Can't able to run the Create ML for training and I upgraded to MacOS 26.3 beta and I have tried older and newer
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
The developer tutorial for visual intelligence indicates that the method to detect and handle taps on a displayed entity from the Search section is via an "OpenIntent" associated with your entity.
However, running this intent executes code from within my app. If I have the perform() method display UI, it always displays UI from within my app.
I noticed that the Google app's integration to visual intelligence has a different behavior-- tapping on an entity does not take you to the Google app -- instead, a Webview is presented sheet-style WITHIN the Visual Intelligence environment (see below)
How is that accomplished?
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
Hi all, I spent the last few months developing an MLX/Ollama local AI Benchmarking suite for Apple Silicon, written in pure Swift and signed with an Apple Developer Certificate, open source, GPL, and free. I would love some feedback to continue development. It is the only benchmarking suite I know of that supports live power metrics and MLX natively, as well as quick exports for benchmark results, and an arena mode, Model A vs B with history. I really want this project to succeed, and have widespread use, so getting 75 stars on the github repo makes it eligible for Homebrew/Cask distribution.
Github Repo
Topic:
Machine Learning & AI
SubTopic:
Core ML
It seems to be that Swift has more APIs implemented than the C++ interface (especially APIs found in the MLXNN and MLXOptimize folders). Is there any intention to implement more APIs for neural networks and training them in the future?
I'm implementing an LLM with Metal Performance Shader Graph, but encountered a very strange behavior, occasionally, the model will report an error message as this:
LLVM ERROR: SmallVector unable to grow. Requested capacity (9223372036854775808) is larger than maximum value for size type (4294967295)
and crash, the stack backtrace screenshot is attached. Note that 5th frame is
mlir::getIntValues<long long>
and 6th frame is
llvm::SmallVectorBase<unsigned int>::grow_pod
It looks like mlir mistakenly took a 64 bit value for a 32 bit type. Unfortunately, I could not found the source code of
mlir::getIntValues, maybe it's Apple's closed source fork of llvm for MPS implementation? Anyway, any opinion or suggestion on that?
Topic:
Machine Learning & AI
SubTopic:
General
Hello everyone,
I’m currently working with the Message Filtering Extension and would really appreciate some clarification around its performance and operational constraints. While the extension is extremely powerful and useful, I’ve found that some important details are either unclear or not well covered in the available documentation.
There are two main areas I’m trying to understand better:
Machine learning model constraints within the extension
In our case, we already have an existing ML model that classifies messages (and are not dependant on Apple's built-in models). We’re evaluating whether and how it can be used inside the extension.
Specifically, I’m trying to understand:
Are there documented limits on the size of an ML model (e.g., maximum bundle size or model file size in MB)?
What are the memory constraints for a model once loaded into memory by the extension?
Under what conditions would the system terminate or “kick out” the extension due to memory or performance pressure?
Message processing timeouts and execution constraints
What is the timeout for processing a single received message?
At what point will the OS stop waiting for the extension’s response and allow the message by default (for example, if the extension does not respond in time)?
Any guidance, official references, or practical experience from Apple engineers or other developers would be greatly appreciated.
Thanks in advance for your help,
Hi all,
I'm trying to find out if/when we can expect mxfp8/mxfp4 support on Apple Silicon. I've noticed that mlx now has casting data types, but all computation is still done in bf16. Would be great to reduce power consumption with support for these lower precision data types since edge inference is already typically done at a lower precision!
Thanks in advance.
Topic:
Machine Learning & AI
SubTopic:
Core ML
Hi i'm curently crating a model to identify car plates (object detection) i use asitop to monitor my macbook pro and i see that only the cpu is used for the training and i wanted to know why
Using highly optimized Metal Shading Language (MSL) code, I pushed the MacBook Air M2 to its performance limits with the deformable_attention_universal kernel. The results demonstrate both the efficiency of the code and the exceptional power of Apple Silicon.
The total computational workload exceeded 8.455 quadrillion FLOPs, equivalent to processing 8,455 trillion operations. On average, the code sustained a throughput of 85.37 TFLOPS, showcasing the chip’s remarkable ability to handle massive workloads. Peak instantaneous performance reached approximately 673.73 TFLOPS, reflecting near-optimal utilization of the GPU cores.
Despite this intensity, the cumulative GPU runtime remained under 100 seconds, highlighting the code’s efficiency and time optimization. The fastest iteration achieved a record processing time of only 0.051 ms, demonstrating minimal bottlenecks and excellent responsiveness.
Memory management was equally impressive: peak GPU memory usage never exceeded 2 MB, reflecting efficient use of the M2’s Unified Memory. This minimizes data transfer overhead and ensures smooth performance across repeated workloads.
Overall, these results confirm that a well-optimized Metal implementation can unlock the full potential of Apple Silicon, delivering exceptional computational density, processing speed, and memory efficiency. The MacBook Air M2, often considered an energy-efficient consumer laptop, is capable of handling highly intensive workloads at performance levels typically expected from much larger GPUs. This test validates both the robustness of the Metal code and the extraordinary capabilities of the M2 chip for high-performance computing tasks.
Hi everyone
Im currently developing an object detection model that shall identify up to seven classes in an image. While im usually doing development with basic python and the ultralytics library, i thought i would like to give CreateML a shot. The experience is actually very nice, except for the fact that the model seem not to be using any ANE or GPU (MPS) for accelerated training.
On https://developer.apple.com/machine-learning/create-ml/ it states: "On-device training Train models blazingly fast right on your Mac while taking advantage of CPU and GPU."
Am I doing something wrong?
Im running the training on
Apple M1 Pro 16GB
MacOS 26.1 (Tahoe)
Xcode 26.1 (Build version 17B55)
It would be super nice to get some feedback or instructions.
Thank you in advance!
I am writing a custom package wrapping Foundation Models which provides a chain-of-thought with intermittent self-evaluation among other things. At first I was designing this package with the command line in mind, but after seeing how well it augments the models and makes them more intelligent I wanted to try and build a SwiftUI wrapper around the package.
When I started I was using synchronous generation rather than streaming, but to give the best user experience (as I've seen in the WWDC sessions) it is necessary to provide constant feedback to the user that something is happening.
I have created a super simplified example of my setup so it's easier to understand.
First, there is the Reasoning conversation item, which can be converted to an XML representation which is then fed back into the model (I've found XML works best for structured input)
public typealias ConversationContext = XMLDocument
extension ConversationContext {
public func toPlainText() -> String {
return xmlString(options: [.nodePrettyPrint])
}
}
/// Represents a reasoning item in a conversation, which includes a title and reasoning content.
/// Reasoning items are used to provide detailed explanations or justifications for certain decisions or responses within a conversation.
@Generable(description: "A reasoning item in a conversation, containing content and a title.")
struct ConversationReasoningItem: ConversationItem {
@Guide(description: "The content of the reasoning item, which is your thinking process or explanation")
public var reasoningContent: String
@Guide(description: "A short summary of the reasoning content, digestible in an interface.")
public var title: String
@Guide(description: "Indicates whether reasoning is complete")
public var done: Bool
}
extension ConversationReasoningItem: ConversationContextProvider {
public func toContext() -> ConversationContext {
// <ReasoningItem title="${title}">
// ${reasoningContent}
// </ReasoningItem>
let root = XMLElement(name: "ReasoningItem")
root.addAttribute(XMLNode.attribute(withName: "title", stringValue: title) as! XMLNode)
root.stringValue = reasoningContent
return ConversationContext(rootElement: root)
}
}
Then there is the generator, which creates a reasoning item from a user query and previously generated items:
struct ReasoningItemGenerator {
var instructions: String {
"""
<omitted for brevity>
"""
}
func generate(from input: (String, [ConversationReasoningItem])) async throws -> sending LanguageModelSession.ResponseStream<ConversationReasoningItem> {
let session = LanguageModelSession(instructions: instructions)
// build the context for the reasoning item out of the user's query and the previous reasoning items
let userQuery = "User's query: \(input.0)"
let reasoningItemsText = input.1.map { $0.toContext().toPlainText() }.joined(separator: "\n")
let context = userQuery + "\n" + reasoningItemsText
let reasoningItemResponse = try await session.streamResponse(
to: context, generating: ConversationReasoningItem.self)
return reasoningItemResponse
}
}
I'm not sure if returning LanguageModelSession.ResponseStream<ConversationReasoningItem> is the right move, I am just trying to imitate what session.streamResponse returns.
Then there is the orchestrator, which I can't figure out. It receives the streamed ConversationReasoningItems from the Generator and is responsible for streaming those to SwiftUI later and also for evaluating each reasoning item after it is complete to see if it needs to be regenerated (to keep the model on-track). I want the users of the orchestrator to receive partially generated reasoning items as they are being generated by the generator. Later, when they finish, if the evaluation passes, the item is kept, but if it fails, the reasoning item should be removed from the stream before a new one is generated. So in-flight reasoning items should be outputted aggresively.
I really am having trouble figuring this out so if someone with more knowledge about asynchronous stuff in Swift, or- even better- someone who has worked on the Foundation Models framework could point me in the right direction, that would be awesome!
With the release of the newest version of tahoe and MLX supporting RDMA. Is there a documentation link to how to utilizes the libdrma dylib as well as what functions are available? I am currently assuming it mostly follows the standard linux infiniband library but I would like the apple specific details.
Topic:
Machine Learning & AI
SubTopic:
General
Hello, I have to create an app in Swift that it scan NFC Identity card. It extract data and convert it to human readable data. I do it with below code
import CoreNFC
class NFCIdentityCardReader: NSObject , NFCTagReaderSessionDelegate {
func tagReaderSessionDidBecomeActive(_ session: NFCTagReaderSession) {
print("\(session.description)")
}
func tagReaderSession(_ session: NFCTagReaderSession, didInvalidateWithError error: any Error) {
print("NFC Error: \(error.localizedDescription)")
}
var session: NFCTagReaderSession?
func beginScanning() {
guard NFCTagReaderSession.readingAvailable else {
print("NFC is not supported on this device")
return
}
session = NFCTagReaderSession(pollingOption: .iso14443, delegate: self, queue: nil)
session?.alertMessage = "Hold your NFC identity card near the device."
session?.begin()
}
func tagReaderSession(_ session: NFCTagReaderSession, didDetect tags: [NFCTag]) {
guard let tag = tags.first else {
session.invalidate(errorMessage: "No tag detected")
return
}
session.connect(to: tag) { (error) in
if let error = error {
session.invalidate(errorMessage: "Connection error: \(error.localizedDescription)")
return
}
switch tag {
case .miFare(let miFareTag):
self.readMiFareTag(miFareTag, session: session)
case .iso7816(let iso7816Tag):
self.readISO7816Tag(iso7816Tag, session: session)
case .iso15693, .feliCa:
session.invalidate(errorMessage: "Unsupported tag type")
@unknown default:
session.invalidate(errorMessage: "Unknown tag type")
}
}
}
private func readMiFareTag(_ tag: NFCMiFareTag, session: NFCTagReaderSession) {
// Read from MiFare card, assuming it's formatted as an identity card
let command: [UInt8] = [0x30, 0x04] // Example: Read command for block 4
let requestData = Data(command)
tag.sendMiFareCommand(commandPacket: requestData) { (response, error) in
if let error = error {
session.invalidate(errorMessage: "Error reading MiFare: \(error.localizedDescription)")
return
}
let readableData = String(data: response, encoding: .utf8) ?? response.map { String(format: "%02X", $0) }.joined()
session.alertMessage = "ID Card Data: \(readableData)"
session.invalidate()
}
}
private func readISO7816Tag(_ tag: NFCISO7816Tag, session: NFCTagReaderSession) {
let selectAppCommand = NFCISO7816APDU(instructionClass: 0x00, instructionCode: 0xA4, p1Parameter: 0x04, p2Parameter: 0x00, data: Data([0xA0, 0x00, 0x00, 0x02, 0x47, 0x10, 0x01]), expectedResponseLength: -1)
tag.sendCommand(apdu: selectAppCommand) { (response, sw1, sw2, error) in
if let error = error {
session.invalidate(errorMessage: "Error reading ISO7816: \(error.localizedDescription)")
return
}
let readableData = response.map { String(format: "%02X", $0) }.joined()
session.alertMessage = "ID Card Data: \(readableData)"
session.invalidate()
}
}
}
But I got null. I think that these data are encrypted. How can I convert them to readable data without MRZ, is it possible ?
I need to get personal informations from Identity card via Core NFC.
Thanks in advance.
Best regards
I've built a model using Create ML, but I can't make it, for the love of God, updatable. I can't find any checkbox or anything related. It's an Activity Classifier, if it matters.
I want to continue training it on-device using MLUpdateTask, but the model, as exported from Create ML, fails with error: Domain=com.apple.CoreML Code=6 "Failed to unarchive update parameters. Model should be re-compiled." UserInfo={NSLocalizedDescription=Failed to unarchive update parameters. Model should be re-compiled.}
Hi everyone,
I’m currently exploring the use of Foundation models on Apple platforms to build a chatbot-style assistant within an app. While the integration part is straightforward using the new FoundationModel APIs, I’m trying to figure out how to control the assistant’s responses more tightly — particularly:
Ensuring the assistant adheres to a specific tone, context, or domain (e.g. hospitality, healthcare, etc.)
Preventing hallucinations or unrelated outputs
Constraining responses based on app-specific rules, structured data, or recent interactions
I’ve experimented with prompt, systemMessage, and few-shot examples to steer outputs, but even with carefully generated prompts, the model occasionally produces incorrect or out-of-scope responses.
Additionally, when using multiple tools, I'm unsure how best to structure the setup so the model can select the correct pathway/tool and respond appropriately. Is there a recommended approach to guiding the model's decision-making when several tools or structured contexts are involved?
Looking forward to hearing your thoughts or being pointed toward related WWDC sessions, Apple docs, or sample projects.
Hello everyone,
I’m looking for guidance regarding my app review timeline, as things seem unusually delayed compared to previous submissions.
My iOS app was rejected on November 19th due to AI-related policy questions.
I immediately responded to the reviewer with detailed explanations covering:
Model used (Gemini Flash 2.0 / 2.5 Lite)
How the AI only generates neutral, non-directive reflective questions
How the system prevents any diagnosis, therapy-like behavior or recommendations
Crisis-handling limitations
Safety safeguards at generation and UI level
Internal red-team testing and results
Data retention, privacy, and non-use of data for model training
After sending the requested information, I resubmitted the build on November 19th at 14:40.
Since then:
November 20th (7:30) → Status changed to In Review.
November 21st, 22nd, 23rd, 24th, 25th → No movement, still In Review.
My open case on App Store Connect is still pending without updates.
Because of the previous rejection, I expected a short delay, but this is now 5 days total and 3 business days with no progress, which feels longer than usual for my past submissions.
I’m not sure whether:
My app is in a secondary review queue due to the AI-related rejection,
The reviewer is waiting for internal clarification,
Or if something is stuck and needs to be escalated.
I don’t want to resubmit a new build unless necessary, since that would restart the queue.
Could someone from the community (or Apple, if possible) confirm whether this waiting time is normal after an AI-policy rejection?
And is there anything I should do besides waiting — for example, contacting Developer Support again or requesting a follow-up?
Thank you very much for your help. I appreciate any insight from others who have experienced similar delays.
I’m considering creating an ILMessageFilterExtension using a mini LLM/SLM to detect fraud and I’ve read it has strict memory limits yet I can’t find it in the documentation. What’s the set limit or any other constraints impacting the feasibility of running 100-500mb model?
Topic:
Machine Learning & AI
SubTopic:
Core ML
Hi everyone,
I believe I’ve encountered a potential bug or a hardware alignment limitation in the Core ML Framework / ANE Runtime specifically affecting the new Stateful API (introduced in iOS 18/macOS 15).
The Issue:
A Stateful mlprogram fails to run on the Apple Neural Engine (ANE) if the state tensor dimensions (specifically the width) are not a multiple of 32. The model works perfectly on CPU and GPU, but fails on ANE both during runtime and when generating a Performance Report in Xcode.
Error Message in Xcode UI:
"There was an error creating the performance report Unable to compute the prediction using ML Program. It can be an invalid input data or broken/unsupported model."
Observations:
Case A (Fails): State shape = (1, 3, 480, 270). Prediction fails on ANE.
Case B (Success): State shape = (1, 3, 480, 256). Prediction succeeds on ANE.
This suggests an internal memory alignment or tiling issue within the ANE driver when handling Stateful buffers that don't meet the 32-pixel/element alignment.
Reproduction Code (PyTorch + coremltools):
import torch.nn as nn
import coremltools as ct
import numpy as np
class RNN_Stateful(nn.Module):
def __init__(self, hidden_shape):
super(RNN_Stateful, self).__init__()
# Simple conv to update state
self.conv1 = nn.Conv2d(3 + hidden_shape[1], hidden_shape[1], kernel_size=3, padding=1)
self.conv2 = nn.Conv2d(hidden_shape[1], 3, kernel_size=3, padding=1)
self.register_buffer("hidden_state", torch.ones(hidden_shape, dtype=torch.float16))
def forward(self, imgs):
self.hidden_state = self.conv1(torch.cat((imgs, self.hidden_state), dim=1))
return self.conv2(self.hidden_state)
# h=480, w=255 causes ANE failure. w=256 works.
b, ch, h, w = 1, 3, 480, 255
model = RNN_Stateful((b, ch, h, w)).eval()
traced_model = torch.jit.trace(model, torch.randn(b, 3, h, w))
mlmodel = ct.convert(
traced_model,
inputs=[ct.TensorType(name="input_image", shape=(b, 3, h, w), dtype=np.float16)],
outputs=[ct.TensorType(name="output", dtype=np.float16)],
states=[ct.StateType(wrapped_type=ct.TensorType(shape=(b, ch, h, w), dtype=np.float16), name="hidden_state")],
minimum_deployment_target=ct.target.iOS18,
convert_to="mlprogram"
)
mlmodel.save("rnn_stateful.mlpackage")
Steps to see the error:
Open the generated .mlpackage in Xcode 16.0+.
Go to the Performance tab and run a test on a device with ANE (e.g., iPhone 15/16 or M-series Mac).
The report will fail to generate with the error mentioned above.
Environment:
OS: macOS 15.2
Xcode: 16.3
Hardware: M4
Has anyone else encountered this 32-pixel alignment requirement for StateType tensors on ANE? Is this a known hardware constraint or a bug in the Core ML runtime?
Any insights or workarounds (other than manual padding) would be appreciated.
I don't know if these forums are any good for rumors or plans, but does anybody know whether or not Apple plans to release a library for training reinforcement learning? It would be handy, implementing games in Swift, for example, to be able to train the computer players on the same code.
Recursive and Self-Referential Data Structures
Combining recursive and self-referential data structures with frameworks like Accelerate, SwiftMacros, and utilizing SwiftUI hooks can offer significant benefits in terms of performance, maintainability, and expressiveness. Here is how Apple Intelligence breaks it down.
Benefits:
Natural Representation of Complex Data:
Recursive structures, such as trees and graphs, are ideal for representing hierarchical or interconnected data, like file systems, social networks, and DOM trees.
Simplified Algorithms:
Many algorithms, such as traversals, sorting, and searching, are more straightforward and elegant when implemented using recursion.
Dynamic Memory Management:
Self-referential structures can dynamically grow and shrink, making them suitable for applications with unpredictable data sizes.
Challenges:
Performance Overhead:
Recursive algorithms can lead to stack overflow if not properly optimized (e.g., using tail recursion).
Self-referential structures can introduce memory management challenges, such as retain cycles.
Accelerate Framework
Benefits:
High-Performance Computation:
Accelerate provides optimized libraries for numerical and scientific computing, including linear algebra, FFT, and image processing.
It can significantly speed up computations, especially for large datasets, by leveraging multi-core processors and GPU acceleration.
Parallel Processing:
Accelerate automatically parallelizes operations, making it easier to take advantage of modern hardware capabilities.
Integration with Recursive Data:
Matrix and Vector Operations:
Use Accelerate for operations on matrices and vectors, which are common in recursive algorithms like those used in machine learning and physics simulations.
FFT and Convolutions:
Accelerate's FFT functions can be used in recursive algorithms for signal processing and image analysis.
SwiftMacros
Benefits:
Code Generation and Transformation:
SwiftMacros allow you to generate and transform code at compile time, enabling the creation of DSLs, boilerplate reduction, and optimization.
Improved Compile-Time Checks:
Macros can perform complex compile-time checks, ensuring code correctness and reducing runtime errors.
Integration with Recursive Data:
DSL for Data Structures:
Create a DSL using SwiftMacros to define recursive data structures concisely and safely.
Optimization:
Use macros to generate optimized code for recursive algorithms, such as memoization or iterative transformations.
SwiftUI Hooks
Benefits:
State Management:
Hooks like @State, @Binding, and @Effect simplify state management in SwiftUI, making it easier to handle dynamic data.
Side Effects:
@Effect allows you to perform side effects in a declarative manner, integrating seamlessly with asynchronous operations.
Reusable Logic:
Custom hooks enable the reuse of stateful logic across multiple views, promoting code maintainability.
Integration with Recursive Data:
Dynamic Data Binding:
Use SwiftUI's data binding to manage the state of recursive data structures, ensuring that UI updates reflect changes in the underlying data.
Efficient Rendering:
SwiftUI's diffing algorithm efficiently updates the UI only for the parts of the recursive structure that have changed, improving performance.
Asynchronous Data Loading:
Combine @Effect with recursive data structures to fetch and process data asynchronously, such as loading a tree structure from a remote server.
Example: Combining All Components
Imagine you're building an app that visualizes a hierarchical file system using a recursive tree structure. Here's how you might combine these components:
Define the Recursive Data Structure:
Use SwiftMacros to create a DSL for defining tree nodes.
@macro
struct TreeNode {
var value: T
var children: [TreeNode]
}
Optimize with Accelerate:
Use Accelerate for operations like computing the size of the tree or performing transformations on node values.
func computeTreeSize(_ node: TreeNode) -> Int {
return node.children.reduce(1) { $0 + computeTreeSize($1) }
}
Manage State with SwiftUI Hooks:
Use SwiftUI hooks to load and display the tree structure dynamically.
struct FileSystemView: View {
@State private var rootNode: TreeNode = loadTree()
var body: some View {
TreeView(node: rootNode)
}
private func loadTree() -> TreeNode<String> {
// Load or generate the tree structure
}
}
struct TreeView: View {
let node: TreeNode
var body: some View {
List(node.children, id: \.value) {
Text($0.value)
TreeView(node: $0)
}
}
}
Perform Side Effects with @Effect:
Use @Effect to fetch data asynchronously and update the tree structure.
struct FileSystemView: View {
@State private var rootNode: TreeNode = TreeNode(value: "/")
@Effect private var loadTreeEffect: () -> Void = {
// Fetch data from a server or database
}
var body: some View {
TreeView(node: rootNode)
.onAppear { loadTreeEffect() }
}
}
By combining recursive data structures with Accelerate, SwiftMacros, and SwiftUI hooks, you can create powerful, efficient, and maintainable applications that handle complex data with ease.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models