I’m seeing consistent failures using SoundAnalysis live classification when my app moves to the background.
Setup
iOS 17.x
AVAudioEngine mic capture
SNAudioStreamAnalyzer
SNClassifySoundRequest(classifierIdentifier: .version1)
UIBackgroundModes = audio
AVAudioSession .record / .playAndRecord, active
Audio capture + level metering continue working in background (mic indicator stays on)
Issue
As soon as the app enters background / screen locks:
SoundAnalysis starts failing every second with domain:com.apple.SoundAnalysis, code:2(SNErrorCode.operationFailed)
Audio capture itself continues normally
When the app returns to foreground, classification immediately resumes without restarting the engine/analyzer
Question
Is live background sound classification with the built-in SoundAnalysis classifier officially unsupported or known to fail in background?
If so, is a custom Core ML model the only supported approach for background detection?
Or is there a required configuration I’m missing to keep SNClassifySoundRequest(.version1) running in background?
Thanks for any clarification.
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
When the system language and Siri language are not the same, Apple AI may not be usable.
For example, if the system is in English and Siri is in Chinese, it may cause Apple AI to not work.
May I ask if there are other reasons why the app still cannot be used internally even after enabling Apple AI?
The documentation for the Create ML tool ("Building an object detector data source") mentions that there are options for using normalized values instead of pixels and also different anchor point origins ("MLBoundingBoxCoordinatesOrigin") instead of always using "center". However, the JSON format for these does not appear in any examples. Does anyone know the format for these options?
Topic:
Machine Learning & AI
SubTopic:
Create ML
Hello everyone,
I’m looking for guidance regarding my app review timeline, as things seem unusually delayed compared to previous submissions.
My iOS app was rejected on November 19th due to AI-related policy questions.
I immediately responded to the reviewer with detailed explanations covering:
Model used (Gemini Flash 2.0 / 2.5 Lite)
How the AI only generates neutral, non-directive reflective questions
How the system prevents any diagnosis, therapy-like behavior or recommendations
Crisis-handling limitations
Safety safeguards at generation and UI level
Internal red-team testing and results
Data retention, privacy, and non-use of data for model training
After sending the requested information, I resubmitted the build on November 19th at 14:40.
Since then:
November 20th (7:30) → Status changed to In Review.
November 21st, 22nd, 23rd, 24th, 25th → No movement, still In Review.
My open case on App Store Connect is still pending without updates.
Because of the previous rejection, I expected a short delay, but this is now 5 days total and 3 business days with no progress, which feels longer than usual for my past submissions.
I’m not sure whether:
My app is in a secondary review queue due to the AI-related rejection,
The reviewer is waiting for internal clarification,
Or if something is stuck and needs to be escalated.
I don’t want to resubmit a new build unless necessary, since that would restart the queue.
Could someone from the community (or Apple, if possible) confirm whether this waiting time is normal after an AI-policy rejection?
And is there anything I should do besides waiting — for example, contacting Developer Support again or requesting a follow-up?
Thank you very much for your help. I appreciate any insight from others who have experienced similar delays.
I'm using Vision framework (DetectFaceLandmarksRequest) with the same code and the same test image to detect face landmarks. On iOS 18 everything works as expected: detected face landmarks align with the face correctly.
But when I run the same code on devices with iOS 26, the landmark coordinates are outside the [0,1] range, which indicates they are out of face bounds.
Fun fact: the old VNDetectFaceLandmarksRequest API works very well without encountering this issue
How I get face landmarks:
private let faceRectangleRequest = DetectFaceRectanglesRequest(.revision3)
private var faceLandmarksRequest = DetectFaceLandmarksRequest(.revision3)
func detectFaces(in ciImage: CIImage) async throws -> FaceTrackingResult {
let faces = try await faceRectangleRequest.perform(on: ciImage)
faceLandmarksRequest.inputFaceObservations = faces
let landmarksResults = try await faceLandmarksRequest.perform(on: ciImage)
...
}
How I show face landmarks in SwiftUI View:
private func convert(
point: NormalizedPoint,
faceBoundingBox: NormalizedRect,
imageSize: CGSize
) -> CGPoint {
let point = point.toImageCoordinates(
from: faceBoundingBox,
imageSize: imageSize,
origin: .upperLeft
)
return point
}
At the same time, it works as expected and gives me the correct results:
region is FaceObservation.Landmarks2D.Region
let points: [CGPoint] = region.pointsInImageCoordinates(
imageSize,
origin: .upperLeft
)
After that, I found that the landmarks are normalized relative to the unalignedBoundingBox. However, I can’t access it in code. Still, using these values for the bounding box works correctly.
Things I've already tried:
Same image input
Tested multiple devices on iOS 26.2 -> always wrong.
Tested multiple devices on iOS 18.7.1 -> always correct.
Environment:
macOS 26.2
Xcode 26.2 (17C52)
Real devices, not simulator
Face Landmarks iOS 18
Face Landmarks iOS 26
I have seen inconsistent results for my Colab machine learning notebooks running locally on a Mac M4, compared to running the same notebook code on either T4 (in Colab) or a RTX3090 locally.
To illustrate the problems I have set up a notebook that implements two simple CNN models that solves the Fashion-MNIST problem. https://colab.research.google.com/drive/11BhtHhN079-BWqv9QvvcSD9U4mlVSocB?usp=sharing
For the good model with 2M parameters I get the following results:
T4 (Colab, JAX): Test accuracy: 0.925
3090 (Local PC via ssh tunnel, Jax): Test accuracy: 0.925
Mac M4 (Local, JAX): Test accuracy: 0.893
Mac M4 (Local, Tensorflow): Test accuracy: 0.893
That is, I see a significant drop in performance when I run on the Mac M4 compared to the NVIDIA machines, and it seems to be independent of backend. I however do not know how to pinpoint this to either Keras or Apple’s METAL implementation. I have reported this to Keras: https://colab.research.google.com/drive/11BhtHhN079-BWqv9QvvcSD9U4mlVSocB?usp=sharing but as this can be (likely is?) an Apple Metal issue, I wanted to report this here as well.
On the mac I am running the following Python libraries:
keras 3.9.1
tensorflow 2.19.0
tensorflow-metal 1.2.0
jax 0.5.3
jax-metal 0.1.1
jaxlib 0.5.3
Topic:
Machine Learning & AI
SubTopic:
General
I'm on Tahoe 26.1 / M3 Macbook Air. I'm using VNDetectFaceRectanglesRequest as properly as possible, as in the minimal command line program attached below. For some reason, I always get:
MLE5Engine is disabled through the configuration
printed. I couldn't find any notes on developer docs saying that VNDetectFaceRectanglesRequest can not use the Apple Neural Engine. I'm assuming there is something wrong with my code however I wasn't able to find any remarks from documentation where it might be. I wasn't able to find the above error message online either. I would appreciate your help a lot and thank you in advance.
The code below accesses the video from AVCaptureDevice.DeviceType.builtInWideAngleCamera. Currently it directly chooses the 0th format which has the largest resolution (Full HD on my M3 MBA) and "4:2:0" color "v" reduced color component spectrum encoding ("420v").
After accessing video, it performs a VNDetectFaceRectanglesRequest. It prints "VNDetectFaceRectanglesRequest completion Handler called" many times, then prints the error message above, then continues printing "VNDetectFaceRectanglesRequest completion Handler called" until the user quits it.
To run it in Xcode, File > New project > Mac command line tool. Pasting the code below, then click on the root file > Targets > Signing & Capabilities > Hardened Runtime > Resource Access > Camera.
A possible explanation could be that either Apple's internal CoreML code for this function works on GPU/CPU only or it doesn't accept 420v as supplied by the Macbook Air camera
import AVKit
import Vision
var videoDataOutput: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput()
var detectionRequests: [VNDetectFaceRectanglesRequest]?
var videoDataOutputQueue: DispatchQueue = DispatchQueue(label: "queue")
class XYZ: /*NSViewController or NSObject*/NSObject, AVCaptureVideoDataOutputSampleBufferDelegate {
func viewDidLoad() {
//super.viewDidLoad()
let session = AVCaptureSession()
let inputDevice = try! self.configureFrontCamera(for: session)
self.configureVideoDataOutput(for: inputDevice.device, resolution: inputDevice.resolution, captureSession: session)
self.prepareVisionRequest()
session.startRunning()
}
fileprivate func highestResolution420Format(for device: AVCaptureDevice) -> (format: AVCaptureDevice.Format, resolution: CGSize)? {
let deviceFormat = device.formats[0]
print(deviceFormat)
let dims = CMVideoFormatDescriptionGetDimensions(deviceFormat.formatDescription)
let resolution = CGSize(width: CGFloat(dims.width), height: CGFloat(dims.height))
return (deviceFormat, resolution)
}
fileprivate func configureFrontCamera(for captureSession: AVCaptureSession) throws -> (device: AVCaptureDevice, resolution: CGSize) {
let deviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [AVCaptureDevice.DeviceType.builtInWideAngleCamera], mediaType: .video, position: AVCaptureDevice.Position.unspecified)
let device = deviceDiscoverySession.devices.first!
let deviceInput = try! AVCaptureDeviceInput(device: device)
captureSession.addInput(deviceInput)
let highestResolution = self.highestResolution420Format(for: device)!
try! device.lockForConfiguration()
device.activeFormat = highestResolution.format
device.unlockForConfiguration()
return (device, highestResolution.resolution)
}
fileprivate func configureVideoDataOutput(for inputDevice: AVCaptureDevice, resolution: CGSize, captureSession: AVCaptureSession) {
videoDataOutput.setSampleBufferDelegate(self, queue: videoDataOutputQueue)
captureSession.addOutput(videoDataOutput)
}
fileprivate func prepareVisionRequest() {
let faceDetectionRequest: VNDetectFaceRectanglesRequest = VNDetectFaceRectanglesRequest(completionHandler: { (request, error) in
print("VNDetectFaceRectanglesRequest completion Handler called")
})
// Start with detection
detectionRequests = [faceDetectionRequest]
}
// MARK: AVCaptureVideoDataOutputSampleBufferDelegate
// Handle delegate method callback on receiving a sample buffer.
public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
var requestHandlerOptions: [VNImageOption: AnyObject] = [:]
let cameraIntrinsicData = CMGetAttachment(sampleBuffer, key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, attachmentModeOut: nil)
if cameraIntrinsicData != nil {
requestHandlerOptions[VNImageOption.cameraIntrinsics] = cameraIntrinsicData
}
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
// No tracking object detected, so perform initial detection
let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer,
orientation: CGImagePropertyOrientation.up, options: requestHandlerOptions)
try! imageRequestHandler.perform(detectionRequests!)
}
}
let X = XYZ()
X.viewDidLoad()
sleep(9999999)
I got 3203.23 GFLOPS (FP16) on the M3 Macbook Pro and only 2833.24 GFLOPS (FP16) on the M4 Macbook Air for 4096x4096 matrix multiplications for a PyTorch MPS FP16 Benchmark. Wasn't the performance supposed to be twice as high on the M4 compared to the M3 even with the termal throtling on the Macbook Air? What went wrong?
When I ran the following code on a physical iPhone device that supports Apple Intelligence, I encountered the following error log.
What does this internal error code mean?
Image generation failed with NSError in a different domain: Error Domain=ImagePlaygroundInternal.ImageGeneration.GenerationError Code=11 “(null)”, returning a generic error instead
let imageCreator = try await ImageCreator()
let style = imageCreator.availableStyles.first ?? .animation
let stream = imageCreator.images(for: [.text("cat")], style: style, limit: 1)
for try await result in stream { // error: ImagePlayground.ImageCreator.Error.creationFailed
_ = result.cgImage
}
I’m trying to follow Apple’s “WWDC24: Bring your machine learning and AI models to Apple Silicon” session to convert the Mistral-7B-Instruct-v0.2 model into a Core ML package, but I’ve run into a roadblock that I can’t seem to overcome. I’ve uploaded my full conversion script here for reference:
https://pastebin.com/T7Zchzfc
When I run the script, it progresses through tracing and MIL conversion but then fails at the backend_mlprogram stage with this error:
https://pastebin.com/fUdEzzKM
The core of the error is:
ValueError: Op "keyCache_tmp" (op_type: identity) Input x="keyCache" expects list, tensor, or scalar but got state[tensor[1,32,8,2048,128,fp16]]
I’ve registered my KV-cache buffers in a StatefulMistralWrapper subclass of nn.Module, matching the keyCache and valueCache state names in my ct.StateType definitions, but Core ML’s backend pass reports the state tensor as an invalid input. I’m using Core ML Tools 8.3.0 on Python 3.9.6, targeting iOS18, and forcing CPU conversion (MPS wasn’t available). Any pointers on how to satisfy the handle_unused_inputs pass or properly declare/cache state for GQA models in Core ML would be greatly appreciated!
Thanks in advance for your help,
Usman Khan
Topic:
Machine Learning & AI
SubTopic:
Core ML
Tags:
Metal
Metal Performance Shaders
Core ML
tensorflow-metal
the specific context is that i would like to build an agent that monitors my phone call (with a customer support for example), and simiply identify whether or not im still put on hold, and notify me when im not.
currently after reading the doc, i dont think its possible yet, but im so annoyed by the customer support calls that im willing to go the distance and see if theres any way.
I'm adding Visual Intelligence support to my app, and now want to add a Tip using TipKit to guide users to this feature from within my app. I want to add a Rule to my Tip which will only show this Tip on devices where Visual Intelligence is supported (ex. not iPhone 14 Pro Max).
What is the best way for me to determine availability to set this TipKit rule?
Here's the documentation I'm following for Visual Intelligence: https://developer.apple.com/documentation/visualintelligence/integrating-your-app-with-visual-intelligence
Hi everyone
Im currently developing an object detection model that shall identify up to seven classes in an image. While im usually doing development with basic python and the ultralytics library, i thought i would like to give CreateML a shot. The experience is actually very nice, except for the fact that the model seem not to be using any ANE or GPU (MPS) for accelerated training.
On https://developer.apple.com/machine-learning/create-ml/ it states: "On-device training Train models blazingly fast right on your Mac while taking advantage of CPU and GPU."
Am I doing something wrong?
Im running the training on
Apple M1 Pro 16GB
MacOS 26.1 (Tahoe)
Xcode 26.1 (Build version 17B55)
It would be super nice to get some feedback or instructions.
Thank you in advance!
Hi team,
We have implemented a writing tool inside a WebView that allows users to type content in a textarea. When the "Show Writing Tools" button is clicked, an AI-powered editor opens. After clicking the "Rewrite" button, the AI modifies the text. However, when clicking the "Replace" button, the rewritten text does not update the original textarea.
Kindly check and help me
showButton.addTarget(self, action: #selector(showWritingTools(_:)), for: .touchUpInside)
@available(iOS 18.2, *)
optional func showWritingTools(_ sender: Any)
Note:
same cases working in TextView
pfa
I'm trying the new RecognizeDocumentsRequest supposed to detect paragraphs (among other things) in a document.
I tried many source images, and I don't see the slightest difference compared to the old API (VN)RecognizedTextRequest
Is it supposed to not work or is it in beta?
Hi Apple Engineers,
I am experiencing a potential memory management bug with CoreML on M1 Mac (32GB Unified Memory).
When processing long video files (approx. 12,000 frames) using a CoreML execution provider, the system often completes the 'Analysing' phase but fails to transition into 'Processing'. It simply exits silently or hits an import error (scipy).
However, if I split the same task into small 20-frame segments, it works perfectly at high speeds (~40 FPS). This suggests the hardware is capable, but there is an issue with memory fragmentation or resource cleanup during long-running CoreML sessions.
Is there a way to force a VRAM/Unified Memory flush via CLI, or is this a known limitation for large frame indexing?
I watched this year WWDC25 "Read Documents using the Vision framework". At the end of video there is mention of new DetectHandPoseRequest model for hand pose detection in Vision API.
I looked Apple documentation and I don't see new revision. Moreover probably typo in video because there is only DetectHumanPoseRequst (swift based) and
VNDetectHumanHandPoseRequest (obj-c based) (notice lack of Human prefix in WWDC video)
First one have revision only added in iOS 18+:
https://developer.apple.com/documentation/vision/detecthumanhandposerequest/revision-swift.enum/revision1
Second one have revision only added in iOS14+:
https://developer.apple.com/documentation/vision/vndetecthumanhandposerequestrevision1
I don't see any new revision targeting iOS26+
hello,
Do you have any information on the handling of sparse matrix with MPS and PyTorch? release date? ...
Hi,
I am new to developing on Apple’s platform yet I want to familiarize myself with Core ML and Core ML Tools. I was watching the WWDC24: Bring your machine learning and AI models to Apple Silicon video and was trying to follow along. After multiple attempts and much reading up on documentation, I am still unable to get a coherent script running that will convert the Mistral model that the host used and convert it to a valid Core ML model.
here is a pastebin to what i have currently:
https://pastebin.com/04cVjF1v
if you require the output as well please let me know
Hi all! Nice to meet you.,
I am planning to build an iOS application that can:
Capture an image using the camera or select one from the gallery.
Remove the background and keep only the detected main object.
Add a border (outline) around the detected object’s shape.
Apply an animation along that border (e.g., moving light or glowing effect).
Include a transition animation when removing the background — for example, breaking the background into pieces as it disappears.
The app Capword has a similar feature for object isolation, and I’d like to build something like that.
Could you please provide any guidance, frameworks, or sample code related to:
Object segmentation and background removal in Swift (Vision or Core ML).
Applying custom borders and shape animations around detected objects.
Recognizing the object name (e.g., “person”, “cat”, “car”) after segmentation.
Thank you very much for your support.
Best regards,
SINN SOKLYHOR