How does a third party developer go about supporting the new Enhanced Dialogue option for video apps in tvOS 18?
If an app is using the standard AVPlayerViewController, I had assumed it would be a simple-ish matter of building against the tvOS 18 SDK but apparently not, the options don't appear, not even greyed out.
Audio
RSS for tagDive into the technical aspects of audio on your device, including codecs, format support, and customization options.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Using the official SwiftTranscriptionSampleApp from WWDC 2025, speech transcription takes 14+ seconds from audio input to first result, making it unusable for real-time applications.
Environment
iOS: 26.0 Beta
Xcode: Beta 5
Device: iPhone 16 pro
Sample App: Official Apple SwiftTranscriptionSampleApp from WWDC 2025
Configuration Tested
Locale: en-US (properly allocated with AssetInventory.allocate(locale:)) and es-ES
Setup: All optimizations applied (preheating, high priority, model retention)
I started testing in my own app to replace SFSpeech API and include speech detection but after long fights with documentation (this part is quite terrible TBH) I tested the example (https://developer.apple.com/documentation/speech/bringing-advanced-speech-to-text-capabilities-to-your-app) and saw same results.
I added some logs to check the specific time:
🎙️ [20:30:41.532] ✅ Analyzer started successfully - ready to receive audio!
🎙️ [20:30:41.532] Listening for transcription results...
🎙️ [20:30:56.342] 🚀 FIRST TRANSCRIPTION RESULT after 14.810s: 'Hello' (isFinal: false)
Questions
Is this expected performance for iOS 26 Beta, because old SFSpeech is far faster?
Are there additional optimization steps for SpeechTranscriber?
Should we expect significant performance improvements in later betas?
Hi,
I am creating an app that can include videos or images in it's data. While
@Attribute(.externalStorage)
helps with images, with AVAssets I actually would like access to the URL behind that data. (as it would be stupid to load and then save the data again just to have a URL)
One key component is to keep all of this clean enough so that I can use (private) CloudKit syncing with the resulting model.
All the best
Christoph
Dear Sirs,
I’ve written a virtual audio driver based on AudioDriverKit and running as dext in my MacOS app. Sometimes when waking up from a sleep state the recording side of my driver extension seems to hang and I don’t see any calls to my io_operation callback. Then the recording app like a DAW seems to hang when trying to start a recording. This doesn’t happen after short sleep states or after a complete new start of my MacBook.
I already opened a case in Feedback-Assistant on 5th of May (FB17503622) which also includes a sysdiagnose and a ktrace but I didn't get any feedback so far. Meanwhile some of our customers are getting angry and I'd like to know if there's anything I could do to fix this problem on my side.
We’re not sure whether this worked in previous MacOS versions, we think we didn’t observe this before 15.3.1 but at least since 15.3.1. we’ve seen this problem.
Best regards,
Johannes
Hello,
The search functionality of the coreaudio-api mailing list archive has been broken for a very long time. Several of the lower-level audio APIs have only been discussed on this mailing list, making it critical for those of us maintaining old audio code.
Steps to reproduce:
Open https://lists.apple.com/archives/list/coreaudio-api@lists.apple.com/ in your web browser.
Enter a search term in the "Search this list" field in the top-right corner of the page.
The search will eventually time out with "502 Bad Gateway"
Can somebody please forward this information to the current maintainer? I've tried to contact developer support but they weren't sure what to do.
Thanks!
Topic:
Media Technologies
SubTopic:
Audio
Hi everyone,
I’m testing audio recording on an iPhone 15 Plus using AVFoundation.
Here’s a simplified version of my setup:
let settings: [String: Any] = [
AVFormatIDKey: Int(kAudioFormatLinearPCM),
AVSampleRateKey: 8000,
AVNumberOfChannelsKey: 1,
AVLinearPCMBitDepthKey: 16,
AVLinearPCMIsFloatKey: false
]
audioRecorder = try AVAudioRecorder(url: fileURL, settings: settings)
audioRecorder?.record()
When I check the recorded file’s sample rate, it logs:
Actual sample rate: 8000.0
However, when I inspect the hardware sample rate:
try session.setCategory(.playAndRecord, mode: .default)
try session.setActive(true)
print("Hardware sample rate:", session.sampleRate)
I consistently get:
`Hardware sample rate: 48000.0
My questions are:
Is the iPhone mic actually capturing at 8 kHz, or is it recording at 48 kHz and then downsampling to 8 kHz internally?
Is there any way to force the hardware to record natively at 8 kHz?
If not, what’s the recommended approach for telephony-quality audio (true 8 kHz) on iOS devices?
Thanks in advance for your guidance!
3
I am working on an application to get when input audio device is being used. Basically I want to know the application using the microphone (built-in or external)
This app runs on macOS. For Mac versions starting from Sonoma I can use this code:
int getAudioProcessPID(AudioObjectID process)
{
pid_t pid;
if (@available(macOS 14.0, *)) {
constexpr AudioObjectPropertyAddress prop {
kAudioProcessPropertyPID,
kAudioObjectPropertyScopeGlobal,
kAudioObjectPropertyElementMain
};
UInt32 dataSize = sizeof(pid);
OSStatus error = AudioObjectGetPropertyData(process, &prop, 0, nullptr, &dataSize, &pid);
if (error != noErr) {
return -1;
}
} else {
// Pre sonoma code goes here
}
return pid;
}
which works.
However, kAudioProcessPropertyPID was added in macOS SDK 14.0.
Does anyone know how to achieve the same functionality on previous versions?
I’m working with the Push-to-Talk (PTT) framework and observing a consistent delay when starting audio capture.
Scenario:
A PTT call is already active
The AVAudioSession is fully configured
I request beginTransmission on the PTT channel
I start my Audio Unit for recording (AudioOutputUnitStart)
Observed behavior:
AudioOutputUnitStart takes ~500 ms
This happens whether I start the Audio Unit:
after didBeginTransmission, or
after AVAudioSession didActivate
Comparison:
Using the same Audio Unit, same format, and same configuration
Without the PTT framework, AudioOutputUnitStart takes ~200 ms
Additional notes:
I am not modifying or reconfiguring AVAudioSession when requesting beginTransmission
The audio session is already set up when the PTT call starts
There are no interruptions or route changes at the time of starting the Audio Unit
Impact:
This extra latency is significant for Push-to-Talk use cases where fast transmit
start is critical.
My app want Converting iphone12 HDR Video to SDR,to edit。
follow the doc Apple-HDR-Convert.
My code setting the pixBuffAttributes
[pixBuffAttributes setObject:(id)(kCVImageBufferYCbCrMatrix_ITU_R_709_2) forKey:(id)kCVImageBufferYCbCrMatrixKey];
[pixBuffAttributes setObject:(id)(kCVImageBufferColorPrimaries_ITU_R_709_2) forKey:(id)kCVImageBufferColorPrimariesKey];
[pixBuffAttributes setObject:(id)kCVImageBufferTransferFunction_ITU_R_709_2 forKey:(id)kCVImageBufferTransferFunctionKey];
playerItemOutput = [[AVPlayerItemVideoOutput alloc] initWithPixelBufferAttributes:pixBuffAttributes];
but I get the playerItemOutput's output buffer
CFTypeRef colorAttachments = CVBufferGetAttachment(pixelBuffer, kCVImageBufferYCbCrMatrixKey, NULL);
CFTypeRef colorPrimaries = CVBufferGetAttachment(pixelBuffer, kCVImageBufferColorPrimariesKey, NULL);
CFTypeRef colorTransFunc = CVBufferGetAttachment(pixelBuffer, kCVImageBufferTransferFunctionKey, NULL);
NSLog(@"colorAttachments = %@", colorAttachments);
NSLog(@"colorPrimaries = %@", colorPrimaries);
NSLog(@"colorTransFunc = %@", colorTransFunc);
log output:
colorAttachments = ITU_R_2020
colorPrimaries = ITU_R_2020
colorTransFunc = ITU_R_2100_HLG
pixBuffAttributes setting output format invalid,please help!
Hi,
On macOS I used to open MP3 and MP4 files with ExtAudioFile. For a few years it doesn't work anymore.
So I decided to try different macOS API using the AudioFileID of AudioToolbox framework.
I decided to write a test:
https://gist.github.com/joelkraehemann/7f5b241b52ca38c3a765c138fb647588
It fails right here:
AudioFileOpenWithCallbacks()
By telling OSStatus error 1954115647, which means kAudioFileUnsupportedFileTypeError.
The filename was set to an MP4 file:
~/Music/test.mp4
Howto fix this?
regards, Joël
In Instruments, I'm seeing "Zero Time Stamp" events in the "Audio Server" lane.
What does that mean?
There appears to be no method of going forward or backwards in Get Info in the Music application,
Topic:
Media Technologies
SubTopic:
Audio
I am developing an app with transcription and I am exploring ways to improve the transcription from the SpeechAnalyzer/Transcriber for technical terms. SFSpeech... recognition had the capability of being augmented by contextualStrings. Does something similar exist for SpeechAnalyzer/Transcriber? If so please point me towards the documentation and any sample code that may exist for this. If there are other options, please let me know.
iPhoneやiPadにおいて、画面上のボタンなどをタップした際に、特定の楽器音を発音させる方法をご存知の方いらっしゃいませんか?
現在音楽学習アプリを作成途中で、画面上の鍵盤や指板のボタン状のframeに、単音又は和音を割当て発音させる事を考えております
SwiftUIのcodeのみで実現できないでしょうか
嘗て、MIDIのlevel1の楽器の発音機能があった様に記憶していますが、現在のOS上では同様の機能を実装してないように思えます
皆様のお知恵をお貸しください
I neet to take pcm data from aac data, but this api has fossy me deeply.
Using an iPhone Pro 12 running iOS 26.0.1, with AirPods Pro 3. Camera app does capture video with what seems to be "Studio Quality Recording".
Am trying to replicate that SQR with my own Camera like app, and while I can pull audio in from the APP3 mic, and my video capture app is recording a 48,000Hz high-bitrate video, the audio still sounds non-SQR.
I'm seeing bluetoothA2DP , bluetoothLE , bluetoothHFP as portType, and not sure if SQR depends on one of those?
Is there sample code demonstrating a SQR capture? Nevermind video and camera, just audio even?
Also, I don't understand what SQR is doing between the APP3 and the iPhone. What codec is that? What bitrate is that? If I capture video using Capture and inspect the audio stream I see mono 74.14 kbit/s MPEG-4 AAC, 48000 Hz. But I assume that's been recompressed and not really giving me any insight into the APP3 H2 transmission?
I have a PCM audio buffer (AVAudioPCMFormatInt16). When I try to play it using AVPlayerNode / AVAudioEngine an exception is thrown:
"[[busArray objectAtIndexedSubscript:(NSUInteger)element] setFormat:format error:&nsErr]: returned false, error Error Domain=NSOSStatusErrorDomain Code=-10868
(related thread https://forums.developer.apple.com/forums/thread/700497?answerId=780530022#780530022)
If I convert the buffer to AVAudioPCMFormatFloat32 playback works.
My questions are:
Does AVAudioEngine / AVPlayerNode require AVAudioPCMBuffer to be in the Float32 format? Is there a way I can configure it to accept another format instead for my application?
If 1 is YES is this documented anywhere?
If 1 is YES is this required format subject to change at any point?
Thanks!
I was looking to watch the "AVAudioEngine in Practice" session video from WWDC 2014 but I can't find it anywhere (https://forums.developer.apple.com/forums/thread/747008).
Hi everyone, I’m working on an iOS MusicKit app that overlays a metronome on top of Apple Music playback. To line the clicks up perfectly I’d like access to low-level audio analysis data—ideally a waveform / spectrogram or beat grid—while the track is playing. I’ve noticed that several approved DJ apps (e.g. djay, Serato, rekordbox) can already: • Display detailed scrolling waveforms of Apple Music songs • Scratch, loop or time-stretch those tracks in real time That implies they receive decoded PCM frames or at least high-resolution analysis data from Apple Music under a special entitlement. My questions: 1. Does MusicKit (or any public framework) expose real-time audio buffers, FFT bins, or beat markers for streaming Apple Music content? 2. If not, is there an Apple program or entitlement that developers can apply for—similar to the “DJ with Apple Music” initiative—to gain that deeper access? 3. Where can I find official documentation or a point of contact for this kind of request? I’ve searched the docs and forums but only see standard MusicKit playback APIs, which don’t appear to expose raw audio for DRM-protected songs. Any guidance, links or insider tips on the proper application process would be hugely appreciated! Thanks in advance.
Topic:
Media Technologies
SubTopic:
Audio
I develop a application with an uvc camera, this camera is a webcam, I use the AVFoundation library ,but when I run the code "[self.mCaptureSession startRunning]" ,I can not get the buffer, I already set the delegate, any answer will help.
I work on an iOS app that records video and audio. We've been getting reports for a while from users who are experiencing their video recordings being cut off. After investigating, I found that many users are receiving the AVAudioSessionMediaServicesWereResetNotification (.mediaServicesWereResetNotification) notification while recording. It's associated with the AVFoundationErrorDomain[-11819] error, which seems to indicate that the system audio daemon crashed. We have a handler registered to end the recording, show the user a prompt, and restart our AV sessions. However, from our logs this looks to be happening to hundreds of users every day and it's not an ideal user experience, so I would like to figure out why this is happening and if it's due to something that we're doing wrong.
The debug menu option to trigger the audio session reset is not of much use, because it can't be triggered unless you leave the app and go to system settings. So our app can't be recording video when the debug reset is triggered. So far I haven't found a way to reproduced the issue locally, but I can see that it's happening to users from logs.
I've found some posts online from developers experiencing similar issues, but none of them seem to directly address our issue. The system error doesn't include a userInfo dictionary, and as far as I can tell it's a system daemon crash so any logs would need to be captured from the OS.
Is there any way that I could get more information about what may be causing this error that I may have missed?
Topic:
Media Technologies
SubTopic:
Audio