Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.

All subtopics
Posts under Media Technologies topic

Post

Replies

Boosts

Views

Activity

EXT-X-DISCONTINUITY misalignment
We encounter issue with avplayer in case of EXT-X-DISCONTINUITY misalignment between audio and video produced after insertion of gaps. The initial objective is to introduce an EXT-X-DISCONTINUITY in audio playlist after some missing segments (EXT-X-GAP) which durations are aligned to video segments durations, to handle irregular audio durations. Please find below an example of corresponding video and audio playlists: video: #EXTM3U #EXT-X-VERSION:7 #EXT-X-MEDIA-SEQUENCE:872524632 #EXT-X-INDEPENDENT-SEGMENTS #EXT-X-TARGETDURATION:2 #USP-X-TIMESTAMP-MAP:MPEGTS=7096045027,LOCAL=2025-05-09T12:38:32.369100Z #EXT-X-MAP:URI="hls/StreamingBasic-video=979200.m4s" #EXT-X-PROGRAM-DATE-TIME:2025-05-09T12:38:32.369111Z #EXTINF:2.002, no desc hls/StreamingBasic-video=979200-872524632.m4s #EXTINF:2.002, no desc hls/StreamingBasic-video=979200-872524633.m4s #EXTINF:2.002, no desc hls/StreamingBasic-video=979200-872524634.m4s #EXTINF:2.002, no desc hls/StreamingBasic-video=979200-872524635.m4s #EXTINF:2.002, no desc hls/StreamingBasic-video=979200-872524636.m4s ## Media sequence discontinuity #EXT-X-GAP #EXTINF:2.002, no desc hls/StreamingBasic-video=979200-872524637.m4s ## Media sequence discontinuity #EXT-X-GAP #EXTINF:2.002, no desc hls/StreamingBasic-video=979200-872524638.m4s #EXT-X-PROGRAM-DATE-TIME:2025-05-09T12:38:46.383111Z #EXTINF:2.002, no desc hls/StreamingBasic-video=979200-872524639.m4s #EXTINF:2.002, no desc hls/StreamingBasic-video=979200-872524640.m4s audio: EXTM3U #EXT-X-VERSION:7 #EXT-X-MEDIA-SEQUENCE:872524632 #EXT-X-INDEPENDENT-SEGMENTS #EXT-X-TARGETDURATION:2 #USP-X-TIMESTAMP-MAP:MPEGTS=7096045867,LOCAL=2025-05-09T12:38:32.378400Z #EXT-X-MAP:URI="hls/StreamingBasic-audio_99500_eng=98800.m4s" #EXT-X-PROGRAM-DATE-TIME:2025-05-09T12:38:32.378444Z #EXTINF:2.0053, no desc hls/StreamingBasic-audio_99500_eng=98800-872524632.m4s #EXTINF:2.0053, no desc hls/StreamingBasic-audio_99500_eng=98800-872524633.m4s #EXTINF:2.0053, no desc hls/StreamingBasic-audio_99500_eng=98800-872524634.m4s #EXTINF:1.984, no desc hls/StreamingBasic-audio_99500_eng=98800-872524635.m4s #EXTINF:2.0053, no desc hls/StreamingBasic-audio_99500_eng=98800-872524636.m4s ## Media sequence discontinuity #EXT-X-GAP #EXTINF:2.002, no desc hls/StreamingBasic-audio_99500_eng=98800-872524637.m4s ## Media sequence discontinuity #EXT-X-GAP #EXTINF:2.002, no desc hls/StreamingBasic-audio_99500_eng=98800-872524638.m4s #EXT-X-DISCONTINUITY #EXT-X-PROGRAM-DATE-TIME:2025-05-09T12:38:46.778444Z #EXTINF:1.6213, no desc hls/StreamingBasic-audio_99500_eng=98800-872524639.m4s #EXTINF:2.0053, no desc hls/StreamingBasic-audio_99500_eng=98800-872524640.m4s In this case playback is broken with avplayer. Is it conformed to Http Live Streaming? Is it an avplayer bug? What are the guidelines to handle such gaps?
0
0
231
Jul ’25
How to synchronize the clock sources of two audio devices
I created a virtual audio device to capture system audio with a sample rate of 44.1 kHz. After capturing the audio, I forward it to the hardware sound card using AVAudioEngine, also with a sample rate of 44.1 kHz. However, due to the clock sources being unsynchronized, problems occur after a period of playback. How can I retrieve the clock source of the hardware device and set it for the virtual device?
2
0
301
May ’25
Clarification of use of `AVAssetDownloadConfiguration` and `AVAssetDownloadTask` to persist MP3 Audio downloads/streams
Hello! I have been following the UsingAVFoundationToPlayAndPersistHTTPLiveStreams sample code in order to test persisting streams to disk. In addition to support for m3u8, I have noticed in testing that this also seems to work for MP3 Audio, simply by changing the plist entries to point to remote URLs with audio/mpeg content. Is this expected, or are there caveats that I should be aware of? Thanks you!
0
0
75
Apr ’25
FCPX Workflow Extension: Drag & Drop FCPXML with Titles to Timeline Not Working Despite Following File Promise Guidelines
I'm developing a Final Cut Pro X workflow extension that transcribes audio and creates a text output. I need to allow users to drag this text directly from my extension into FCPX's timeline as titles. Current Implementation: Using NSFilePromiseProvider as per Apple's guidelines for drag and drop Generating valid FCPXML (v1.10) with proper structure: Complete resources section with format and asset references Event and project hierarchy Asset clip with connected title elements Proper timing and duration calculations Supporting multiple pasteboard types: com.apple.finalcutpro.xml.v1-10 com.apple.finalcutpro.xml.v1-9 com.apple.finalcutpro.xml What's Working: Drag operation initiates correctly File promise provider is set up properly FCPXML generation is successful (verified content) All required pasteboard types are registered Proper logging confirms data is being requested and provided Current Pasteboard Types Offered: com.apple.NSFilePromiseItemMetaData com.apple.pasteboard.promised-file-name com.apple.pasteboard.promised-suggested-file-name com.apple.pasteboard.promised-file-content-type Apple files promise pasteboard type com.apple.pasteboard.NSFilePromiseID com.apple.pasteboard.promised-file-url com.apple.finalcutpro.xml.v1-10 com.apple.finalcutpro.xml.v1-9 com.apple.finalcutpro.xml What additional requirements or considerations are needed to make FCPX accept the dragged FCPXML content? Are there specific requirements for workflow extensions regarding drag and drop operations with titles that aren't documented? Any insights, especially from those who have implemented similar functionality in FCPX workflow extensions, would be greatly appreciated. Technical Details: macOS Version: 15.5 (24F74) FCPX Version: 11.1.1 Extension built with SwiftUI and AppKit integration Using NSFilePromiseProvider and NSPasteboardItemDataProvider Full pasteboard type support for FCPXML versions
0
0
198
Jul ’25
HDR video & screen brightness
When I play an HDR video in the iPhone Photos app, I can see the HDR effect obviously. But if this HDR video is played continuously for more than 30-40 minutes, the HDR effect will disappear and the brightness will be compressed to the SDR range. This issue will appear on any iPhone. Depending on the phone, it may be 20-30 minutes, or 30-40 minutes, or even a few minutes, such as iPhone 12 mini. Similarly, if I use AVPlayer to play and preview an HDR video, if it plays more than 30-40 minutes, the HDR effect will disappear and the screen brightness will dim. Also the currentEDRHeadroom will gradually decrease to 1 Note, test it with an HDR video longer than 1 hour, and if the video is short, please loop it. My question is how to avoid losing the HDR effect after 30-40 minutes when I use CAMetalLayer to render any HDR video.
1
0
189
Jul ’25
Audio / Video sync issue on iOS using AVSampleBufferRenderSynchronizer
My current app implements a custom video player, based on a AVSampleBufferRenderSynchronizer synchronising two renderers: an AVSampleBufferDisplayLayer receiving decoded CVPixelBuffer-based video CMSampleBuffers, and an AVSampleBufferAudioRenderer receiving decoded lpcm-based audio CMSampleBuffers. The AVSampleBufferRenderSynchronizer is started when the first image (in presentation order) is decoded and enqueued, using avSynchronizer.setRate(_ rate: Float, time: CMTime), with rate = 1 and time the presentation timestamp of the first decoded image. Presentation timestamps of video and audio sample buffers are consistent, and on most streams, the audio and video are correctly synchronized. However on some network streams, on iOS, the audio and video aren't synchronized, with a time difference that seems to increase with time. On the other hand, with the same player code and network streams on macOS, the synchronization always works fine. This reminds me of something I've read, about cases where an AVSampleBufferRenderSynchronizer could not synchronize audio and video, causing them to run with independent and potentially drifting clocks, but I cannot find it again. So, any help / hints on this sync problem will be greatly appreciated! :)
2
0
1.3k
Apr ’25
Background recording app getting killed by watch dog.. how to avoid?
We have the necessary background recording entitlements, and for many users... do not run into any issues. However, there is a subset of users that routinely get recordings ending.. we have narrowed this down and believe it to be the work of the watch dog. First we removed the entire view hierarchy when app is backgrounded. There is just 'Text("Recording")' This got the CPU usage in profiler down to 0%. We saw massive improvements to recording success rate. We walked away assuming that was enough. However we are still seeing the same sort of crashes. All in the background. We're using Observation to drive audio state changes to a Live Activity. Are those Observations causing the problem? Why doesn't apple provide a better API to background audio? The internet is full of weird issues https://stackoverflow.com/questions/76010213/why-is-my-react-native-app-sometimes-terminated-in-the-background-while-tracking https://stackoverflow.com/questions/71656047/why-is-my-react-native-app-terminating-in-the-background-while-recording-ios-r https://github.com/expo/expo/issues/16807 This is such a terrible user experience. And we have very little visibility into what is happening and why. No where in apple documentation states that in order for background recording to work, the app can only be 'Text("Recording")' It does not outline a CPU or memory threshold. It just kills us.
2
0
426
Mar ’25
Visual isTranslatable: NO; reason: observation failure: noObservations, when trying to play custom compositor video with AVPlayer
I am trying to achieve an animated gradient effect that changes values over time based on the current seconds. I am also using AVPlayer and AVMutableVideoComposition along with custom instruction and class to generate the effect. I didn't want to load any video file, but rather generate a custom video with my own set of instructions. I used Metal Compute shaders to generate the effects and make the video to be 20 seconds. However, when I run the code, I get a frozen player with the gradient applied, but when I try to play the video, I get this warning in the console :- Visual isTranslatable: NO; reason: observation failure: noObservations Here is the screenshot :- My entire code :- import AVFoundation import Metal class GradientVideoCompositorTest: NSObject, AVVideoCompositing { var sourcePixelBufferAttributes: [String: Any]? = [ kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA ] var requiredPixelBufferAttributesForRenderContext: [String: Any] = [ kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA ] private var renderContext: AVVideoCompositionRenderContext? private var metalDevice: MTLDevice! private var metalCommandQueue: MTLCommandQueue! private var metalLibrary: MTLLibrary! private var metalPipeline: MTLComputePipelineState! override init() { super.init() setupMetal() } func setupMetal() { guard let device = MTLCreateSystemDefaultDevice(), let queue = device.makeCommandQueue(), let library = try? device.makeDefaultLibrary(), let function = library.makeFunction(name: "gradientShader") else { fatalError("Metal setup failed") } self.metalDevice = device self.metalCommandQueue = queue self.metalLibrary = library self.metalPipeline = try? device.makeComputePipelineState(function: function) } func renderContextChanged(_ newRenderContext: AVVideoCompositionRenderContext) { renderContext = newRenderContext } func startRequest(_ request: AVAsynchronousVideoCompositionRequest) { guard let outputPixelBuffer = renderContext?.newPixelBuffer(), let metalTexture = createMetalTexture(from: outputPixelBuffer) else { request.finish(with: NSError(domain: "com.example.gradient", code: -1, userInfo: nil)) return } var time = Float(request.compositionTime.seconds) renderGradient(to: metalTexture, time: time) request.finish(withComposedVideoFrame: outputPixelBuffer) } private func createMetalTexture(from pixelBuffer: CVPixelBuffer) -> MTLTexture? { var texture: MTLTexture? let width = CVPixelBufferGetWidth(pixelBuffer) let height = CVPixelBufferGetHeight(pixelBuffer) let textureDescriptor = MTLTextureDescriptor.texture2DDescriptor( pixelFormat: .bgra8Unorm, width: width, height: height, mipmapped: false ) textureDescriptor.usage = [.shaderWrite, .shaderRead] CVPixelBufferLockBaseAddress(pixelBuffer, .readOnly) if let textureCache = createTextureCache(), let cvTexture = createCVMetalTexture(from: pixelBuffer, cache: textureCache) { texture = CVMetalTextureGetTexture(cvTexture) } CVPixelBufferUnlockBaseAddress(pixelBuffer, .readOnly) return texture } private func renderGradient(to texture: MTLTexture, time: Float) { guard let commandBuffer = metalCommandQueue.makeCommandBuffer(), let commandEncoder = commandBuffer.makeComputeCommandEncoder() else { return } commandEncoder.setComputePipelineState(metalPipeline) commandEncoder.setTexture(texture, index: 0) var mutableTime = time commandEncoder.setBytes(&mutableTime, length: MemoryLayout<Float>.size, index: 0) let threadsPerGroup = MTLSize(width: 16, height: 16, depth: 1) let threadGroups = MTLSize( width: (texture.width + 15) / 16, height: (texture.height + 15) / 16, depth: 1 ) commandEncoder.dispatchThreadgroups(threadGroups, threadsPerThreadgroup: threadsPerGroup) commandEncoder.endEncoding() commandBuffer.commit() } private func createTextureCache() -> CVMetalTextureCache? { var cache: CVMetalTextureCache? CVMetalTextureCacheCreate(kCFAllocatorDefault, nil, metalDevice, nil, &cache) return cache } private func createCVMetalTexture(from pixelBuffer: CVPixelBuffer, cache: CVMetalTextureCache) -> CVMetalTexture? { var cvTexture: CVMetalTexture? let width = CVPixelBufferGetWidth(pixelBuffer) let height = CVPixelBufferGetHeight(pixelBuffer) CVMetalTextureCacheCreateTextureFromImage( kCFAllocatorDefault, cache, pixelBuffer, nil, .bgra8Unorm, width, height, 0, &cvTexture ) return cvTexture } } class GradientCompositionInstructionTest: NSObject, AVVideoCompositionInstructionProtocol { var timeRange: CMTimeRange var enablePostProcessing: Bool = true var containsTweening: Bool = true var requiredSourceTrackIDs: [NSValue]? = nil var passthroughTrackID: CMPersistentTrackID = kCMPersistentTrackID_Invalid init(timeRange: CMTimeRange) { self.timeRange = timeRange } } func createGradientVideoComposition(duration: CMTime, size: CGSize) -> AVMutableVideoComposition { let composition = AVMutableComposition() let instruction = GradientCompositionInstructionTest(timeRange: CMTimeRange(start: .zero, duration: duration)) let videoComposition = AVMutableVideoComposition() videoComposition.customVideoCompositorClass = GradientVideoCompositorTest.self videoComposition.renderSize = size videoComposition.frameDuration = CMTime(value: 1, timescale: 30) // 30 FPS videoComposition.instructions = [instruction] return videoComposition } #include <metal_stdlib> using namespace metal; kernel void gradientShader(texture2d<float, access::write> output [[texture(0)]], constant float &time [[buffer(0)]], uint2 id [[thread_position_in_grid]]) { float2 uv = float2(id) / float2(output.get_width(), output.get_height()); // Animated colors based on time float3 color1 = float3(sin(time) * 0.8 + 0.1, 0.6, 1.0); float3 color2 = float3(0.12, 0.99, cos(time) * 0.9 + 0.3); // Linear interpolation for gradient float3 gradientColor = mix(color1, color2, uv.y); output.write(float4(gradientColor, 1.0), id); }
1
0
378
Apr ’25
UIDocumentPickerViewController in Audiounit Extension unable to receive touches
Hello, I have an existing AUv3 instrument plugin. In the plug in, users can access files (audio files, song projects) via a UIDocumentPickerViewController In Logic Pro, (and some other hosts, but not all), the document picker is unable to receive touches, while a keyboard case is attached to the iPad. Removing the case (this is an Apple brand iPad case) allows the interactions to resume and allows me to pick files in the usual way. One of my users reports this non-responsive behavior occurs even after disconnecting their keyboard. I have fiddled with entitlements all day, and have determined that is not the issue, since the keyboard disconnection appears to fix it every time for me. Here is my, very boilerplate, presentation code : guard let type = UTType("com.my.type") else { return } let fileBrowser = UIDocumentPickerViewController(forOpeningContentTypes: [type]) fileBrowser.overrideUserInterfaceStyle = .dark fileBrowser.delegate = self fileBrowser.directoryURL = myFileFolderURL() self.present(fileBrowser, animated: true) {
2
0
566
Jul ’25
Essentials of macOS to read and write mp3 and mp4 audio files
Hi, On macOS I used to open MP3 and MP4 files with ExtAudioFile. For a few years it doesn't work anymore. So I decided to try different macOS API using the AudioFileID of AudioToolbox framework. I decided to write a test: https://gist.github.com/joelkraehemann/7f5b241b52ca38c3a765c138fb647588 It fails right here: AudioFileOpenWithCallbacks() By telling OSStatus error 1954115647, which means kAudioFileUnsupportedFileTypeError. The filename was set to an MP4 file: ~/Music/test.mp4 Howto fix this? regards, Joël
1
0
347
Jun ’25
AudioUnit (AUv2) Session Compatibility After Adding MIDI Support
Hi there! We have a suite of AudioUnit v2 plugins that have been shipped for some time as aufx plugins, and we are looking into MIDI-related platform upgrades, so we need a way to update these plugins to request MIDI from Logic (and other AU hosts) but avoid changing our AU type and subtype so we don't break existing sessions. Any ideas on how we can do this?
1
0
120
Mar ’25
WWDC25 Camera & Photos group lab summary (Part 1 of 3)
(Note: this is part 1 of a 3 part posting. See Part 2 or Part 3) At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Camera & Photos. WWDC25 Camera & Photos group lab ran for one hour at 6 PM PST on Tuesday June 10th, 2025 Introductory kick-off questions Question 1 Tell us a little about the new AVFoundation Capture APIs we've made available in the new iOS 26 developer preview? Cinematic Capture API (strong/weak focus, tracking focus)(scene monitoring)(simulated aperture)(dog/cat heads/groupIDs) Camera Controls and AirPod Stem Clicks Spatial Audio and Studio Quality AirPod Mics in Camera Lens Smudge Detection Exposure and Focus Rect of Interest Question 2 I built QR code scanning into my app, but on newer iPhones I have to hold the phone very far away from the QR code, otherwise the image is blurry and it doesn't scan. Why is this happening and how can I fix it? Every year, the cameras get better and better as we push the state of the art on iPhone photography and videography. This sometimes results in changes to the characteristics of the lenses. min focus distance newer phones have multiple lenses automatic switching behavior Use virtual device like the builtInDualWide or built in Triple, rather than just the builtInWide Set the videoZoomFactor to 2. You're done. Question 3 Last year, we saw some exciting new APIs introduced in AVFoundation in the health space. With Constant Color photography, developers can take pictures that have constant color regardless of ambient lighting. There are some further advancements this year. Davide, could you tell us about them? constant color photography is mean to remove the "tone mapping" applied to photograph captured with camera app, usually incldsuing artistic intent, and instead try to be a close as possible to the real color of the scene, regardless of the illumination constant color images could be captured in HEIF anf JPEG laste year. this year we are adding Support for the DICOM medical imaging photo format. It is a fomrat used by the health industry to store images related to medical subjects like MRI, skin problems, xray and so on. It's writable and also readable format on all OS26, supported through AVCapturePhotoOutput APIs and through the coregraphics api. for coregrapphics there is a new DICOM entry in the property dictionary which includes all the dicom availbale and defined propertie in a file. finder will also display all those in the info panel (Address why a developer would want to use it) - not for regualr picture taking apps. for those HEIF and JPEG are the preferred delivery format. use dicom if your app produces output that are health related, that you can also share with health providers or your doctors Main session developer questions Question 1 LiDAR vs. Dual Camera depth generation: Which resolution does the LiDAR sensor natively have (iPhone 16 Pro) and when to prefer LiDAR over Dual Camera? Both report formats with output resolutions (we don't advertise sensor resolution) Lidar vs Dual, etc: Lidar: Best for absolute depth, real world scale and computer vision Dual, etc: relative, disparity-based, less power, photo effects Also see: 2022 WWDC session "Discovery advancements in iOS camera capture: Depth, focus and multitasking" Question 2 Can true depth and lidar camera run at 60fps? Lidar can do 30fps (edited) Front true depth can do 60fps. Question 3 What’s the first class way to use PhotoKit to reimplement a high performance photo grid? We’ve been using a LazyVGrid and the photos caching manager, but are never able to hit the holy trinity (60hz, efficient memory footprint, minimal flashes of placeholder/empty cells) use the PHCachingImageManager to get media content delivered before you need to display it specify the size you need for grid sized display set the options PHVideoRequestOptionsDeliveryModeFastFormat, PHImageRequestOptionsDeliveryModeFastFormat and PHImageRequestOptionsResizeModeFast Question 4 For rending live preview of video stream, Is there performance overhead from using async and Swift UI for image updates vs UIViewRepresentable + AVCaptureVideoPreviewLayer.self? AVCaptureVideoPreviewLayer is the most efficient display path Use VDO + AVSampleBufferDisplayLayer if you need to modify the image data Swift UI image is optimized for static image content Question 5 Is there a way to configure the AVFoundation BuiltInLiDarDepthCamera mode to provide a depth map as accurate as ARKit at close range? The AVCaptureDepthDataOutput supports filtering that reduces noise and fills in invalid values. Consider using this for smoother depth maps Question 6 Pyramid-based photo editing in core image (such as adobe camera raw highlights and shadows)? First off you may want to look a the builtin filter called CIHighlightShadowAdjust Also the noise reduction in the CIRawFilter uses a pyramid-based algorithm. You can also write your own pyramid-based algorithms by taking an input image: down sample it by two multiply times using imageByApplyingAffineTransform apply additional CIKernels to each downsampled image as needed. use a custom CIKernel to combine the results. Question 7 Is the best way to integrate an in-app camera for a “non-camera” app UIImagePickerController? Yes, UIImagePickerController provides system-provided UI for capturing photos and movies. Question 8 Hello, my question is on Deferred Photo Processing? Say I have a photo capture app that adds a CIFilter to the capture. How can I take advantage of Deferred Photo Processing? Since I don’t know how to detect when the deferred captured photo is ready CIFilter can be called on the final at that point Photo will have to be re-inserted into the Photo library as adjustment Question 9 For shipping photo style assets in the app that need transparency what is the best format to use? JPEG2000? will moving to this save a lot of space comapred to PNG or other options? If you want lossless compression PNG is good and supports unpremutiplied alpha If you want lossy compression HEIF supports premutiplied or unpremutiplied alpha (Note: this is part 1 of a 3 part posting. See Part 2 or Part 3)
0
0
509
Jul ’25
Memory leak when performing DetectHumanBodyPose3DRequest request
Hi, I'm developing an application for macos and ios that has to run DetectHumanBodyPose3DRequest model in real time for retrieving the 3d skeleton from the camera. I'm experiencing a memory leak every time the model is used (when i comment that line, the memory stays constant). After a minute it uses about 1GB of ram running with mac catalyst. I attached a minimal project that has this problem Code Camera View import SwiftUI import Combine import Vision struct CameraView: View { @StateObject private var viewModel = CameraViewModel() var body: some View { HStack { ZStack { GeometryReader { geometry in if let image = viewModel.currentFrame { Image(decorative: image, scale: 1) .resizable() .scaledToFill() .frame(width: geometry.size.width, height: geometry.size.height) .clipped() } else { ProgressView() } } } } } } class CameraViewModel: ObservableObject { @Published var currentFrame: CGImage? @Published var frameRate: Double = 0 @Published var currentVisionBodyPose: HumanBodyPose3DObservation? // Store current body pose @Published var currentImageSize: CGSize? // Store current image size private var cameraManager: CameraManager? private var humanBodyPose = HumanBodyPose3DDetector() private var lastClassificationTime = Date() private var frameCount = 0 private var lastFrameTime = Date() private let classificationThrottleInterval: TimeInterval = 1.0 private var lastPoseSendTime: Date = .distantPast init() { cameraManager = CameraManager() startPreview() startClassification() } private func startPreview() { Task { guard let previewStream = cameraManager?.previewStream else { return } for await frame in previewStream { let size = CGSize(width: frame.width, height: frame.height) Task { @MainActor in self.currentFrame = frame self.currentImageSize = size self.updateFrameRate() } } } } private func startClassification() { Task { guard let classificationStream = cameraManager?.classificationStream else { return } for await pixelBuffer in classificationStream { self.classifyFrame(pixelBuffer: pixelBuffer) } } } private func classifyFrame(pixelBuffer: CVPixelBuffer) { humanBodyPose.runHumanBodyPose3DRequestOnImage(pixelBuffer: pixelBuffer) { [weak self] observation in guard let self = self else { return } DispatchQueue.main.async { if let observation = observation { self.currentVisionBodyPose = observation print(observation) } else { self.currentVisionBodyPose = nil } } } } private func updateFrameRate() { frameCount += 1 let now = Date() let elapsed = now.timeIntervalSince(lastFrameTime) if elapsed >= 1.0 { frameRate = Double(frameCount) / elapsed frameCount = 0 lastFrameTime = now } } } HumanBodyPose3DDetector import Foundation import Vision class HumanBodyPose3DDetector: NSObject, ObservableObject { @Published var humanObservation: HumanBodyPose3DObservation? = nil private let queue = DispatchQueue(label: "humanbodypose.queue") private let request = DetectHumanBodyPose3DRequest() private struct SendablePixelBuffer: @unchecked Sendable { let buffer: CVPixelBuffer } public func runHumanBodyPose3DRequestOnImage(pixelBuffer: CVPixelBuffer, completion: @escaping (HumanBodyPose3DObservation?) -> Void) { let sendableBuffer = SendablePixelBuffer(buffer: pixelBuffer) queue.async { [weak self] in Task { [weak self, sendableBuffer] in do { guard let self = self else { return } let result = try await self.request.perform(on: sendableBuffer.buffer) //process result DispatchQueue.main.async { if result.isEmpty { completion(nil) } else { completion(result[0]) } } } catch { DispatchQueue.main.async { completion(nil) } } } } } }
1
0
164
Jun ’25
Personas and Lighting in a visionOS 26 (or earlier) SharePlay Session
Use case: When SharePlay -ing a fully immersive 3D scene (e.g. a virtual stage), I would like to shine lights on specific Personas, so they show up brighter when someone in the scene is recording the feed (think a camera person in the scene wearing Vision Pro). Note: This spotlight effect only needs to render in the camera person's headset and does NOT need to be journaled or shared. Before I dive into this, my technical question: Can environmental and/or scene lighting affect Persona brightness in a SharePlay? If not, is there a way to programmatically make Personas "brighter" when recording? My screen recordings always seem to turn out darker than what's rendered in environment, and manually adjusting the contrast tends to blow out the details in a Persona's face (especially in visionOS 26).
0
0
116
Jun ’25
Capturing multiple screens no longer works with macOS Sequoia
Capturing more than one display is no longer working with macOS Sequoia. We have a product that allows users to capture up to 2 displays/screens. Our application is using gstreamer which in turn is based on AVFoundation. I found a quick way to replicate the issue by just running 2 captures from separate terminals. Assuming display 1 has device index 0, and display 2 has device index 1, here are the steps: install gstreamer with brew install gstreamer Then open 2 terminal windows and launch the following processes: terminal 1 (device-index:0): gst-launch-1.0 avfvideosrc -e device-index=0 capture-screen=true ! queue ! videoscale ! video/x-raw,width=640,height=360 ! videoconvert ! osxvideosink terminal 2 (device-index:1): gst-launch-1.0 avfvideosrc -e device-index=1 capture-screen=true ! queue ! videoscale ! video/x-raw,width=640,height=360 ! videoconvert ! osxvideosink The first process that is launched will show the screen, the second process launched will not. Testing this on macOS Ventura and Sonoma works as expected, showing both screens. I submitted the same issue on Feedback Assistant: FB15900976
2
0
390
Apr ’25
Level Networking on watchOS for Duplex audio streaming
I did watch WWDC 2019 Session 716 and understand that an active audio session is key to unlocking low‑level networking on watchOS. I’m configuring my audio session and engine as follows: private func configureAudioSession(completion: @escaping (Bool) -> Void) { let audioSession = AVAudioSession.sharedInstance() do { try audioSession.setCategory(.playAndRecord, mode: .voiceChat, options: []) try audioSession.setActive(true, options: .notifyOthersOnDeactivation) // Retrieve sample rate and configure the audio format. let sampleRate = audioSession.sampleRate print("Active hardware sample rate: \(sampleRate)") audioFormat = AVAudioFormat(standardFormatWithSampleRate: sampleRate, channels: 1) // Configure the audio engine. audioInputNode = audioEngine.inputNode audioEngine.attach(audioPlayerNode) audioEngine.connect(audioPlayerNode, to: audioEngine.mainMixerNode, format: audioFormat) try audioEngine.start() completion(true) } catch { print("Error configuring audio session: \(error.localizedDescription)") completion(false) } } private func setupUDPConnection() { let parameters = NWParameters.udp parameters.includePeerToPeer = true connection = NWConnection(host: "***.***.xxxxx.***", port: 0000, using: parameters) setupNWConnectionHandlers() } private func setupTCPConnection() { let parameters = NWParameters.tcp connection = NWConnection(host: "***.***.xxxxx.***", port: 0000, using: parameters) setupNWConnectionHandlers() } private func setupWebSocketConnection() { guard let url = URL(string: "ws://***.***.xxxxx.***:0000") else { print("Invalid WebSocket URL") return } let session = URLSession(configuration: .default) webSocketTask = session.webSocketTask(with: url) webSocketTask?.resume() print("WebSocket connection initiated") sendAudioToServer() receiveDataFromServer() sendWebSocketPing(after: 0.6) } private func setupNWConnectionHandlers() { connection?.stateUpdateHandler = { [weak self] state in DispatchQueue.main.async { switch state { case .ready: print("Connected (NWConnection)") self?.isConnected = true self?.failToConnect = false self?.receiveDataFromServer() self?.sendAudioToServer() case .waiting(let error), .failed(let error): print("Connection error: \(error.localizedDescription)") DispatchQueue.main.asyncAfter(deadline: .now() + 2) { self?.setupNetwork() } case .cancelled: print("NWConnection cancelled") self?.isConnected = false default: break } } } connection?.start(queue: .main) } Duplex in this context refers to two-way audio transmission simultaneously recording and sending audio while also receiving and playing back incoming audio, similar to a VoIP/SIP call. The setup works fine on the simulator, which suggests that the core logic is correct. However, since the simulator doesn’t fully replicate WatchOS hardware behavior especially for audio sessions and networking issues might arise when running on a real device. The problem likely lies in either the Watch’s actual hardware limitations, permission constraints, or specific audio session configurations. I am reaching out to seek further assistance regarding the challenges I've been experiencing with establishing a UDP, TCP & web socket connection on watchOS using NWConnection for duplex audio streaming. Despite implementing the recommendations provided earlier, I am still encountering difficulties From what I can see, your implementation is focused on streaming audio playback with the server. In my case, I'm looking for a slightly different approach: I want to capture audio and send buffers of a specific size to the server while playing audio simultaneously, essentially achieving full duplex streaming similar to a VOIP call. Additionally, I’d like to ensure that if no external audio route is connected, the Apple Watch speaker is used by default. Any thoughts or insights on adapting this setup for those requirements would be very welcome.
1
0
235
Apr ’25
Seeking Expert Clarification on MusicKit Usage and Compliance for a App in Saudi Arabia
Hello Apple Developer Community, We are developing a music management platform for restaurants and cafes in Saudi Arabia. Our app enables businesses to schedule playlists and allows visitors to request songs via barcodes. Music playback is powered by Apple Music, and users must have their own Apple Music subscriptions to access the music. Our service charges a monthly subscription fee for these management features, not for music access itself. Project Overview and MusicKit Role Our app integrates MusicKit to leverage Apple Music’s catalog and playback capabilities. Users log in with their Apple Music accounts, ensuring they have an active subscription for music playback. Our platform’s value lies in its tools—playlist scheduling and song requests—which are built on top of MusicKit’s APIs. We offer these features exclusively in Saudi Arabia. Legal Context in Saudi Arabia In Saudi Arabia, to our understanding, no special licenses are required for playing music in commercial venues like restaurants and cafes. This means our clients can use Apple Music subscriptions for playback without additional performance rights licenses. While this aligns with local laws, we recognize that Apple’s global policies may impose stricter requirements, prompting our need for clarification. Subscription Model and Monetization Concerns We charge a monthly subscription fee for access to our app’s features (e.g., scheduling playlists and managing song requests). This fee is separate from the Apple Music subscription, which users must maintain for playback. However, Apple’s MusicKit terms state: "You agree not to require payment for or indirectly monetize access to the Apple Music service." We’re concerned whether our subscription model might be interpreted as indirectly monetizing Apple Music access, given its reliance on MusicKit for functionality. Scheduling Feature and Synchronization Rights Our app allows businesses to schedule playlists for general time slots (e.g., “play this playlist from 6 PM to 8 PM”). It does not support precise scheduling, such as playing a specific song at an exact moment (e.g., “play this song at 7:30 PM”). Apple’s guidelines mention that “deeper or more complex music integration” may require additional licenses, like synchronization rights. We’re unsure if our general scheduling feature crosses this threshold or remains within MusicKit’s standard usage. Questions for Clarification We’d greatly appreciate expert input on the following: Monetization: Does our subscription fee for management features (scheduling and song requests) violate Apple’s policy against indirectly monetizing Apple Music access? Local Context: Given that Saudi Arabia requires no additional licenses for commercial music playback, does this impact our compliance with Apple’s global terms? Scheduling: Does our playlist scheduling for general time slots (not exact moments) fall within MusicKit’s permitted scope, or does it require further licensing? Thank you in advance for any insights or guidance to ensure our app aligns with Apple’s policies!
0
0
374
Mar ’25