The document discusses various techniques for manipulating and processing video and audio using AV Foundation frameworks in iOS and Mac OS X. It begins with an overview of AV Foundation and describes common tasks like playback, capture, and editing. It then demonstrates tricks like animating AVPlayerLayers and recording the screen. The document dives deeper into techniques for reading and manipulating subtitle, audio, and video tracks using Core Media, Core Audio, Core Video, and Core Image frameworks. It provides code samples for applying filters to video in real-time and writing modified data back out.
QCon London: Mastering long-running processes in modern architectures
AV Foundation Tricks
1. Stupid Video Tricks
Chris Adamson • @invalidname
CocoaConf Las Vegas • August, 2014
2. AV Foundation
• Framework for working with time-based media
• Audio, video, timed text (captions / subtitles),
timecode
• iOS 4.0 and up, Mac OS X 10.7 (Lion) and up
• Replacing QuickTime on Mac
3. Ordinary AV Foundation stuff
• Playback: AVAsset + AVPlayer
• Capture to file: AVCaptureSession +
AVCaptureDeviceInput +
AVCaptureMovieFileOutput
• Editing: AVComposition + AVExportSession
4. But Why Be Ordinary?
http://www.crunchyroll.com/my-ordinary-life
5. Introductory Trick
• AVPlayerLayer and
AVCaptureVideoPreviewLayer are subclasses of
CALayer
• We can do lots of neat things with CALayers
14. Core Media
• Opaque types to represent time: CMTime,
CMTimeRange
• Opaque types to represent media samples and their
contents: CMSampleBuffer, CMBlockBuffer,
CMFormatDescription
15. CMSampleBuffer
• Provides timing information for one or more samples:
when does this play and for how long?
• Contains either
• CVImageBuffer – visual data (video frames)
• CMBlockBuffer — arbitrary data (sound, subtitles,
timecodes)
16. Use & Abuse of
CMSampleBuffers
• AVCaptureDataOutput provides CMSampleBuffers
in sample delegate callback
• AVAssetReader provides CMSampleBuffers read
from disk
• AVAssetWriter accepts CMSampleBuffers to write
to disk
18. How the Heck Does that
Work?
• Movies have tracks, tracks have media, media have
sample data
• All contents of a QuickTime file are defined in the
QuickTime File Format documentation
21. Subtitle Sample Data!
Subtitle sample data consists of a 16-bit word that
specifies the length (number of bytes) of the subtitle text,
followed by the subtitle text and then by optional sample
extensions. The subtitle text is Unicode text, encoded
either as UTF-8 text or UTF-16 text beginning with a
UTF-16 BYTE ORDER MARK ('uFEFF') in big or little
endian order. There is no null termination for the text.!
Following the subtitle text, there may be one or more
atoms containing additional information for selecting and
drawing the subtitle.!
22. I Iz In Ur Subtitle Track…
AVAssetTrack *subtitleTrack = [[asset tracksWithMediaType: AVMediaTypeSubtitle]
firstObject];
!
AVAssetReaderTrackOutput *subtitleTrackOutput =
[AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:subtitleTrack
outputSettings:nil];
!
// ...
while (reading) {
CMSampleBufferRef sampleBuffer = [subtitleTrackOutput copyNextSampleBuffer];
if (sampleBuffer == NULL) {
AVAssetReaderStatus status = subtitleReader.status;
if ((status == AVAssetReaderStatusCompleted) ||
(status == AVAssetReaderStatusFailed) ||
(status == AVAssetReaderStatusCancelled)) {
reading = NO;
NSLog (@"ending with reader status %d", status);
}
} else {
CMTime presentationTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer) ;
CMTime duration = CMSampleBufferGetDuration(sampleBuffer);
23. …Readin Ur CMBlockBuffers
CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
size_t dataSize =CMBlockBufferGetDataLength(blockBuffer);
if (dataSize > 0) {
UInt8* data = malloc(dataSize);
OSStatus cmErr = CMBlockBufferCopyDataBytes (blockBuffer,
0,
dataSize,
data);
24.
25. Subtitle Summary
• AVAssetReaderOutput provides CMSampleBuffers
• Get timing info with
CMSampleBufferGetPresentationTimestamp()
and CMSampleGetDuration()
• Get raw data with CMBlockBufferGet…() functions
• Have at it
28. Screen Recording
• Run an NSTimer to get screenshots
• Many ways to do this, such as drawing your
CALayer to a CGContext, make it a UIImage
• Convert image data into a CVPixelBuffer
• Use AVAssetWriterPixelBufferAdaptor to write pixel
buffer and presentation time to an AVAssetWriterInput
37. Potential Uses
• Run captured / read-in audio through effects in an
AUGraph
• See “AVCaptureAudioDataOutput To AudioUnit”
examples (iOS & OS X) from WWDC 2012
• May make more sense for audio-oriented apps to do
capture / file reads entirely from Core Audio
41. Core Media Core Video
• CMSampleBuffers provide CVImageBuffers
• Two sub-types: CVPixelBufferRef,
CVOpenGLESTextureRef
• Pixel buffers allow us to work with bitmaps, via
CVPixelBufferGetBaseAddress()
• Note: Must wrap calls with
CVPixelBufferLockBaseAddress(),
CVPixelBufferUnlockBaseAddress()
42. Use & Abuse of
CVImageBuffers
• Can be used to create Core Image CIImages
• iOS: -[CIImage imageWithCVPixelBuffer:]
• OS X: -[CIImage imageWithCVImageBuffer:]
• CIImages can be used to do lots of stuff…
44. Recipe
• Create CIContext from EAGLContext
• Create CIFilter
• During capture callback
• Convert pixel buffer to CIImage
• Run through filter
• Draw to CIContext
45. Set Up GLKView
if (! self.glkView.context.API != kEAGLRenderingAPIOpenGLES2) {
EAGLContext *eagl2Context = [[EAGLContext alloc]
initWithAPI:kEAGLRenderingAPIOpenGLES2];
self.glkView.context = eagl2Context;
}
self.glContext = self.glkView.context;
// we'll do the updating, thanks
self.glkView.enableSetNeedsDisplay = NO;
46. Make CIContext
// make CIContext from GL context, clearing out default color space
self.ciContext = [CIContext contextWithEAGLContext:self.glContext
options:@{kCIContextWorkingColorSpace :
[NSNull null]} ];
[self.glkView bindDrawable];
// from Core Image Fun House:
_glkDrawBounds = CGRectZero;
_glkDrawBounds.size.width = self.glkView.drawableWidth;
_glkDrawBounds.size.height = self.glkView.drawableHeight;
See also iOS Core Image Fun House from WWDC 2013
47. Ask for RGB in Callbacks
self.videoDataOutput = [[AVCaptureVideoDataOutput alloc] init];
[self.videoDataOutput setSampleBufferDelegate:self
queue:self.videoDataOutputQueue];
[self.captureSession addOutput: self.videoDataOutput];
NSDictionary *videoSettings = @{
(id) kCVPixelBufferPixelFormatTypeKey:
@(kCVPixelFormatType_32BGRA)};
[self.videoDataOutput setVideoSettings:videoSettings];
Note: 32BGRA and two flavors of 4:2:0 YCbCr are the only valid
pixel formats for video capture on iOS
51. Callback: Draw to GLKView
[self.glkView bindDrawable];
if (self.glContext != [EAGLContext currentContext]) {
[EAGLContext setCurrentContext: self.glContext];
}
// drawing code here is from WWDC 2013 iOS Core Image Fun House
// clear eagl view to grey
glClearColor(0.5, 0.5, 0.5, 1.0);
glClear(GL_COLOR_BUFFER_BIT);
// set the blend mode to "source over" so that CI will use that
glEnable(GL_BLEND);
glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_ALPHA);
CGRect drawRect = bufferCIImage.extent;
[self.ciContext drawImage:bufferCIImage
inRect:self.glkDrawBounds
fromRect:drawRect];
[self.glkView display];
CVPixelBufferUnlockBaseAddress(cvBuffer, 0);
52. Recap
from
camera CIFilter
CVPixelBuffer CIImage CIImage
CIContext
-[CIContext
drawImage:
fromRect:
inRect:]
(OpenGL
drawing)
+[CIImage
imageWith
CVPixelBuff
er:]
60. consCt unsigInedC int sioze = 6l4; orCube Data
size_t cubeDataSize = size * size * size * sizeof (float) * 4;
float *keyCubeData = (float *)malloc (cubeDataSize);
float rgb[3], hsv[3], *keyC = keyCubeData;
// Populate cube with a simple gradient going from 0 to 1
for (int z = 0; z < size; z++){
rgb[2] = ((double)z)/(size-1); // Blue value
for (int y = 0; y < size; y++){
rgb[1] = ((double)y)/(size-1); // Green value
for (int x = 0; x < size; x ++){
rgb[0] = ((double)x)/(size-1); // Red value !
// Convert RGB to HSV
// You can find publicly available rgbToHSV functions on the Internet !
RGBtoHSV(rgb[0], rgb[1], rgb[2],
&hsv[0], &hsv[1], &hsv[2]); !
// RGBtoHSV uses 0 to 360 for hue, while UIColor (used above) uses 0 to 1.
hsv[0] /= 360.0;
// Use the hue value to determine which to make transparent
// The minimum and maximum hue angle depends on
// the color you want to remove
bool keyed = (hsv[0] > minHueAngle && hsv[0] < maxHueAngle) &&
(hsv[1] > minSaturation && hsv[1] < maxSaturation) &&
(hsv[2] > minBrightness && hsv[2] < maxBrightness);
float alpha = keyed ? 0.0f : 1.0f;
// re-calculate c pointer
keyC = (((z * size * size) + (y * size) + x) * sizeof(float)) + keyCubeData;
// Calculate premultiplied alpha values for the cube
keyC[0] = rgb[0] * alpha;
keyC[1] = rgb[1] * alpha;
keyC[2] = rgb[2] * alpha;
keyC[3] = alpha;
}
}
}
See “Chroma Key Filter Recipe” in Core Image Programming Guide
61. Create CIColorCube from
mapping data
// Create memory with the cube data
NSData *data = [NSData dataWithBytesNoCopy:keyCubeData
length:cubeDataSize
freeWhenDone:YES];
self.colorCubeFilter = [CIFilter filterWithName:@"CIColorCube"];
[self.colorCubeFilter setValue:[NSNumber numberWithInt:size]
forKey:@"inputCubeDimension"];
// Set data for cube
[self.colorCubeFilter setValue:data forKey:@"inputCubeData"];
63. Apply Filters in Delegate
Callback
CIImage *bufferCIImage = [CIImage imageWithCVPixelBuffer:
cvBuffer];
[self.colorCubeFilter setValue:bufferCIImage
forKey:kCIInputImageKey];
CIImage *keyedCameraImage = [self.colorCubeFilter valueForKey:
kCIOutputImageKey];
[self.sourceOverFilter setValue:keyedCameraImage
forKeyPath:kCIInputImageKey];
CIImage *compositedImage = [self.sourceOverFilter valueForKeyPath:
kCIOutputImageKey];
Then draw compositedImage to CIContext as before
64. More Fun with Filters
• Alpha Matte: Use CIColorCube to map green to white
(or transparent), everything else to black
• Can then use this with other filters to do edge work
on the “foreground” object
• Be sure that any filters you use are of category
CICategoryVideo.
65. More Fun With CIContexts
• Can write effected pixels to a movie file with
AVAssetWriterOutput
• Use base address of CIContext to create a new
CVPixelBuffer, use this and timing information to
create a CMSampleBuffer
• AVAssetWriterInputPixelBufferAdaptor makes this
slightly easier
66. Recap
• Most good tricks start with CMSampleBuffers
• Audio: convert to Core Audio types
• Video: convert to CIImage
• Other: get CMBlockBuffer and parse by hand
68. Q&A
Slides at http://www.slideshare.net/invalidname/
See comments there for link to source code
!
invalidname [at] gmail.com
@invalidname (Twitter, app.net)
http://www.subfurther.com/blog