CFF: One Format to Rule Them All?
The move towards MPEG DASH and the fragmented MP4 (fMP4) common file format may finally offer DVD-like interoperability for web video
Since the early days of online video, various battles have been waged among competing video codecs and formats, but those battles were merely indicative of the larger struggle between those who support a common file format for all online video delivery and those who support proprietary formats. The most recent push for a common file format is coming from proponents of MPEG DASH, and if the standing-room-only crowd at an MPEG DASH panel at Streaming Media West earlier this month is any indication, its time may have come.
UltraViolet, which uses both a common file format and common encryption, is supported by six major studios and the seventy-member Digital Entertainment Content Ecosystem (DECE) as a way to deliver premium content such as movies on both physical media (discs) and online (digital downloads). As described in detail in a "What Is...?" article by Jan Ozer, MPEG DASH is a way to standardize manifests (called Media Presentation Descriptions or MPDs) that is moving through the ISO ratification process piece by piece.
Microsoft, Apple, and AdobeLet's take a look at how MPEG DASH and UltraViolet fit together with two currently used fragmented MP4 (fMP4) solutions, Adobe HTTP Dynamic Streaming and Microsoft IIS Smooth Streaming. Understanding this is important to understanding why we might be on the cusp of a common way to scale online video delivery to television-sized audiences. Rather than spelling out the technical details of fMP4, I'll point readers to a technically leaning white paper on fMP4 which I authored for Transitions, Inc. and a related post on my Workflowed blog. The paper, jointly sponsored by Adobe and Microsoft, contains sections on technical aspects of fMP4 as well as each company's approach to fragmented delivery over HTTP.
As Chris Knowlton pointed out in a recent post on Microsoft's IIS blog, Microsoft was the first to embrace the use of fragmented MP4 files, back in late 2008. At that time, the use of fragmented MP4 elementary streams was a new and non-standardized concept, but Microsoft rapidly implemented it in Smooth Streaming, which used AAC for audio and AVC/H.264 for video compression.
Around the same time, Apple was also beginning to push the concept of a modified MPEG-2 Transport Stream that it since dubbed Apple HTTP Live Streaming (HLS) as a way to deliver multiplexed (muxed) transport audio/video streams to its iPhone and iPod touch devices.
Microsoft's idea of using fragments of MP4 files for Smooth Streaming was brilliant in its simplicity, for a variety of reasons touched on in the white paper. The most basic benefit, though, might just be the way it solved the asset management nightmare looming with HLS: Rather than having to manage the location of thousands of tiny segments for each streaming video, as is the case with HLS, the fMP4 approach allows fragments of a large audio or video file to be identified based on timecode.
All the audio and video could reside in just a handful of files—one each for each discrete video or audio bitrate—limiting the need to manage the thousands of files in competing approaches.
Adobe rapidly moved to adopt an HTTP delivery approach, too, in its Adobe HTTP Dynamic Streaming (HDS). Adobe chose a different file format called F4F that is, as Adobe's Kevin Towes points out in a recent blog post, based on the fragmented MP4 file format. While Adobe used the same AAC and H.264 codecs for its fragmented MP4 approach, its use of the F4F file format meant that it could deliver Flash video content both via HTTP and RTMP.
Because Adobe and Microsoft chose to use the same fragmented MP4 concept, but to package them in different formats, some confusion ensued. We at StreamingMedia.com found ourselves asking the question as to whether there would be a common ground between these competing formats. Turns out that was also on the minds of Adobe and Microsoft, but it took the emergence of UltraViolet to bring the issue front and center.
UltraVioletPrior to UltraViolet, Microsoft had proposed a format known as PIFF (Protected Interoperable File Format), as a way toward a common file format for fragmented MP4. The idea garnered interest, but it was not until UltraViolet settled on a common file format—now referred to as CFF—that the idea of an interoperable format gained traction. For all intents and purposes, the CFF specification that UltraViolet settled on is a published version of PIFF, and Microsoft has helped keep pace with the CFF by making the PIFF 1.3 version directly compatible with CFF.
Once the UltraViolet CFF was published and agreed upon by the studios and DECE members, the next step was to look at a common encryption scheme. UltraViolet settles on five digital rights management (DRM) schemes—one each from Adobe, Google, Marlin, Microsoft, and the Open Mobile Alliance.
For all the work by Adobe and Microsoft, and the additional work by DECE on UltraViolet, there had not been a collaborative effort to standardizing an fMP4 approach. That changed, though, once the UltraViolet common file format and five-pronged DRM schemes were approved by DECE members.
Which leads us to MPEG DASH. The primary purpose of DASH is a proposal to standardize HTTP-based content delivery via XML-based manifests (the Media Presentation Description, or MPD, in DASH nomenclature). In doing so, DASH seeks to find common ground between the various approaches that Adobe, Microsoft, and others have created for fMP4, as well as incorporating a way to generate XML-bases standardized MPDs for the M2TS segment files that Apple uses for HLS.
Alongside these media description manifests, however, MPEG felt it wise to adopt two other proposals that complement—but are not part of—the DASH standards: one is a common set of encryptions and the other a common file format (CFF).
MPEG wanted to base a standardized CFF on MPEG-4 Part 12, the ISO Base Media File Format that has as its original basis the QuickTime file format. Rather then re-invent the wheel, MPEG chose to consider existing common file formats, settling on the UltraViolet CFF. The adopted MPEG CFF now means that PIFF, UltraViolet, and MPEG-4 use a common file format, a confluence of fMP4 akin to the DVD Forum's 1995 interoperability specification, which allowed competing device manufacturers to create interoperable DVD players.
In addition, MPEG reviewed UltraViolet's five-pronged DRM approach, which will be known as the MPEG Common Encryption (ISO/IEC 23001-7 CENC) scheme if pending ratification is approved. The major benefit from a DRM standpoint is that CENC can be combined with the MPEG-4 Part 12 ISO Base Media File Format, enabling fragmented MP4 to take advantage of DRM and elementary stream encryption.
This accelerated move to a common file format between fragmented MP4 solutions doesn't quite have universal traction, but it sets the stage—along with MPEG DASH's profiles for both fMP4 and M2TS decoding—to allow broadcasters to choose from just a few proprietary and standards-based HTTP delivery options.
We've yet to hear from Apple—which contributed to the MPEG DASH working group—as to whether it will take advantage of the M2TS profiles in MPEG DASH to replace the .m3u8 manifests currently used in HLS, but it's become clear Adobe is willing to consider a balancing act between its traditional Flash video approach and a standards-based fMP4 approach.
In much the same way that Adobe has embraced a symbiotic relationship, when it comes to interactivity, between Flash and HTML5, it also appears Adobe sees the two-pronged approach for fMP4 as beneficial. And that's a positive step toward embracing a common file format in the near future.
"With any accelerated technology, a large group of people are motivated to consider different ideas and concepts that improve the technology," wrote Towes in his blog post about the jointly sponsored white paper. "When you are smack in the middle, there are a lot of options available to [broadcasters]. . . . We try to help broadcasters realize new revenue streams, regardless of platform being delivering to."
"Where Flash Player is available, we'll use RTMP or HDS to deliver world-class quality of experience, encryption and rights management," said Towes in an interview for the white paper. "With other platforms, such as iOS-based devices, we'll do an equally good job. Adobe also sees the benefit of standards-based approaches like the Common File Format, HTML5 and MPEG DASH.”