Thu, 29 Jul 2010

Adaptive Streaming with Windows Azure Blobs and CDN

In this post, I’ll show you how to use Windows Azure Blobs and the Windows Azure CDN to deliver adaptive streaming video content to your users in a format compatible with Silverlight’s Smooth Streaming player. For those who just want to try it out, head over to the Adaptive Streaming with Windows Azure Blobs Uploader project on Code Gallery. The instructions there will get you going.

[UPDATE 8/2/2010] I’ve added an Expression Encoder publishing plugin to the project too.

Understanding Smooth Streaming

Before we get into the details of how adaptive streaming works on top of Windows Azure Blobs, it’s necessary to understand what Smooth Streaming is and how it works.

Smooth Streaming is Microsoft’s HTTP-based adaptive streaming protocol. As Alex Zambelli’s wrote in his excellent “Smooth Streaming Technical Overview”:


Adaptive streaming is a hybrid delivery method that acts like streaming but is based on HTTP progressive download. It's an advanced concept that uses HTTP rather than a new protocol.

In a typical adaptive streaming implementation, the video/audio source is cut into many short segments ("chunks") and encoded to the desired delivery format… The encoded chunks are hosted on a HTTP Web server. A client requests the chunks from the Web server in a linear fashion and downloads them using plain HTTP progressive download.

The "adaptive" part of the solution comes into play when the video/audio source is encoded at multiple bit rates, generating multiple chunks of various sizes for each 2-to-4-seconds of video. The client can now choose between chunks of different sizes. Because Web servers usually deliver data as fast as network bandwidth allows them to, the client can easily estimate user bandwidth and decide to download larger or smaller chunks ahead of time. The size of the playback/download buffer is fully customizable.

In other words, HTTP-based adaptive streaming is about taking a source video, encoding it into lots of small chunks at various bitrates, and then letting the client play back the most appropriate chunks (based on available bandwidth).

If you’ve ever looked at IIS Smooth Streaming content, though, you’ll notice that there aren’t lots of tiny chunks. There are a few, fairly large video files. Alex explains this too:


IIS Smooth Streaming uses the MPEG-4 Part 14 (ISO/IEC 14496-12) file format as its disk (storage) and wire (transport) format. Specifically, the Smooth Streaming specification defines each chunk/GOP as an MPEG-4 Movie Fragment and stores it within a contiguous MP4 file for easy random access. One MP4 file is expected for each bit rate. When a client requests a specific source time segment from the IIS Web server, the server dynamically finds the appropriate Movie Fragment box within the contiguous MP4 file and sends it over the wire as a standalone file, thus ensuring full cacheability downstream.

Smooth Streaming files are quite literally all those little chunks concatenated together. I encourage you to read Alex’s entire article to understand the exact file format and wire format.

The key insight for our purpose is that to the client, Smooth Streaming content is just many small video chunks. The beauty of this model is that Smooth Streaming works great with CDNs and caches in between the client and the server. To the client, all that matters is that small chunks of video are being served from the appropriate URLs.

Using Windows Azure Blobs as an Adaptive Streaming Host

Windows Azure Blobs can serve specific content at configurable URLs, which as we’ve seen is the only requirement to provide clients with an adaptive streaming experience. To sweeten the deal, there’s built-in integration between Windows Azure Blobs and the Windows Azure CDN.

All that’s left for us to do is to figure out the set of URLs a Smooth Streaming client might request and store the appropriate video chunks at those URLs. There are two files that will help us do that:

  • The server manifest (.ism) – This is a SMIL file that maps video and audio tracks to the file that contains them and the bitrates at which they were encoded.
  • The client manifest (.ismc) – This is an XML file that specifies to the client which bitrates and timestamps are available. It also specifies the URL template clients should use to request chunks.

The combination of these two files tells us everything we need to know to extract the video chunks and store them in Widows Azure Blobs.

The Adaptive Streaming with Windows Azure Blobs Uploader code first reads the server manifest and keeps track of the mapping of bitrate and content type (video or audio) to tracks within files. Then it reads the client manifest and generates all the permutations of bitrate, content type, and timestamp. For each of these, it looks up the appropriate track of the appropriate file, extracts that chunk from the file, and stores it in blob storage according to the URL template in the client manifest.

The code’s not too complicated, and you can find it in the Code Gallery project in SmoothStreamingAzure.cs.

Prior Work

After I patted myself on the back for coming up with this brilliant scheme, it was pointed out to me that Alden Torres blogged about this back in December 2009. He used a tool on Codeplex called MP4 Explorer, which has a feature that allows uploading to blob storage. That tool reads the source MP4 files themselves and derives the chunks from there (as opposed to my approach, which reads the client manifest).

The two big reasons I decided to write my own code for this were that I wanted a command-line tool and that I wanted to upload the blobs in parallel. I was able to cut down the upload time for the Big Buck Bunny video from around three hours (as Alden mentions in his post) to around thirty minutes simply by doing the uploads in parallel.

Shortcomings of This Approach

To the client doing simple playback, there’s no difference between IIS Smooth Streaming (hosted by IIS Media Services) or Adaptive Streaming with Windows Azure Blobs. However, to the content owner and to the server, there are significant differences:

  • With IIS on the server, scenarios that require server intelligence are possible (like real-time transcoding or encryption).
  • There are fewer files to manage with IIS (since it keeps all the content in a small number of files). This makes copying files around and renaming them much simpler.
  • As future features (like fast-forward and new targets like the iPad) come out, all you need to do is update IIS Media Services to get the new functionality. With a solution like the one described in this post, you’ll need to reprocess existing content.
  • Because the manifest formats for IIS Smooth Streaming are actively evolving, there’s no guarantee that my code will work correctly with future Smooth Streaming clients and content.

Specifically, there are a few features of IIS Smooth Streaming that my code doesn’t handle today:

  1. Trick play (fast-forward and rewind). This is supported under IIS by extracting keyframes from the video. My code doesn’t support extracting these keyframes.
  2. Live Smooth Streaming. Handling a live event (where the manifest is changing and the chunks include extra hints about future chunks) isn’t supported in my code.

The Windows Azure team is still committed to running full IIS Media Services within Windows Azure web roles in the future.

Get the Tool

If you’d like to host Smooth Streaming content in Windows Azure Blobs, please check out the Adaptive Streaming with Windows Azure Blobs Uploader project on Code Gallery, where you can download the command-line tool as well as the full source code.

Sample

Here's Big Buck Bunny served from cdn.blog.smarx.com:

 

[UPDATE 11:29pm PDT] The name of the tool has been changed to “Adaptive Streaming with Windows Azure Blobs Uploader”.