Sun, 31 May 2009

Sample Code for New Windows Azure Blob Features

[UPDATE 10/13/2010] This post is of course out of date by now. All these features are available in the supported StorageClient library, and that’s how you should use them. Also one fix to the “Copy blob” section. You can indeed copy blobs cross-container.

[UPDATE 6/5/2009] This post is just about the new functionality in blob storage. See the next blog post (“Sample Code for Batch Transactions in Windows Azure Tables”) to learn about the new functionality in table storage.

I’m a little bit late on this, but as promised on Thursday on the Windows Azure blog, I’ve written up some code taking advantage of the new copy blob and get committed block list methods in Windows Azure storage. This code comes in the form of a modified copy of the sample storage client library that ships in the Windows Azure SDK. For the impatient, you can go ahead and download the code.

To use the new features, you’ll need this code or your own code like it, because the SDK sample hasn’t yet been updated to include the new functionality. You’ll also need to run against a storage account in the cloud, because the development storage in the SDK has also not yet been updated to match the latest bits in the cloud.

Disclaimer

The standard level of quality of sample code I write is not terribly high, and this is no exception. Please take this code for what it is, a sample of how to use the new functionality. An updated storage client library in the future is the right place to get higher quality code.

The New Features

Copy blob

This method has been added. It allows you to copy from one blob to another within ~~the same container of~~ the same account. Takes the form over the wire as an HTTP PUT with no body but a header specifying the source blob name.

For full details, see the MSDN documentation for Copy Blob.

Get block list

This method was modified since earlier releases to include support for a blocklisttype query parameter. Setting different values for this parameter lets you get the list of committed blocks, uncommitted blocks, or all of them combined.

For full details, see the MSDN documentation for Get Block List.

The New Methods

To take advantage of these new features, I added the following five methods to the sample storage client library:

public abstract bool CopyBlob(string to, string from, bool overwrite);
public abstract List<string> GetBlockList(string blobName, string eTag);
public abstract List<string> GetCommittedBlockList(string blobName, string eTag);
public abstract List<string> GetUncommittedBlockList(string blobName, string eTag);
public abstract List<string> GetAllBlockList(string blobName, string eTag);

In the code, these are declared in BlobStorage.cs. They’re implemented in RestBlobStorage.cs with the help of a few additional constants in RestHelpers.cs.

There’s also a console application included that tests the new functionality. To use it, modify Program.cs to use a valid storage account and key. The output of the test program should look like this:

Creating container.
Creating big blob (5MB).
Done.
Committed blocks:
        AAAAAA==
        AQAAAA==
        AgAAAA==
        AwAAAA==
        BAAAAA==
Copying blob...
Done.
Committed blocks:
        AAAAAA==
        AQAAAA==
        AgAAAA==
        AwAAAA==
        BAAAAA==
Deleting test container.

[UPDATE 6/5/2009] The test program now generates some more output based on entity group transactions. See the next blog post ("Sample Code for Batch Transactions in Windows Azure Tables”) for details.

Miscellaneous Details

As part of the release of the new storage code, a versioning header has been introduced. Setting the header x-ms-version to 2009-04-14 unlocks the new functionality, so I added a line to Utilities.CreateHttpRequest in RestHelpers.cs that adds this header to all HTTP requests.

I also exposed for the first time the Get Block List functionality, though it existed in the storage client library in a private method. I changed the way this works… it used to return a list of integers, each representing the size of a block. Now it returns a list of strings, each of which is the name of a block. This seems more natural to me, but be aware of that change.

I did nothing smart in this code to take advantage of the new functionality to get the committed block list for a blob. The most obvious use of this would be to make a smart blob uploader that automatically resumes an upload without having to keep local state to know which blocks were successfully uploaded. (It can just query the server.)

Download

You can find the full source code (updated storage client library and console test application) here: http://cdn.blog.smarx.com/files/StorageMayFeatures.zip. Enjoy!