Mon, 29 Aug 2011

Memcached in Windows Azure

I’ve just published two new NuGet packages: WazMemcachedServer and WazMemcachedClient. These make it drop-dead simple to add memcached to a Windows Azure application in a way that takes advantage of Windows Azure’s dynamic scaling, in-place upgrades, and fault tolerance.

Why Memcached?

Windows Azure has a built-in distributed cache solution (Windows Azure Caching), which is a great option for .NET developers who want to easily add a cache to their Windows Azure application. However, I’ve heard from some customers who would like to use memcached.

One scenario in particular that I think is a great fit for memcached is reusing existing RAM. For example, you may have spare RAM on your web role instances, and adding memcached to them could give you an in-memory cache without adding any VMs (and thus without adding any cost). Note that Windows Azure Caching has a fantastic “local cache” option, but that still requires that a remote cache is provisioned, and the local cache is not shared (it’s per-instance).

Another reason some people choose memcached is so they can hand-tune their cache. This isn’t for the faint of heart, but it’s a nice option for people who are already experts in tuning memcached for their particular workload (perhaps changing the minimum space allocated per key).

How Does it Work?

The server-side implementation is simple. It just launches memcached, listening on an internal endpoint. The client-side is where a bit of work is done. I wanted a client that met two goals:

Use consistent hashing, which minimizes the disruption of adding and removing servers.
Respond automatically when servers are added to and removed from the cluster (during scaling, upgrades, or failures).

The first goal is met by basing the solution on the Enyim memcached client, which uses consistent hashing by default. The second goal meant extending Enyim in the form of a custom IServerPool implementation called WindowsAzureServerPool. This code regularly looks for newly added or removed Windows Azure instances and reconfigures the memcached client automatically. Importantly, it doesn’t just try to use new Windows Azure instances when they’re first added. It waits until the instance is accepting connections before trying to use it as a cache server.

The package is based on code that Channel 9 uses. Big thanks to Mike Sampson from the Channel 9 team for helping with this.

Setting up the Servers

You can run memcached on any role (web or worker). In a heavy-duty distributed cache, you’ll probably create a dedicated worker role just for caching, but in a lot of web applications, you might simply add memcached to your web role. In either case, there are three steps to getting memcached up and running:

Use NuGet to install the WazMemcachedServer package. (From the Package Manager Console, this is just install-package WazMemcachedServer.) This adds the memcached binaries (1.4.5 Windows binaries from Couchbase) and a small helper class for launching them.
Create an internal TCP endpoint for memcached to listen on. (I usually call this “Memcached”.) You can do this through the Visual Studio UI (double-click on the role and pick “Endpoints” on the left) or by adding it directly to ServiceDefinition.csdef.
Add code to your WebRole.cs or WorkerRole.cs to launch and monitor the memcached process:
```
Process proc;
public override void Run()
{
    proc.WaitForExit();
}

public override bool OnStart()
{
    proc = WindowsAzureMemcachedHelpers.StartMemcached("Memcached", 512);
    return base.OnStart();
}
```
The first parameter is the name of the endpoint you created in step #2, and the second parameter is the amount of RAM (in megabytes) you want to dedicate to memcached. Note that my Run method is just waiting (hopefully forever) for the memcached process to exit. This way, if memcached crashes, so will your role instance, allowing Windows Azure to restart everything for you. If you’re doing other things in your role’s Run method, you might want to instead use the process’s Exited event to react to the process crashing.

At this point, all instances of this role will be running memcached listening on an internal endpoint.

Setting up the Client

To make use of your new cluster of memcached servers from your code, you’ll need a client that knows how to find the memcached server instances, even when they come and go due to scaling and upgrades. Setting that up is easy:

Install the WazMemcachedClient package via install-package WazMemcachedClient. This will add a couple of classes that extend the Enyim memcached client to discover and use the memcached servers you’ve set up.
Create a MemcachedClient in your code that you’ll reuse throughout the application’s lifecycle to talk to memcached. In a web app, you might put this in a static variable in your ASP.NET MVC controller:
```
static MemcachedClient client = WindowsAzureMemcachedHelpers.CreateDefaultClient(
    "WorkerRole", "Memcached");
```
The first parameter is the name of the role running memcached, and the second parameter is the name of the internal endpoint on which memcached is listening. Another great place to initialize the client is in Application_Start:
```
Application["memcache"] = WindowsAzureMemcachedHelpers.CreateDefaultClient("WorkerRole", "Memcached");
```
Then you can access it via Application[“memcached”] from anywhere in your code.

Once you’ve done the above two steps, you can use the MemcachedClient you’ve created to perform any memcached operations. For example:

string value = client.Get(key) as string;
if (value == null)
{
    value = FetchFromStorage(key);
    client.Store(StoreMode.Set, key, value);
}
return value;

Downloads

The NuGet packages are in the form of source code, so you can read the entire code (and make changes) by installing the two NuGet packages: