Tim Jeanes: TechEd 2008 - Day 5

DAT317 - Project "Velocity": A First Look at Microsoft's Distributed Caching Framework

Velocity is Microsoft's upcoming caching framework. It's a distributed in-memory cache for all kinds of data - CLR objects, rows, XML, binary data, etc. - so long as you can serialize it, you can cache it.

This allows you to share cached information over a number of web servers, providing fast access to the data held in the cache. The cache itself is also spread over a number of machines, but you never need to know which server is holding your cached item. Velocity supports failover too, so if one of the cache hosts goes down, your cached item is still available from another host.

Objects are stored in the cache as named items, so the cache acts like a dictionary. Simple Get and Put commands can be used for basic access. Optimistic concurrecy can be used: Velocity tracks a version number of each item internally and only lets you update an item if there have been no other updates to it since it was retrieved. More pessimistically, you can use GetAndLock/ PutAndUnlock/ Unlock to prevent other threads or other servers updating items you're using. On top of the name of each item you cam apply tags, letting you retrieve sets of data with the same tag all in one go.

As you'd expect, you can configure the cache to specify how long it holds on to items before they expire and have to be retrieved from the backend database again. A nice feature is that it also supports notifications, so that if you update an item in the cache, that cache host notifies all others that the item has been changed. Of course this model has a bit more overhead, so which to use depends on how time-critical the accuracy of your data is.

Velocity also supports a SessionStoreProvider, making it easier to share session data over a number of web servers. This is about 200% faster than using SQL Server to store session state. Again this supports failover, so even if the primary machine that's storing your shopping cart goes up in smoke, that's not going to stop you spending money.

PDC309 - Designing applications for Windows Azure

This was a great session because it filled in a lot of the detail of things we'd learnt about Azure earlier this week, and because it gave some coding best practices for working on the cloud. The presenter has already deployed a number of real-life Azure-based applications, so it's great to be able to pick up what he's learnt the hard way.

I was a little surprised at some of the limitations of table storage. You can only have 255 properties per entry, which seems a bit arbitrarily small. Properties on table entities can only be the basic types (int, string, DateTime, Guid, etc.), as you'd expect, but there's a field size limit of 64K for strings and byte arrays. Each entity is limited to 1MB in total, though I suppose that makes sense - once your objects are getting that large they probably belong in a blob anyway. On the other hand, I'm glad to hear that they will be adding index support to table entity properties.

For now you only get indexes on the row key and the partition key (both of which are strings - up to 64KB each). As a pair together, they make the primary key for the table, which I think is perhaps a little awkward: surely the row key by itself should be enough? It would make cross-partition joins much easier, and we all love GUIDs anyway. However, I suppose this does push you in the right direction: cross-partition joins can potentially be more expensive (because Azure will always ensure each partition is held entirely on one physical machine), so I guess you should avoid them where possible.

As you don't get indexes on other properties (for the CTP, at least) this means you have to denormalise your data all over the place if you're going to want to do any kind of searching. More critically, as table storage has no support for transactions, it looks like we'll be waiting for SQL Data Services to come out before we try any real-life database-type storage in the cloud.

We also took quite an in-depth look at queues and how to process them in worker roles.

There is absolutely no limit to the number of queues you can have, so go crazy! One queue for each customer would be entirely feasible. The only restriction is the name of the queue: as queues can be accessed in a RESTful way, the name has to be something that you can use as part of a URI.

I quite like the way messages are handled as they are read from a queue. When a message is read, it is marked as invisible so that other worker roles don't pick it up. If the role that took it dies and never deletes the message, it becomes visible again after a configurable timeout period. The downside is that if a malformed message always causes an exception, it remains on the queue forever. the way to handle this is to check the time the message was created, when you read it from the queue. If it's been there for a long time, delete the message from the queue. What you do next depends on your application - you could log the problem, put the bad message onto a "bad messages" queue, or whatever.

It's possible (and more efficient) to read a number of messages from a queue at once, then process and delete them one at a time. Once the messages have been processed, you loop the worker role back to the start again - just by using while(true) - but sleep the Thread for a bit first if the queue was empty.

One point that was mentioned a few times was that you have to code to accommodate failure. Worker roles may die unexpectedly part way through processing a message, for example, leaving your table storage in a half-baked state. The fact that everything is stateless coupled with the lack of transaction support on the table storage means there's a whole bunch of craziness you'll have to code around.

The worker role has a property that returns its health status - you have to implement this yourself. An easy check is to see how long you've been processing the current message and flag yourself as unhealthy if it looks like you're stuck in an infinite loop. The health property does nothing in the current CTP release - in future Azure will monitor it to choose when to kill and restart the worker role.

One little gotcha: when DateTime.Now in the cloud, remember you don't know where your code is running, so you can assume nothing about what timezone you're in. Always use DateTime.UtcNow instead.

DVP315 - Do's and Don'ts in Silverlight 2 Applications

As we're just starting out in Silverlight 2, this was a really handy session. We got some tips from someone who's already made all our mistakes for us. Here they are:


- Leave the default "Install Silverlight" message.

It's rubbish and if your user hasn't heard of Silverlight then they're not going to install it. Instead display an image that shows the user what sort of thing they'll see in the real application, with a message telling them what they have to do.

- Take a long time to load.

Instead load assets after the main app has loaded. Also cache assets in the isolated storage so they don't have to get it again next time. That way if there's no internet access, we can look in the local cache. It there, your app can work offline.

- Leave the default splash screen.

- Use width and height with

Resizing videos is very costly and wasteful in terms of CPU time, so show videos at their natural size only. Similarly, animating the size of text or complicated paths is also CPU intensive. To animate the size of text, build a vector from the text and scale that.

- Use a transparent background to the Silverlight app - it's very expensive. Admittedly it can look great though...

- Use opacity to hide something.

If you fade something out, remember to set Visibility=Visibility.Collapsed at the end.


- Show the designer mock data for use in Blend.

You can detect where a user control is being rendered by examining DesignerProperties.GetIsInDesignMode(this). Elsewhere you can check HtmlPage.IsEnabled.

- Use EnableFrameRateCounter=true when debugging.

This shows the current framerate in the status bar of the browser. Set the maximum framerate to something very high so you get an accurate indication of how your application's performing.

- Use BackgroundWorker or asynchronous calls.

The UI stops redrawing whilst code is running on the same thread.

- Use xperf to monitor performace.

This is a really handy tool that monitors and records the CPU usage of all processes. You can drill right down to individual procedure calls within your Silverlight app, showing you exactly what part of your application is being resource-hungry. Remember to run it as administrator.

WIN312 - Deep Dive into "Geneva" Next Generation Identity Server and Framework.

I'd heard references to Geneva and its claims-based identity model, so I thought I'd check it out in a bit more depth. At its most fundamental, claims-based identity supplies a series of flags that describe attributes of a user. Who provides the claims for a user can vary, but so long as your application gets the claims from a trusted source, you're good to go.

Under the Geneva framework, claims are passed around as security tokens, so are cryptographically protected by the claims issuer (normally Geneva Server). The Geneva framework takes care of all the cryptography for you.

The IClaimsIdentity, IClaimsPrincipal interfaces are available in .NET to code against, and are compatible with the existing roles-based security model. The Geneva framework converts information such as roles from Active Directory into claims.

The framework is extensible, so you can communicate with any type of issuer that you like (or that you invent), just so long as you have some way to convert their security token to a string.

You can add your own Claims Authentication Manager, where you can transform one set of claims to another. For example, given a claim that the user is in the UK and a claim that states they're over 18, you can emit a claim that they are legally allowed to buy alcohol.

This makes it really easy to define as many of your own application-specific claims as you like, and as Geneva already communicates with Active Directory, Live ID and Cardspace, quite a range of login routes are available right out of the box.

Geneva is a Windows component so is essentially free to use (or will be once it's fully live some time next year).