Thursday, March 15, 2012

Web Sockets and Statelessness

The web works better when it's stateless, so the theory goes.  Stateless meaning that the client and server are largely decoupled from one another.  When applications are stateless, I can start a session on one computer, do something, and start another session elsewhere without worry.  Put another way, in the context of web applications, we're talking about stateless clients — all the application data resides on the server.

Statelessness matters because it's how we're able to scale to many concurrent users. This would be a daunting task if every action the user performed was managed by the server.  And so the web scales through simplicity — transferring the state of applications over HTTP.

Now we're seeing new protocols for how web application clients communicate with the server.  You want to use web sockets when interactivity is important — when updates in application state are unpredictable and thus difficult to publish over HTTP.  Intuitively, we'd like to use an abstraction that keeps the communication channel with the server open and listens for incoming data.  What, if anything does that mean for statelessness and how well will it scale, if at all?

When I first started experimenting with web socket technology, I was balked at the prospect of passing state back and forth rather than in a single direction.  But once you start using web sockets, it's really not that far a stretch from the tradition of HTTP.  It's just a communication protocol, and we can still apply RESTful principles to it.

Ask Now, Answer Later
A key advantage of web socket abstractions is the ability to ask a question, and not have to wait around for the answer.  If you need to get a list of objects, you issue a the command to fetch those objects over a web socket.  The web socket connection, at an indeterminate point in time after the command was issued, will receive the list of objects. The client's web socket connection passes this data into the Javascript code that can handle it however it wants.

What does this mean for the application running on the server that has to eventually respond to these socket requests?  Is this new technology somehow fixing a bottleneck in how requests block others from being fulfilled.  Not exactly.  Traditional HTTP requests only appear as though the server is actually spending significant effort in satisfying your request when in fact it's probably handling many requests simultaneously.  Well, I'm sure there are plenty of cases where there are longer-running requests that may actually affect the duration of others in the system.  However, these HTTP requests need a response in order to complete the workflow.  HTTP requests expect to be acknowledged somehow. Sockets, on the other hand, we can fire and forget.

Once a web socket requests lands on it's target server infrastructure, it goes about it's chores like any other client-server request.  It even responds to the client.  How this is actually accomplished largely depends on the web application framework you're using. For example, maybe the incoming web socket request spawns a new handler thread, within, the connection is available for writing back to the client.  Or, maybe you're not given any sort of client connection you can update as simply as a file object.  Maybe you have to go find it based on the request's session.  The end result, of course, means that you're no longer dealing with the traditional request handler abstraction where you can forget about the client once you've honoured your duties.  And this can cause a challenge in complexity.

With the flexibility that web sockets afford an application come new obstacles that we're not overly-concerned with, if at all, in our more traditional HTTP applications.  Once you've managed to wrap your head around the never-ending-connection philosophy, you'll start to realize that these sockets really do solve some ugly work-arounds that would otherwise need band-aids in HTTP land.  For instance, a command issued by the client might want to send sparse feedback — with a second or two in between.  This type of thing one can do blindfolded with web sockets.  It's the other little hindrances that'll make for an interesting experience — like softening of the command-response architecture rules.  Correspondingly, the stateless style of application we're used to building suddenly becomes foreign to us.

State Delivery
RESTful architectures push for statelessness by transferring the current state of data over the wire and to the client.  There is no such thing as state that remains in the web browser, aside from what the user is currently viewing or interacting with.  So my nagging concern about web sockets has always been along the lines of, how can we achieve something similar to stateless user interfaces?  I'm guessing I'm not alone when I don't want to throw out my learned experience with designing resilient systems that scale based on the principle of statelessness.

I've now come to understand that this is in fact not the case — web sockets aren't out to destroy our grounded beliefs in the command-response architecture.  No, web sockets are simply a delivery mechanism whereby state is transferred to some agent listening for it.  Not that different from HTTP, only improved for certain situations.  Like when state changes rapidly — more than once during a single command.  We need an intuitive method of transferring these changes in state back to the client without shooting ourselves in the foot with traditional HTTP modes.

I would say the frequency of state changes in your application domain will serve nicely as an indicator of how necessary web sockets are.  If it is frequent enough, perhaps web sockets are the answer.  Perhaps it'll make developing the using interface code that much easier by not having to result to polling techniques.  Because when you start implementing polling yourself, you're emulating the user hitting the refresh button.  If it seems silly to you that the user sit in her chair and click refresh constantly, it may be time to consider building with an abstraction better aligned with the model your data is asking for.