Tuesday, March 24, 2009

Why we need a thread-safe publish/subscribe event system

Publish-subscribe event systems are a fairly common design pattern in modern computing. The concept becomes increasingly powerful in distributed systems where many nodes can subscribe to an event or topic emitted from a single node. The name publish-subscribe, or pub-sub, is used because it has a tight analogue in the real world. People with magazine or newspaper subscriptions receive updates when something is published. Because of this analogue, developers are more easily able to reason about events and why they occurred in complex software systems. In any given software system, some code will need to react to one or more events. These events can range from anything as simple as a mouse click to a complete database failure. The publish-subscribe pattern is infinitely extensible because any number of observers may subscribe to a single event. Subscriptions can also be canceled to as to offer architectural scalability in both directions; up and down. One bottleneck in a publish-subscribe framework can occur while the publishing object needs to wait until all subscribers have finished reacting to the event. In some cases, this is unavoidable such as when the publisher is expecting a value to be returned from one of the subscribers. In other cases, however, the publisher doesn't care about the subscribers or how they react to a published event. In a localized publish-subscribe system, that is, not a distributed publish-subscribe system, we could use threads for subscribers. If we were to build and use a framework such as this, where subscriptions react to events in separate threads of control, we would also need the ability to turn threading off and use the framework in the same way and have it still be functional. This is because threading is simply not an option in every scenario.

The boduch Python library offers a publish-subscribe event system such as this. The library is still in it's infancy but has the ability to run subscription event handles in new threads on control. The threading capability can also be switched on and off. The same code using the library can be run in either mode. Events are declared by specializing a base "Event" class. Likewise, event handles, or subscriptions, are declared by specializing a base "Handle" class. Developers can then subscribe to an event by passing an event class and a handle class to the subscribe function. Multiple handles may be listening to a given event type and if running in threaded mode, each handle will start a new thread of control. There are limits on the number of threads that are allowed to be run at a given time but this can be adjusted either manually or pragmatically. When running in threaded mode, or non-threaded mode for that matter, published events may be specified as atomic. This really only as an effect when the event system is running in threaded mode because it forces all handles for that particular event to run in the publisher's thread. When running in non-threaded mode, atomic publications are idempotent.

As mentioned earlier, there are several limitations to the boduch library since it is still in its infancy as a project. For instance, there is no way to specify a filter for event subscriptions. Subscribers may want to react to event types based on data contained within the event instance. In turn, there is no way to tack the source object that emitted the event. Finally, there is no real guarantee that proper ordering will be preserved when running in threaded mode. However, this can be worked-around. I haven't actually encountered a scenario where the ordering of instructions have been defective when running in threaded mode. This doesn't mean it is possible. I actually hope I do some day so I can incorporate more built in safety in the library.