Tuesday, March 17, 2009

ECP three-level machine abstraction

The Enomaly Elastic Computing Platform is a platform for managing distributed virtual machines. Therefore, we need some type of abstract representation of this concept. This requirement isn't really any different from any other software problem. There is a problem domain which contain concepts unique to that domain. Developers will then try to capture what that concept represents in that domain by creating an abstraction. By creating abstractions in this manor, we lower the representational gap between the domain and how concepts in that domain are realized in the solution. In the case of ECP, there is a real need to represent the idea of machines.

In any given software solution, the abstraction created by developers may be a very simple, single layer abstraction architecture or there could potentially be several layers within the architecture, yielding an extremely complex architecture. In the latter case, without a well thought out design, we start to lose the value that creating an abstraction brings in the first place. Sometimes, when dealing with a large abstraction, further dividing this abstraction into layers can help to better understand what you as a developer are actually implementing. Often, the abstraction design is further complicated by constraints imposed by the system or framework within which we are developing. Rationale, interfaces, and consistency in general, need to be taken into consideration when constructing a layered abstraction architecture.

To implement the machine abstraction, ECP uses a three-level approach to realizing this abstraction. In this architecture, each level is a class that realizes a different level of the "machine" concept, and for different purposes than other layers. In this implementation, the three levels hierarchical. At the top level, we have a class called ActualMachine which implements several methods for invoking machine behavior. The next level contains a class called DummyMachine that inherits from ActualMachine and doesn't do much. Finally, we have a Machine class that can store persistent data to the database. Hierarchically, the DummyMachine and Machine classes are at the same level since they both inherit from ActualMachine. In this discussion, however, the levels aren't necessarily based on the class hierarchy but rather based on the rationale behind each class.

The ActualMachine class is meant to most closely represent the concept of "machine" in the context of ECP. The same symbolizes that this is the underlying machine, not a Python object. Obviously, instances of ActualMachine are Python objects but when using these objects, we are more interested in what the underlying technology. This class is where all the behavior for the machine concept is defined. This class doesn't define any data attributes.

The DummyMachine class is exactly what the name implies; a dummy. The class simply defines a constructor that allows attributes to be set. Also, the class inherits all the behavior from ActualMachine. Instances of DummyMachine can set attributes in the constructor and invoke behavior provided by ActualMachine.

The Machine class provides persistence for the machine abstraction in ECP. The class also inherits behavior from the ActualMachine class. Machine functions similar to DummyMachine in that they both provide the same behavior. The difference between the two is that DummyMachine stores attributes in memory while Machine stores attributes in the database.

The rationale behind this architecture is that we want to be able to instantiate machine instances while not affecting the database. The opposite is also true; we need to be able to instantiate machines that will have an immediate effect on the database. Within the context of the ECP RESTful API, machines that are not stored on the local machine (they are retrieved from another ECP host), will need to be instantiated. That is, we want to have an abstraction available to use once the remote machine data has arrived. This can be done by using some primitive data construct such as a list or a dictionary, but by doing this we lose the machine concept. The behavioral aspect of the machine concept is gone because you can't tell a dictionary to shutdown.

There are still several limitations to this approach. For instance, not all ActualMachine behavior will be supported by the DummyMachine instances that are created. This is simply a limitation of the three classes and their inter-relationships. It is still an improvement over representing domain concepts using primitive types. We give ourselves more control in the three-level architecture over what happens when requested behavior cannot be fulfilled. The DummyMachine layer is an example of mixing the problem domain with the solution domain. The class came into existence because the solution demanded it. But this design allows for the behavior provided by the machine instances to still behave like "machines" without conforming too much to the solution constraints.

A similar approach is taken in ECP with other abstractions such as packages. The architecture hasn't been fully implemented for every abstraction within the platform. It will hopefully prove to add some balance between constraints and offered functionality.

I'm sure this approach prove useful in many other application areas. As objects become more and more distributed, we'll need a better way to represent their data when used locally while preserving the behavior of that object.