Tuesday, February 24, 2009

New ECP exception and enhanced state restoring behaviour

Over the past few days, the ECP team has made some notable enhancements in the trunk. The first being the addition of a new exception called E2LibvirtError. As the title suggests, this exception is raised for Libvirt-related issues. The Python Libvirt library already defines an exception class. However, there are many types of errors that can happen from within Libvirt. The idea behind this new E2LibvirtError class is to provide better information when something bad happens in libvirt. For instance, in the Python Libvirt library, there is only one exception type. If this exception gets raised, a short message is displayed. This is the default message that gets initialized with the Python base exception class.

The problem here being that Libvirt can manage several different hypervisors on any given system. Thus, there are several layers within Libvirt in which something can go wrong. In ECP, the Libvirt exception is caught, and the generic message is recorded.

The new E2LibvirtError exception exploits additional exception information encapsulated within the basic Libvirt exception instance. I don't mean encapsulated in the traditional object-oriented sense. I mean the information is there, and ECP should use it for the benefit of the end user. The new ECP exception, when instantiated, will take several error codes from the original Libvirt exception and produce a much more meaningful error message.

This leads me to the changes made in the restore_machines_state() functionality. The rationale hasn't changed, only the implementation. We simply handle table existence and local machine existence detection much better than the current version. If the function finds a machine that is not running and it should be (because that was the state the machine was in when ECP last shut down), it will attempt to start it. We've already added the new E2LibvirtError exception handling to this function when attempting to start the machine since this is a Libvirt operation. I've already been seeing much more useful error messages in the logs. These new error messages should also be viewable in the web front-end via the error dialog box when something Libvirt-related goes wrong.

This does increase the Libvirt coupling in ECP a considerable amount. However, given the level of functionality that ECP would have without Libvirt, I think it is a fair trade off.