Friday, October 23, 2009

Python Transaction Objects

The transactional programming model allows for changes to be made to data while preserving previous changes made to that same data. This allows the data in question to be altered without concerning ourselves with losing critical changes. This can only go so far though, because eventually, these previous transactions must be discarded. Otherwise, disk space suddenly becomes very precious. Some sort of data confirmation method needs to be applied to the data after some number of transactions. If confirmation fails at this point, the data can move backward in time. If the confirmation passes, the previous transaction data is destroyed.

Most database systems are largely transactional. The reason being that the main feature of any given database is to store and manipulate data. Providing transactional support is a huge requirement for systems that use databases. If the database doesn't provide the necessary transaction support, the applications that use the database would need to implement it. Transactional support isn't trivial to implement. Especially the kind of transactional support provided by production-grade databases.

Moving down to the individual transaction level, what data, exactly, does each transaction need to store? Do transactions need to make full copies of the data being operated on in order to restore previous states? This is one really inefficient way to do it. It is inefficient because the transaction data, once accumulated, would grow uncontrollably large. The better way to store transaction data is to store only what is absolutely necessary to revert the current data to a previous state. Once in a previous state, the same principle can be applied to the data to move further back in time still.

Does the transactional model have a place inside application code? Well, maybe on a fractional scale in comparison to database system transactional support. Having simplistic transactional support that fits inside an object-oriented design could potentially be well suited for small edits that need to be made to objects during runtime. In this case, the number of transactions at any given time would be very small and probably wouldn't exist for any significant amount of time. The real benefit here is simplicity. Even if the application you are building does use a database with transactional support, better to leave the heavy transaction lifting to it rather than bother it with smaller edits.

An in-memory transaction class could be of use for this purpose. Sub classes could then inherit from this class in order to become transactional. Below is a simple example of such a class as implemented in Python.
#Example; Python transaction objects.

#Do imports.
from difflib import ndiff, restore
from types import StringTypes

#String type tuple.
STRING_TYPES=StringTypes

#A transactional class that should be sub-classed.
class Transactional(object):

#Constructor. Initialize the transaction list.
def __init__(self):
self.transactions=[]

#Start recording a transaction.
def start(self):
_attribute={}
for i in dir(self):
current_attribute=getattr(self, i)
if type(current_attribute) in STRING_TYPES:
_attribute[i]=current_attribute
self.transactions.append(_attribute)

#Stop recording a transaction.
def stop(self):
_tran_index=len(self.transactions)-1
_tran_current=self.transactions[_tran_index]
for i in dir(self):
current_attribute=getattr(self, i)
if type(current_attribute) in STRING_TYPES:
_tran_current[i]="\n".join(ndiff(_tran_current[i], current_attribute))

#Rollback the last stored transaction.
def rollback(self):
_tran_index=len(self.transactions)-1
_tran_current=self.transactions[_tran_index]
for i in _tran_current.keys():
setattr(self, i, "".join(restore(_tran_current[i].splitlines(), 2)))
self.transactions.pop(_tran_index)

#Commit all changes.
def commit(self):
self.transactions=[]

#Simple class capable of storing transactions.
class Person(Transactional):

#Constructor. Initialize the Transactional class.
def __init__(self):
super(Person, self).__init__()
self.first_name=""
self.last_name=""

def set_first_name(self, first_name):
self.first_name=first_name

def set_last_name(self, last_name):
self.last_name=last_name

#Main.
if __name__=="__main__":

#Instantiate a person.
person_obj=Person()

#Start recording a transaction.
person_obj.start()

#Manipulate the object.
person_obj.set_first_name("John")
person_obj.set_last_name("Smith")

#Stop recording the transaction.
person_obj.stop()

#Manipulate the object.
person_obj.set_first_name("jOhN")
person_obj.set_last_name("sMiTh")

#Display object data.
print "FIRST NAME:",person_obj.first_name
print "LAST NAME: ",person_obj.last_name

#Rollback to latest stored transaction.
person_obj.rollback()

#Display object data.
print "FIRST NAME:",person_obj.first_name
print "LAST NAME: ",person_obj.last_name
In this example, the Transactional class is responsible for providing the transaction support for sub classes. The basic idea behind this class is that it will provide very basic transaction support for any string attributes of the class. This means that sub classes can define any number of string attributes and each one will be transactional.

There are four basic methods to Transactional: start(), stop(), rollback(), and commit(). The Transactional.start() method will start recording a transaction. This means that any changes made after the method is invoked, will be part of the transaction data. The Transactional.stop() method completes the current transaction that is being recorded. It does this by using Python diff support to store only the changes that have been made to the data. The Transactional.rollback() method restores the string attributes to the most recently stored transaction state. Again, this is done using Python diff support. Finally, Transactional.commit() simply purges all transaction data.