Wednesday, May 20, 2009

SQLAlchemy Bind and Result Processors

With any given object-relational mapper technology, the basic use case is the ability to map abstract data types to tables and instances of these types to rows in these tables. There is also a need to may primitive programming language types to primitive database column types. This is be no means easily achieved. Especially if the object-relational mapper in question supports multiple databases. Any popular object-relational mapper will support more than a single database. If it were the case that only a single database were supported by an ORM, there really wouldn't be a need for the ORM to begin with. In fact, if only a single database would need to be supported, it might be beneficial to not use an ORM because of the overhead that would be removed. This is rarely the case. Having said that, developers can generally assume that support for multiple database technologies is a given. Going back to the problem of dealing with primitive type mapping, how would an ORM accomplish this? One approach might be to implement the specific column types for each supported database. This would mean much duplicated functionality and would not be good object oriented practice. SQLAlchemy takes a different approach to handling the vary disparate column types between different database technologies.

SQLAlchemy defines two abstract type classes; AbstractType and TypeEngine. These two classes not only provide the base type interfaces used in SQLAlchemy, but also provide the default implementations for these methods. When mapping a primitive programming language type to a primitive database column type, there are two directions these types can be mapped. When supplying binding parameters to SQL queries that are initially primitive programming language types, and when retrieving query results from an executed query. In the latter case, the type mapping goes from the primitive database column type to the primitive programming language type.

The AbstractType class defines two methods to accomplish this; bind_processor() and result_processor(). Both methods accept a database dialect parameter so they know which database technology they are dealing with and the idea is to return a callable that performs the necessary transformations before using the data in a query or result set. In other words, bind_processor() returns a function to make the bind parameters used in a query actually usable. The result_processor() method uses the inverse concept. The SQLAlchemy String type is a good example of why these methods are useful. With strings, there are many questions of encoding before they can be used in a database.

One implementation note on the SQLAlchemy implementation of the AbstractType and TypeEngine classes. As illustrated below, the TypeEngine class directly inherits from AbstractType. TypeEngine unnecessarily overrides the bind_processor() and result_processor() methods. Both the signature and implementation of the two methods are the same in both classes. All that changes is the API documentation.