Friday, March 20, 2009

SQL engine dialect in SQLAlchemy

Object relational mapping technology is used in object-oriented systems to provide a layer between the objects in the system, and the underlying relational database system. ORMs help reduce the time investment, and thus the cost as well, during development because developers can still think in an object-oriented way while interacting with the database. Tables in the database are classes in the code. Instances of these classes are rows in the database tables. This is an over-simplification of the problem. It isn't quite so straightforward as to generate standardized SQL statements based on what the instance is doing in the code. Not all database systems use a standard set of SQL. To further complicate matters, the various popular database systems don't even support the same features.

In any given language, there are different dialects of that language. For instance, if two people were engaged in a conversation, one person from the west coast of the country, the other from the east, there is bound to be some ambiguity even though they are speaking the same language. This could arise from a number of factors such as politics, or culturally-related customs. The same effect happens in technology. The language is SQL and while some database systems are very similar in some ways, they can be just different enough to cause an application to require a huge amount of duplicated code just so that the application can support multiple database systems. These are the systems that are supposed to speak the same language. SQLAlchemy addresses this issue by introducing the concept of a dialect. Every database system supported by SQLAlchemy defines a dialect. This isn't just a dialect for the SQL language alone but rather for the entire database system. We can do a lot more than merely execute SQL statements with SQLAlchemy so this is a requirement of the Dialect interface. This interface specifies what each dialect for a specific database must look like. For instance, with a specific dialect, we can specify whether the database accepts unicode SQL statements or not. With SQLAlchemy, there are other classes that do work related to communication with the database such as preparing SQL statements and generating schemas. All these are specified in the Dialect interface.

SQLAlchemy also defines a DefaultDialect class which provides the Dialect interface. This class actually implements some of the methods and attributes specified by the Dialect interface. Some of these methods and attributes are common across all supported database systems. The attributes and methods that are specialized for any given database, are simply overridden by that particular dialect implementation.

When a connection is established in SQLAlchemy, it will construct a new engine instance. This engine will then use the dialect specified in the connection string. For example, mysql would be considered a dialect. Method calls made through the constructed database engine instance are then delegated to the dialect. The dialect will then act accordingly for the database system it is using.

The Dialect interface and the implemented dialect classes in SQLAlchemy serve as a good example of polymorphism in object-oriented programming. There is a single interface that is used to invoke behavior and that behavior varies by type. This is a resilient design because it is loosely coupled and highly replaceable. We can implement our own database dialect and plug in into SQLAlchemy if we are so inclined.