Thursday, April 23, 2009

TurboGears i18n catalog loading

In a perfect world, every single application would have full translations of every major spoken language. Applications having this ability is referred to as internationalization. Internationalization become increasingly more important when dealing with web applications. The whole idea behind web applications is portability amongst many disparate clients. Thus, it is safe to assume that not all of these clients are going to speak a universal language. Fortunately, there are many internationalization tools available at developers disposal. One of the more popular tools on Linux platforms is the GNU gettext utility. In fact, Python has a built-in module that is built on top of this technology. Taking this a step further, some Python developers have built abstractions on top of the gettext module, providing further internationalization capabilities for Python web applications. The gettext Python module uses message catalogs to store the translated messages used throughout any given application. These catalogs can then be compiled into a more efficient format for performance purposes. This method of using a single message catalog would suffice for monolithic applications that cannot be extended. However, all modern Python web application frameworks can be extended one way or another. These components that extend these frameworks are more than likely to need internationalization of messages. The problem of storing and using translated messages suddenly becomes much more difficult.

The TurboGears Python web application framework provides internationalization support. TurboGears provides a function that will translate the specified string input. Of course, the message must exist in a message catalog but that is outside the scope of this discussion. The tg_gettext() function is the default implementation provided by the framework. The reason it is the default implementation and not the only implementation is because a custom text translation function may be specified in the project configuration. Either way, custom or default, the text translation function becomes the global _() function that can then be used anywhere in the application. Below is illustration of how the tg_gettext() function works.

The get_catalog() function does as the name says and returns the message catalog of the specified locale. The function will search the locale directory specified in the i18n.locale_dir configuration value. Once a message catalog has been loaded by TurboGears, it is then stored in the _catalogs dictionary. The _catalogs dictionary acts as a cache for message translations. Illustrated below is the work-flow of this function.

As mentioned earlier, this method of using a single message catalog is all well and good until your application needs to be extended by third-party components. If these components have their own message catalogs, any text translation calls made to _() will use the base application catalog. The reason being, the first time a catalog is loaded, it stays loaded because of the _catalogs cache. This means that even if the extension to the web application framework were to update the i18n.locale_dir configuration value, it will make no difference if the specified locale has already been cached. Ideally, the get_catalogs() function could check the _catalogs cache based on the locale directory rather than the locale.