Friday, March 6, 2009

The dynamic configuration problem

Most applications, especially web applications require some type of configuration management. The configuration data is usually specified in a .cfg file in the application directory. Most languages and web frameworks provide utilities to parse and read these configuration files. These than become variables, possibly altering the control flow, in the application itself.

In many cases, users of these applications do not want to edit configuration files. I know I don't want to half the time. The solution? Integrate the configuration into the GUI. Perfect. So now we have a pretty interface that users (often administrators) can use to update these configuration values. One problem here is where do these values get stored once they have been updated? It is probably not a good idea to alter the configuration file. Or is it?

In ECP the approach is to store the altered configuration values from the web front-end into the database. This is good because we don't need to mess around with writing code in the application to manipulate the configuration file every time a configuration value is changed. When a configuration value is needed somewhere in the application code, ECP will check the database for the configuration value. If the value is found, this value is used in the application.

The problem here may be obvious but I'll mention it anyway. What happens when a configuration value in the configuration file is updated and that same value already exists in the database? Well, as it turns out, that change will have no effect.

What if we were to change the order precedence of the configuration value lookup; if the configuration file were checked before the database? In this case, we have lost the front-end GUI capability. The reason I don't like storing updated configuration values in the configuration files is because that isn't why they are there. The configuration files exist for humans to edit and they are generally quite annoyed when a machine starts messing with the configuration.

So how do we fix this configuration race condition? We have a value that is stored in the database and that same value is stored in the configuration file. Ideally, it would make more to support both configuration storage locations; the file and the database. At the most basic level, we are trying to solve the problem of the most recent value. The most recent configuration value should be used in the application code. One step to take on the database side is to store the modification date for each configuration value. At least then we can determine how fresh the database value is. On the configuration file side, this isn't really feasible since the file is edited by humans. We can't expect them to enter the current timestamp every time the update a value just so their application will behave as expected.

We can't use the modification date of the configuration file because this date doesn't reflect which value was updated. What we could do is introduce some new configuration fields to the database. The table might look like the following.

Here, we have four fields; db_value, db_tstamp, file_value, and file_tstamp. The db_value field stores the value the is sent from the application front-end. The db_tstamp value is updated with the current timestamp when the db_value field is updated. When the application is started for the first time, all file configuration values would then be copied into the database in the file_value and file_tstamp fields. At this point, we would have several file values stored in the database but no front-end editor values.

The reason we would need these file_value and file_tstamp fields is to compare them with the db_value and db_tstamp fields. When a lookup happens, the lookup functionality would read the value from the configuration file. If the value is different, we update the file_value and file_tstamp fields. We would then return either the db_value or file_value based on whichever is greater; db_tstamp or file_tstamp.