Friday, June 25, 2010

Python Is Descendant

How do you check if a given Python object is a descendant of a given class? Here is a good way to do it using the inspect.getmro() function. This function returns a list of all the object's base classes, including all grand-parent classes, and so on, all the way up the class hierarchy.

Here is my version of the technique put into a simple function that will test if the specified object is a descendant of the specified class name.
from inspect import getmro

def istype(obj, typename):
return typename in [clsobj.__name__ for clsobj in getmro(obj.__class__)]

class A(object):
pass

class B(A):
pass

if __name__ == "__main__":
print istype(B(), "A")
print istype(B(), "object")

Wednesday, June 16, 2010

Configuration Design

How much thought goes into what your configuration file will look like when developing an application? That is, does the overall configuration structure have an impact on the quality of your finished product? This question can be asked from more than one perspective. For instance, from the end user's point of view, is the set of potential configuration options cohesive and easy to understand? The number of potential configuration options also plays a role in the overall manageability of the application because more configuration options means more complexity. This same point needs to be considered from the application's point of view. Adding flexibility by means of configurable options has an impact on your application's design. Making any aspect of your application configurable adds variability to something that could have been constant, or hard-coded. Beyond the size and structure of an application configuration, there is meaning behind each configuration option. Thinking about what each configuration option means before adding it to the configuration design is important. Lets take a look at an example of how we might go about designing a configuration schema for a simple system.

The system under development is a simple desktop GUI application data viewer. The application will support two data formats; JSON and XML. The GUI for this application is fairly straightforward in that it will have two tabs. These tabs will offer the end user a different view of the data. The first tab will use a tree widget to display a hierarchical view of the data. The second tab will use a table or grid widget to display the data. In the latter tab, clicking a row will reload the table or grid with that row's data if it can be displayed as a table or grid. Otherwise, a dialog is opened to display the details of the row data. Likewise, if a leaf node in the tree widget of the first tab is selected, the same dialog will appear. The application can load the data from a local file or from a URI on the web.

This intentionally simplistic system hardly justifies the need for having a configuration. It is ideal, however, for our purposes here because we want to elaborate on how the configuration values for the application come into being. Lets start by looking at the static aspects of our sample application. By static, I mean aspects of the application that remain constant throughout it's lifetime. These aspects aren't configurable. It is worth looking at the fundamental features that will not change before designing configurable elements. This way, we can eliminate candidates from our list of potential configuration values. Looking at the sample application under development, there are two supported data formats, JSON and XML. One potential configuration value we could add is a list of supported data formats. Logically, we would want to have the ability to extend our application to support more than just JSON and XML formats. However, in our case, this is something that will remain constant. The system is a JSON/XML reader so we don't want to change the what the application does best.

Now that we've eliminated a potential configuration option, lets think about some configuration categories, or sections. Configuration sections are a group of related configuration options. For instance, in our application, we would probably want a configuration section for GUI settings. The benefit to having configuration options grouped by sections is that the end user can get a better handle on what they're changing. Conceptually, sections, or categories, give the end user a little more confidence when manipulating the configuration. For instance, if a user sees a GUI section in the configuration file, they're going to feel at ease changing these options. This is because the intent of this particular group of options is clearly stated; change these values to alter the look and feel of the GUI. Additionally, from the developer's perspective, configuration sections give us an opportunity to map configuration options to concepts found in our code. GUI configuration options might map to only a handful of classes, or possibly a single base class.

Since our sample JSON/XML reader application is so simplistic, we're limited in the number of configuration sections we can create. Our application is a desktop GUI application so "gui" is a good candidate configuration section. We can also read JSON or XML from the web so "network" might be considered for a configuration section. Finally, we may also want a "core" section for configuration options that don't fit into either of the above sections. The "core" configuration section is intentionally general because we still need to leave room for changes. As our application code evolves and is re-factored, we may discover new concepts to map our configuration options to. The "gui" and "network" concepts, although broad, are clear enough for both users and developers upfront.

Lets take a step back from our sample application for a moment and take a look at some the various types of configuration options. The most basic type of configuration option we can have is a boolean, on/off value. These are the configuration options that either enable or disable some aspect of the system. With boolean configuration options, you typically specify a true value to enable something. For instance, setting "ssh_enabled" to a true value enables SSH. We can also use boolean configuration options to negate a certain aspect of the system. For instance, setting "ssh_disabled" to a true value disables SSH. In the latter case, we've changed the meaning the boolean configuration option to take on a negative meaning. Most of the time, the former approach of true values acting as an enabler works best because the intent is always clear. Configuration options can also specify a literal value such as a string or integer. These types of configuration values are useful for default values used in the system. Configuration options can also specify a list of literal values are also typically used as defaults within the system or as a means to extend the system.

Going back to our JSON/XML example, lets come up with some configuration values. Is there any aspect of the application we may like to disable at some point? Perhaps the obvious choice is to have the ability to disable the JSON format or to disable the XML format. What type of configuration option should we use to accomplish this? Two boolean configuration options, "json_enabled" and "xml_enabled", could be added to the configuration file. By default, our application would probably have these two options set to true since these options make up the core of the system. Next, what configuration section do these configuration options belong to? For now, they go in the "core" section because changing these options has an impact on several concepts in our application. If we were to set "xml_enabled" to false, the XML tab would no longer be displayed. An alternative to using boolean configuration options is a list. For instance, we could define a "data_formats" configuration option that defaults to a list of "JSON" and "XML". With this approach, there is a need for only a single configuration option instead of two. But, the problem with this approach is that we've introduced a notion of extensibility to both developers and the end user when this isn't the case. To disable one of these data formats, we'd have to remove it from the list. Once removed, it is as though the data format never existed.

What about our other two configuration sections, "network" and "gui"? In the network section we have a good opportunity to use a list configuration type. An "allowed_hosts" option would allow us to specify a list of hosts in which the end user is allowed to request data from. The "network" section could also use a "network_enabled" boolean option that allows us to disable networking entirely. Our "gui" configuration section is largely undefined at the moment. This is fine because the last thing we want to do is throw configuration options into sections when they aren't needed. The "gui" section is a good place to put proxy configuration options, if and when they are needed. Our application is going to use a GUI library of some sort and these libraries offer a plethora of parameters to change the look and feel of the application. The configuration options defined in the "gui" section can usually be passed directly through the application and into the GUI library.

What we have seen here with our simple example is that there is some thought involved with how a finished configuration file for an application should look. The options used in your configuration file should map well to the concepts found in your application design. If you're writing an application and something seems unclear about a configuration option, change it. But don't hesitate as the sooner you correct this ambiguity the better. Problems with configuration options may be badly named options or they may have misused values. These issues sometimes aren't trivial to spot but putting yourself in the mindset of someone who has to use your configuration file for productive work will often help with the design.

Wednesday, June 9, 2010

Open Source UML

There are a variety of open source modeling tools in existence today. Some of these tools are more popular than others for various reasons. One reason for the variation in popularity is the support of the UML specification or lack thereof. Another reason is the overall user experience while building models with these tools. Proprietary UML tools are much more advanced than their open source counterparts in several dimensions. This is why they are the preferred choice by most developers who use the language. They are usually the right tool for the job. If an open source UML modeling tool does not implement part of the UML specification that is important to the developer, this is a show-stopper. Software developers want to use the right tool for the job. Anything that involves less modification and less duplication is always a bonus. However, open source UML modeling tools do exist and are in wide use today so there must be a reason for this. There are some advantages and disadvantages to choosing an open source modeling tool. Two questions to ask before choosing a UML modeling tool are as follows. If a tool doesn't support some part of the UML specification, how will that affect me? How many non-modeling features does the tool have that I will probably never use? I'd now like to explore these questions a little further.

One question to ask yourself that should probably precede the two above is what am I using the UML for? This will obviously influence your choice of modeling tool because different features serve different needs. I'm not going to dig too deeply into what the various uses of the UML are and why they matter. Instead, I'd only like to consider two broad categories; UML as a sketch and UML as a specification. The former category is the more widely used of the two as it requires less of an investment. The latter has much more of an impact on the success of the project because the model has to be formally correct. Otherwise, the cascading modeling errors have a major disruption. What you want to use the UML for is irrelevant here. What is important is the style of modeling you want to use; sketch or specification.

Lets take a look at the UML specification itself. Some of it is essential for tool vendors to implement such as classes and relationships. Others aren't as important such as profiles and timing diagrams. This level of importance is relative to any given project domain. For instance, if we were modeling the specification of a real-time system, these sections of the UML specification are suddenly much more important than if we were simply sketching ideas using UML notation.

Proprietary modeling tools have better support for the UML specification itself. In the rare case that you require a modeling tool to support the full UML language specification, the choice is easy. You need to invest in a proprietary tool. This is the exception rather than the rule. The majority of UML users do not need support of the full specification. Open source UML tools have good support for essential modeling elements such as classes, packages, relationships, use cases, interactions, and state machines. Some are better than others for modeling certain elements and they are all different from one another in terms of user experience.

What if you want to build UML models as a software system specification? Can you still do this if you only require a subset of the UML specification be implemented? Indeed, you can. If you like a given tool, whatever the reason may be, you can't select it for use and hope that it will support the full specification in the future because that may never happen. Open source modeling tools are perfectly acceptable for using UML as a specification. Lets do a quick run-through of which UML elements can be used as a specification with open source modeling tools. Classes, packages and relationships? Check. These elements are ranked the highest because it would be next to impossible to model an object-oriented system without them. Actions, activities, control-flow, and data-flow? Check. We need these elements of the language when it comes time to build smaller, atomic computations of the system. Use cases. Check. These are a must for visualizing simple requirements. This only touches on the very fundamental elements of the language. Support for the UML specification in open source software goes much beyond our purposes here.

Open source tools have these areas pretty well covered without much deviation from the UML specification itself. Open source UML tool support in other areas of the UML specification, like interactions and state machines, are still lacking or are inconsistent. Again, how important are these areas of the language to you? Even with incomplete or missing implementations, using UML as a sketch is possible with open source UML modeling tools. As an example, consider nesting modeling elements. With some open source tools, dragging one element into another to show a parent-child relationship has no effect in the semantic model. That is, internally, from the tool's perspective the two elements are at the same level. From the end user's perspective, these elements are at the same level so the sketch still serves it's purpose.

Up to this point, we've only touched upon modeling elements that exist within the UML specification and the tools that implement them. This is, after all, the primary goal of a UML modeling tool. There are, however, some features that fall outside the scope of the specification itself. The value of these additional features are something to consider when choosing a modeling tool. Many of these features are probably not needed.

Code generation is a must have for any enterprise-grade UML modeling tool. Is this actually a must have? Or is it a feature just for the sake of having a feature? Generating code is also supported by open source UML tools. The claimed benefit of generating code from a model is that the skeleton of the classes and their relationships can be built automatically. This saves us some tedious typing but it also introduces a level of coupling between the model and the code that may not be desired. This is because at the code level, there are going to be small hand-made changes that aren't reflected by the model. If you are building models as sketches, this is definitely not a feature you'd be willing to pay for.

XMI support enables the exchange of modeling meta-data. In theory this is a must-have feature for any modeling tool, open source or proprietary. It is a must-have because it is the standard format used to store and transfer semantic model data. If you are sketching UML diagrams, the underlying semantic model isn't as valuable to you. Therefore, the need for XMI importing and exporting isn't that great. Organizations tend to standardize on a modeling tool once one has been chosen. Even if you are modeling rigid, software system specifications, your need for XMI support may not be that great. In the spectrum of open source UML tools, support for XMI isn't quite there. Different tools support the interchange standard at different levels. It is comparable to the support for various web standards among various browser vendors. The little differences create more problems than the standard solves.

The overall user experience is a criterion that is sometimes overlooked in regard to UML software. Aside from what features are supported, what is needed, and what we can do without, usability has an impact on the quality of the models we produce. Usability isn't necessarily isolated from the feature set of the software. If any given UML modeling tool has too many features that are edging away from the UML specification, we're distracted. I would go so far as to say that this makes the software intimidating when it doesn't have to be. The number of steps required to construct a basic diagram should be small. If that number is too high, think about using something simpler. Building models is no different than writing code in that it is never going to be right the first time around. Modeling is an inherently iterative process. If your chosen modeling software can lead to building models in a more timely manor due to a clean, simple user interface, give that software high marks.

As you can see there are more attributes of a good UML modeling tool to consider than the number of features it has. Deciding on what you are using the UML for, sketching or specifying, changes the evaluation criteria when choosing software. Open source UML modeling tools are a good alternative to purchasing a proprietary tool in some cases, even though these tools still haven't reached maturity yet. When open source UML software doesn't fit your needs, consider how many unnecessary features you will be paying for when choosing your tool of choice.

Thursday, June 3, 2010

jQueryUI Tab By ID

The jQueryUI tabs widget allows you to select a tab programmaticaly by using a 0-based index. This is handy because sometimes you want to direct a user to a tab without them having to click it. The select() method of the tabs widget allows us to do just that. However, having to keep track of the indexes for each tab doesn't really make sense and hard-coding the index value for a specific tab makes even less sense.

It would be nice if we could select a specific tab by ID. The following function does just that.
function select_tab_by_id(widget_id, tab_id){
jQuery(widget_id)
.tabs("select", jQuery(widget_id+" ul a")
.index(jQuery("[href="+tab_id+"]")));
}

select_tab_by_id("#my_tabs", "#tools");