Monday, May 10, 2010

Flexible Python Conditionals

With dynamically-typed languages such as Python, we are able to add flexibility in places that reduces the complexity of the problem we are trying to solve. Take a conditional if statement for instance. We can pass any object to a function and use attributes of this object in the conditions of an if statement. Of course, you'll get AttributeError exceptions if the attribute isn't part of the object, but that is easy to fix because the exception is very self-explanatory.

Even dynamically-typed languages aren't immune to complex conditionals, such as large if statements with several segments. During development, these can grow quite large. This isn't necessarily a problem of careless development. It is really more of a time constraint. The if statement is a great construct for executing alternative behavior; it is fast and the syntax is straightforward.

In Python, an alternative conditional construct is a dictionary. A dictionary is a simple means of storing named data, key-value pairs. Since the key can be any type, including objects, and the data can store objects, which in turn store behavior, dictionaries can also execute behavior conditionally.

So what is the performance difference between using if statements or dictionaries to control behavior? Below is a small example of using the two approaches.
import timeit

class User(object):
def __init__(self, name):
self.name = name

def test_dict(user):
user_dict = {"user1":User("user1"), "user2":User("user2")}
return user_dict[user].name

def test_if(user):
user1 = User("user1")
user2 = User("user2")

if user == "user1":
return user1.name
if user == "user2":
return user2.name

if __name__ == "__main__":
dict_timer = timeit.Timer('test_dict("user1")',\
'from __main__ import test_dict')

print "Dict", dict_timer.timeit()

if_timer = timeit.Timer('test_if("user1")',\
'from __main__ import test_if')

print "If", if_timer.timeit()
In this example, we have two test functions, test_dict() and test_if(). The test_dict() function uses a dictionary as a conditional while the test_if() function uses an if statement. The test_if() function is faster than the test_dict() function. In the example above, test_if() evaluates to true in the first segment of the if statement. What happens if we change our test to look up "user2" instead of "user1"?

Well, the test_if() function is still faster than the test_dict() function but not as fast as when searching for "user1". This is because both segments of the if statement are are evaluated. What is worth noting here is that the test_dict() performance doesn't fluctuate based on what is being searched. That is, the dictionary performance is largely dependent on the number of elements it contains and the if statement performance is dependent on the value being tested in addition to the number of conditions to evaluate.

The benefit to using dictionaries in this context is that they are flexible. Dictionaries can easily be computed and inserted into a conditional context like the example above. There is, however, one drawback test_dict() has. What happens if we search for a non-existent user? The test_if() function already handles this scenario by returning None. The test_dict() function will raise a KeyError exception. Below is the modified example that will handle this scenario.
import timeit

class User(object):
def __init__(self, name):
self.name = name

def test_dict(user):
user_dict = {"user1":User("user1"), "user2":User("user2")}
try:
return user_dict[user].name
except KeyError:
return None

def test_if(user):
user1 = User("user1")
user2 = User("user2")

if user == "user1":
return user1.name
if user == "user2":
return user2.name

if __name__ == "__main__":
dict_timer = timeit.Timer('test_dict("user1")',\
'from __main__ import test_dict')

print "Dict", dict_timer.timeit()

if_timer = timeit.Timer('test_if("user1")',\
'from __main__ import test_if')

print "If", if_timer.timeit()