Thursday, March 11, 2010

Python Key Errors

When accessing elements from a Python dictionary, we have to make sure that the key actually exists as part of the dictionary object. Otherwise, a KeyError will be raised. This isn't a big deal because since we know that is the consequence of attempting to access a non-existent key, we can handle the error.

This is a common way to handle accessing dictionary elements; using a try-except block. Another way to make sure you are not requesting an element that doesn't exist is to use the has_key() dictionary method. This method will return true if the element in question exists. At this point, you are generally safe to access the element.

Which dictionary access element is better? None really. It depends on your coding style. It is always better to be consistent.

From a performance perspective, we can see a minor difference. For instance, the following example will attempt to retrieve a non-existent dictionary element using both methods.
from timeit import Timer

def haskey():
dictobj = dict(a=1)

if dictobj.has_key("b"):
result = dictobj["b"]
else:
result = None

def keyerror():
dictobj = dict(a=1)

try:
result = dictobj["b"]
except KeyError:
result = None

if __name__ == "__main__":
haskey_timer = Timer("haskey()", "from __main__ import haskey")
keyerror_timer = Timer("keyerror()", "from __main__ import keyerror")

print "HASKEY:", haskey_timer.timeit()
print "KEYERROR:", keyerror_timer.timeit()

In this case, the has_key() method is noticeably faster than the KeyError method. Now, this example shows elements that don't exist. What if the element does exist? Well, the KeyError method is slightly faster.

1 comment :

  1. You may want to look into the "get" method for dictionaries. It does a lookup for a key in a dictionary and if it does not exist it will return a default value if one is provided. In the above example you would do dictobj.get('b',None) and in the case where b is not in the dictionary it would result None to you. Syntax-wise it is a lot nicer and running your benchmark is only slightly slower then the has_key method (in my test it was 2.0 sec vs 1.8 sec).

    ReplyDelete