Thursday, January 3, 2013

Indexes and Keys

In any programming language, the developer typically has a choice between two collection types — a list or a dictionary. They may be called something similar, depending on the language — array and hash-map for example. The list uses an index to access a given item in the collection while the dictionary uses a key. A key could be a number, or some other primitive construct in the language, but they're usually strings. There is one interesting distinction between indexes and keys. Indexes are allocated by the list whereas keys have to be dreamed up by the programmer. This is both good and bad. Good because keys are strings based on some concept. Indexes, by contrast aren't so intuitive to the developer when they need to look something up later. Should we just stick with dictionaries in our code, or our we introducing some hidden issues?

The very concept of a list/array is an important one. The indexes only appear to to be arbitrary integers when we try to look something up. But it turns out that the look up itself is arbitrary and the index makes perfect sense. It's there to preserve ordering. Think about page numbers in a book. The numbers themselves are seldom used. We could probably ready a book that didn't print numbers on each page without getting lost. Just like we can use lists in software without directly looking at the indexes. We're using the list to manage a collection, not necessarily look up the 43rd item because it so happens that we need the 43rd item. Outside of the list context, the number 43 doesn't hold any meaning. Inside the list, 43 means that it is after 42 and before 44. Which is very relevant when iterating over the list. The important concept for the programmer during these iterations is the next item.

Just as we look up object attributes by name, we can look up dictionary items by key. They both share the same fundamental advantage and problem. Looking things up by key works well when order isn't relevant. Keys are something representative of our knowledge about the problem domain. We come up with keys and attributes based on something conceptually familiar to us. If we want the first name, we want to look it up by the firstName attribute not by the index 4. The advantage with keys is that they're portable. We doesn't exclusively assign a key to any one dictionary. A key is descriptive and might apply to several dictionaries. But the challenge with keys is that they're introducing new language into the system. The benefit of looking up something familiar by remembering works only if there are a handful of possible keys. Otherwise, you need reference material while you're programming just to remember what they are.

There is no real advantage to using a list or a dictionary exclusively — they each have their strengths. Lists maintain order, but indexes appear arbitrary when looking up an item. Dictionaries help in representing conceptual attributes of things, but introducing more complexity in terms of remembering what is available. It's interesting, how to seemingly simple structures within every programming language have such influence over how we think about solving a particular problem.