Tuesday, September 25, 2012

Simplicity Through Primitiveness

Software development craftsmanship involves seeking simplicity. Not only because it's an art form, but because it's practical to do so. Simplicity directly translates into software with fewer moving parts. Fewer moving parts, in turn, mean less risk. I can go on and on about how great simplicity is and how we should all try to reach the simplest possible solutions. But most programmers know what simplicity is, and have grown to appreciate it for any number of reasons. It's generally thought of as a positive side-effect of writing good code. Or the inverse — simplicity is a side-effect of good code.

These are all the good things simplicity brings to a software development project. Believe it or not, there is such thing as over-simplification. You can miss important details — relevant to what your software is trying to accomplish. I see this as a bit of a misnomer — you should try to over-simplify. You should make an active effort as you're constructing components to eliminate excess weight. When you inevitably realize that you've over-simplified — because you're missing key functionality — you can add it. You've got a simple foundation on which to add missing capabilities.

At this point, you them want to repeat the process. Newly added components could do no harm. Or they could require changes to your foundation. Or they could introduce inconsistencies with conventions, or just put a damper on the warm and fuzzy feeling you got from your once clean and tidy code. Again, this comes back to actively monitoring code, as it gets added, as it changes, and even as it sits there and matures, looking for the simplest possible outcome. This presents a problem — how do you define the simplest possible outcome? Is it the least amount of code for a given function? Is it the best documented code? The "best" abstractions, the "best" third-party libraries, the "best" package structure? The fact is, if you're not careful, you can do more damage than good in your quest for simplicity.

Consider layers of abstraction in a class hierarchy. At the top, you've the most abstract thing that doesn't do much of anything besides define interfaces. In the middle, you're starting to drill down into meaningful structure and behaviour, maybe even classes that support direct instantiation. Finally, we reach the opposite end of abstraction tree where it's as real as it gets, and everything has been passed down — for better or for worse. I include for worse here because this it helps to illustrate the key point I'm making with abstractions. The simplicity is artificial in the sense that this entire structure we've just created to help simplify things merely hides the trouble spots, or complex implementation details if you will. Encapsulation isn't a horrific problem that'll lead to nasty bugs — I'm not suggesting that at all. What I am suggesting, however, is that encapsulation, inheritance, and other object concepts as these move us away from primitiveness. That may sound like an overly-obvious statement, but if you think about it in terms of layering, you can distance yourself from the language. Perhaps further than you intended.

If you consider abstractions in terms of applicability, you'll strive to create generic abstractions. You can take a generic enough abstraction and apply it in a number of scenarios as opposed to just a handful, specific to the domain for which it was designed. But let's look at this idea from another angle. Think about abstractions in terms of portability. Imagine, with each component you design, that you're creating a miniature software system. You're trying to reach a broad audience with this component, and so we have to aim for portability.

What is the difference between generic solutions and portable solutions when it comes to designing abstractions? We often have competing principles when we're writing code. For example, to write something generic enough that can be shared across the system means you're going to layer your abstractions on top of one another — if one level doesn't support the required API, we add another layer that hides some implementation details, builds on the existing layers, and ultimately gets us what we need. This approach is the common way of dealing conflicting system components that must communicate with each other. It is our job to come up with a solution, not a hack job, but a proper object-oriented design. This pattern of software development tends to bury us in layers. Perhaps as an unintentional side-effect of quality code. If you take a step back and look at the resulting software system, you'll notice that any level of primitiveness is lost.

Now, if you've done your job, you've worked-around any difficulties in creating components for your system and they're well documented and easy to understand. In fact, this isn't all that hard to do. You have a team of developers — a small team let's say. Each member is solely responsible to the development and maintenance of one or two components. That way, there is no overlap, no miscommunication, and everything works on it's own just as it should. When you need to integrate another developer's component, we consult with the author — maybe the documentation is enough and we don't need to. This sounds all very well and organized and might work well for incredibly small software systems. But layers, ownership, documentation — these are all overheads in one sense or another.

Consider documentation. Imagine that instead of designing your on component to serve a particular purpose inside your software, you use something directly out of the language's library. In other words, something primitive. Now, a primitive piece of any programming language is going to be well tested, and well documented. Primitive components are well understood by programmers because they're available wherever the language is, and are often a prerequisite for performing other useful actions. By comparison to your own component, you no longer have to maintain specific documentation, likelihood of component-specific bugs goes down, and there is less learning overhead. Now consider layers. There is no doubt that you'll face the layers that your team-mate has put together in order to manifest his or her idea. The component author isn't always going to be there to support you when something goes wrong. More often than not, component bugs are discovered when they're used in the broader context of a software system. And so you'll need to diagnose the issue by navigating and understanding the layers specific to that component. Once you've started down that path, the whole idea of self-containment has lost much of it's appeal.

Primitiveness isn't my argument for going backward through time and using only low-level constructs to build complex software systems. Not at all, in fact, we need extensive levels of abstraction where it makes sense if we're to have any level of comprehension in the face of overwhelming complexity. My argument for using primitive constructs is an extension of the idea that abstractions should take away the less important features of an object. Either by hiding, or removing entirely, what programmers don't see or touch means one less moving part to tinker with. One less obstacle that prevents a coherent set of modules from interacting as they should, and producing the desired outcome. So let's chase after simplicity. Let's use primitive components where the the only loss is complexity.