Friday, February 15, 2013

Complexity and Scale

What's the relationship between complexity and scale? Is there no relation between the two concepts when it comes to software architecture — your software either scales up or it doesn't? I think the opposite is true. There is no technology that encases our existing software systems and presents the owners with a scale button. That doesn't stop us from trying, and the results are mixed. That's not surprising, however. The successful software scaling operations typically involve a simpler business domain. That's not to say it's impossible to scale up complex software systems — Facebook has an incredibly complex business domain and I think it's safe to say they know how to scale up their systems. They're also a good example of how the two ideas are bound together. Facebook has a lot of knowledge about, and spends a lot of effort on, large-scale software.

If you work in a more realistically-sized software shop, you probably work with folks who are software developers, who also know a thing or two about scaling software systems to meet the demands of its users. But you probably don't work next to a dedicated team of scaling experts. Instead, I think the vast majority of development shops are focused on their product. Making something usable that customers can actually get the hang of. So right up front, during the software design phase, the concept of complexity and scale are tied to one another. Complexity, as far as the user perceives your product, will drive them away. Less users right off the bat reduces the likelihood of ever having to scale up your operation.

But let's think about how complexity impacts the practicality of scaling a software system to never-before-seen traffic volumes. Suppose the system we're scaling up isn't complex — it doesn't have many moving parts. And so the team goes through the exercise of deploying the system to meet the demand of progressively higher levels of traffic. Things start to break, but the team is able to quickly identify and resolve the issues. It all goes well, we're able to scale up our software system. What do you think, will it work as expected if we add another feature? How about two more?

I don't think the question is in how many features does it take to bring down a previously scalable system, but rather, how unpredictable is that number? The underlying infrastructure that enables our software to scale is bound to one or more generic usage patterns exhibited by the application. When you disrupt that pattern with complexity, you disrupt the scaling equilibrium.