Recently I had the chance to read through "The Art of Capacity Planning" by John Allspaw (Engineering Manager, Operations at Flickr). To be honest, I thought I would be poring through mathematical theory on the implementation of capacity planning in this book, logging facts and high-level theory into my head but only retaining half or less of it. I was happily surprised that I was completely wrong.
This book ran through the practical implementation of capacity planning from an operations point of view, pointing out commonsense solutions to real-world problems using practical examples from Flickr and other companies along the way. I want to reiterate the "commonsense" portion of my last statement. John is clear to state that many times the easiest solutions are the best. This may be commonsense for everyone, but how many of us actually use this approach? Most of the time, whether in operations, software engineering, or website development, I see engineers take an approach that brings complexity to a whole new level, even when a simple solution is staring them in the face. I must admit, I sometimes partake in this group as well, building out complex algorithms to solve a problem (using a jackhammer on a nail). That is the point John is trying to hammer home: When it comes to building a scalable, reusable model for any project, the tried and true simple methods are often the best.
In the preface John states that the audience for this book includes "systems, storage, database and network administrators, engineering managers, and of course, capacity planners." I've worn a lot of hats in my employment over the years, sometimes falling into those categories. To be honest, I believe the audience for this book extends to software engineers, web developers, and any person building a product on top of these systems. It's important for all developers to know how what they are developing will affect the systems they are building upon, whether it's about understanding how the number and size of database queries affect the server in a high-load situation or how proper caching methodologies help reduce that same server load. I can definitely say that I am a better software engineer after reading this book.
John ties in concepts such as using database replication to assist in normal REST requests, proper methods for determining and implementing a server capacity plan, using load testing techniques to test and measure web-server ceilings, and how to plan for unknowns such as disaster recovery or additional load through API accesses.
One of my favorite sections was the discussion on cloud computing, which ripped through all of the marketing buzzword hype to give readers real company implementations of this technology, weighing why this approach works in some applications and fails in others. This is all done in light language with practical examples and comparisons that kept this reader's interest throughout the entire book.
In short, the audience for this text reaches far beyond those people that were originally assumed as interested parties. If you take away one key idea from this book it should be that everything you do in a development environment, whether it be in operations or web application development, affects your business in many ways you may not have thought of. Proper capacity planning is the responsibility of everyone, and understanding what you can do to help will make you a better engineer, network administrator, or any other hat you wish to don on any given day.
Senior Software Engineer
Yahoo! Developer Network