Why should you test in production? Bluntly, the truth is you’re already doing it; the true question is whether to become deliberate, expert and reliable at it, or whether it continues to be an accident and a crisis.
You’re already testing in production in at least the sense that you have procedures for handling incidents and outages that affect customers directly. With a little more investment, that workflow can become an ongoing, manageable asset for your software development lifecycle (SDLC). Here’s how.
For current purposes, testing in production (TiP) has to do with web applications; the story for embedded software, installable software, and so on is outside the scope of this introduction. Also, TiP isn’t a trick that requires customers to find a mass of defects that the publisher should have found first. Most testing is still done as early in the SDLC as possible. Instead, TiP complements all other forms of testing for a few situations where the production environment is effectively irreplaceable.
There are several things TiP can give that can’t be matched testing in a staging environment:
- Delivery at extreme scale
- Practice at error recovery
- Data which respects customer privacy while representing customer diversity
- Good return on investment
Highlights of these four themes follow.
TiP for Scale
Some applications are simply big, and to replicate their facilities or loads would be prohibitively expensive. It might be impractical to maintain more than one global network topology, for instance.
In principle, most development organizations maintain a staging environment where a product can be tested thoroughly before final deployment. But in practice, essentially all staging environments are smaller than the production environment in at least one dimension. This means that, even when all tests pass in the staging environment, installation in production introduces new challenges: an order of magnitude more co-operating nodes, hundreds of consecutive hours of traffic to shuffle caches far more thoroughly, a longer tail of unusual data, a thread pool that’s a multiple of the size of the staging environment’s, entirely different query optimizations, and so on.
The staging environment becomes a model for production, then: a site to verify that the application behaves predictably under light stress. Full-blown execution only happens in production, though.
One of the benefits of a focus on TiP is practice in useful disaster recovery of business continuity techniques. A good TiP team knows how to diagnose defects quickly, judge the best remedies for them, and, when the situation calls for it, revert back to a known good release reliably. TiP is an unmatched opportunity to verify an organization’s digital resiliency.
Another aspect of TiP’s usefulness is the way it supplies test data. Conventional testing uses data that is similar to customer data, but of course not too similar, because that would violate the security of individual customers.
Production data, though, doesn’t just approximate the target; it is the target. Generating realistic data is a surprisingly expensive chore, and to be able to use real data correctly sometimes is a big advantage.
At the same time, rigorous controls need to be in place so that the data doesn’t leak out. For a programmer to disclose that “test-password” accesses a test database isn’t so bad; to share personal details about even one real customer, though, is a grave matter.
Engineering is always about choices — that is, finding the right balance between costs and benefits. TiP shouldn’t be seen as a last resort only for those situations when no other test is possible. Instead, think in terms of situations where TiP pays off.
It might be feasible, for instance, to create test data that adequately models customer data for a specific load test. If running such a test as TiP liberates an experienced analyst from data synthesis so they can focus on analyses that directly boost revenue, then TiP immediately becomes profitable.
Make the Most of TiP
The first and most important step toward TiP is the flexibility to consider it. Even with that in place, a great deal of technique remains. In fact, TiP has grown so big and beneficial that it’s a specialty unto itself. In particular, TiP techniques in the three phases of deployment, release and post-release are arguably as refined and complex as all of classical pre-production software testing. Comprehensive treatment of this range of methods is far beyond the scope of this introduction.
A few pervasive tips about TiP are worth recording, though:
- Work with feature flags and make them second nature in your designs and implementations
- Schedule slack resources so you can focus your first TiP efforts on the times and places your application is least loaded
- Instrument the application so you can see what customers see
- Create a variety of auxiliary test accounts that obey the same rules as govern other customers
- Enlist a security specialist to review your TiP for hazards and insights
By using these techniques, you can make TiP a manageable asset in your testing arsenal.