Ranorex Logo

White-Box Testing in TDD

Test Automation Blog by Ranorex
|
White-box testing

Test-driven development has filled up with esoteric techniques and beliefs. Here is a concrete example of a simplification and correction in testing.

This example is executable Python, but the ideas apply across most computing languages, and at least one of its generalizations is likely to apply in your situation:

While I doubt that this small function precisely quotes anything actually in use, it’s representative of many, many real-world definitions. Plenty of code in use that deserves testing differs from this example by only a single line or two.Python

check_result(handle):
  """
  The context is that several long-running processes,
  indexed by user account, have been launched. The
  current function determines whether a result is ready,
  and, if so, what it is for $USER.
  If a result isn’t ready--if the long-running process
  is still running--just return None as a sentinel.
  """
  outcome = done(handle)
  if outcome:
    return outcome[os.environ["USER"]]
  return None

Testability

One of the first reactions a tester should have to this tiny definition is that it’s hard to test. More precisely, when different developers, testers or continuous integration (CI) processes exercise a test of check_result, they’re likely to receive different results, because each one involves a different $USER.

A natural response would be to add another argument to check_result(). As is frequently the case, a more general or properly abstract definition turns out to be more testable. The signature of the function then becomes:Python

check_result(handle, account):

At this point, the function looks more testable, but it’s less useful. The updated definition requires all clients of check_result() to compute and manage account for themselves. While testability sounds desirable, it certainly can’t be at the expense of a more complex implementation.

A better solution is possible. All check_result()’s current clients assume that check_result() manages $USER on their behalf. One possible response is to keep the original definition for check_result() but create test methods that save, alter and restore os.environ. Automatic testing of definitions that bind outside resources — for this implementation’s purposes, the user account is external — frequently resort to such invasiveness. It’s more feasible in a white-box situation, of course, when the implementation is visible to the testers. At the same time, to bind test definitions to implementation details makes the tests themselves more fragile.

An even better solution is possible, though. Consider:Python

check_result(handle, account=os.environ["USER"]):
    """ The context is that several long-running processes, 
        indexed by user account, have been launched. The
        current function determines whether a result is ready, 
        and, if so, what it is for $USER."""
    outcome = done(handle)
    if outcome:
        return outcome[account]
    return None

Reliance on a default argument allows all the clients of check_result() to continue to invoke it unchanged; for them, check_result() still takes responsibility for computation of account. At the same time, parametrization of check_result() makes testing easier; test definitions can pass in specific values for account that yield better-controlled results.

Some computing languages don’t directly support default arguments; C and Java are examples. From their perspective, this example looks suspiciously tricky. The larger principle — that a more general definition can sometimes by more testable — is so important, though, that it’s worth learning workarounds for these languages.

In Java, for instance, it’s natural to create the effect of default arguments through method overloading. In such a case, the main method would include all parameters, and a signature targeted for defaulting implements the default procedurally:Python

Result check_result(handle, account) {
    …
}
Result check_result(handle) {
    Map<String, String> env = System.getenv();
    return check_result(handle, env.get("USER")); 
}

Whatever the implementation situation, a savvy tester looks for a way to cooperate with development to create an implementation that is both general enough to be testable and specific enough to be useful.

Ambiguity

We introduced a generalization of the check_result definition to make the original definition more testable. That testability pays off in uncovering a gap in the specification.

Recall the context of check_result: A long-running process handle indexes eventually completes, at which point it returns results for several different account instances. What if the process completes, though, but no result is available for account? What is supposed to happen then?

Our current version of check_result tosses a KeyError exception. Is that the right action? From the information available to us, we don’t know. Maybe an exception is needed; maybe the function should “silently” return a None, or even some third alternative.

This is the point at which an experienced tester reports back an ambiguity — that is, that a test has failed on the grounds that the requirements have a gap. Notice that parametrization of account makes it easier to identify that someone needs to answer, what if done() returns an answer with no entry for $USER?

Stylistic Hazard

That’s not all. The current version of check_result is so small that we should be able to understand it completely. That’s certainly a desirable property for an example. Even this little example, though, hides at least one more subtlety.

Focus on the test:Python

if outcome:

From the materials available to us, we’ve already identified a gap. Without a formal specification of done(), we’re unsure whether check_result precisely matches the requirements it should meet.

It’s not just check_result’s behavior that’s in question, though; at least one more ambiguity about done() is lurking.

This one takes a bit more explanation. Some styles of Python return False as a sentinel, and a reader of the check_result above might well speculate that’s exactly the intent: If the computation hasn’t finished, then done() returns False. If the computation has finished, then done() returns a dictionary of results.

That’s a problem, though. Python style manuals prescribe that a correct expression for testing a value that might be False is indeedPython

if outcome:

The problem is that this test picks up other values as well. Suppose done() finishes but returns an empty dictionary. In Python, a simplePython

if outcome:

treats both False and the empty dictionary identically. Is that intended? Without a more detailed specification, we don’t know.

At least one improvement is likely. Whatever the decision about the empty dictionary, it’s slightly better style in Python to use None, rather than False, as a sentinel, and test it withPython

if outcome is None:

The only value that matches this test is None. Coding this way helps distinguish the sentinel from an empty dictionary.

This illustrates that sometimes — in fact, often! — a valuable result of a testing cycle is not “The software passed” or “Here are specific failures,” but “Gaps in the specifications result in these potential problems.”

An alert and knowledgeable tester generates higher-value results that identify not only errors, but also opportunities to improve the long-term resilience of the implementation. Complete specifications and good style help make software that is simultaneously more readable and more testable.

In This Article

Sign up for our newsletter

Share this article

Related Articles

Automated-UI-testing-guide-blog-image

Automated UI Testing Guide: Tools, Scripts & Cross-Browser Tips

Oct 23, 2025

Take the leap and make the switch to automated UI testing. A product like Ranorex Studio is just what you need to streamline your testing process.

Integration-testing-complete-guide-blog-image

Integration Testing: A Complete Guide for QA Teams

Oct 16, 2025
Even with 100% unit test coverage, critical defects can slip through when components interact in unexpected ways. This is where integration testing becomes essential—validating that APIs, databases, and services work seamlessly together. By catching data mismatches, broken endpoi...
A-complete-guide-to-gui-testing-tools-blog-image

A Complete Guide to GUI Testing: Tools, Test Plans, Techniques

Oct 02, 2025

Our comprehensive GUI testing guide will walk you through the different types of tests and best practices for writing test cases.