Organizations rely heavily on their technology infrastructure to enable business growth. Software testing management is critical to ensuring that applications used by the workforce perform as intended and won’t break down when employees are in the middle of crucial...
“What?”, I replied.
“Every function in your classes is validating the data it’s getting. I can understand validating when the data comes into the web server, but for classes such as the ones that you are making, it’s just too much validation.”, he said with a stern look.
“Oh, OK”, I said as he walked away. At the time, I really didn’t want to get into an argument. So I just did as he requested. From that point on, none of the functions in my classes validated the data being passed to them. I just trusted that the data was good.
It was a lot to ask.
When it comes to validating data, I am reminded of the Russian proverb, Доверяй, но проверяй, which means, trust, but verify. I’ve been bitten too many times to forgo thorough validation of the data my code is going to process. Still, my boss had a point. Data validation eats cycles and takes time that adds to the expense of running the code. My boss’s assertion forced me to move beyond the habits formed by my opinions and question if my validation practices really justified the expense they incurred.
Validate According to the Deployment Unit
So I gave it some thought. What I came up with is that reliable validation depends on the deployment unit. For example, imagine I’m writing a Node.js package, Who Lives There, that takes a street address and returns some profile information of the people living at that address. Of course, in order to get the package to provide the service expected, the information submitted needs to be a valid address, as does the format of the information itself. (See Figure 1.)
Figure 1: A Node.js package is an example of a deployment unit
The package accepts data at an entry point, index.js. The index.js code uses the code in addressValidator.js to validate the address and the code in profileAnalyzer.js to get the information about people living at the address. The profileAnalyzer.js does not need to validate the address because that work is done by addressValidator.js. Both addressValidator.js and profileAnalyzer.js reside in the same deployment unit, the Node.js package. This means that profileAnalyzer.js can have a good deal of trust that index.js, the controller object, has done the address validation already.
Now let’s consider a scenario in which the deployment unit for Who Lives There is a Docker container, as shown below in Figure 2.
Figure 2: A Docker container is an example of a deployment unit
Granted, in order to avoid doing redundant work there needs to be a conventional policy agreement among those working on the web application that all data validation logic is to be performed by the entity accepting the request, in this case, server.js.
However, there is a risk. Should logic change in profileAnalyze.js that requires a change in the expected data structure of the address, the validation logic will need to be rewritten in addressValidator.js. There is no magic here. Somehow, the person making the change in profileAnalyze.js needs to communicate with the person writing addressValidator.js to make a corresponding change. If addressValidator.js does not change, things can get weird.
Remember, profileAnalyzer.js is always at risk of being exposed to bad data. Without its own validation mechanisms, profileAnalyzer.js will just emit errors that are particular to its operation and are so far down in the code stack, that those experiencing the error will have little understanding as to the nature of the problem. The nice thing about well-written validation logic is that when things go wrong, the error messages usually describe how to fix the problem. Errors emitted outside of validation logic can be cryptic and hard to understand, let alone fix.
The Importance of a Data Validation Policy
When it comes to testing data-driven code, such as Who Lives There, understanding the data validation policy in force is an important part of the software development process. For developers, this means knowing where to test happy and sad paths in terms of data validation. In the scenarios above, it makes little sense to sad path unit test profileAnalyzer.js for anything more than general failure when providing bad data. However, sad path testing addressValidator.js for more fine-grained error responses does make sense.
In terms of test practitioners who implement functional tests, this means having a clear understanding as to where data validation takes place in terms of the associated entry point and then executing tests that exercise validation accordingly. Also, performance tests will also monitor validation activities to make sure that they’re being done efficiently in terms of CPU consumption and time to execute.
Putting It All Together
Data validation is a critical aspect of application activity. If I had a dollar for every bug I had to fix that ended up being about bad data, I’d be well on my way to an all expenses paid trip to an island resort. But, there is a good argument to be made that a developer can be overzealous when writing validation code. Remember, each if/then statement you write is a programming expense. Over time, they add up. The trick is to use CPU cycles wisely. Thus, creating a data validation policy that is well known and easy to follow by all those involved in a product’s software development life cycle is a good way to make fault tolerant code that runs efficiently.
Having good software testing management can lead to business growth. Follow these 4 tips for a smooth management process.
Software quality is a core concern for all software developers, and it encompasses more than just the code. Here’s how to improve software quality in nine ways.
This is a helpful guide on how to reduce software testing times and the benefits of doing so. Check out these four tips to achieve more time-efficient testing.