Why Your Test Efforts Should Tackle Data First
There is no question that automation poses technical challenges; you have to get your tool and application to play nice together before automation is even possible. But the most difficult obstacle to success is more mundane: It’s the data. If you can’t control, define, and predict the state of the data, you won’t have the repeatability that makes automation practical.
It starts with the data in the test environment. For a hospital, this includes which rooms are available, what doctors are on staff, and which medications are in inventory. For an airline, this means knowing what routes are flown, which flights are scheduled, and how many seats are available for each trip. You have to control the state of the data when you start execution because your test must know what to expect.
Next comes the data you provide during the test. As you execute, you will create or transform the data: admit patients, book passengers, or make payments. Defining this data is an integral part of developing your test cases. For a manual test, you can adapt on the fly by looking for available rooms or experimenting with different flights, but an automated test must know the values in advance. Trying to adapt during automated execution is theoretically possible, but usually impractical; it leads to complex logic that introduces ambiguity and instability.
Finally, you must be able to predict the result. This is harder than it sounds. You can’t get away with saying “verify that the value is correct” like you may be able to in a manual test; you have to specify precisely what the correct response is. Manual testers have so much tribal knowledge that they are used to winging it, and reducing experience to exact values takes extra thought and effort.
Too often, automation projects start by tackling the technical issues first, and only once execution becomes possible do they start to trip over the issues with the data. In my experience, the data issues should come first because they take longer to resolve and may involve other departments and resources. Subject matter experts, database administrators, network engineers, and hardware support teams may have to be orchestrated before a robust and repeatable solution can emerge. Waiting until you are starting the test cycle is too late.
To those who argue that the technical issues have to be resolved before the data issues matter, I say that having a comprehensive data strategy enables both manual and automated testing. While manual testers can and do adapt on the fly, the truth is that this takes time. The majority of most manual test effort is spent searching for or creating the conditions required for the test. If everyone can rely on a controlled, predictable data environment, then not only will manual testing be far more efficient, but the transition to automation will be significantly less painful.
In the end, if you have resolved the data challenges but find the technical issues are insurmountable, you have still gained productivity. But if you resolve the technical issues first and then cannot address the data issues, you have lost time and effort. It may not be as fun or as satisfying as slaying technical dragons, but addressing the data leads to ultimate success.