Why Is Estimating Software Testing Time So Difficult?
Management loves to ask testers to estimate how long their efforts will take. We can know how long test cases take to run individually, and we can assume there will be some bugs that will require time to fix and to rerun tests. But outside these sureties, there’s much more we do not know that makes it difficult to predict a timeframe.
Here are nine factors that significantly influence our ability to estimate testing time.
1. Although bugs are almost a certainty, how long their fixes will take is less certain. Most organizations lack sufficient historical data to build estimates from. Without the data of experience, it’s difficult to create accurate estimates.
2. The next key factor is the test team itself. How large is the team? What is everyone’s level of skill and experience? Do they have a well-defined testing process that everyone understands and can select from? How much time can the team focus on testing tasks without interruption? And what are the team’s interaction skills? These characteristics are all vital to the team’s performance and, thus, the estimates for testing time, but we have few ways of measuring them.
3. System size, complexity, and risk are more factors that influence the amount of testing that “should” be performed. And again, we have few effective ways of measuring these components.
4. How stable are requirements? We don’t “freeze the requirements” these days; in today’s agile world we welcome change, and with those changes to requirements will come changes in testing—and the estimates.
5. Another unknown is the defect density in the requirements, design, and code. Buggy requirements and design will result in buggy code. That factor has a substantial impact on the amount of time testing will require.
6. We also need to consider the developer “screw-up rate” when fixing defects. The general view is that about 5 percent of the “fixes” either will not fix the original problem or will break something else in the product, but you don't know the exact ratio.
7. The required thoroughness (coverage) of the testing plays a role, too. Is this a cribbage game app in which minor errors might be acceptable, or a drug infusion system where errors can be deadly? Does the system have zillions of paths that each require a unique test, or myriad combinations of data it must process correctly each time?
8. The availability and reuse of previous test assets and environments can significantly reduce the time required to test. Unfortunately, there are no generally accepted ways to measure that element.
9. Lastly, good test estimation is just plain hard work. Joel Spolsky’s evidence-based scheduling method has four steps: Break the planned testing tasks down into small chunks, track the actual elapsed time, simulate the future using Monte Carlo methods, and manage your project actively. He claims good success using this method, but who really wants to go to all that work? Not many organizations I know of.
Is it any wonder that test estimation is so difficult? So many important factors elude our measurement, and even if we knew them, it would require substantial effort to turn those metrics into a reasonable estimate. Perhaps we would be better off investing the time we would spend estimating into doing actual testing instead.