Is There a Recommended Duration of Time for a User-Facing Test?
Embracing end-users in product and application testing is no longer a rarity. Organizations are transparent enough in their product releases to include end-users in their quality assurance processes. This includes involving end-users in varied forms of testing such as beta programs before and after release, crowd testing programs, and ongoing telemetry measurements to understand the product's usage in a live environment.
The success of such end-user-involved test programs depends on various factors, such as timing, areas under test, buy-in from internal teams to act on end user-reported issues promptly, and duration of conducting such tests.
Although all of these factors are significant, when these tests are done before release to production, the duration of tests is a very critical factor in determining the test program's success or failure in meeting its goal. For instance, let’s say an A/B test is being conducted early on in the development cycle. How long should you run the program in order to confidently say you have enough data points to make a decision?
Deepak Tiwari, the head of analytics at Google, recently talked about how a round of rapid-fire A/B testing would have almost resulted in a major fiasco for Google and how the results were drastically different when the test was conducted for an additional month.
While there is no right answer as to how many days you need to run such user-facing tests, including A/B tests, there are attempts in the industry to build tools to determine the duration statistically. Such tools at least give the team a base guideline and confidence at addressing this important question analytically, rather than attempting to run the program without any due diligence planning.
The same quandary extends to beta programs: How long should beta programs be run? Software practitioners often having varying answers here, but one suggestion that might be useful is to explicitly ask the beta testers for their feedback on the product’s release readiness. Although this is a qualitative question, it can often give the product team very useful insights. Google—which is an active proponent of user-driven testing, especially through beta programs—also supports staged rollouts to better answer the question of duration.
At the end of the day, the duration of a user-facing test is really a case by case decision that should be based on what kind of user testing is done, whether it is pre- or post-release, what past experience with such testing has shown, and the goal of the test itself. Such a thought-out test duration will at least help you be in the sweet spot range to get the desired results from your end-user testing efforts.