How Exhaustive Testing Ensured a Successful Voyage for the Mars Rover
When our software products reside on Earth, we might say the worst time to find a bug is during a sales demonstration. When our software is designed to hurdle radioactive-powered machinery through space, however, the worst time to find a bug might be when the code being executed is controlling a $2.5 billion rover landing 352 million miles from Earth on a planet not particularly accommodating to the landing of heavy craft. It’s all about context.
Testers often worry about uncaught bugs silently stinking up dark, never-exposed corners of code, waiting there to be found by real users behaving as only users can. But few testers must worry about bugs on other planets or about wrecking the surface of Mars.
Such worries are all in a day's work for NASA's Mars Science Laboratory (MSL) mission, which landed the Curiosity rover on Mars in August 2012 after a 253-day trip through space. MSL handled the myriad risks through well-planned software architecture, tight coding standards, and exhaustive testing.
Development testing specialist Coverity was employed to use static analysis tools to test the source code while it was being written, examining every path in the millions of lines of C and C++ code to find potential defects. Not an easy feat according to Andreas Kuehlmann, vice-president of research and development at Coverity, which had to rigorously test the software for its ability to respond to a huge number of scenarios.
“There were so many pieces that we had to get right,” Kuehlmann said. “You have lots of sensors that are measuring a lot of things all the time and the computation is controlling every aspect of the landing. That's very complex software with many, many things that can go wrong.”
NASA learned its lesson about what can go wrong in 2004 when the Spirit rover experienced a “flash memory management anomaly” that put the rover in serious danger of overheating and the mission in serious danger of ending prematurely. After the ensuing panic and subsequent bug fix, it was determined that the bug could have been avoided with better software architecture and rigorous testing.
No software is ultimately bug-free, not even the software used to control Curiosity, but roughly 2,000 bugs were zapped in the rover’s code through Coverity’s static analysis tools, which examined the code to find flawed logic, problematic programming, and silent—but potentially disastrous—bugs like resource leaks, memory corruptions, and null-pointer errors. “For typical software (which this clearly isn’t), it’s not unusual to find approximately 1 defect for every thousand lines of code,” said Andy Chou, chief technical officer of Coverity.
The exhaustive testing paid off: Measurements taken by Curiosity are now helping NASA design systems that protect human explorers from radiation exposure on deep-space expeditions in the future.