On Fuzz Testing

Posted by Karl Auerbach /

There's an old joke. It was said that English automobiles of the 1950's came equipped with a walnut inlayed toolbox containing many hammers. These ranged in size from a small jeweler's hammer up through a heavy, concrete shattering, sledge. It was said that when something in the automobile stopped working that one should begin by pounding on it with the smallest hammer. If that didn't solve the problem then one should move up to the next larger size. And so on, using ever larger hammers, until the sledge, which would reduce the automobile to a heap of shattered parts that could easily be hauled away – because the original problem was obviously insolvable.

Fuzz testing of software is somewhat like that old English automotive technique, but often without the benefit of an orderly sequence.

Fuzz testing is a form of brute-force testing - every possibility is thrown at the target in hopes that eventually something bad will happen and a flaw revealed. Fuzz testing can take a great deal of time. Fuzz testing is a plausible technique if the number of variations is small enough that all the possibilities can be tried in the time before the target product becomes obsolete. But with some modern network protocols the time to test all the combinations could run into years - or, in many cases, eons.

There's a counter joke about a company that had a huge dynamo that was going awry, shaking and buzzing. So they called in an expert. She listened to the noises and felt the vibrations for a minute. Then she pulled a small screwdriver out of her pocket, walked around to the back of the giant machine, and then turned small one screw. The problem instantly went away. She presented an invoice for $10,000. The company objected saying that the price was far too high for one minute of work. Her answer was they they were not paying for the one minute of work but for her 25 years of experience that informed her how to go from the symptoms to the solution.

At IWL our approach to testing is closer to the latter story than to the former. While there is sometimes merit to trying all possibilities, there is often more merit, and a lot more efficiency, to use expertise to focus on likely sources of problems and to begin by examining those points of likely sensitivity.

For example, a fuzz testing approach to a protocol might enumerate all 4,294,967,296 possible values of a 32-bit protocol field. IWL's approach would be to use expertise to know that a common source of error occurs when the high order bit rolls over (typically caused a programmer who failed to use an "unsigned" keyword when defining the 32-bit integer.) So IWL's approach would be to begin testing across a small set of values that span from just below the rollover to just above the rollover. That avoids roughly four billion test probes that a fuzzing tool would make. Even at 1000 tests a second this would save roughly 50 days of testing.

Some members of IWL's technical and management team have been building internet protocols and software for more than forty years. At IWL we have seen just about every kind of implementation error, from simple high-order bit rollover issues (as mentioned above), to complex race conditions, to memory leaks. Based on that experience we have built tools (such as our KMAX and Maxwell Pro products) that will often reveal implementation problems more quickly than brute force fuzzers.

In addition, it is unlikely that fuzzers will ever discover implementation flaws that are triggered when packets arrive in the kind of unusual sequences or at slightly abnormal times that are becoming increasingly common on the internet.

For example, some of us here at IWL have worked on entertainment grade streaming audio/video distribution over the internet. Internet quality is always changing – rain adds noise to radio and satellite channels, other users induce congestion that causes delay and data loss, and the various media coder/decoders (codecs) may be highly variable in their demands for network quality (for instance, high motion action video tends to generate significantly more data than talking head shows.) The software at the receiving end is always in a race with time to get received data to the eyes and ears of the viewers. Any disturbance in the incoming flow of audio or video data can result in damage to the rendered sound or image. We've all experienced what happens: dropped sound and blotchy images. The software that handles this is usually quite complex, time sensitive, and very difficult to test. Fuzzers are not very useful for testing this kind of software. More useful are tools (such as the IWL KMAX and Maxwell Pro products) that allow implementors to expose their code to lost, dropped, delayed, mis-ordered, duplicated, rate limited, and altered data packets.

In conclusion: fuzzer tools are useful. But they are only one among many of the tools that ought to be in a software builder's toolkit. Fuzzing tools should be not be the first ones deployed from that toolkit. More tightly focused tools ought to be used first because those tools will be more likely to quickly reveal the kinds of flaws that programmers over the years and around the world have made over and over again. For testing new code fuzzers ought to be the last tool used, not the first.

Previous Post