Archive for the 'Fuzzing' Category

Fuzzers love assertions

Monday, February 3rd, 2014

Fuzzers make things go wrong.
Assertions make sure we find out.

Assertions can improve code quality in many ways, but they truly shine when combined with fuzzing. Fuzzing is normally limited to finding obvious symptoms like crashes, because it's rare to be able to tell correct behavior from incorrect behavior when the input is generated randomly. Assertions expand the scope of fuzzing to include everything they check.

Assertions can even help find crash bugs: some bugs are relatively easy for fuzzers to trigger, but only lead to crashes when additional conditions are met. A well-placed assertion can let us know every time we trigger the bug.

Fuzzing JS and DOM has found about 4000 assertion bugs, including about 300 security bugs.

Asserting safe use of generic data structures

Assertions in widely-used data structures can find bugs in many callers.

  • Array indices must be within bounds. This simple precondition assert in nsTArray has caught about 90 bugs.
  • Hash tables must not be modified during enumeration. If the modification happened to resize the hash table, it would leave stack pointers dangling. This PLDHashTable assertion has caught over 50 bugs.
  • Cached values should not be out of date. When a cache's get method takes a key and a closure for computing values in the case of a cache miss, debug builds can check whether the cached values are still correct. This is effectively a form of differential testing that notices bugs in cache-invalidation logic.

Asserting module invariants

When an entire module must maintain an invariant, a single assertion can catch dozens of bugs.

Making the frame arena safer

Gecko's CSS box objects, called "frames", are created and destroyed manually. They are allocated within an arena to reduce malloc overhead and fragmentation. The arena also made it possible to reduce the risk associated with manual memory management. A combination of assertions (in debug builds) and runtime mitigations (in all builds) mitigates dangling pointer bugs that involve frames.

Requests for Gecko developers

Please add assertions, especially when:

  • A bug would be a security hole
  • Crashing is not guaranteed
  • Many callers must fulfill a precondition
  • Complex, extensive code must maintain an invariant

Also consider:

Fuzzing for consistent rendering

Saturday, March 3rd, 2012

My DOM fuzzer can now find bugs where the layout of a DOM tree depends on its history.

In this example, forcing a re-layout swapped a “1” and “3” on the screen. My fuzzer didn’t know which rendering was correct, but it could tell that Firefox was being inconsistent.

Initial DOM tree
  • DIV
    • ت
    • SPAN
      • 1
      • SPAN
      • 3
31ت
Random change:
remove the inner span
  • DIV
    • ت
    • SPAN
      • 1
      • 3
31ت
Force re-layout
  • DIV
    • ت
    • SPAN
      • 1
      • 3
13ت

Gecko developer Simon Montagu quickly determined that 13ت is the correct rendering and attached a patch. Later, when a user reported that the bug affected Persian comments on Facebook, we were able to backport Simon’s fix to Firefox 11.

How it works

The fuzzer starts by making random dynamic changes to a page. Then it compares two snapshots: one taken immediately after the dynamic changes, and another taken after also forcing a relayout.

To force a relayout, it removes the root from the document and then adds it back:

  var r = document.documentElement; 
  document.removeChild(r);
  document.appendChild(r);

Like reftest, it uses drawWindow() to take snapshots and compareCanvases() to compare them.

In theory, I could also look for bugs where dynamic changes do not repaint enough of the window. But I've been told that testing for painting invalidation bugs is tricky, so I'll wait until most of the layout bugs are fixed.

Exceptions

Since the testcases are random, I have to be heavy-handed in ignoring known bugs. If I file a rendering bug where the weirdest part of the testcase is floats, I'll have the fuzzer ignore inconsistent rendering in testcases with floats until the bug is fixed.

The current list of exceptions is fairly large and includes key web technologies:

Fuzzing in the pool

Tuesday, November 23rd, 2010

In mid-2009, John O'Duinn offered to let my DOM fuzzer run on the same pool of machines as Firefox regression tests. I'd have an average of 20 computers running my fuzzer across a range of operating systems, and I wouldn't have to maintain the computers. All I had to do was tweak my script to play nicely with the scheduler, and not destroy the machines.

Playing nicely with the scheduler

Counter-intuitively, to maximize the amount of fuzzing, I had to minimize the duration of each fuzz job. The scheduler tries to avoid delays in the regression test jobs so developers don't go insane watching the tree. A low-priority job will be allowed to start much more often if it only takes 30 minutes.

Being limited to 30 minutes means the fuzz jobs don't have time to compile Firefox. Instead, fuzz jobs have to download Tinderbox builds like the regression test jobs do. I fixed several bugs in mozilla-central to make Tinderbox builds work for fuzzing.

I also modified the testcase reducer to split its work into 30-minute jobs. If the fuzzer finds a bug and the reducer takes longer than 30 minutes, it uploads the partially-reduced testcase, along with the reduction algorithm's state, for a subsequent job to continue reducing. To avoid race conditions between uploading and downloading, I use "ssh mv" synchronization.

Not destroying the test slaves

I wasn't trying to fill up the disks on the test slaves, really!

Early versions of my script filled up /tmp. I had incorrectly assumed that /tmp would be cleared on each reboot. Luckily, Nagios caught this before it caused serious damage.

Due to a security bug in some debug builds of Firefox, the fuzzer created randomly-named files in the home directory. This security bug has been fixed, but I'm afraid RelEng will be finding files named "undefined" and "[Object HTMLBodyElement]" for a while.

By restarting Firefox frequently, fuzzing accelerated the creation of gigantic console.log files on the slaves. We're trying to figure out whether to make debug-Firefox not create these files or make BuildBot delete them.

Results so far

Running in the test pool gets me a variety of operating systems. The fuzzer currently runs on Mac32 (10.5), Mac64 (10.6), Linux32, Linux64, and Win32. This allowed me to find a 64-bit-only bug and a Linux-only bug in October. Previously, I had mostly been testing on Mac.

The extra computational power also makes a difference. I can find regressions more quickly (which developers appreciate) and find harder-to-trigger bugs (which developers don't appreciate quite as much). I also get faster results when I change the fuzzer, such as the two convoluted testcases I got shortly after I added document.write fuzzing.

Unexpectedly, getting quick results from fuzzer changes makes me more inclined to tweak and improve to the fuzzer. I know that the change will still be fresh in my mind when I learn about its effects. This may turn out to be the most important win.

With cross-platform testing and the boost to agility, I suddenly feel a lot closer to being able to share and release the fuzzer.

How my DOM fuzzer ignores known bugs

Sunday, November 21st, 2010

When my DOM fuzzer finds a new bug, I want it to make a reduced testcase and notify me so I can file a bug report. To keep it from wasting time finding duplicates of known bugs, I maintain several ignore lists:

Some bugs are harder to distinguish based on output. In those cases, I use suppressions based on the fuzzer-generated input to Firefox:

Fixing any bug on those lists improves the fuzzer's ability to find additional bugs. But I'd like to point out a few that I'd especially like fixed:

In rare cases, I'll temporarily tell the fuzzer to skip a feature entirely:

Several bugs interfere with my ability to distinguish bugs. Luckily, they're all platform-specific, so they don't prevent me from finding cross-platform bugs.

  • Bug 610311 makes it difficult to distinguish crashes on Linux, so I ignore crashes there.
  • Bug 612093 makes it difficult to distinguish PR_Asserts and abnormal exits on Windows. (It's fixed in NSPR and needs to be merged to mozilla-central.)
  • Bug 507876 makes it difficult to distinguish too-much-recursion crashes on Mac. (But I don't currently know of any, so I'm not ignoring them at the moment!)

GCC correctness fuzzing

Wednesday, November 3rd, 2010

In 2008 I wrote about generating random JavaScript to find differences between optimization modes and differences between JavaScript engines (rough list of bugs).

How do you do this kind of testing on a language like C where the behavior of many programs is undefined per spec? John Regehr explains how in his talk Exposing Difficult Compilers Bugs With Random Testing at GCC Summit 2010.

Fuzzing talk at the Mozilla Summit

Wednesday, July 14th, 2010

At the 2010 Mozilla Summit, I talked about my JavaScript engine and DOM fuzzers, which have each found many hundreds of bugs. I also talked about the automations that keep me sane when I fuzz these complex components.

My slides are in the S5 web-based presentation format. You can click the Ø button to view the presentation in "handout mode" and see what I planned to say while each slide was up.

I shared a presentation slot with Mozilla contractor Paul Nickerson, who has a separate slide deck. He wisely saved the best part of his talk for the end: a demo of his font fuzzer causing Windows 7 to blue-screen.

CSS grammar fuzzer

Monday, March 16th, 2009

I wrote a CSS grammar fuzzer to test Gecko's CSS parser. This fuzzer's tricks:

Declarative context-free grammar. This makes it easy to add new CSS features to the fuzzer, or even use it to test grammars other than CSS. Each symbol can be a star, concatenation, or list of alternatives. Unlike a parser, a fuzzer has to make decisions about what to create, so alternatives can be given weights and stars have a suggested number of repetitions. This alone was enough to find at least one bug in Gecko.

Breaking rules. Like any good fuzzer, it doesn't always follow the given context-free grammar. Sometimes it does weird stuff, such as inserting a random symbol, to throw the parser off. I was surprised that this only found one additional bug in Gecko. Perhaps this reflects the comprehensive error handling requirements of the CSS specifications and the corresponding test suites.

Grammatical recursion. When the fuzzer notices that a symbol is the same as an ancestor, it can repeat the parts of the final string between the two symbols. This is effective at finding bugs where large input can cause a recursive algorithm to run out of stack space and crash. This found four grammar-recursion crashes in Gecko.

CSS serialization. The fuzzer makes sure that any text that comes out of the browser's stylesheet serializer survives another trip through the parser and serializer. This is the same trick jsfunfuzz uses to test the JavaScript engine decompiler. This helped fine four incorrect-serialization bugs in Gecko.

None of these Gecko bugs seem to be security holes. I shared the fuzzer with other browser vendors privately for over a month, and nobody asked me to delay the release, so I believe it didn't find security holes in other browsers either. But I think this has more to do with CSS parsing being fairly simple and self-contained than any weakness in the fuzzer.

After CanSecWest, I'm going to try using this fuzzer to generate JavaScript expressions that crash parsers that use recursion incautiously. Unless someone beats me to it, of course ;) (jsfunfuzz already creates many types of weird JavaScript, but can't look for grammatical recursion opportunities easily because it is written in a functional style rather than a declarative style.)

This fuzzer is written in JavaScript and is MIT licensed. I'd love to hear what other people manage to do with it. Get it here.

Some differences between JavaScript engines

Tuesday, December 23rd, 2008

I gave my new fuzzer a break from testing TraceMonkey by asking it to look for differences between SpiderMonkey and JavaScriptCore. I have listed them below, with SpiderMonkey output above JavaScriptCore output.

I have no idea how many of these are bugs (in SpiderMonkey or JavaScriptCore) and how many are ambiguous in the spec (intentionally or unintentionally).

Early error reporting

SpiderMonkey reports some errors at compile time that JavaScriptCore only reports at run time, if the code is actually hit. The difference is most obvious (and most likely to cause compatibility problems) if the code is skipped.

> if (false) { --1; }
S: SyntaxError: invalid decrement operand
J: (no error)
> if (false) { return; }
S: SyntaxError: return not in function
J: (no error)

instanceof

The two engines disagree about what objects are reasonable operands for the 'instanceof' operator.

> ({} instanceof {a:2})
S: typein:3: TypeError: invalid 'instanceof' operand ({a:2})
J: false
> ({} instanceof eval)
S: false
J: Exception: TypeError: instanceof called on an object with an invalid prototype property.

new with native functions

SpiderMonkey allows the "new" operator to be used with some native functions that JavaScriptCore considers non-constructors.

> new Math.sqrt(16)
S: 4
J: Exception: TypeError: Result of expression 'Math.sqrt' ... is not a constructor.
> new ({}.toString)
S: [object Object]
J: Exception: TypeError: Result of expression '({}.toString)' ... is not a constructor.
> new eval
S: typein:9: EvalError: function eval must be called directly, and not by way of a function of another name
J: Exception: TypeError: Result of expression 'eval' ... is not a constructor.

Converting between numbers and strings

> print(+'\00000027')
S: NaN
J: 0
> (1e-10).toString(16)
S: 0.000000006df37f675ef6ec
J: 0

const

There are subtle differences in handling of this new keyword.

> const d; const d;
S: TypeError: redeclaration of const d
J: (no error)
> const c = 0; print(++c);
S: 0
J: 1

Other differences

> print((function(){return arguments;})());
S: [object Object]
J: [object Arguments]
> typeof /x/
S: object
J: function

See Mozilla bug 61911, which changed this in SpiderMonkey in 2007.