Archive for the 'Fuzzing' Category

CSS grammar fuzzer

Monday, March 16th, 2009

I wrote a CSS grammar fuzzer to test Gecko's CSS parser. This fuzzer's tricks:

Declarative context-free grammar. This makes it easy to add new CSS features to the fuzzer, or even use it to test grammars other than CSS. Each symbol can be a star, concatenation, or list of alternatives. Unlike a parser, a fuzzer has to make decisions about what to create, so alternatives can be given weights and stars have a suggested number of repetitions. This alone was enough to find at least one bug in Gecko.

Breaking rules. Like any good fuzzer, it doesn't always follow the given context-free grammar. Sometimes it does weird stuff, such as inserting a random symbol, to throw the parser off. I was surprised that this only found one additional bug in Gecko. Perhaps this reflects the comprehensive error handling requirements of the CSS specifications and the corresponding test suites.

Grammatical recursion. When the fuzzer notices that a symbol is the same as an ancestor, it can repeat the parts of the final string between the two symbols. This is effective at finding bugs where large input can cause a recursive algorithm to run out of stack space and crash. This found four grammar-recursion crashes in Gecko.

CSS serialization. The fuzzer makes sure that any text that comes out of the browser's stylesheet serializer survives another trip through the parser and serializer. This is the same trick jsfunfuzz uses to test the JavaScript engine decompiler. This helped fine four incorrect-serialization bugs in Gecko.

None of these Gecko bugs seem to be security holes. I shared the fuzzer with other browser vendors privately for over a month, and nobody asked me to delay the release, so I believe it didn't find security holes in other browsers either. But I think this has more to do with CSS parsing being fairly simple and self-contained than any weakness in the fuzzer.

After CanSecWest, I'm going to try using this fuzzer to generate JavaScript expressions that crash parsers that use recursion incautiously. Unless someone beats me to it, of course ;) (jsfunfuzz already creates many types of weird JavaScript, but can't look for grammatical recursion opportunities easily because it is written in a functional style rather than a declarative style.)

This fuzzer is written in JavaScript and is MIT licensed. I'd love to hear what other people manage to do with it. Get it here.

Some differences between JavaScript engines

Tuesday, December 23rd, 2008

I gave my new fuzzer a break from testing TraceMonkey by asking it to look for differences between SpiderMonkey and JavaScriptCore. I have listed them below, with SpiderMonkey output above JavaScriptCore output.

I have no idea how many of these are bugs (in SpiderMonkey or JavaScriptCore) and how many are ambiguous in the spec (intentionally or unintentionally).

Early error reporting

SpiderMonkey reports some errors at compile time that JavaScriptCore only reports at run time, if the code is actually hit. The difference is most obvious (and most likely to cause compatibility problems) if the code is skipped.

> if (false) { --1; }
<span class="spidermonkey"  >S: SyntaxError: invalid decrement operand</span>
<span class="javascriptcore">J: (no error)</span>
&gt; if (false) { return; }
<span class="spidermonkey"  >S: SyntaxError: return not in function</span>
<span class="javascriptcore">J: (no error)</span>

instanceof

The two engines disagree about what objects are reasonable operands for the 'instanceof' operator.

&gt; ({} instanceof {a:2})
<span class="spidermonkey"  >S: typein:3: TypeError: invalid 'instanceof' operand ({a:2})</span>
<span class="javascriptcore">J: false</span>
&gt; ({} instanceof eval)
<span class="spidermonkey"  >S: false</span>
<span class="javascriptcore">J: Exception: TypeError: instanceof called on an object with an invalid prototype property.</span>

new with native functions

SpiderMonkey allows the "new" operator to be used with some native functions that JavaScriptCore considers non-constructors.

&gt; new Math.sqrt(16)
<span class="spidermonkey"  >S: 4</span>
<span class="javascriptcore">J: Exception: TypeError: Result of expression 'Math.sqrt' ... is not a constructor.</span>
&gt; new ({}.toString)
<span class="spidermonkey"  >S: [object Object]</span>
<span class="javascriptcore">J: Exception: TypeError: Result of expression '({}.toString)' ... is not a constructor.</span>
&gt; new eval
<span class="spidermonkey"  >S: typein:9: EvalError: function eval must be called directly, and not by way of a function of another name</span>
<span class="javascriptcore">J: Exception: TypeError: Result of expression 'eval' ... is not a constructor.</span>

Converting between numbers and strings

&gt; print(+'\00000027')
<span class="spidermonkey"  >S: NaN</span>
<span class="javascriptcore">J: 0</span>
&gt; (1e-10).toString(16)
<span class="spidermonkey"  >S: 0.000000006df37f675ef6ec</span>
<span class="javascriptcore">J: 0</span>

const

There are subtle differences in handling of this new keyword.

&gt; const d; const d;
<span class="spidermonkey"  >S: TypeError: redeclaration of const d</span>
<span class="javascriptcore">J: (no error)</span>
&gt; const c = 0; print(++c);
<span class="spidermonkey"  >S: 0</span>
<span class="javascriptcore">J: 1</span>

Other differences

&gt; print((function(){return arguments;})());
<span class="spidermonkey"  >S: [object Object]</span>
<span class="javascriptcore">J: [object Arguments]</span>
&gt; typeof /x/
<span class="spidermonkey"  >S: object</span>
<span class="javascriptcore">J: function</span>

See Mozilla bug 61911, which changed this in SpiderMonkey in 2007.

Fuzzing TraceMonkey

Tuesday, December 23rd, 2008

Making JavaScript faster is important for the future of computer security. Faster scripts will allow computationally intensive applications to move to the Web. As messy as the Web's security model is, it beats the most popular alternative, which is to give hundreds of native applications access to your files. Faster scripts will also allow large parts of Firefox to be written in JavaScript, a memory-safe programming language, rather than C++, a statically typed footgun.

Mozilla's ambitious TraceMonkey project adds a just-in-time compiler to Firefox's JavaScript engine, making many scripts 3 to 30 times faster. TraceMonkey takes a non-traditional approach to JIT compilation: instead of compiling a function at a time, it compiles only a path (such as the body of a loop) at a time. This makes it possible to optimize the native code based on the actual type of each variable, which is important for dynamic languages like JavaScript.

My existing JavaScript fuzzer, jsfunfuzz, found a decent number of crash and assertion bugs in early versions of TraceMonkey. I made several changes to jsfunfuzz to help it generate code to test the JIT infrastructure heavily. For example, it now generates mixed-type arrays in order to test how the JIT deals with unexpected type changes.

Andreas Gal commented that each fuzz-generated testcase saved him nearly a day of debugging: otherwise, he'd probably have to tease a testcase out of a misbehaving complex web page. Encouraged by his comment, I looked for additional ways to help the TraceMonkey team.

JIT correctness

Last month, I wrote a new fuzzer designed to find correctness bugs. It runs a randomly-generated script in two JavaScript engines (in this case, SpiderMonkey with and without the JIT) and complains if the output is different.

It quickly found 13 bugs where the JIT caused JavaScript code to produce incorrect results. These bugs range from obvious to obscure to evil.

It even found two security bugs that jsfunfuzz had missed. One was a crash that involved a combination of language features that jsfunfuzz doesn't test heavily. The other was an uninitialized-memory-read bug, which caused the output to be random when it should have been consistent. jsfunfuzz missed the bug because it ignores most output, but the new fuzzer interpreted it as a difference between non-JIT and JIT output and brought the bug to my attention.

JIT speed

I set up the new fuzzer to compare the time needed to execute scripts and complain whenever enabling the JIT made a script run more slowly. It measures speed by letting the script run for 500ms and reporting the number of loop iterations completed in that time.

So far, it has found 4 serious bugs where the JIT makes scripts several times slower. Two of these have already been fixed, but the other two may be difficult to fix.

It has also found 10 cases where the JIT makes scripts about 10% slower. Most of these minor slowdowns are due to "trace aborts", where a piece of JavaScript is not converted to native code and stays in the interpreter. Some trace aborts are due to bugs, while others are design decisions or cases for which conversion to native code simply hasn't been implemented yet.

There is some disagreement over which trace aborts are most likely to affect real web pages. I asked members of Mozilla's QA team to scan the web in a way that can answer this question.

Interpreter speed

Mostly for fun, I also looked to see which code the JIT speeds up the most. Here's a simplified version of its answer:

for (var i = 0; i &lt; 0x02000000; ++i) {
  d = 0x55555555;
  d++; d++; d++; d++; d++;
}

This code runs 250 times faster when the JIT is enabled. The JIT is able to achieve this gigantic speedup due to the interpreter being inefficient in dealing with undeclared variables and numbers that can't be represented as 30-bit ints.

Assertions

The JavaScript engine team has documented many of their assumptions as assertions in the code. Many of these assertions make it easier to spot dangerous bugs, because the script generated by the fuzzer doesn't have to be clever enough to actually cause a crash, only strange enough to violate an assumption. This is similar to my experience with other parts of Gecko that use assertions well.

Other JavaScript engine assertions make it easier to find severe performance bugs. Without these assertions, I'd only find these bugs when I measure speed directly, which requires drastically slowing down the tests.

More ideas

One testcase generated by my fuzzer demonstrated a combination of a JIT performance bug with a minor bytecode generation bug. I might be able to search for similar bytecode generation bugs the same way I searched for decompiler bugs: by ensuring that a function does not change when round-tripping through the decompiler. In order to do that, I'll need a new patch for making dis() return the disassembly instead of printing it.

I should be able to find some performance bugs by looking at which aborts and side exits are taken. This strategy would make some performance bugs (such as repeatedly taking a side exit) easier to spot.

Gecko assertions

Tuesday, November 27th, 2007

Assertions frequently help me find bugs that would be difficult to catch otherwise. I've filed 479 bugs on assertion failures, excluding bugs that also causes crashes or hangs, and 223 of those have already been fixed. (Thanks to Brendan, I can safely file bugs when I see assertion failures, knowing that my bugs aren't invalid.)

Many assertion failures indicate that assumptions were violated in a relatively harmless way, merely causing incorrect layout or slow performance in an edge case. But sometimes they indicate the presence of a bug that could lead to a serious memory safety violation. For example, "ASSERTION: Some objects allocated with AllocateFrame were not freed" in the FrameArena destructor has helped find dozens of bugs that lead to leaks and/or dangling pointers.

Gecko developers have added many useful assertions lately, and it looks like more are on the way. Regression tests for assertions are improving, too.

I have a list of assertions that I ignore during automated testing. Whenever a developer fixes a bug on that list, I remove the bug and the corresponding assertion from the file, so if I hit the assertion again after that point, I'll notice it.

In addition to helping to catch bugs, assertions also serve as "living documentation" about the code's invariants. We notice if they become out of date, unlike Wiki pages and comments. In my dream world, assertions would also serve as waypoints for an automated theorem prover ;)

Valgrind coming to Mac

Wednesday, September 19th, 2007

Apple employee Greg Parker has ported Valgrind to Mac, and plans to release his work soon after Leopard is released in October. He's been working on it for quite a while.

I'm excited about being able to use Valgrind on Mac. Valgrind's "Memcheck" is much better at catching dangling-pointer bugs and heap buffer-overflow bugs than simply watching for crashes (even with MallocScribble enabled). Running a fuzzer with Memcheck can reveal exploitable memory safety bugs that would not have triggered crashes otherwise.

Update 2008-09-29: Greg Parker has released his port as a patch.

Update 2009-02-06: Valgrind developer Nicholas Nethercote has imported Greg's patch into a branch of Valgrind's SVN repository.

Update 2009-06-02: Valgrind trunk now supports Mac.

Introducing Lithium, a testcase reduction tool

Saturday, September 15th, 2007

I wrote a tool called Lithium that automatically reduces large testcases, such as real-world web pages or testcases produced by jsfunfuzz. It can usually reduce a 3000-line jsfunfuzz crash testcase to 3-10 lines in several minutes, considerably faster than I can reduce by hand. Perhaps more importantly, I can do something else while it reduces the testcase.

There are two (related) reasons I'm not calling it "Lithium 1.0" yet. First, I'm hoping to improve the way "interestingness tests" are written. Currently, they're separate programs that communicate to Lithium using their exit code, which limits error handling and might slow Lithium down. I'd like to make the interestingness tests be Python files, but I'm not sure what the best way to do that is. (Should Lithium __import__ the interestingness test? Or should the interestingness test import Lithium and be renamed to e.g. "reduce_crash.py"?)

Second, it would be useful to be able to pass extra arguments to the program being tested. For example, it would be useful to be able to pass a profile name to Firefox, or to pass a Firefox path to Valgrind. One possibility is to put the program being tested last on the command line, so extra positional arguments become options to that program. This solution would only work for interestingness tests that launch a single program (so it wouldn't work for a "renders differently in these two Firefox builds" test, for example), but maybe that's okay. Another possibility is to require the use of a config file for passing arguments to programs being tested (so you don't end up typing all of ".../firefox-bin -P foo" on Lithium's command line).

I'll probably use the MIT license for Lithium (but not for timed_run.py, which was mostly written by Chris Cooper and Bob Clary).

Opera is finding jsfunfuzz useful

Saturday, August 4th, 2007

Opera has posted a build with fixes for several crashes found by jsfunfuzz. Cool!

Opera community members posted dozens of comments about it, and I replied to several.

jsfunfuzz in news and blogs

Friday, August 3rd, 2007

Before the presentation:

After the presentation: