In mid-2009, John O'Duinn offered to let my DOM fuzzer run on the same pool of machines as Firefox regression tests. I'd have an average of 20 computers running my fuzzer across a range of operating systems, and I wouldn't have to maintain the computers. All I had to do was tweak my script to play nicely with the scheduler, and not destroy the machines.
Playing nicely with the scheduler
Counter-intuitively, to maximize the amount of fuzzing, I had to minimize the duration of each fuzz job. The scheduler tries to avoid delays in the regression test jobs so developers don't go insane watching the tree. A low-priority job will be allowed to start much more often if it only takes 30 minutes.
Being limited to 30 minutes means the fuzz jobs don't have time to compile Firefox. Instead, fuzz jobs have to download Tinderbox builds like the regression test jobs do. I fixed several bugs in mozilla-central to make Tinderbox builds work for fuzzing.
I also modified the testcase reducer to split its work into 30-minute jobs. If the fuzzer finds a bug and the reducer takes longer than 30 minutes, it uploads the partially-reduced testcase, along with the reduction algorithm's state, for a subsequent job to continue reducing. To avoid race conditions between uploading and downloading, I use "ssh mv" synchronization.
Not destroying the test slaves
I wasn't trying to fill up the disks on the test slaves, really!
Early versions of my script filled up /tmp. I had incorrectly assumed that /tmp would be cleared on each reboot. Luckily, Nagios caught this before it caused serious damage.
Due to a security bug in some debug builds of Firefox, the fuzzer created randomly-named files in the home directory. This security bug has been fixed, but I'm afraid RelEng will be finding files named "undefined" and "[Object HTMLBodyElement]" for a while.
By restarting Firefox frequently, fuzzing accelerated the creation of gigantic console.log files on the slaves. We're trying to figure out whether to make debug-Firefox not create these files or make BuildBot delete them.
Results so far
Running in the test pool gets me a variety of operating systems. The fuzzer currently runs on Mac32 (10.5), Mac64 (10.6), Linux32, Linux64, and Win32. This allowed me to find a 64-bit-only bug and a Linux-only bug in October. Previously, I had mostly been testing on Mac.
The extra computational power also makes a difference. I can find regressions more quickly (which developers appreciate) and find harder-to-trigger bugs (which developers don't appreciate quite as much). I also get faster results when I change the fuzzer, such as the two convoluted testcases I got shortly after I added document.write fuzzing.
Unexpectedly, getting quick results from fuzzer changes makes me more inclined to tweak and improve to the fuzzer. I know that the change will still be fresh in my mind when I learn about its effects. This may turn out to be the most important win.
With cross-platform testing and the boost to agility, I suddenly feel a lot closer to being able to share and release the fuzzer.