Detecting leak-until-shutdown bugs

Most of Mozilla's leak-detection tools work on the premise that when the application exits, no objects should remain. This strategy finds many types of leak bugs: I've used tools such as trace-refcnt to find over a hundred. But it misses bugs where an object lives longer than it should.

The worst of these are bugs where an object lives until shutdown, but is destroyed during shutdown. These leaks affect users as much as any other leak, but most of our tools don't detect them.

After reading about an SVG leak-until-shutdown bug that the traditional tools missed, I wondered if I could find more bugs of that type.

A new detector

I started with the premise that if I close all my browser windows (but open a new one so Firefox doesn't exit), the number of objects held alive should not depend on what I did in the other windows. I retrofitted my DOM fuzzer with a special exit sequence:

  1. Open a new, empty window
  2. Close all other windows
  3. Until memory use stabilizes
  4. Count the remaining objects (should be constant)
  5. Continue with the normal shutdown sequence
  6. Count the remaining objects (should be 0)

If the first count of remaining objects depends on what I did earlier in the session, and the second count is 0, I've probably found a leak-until-shutdown bug.

To reduce noise, I had to disable the XUL cache and restrict the counting to GlobalWindow and nsDocument objects. On Linux, I normally count 4 nsGlobalWindows and 4 nsDocuments.

So far, I've found two bugs where additional objects remain:

I'm glad we found the <video> leak before shipping Firefox 4!

Note that this tool can't find all types of leaks. It won't catch leak-until-page-close bugs or other leaks with relatively short lifetimes. It can't tell you if a cache is misbehaving or if cycle collection isn't being run often enough.

Next steps

Depending on how promising we think this approach is, we could:

  • Use it in more types of testing
    • Package it into a more user-friendly extension for Firefox debug builds
    • Make it a regular part of fuzzing
    • Use it for regression tests
  • Add something to Gecko that's similar but less kludgy
  • Expand the classes it will complain about
  • Debug the flakiness with smaller objects
  • Make the XUL cache respond to memory-pressure notifications

It's also possible that DEBUG_CC, and in particular its "expected to be garbage" feature, will prove itself able to find a superset of leaks that my tool can find.

7 Responses to “Detecting leak-until-shutdown bugs”

  1. James John Malcolm Says:

    Sounds like a fantastic method to flush out leaks!

  2. Jonathan Watt Says:

    Great stuff. :)

    What are the known reasons that the premise can’t go more like this: if you start with one tab containing about:blank, then when you open a second tab, load a page into it, and then close that second tab (with or without interaction first), no extra objects should be left alive over and above those that were alive prior to the opening of the second tab.

    There are things like history, etc. that are going to increase their memory use as a result of loading a page into the second tab, but I’m wondering if we couldn’t provide facilities for an extension to disabled all these things so that our premise could be just as I’ve described.

  3. Jonathan Watt Says:

    Maybe less disabling would be required if the premise was: if you start with one tab containing about:blank, load a second page into it, and then go back in history to about:blank (with or without interaction first), no extra objects should be left alive over and above those that were alive prior to loading the second page.

  4. Wladimir Palant Says:

    Great work! Just in case you didn’t think of it yet – nsIAppStartup.enterLastWindowClosingSurvivalArea() allows you to close all windows without opening a new one, might help reducing noise.

  5. Johan Sundström Says:

    So under MacOS, where Firefox stays alive after closing the last open window, this method (if modded lightly not to open an extra window, which, in the case of extra add-ons, could throw results off) will work even better?

  6. Jesse Ruderman Says:

    Apparently the SVG leak was detected by sayrer’s list-shutdown-observers patch in https://bugzilla.mozilla.org/show_bug.cgi?id=578890

  7. Jürgen Möller Says:

    Yeah, whenever I quit the newest Firefox, it often lives longer than one minute (!) till it finally quits. But “quit” should always mean “quit instantly”. What have we come to?