Bookmarklets updated to comply with DOM 2 rule

November 2nd, 2006

DOM 2 does not allow nodes to be moved between documents -- in fact, it requires that implementations throw an error when code tries to do so. But for years, Gecko has not enforced this rule.

It's a bit embarrassing that Internet Explorer gets this right and we get it wrong. Someone might think Gecko is trying to embrace and extend the DOM.

Soon, Gecko will start enforcing the rule on trunk. But bringing Gecko in line with this aspect of the DOM spec risks breaking Gecko-specific code, such as code in extensions and bookmarklets written for Firefox. For example, my Search Keys extension used to create some nodes in the chrome document, and some in the foreground tab, before putting them in the tab that that just loaded. Search Keys 0.8 creates all elements in the correct document.

I also updated the following bookmarklets to create nodes in the correct document and/or use importNode when copying nodes between documents:

These bookmarklets previously only worked in browsers that violated the DOM spec by allowing nodes to be moved between documents without a call to importNode or adoptNode. Maybe some of them work in IE now.

If you use those bookmarklets, you should grab the new versions so they won't break when you update to next week's trunk build or to Firefox 3.

Determining whether a crash looks exploitable

November 2nd, 2006

If you use Mac OS X 10.4, you can usually determine whether crashes you encounter are severe security holes in seconds, even if you are not a C++ developer or do not have access to the source code of the application that crashed. Here's how.

Setting up Crash Reporter

To prepare, type "defaults write com.apple.CrashReporter DialogType developer" into a Terminal window. (Or, if you have CrashReporterPrefs installed, you can do this using a GUI.) This makes several changes to the dialog that appears when any application crashes. The most important change is the addition of a partial stack trace to the dialog that appears when applications crash. The stack trace tells you which function the crash occurred in, which function called that function, and so on.

Another nice feature of "Developer" mode is that a crashing application's windows stick around until you click "Close" instead of disappearing immediately. This gives you a chance to salvage unsaved data that was visible when the application crashed.

To try out Crash Reporter, find a crash bug report in Bugzilla, such as this null dereference or this too-much-recursion crash, and point Firefox at the bug's testcase. Now, instead of seeing a Basic crash dialog, you should see a Developer crash dialog with the first ten lines of a stack trace and other debugging information.

Skimming a crash report

By looking at three things in the crash report in order, you can get a good idea of whether the crash is likely to be exploitable:

1. Look at the top line of the stack trace. If you see a hex address such as 0x292c2830 rather than a function name such as nsListBoxBodyFrame::GetRowCount at the top of the stack, a bug has caused the program to transfer control to a "random" part of memory that isn't part of the program. These crashes are almost always exploitable to run arbitrary code.

2. Look at the last line above the stack trace, which will usually be something like "KERN_PROTECTION_FAILURE (0x0002) at 0x00000000". The last number, in this case 0x00000000, is the memory address Firefox was prevented from accessing. If the address is always zero (or close to zero, such as 0x0000001c), it's probably a null dereference bug. These bugs cause the browser to crash, but they do so in a predictable way, so they are not exploitable. Most crashes fall into this category.

3. Check the length of the stack trace by clicking the "Report..." button. If it's over 300 functions long, it's likely to be a too-much-recursion crash. Like null dereferences, these crashes are not exploitable.

Any other crash where Firefox tries to use memory it does not have access to indicates some kind of memory safety bug. These crashes can often be exploited to run arbitrary code, but you can't be as certain as in the case where you see a bogus address at the top of the stack in step 1.

Reporting bugs

If you encounter a crash in Firefox that looks exploitable, please take the time to figure out how to reproduce the bug, create a reduced testcase if you can, and file a security-sensitive bug report in Bugzilla. After filing the bug, attach the crash report generated by Mac OS X, pointing out what makes the crash look like a security hole.

If a crash bug looks exploitable based on the stack trace, Mozilla's security group assumes it is exploitable. You don't have to learn machine language and construct a sophisticated demo that uses the bug to launch Calculator.app to convince us to take such a bug seriously and fix it. The same is true for Apple's Safari team in my experience.

Windows and Linux: using Talkback

If you use Windows or Linux, you can't use the Mac OS X Crash Reporter, but you can use Talkback instead if you want to see stack traces for Firefox crashes. Installing Nightly Tester Tools gives you a menu showing your recent crashes, but it's still not quite as efficient as the Mac OS X trick, and depends on the Talkback server being in a good mood.

Talkback was developed before developers knew so many types of crashes were exploitable, and it's primary purpose is to determine which crashes are the most common, so it does not show you which memory address Firefox was denied access to. This prevents you from distinguishing likely null dereferences from some severe memory safety bugs (step 2 above).

Firefox lives up to its name

November 1st, 2006

Bug report of the week:

Firefox 2 runs 50 degrees F hotter than Safari or Firefox 1.5

Integer overflows

November 1st, 2006

"What is a string library? It's a way to pretend that computers can manipulate strings just as easily as they can manipulate numbers."

-- Joel Spolsky, The Law of Leaky Abstractions.

Most C++ code uses the integer mod 232 (or 264) type C++ calls "int" as if they were integers. This is great for performance -- many operations on int32 are a single CPU instruction -- but dangerous for security and correctness when the numbers can be large. This can cause security holes in at least two ways.

First, code might use int32 arithmetic to decide how much memory to allocate. Consider an image decoder that allocates width * height * 4 bytes to store RGBA pixels and then decodes the image data into the structure. But since width and height are unsigned ints, it doesn't really allocate width*height*4 bytes; it allocates width*height*4 mod 232 bytes. If the integer used to decide how much memory to allocate has overflowed in such a way that it comes out as a small integer, the code is likely to overflow the buffer as it writes the decoded image into the structure.

Second, code might use int32 arithmetic to decide when to deallocate an object. In code that uses reference counting, an extra call to "release" can obviously lead to a dangling pointer situation. But thanks to integer overflows, 232 unbalanced calls to "addref" followed by a normal "release" can have the same effect. (Luckily, you can't cause this situation by merely making 232 objects point to a specific object, because you'd run out of memory first. So this could be addressed by auditing for addref-without-release leak bugs rather than modifying the addref function to make it safer.)

Explicit checks

Some code in Gecko has explicit checks to prevent overflows. (This must be done carefully -- "width*height*4 > 232" doesn't mean anything to a C++ compiler!) If you remember to think "integer mod 232 or 264" every time you see "int", you may be able to avoid introducing new security holes due to integer overflow when you write code.

Michael Howard at Microsoft advocates this approach, at least for C code that is used near things like allocation sizes and reference counts, and provides functions to do checked arithmetic operations. These functions return a boolean indicating whether the arithmetic operation succeeded. This leads to code where it is hard to see what calculation is being done but easy to see that each step of the calculation is done safely.

Safe integer classes

Another strategy is to avoid using "int", at least in code used to compute allocation sizes, and instead use a "safe" integer class. A safe class might do correct arithmetic on large numbers, allocating extra memory when needed, but perhaps that is overkill for keeping allocations safe. A proponent of this approach might say "int is the new char *", referring to how string buffer overflows have been nearly eliminated through the use of string classes, and make fun of Joel Spolsky for the quote at the beginning of this post.

David LeBlanc, also at Microsoft, advocates a slightly different approach: using a class that treats overflow as an error and can throw exceptions. This keeps arithmetic formulas readable at the expense of having to design the function to handle exceptions correctly.

Static analysis can be used to scan for calls to malloc that use "int" and need to be converted to using SafeInt.

Will Gecko soon have a multitude of integer classes, each with different performance characteristics, signedness, and overflow behavior? Probably not, because numbers used to decide how much memory to allocate are almost always unsigned integers where overflows can be treated as errors. But I wouldn't be surprised to see different parts of the code use different strategies, with C code using the "explicit checks with helpers" strategy and XPCOM C++ code using another strategy.

Other languages

Many languages share C++'s behavior of exposing "integer mod 232" types as "int", but JavaScript and Python are two major exceptions. JavaScript has a hybrid "number" type that is sometimes stored as an integer and sometimes stored as a floating-point number. Overflowing integer arithmetic turns your numbers into floating-point numbers, while treating a floating-point number as a bit field tries to turn it back into an integer by computing its value mod 232. While JavaScript's behavior is more useful in most situations than wrapping around, you wouldn't want to use it for memory allocation.

Python instead takes advantage of its dynamic type system to make integers safe. Overflowed integers are replaced with a "long integer" type that is slower to operate on but has safe, correct behavior for integers of any size (until you run out of memory).

Memory safety bugs in C++ code

November 1st, 2006

C++ lets developers work with raw pointers, allowing some performance tricks not available in higher-level languages. By allowing developers to decide when objects are allocated and deallocated, developers have the flexibility to choose between (or mix and match) reference counting, various forms of tracing garbage collection, and ad-hoc calls to "new" and "delete". Developers also have the ability to allocate some objects on the stack rather than the heap, improving performance. While C++ is not the only language that allows stack allocation, it is one of the few that lets you maintain linked lists of stack-allocated objects.

C++ also allows developers to do manual pointer arithmetic, making it possible to steal bits from pointers or implement XOR linked lists. Arrays use implicit pointer arithmetic without bounds-checking, which is nice for performance when bounds-checking would be redundant.

Unfortunately, most C++ compilers do not include theorem provers, so they cannot require you to declare invariants and provide enough proof hints to explain why your use of raw pointers is safe. As a result, it is easy to have bugs that lead to severe security holes.

Common types of memory safety bugs

These memory safety bugs usually manifest themselves as crashes. They're also usually exploitable to run arbitrary code.

  • Using a dangling pointer. A simple read from a dangling pointer usually won't cause too much damage, except perhaps to privacy. But writing to a dangling pointer can corrupt another data structure, and freeing a dangling pointer can leave another data structure open to future corruption. Worst, calling a virtual member function on a dangling pointer will jump to a memory location based on a vtable pointer that is likely to have been overwritten, easily leading to arbitrary code execution. Most of the memory safety bugs I have found in Gecko involve dangling pointers.
  • Buffer overflows, also known as "writing past the end of a string or array". These are the best-known memory safety bugs, and among the first to be exploited to run arbitrary code. They're dangerous whether the array is on the heap or the stack, and whether the overflow is as long as the attacker wants or a single byte.
  • Integer overflows, bugs due to forgetting that what C++ calls "int" is really "integer mod 232" (or 264). If int computation is used to decide how much memory to allocate, overflow can lead to a buffer-overflow situation. If reference counting is implemented using an int counter, overflow can lead to an object being freed prematurely, creating a dangling-pointer situation.

Safe crashes

Several common types of crashes that are not security holes:

  • Dereferencing NULL. Most operating systems never allocate page 0, so userland programs can assume that dereferencing null is a safe crash. This is good because null dereferences are significantly harder to prevent than uses of dangling pointers.
  • Too much recursion. Most operating systems have a guard page at the stack limit to prevent your stack and heap from colliding. This is good because preventing too-much-recursion bugs is hard and has historically not been necessary.

Note that some operating systems have bugs or design flaws that turn these "safe" crashes into security holes. Until recently, Windows had a bug that turned null dereferences in some programs into security holes. And at least as of 2005, some operating systems do not guarantee that null dereferences and too-much-recursion are crashes. IMO, those operating systems need to be fixed, so developers can continue treating null-dereference crashes as having the same severity across operating systems.

Continuous Daylight Saving Time

November 1st, 2006

Daylight Saving Time seems to serve three major purposes:

  • Health: keeping sunrise roughly constant relative to when work or school starts makes modern routines easier on our circadian rhythms, improving our pyschological health and perhaps also our physical health. In addition, the daylight "saved" by not "sleeping in" hours past sunrise during the summer makes more outdoor activity possible, increasing the amount of exercise we get without conscious effort.
  • Energy use: By using less artificial light and spending less time inside watching TV during the summer, America saves about 1% on total energy use by using Daylight Saving Time.
  • Safety: Daylight Saving Time tries to keep both morning and evening commutes in daylight when possible. But when that isn't possible, it tries to ensure that at least the morning commute is during daylight. This reduces car-accident injuries by thousands or tens of thousands per year.

I think a time system could improve health, energy use, and safety even more if it were to make small adjustments throughout the year instead of large adjustments twice a year. For example, a small amount of time might be added or taken away just before 2am every morning, in order to keep sunrises at 6am at a latitude of 40 degrees. The daily changes would be small enough for most people to ignore -- less than two minutes per day even around the equinoxes.

Interestingly, switching to continuous time change would also address the main criticisms of DST:

  • Lost productivity and an increase in fatal auto accidents twice a year due to disruption of sleeping patterns.
  • Lost productivity fiddling with clocks.
  • Farmers are forced out of synchronization with the rest of society.

It seems like my favorite kind of compromise, one that reveals a false trade-off and makes both sides happier than they would have been with their previous preferred solutions.

Of course, there would be new drawbacks. Certain time calculations would be more difficult: night-shift workers might find themselves needing to keep track of the changing length of each day, instead of being confused only twice a year. Planning a weekly meeting involving people in different hemispheres (or DST regimes) would become more difficult, especially if people on each hemisphere have tight schedules.

We would also have to replace our clocks and watches. I'm not about to pretend that forcing everyone to purchase new clocks would be a good thing by itself, but at least it would only be a one-time cost; computing power is cheap enough that the the price of clocks would not increase permanently. When we upgrade our clocks to deal with days that vary slightly in length, we should also give them all the ability to update themselves; this would be more pleasant than requiring you to enter the date in addition to the time after each power outage. We could also dramatically improve the user interfaces of most alarm clocks with respect to how often they fail to wake people up, but that's the subject for another blog post.

This "Continuous DST" proposal is not to be confused with the proposal known as "Year-round DST". The advantages of DST arise from the twice-yearly changes to our clocks corresponding to the changes in the seasons. While "year-round DST" might make sense as a short-term response to an energy crisis such as World War II, in the long term it equivalent to not having DST at all: over a period of several years, everyone will shift their hours back to when they are comfortable being awake unless the government also legislates working hours, store hours, and prime-time television.

I'll admit to being atypical when it comes to sleeping schedules. I work from home and can keep almost any schedule I want. I tend to be most productive at nights, when there are few distractions, so I often sleep during the day. I prefer to be outside during the evening and night, when I don't have to wear sunglasses. (As an added bonus, when I go grocery shopping, my dairy products will take less damage from the walk home). On the other hand, in college, when many students wouldn't even consider taking a class before 10am, I didn't mind having an 8am MWF class as long as I also had a 8:10am class on Tuesday and Thursday.

I'm sure many readers do keep "normal hours", whether by coercion or choice, so what do you think of Continuous DST?

San Diego Firefox party

October 28th, 2006

I had a great time at the San Diego Firefox party, organized by numist. Most of the people at the party were a lot of UCSD computer science students, but there was also at least one Cog Sci major and an English major. Many of the computer science majors were juniors who had just finished struggling with difficult OCaml assignments in a Programming Languages course.

Not everyone who was at the party uses Firefox as their main browser. While some of them use nothing but Firefox trunk builds, the host uses Safari for most of his browsing.

Lawrence Eng, a market researcher at Opera Software's San Diego office, also joined the party. We discussed differences in anti-phishing approaches: Opera's default protection involves contacting the server with URLs you visit, but Opera promises to only use the URLs it collects due to the feature in specific ways. He also admitted to having tried out Thumbs, saying that "Firefox has Opera beat there".

Some people at the party were disappointed at the lack of Firefox t-shirts, but said they weren't going to switch to Opera or Safari as a result. I replied that it was a good thing Lawrence hadn't brought along any Opera shirts.

I brought my copy of Apples to Apples. It is one of my favorite party games, along with Taboo and Scattergories. About half an hour into the party, I tried to start the game. Not many of the partygoers knew the game, so we started with four players and let others join gradually.

Like any good party game, Apples to Apples is fun even if you're not winning; it's possible to play without keeping score at all. This was good for me because I'm not an especially strong player and many of the other players had the advantage of already knowing each other.

Perhaps in part due to my overall low score, I was very satisfied with how I won the last round. The adjective to match was "Frightening" and I played "A sunrise", initially hoping to win on irony. But after seeing that my "sunrise" was up against the Anne Frank card, I had a flash of insight. I explained: "You've been up all night working on a project, you're not even close to done, and you look out the window and see the sun rising." Another player had been in exactly that situation the morning before the party, and the judge picked my card.

Bundled software in security updates

October 28th, 2006

Today's Java security update includes a checked-by-default "Install Google Toolbar for Internet Explorer" option. Shame on you, Sun and Google. Automatic security updates are no place to push unrelated, bundled software. Making security updates annoying hurts security almost as much as making security updates complicated: users will be less inclined to update next time.

This is similar to how Flash updates attempt to install the Yahoo Toolbar. It's certainly not as bad as the frequently updated AOL Instant Messenger, which turns on the "Today window" popup on every AIM account and adds a "Netscape ISP" icon to the desktop with every security update. But I thought Google was trying to set a good example.