Last week, I asked Luke Wagner to explain some security bugs that he fixed in the past. I hoped to learn from each bug at multiple levels, in ways that could help prevent future security bugs from arising and persisting.
- What was the bug?
- What went wrong in the developer's thinking that caused the bug to be introduced?
- What made the bug exploitable?
- What caused us to use especially dangerous features of C++?
- Could a new abstraction make it possible to do this both fast and safe?
- What caused the bug to persist? Could we have caught this earlier with improved regression tests, fuzz testing, dynamic analysis, or static analysis?
Luke and I made trees for all ten bugs, at first on paper and later using EtherPad. Then I extracted and categorized what I thought were the most useful lessons and recommendations.
Recommendations for introducing fewer bugs
- Create centralized, type-restricted cast functions. This protects you when you change the representation of one of the types. It also protects against mistakes that cause the input type to be incorrect.
- Use tagged unions instead.
- Use a typed wrapper (a struct containing a single value). When assigning from the underlying numeric type, convert using one of two functions: one that checks for special values, and one that explicitly does not.
- Audit existing code paths to ensure they cannot generate the special value.
Clarity of invariants
- Increase use of methods named AssertInvariants
- Create an alias for JS_ASSERTION called JS_INVARIANT.
Interacting with other developers
- If you're about to do something gross because someone else doesn't expose the right API/helper, maybe you should get it exposed.
JS Engine specific
- Any patch that touches rooting should be reviewed by Igor.
- Interpreter could have better abstraction and encapsulation for its stack.
Recommendations for catching bugs earlier
- Find all casts (C-style casts, the reinterpret_cast keyword, and casts through unions) for a given type. Could be used to enforce centralization or to find things that should be centralized.
- Be suspicious of a function with multiple return statements, all of which return the same primitive value.
- Be suspicious of a function returning true/success in an OOM path.
- Ask Valgrind developers what they think of providing (in valgrind.h) a way to tie the addressability of "stacklike memory" to a variable that represents the end of the stack.
- We should fuzz worker threads somehow.
- In browser (slow and messy, but it's what users are running).
- In thread-safe shell (--enable-threadsafe?), which has "toy workers".
- We should fuzz compartments better.
- I should ask Blake and Andreas for help with testing compartments and wrappers.
- I should ask Gary to run jsfunfuzz in xpcshell, where I can test both same-origin and different-origin compartments, and thus get more interesting wrappers.
- We should give JS OOM fuzzing another shot.
I'm curious if others have additional ideas for what could have prevented the ten bugs we looked at. For example, someone like Jeff Walden, who loves to write exhaustive regression tests, might have ideas that Luke and I did not consider.
I'd also like to do this kind of analysis with a other developers on bugs they have fixed.