The #ifdef problem
In discussing DXR at the 2011 LLVM Developers' Conference, perhaps the most common question I had was asking what it did about the #ifdef problem: how does it handle code present in the source files but excluded via conditional compilation constructs? The answer then, as it is now, was "nothing:" at present, it pretends that code not compiled doesn't really exist beyond some weak attempts to lex it for the purposes of syntax highlighting. One item that has been very low priority for several years was an idea to fix this issue by essentially building the code in all of its variations and merging the resulting database to produce a more complete picture of the code. I don't think it's a hard problem at all, but rather just an engineering concern that needs a lot of little details to be worked out, which makes it impractical to implement while the codebase is undergoing flux.
Documentation is an intractable unsolved problem that makes me wonder why I bring it up here…oh wait, it's not. Still, from the poor quality of most documentation tools out there when it comes to grokking very large codebases (Doxygen, I'm looking at you), it's a wonder that no one has built a better one. Clang added a feature that lets it associate comments to AST elements, which means that DXR has all the information it needs to be able to build documentation from our in-tree documentation. With complete knowledge of the codebase and a C++ parser that won't get confused by macros, we have all the information we need to be able to make good documentation, and we also have a very good place to list all of this documentation.
Indexing dynamic languages