Thursday, June 2, 2011

Who's responsible for libxul

As you might have gathered from my last post, I spend a fair amount of time thinking up ways of visualizing software metrics. Today's visualization is inspired by a lunch conversation I had yesterday. Someone asked "how much space does SVG take up in libxul?" This isn't a terribly hard problem to solve: we have a tool that can tell us the size of symbols (i.e., codesighs), and I have a tool that gives me a database of where every symbol is defined in source code.

Actually, it's not so simple. codesighs, as I might have predicted, doesn't appear to work that well, so I regressed back to nm. And when I actually poked the generated DXR database for mozilla-central using dehydra (I haven't had the guts to try my dxr-clang integration on m-c yet), I discovered that it appears to only lack the information about functions. Both static functions and member C++ functions—my suspicions of breakage are now confirmed. So I resorted to one of my "one-liner" scripts, as follows:

cat <<HEREDOC
<!DOCTYPE html>
<html><head>
<title>Code size by file</title>
<script type="application/javascript" src="https://www.google.com/jsapi"></script>
<script type="application/javascript">
google.load("visualization", "1", {packages: ["treemap"]});
google.setOnLoadCallback(drawChart);
function drawChart() {
  var data = new google.visualization.DataTable();
  data.addColumn('string', 'File');
  data.addColumn('string', 'Path');
  data.addColumn('number', 'Code size (B)');
HEREDOC
nm --format=bsd --size-sort /src/build/trunk/browser/dist/bin/libxul.so | cut -d' ' -f 3 > /tmp/libxul.symbols.txt
find -name "*.o"| xargs nm --format=bsd --size-sort --print-file-name | sed -e 's/:/ /g' | python -c '
import sys
symbols = set(open("/tmp/libxul.symbols.txt").readlines())
files = {}
for line in sys.stdin:
  args = line.split(" ")
  if args[-2] in "tTvVwW" and args[-1] in symbols:
    files[args[0]] = files.get(args[0], 0) + int(args[1], 16)
out_lines = set()
for f in files:
  f2 = f.replace("./", "")
  path = f2.split("/")
  str = "src/"
  for p in path[:-1]:
    out_lines.add((str + p + "/", str, 0))
    str += p + "/"
  out_lines.add((f2, str, files[f]))
print "data.addRows([" + ",\n".join([repr(list(x)) for x in out_lines]) + "]);"
'

cat <<HEREDOC
  data.addRow(["src/", null, 0]);
  var tree = new google.visualization.TreeMap(document.getElementById('tmap'));
  tree.draw(data, {maxDepth: 3, headerHeight: 0, noColor: "#ddd", maxColor: "#ddd",
    minColor: "#ddd", midColor: "#ddd"});
}
</script></head>
<body>
  <div id="tmap" style="width: 100%; height: 1000px;"></div>
</body>
</html>
HEREDOC

It's really 4 commands, which just happens to include two HEREDOC cats and a multi-line python script. The output is a littlenot-so-little HTML file that loads a treemap visualization using the google API. The end result can be seen here. To answer the question posed yesterday: SVG accounts for about 5-6% of libxul, in terms of content, and about another 1% for the layout component; altogether, about as much as our xpconnect code accounts for libxul. For Thunderbird folks, I've taken the time to also similarly divvy up libxul, and that can be found here.

P.S., if you're complaining about the lack of hard numbers, blame Google for giving me a crappy API that doesn't allow me to make custom tooltips.

No comments: