Sunday, April 27, 2008

Software Bloat

Although a belated response, this is an attempt to respond to some of the ideas presented in this newsgroup message, and a few other ideas hinted at in subsequent postings, as well as not publicly espoused.

One of the great culprits of modern computer society is the idea of software bloat: that an application has grown too large to be effective. Microsoft Windows and Office are regularly accused of this sin, but even Thunderbird and Firefox are accused; I'm sure that one could dredge up opinions as to why the Linux kernel or X is too bloated. But are these terms fairly applied?

An adage holds that 80% of features are used by only 20% of users, but the problem is that the only-used 20% is different for each user. I personally don't use mail-merge in office applications at all, but I tend to heavily use the advanced outlining options, for example. Does this mean that the mail-merge feature should be stripped? What about macros--evil for most of the public but invaluable for those who heavily rely on them?

I recall reading Slashdot's response to the awesomebar in Firefox. The average response consisted of yelling why it was such a bad idea to include it, most likely by those who never used it before. I had the same response at first, but quickly found it invaluable: going to a specific bugzilla bug allows me to type the bug number, or heading to RFC 3977 by typing just that. And for that site where I remember the title but not the freaking URL, well, that explains itself. In fact, now, I get annoyed when I realize that I am not using the awesomebar because the regular URL bar is so cumbersome. So is the awesomebar bloat? To non-users, yes, but to users, no.

A more specific example to me is Thunderbird. The aforementioned post consisted of a rant (I'm not going to glorify it) as to why Thunderbird should stick to email and only email, no RSS, no calendar, no NNTP, no address book, no... half of it. Would it be better? The RSS component probably detracts from Thunderbird by being half-implemented (I will slyly add in that news has had even worse problems, and few call for that to be stripped out unless prompted), but would one complain if it was well-implemented? I think not. And there are strong reasons, too.

But there is no clear line to draw when excluding features. News is a worthwhile component to include in an email reader: it is quite close to IMAP in many regards. If one includes news support, why not include uuencode? It's useful for multinational stuff or alt.binaries. Why not then yEnc? Combine-and-decode? X-Face? Advanced message scoring? Feature XYZ? At some point, someone can call it all "bloat", but where is the line you need to draw that makes it not bloat?

What's the alternative to bloat? Look at the ultimate example of bloat, Windows; the non-bloated alternative is Linux (Mac is somewhere in between). You have several varieties of Vista, but hundreds of Linux: Debian, Ubuntu, Gentoo, Slackware, OpenSUSE, Red Hat, Fedora Core, etc. Your desktop? The big ones are GNOME or KDE, but FVWM, Fluxbox, Blackbox, XFCE, and many more exist; cross-interoperability, especially between GNOME and KDE, is not close to what Windows provides. In the place of broken one-size-fits-all, we have a multitude of alternatives which can do poor jobs of talking to each other.

The answer, many claim, is in the idea of pluggability. But, here too, the line is very blurry. At what point do you say that extension XYZ should be included into the core? This is what happened with RSS: it first existed as an extension and was later integrated. Although not well-versed in Firefox history, I would be willing to posit that extensions there too have similarly become incorporated. The sum response is that more plugins are included into the core until someone, once again, screams about the bloat, forks, creates a leaner version, which then becomes just as bloated, ad infinitum.

So, what can one do about bloat? In my opinion, the true, final answer is to let the user decide for him or herself what features are needed. Being able to easily compile the source code with options to turn off feature X goes a long way to this. Increasing modularity helps, and, perhaps most importantly, the biggest single help would be to have true, open, universally-supported standards, not just for online communication, but through interprocess and interplatform communications in all regards (i.e., standards closer to an ODF specification than to XML).

Sunday, April 6, 2008

The Great Addressbook Rewrite

As I write this, bug 413260's first part is almost complete. At a diff of -2639/+1747 (47 files changed), the first portion is quite large. That said, substantive changes take place in only two files, accounting for over half the patch. A small portion is given over to the removal of nsIAbMDBCard, while the rest of the patch centers on merely changing the access points. Its size is mostly due to the large number of places it is accessed, not because the feature set is complex.

My original plan for implementation would have followed three large patches with errata in various other patches. The nsIAbCard is relatively self-contained and atomic, so that was how I came up with the plan. However, the startling results from attempting to do nsIAbDirectory refactoring makes this idea unfeasible.

My goal was to remove as much of nsIAbMDBDirectory as possible, which would require invalidating nsIAddrDatabase. That requires major changes to import and palmsync code. Throw in the difficulty of going from a file to an address book directory or a database to a directory and you get some hair-raising complexity. Then add in the fact that mailing lists are, well, black magic, and this process becomes especially complex. And large: my latest WIP touches some 68 files with a diff of a mere -693/+836.

What causes the sheer magnitude of difference in complexity? A card is quite lightweight: it is essentially a map of properties with a bit of extra stuff. There is therefore only one implementation of a card with five sub-implementations that refine it somewhat. A directory is the opposite: a heavyweight object with 4-6 (depending on how you count them) different implementations, sharing little meat among them. A mailing list is quite like black magic: they pop into existence when you need them, and creating one requires some non-intuitive steps.

So, what's my new plan? Well, lesson #1 is to stave off nsIAddrDatabase-gutting until after mailing lists are done. The next steps will be more atomic: the creation of nsIAbCollection and the involved refactoring come is first (probably split up into two or three patches), followed by mailing list sanity with nsIAbGroup (again, maybe a few patches itself). Only afterwards will I remove nsIAddrDatabase from import and palmsync (probably in two batches). Trivial removal of nsIAbMDBDirectory (i.e., where it is used for cardForEmail) will probably be in one of the first directory patches.

Of course, some of the more ambitious changes will need tests, but it looks like hwaara is being nice and writing tests for import (see bug 421050).