Friday, January 22, 2010

Developing new account types, part 0: An introduction

This series of blog posts discusses the creation of a new account type implemented in JavaScript. Over the course of these blogs, I use the development of my Web Forums extension to explain the necessary actions in creating new account types. I hope to add a new post once every two weeks (I cannot guarantee it, though).

Before I begin the actual discussion, let me give some background. The ability to develop new account types has been my biggest extension goal for about two years now. Probably because of its difficulty, I know of only two extensions that have tried to do it: what is now the RSS code, and Webmail. In the first case, the implementer resorted to creating a binary component for the incoming server; in the latter, the implementer wrote a fake IMAP (and POP, SMTP) server to proxy the information to the web interface.

Some preliminary points: making a new account type is not a Good First Extension. You will need a fair amount of XPCOM experience, and probably decent experience at delving into implementations of undocumented interfaces. How much XUL and DOM (for stuff like webscraping) you use is up to you. MDC has a guide on building a Thunderbird extension from scratch. It is also probably not a bad idea to get comfortable with manual preference editing.

I am also trying a different form of development in this guide. This is not being done via my more common method of manually editing HTML by hand, but by writing in Kompozer. I'm also attempting to get more code included in my posts, and hopefully some images as well (the last part will be hardest). Like my first guide, I do expect that this will be adapted into a series of documents on MDC at some point. Some more reference-oriented documentation will be posted on MDC as I write this.

I personally use a debug version of Thunderbird, on Linux, very near the tip as the source code as the basis for my extension development (my regular profile is some 370 MB of stuff I don't dare threaten with developmental work). This is the same build I do patch development on, so I will rely on patches in said tree from time to time. One patch in particular is required [1], but otherwise, it should work on 1.9.2 and probably 1.9.1 as well.

This guide is structured to first demonstrate the actual activity components (e.g., displaying messages) and only cover configuration (e.g., the wizard to create a new account) when the more basic stuff has been completed. Therefore, you will need to get comfortable with editing configuration files by hand if you follow these steps exactly.

Backend introduction

So, let's start with an overview of the backend interfaces in mailnews. A list of the interfaces a front-end widget might use to talk to an account is: nsIMsgAccount, nsIMsgDatabase, nsIMsgDBView [2], nsIMsgFolder, nsIMsgDBHdr, nsIMsgIdentity, nsIMsgIncomingServer, nsIMsgMailNewsUrl, nsIMsgMessageService, nsIMsgProtocolInfo, nsIChannel, nsIProtocolHandler, and nsIRDFResource. Many of these would need to be implemented, and a few of them are not in any way small; to implement nsIMsgFolder would require a total of 186 methods, setters, and getters (as of this writing), many of which are not well-documented.

In reality, implementations are not from scratch. Everything tends to boil down into two or five different implementing classes: the server, the service, the folder, the url, and the database (there is also typically a protocol implementation as well). Of these, only the service is implemented from scratch, and it gets the simplest interfaces to implement. When I said "two or five," I am referring to the fact that there are actually two types of accounts. The first type, which only has to implement a server and service, can be called mailbox accounts: all of the messages are downloaded into local folders [3]. The second type implements all of the above, as the messages are generally stored on the server and downloaded on demand (or cached).

Of these two types, the less interesting is the first one. I will therefore generally ignore this account type. If you want to make such an account type, look at the RSS implementation for guidelines. The primary distinction is that mailbox accounts lack their own folder types, and therefore databases and URLs. In such a case, all you need to worry about is delivering the messages.

Following is a description of the major implemented components:

Server
The server represents the source of messages for an account. It also serves as the per-account configuration information for implementers. For example, NNTP stores the maximum connection limit to a server off of this implementation.
Folder
The folder represents a container of messages. Ultimately, the UI interacts more with folders than with servers, at least on a regular basis. This is the most complex interface to deal with, primarily because it can be hard to tell precisely what you need to implement versus what (eventually) calls back on some other message.
Database
The database represents a store of a subset of message information. It generally stores by default what NNTP would call overview information (enough to create a threaded message list), plus some flags like read status, as well as some information that extensions which to preserve.
Service
The service is more of a "how-to" guide for accounts. This is the external endpoint for ultimately copying messages, viewing messages, etc. Note that this is the only service implementation, so the actual server communication code typically happens in a different implementation.
URL
URLs are what the name implies. It's how one refers to messages, folders, and servers, although only messages are typically instantiated with the object in question. They also tend to be used as the primary internal communication system.
Protocol
The protocol instance represents a connection to a server. Unlike the other implementations, this one is not mandatory and is typically not visible via the "main" interfaces (nsIChannel is perhaps the most useful one they export). I suspect this is primarily useful for binary protocols, but I have not yet delved far enough into creating a new account to say for certain.

Important interfaces and their interactions

The center of an account is represented by the nsIMsgAccount. To get an idea for the amount of interfaces involved, look at the collaboration diagrams for nsIMsgAccount and nsIMsgFolder.

nsIMsgAccount represents an account. The interface itself is not terribly useful—it's mostly just a step on the way to get to a server or an identity.

nsIMsgIdentity represents an identity. Identities are essentially a way of persisting compose settings; since their use is wholly related to compose code, I will not discuss them in detail until later parts of the guide.

nsIMsgIncomingServer, as mentioned earlier, represents a message source. This is one of the interfaces you will have to implement, although much of it is already done for you. Everything that is specific to a server hangs off of this interface; everything that is specific to a folder hangs off of nsIMsgFolder; folders are accessible via the root folder of a server.

nsIMsgFolder, as mentioned earlier, represents a container of messages. This is one of the interfaces that has to be implemented, unless you are using a mailbox account. All folders have a database.

nsIMsgDatabase represents the message store overview. This has to be implemented if you are implementing nsIMsgFolder (unless you want to be sneaky). Databases are used to get at thread and header information, via nsIMsgThread and nsIMsgDBHdr, respectively. Messages themselves have numerous representations: URIs, header objects, message keys, and (sometimes) message IDs. Conversion between these forms is common.

nsIDBFolderInfo represents folder properties normally stored in the database. All of these properties are also stored in the folder cache (nsIMsgFolderCache) to avoid opening up all of the databases just to figure out how many unread messages are in each folder.

nsIMsgAccountManager and nsIMsgBiffManager are two managers that handle account creation and the periodic mail download (generally called biff), respectively. Expect to see these calling your code a lot.

nsIMsgDBView represents the thread pane view. This is going to be the primary consumer of nsIMsgDatabase, and this is where you should go to look to find out what happens if, e.g., you select a new message.

nsIMsgFilterList, nsIMsgFilterPlugin, nsIMsgFilterService, and nsIMsgFilter are the interfaces that deal with filtering. None of these will have to be implemented to support filtering [4].

The nsIMsgSearch* interfaces are those that deal with search (there are around 9 of them). Most of these will not have to be implemented to support searching. More on this when searching is discussed.

nsIMsgWindow represents the bridge to the front-end. It is passed into many functions, although it may be null, typically when being invoked from the backend.

nsIMsgMailNewsUrl represents the URL object that loads a message. This will generally have to be implemented if nsIMsgFolder is.

nsIMsgProtocolInfo represents the basic information about an account type's capabilities. This interface is one that is required to be implemented. As the name implies, it is generally geared towards the capabilities of the connection to the server.

nsIMsgMessageService and nsIMsgMessageFetchPartService represent the ability to retrieve the message (and message parts, more often known as attachments [5]). This is another interface that one must implement if folders are being implemented.

The MIME, compose, and import interfaces are omitted from this list of backend interfaces, as these are topics that will not be discussed for a while, and I am not certain they are useful to know about making new account types at present.

Notes

  1. The purpose behind this patch is to enable extensions to reuse files from base/utils like C++ components can. If you were to adapt this to use C++ instead of JS, this patch would not be necessary. As the comments in the linked bug indicate, there is no guarantee that this will be implemented for Thunderbird 3.1; however, in such a scenario, the specifically required binary components would be available for reuse on some webpage. More on this when a decision is made.
  2. Strictly speaking, this interface uses other interfaces in the list to talk to you. That said, a lot of interaction with folders and databases happens through this interface.
  3. I say local folders—not Local Folders—here because Global Inbox settings actually rely on POP-specific attributes. It is still possible, via a reimplementation, to change the delivery settings. Such a mechanism is outside the scope of this guide.
  4. It's not strictly necessary to implement these, but if you want to add custom filter terms or actions or custom search terms, some interfaces will need to be implemented. Such actions are beyond the scope of this guide.
  5. Classifying all message parts as attachments is a pretty big oversimplification. In general, the only time specific parts are requested in Thunderbird and SeaMonkey are when attachments are involved. For more information on message parts, please see RFC 2045, RFC 2046 (two of the five MIME specifications), as well as the IMAP FETCH subsection (for numbering).

Monday, January 4, 2010

Building packages: harder than they look

For a course I'm TAing, we (the other TAs and I) decided to revamp the tools so that students could more easily install them on their own computers. This was really my first look into actually producing packages for other people. Here is the long tale:

Step 1: Build simpl

Okay, the basic, core tools here compile and work easily. The more complicated tale is the GUI, built on qt. qt 3, to be precise. Except the autodiscovery thinks we want to try building qt 4. A single post-configure change gets this working. Only took a few hours here (trying to go the qt4 route didn't work so well, and we had interesting endeavors trying to figure out how to get KDE headers to work).

Step 2: Build binutils

This wasn't all that hard at first. Configure ran nicely and without problems, and building... oops, there's a warning and someone turned on -Werror. Another reconfigure gets this building quickly.

Step 3: Build (cross-compiling) gcc

Configure... build... fail... reconfigure... rebuild... fail... Repeat for several hours. Make that days. Do I want these options? Or those options? Still failing. Try editing files mid-build, so if that gets it to work. And, no. Okay, let's try binutils again. Solution: make install binutils first, then build gcc. That works without problems.

Step 3.5: Test the build

I have a Makefile that just requires me to change a few lines to swap gcc versions and directories of everything. Do that, try it, and... it doesn't work. Something about libc not working correctly.

Step 4: Build newlib

By this point, I know the drill: copy the configure from elsewhere, configure, and build. Apparently there's a typo in one of the ARM assembly files. I teach myself a tiny bit more of ARM (this is turning out to be very educational!) and fix the file. Reconfigure, rebuild, install, and test again. This time, it's complaining about missing a few functions. I found some more documentation online, and wrote my own sbrk function (where "wrote" means copied from some file online and tweaked to make it build). Testing fails again, so I make myself a few more functions and everybody's happy.

Step 5: Build vba

As you might imagine, this one didn't work either. So many build errors. I look at what Debian did, so I ponder some more, talk it over with the other TAs, and give up. Skritch, skritch.

Step 5: Build vbam

This fork builds... oh, wait, I need cmake. Okay, this fork builds without problem. They don't have version package downloads for my build script to pull, so I just have it yank a specific svn revision. Nice, simple package to work with after the mess that is cross-compiling.

Step 6: Build gdb

No problems here. Worked fine the first time, no patching, no need to rebuild. Even the testing had no problems. Stunned me.

Step 7: Package and test on school computer

Problems:
  1. Can't find libmpfr.so
  2. cc1: /lib/libc.so.6: version `GLIBC_2.7' not found
  3. as: /lib/libc.so.6: version `GLIBC_2.7' not found
  4. (vbam) Segmentation fault
Solutions:
  1. Statically compile libmpfr.so. Not too hard...
  2. Statically link gcc. Not very trivial. Eventually, LDFLAGS=-static in the configure arguments works.
  3. Statically link as (and other binutils). This requires manually copying the final line and adding in the -static argument. Every time I rebuild binutils.
  4. Debug, find backtrace. It's in pthreads, called from SDL. Try statically linking SDL (no luck). Try using different SDL versions. Rebuild vbam with debug. Notice that the primary reason for fault is... no sound device. Patch vbam. Test again, it works!

For the sake of clarity, every time I had to test in the final step, I had to reupload the tarball, which started out at 49 MB and grew to 55 MB (thanks to static compilation). Sometimes I had to reupload it again, if the connection died in the middle (my internet connection started getting flaky... possibly related to the 100s of MB I was uploading a day. Or maybe the 100s of MB I was downloading (every time I restarted the script, it downloaded 100 MB of source archives....).

So, in short, I had to override build scripts for 2 different packages, patch another 2, and build 3 out of 5 packages statically. One package doesn't have a point release; the other three are spread out among three separate servers to download. Running the build script from scratch requires nearly 2GB of disk space and takes several hours. At least now I repackaged it in Makefile form so you don't have to restart all over from square one if you forgot to install cmake first. Building the final tarball requires a good minute on my system.

But, I've finally finished the experience. Plus, I won't have to do it again... after I build the 64-bit version.

Saturday, January 2, 2010

Predicted work on Thunderbird

It's a new year, so it's time for me to predict (and probably overestimate, who knows) what I would like to do and see in the realm of Thunderbird (and SeaMonkey) and other tidbits in the Mozilla realm.

News submodule

Thunderbird 3 improved the filter story dramatically here; the next two biggest itches are the complete inanity that is news URIs (too many bugs to count), and the venerable old crossposts bug. I still contend that the latter would best be served by per-account database functionality; in any case, it does require some database changes to work properly. I doubt I'll find time to look at that bug in particular this year.

The URI issues are more tractable, but I don't think I'll find time to hit them for 3.1; in any case, I now consider them to be the highest priority news bugs. So, to anyone with time on their hands: feel free to take one or two of these and start fixing. You'll get much kudos from Thunderbird Usenet users.

Other various "nice-to-haves" on my list: fixing subscribe, cleaning up some of the gunk in the code, adding support for RFC 3977 CAPABILITIES, possibly changing how news:a.group URIs work (open the folder view, not necessarily subscribing to them), among others. Combine-and-decode also falls under this list, but it's a lot less tractable than some of the other stuff.

Analysis tools

jshydra could use some more love: I hope to be able to be able to get a converter to a more natural AST working by the end of the year, as well as an automated test suite to verify correctness whenever I change m-c versions. I've also been working on-and-off on getting symbols for DXR via MSVC, which should hopefully also be finished this year.

Other Mozilla/Mailnews work

As I've mentioned before, my biggest goal for 3.1 is to be able to specify new account types in Javascript. I basically have the necessary framework completed locally, I just need to finish writing the tests and fix some bugs before getting it reviewed and committed; after that, I'll be writing a series of blog entries on developing an account type in JS, similar to (and hopefully better than) my pork guides. Speaking of which, I hope to finish that sometime this year as well. Possibly during summer again.

I've yet to see a roadmap for the address book in 3.1 and later, so I don't know what I'll be doing for the address book in this upcoming year. I expect, though, that I won't do anything near the scale of what I did for bug 413260. De-RDF and de-morkification are another two things I'd like to see worked on that I don't expect to get to this next year as well.

Time to see how much I'll actually get done this year!