Monday, July 20, 2009

A guide to pork, part 1

As one of the first people to have actually used pork (apparently the third, after taras and cjones), I feel obliged to give a guide as to how to write an automatic patch generator, so as best to prevent people from asking the same question a fourth time. This also contains some ranting about some of my annoyances with pork (sm::string, I'm looking at you). So, without further ado, I present Part 1: It works!

A brief introduction

Pork essentially consists of three main areas of API (enumerated in order of my discovery): the patcher API, the C++ AST structure direct from the parser, and the annotated APIs that make finding information more than a bit easier. There is something which constitutes a sort of fourth API, the utilities that partially replicate functionality in the STL.

My original interest in pork came from an idea to rewrite libmime, which is roughly a basic C++ implementation in C, into the equivalent C++ code. Such a patch would be on the upper end of difficulty for a normal shell, python, or awk script to rewrite: I need to combine classes, rewrite function prototypes, rename variables, and refactor globs of code like
return ((MimeObjectClass*)&MIME_SUPERCLASS)->initialize(object);
into
return MimeLeaf::initialize();

Step 1: Building and running your tool

The first step is to build pork. Taras's guide will likely be more up-to-date than any instructions I give. Now you have an installation of pork. After that, you can plug in your own tool into the structure. I've personally handled this by making a tools/ subdirectory and making a very neat Makefile that automatically adds files to be compiled into the tools themselves.

Your tool will eventually be invoked tool <args> filename if you are using the pork-barrel script. All that pork-barrel does is to run the programs one at a time and to merge the outputted patch in the end; you don't need to use it (and I recommend you don't) as you start your tool. The files it runs on are preprocessed files, generally with the extension i or ii. Invoking gcc with -save-temps is a nice way of generating these files. You don't need to use mcpp if you're not overly concerned about stuff lurking in macros.

Step 1.1: Running the patcher

Once your tool processes its arguments, it will eventually be reading the C++ files and patching them. Here is some sample code to do that, which I provide without comment (it's just boilerplate):

#include "piglet.h"
#include "expr_visitor.h"
#include "patcher.h"

class MainVisitor: public ExpressionVisitor {
public:
  MainVisitor(Patcher &p): patcher(p) {}
private:
  Patcher &patcher;
};

int main(int argc, char **argv) {
  PigletParser parser;
  Patcher p;
  MainVisitor visitor(p);
  for (int i = 1; i < argc; i++) {
    TranslationUnit *unit = parser.getASTNoExc(argv[i]);
    unit->traverse(visitor);
  }
  return 0;
}

The necessary APIs for the utilities will eventually be covered in more detail. Unfortunately, it's late, and you now have a working, if idempotent, pork utility. Next time, I'll discuss the basics of Patcher and ExpressionVisitor.

1 comment:

Taras said...

Dave Mandelin did some Pork work, but he opted to avoid using patcher and instead built some python cludge(ie he used elsa mainly). But he did lay the foundations for pork.
So it's more like 2.999 people got there before you :)