Thursday, June 24, 2010

Emacs the Lifesaver

I was not thrilled of the task in front of me. Refreshing an old mega-patch, revise virtually every hunk to use a different interface, and committing it piecemeal to the repository again. Sounds like a long error prone job of suffering.

Fortunately here is where the power tools come in. I
  1. re-merged the backed out revision, postponing conflicts;
  2. saved the diff away and reverted the repository;
  3. wrote a small awk script to massage the hunks in the diff to get a new patch;
  4. fired up emacs with the patch and applied each hunk after thorough review (and seldom with minor changes);
  5. some hunks are not ready to go in yet as they do not qualify as refactoring, these are kept for later;
  6. commit almost every file as a separate revision.
I spend the most time in Emacs (the Mac OS X incarnation, Aquamacs is fantastic). It provides me all the comfort and productivity I need:
  • it provides all necessary hunk operations such as apply, reverse, go to original, drop etc.
  • I can transparently work from a remote machine via ssh, including editing, version control and the above diff operations
  • peace of mind, by being rock solid and autosaving stuff.
The only inconvenience is the sheer amount of keyboard equivalents, but I am getting used to them too.

Thanks Emacs, without you I would probably drop!

Tuesday, June 22, 2010

Burning ISO CDs

I wasted some hours with trying to burn ISO CDs on the mac.

I tried various methods like converting .dmg to .cdr (CD Master) in Disk Utility, but the resulting CD always mounted as HFS+.

Finally I googled a nice method:
hdiutil makehybrid -iso -o MyImage.iso /Volumes/SomeFolder/
will create an ISO filesystem, which can be burnt with Disk Utility and shows up like that in the Finder. That is - well, I am pretty sure - readable on PCs.
Alternatively I may use
hdiutil burn MyImage.iso
I believe...

In retrospect, some of my burn products may have ended up as PC-readable too, since hybrid filesystems may have been created. I'll test them on a PC tomorrow for sure.

Friday, June 18, 2010

Sized types

I have always liked the idea of assigning some notion of size to (tree-like) values, and track its changes along pattern matching and construction to be able to reason about termination-unaffecting recursive calls.

Many years ago, when reading the Hughes-Pareto-Sabry paper I did not see the point yet why termination is fundamental in various aspects.
Only when sitting on the park bench on the isle of Margitsziget (Budapest) and discussing with Tim about sound logic in Ωmega, it dawned to me how termination checking with sized types can be exploited.

I developed the intuition of the tree-like data floating heads down in the water and we are reasoning about criteria that it can still float without touching the ground at depth n. Still, this metaphor was rather hazy.
In the meantime I have tried to digest the relevant papers from Barthe and Abel, brainstormed somewhat here and let my brain background.

Yesterday, I found (on reddit) a link to Abel's new MiniAgda implementation and its description. It made clear to me that my early intuition was not bad at all, the water depth is the upper limit of the size, and recursion is to reduce this to obtain a well-founded induction.
Now it is time to rethink my ideas about infinite function types and how they can be reconciled with sized types.

But it looks like Abel has done the hard work and his Haskell implementation of MiniAgda could be married with Ωmega in the following way: Derive a sized variant of every (suitable) Ωmega datatype and try to check which functions on them terminate. These can be used as theorems in Ωmega.

Hopefully Tim is paying attention to this when implementing Trellys...

Tuesday, June 8, 2010

My grief with out-of-tree code

This post is a long-standing todo item in my brain, but this checkin actually prompted me to do it.

A little bit of history first. As a software developer currently mostly active in the embedded space, I like solutions which allow me to save some CPU cycles or bytes of RAM here and there as long as they still allow me to use the same interfaces. Exploiting the characteristics of the underlying hardware and algorithms is often low-hanging fruit when it comes to optimizations.
So I have this little agenda of about 10 items I wish to implement in the future to make the LLVM framework a little more efficient.

One of these was to reorder the operands inside of the call instruction, to obtain faster access to the callee but mainly to allow fast visitation of all instructions that have a certain callee. I explained all my motives in a separate mail, so I want to save you from the gory details here. To make a long story short, it took me several iterations to catch all places in the optimizers where the operand order was assumed to be in the (callee, arg1, arg2, ...) fashion, instead of the new (arg1, arg2, ..., callee) one, and some miscompilations were only revealed by running the nightly tests. It was a work of blood and sweat because there are many intrinsics and transformations on them and they are often manipulating via the low-level interface getOperand(n). Actually there is a nice helper interface, called CallSite, which makes it easy to access the call instruction's arguments in a high-level fashion and this interface probably the best for LLVM clients, since its also handles the invoke instructions. However, I regard it ok to use the low-level interface in the LLVM tree directly, since it is possible to consistently change things in one atomic commit.
Finally, the day where all regression and nightly tests succeeded, has dawned. My patch seemingly stuck, with all buildbots green. I left for downtown and returned late at night. Just to discover that all has been backed out, because my change broke havoc in an Apple project that obviously used the low-level interface. This was especially frustrating, since I cannot even submit a correcting patch against that project.

I did receive very little encouraging support, not even moral one. Some comments were even pretty dismissive, like this patch has already caused many problems, it is not worth it for such a marginal gain. I have no problem with the comment itself, since I would utter such words in comparable situations too, but this time it was my investment that was at stake. I was pretty determined to keep fighting.

I wondered whether new measurements with higher arity calls would find a significant speedup with my patch applied. So I did some benchmarking for cases where the change is expected to make a difference, and actually found (roughly) a 3% speedup. Clearly this number is only achieved in specific situations, so the generic case would be well below that, but still it could compensate for many little time eaters that are necessary for an advanced optimization pass or analysis.

In my conversation with the involved engineer I enumerated following reasons why resorting to low-level interface in out-of-tree projects is a bad idea:
  • they are not conveying the intent
  • they are depending on implementation details by reaching over abstraction barriers
  • they are an impediment to change
(these are mostly the same reasons which you can find in the above commit message too). He did agree to all this and promised to nudge the OpenGL implementors. I also received a request to submit a patch that guarantees that no silent breakage can happen. Well, I acknowledged that this is a valid concern, so I did some brainstorming.

I succeeded to put together a small patch that detected all instances of get/setOperand(0), the major potential cause of breakage in external projects. Compiling with this patch would pinpoint all places where getCalledValue() should be used.

But I cannot promise more than that!

Why it is impossible to guarantee that with my proposed change either everything keeps working or there is a compilation error with a clear fixing indication? Because the User baseclass does provide the low-level getOperand interface too and I cannot disallow that. C++ only lets me protect parts of the CallInst class...
Would a patch to make getOperand private in CallInst be accepted? Probably not now, but read on.

What aggravates the problem with private trees is file ownership. The engineer who detects the breakage is not entitled to fix simple cases, but needs to lobby the project/file owner first. This results in additional inertia. (Disclaimer: I am not sure whether Apple does have a file-ownership model internally.)

Surprisingly the same thing that happened to me theoretically could happen to any Apple engineer too. Imagine some checkin to LLVM broke the dragonegg GCC plugin which is effectively licensed as GPLv3, so no Apple engineer is allowed to inspect its sources. What would happen if the dragonegg maintainer backed out the change on grounds of "broke an important external project"?

What to do next? Now, whining is not one of the things I like to do, so let's advance in some way. Bill's patch I mentioned in the beginning is a possible first step, as I could rework a large portion of my patch in terms of getArgOperand(n-1) instead of getOperand(n-1), without actually changing the operand order for now. These kinds of incremental refactorings that do not change functionality are mostly welcome in the LLVM world. Then I am dependent on the goodwill of some Apple engineer to make a similar change in that internal project too. Finally the switch (i.e. the operand order) could be flipped.

Why I am reluctant to begin? Because it is lots of work, many new intrinsics have been introduced, I definitely will get a bunch of merge conflicts, and finally, who knows, there might be another internal project that chokes and the whole misery enters a new iteration.

Why do I feel that the change is urgent? Because LLVM is getting popular with an extraordinal speed. As more and more external projects use LLVM as a foundation, more and more code will exhibit bad habits of using low-level interfaces. The few post-v2.7 months are probably the last chance to make the switch in argument order, before things become de-facto cemented. Maybe it is too late already. That would be a pity, though, LLVM as a compilation infrastructure should be as fast and nimble as possible. Every one of its clients would profit.

So, dear external tree developers, I hope you get rid of the low-level calls and use the high-level ones instead. It should not cost you more than touching a couple of lines and retesting. I would be happy to assist you.

Regarding development policy, I would welcome a clear statement about what amount of testing in the LLVM ecosystem is "sufficient" and excludes the risk of a patch being backed out.

Bottom line, I'd love to get this patch wrapped up, but I am dependent on the support of external tree owners. Are you willing to help me?