Improved Cabal dependency tracking and integration with existing package managers

Passalaqua

Short description: Improving the cabal user experience and usability (esp. for new users) by implementing a better dependency tracking method for identifying external executables and libraries and presenting better warnings and suggested actions whenever a native package manager is available and supported through cabal plugins. All this would hopefully lead to more frequent handsfree installs and file tracking through available tools, avoiding replication of package management functions in cabal.

Additional info: http://hackage.haskell.org/trac/summer-of-code/...

Project proposal

Currently cabal manages the installation of projects listed on HackageDB relatively well provided (among other things) that you have all the necessary external tools available and installed.

However, when certain tools are not available, such as alex or happy or some external library that cabal checks for with pkg-config, we lose the automated aspect of compiling and installing packages and get confused and frustrated users. That is particularly bad for new users or people just wishing to use a certain Haskell program available in HackageDB (who could be potential Haskell users) for which their OS does not have a pre-packaged version.

Main Goal 

This project is focused on improving this aspect of user interaction with Cabal by improving the dependency tracking when installing cabal programs and libraries.

Specific Goals

1. Improve the dependency tracking

The goal is to reduce frequent late installation failures due to unmet dependencies by using an upfront dependency check and warning the user if needed. If cabal can install with the proper dependencies met, it should also be able to quickly verify if all (or at least most) of them and warn the user for any problems upfront. Any problems can then be solved before giving the user the impression that it will succeed.

As it is, cabal install only correctly performs an upfront dependency check when entirely relying upon libraries available on HackageDB. Otherwise, the installation will fail later in the process when attempting to build a package with unresolved and unchecked external dependencies.

It is possible to increase the package dependency checking to include any external executables and libraries used or checked through pkg-config fields. This extended dependency checking can then warn the user (with the option to override) so that the proper steps can be taken before actually installing. This is particularly important when compiling large projects, where some users prefer to issue the command and work on something else while cabal does its magic. Coming back after 30 minutes or so to find out that it stopped after two or three minutes instead can be particularly frustrating.

Given some common use cases (such as happy, alex or even gtk2hs) I believe it is also interesting to present user with suggested common solutions. This is similar in spirit to how GHC complains about missing extension flags but suggests them, and gives an immense boost in user response speed.

2. Implement an interface for cabal to interact with package managers

During this step, I shall take advantage of all the dependency information above to suggest packages that meet the requirements. When issuing an install command, if a dependency is not met it should be possible to recommend the proper actions and act on the user's behalf, after asking for confirmation.

For this to work we would need to have cabal environment aware. That is, with the proper plugin for that particular platform/package manager, cabal could already suggest the proper actions to be taken.

For example, say the user is running on Gentoo and has the (yet to be implemented) gentoo-plugin for cabal. Suppose this user requests an install through cabal for some library on HackageDB that depends on alex but it was not found on the path. Cabal can then verify if it has installed in the past or if it has been installed through portage. If it has, then it can warn the user there is probably some PATH related issue for that executable that needs to be fixed. If it hasn't, then it can ask the user if cabal should issue emerge alex. If the installation is successful, then cabal can continue installing the originally requested package automatically.

This step would be implemented as an interface between cabal and environment aware plugins that would be responsible for interacting with native package managers. As a proof-of-concept I would also implement a few of those plugins with basic functionality as well. Initially I have Debian/Ubuntu, Gentoo and Windows + Cygwin (if even possible) in mind, although OS X + fink is also a possibility with a little help from some Mac enabled friends.

Each plugin would have to be responsible for checking for dependencies on the system, verify which can be met through which native packages by keeping its own internal mapping, and issue the proper install commands whenever possible.

It is worth noting that I plan to implement this in such a way that if no such plugin is  available, no loss of functionality should occur. Unsupported platforms or platforms with limited support should not be negatively affected by this feature. And naturally, overriding any suggestions should always be possible, for testing cases or for when a particular platform support becomes slightly outdated.

At this point I expect to meet my original usability improvements goals already.

3. Extend cabal so that extracting information for external packages gets easier

As described on the ticket listed in "Additional info", I took inspiration from the g-cpan project for gentoo[1] which allows the automatic creation of packages (ebuild files) for PERL modules available on CPAN and their automatic registration into gentoo's package manager.

All the information required for building packages should already be available at this point. It should therefore be relatively straightforward to expose it so that external tools can more or less automatically generate native packages with all the proper dependencies listed. Extra information, such as licence and how to register or unregister packages with ghc, should also be exposed so that the resulting package can be installed or uninstalled with no (or as little) system side-effects as possible.

The main point of this step is to allow users to quickly build packages for local use. It is usually a good idea not to circumvent package managers on the long run and this could allow projects to be installed quickly without having to wait for potentially long testing and update procedures. It should also be possible to use this functionality for generating package recipes to be hand tuned before actually being uploaded to official repositories.

I am aware of some work done on both Debian[2] and RPM[3] based systems for building tools for automatically providing dependencies from cabal packages and these could be useful in implementing this project on some platform families.

However, both projects seem to be focused only on this third step. Neither implements most of the functionality changes on cabal mentioned on step 2 above. The dependency tracking is there, but requires user invocation of external tools instead of being integrated to cabal itself. And the second (and similar ones, though that one seems to be the most relevant for this project I could find so far) seem to be limited to dependency list generation and has not been updated in quite a while.

By providing an unified API for this step I hope to reduce repeated work for different platforms and allow easily increase in the number of supported platforms. Ultimately, I also expect to allow users to take advantage of both HackageDB and full featured native package managers with no duplication of packages on both, such as installing a native and tested alex instead of building from source, unless otherwise requested. All this without actually turning cabal itself into yet another full-fledged package manager[4].

About the Student

I am a student at the Masters' Programme in Computing Science, Program Technology track, at Utrecht University, currently enrolled regularly and expecting to graduate by August 2013.

This project aligns very well with my main motivation behind pursuing a Masters' degree (and beyond!), which are the development and improvement of advanced programming tools with better user interfaces. Better error reporting, fewer distractions when solving programming problems and better programming interfaces to allow for easier development are among my top concerns.

I believe that developing this project is a good way to reduce the difficulty and frustration Haskell users, who also are new potential Haskell programmers, may experience.

I have prior experience working on and using open source software for development (often in the position of tester rather than developer, however) since around 2005 when I joined the aMule team as a tester. However, most of my code ever since has been for private use and not published or maintained publicly.

As of August 2011 I have been dedicating my time to actively integrate myself into the Haskell community by attending several HUG meetings, focusing my studies on the language whenever possible and interacting directly with colleagues and teachers who are already part of the community*. During this period I also have had daily contact with the language and its tools on both Windows and Linux both due to mandatory courses and on side projects. Side projects include working on updating some projects that had been developed at my current university, such as the Proxima Editor, and identifying Haskell tools shortcomings. I also had some contact with cabal code through the uuagc-plugin code used by the UU Attribute Grammar Compiler.

Promoting the improved functionality described above is something I have been doing slowly in the form of informal conversations both online and in person with other Haskell users, some more experienced, some less. Although there is currently no published Haskell code written by me, I expect my motivation behind my Masters' studies and the references I may provide from people already in the community to be convincing that I have both the motivation and the competence to complete the project indicated above, should I get the financial and moral backing.

* - As this proposal is publicly visible, I am not comfortable giving any names without first consulting those persons. I will do so in private if requested, however.

References 

  1. http://www.gentoo.org/proj/en/perl/g-cpan.xml
  2. http://hackage.haskell.org/package/debian
  3. http://hackage.haskell.org/package/cabalrpmdeps
  4. http://ivanmiljenovic.wordpress.com/2010/03/15/repeat-after-me-cabal-is-not-a-package-manager/