Gagallium : Testing OCaml releases with opamcheck

I (Florian Angeletti) have started working at Inria Paris this August. A part of my new job is to help deal with the day-to-day care for the OCaml compiler, particularly during the release process. This blog post is short glimpse into the life of an OCaml release.

OCaml and the opam repository

Currently, the song of the OCaml development process is a canon with two voices: a compiler release spends the first 6 months of its life as the “trunk” branch of the OCaml compiler git repository. Then after those 6 first months, it is named and given a branch on its own. For instance, this happened on October 18 2019 for OCaml 4.10. Starting from this point, the branch is frozen: only bug fixes are accepted, whereas new development happens in trunk again. Our objective is then to release the new version 3 months later. If we succeed, there are at most two branches active at the same time.

    Dev
    Dev
    Dev
    Dev
    Dev
    Dev
    Bug  Dev
    Bug  Dev
    RCs  Dev
         Dev
         Dev
         Dev
         Bug  Dev
         Bug  Dev
         RCs  Dev
              Dev
              Dev
              Dev
              Bug
              Bug
              Rcs

However, the OCaml compiler does not live in isolation. It makes little point to release a new version of OCaml which is not compatible with other parts of the OCaml ecosystem.

The release cycle of OCaml 4.08 was particularly painful from this point of view: we refactored parts of the compiler API that were not previously versioned by ocaml-migrate-parsetree, making it more difficult to update. In turn, without a working version of ocaml-migrate-parsetree, ppxses could not be built, breaking all packages that depends on ppxs. It took months to correct the issue. This slip of schedule affected the 4.09.0 release and can still be felt on the 4.11 schedule.

Catching knifes before the fall

Lesson learned, we need to test the packages on the opam repository more often. Two tools in current usage can automate such testing: opamcheck and opam-health-check.

The two tools attack the problem with a different angle. The opam-health-check monitoring tool is developed to check the health of the opam repository, for released OCaml versions.

In a complementary way, opamcheck was built by Damien Doligez to check how well new versions of the OCaml compiler fare in term of building the opam repository.

A typical difference between opamcheck and opam-health-check is that opamcheck is biased towards newer versions of the compiler: if an opam package builds on the latest unreleased version of the compiler, we don’t need to test it with older compilers. After all, we are mostly interested in packages that are broken by the new release. The handful of packages that may be coincidentally fixed by an unreleased compiler are at most a curiosity; pruning those unlikely events save us some precious time.

Since I started at Inria, in the midst of the first beta of OCaml 4.09.0, I have been working with opamcheck to monitor the health of the opam repository.

The aim here is twofold. First, we want to detect expected breakages that are just a sign that a package needs to be updated in advance. The earliest we catch those, the more time the maintainers have to patch their packages before the new release. Second, we want to detect unexpected compatibility issues and breakages.

One fun example of such unexpected compatibility issue appeared in the 4.09.0 release cycle. When I first used opamcheck to test the state of the first 4.09.0 beta, there was a quite problematic broken package: dune. This was quite stunning at first, because the 4.09.0 version of OCaml contained mostly bug fixes and small quality-of-life improvements. That was at least what I had few days before told to few worried people…

So what was happening here? The issue stemmed from a small change of behaviour in presence of missing cmis: dune was relying on an unspecified OCaml compiler behaviour in such cases, and this behaviour had been altered by a mostly unrelated improvement in the typechecker.

This change of behaviour was patched and dune worked fine in the second beta release of 4.09. And this time, the next run of opamcheck confirmed that that 4.09.0 was a quiet release.

This is currently the main use of opamcheck: check the health status of the opam repository on unreleased version of OCaml before opam-health-check more extensive coverage takes the relay. One of our objective for the future 4.10.0 release is to keep a much more extensive test coverage, before the first beta.

Opam and the PRs

There is another possible use that is probably much more useful to the anxious OCaml developer: opamcheck can be used to check that a PR or an experimental branch does not break opam packages. A good example is #8900: this PR proposes to remove the special handling of abstract types defined inside the current module. This special case looks nice locally, but it enables to write some code which is valid if and only if it is located in the right module, without any possibility to correct this behaviour by precising module signatures.

It is therefore quite tempting to try to remove this special case from the typechecker, but it is reasonable?

This was another task for opamcheck. First, I added a new opamcheck option to easily check any pull request on the OCaml compiler. After some work, there was some good news: this pattern is mostly unused in the current opam repository.

Knowing if there are any opam packages that rely on this feature is definitively a big help when taking those decisions.

Using opamcheck

So if you are a worried OCaml developer and want to test your fancy compiler PR on the anvil of the opam repositoy, what are the magical incantations?

One option is to download the docker image octachron/opamchek with

docker pull octachron/opamcheck

Beware that the image weights around 7 Gio. If you want to build opamcheck locally, you first need to clone the current opamcheck repository

git clone https://github.com/Octachron/ocaml.git

You probably need to install the following opam packages

opam install minisat opam-file-format

And run the common magic

cd opamcheck
make

Now, there are two use modes, you can launch opamcheck directly (or inside a VM), or use the available dockerfiles. In this short blog post, I will present the later option: it has the advantages of being relatively lightweight in term of configuration, and makes it easier to test your legions of PRs simultaneously (you don’t have legions of PRs, do you?) If you went with the manual road above, you need to first build the image with

make docker

This installs all external dependency on the docker image. That may take a while (and a good amount of space).

Once the image is built or downloaded, there are three main options to run it. If you want to compare several versions of the compiler (given as switch names), let’s say 4.05 and 4.08.1+flambda, you can run:

docker run -v opamcheck:/app/log -p 8080:80 --name=opamcheck opamcheck run -online-summary=10 4.05.0 4.08.1+flambda

The name option is the docker container maps. The -p option maps the port 80 of the container to 8080 this is used to connect to the http server embedded in the image. Finally, the -v precise where the opamcheck log repository is mounted in the host file system. If you forget this option, the log a random docker volume will be used. Here, it will be at /var/lib/docker/volumes/opamcheck.

During opamcheck run, the progress can be checked with either

sudo tail -f /var/lib/docker/volumes/opamcheck_log/_data/results

or by pointing a web browser to localhost:8080/fullindex.html. Note that the first summary is only generated after the OCaml compiler is built and all uninstallable packages have been discovered. On my machine, this rounds up at a 15 minutes wait before the first summary is generated. Later update should be more frequent

The result should look like this summary run for OCaml 4.10.0. The integer parameter in -online-summary=n corresponds to the update period for this html summary. If the option is not provided, the html summary is only built at the end of the run.

If you are more interested by testing a specific PR, for instance #8900, the prmode will work better

docker run opamcheck --name=opamcheck prmode -pr 8900 4.09.0

This command tries to rebase the given PR on top of the given OCaml version (switch name); it fails immediately if the PR cannot be rebased; in this case you should use the latest ‘trunk’ switch as base or use the branch option, described a bit below. When possible, it is a good idea to use a released version as the base, as it will be compatible with more opam packages than the current trunk.

If the branch that you want to test is not yet a PR, or needs some manual rebasing to be compared against a specific compiler version, there is a branch flag. For instance, let’s say that you have a branch “my_very_experimental_branch” at the location nowhere.org. You can run

docker run opamcheck --name=opamcheck prmode -branch https://nowhere.org:my_very_experimental_branch 4.09.0

This command downloads the branch at nowhere.org and compare it against the 4.09.0 switch.

Currently, a full run of opamcheck takes one or two days: you will likely get the results before your first PR review. A limitation is the false positive rate: most opam package descriptions are incomplete or out of date, so packages will fail for reasons unrelated to your PR. Unfortunately, this means that there are still some manual triage needed at the end of an opamcheck run.

There are four main objectives for opamcheck in the next months:

improve the usability
share more code with opam-health-check, at least on the frontend
reduce the false positive rate
reduce the time required by a CI run

If you want to check on future development for opamcheck, and a potentially more up-to-date readme, you can have a look at Octachron/opamcheck.