This series of blog post aims to give a short weekly glimpse into my (Florian Angeletti) work on the OCaml compiler.

Reviewing github pull requests

Last week, I have spent a significant portion of my week reviewing pull requests on the compiler and I have been fortunate to merge three nice pull requests:

Don’t suggest a semicolon when the type is not unit

This pull request by Jules Aguillon improves the new error report for applying a function to too many argument.

List.map ((+) 0) [1;2;3] 0
Error: The function 'List.map' has type ('a -> 'b) -> 'a list -> 'b list
       It is applied to too many arguments
File "test.ml", line 1, characters 23-25:
1 | List.map ((+) 1) [0;1;2] [2;3;4]
                           ^^
  Hint: Did you forget a ';'?
File "test.ml", line 1, characters 25-32:
1 | List.map ((+) 1) [0;1;2] [2;3;4]
                             ^^^^^^^
  This extra argument is not expected.

by removing the hint whenever the expect result type of the application is not unit. This let us with a shorter and to-the-point error message:

List.map ((+) 0) [1;2;3] 0
Error: The function 'List.map' has type ('a -> 'b) -> 'a list -> 'b list
       It is applied to too many arguments
File "test.ml", line 1, characters 25-32:
1 | List.map ((+) 1) [0;1;2] [2;3;4]
                             ^^^^^^^
  This extra argument is not expected.

Note that if you don’t recognize the error message format, this is expected: the previous version had already been considerably improved by a previous PR by Jules.

This pull request by Stefan Muenzel proposed to make more explicit the error messages concerning non-generalizable type variables. The new error message points explicitly to all non-generalize type variables in the involved types.

For instance, writing a ml file containing the single line

let x = Fun.id Fun.id, ref None

now raises

1 | let x = Fun.id Fun.id, ref None
        ^
Error: The type of this expression,
       ('_weak1 -> '_weak1) * '_weak2 option ref,
       contains the non-generalizable type variable(s): '_weak1, '_weak2.
       (see manual section 6.1.2)

Moreover whenever the error happens in a submodule, the error message now points to the first value with a non-generalizable type:

module M = struct let x = ref [] end
File "test.ml", line 1, characters 0-36:
1 | module M = struct let x = ref [] end
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error: The type of this module, sig val x : '_weak1 list ref end,
       contains non-generalizable type variable(s).
       (see manual section 6.1.2)
File "test.ml", line 1, characters 22-23:
1 | module M = struct let x = ref [] end
                          ^
  The type of this value, '_weak1 list ref,
  contains the non-generalizable type variable(s) '_weak1.

Better manual reference in error message

In order to implement the improved error message above, a better way to cite subsections in the manual was needed. Stefan Muenzel took the time to improve the manual cross-reference checker tool to handle this case. The manual cross-reference test checks that references in error messages and warnings are consistent with the section numbering of the manual by parsing the latex-generated aux-file. On the OCaml side, the test only handled chapter and section numbers.

With the change in Stefan’s PR, it is now possible to cite uniformly chapters, sections and subsections (and subsubsections) of the manual in error messages.

My pull requests

Two weeks ago, I finally found the time to propose a small change on the OCaml AST node for value bindings.

Explicit type constraints in value bindings

Consider the following value binding:

let pat : typ = exp

Before my change, the type constraint was stored both in the pattern node and (sometimes) in the expression node with a rather complex desugaring. This duplication of nodes make handling such constraints in ppxs more complicated than it ought to be and it introduced some irregular encoding of type expressions.

With my change, the type constraint is stored directly in the value binding node, and the elaboration has been moved to the typechecker. Hopefully, soon we will no longer build parsetree node in the typechecker. However, in the meanwhile, the parsetree is now closer to the source language and simpler to transform with ppxs.

On-going discussions

In term of medium term projects, I have been discussing two interesting projects last week.

I have spent some time discussing with Sébastien Hinderer about his plans to simplify the build and dependency of OCaml dynlinking library. Right now, this library is built with its own version of the compiler library (to avoid module name collision) which introduces a lot of complexity in the compiler build system. After some discussions with Sébastien, we decided to try to isolate a core linking library that would be shared with both dynlink and the rest of the compiler library.

OCamltest DSL

The ocamltest DSL for the compiler test suite is currently inspired by org mode as a generic way to write tree of tests. However, with some more distance, it has become clearer that such representation is not optimized for most tests in the compiler test suite. In particular, tests often have very long sequences of actions that are ill-fit the current representation. We have thus been discussing an improved DSL for ocamltest for some time, and recently converged towards a new version.