2012/08/10

Incorporating C code in an Ocaml project using Ocamlbuild

I am currently developping the code generation phase of my L programming language, using LLVM. LLVM provides Ocaml bindings for building LLVM code and executing them, but there are no easy way to inspect the result of execution when complex data structures are produced.

It turned out that writing these C stubs was not hard at all; and the OCaml documentation for doing that is quite good. The most difficult part was fighting Ocamlbuild to incorporate these C stubs into the project.

A simple example project

To get started, here is a sample example project with two OCaml file and one C file.

  • File toto_c.c:
#include <stdio.h>
void toto(void)
{
  printf("Hello from C\n");
}
  • File toto.ml:
external toto_a: unit -> unit = "toto";;

let toto_b () = toto_a(); toto_a();;

This file provides the "ocaml part" of the "toto" library: the additional definition toto_b is implemented in ml, not in C.

In simpler implementations with no OCaml definition of the library, or to provide a separation, the external declarations could be put in a standalone .mli file (e.g. toto.mli).

  • File main.ml:
Test.toto_a();;
Test.toto_b();;

This simple application can be simply compiled using a Makefile/shell script as follows:

ocamlc -c -o toto.cmo toto.ml
ocamlc -c -o main.cmo main.ml
gcc -c toto_c.c # or ocamlc -c toto_c.c
ocamlc -custom toto_c.o toto.cmo main.cmo -o main.byte 

Notes:

  • The -custom flag allows to build a custom runtime, i.e. to embed an OCaml interpreterin the test.byte file, linked with toto.o.
  • Here I compiled the c file using gcc, but it may also be built using ocamlc (the benefit being that ocamlc passes the correct include path options to gcc).

Using ocamlbuild flags

A very simple solution can be obtained by passing flags to ocamlbuild:

ocamlbuild toto_c.o
ocamlbuild main.byte -lflags -custom,toto_c.o

This solution could be used, for instance, in a solution that would use a "Makefile" driver together with ocamlbuild.

My current "pure ocamlbuild" solution

You will not be able to compile this with Ocamlbuild alone without a custom plugin, which is OCaml code (with regular syntax) to put in a myocamlbuild.ml file, that is automatically compiled by ocamlbuild.

Note: the user manual of ocamlbuild does not explain how to write a plugin, but the wiki http://brion.inria.fr/gallium/index.php/Ocamlbuild has several pages that help.

After trying complicated alternatives, the solution I found consisted in writing a simple plugin adding a "linkdep" parameterized tag. When the operation is a link, this tag:

  • adds the file to the dependency list (so that it is also built)
  • adds it to the list of files to be linked (for some reason, when the "link" tag is active, i.e. the command is a link command, then Ocamlbuild add the files in the dependency list to list of input files to be linked).

The ocamlbuild plugin is rather small: just put the following in the myocamlbuild.ml file.

open Ocamlbuild_plugin;;

dispatch (function
  | After_rules ->
    pdep ["link"] "linkdep" (fun param -> [param])
  | _ -> ())

What this plugin does is "register that whenever the link and linkdep plugins are set, add the file passed as a parameter to the linkdep tag to the list of dependencies".

Then the _tags file only has to contain this:

<*.byte>: linkdep(toto_c.o), custom
<*.native>: linkdep(toto_c.o)

And the whole application can be simply built with:

ocamlbuild main.byte
ocamlbuild main.native

When the main.byte or main.native files are built (i.e. linked), the toto_c.o file is added as a dependency (and to the list of files that are linked). Additionally, the custom tag add the -custom option to the OCaml linker.

Linking with a .a library

If there is a need to link with several C object files, the solution also allows to use a library as follow:

  • Create a file name libname.clib containing the list of object files: for instance my libtoto.clib contains toto_c.o
  • Change the tag file to require libtoto.a instead of toto_c.o: the _tags file become:
<*.byte>: linkdep(libtoto.a), custom
<*.native>: linkdep(libtoto.a)

And that's all. Ocamlbuild knows how to build .a files from .clib files, and will add the contents of the .clib file as dependencies from the .a file.

Note that it is important that the library file begins by "lib", or the ocamlbuild rules will not work. The command ocamlbuild -documentation provides the list of rules.

Possible enhancements

There are several possible enhancements that could be brought:

  • The scheme presented here is sufficent for simple C stubs, but compiling more complex c files would require to change CFLAGS of the C compiler (in particular the include directories), and add new libraries to the lflags. http://mancoosi.org/~abate/ocamlbuild-stubs-and-dynamic-libraries and http://brion.inria.fr/gallium/index.php/Ocamlbuild_example_with_C_stubs deal with this problem by changing the plugin to add an "includename" flag. I think this idea could be generalized to add a parametrized include tag.
  • I think it is a pity that one has to tag the final executable with the linkdep. It would have been cleaner to tag the ml files that depended on them, such that linking the final executable is independent on whether one of the module uses a C stub or not. OCaml provide a mean to record the list of C library needed for an ocaml library (.cma), but not with individual (.cmo) objects, so there is no simple way to do that.

Conclusion

After spending some time fighting with ocamlbuild, I managed to have a working, but less-than-satisfactory result: library added to all targets for instance. And I remember that I had a similar fight to handle building custom Camlp4 extensions.

I usually like using Ocamlbuild, that allows building medium-sized ocaml projects with no or minimum configuration; especially with the recent support for findlib packages. But I am not a fan of having to write OCaml code to build my project properly. I think that Ocamlbuild could get better if more default parametrized tags would be provided, such as for adding dependencies, compile flags, and so on. But maybe the _tags file format is not sufficient for that.

So maybe I should keep the dual make/ocamlbuild solution. It may be the simplest to maintain, as it does not require a knowledge of the internals of ocamlbuild that I am likely to forget.

Maybe I should also try using OMake instead, that would allow a unifying solution. And Ocamlbuild will not be the adequate tool for building L projects anyway, when I will start bootstrapping the compiler. But OMake seems not to be maintained anymore…

No comments:

Post a Comment