Why b0 ?

Note. This was Written in 2017 and is history by now, but it has aged well and still reflects B0's vision.

A few high-level motivating arguments. To see how b0 supports these points see the manuals.

Goals

B0 is a system to assist the programmer during the whole software construction process, from development to deployment. It is a generic system that integrates, using modular descriptions written in OCaml: configuration, build, source or binary deployment and custom software life-cycle operations.

Why a generic and modular system ?

B0 provides both genericity and domain specific usability by rethinking how builds are to be described, organized and (re)executed.

Most software projects need to deal with special build cases, massage and integrate data sources, interact with other languages and adapt to specific systems or deployement environments. In these projects the closed world assumption on which language specific build systems rely quickly degrades the user experience to a series of painful and time consuming workarounds that are better solved by a generic system.

However a good generic system must be able to express easy to use domain specific descriptions that match the user experience of a language specific system. Besides these descriptions should be modularized, distributed and reused as we do with software: via libraries.

Why OCaml as a description language ?

B0 is an EDSL that provides build description expressiveness, reuse and distribution from the onset.

Build system DSLs tend to constrain their expressive power and rely on the brittleness of shell script language conventions. Most of the time this ends up not being sufficient: general computation, abstraction and datatypes beyond strings are needed. This results in users metaprogramming build DSLs using other tools and languages hereby adding a layer of indirection and complexity only to be able to deal with a defective one.

B0 does not follow the DSL road again but instead, treats the root cause: the user is provided with the full power of OCaml and its rich datatype and abstraction capabilities. This allows to devise high-level and abstract build descriptions that are directly "compiled" down to low-level build operations without going through an indirection layer.

At the same time this trivially solves the description reuse and distribution problem by piggybacking on the OCaml packaging infrastructure.

A simpler build model

B0 provides a simple build model that permits build description abstraction and directly matches both the operational and user mental model of the build system.

Build descriptions are at the center of the system. They must be easy to define, understand and reuse. Their executions and outcomes must be easy to debug, fast and correct.

In order to acheive this b0 departs from the prevalent rule-based build model which exposes concepts that are not needed and does not match the user's mental model (see the annex at the end of this file for a discussion). B0 sees a build system as a direct sequence of build operations that perform side effects on the file system. In order words: a build script written in OCaml.

This procedural approach makes it easy to expose at the API level, build fragments at various abstraction levels, e.g. the actual building blocks of "easy to use" domain specific descriptions.

Build operations are annoted with precise specification of their input (environment, file reads, tool binary) and output (allowed exit codes, file writes). These specifications allows to see them as pure and deterministic functions. In turn this makes it trivial for the build script to be automatically parallelized and incrementally re-executed by memoizing the "functions" via an on-disk cache.

Metadata and build outcome introspection

B0 provides support for build understanding and DRY custom software life-cycle operations by keeping track of low and high-level metadata about build outcomes.

Low-level metadata about the build is automatically serialized by the system. This includes the precise configuration that was used for the build and all the build operations performed and information about their execution. This provides a strong ground for build understanding and debugging. For example, any written file can be queried for the actual build operation that wrote it, build parallelism can be studied after the fact using trace analyzers, etc.

For high-level information b0 allows to attach custom typed metadata to the essential components it deals with: source and built files, statically known build units and packages. Part of this metadata is known before the build starts and part of it may depend on the build configuration. In the end all that information is serialized alongside the build outcomes. This is useful for driving the build (e.g. looking up and compiling against vendored library dependencies) and for further processing the source or build outcomes in a DRY manner (e.g. for deployment or packaging).

Deployment and software life-cycle operations

B0 provides custom and reusable support for massaging, distributing and deploying the project sources and build outcomes.

Software needs to be tested and distributed in order to be run. This means disseminating and running build outcomes or sources at different locations: locally, on servers, on devices, in package repositories, staging areas, etc. Traditionnaly this happens through an insipid soup of custom shell scripts loosely tied to (re)invocations of the build system.

In b0 custom deployment procedures and software life-cycle operations are fully integrated in the project description via the notion of deployment and hooks: customizable procedures to stage sources or build outcomes resulting from multiple build variants. The metadata mentioned in the last section allows to keep all this smooth and DRY.

B0's scope

Build is an important part of b0 and we believe it provides discriminating and outstanding support for it. The user can express builds and their configuration using distributable descriptions of various abstraction level in a natural and introspectable build model.

However b0's support does not stop at the end of a build. It provides additional description capabilities for all the life-cycle operations that are associated with software construction.

B0 is not just a build system: it is a system for software construction care. A framework in which general software life-cycle and construction problems can be expressed and solved.

ANNEX: The problem with rule-based build models

Rule-based build model expose build structure as a directed acyclic graph (DAG) of dependencies between file targets. This DAG is implicitly defined via a set of build rule patterns which match targets and hereby define their prerequisites and the recipe of commands needed to produce them once their prerequisites are ready.

We argue this is not a very good build model:

  1. Pattern rules lead to rule matching. In rule matching a more specific build rule can be selected to build a particular target. If new rules can be added to the set of global rules, this leads to a system that is very difficult to understand: any part of the system can affect the other. In other words the pattern rule approach is anti-modular and uncompositional.
  2. Pattern rules are effectively a form of rule meta-programming which is not needed if there are functions in the description language.
  3. The complete build description of a target is scattered among different build rules as needed by its prerequisites. In order to reconstruct and understand the actual sequence of operations that actually occurs, the programmer has to replay the rule matching algorithm in his head to find out the recipes that will be executed. This is similar to the hell of callback programming: the final description of a build product is non-obvious and scattered among different handlers whose invocation order is unclear and non-trivial.
  4. The rule-based model is at odds with what the user eventually sees and wants to specify: a sequence of commands that need to be issued in order to produce a given target. In other words this model neither matches the actual mental model of the user nor the effective visible behaviour of the build system.
  5. Build abstraction cannot be performed inside the rule-based system. As an escape hatch it can be performed by an external system via meta-programming but, this along with 1. and 2., further obscures build understanding and hampers usability.