The ProbCompCert compiler is a prototype compiler for a subset of the Stan programming that ties into the CompCert C compiler.
The distinguishing feature of ProbCompCert is that parts of it are being formally verified using the Coq proof assistant. The goal is for the generated assembly code is formally guaranteed to behave as prescribed by the semantics of the source Stan code.
So far, we have verified 3 passes in the density compilation pipeline. These passes are interesting from a compiler correctness point of view, because they involve changes that do not necessarily preserve the "extensional" operational semantics of the program, but nevertheless preserve the probability distribution that the programs represent.
Please see INSTALL.md for instructions on getting ProbCompCert built.
For a step-by-step walkthrough that explains how to compile a Stan model, run the resulting binary, and analyze the generated output, please see USAGE.md
ProbCompCert ties directly into CompCert by generating Clight code and makes heavy use of most of the generic compiler libraries of CompCert including: Coqlib, Maps, Integers, Floats, Errors, AST, Linking, Values, Memory, Globalenvs, Smallstep, Op, Ctypes, and Clight.
Documentation on these libraries can be found in CompCert's commented Coq development
-
The runtime directory contains the C implementation of the runtime and of the standard library.
-
The tests directory contains Stan example programs and development tools.
-
The driver contains plain OCaml code that is used during the elaboration phase that connects the Stan language to the Stanlight language.
Similar to CompCert, the compiler makes use of multiple languages. In CompCert, to a large extent, every syntax is asscociated with a single semantics, and a compiler pass that removes a feature maps to a new language. An exception is Clight which comes with two semantics.
In ProbCompCert, at least for now, several compiler pass remain in the same language despite removing a feature. As a result, some languages are associated with multiple semantics.
-
Stan: The Stan language, generated by the parser. This language is (or at least should be) faithful to the official Stan specification.
-
Stanlight: A subset of Stan describing valid Stan programs. In principle, any Stan program could be written in Stanlight. This language supports compiler transformation that are probabilistic in nature. Example include reparameterization.
-
CStan: Clight extended with an explicit target and a list of parameters and data variables. This languages supports compiler transformations that need to work at a lower level. Examples include the transformation of global variables into C structures or the initialization and return of the target.
Language | Syntax | Semantics | Notes |
---|---|---|---|
Stan | Stan | Does not currently have a formal semantics | |
Stanlight | Stanlight | Ssemantics.v | Stanlight programs are valid Stan programs |
CStan | CStan | CStanSemanticsTarget CStanSemanticsBackend | CStan programs use the C type system and global environment |
The parser uses Menhir's verified parsing tools. The verified parser was largely developed by Brian Ward as part of his honor's thesis at Boston College.
- Slexer.mll: lexer specification
- Sparser.vy: parser specification, in Coq Menhir
- error.messages: error messages
Pass | Source | Target | Code | Proof |
---|---|---|---|---|
Implement sampling statement | Stanlight | Stanlight | Sampling | Samplingproof |
Reparameterize constrained parameters | Stanlight | Stanlight | Reparameterization | Reparameterizationproof |
Optimize out additive constants | Stanlight | Stanlight | AdditiveConst | AdditiveConstproof |
Pull calls out of expressions, compile for loops, translate types to C types, translate operators | Stanlight | CStan | Clightification | Clightificationproof |
Replace gloval parameters and data by structures | CStan | CStan | AllocateVariables | AllocateVariablesproof |
Explicit target | CStan | CStan | Target | Targetproof |
Generate Clight | CStan | Clight | Sbackend | Sbackendproof |
- Scompiler: defines the overall compilation and proof of semantics preservation
Check out DEVELOPERS.md for additional information on some of the files, and tips about extending the compiler and run time.