Skip to content

Commit

Permalink
update README and pyproject
Browse files Browse the repository at this point in the history
  • Loading branch information
GStechschulte committed Jan 7, 2025
1 parent 5dc4d78 commit 95dc014
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 16 deletions.
42 changes: 26 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,38 @@
# bart-rs
# PyMC-BART-rs

Rust implementation of [PyMC-BART](https://github.com/pymc-devs/pymc-bart).
Rust implementation of [PyMC-BART](https://github.com/pymc-devs/pymc-bart). PyMC-BART extends the [PyMC](https://github.com/pymc-devs/pymc) probabilistic programming framework to be able to define and solve models including a Bayesian Additive Regression Tree (BART) random variable. PyMC-BART also includes a few helpers function to aid with the interpretation of those models and perform variable selection.

## Table of Contents

## Usage
- [Installation](#installation)
- [Usage](#usage)
- [Modifications](#modifications)

...
## Installation

## Modifications
PyMC-BART is available on PyPI with pre-built wheels for Linux (x86_64, aarch64), Windows (x64), and macOS (x86_64, aarch64). To install using `pip`

The core Particle Gibbs (PG) sampling algorithm for Bayesian Additive Regression Trees (BART) remains the same
in this Rust implementation. What differs is the choice of data structure to represent the Binary Decision Tree.
```bash
pip install pymc-bart-rs
```

A `DecisionTree` structure is implemented as a number of parallel vectors. The i-th element of each vector holds
information about node `i`. Node 0 is the tree's root. Some of the arrays only apply to either leaves or split
nodes. In this case, the values of the nodes of the other vector is arbitrary. For example, `feature` and `threshold`
vectors only apply to split nodes. The values for leaf nodes in these arrays are therefore arbitrary.
## Usage

## Design
Get started by using PyMC-BART to set up a BART model

In this section, the architecture of `bart-rs` is given.
```python
import pymc as pm
import pymc_bart_rs as pmb

TODO...
X, y = ... # Your data replaces "..."
with pm.Model() as model:
bart = pmb.BART('bart', X, y)
...
idata = pm.sample()
```

## Modifications

## Seeding RNGs
The core Particle Gibbs (PG) sampling algorithm for BART remains the same in this Rust implementation as the original Python implementation. What differs is the choice of data structure to represent the Binary Decision Tree.

The implementation of BART utilizes randomness in the growing of trees. The `thread_rng` function from the `randr_distr` crate provides a thread-local random number generator that is automatically seeded by the operating system or environment, ensuring that it is unique for each thread and run of the program. Therefore, we do not explicitly set a specific seed, and expect different values, e.g. sampled values from a Normal distribution, each time the program is ran.
A `DecisionTree` structure is implemented as a number of parallel arrays. The i-th element of each array holds information about node `i`. The zero'th node is the tree's root. Some of the arrays only apply to either leaves or split nodes. In this case, the values of the nodes of the other arrays are arbitrary. For example, `feature` and `threshold` arrays only apply to split nodes. The values for leaf nodes in these arrays are therefore arbitrary.
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ build-backend = "maturin"

[project]
name = "pymc_bart_rs"
description = "Rust implementation of Bayesian Additive Regression Trees for Probabilistic programming with PyMC"
requires-python = ">=3.8, <3.13"
classifiers = [
"Programming Language :: Rust",
Expand Down

0 comments on commit 95dc014

Please sign in to comment.