Feat/conformal prediction (#2552)

* naive conformal prediction * first hist fc version works * add component names * add support for train length * support for last points only * add hist fc unit tests * add first conformal unit tests * overlap end checkpoint * overlap end checkpoint 2 * ignore start * finalize hist fc test * start, train length tests * finalize start train length tests * fix residuals with overlap end * refactor calibration for predict and hist fc * base and child conformal * checks for calibration set * rename conformal naive model * add additional forecasting model logic * add more unit tests * add output chunk shift support * support train length with cal input * support train lenght part 2 * restructure hist fc logic * test with shorter covariates * add checks for min lengths * corrections for minimum input * improve hist fc tests * make naive conformal model accept quantiles * add winkler score quantile interval metric * update tests for quantile instead of alpha * add coverage metric and improve residuals and backtest * add save load as in ensemble mode * quantile tests * remove checks * add non conformity scores for cqr * add conformalized quantile regression * allow all global prob models for ConformalQR * add asymmetric naive model * remove old code * add tests for asymetric naive mdoel * add tests for cqr * add progress bars * add quantile sampler * add predict lkl params and num samples * add random method for handling randomness of non-torch models * fix all tests * code cleanup * add probabilistic test * add conformal models to readme and covariates user guide * fix failing tests * improve docs * add sketch of cp example notebook * small update * improve docs * attempt to fix failing test on linux * update start logic * upgrade python target version * improve stride handling * remove optional input calibration set * use cal stride * make predict work with cal_stride * add cal stride to historical forecasts * hist fc optimized cal set selection * add hist fc start test with different strides * improve comments * add more tests * stridden conformal model tests * apply suggestions from pr review * update docs * cleanup * update changelog * update changelog * update example notebook * add conformal prediction notebook * apply suggestions from PR review * update notebook * update changelog
unit8co · Dec 20, 2024 · fc244ac · fc244ac
1 parent 412d983
commit fc244ac
Show file tree

Hide file tree

Showing 33 changed files with 7,558 additions and 792 deletions.
diff --git a/.github/workflows/merge.yml b/.github/workflows/merge.yml
@@ -80,7 +80,7 @@ jobs:
     runs-on: ubuntu-latest
     strategy:
       matrix:
-        example-name: [00-quickstart.ipynb, 01-multi-time-series-and-covariates.ipynb, 02-data-processing.ipynb, 03-FFT-examples.ipynb, 04-RNN-examples.ipynb, 05-TCN-examples.ipynb, 06-Transformer-examples.ipynb, 07-NBEATS-examples.ipynb, 08-DeepAR-examples.ipynb, 09-DeepTCN-examples.ipynb, 10-Kalman-filter-examples.ipynb, 11-GP-filter-examples.ipynb, 12-Dynamic-Time-Warping-example.ipynb, 13-TFT-examples.ipynb, 15-static-covariates.ipynb, 16-hierarchical-reconciliation.ipynb, 18-TiDE-examples.ipynb, 19-EnsembleModel-examples.ipynb, 20-RegressionModel-examples.ipynb, 21-TSMixer-examples.ipynb, 22-anomaly-detection-examples.ipynb]
+        example-name: [00-quickstart.ipynb, 01-multi-time-series-and-covariates.ipynb, 02-data-processing.ipynb, 03-FFT-examples.ipynb, 04-RNN-examples.ipynb, 05-TCN-examples.ipynb, 06-Transformer-examples.ipynb, 07-NBEATS-examples.ipynb, 08-DeepAR-examples.ipynb, 09-DeepTCN-examples.ipynb, 10-Kalman-filter-examples.ipynb, 11-GP-filter-examples.ipynb, 12-Dynamic-Time-Warping-example.ipynb, 13-TFT-examples.ipynb, 15-static-covariates.ipynb, 16-hierarchical-reconciliation.ipynb, 18-TiDE-examples.ipynb, 19-EnsembleModel-examples.ipynb, 20-RegressionModel-examples.ipynb, 21-TSMixer-examples.ipynb, 22-anomaly-detection-examples.ipynb, 23-Conformal-Prediction-examples.ipynb]
     steps:
       - name: "Clone repository"
         uses: actions/checkout@v4

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -11,8 +11,26 @@ but cannot always guarantee backwards compatibility. Changes that may **break co
 
 **Improved**
 
-- Improvements to `ForecastingModel`: Improved `start` handling for historical forecasts, backtest, residuals, and gridsearch. If `start` is not within the trainable / forecastable points, uses the closest valid start point that is a round multiple of `stride` ahead of start. Raises a ValueError, if no valid start point exists. This guarantees that all historical forecasts are `n * stride` points away from start, and will simplify many downstream tasks. [#2560](https://github.com/unit8co/darts/issues/2560) by [Dennis Bader](https://github.com/dennisbader).
-- Added `data_transformers` argument to `historical_forecasts`, `backtest`, `residuals`, and `gridsearch` that allow to automatically apply `DataTransformer` and/or `Pipeline` to the input series without data-leakage (fit on historic window of input series, transform the input series, and inverse transform the forecasts). [#2529](https://github.com/unit8co/darts/pull/2529) by [Antoine Madrona](https://github.com/madtoinou) and [Jan Fidor](https://github.com/JanFidor)
+- 🚀🚀 Introducing Conformal Prediction to Darts: Add calibrated prediction intervals to any pre-trained global forecasting model with our first two conformal prediction models : [#2552](https://github.com/unit8co/darts/pull/2552) by [Dennis Bader](https://github.com/dennisbader).
+  - `ConformalNaiveModel`: It uses past point forecast errors to produce calibrated forecast intervals with a specified coverage probability.
+  - `ConformalQRModel`: It combines quantile regression (or any probabilistic model) with conformal prediction techniques. It adjusts quantile estimates to generate calibrated prediction intervals with a specified coverage probability.
+  - Both models offer the following support:
+    - use any pre-trained global forecasting model as the base forecaster
+    - uni and multivariate forecasts
+    - single and multiple series forecasts
+    - single and multi-horizon forecasts
+    - generate a single or multiple calibrated prediction intervals
+    - direct quantile value predictions (interval bounds) or sampled predictions from these quantile values
+    - covariates based on the underlying forecasting model
+  - Check out this [example notebook](https://unit8co.github.io/darts/examples/23-Conformal-Prediction-examples.html) for more information!
+- Improvements to `ForecastingModel.historical_forecasts()`, `backtest()`, `residuals()`, and `gridsearch()`:
+  - 🚀🚀 Added support for data transformers and pipelines. Use argument `data_transformers` to automatically apply any `DataTransformer` and/or `Pipeline` to the input series without data-leakage (fit on historic window of input series, transform the input series, and inverse transform the forecasts). [#2529](https://github.com/unit8co/darts/pull/2529) by [Antoine Madrona](https://github.com/madtoinou) and [Jan Fidor](https://github.com/JanFidor)
+  - Improved `start` handling. If `start` is not within the trainable / forecastable points, uses the closest valid start point that is a round multiple of `stride` ahead of start. Raises a ValueError, if no valid start point exists. This guarantees that all historical forecasts are `n * stride` points away from start, and will simplify many downstream tasks. [#2560](https://github.com/unit8co/darts/issues/2560) by [Dennis Bader](https://github.com/dennisbader).
+  - Added support for `overlap_end=True` to `residuals()`. This computes historical forecasts and residuals that can extend further than the end of the target series. Guarantees that all returned residual values have the same length per forecast (the last residuals will contain missing values, if the forecasts extended further into the future than the end of the target series). [#2552](https://github.com/unit8co/darts/pull/2552) by [Dennis Bader](https://github.com/dennisbader).
+- Improvements to `metrics`: Added three new quantile interval metrics (plus their aggregated versions) : [#2552](https://github.com/unit8co/darts/pull/2552) by [Dennis Bader](https://github.com/dennisbader).
+  - Interval Winkler Score `iws()`, and Mean Interval Winkler Scores `miws()` (time-aggregated) ([source](https://otexts.com/fpp3/distaccuracy.html))
+  - Interval Coverage `ic()` (binary if observation is within the quantile interval), and Mean Interval Coverage `mic()` (time-aggregated)
+  - Interval Non-Conformity Score for Quantile Regression `incs_qr()`, and Mean ... `mincs_qr()` (time-aggregated) ([source](https://arxiv.org/pdf/1905.03222))
 - Added `series_idx` argument to `DataTransformer` that allows users to use only a subset of the transformers when `global_fit=False` and severals series are used. [#2529](https://github.com/unit8co/darts/pull/2529) by [Antoine Madrona](https://github.com/madtoinou)
 - Updated the Documentation URL of `Statsforecast` models. [#2610](https://github.com/unit8co/darts/pull/2610) by [He Weilin](https://github.com/cnhwl).