Merge pull request #29 from greta-dev/add-getting-started-vignette-i22

Add "Getting Started" vignette
greta-dev · Dec 12, 2024 · 9cda259 · 9cda259
2 parents 6c1b0bd + cc57a36
commit 9cda259
Show file tree

Hide file tree

Showing 8 changed files with 359 additions and 52 deletions.
diff --git a/.gitignore b/.gitignore
@@ -3,3 +3,4 @@
 .RData
 .Ruserdata
 docs
+inst/doc
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -38,6 +38,8 @@ Depends:
   greta (>= 0.5.0),
   R (>= 4.1.0)
 Suggests: 
+    knitr,
+    rmarkdown,
     testthat (>= 3.0.0)
 Encoding: UTF-8
 LazyData: true
@@ -46,3 +48,4 @@ Config/testthat/edition: 3
 URL: https://github.com/greta-dev/greta.gam, https://greta-dev.github.io/greta.gam/
 BugReports: https://github.com/greta-dev/greta.gam/issues
 Roxygen: list(markdown = TRUE)
+VignetteBuilder: knitr
diff --git a/README.Rmd b/README.Rmd
@@ -20,12 +20,10 @@ knitr::opts_chunk$set(
 [![Codecov test coverage](https://codecov.io/gh/greta-dev/greta.gam/branch/main/graph/badge.svg)](https://app.codecov.io/gh/greta-dev/greta.gam?branch=main)
 <!-- badges: end -->
 
-greta.gam lets you use [mgcv](https://CRAN.R-project.org/package=mgcv)'s smoother functions and formula syntax to define smooth terms for use in a [greta](https://greta-stats.org/) model.
-You can then define your own likelihood to complete the model, and fit it by MCMC. 
+greta.gam lets you use [mgcv](https://CRAN.R-project.org/package=mgcv)'s smoother functions and formula syntax to define smooth terms for use in a [greta](https://greta-stats.org/) model. You can then define your own likelihood to complete the model, and fit it by MCMC. 
 
 The design and architecture of the package was done by [Nick Golding](https://github.com/goldingn), and [David L Miller](https://github.com/dill).
 
-
 # Installation
 
 You can install the development version of `greta.gam` from the [R-universe](https://greta-dev.r-universe.dev/greta.gam) with:
@@ -40,57 +38,93 @@ Or you can install from CRAN with:
 install.packages("greta.gam")
 ```
 
-
 ## Example
 
-Here's a simple example adapted from the `mgcv` `?gam` help file:
-
-In `mgcv`:
+Here's a simple example adapted from the `mgcv` `?gam` help file. In `mgcv`:
 
 ```{r mgcv-generate-and-fit}
 library(mgcv)
-set.seed(2)
+set.seed(2024 - 12 - 12)
 
 # simulate some data...
 dat <- gamSim(1, n = 400, dist = "normal", scale = 0.3)
-
+head(dat)
 # fit a model using gam()
-b <- gam(y ~ s(x2), data = dat)
+mgcv_fit <- gam(y ~ s(x2), data = dat)
+mgcv_fit
+summary(mgcv_fit)
+## show partial residuals
+plot(mgcv_fit, scheme = 1, shift = coef(mgcv_fit)[1])
 ```
 
-Now fitting the same model in `greta`:
+Now fitting the same model in `greta`. We first start by setting up the linear predictor for the smooth. That is, the right hand side of the formula:
 
 ```{r greta-fit}
 library(greta.gam)
 set.seed(2024 - 02 - 09)
 # setup the linear predictor for the smooth
-z <- smooths(~ s(x2), data = dat)
+linear_predictor <- smooths(~ s(x2), data = dat)
+linear_predictor
+```
 
-# set the distribution of the response
-distribution(dat$y) <- normal(z, 1)
+Now we specify the distribution of the response:
 
-# make some prediction data
-pred_dat <- data.frame(x2 = seq(0, 1, length.out = 100))
+```{r greta-fit-add-distribution}
+distribution(dat$y) <- normal(mean = linear_predictor, sd = 1)
+```
 
-# z_pred stores the predictions
-z_pred <- evaluate_smooths(z, newdata = pred_dat)
+Now let's make some prediction data
 
-# build model
-m <- model(z_pred)
+```{r greta-fit-make-preds}
+pred_dat <- data.frame(
+  x2 = seq(0, 1, length.out = 100)
+  )
+
+head(pred_dat)
+```
+
+We run `evaluate_smooths` on the linear predicting with the new prediction data
+
+```{r greta-fit-eval-preds}
+linear_preds <- evaluate_smooths(linear_predictor, newdata = pred_dat)
+linear_preds
+```
+
+Now we specify that as a model object and then fit with MCMC as we do with greta normally:
 
+```{r greta-fit-mcmc}
+# build model
+m <- model(linear_preds)
+m
 # draw from the posterior
-draws <- mcmc(m, n_samples = 200)
+draws <- mcmc(m, n_samples = 200, verbose = FALSE)
+class(draws)
+# 4 chains
+length(draws)
+
+# 200 draws, 100 predictors
+dim(draws[[1]])
+
+# look at the top corner
+draws[[1]][1:5, 1:5]
+```
 
-# plot the mgcv fit
-plot(b, scheme = 1, shift = coef(b)[1])
+
+Now let's compare the `mgcv` model fit to the `greta.gam` fit:
+
+```{r greta-fit-show-preds}
+plot(mgcv_fit, scheme = 1, shift = coef(mgcv_fit)[1])
 
 # add in a line for each posterior sample
-apply(draws[[1]], 1, lines, x = pred_dat$x2, col = "blue")
+apply(draws[[1]], 1, lines, x = pred_dat$x2, 
+      col = adjustcolor("firebrick", alpha.f = 0.1))
 
 # plot the data
 points(dat$x2, dat$y, pch = 19, cex = 0.2)
 ```
 
+The `mgcv` predictions are in the grey ribbon, and the `greta.gam` ones are in red - we can see that the greta predictions are within the range of the mgcv, which is good news!
+
 ## Brief technical details
 
 `greta.gam` uses a few tricks from the `jagam` (Wood, 2016) routine in `mgcv` to get things to work. Here are some brief details for those interested in the internal workings.
@@ -103,12 +137,10 @@ GAMs are models with Bayesian interpretations (even when fitted using "frequenti
 
 There is a slight difficulty in the Bayesian interpretation of the GAM in that, in their naïve form the priors are improper as the nullspace of the penalty (in the 1D case, usually the linear term). To get proper priors we can use one of the "tricks" employed in Marra & Wood (2011) -- that is to somehow penalise the parts of the penalty that lead to the improper prior. We take the option provided by `jagam` and create an additional penalty matrix for these terms (from an eigen-decomposition of the penalty matrix; see Marra & Wood, 2011).
 
-
-### References
+# References
 
 Marra, G and Wood, SN (2011) Practical variable selection for generalized additive models. Computational Statistics and Data Analysis, 55, 2372–2387.
 
 Miller DL (2021). Bayesian views of generalized additive modelling. arXiv.
 
 Wood, SN (2016) Just Another Gibbs Additive Modeler: Interfacing JAGS and mgcv. Journal of Statistical Software 75, no. 7
-
-Original file line number
+Diff line change
@@ Expand Up / @@ -3,3 +3,4 @@ @@
     .RData
     .Ruserdata
     docs
+    inst/doc