literate-programming.qmd

---
execute:
  freeze: auto
---

# Literate programming {#literate-programming}

```{r, message = FALSE, warning = FALSE, echo = FALSE, eval = TRUE}
knitr::opts_knit$set(root.dir = fs::dir_create(tempfile()))
knitr::opts_chunk$set(collapse = TRUE, comment = "#>", eval = TRUE)
options(crayon.enabled = FALSE)
Sys.setenv(TAR_WARN = "false")
```

Literate programming is the practice of mixing code and descriptive writing in order to execute and explain a data analysis simultaneously in the same document. The `targets` package supports literate programming through tight integration with [Quarto](https://quarto.org/), [R Markdown](https://rmarkdown.rstudio.com/), and [`knitr`](https://yihui.org/knitr/). It is recommended to learn one of these three tools before proceeding.

## Scope

There are two kinds of literate programming in `targets`:

1. A literate programming source document (or Quarto project) that renders inside an individual target. Here, you define a special kind of target that runs a lightweight R Markdown report which depends on upstream targets.
2. Target Markdown, an overarching system in which one or more [Quarto](https://quarto.org/) or R Markdown files write the `_targets.R` file and encapsulate the pipeline.

We recommend (1) in order to fully embrace pipelines as a paradigm, and that is where this chapter will focus. However, (2) is still supported, and we include it [in an appendix](#markdown).

## R Markdown targets

Here, literate programming serves to display, summarize, and annotate results from upstream in the `targets` pipeline. The document(s) have little to no computation of their own, and they make heavy use of `tar_read()` and `tar_load()` to leverage output from other targets. 

As an example, let us extend the [walkthrough example](#walkthrough) chapter with the following [R Markdown](https://rmarkdown.rstudio.com/) source file `report.Rmd`.

![](./man/figures/knitr-source.png)

This document depends on targets `fit` and `hist`. If we previously ran the pipeline and the data store `_targets/` exists, then `tar_read()` and `tar_load()` will read those targets and show them in the rendered HTML output `report.html`.

![](./man/figures/knitr-ide.png)

With the `tar_render()` function in [`tarchetypes`](https://github.com/ropensci/tarchetypes), we can go a step further and include `report.Rmd` as a *target* in the pipeline. This new targets  re-renders `report.Rmd` whenever `fit` or `hist` changes, which means `tar_make()` brings the output file `report.html` up to date.

```{r, echo = FALSE, eval = TRUE}
lines <- c(
  "---",
  "output: html_document",
  "---",
  "",
  "```{r}",
  "tar_read(fit)",
  "tar_load(hist)",
  "```"
)
writeLines(lines, "report.Rmd")
```

```{r, eval = TRUE}
library(targets)
library(tarchetypes)
target <- tar_render(report, "report.Rmd") # Just defines a target object.
target$command$expr[[1]]
```

`tar_render()` is like `tar_target()`, except that you supply the file path to the R Markdown report instead of an R command. Here it is at the bottom of the example `_targets.R` file below:

```{r, eval = FALSE}
# _targets.R
library(targets)
library(tarchetypes)
source("R/functions.R")
list(
  tar_target(
    raw_data_file,
    "data/raw_data.csv",
    format = "file"
  ),
  tar_target(
    raw_data,
    read_csv(raw_data_file, col_types = cols())
  ),
  tar_target(
    data,
    raw_data %>%
      mutate(Ozone = replace_na(Ozone, mean(Ozone, na.rm = TRUE)))
  ),
  tar_target(hist, create_plot(data)),
  tar_target(fit, biglm(Ozone ~ Wind + Temp, data)),
  tar_render(report, "report.Rmd") # Here is our call to tar_render().
)
```

When we visualize the pipeline, we see that our `report` target depends on targets `fit` and `hist`. `tar_render()` automatically detects these upstream dependencies by statically analyzing `report.Rmd` for calls to `tar_load()` and `tar_read()`.

```{r, eval = FALSE}
# R console
tar_visnetwork()
```

![](./man/figures/knitr-graph.png)

## Quarto targets

`tarchetypes` >= 0.6.0.9000 supports a `tar_quarto()` function, which is like `tar_render()`, but for Quarto. For an individual source document, `tar_quarto()` works exactly the same way as `tar_render()`. However, `tar_quarto()` is more powerful: you can supply the path to an entire Quarto project, such as a book, blog, or website. `tar_quarto()` looks for target dependencies in all the source documents (e.g. listed in `_quarto.yml`), and it tracks the important files in the project for changes (run `tar_quarto_files()` to see which ones).

## Parameterized documents

[`tarchetypes`](https://docs.ropensci.org/tarchetypes) functions make it straightforward to use [parameterized R Markdown](https://rmarkdown.rstudio.com/developer_parameterized_reports.html) and [parameterized Quarto](https://quarto.org/docs/computations/parameters.html) in a `targets` pipeline. The next two subsections walk through the major use cases.

## Single parameter set

In this scenario, the pipeline renders your parameterized report one time using a single set of parameters. These parameters can be upstream targets, global objects, or fixed values. Simply pass a `params` argument to [`tar_render()`](https://docs.ropensci.org/tarchetypes/reference/tar_render.html) or an `execute_params` argument to [`tar_quarto()`](https://docs.ropensci.org/tarchetypes/reference/tar_quarto.html). Example:

```{r, eval = FALSE}
# _targets.R
library(targets)
library(tarchetypes)
list(
  tar_target(data, data.frame(x = seq_len(26), y = letters))
  tar_quarto(report, "report.qmd", execute_params = list(your_param = data))
)
```

Internally, the `report` target runs:

```{r, eval = FALSE}
# R console
quarto::quarto_render("report.qmd", params = list(your_param = your_target))
```

where `report.qmd` looks like this:

![](./man/figures/param-qmd.png)

See [`tar_quarto()` examples](https://docs.ropensci.org/tarchetypes/reference/tar_quarto.html#examples) and [`tar_render()` examples](https://docs.ropensci.org/tarchetypes/reference/tar_render.html#examples) for more.

## Multiple parameter sets

In this scenario, you still have a single report, but you render it multiple times over a grid of parameters. This time, use [`tar_quarto_rep()`](https://docs.ropensci.org/tarchetypes/reference/tar_quarto_rep.html) or [`tar_render_rep()`](https://docs.ropensci.org/tarchetypes/reference/tar_render_rep.html). Each of these functions takes as input a grid of parameters with one column per parameter and one row per parameter set, where each parameter set is used to render an instance of the document. In other words, the number of rows in the parameter grid is the number of output documents you will produce. Below is an example `_targets.R` file using `tar_render_rep()`. Usage with `tar_quarto_rep()` is the same^[except the parameter grid argument is called `execute_params` in `tar_quarto_rep()`.].

```{r, eval = FALSE}
# _targets.R
library(targets)
library(tarchetypes)
tar_option_set(packages = "tibble")
list(
  tar_target(x, "value_of_x"),
  tar_render_rep(
    report,
    "report.Rmd",
    params = tibble(
      par = c("par_val_1", "par_val_2", "par_val_3", "par_val_4"),
      output_file = c("f1.html", "f2.html", "f3.html", "f4.html")
    ),
    batches = 2
  )
)
```

where `report.Rmd` has the following YAML front matter:

```
title: report
output_format: html_document
params:
  par: "default value"
```

and the following R code chunk:

```{r, eval = FALSE}
print(params$par)
print(tar_read(x))
```

`tar_render_rep()` creates a target for the parameter grid and uses [dynamic branching](#dynamic) to render the output reports in batches. In this case, we have two batches (dynamic branches) that each produce two reports (four output reports total).

```{r, eval = FALSE}
# R console
tar_make()
#> ● run target x
#> ● run target report_params
#> ● run branch report_9e7470a1
#> ● run branch report_457829de
#> ● end pipeline
```

The third output file `f3.html` is below, and the rest look similar.

![](./man/figures/dynamic-rmarkdown-params.png)

For more information, see [these examples](https://docs.ropensci.org/tarchetypes/reference/tar_render_rep.html#examples).