26-SpatialResponseSurface.Rmd

# Spatial response surface sampling {#SpatialResponseSurface}

As with conditioned Latin hypercube sampling, spatial response surface sampling\index{Spatial response surface sampling} is an experimental design adapted for spatial surveys. Experimental response surface designs aim at finding an optimum of the response within specified ranges of the factors. There are many types of response surface designs, see @myers2002.  With response surface sampling one assumes that some type of low order regression model can be used to accurately approximate the relationship between the study variable and the covariates.  A commonly used design is the central composite design\index{Central composite design}. The data obtained with this design are used to fit a multiple linear regression model with quadratic terms, yielding a curved, quadratic surface of the response.

The response surface sampling approach is an adaptation of an experimental design, but at the same time it is an example of a model-based sampling design. Sampling units are selected to implicitly optimise the estimation of the quadratic regression model. However, this optimisation is done under one or more spatial constraints. Unconstrained optimisation of the sampling design under the linear regression model will not prevent the units from spatial clustering, see the optimal sample in Figure \@ref(fig:twosamples). The assumption of independent data might be violated when the sampling units are spatially clustered.  For this reason, the response surface sampling design selects samples with good spatial coverage, so that the design becomes robust against violation of the independence assumption.

@lesch95 adapted the response surface methodology for observational studies. Several problems needed to be tackled. First, when multiple covariates are used, the covariates must be decorrelated. Second, when sampling units are spatially clustered, the assumption in linear regression modelling of spatially uncorrelated model residuals can be violated. To address these two problems, @lesch95 proposed the following procedure; see also @lesch2005:   

1. Transform the covariate matrix into a scaled, centred, decorrelated matrix by principal components analysis (PCA).   
2. Choose the response surface design type.   
3. Select candidate sampling units based on the distance from the design points in the space spanned by the principal components. Select multiple sampling units per design point.   
4. Select the combination of candidate sampling units with the highest value for a criterion that quantifies how uniform the sample is spread across the study area.   

This design has been applied, among others, for mapping soil salinity (ECe), using electromagnetic (EM) induction measurements and surface array conductivity measurements as predictors in multiple linear regression models. For applications, see @corwin2005, @lesch2005, @fitzgerald2006, @Corwin2010, and @Fitzgerald2010.

Spatial response surface sampling  is illustrated with the EM measurements (mS m^-1^) of the apparent electrical conductivity on the 80 ha Cotton Research Farm in Uzbekistan. The EM measurements in vertical dipole mode, with transmitter at 1 m and 0.5 m from the receiver, are on transects covering the Cotton Research Farm (Figure \@ref(fig:EMdataUzbekistan)). As a first step, the natural log of the two EM measurements, denoted by lnEM, are interpolated by ordinary kriging to a fine grid (Figure \@ref(fig:EMdataUzbekistan2)). These ordinary kriging predictions of lnEM are used as covariates in response surface sampling. The two covariates are strongly correlated, $r=0.73$, as expected since they are interpolations of measurements of the same variable but of different overlapping layers.

```{r EMdataUzbekistan, echo = FALSE, out.width = "100%", fig.cap = "Natural log of EM measurements on the Cotton Research Farm (with transmitter at 1 m and 0.5 m from receiver)."}
#Krassovsky datum: epsg:4024
v_dat <- read_csv("data/TransectsData_EM_CRF_Uzbekistan.csv") %>%
    mutate(
        lnEM100cm = log(EMv1mt),
        lnEM50cm = log(EMv05mt)) %>%
    st_as_sf(coords = c("x", "y"), crs = st_crs(4024)) %>%
    st_transform(crs = 32641) %>%
    st_zm

v_dat %>%
    pivot_longer(cols = c("lnEM100cm", "lnEM50cm")) %>%
    st_set_crs(NA_crs_) %>%
    ggplot() +
        geom_sf(mapping = aes(colour = value), size = 0.3) +
        scale_colour_viridis_c(name = "lnEM") +
    scale_x_continuous(
        name = "Easting (km)",
        labels = function(x) {1.0e-3 * x},
        limits = c(305750, 308000)) +
    scale_y_continuous(
        name = "Northing (km)",
        labels = function(x) {1.0e-3 * x},
        limits = c(4600200, 4601400),
        breaks = seq(from = 4600400, to = 4601200, by = 200)) +
    facet_grid(~ name)
```


```{r, echo = FALSE, eval = FALSE}
v_crf <- read_sf(system.file("extdata/cottonresearchfarm_uzbekistan.gpkg", package = "sswr")) %>%
    st_set_agr("constant") %>%
    st_transform(crs = 32641)

# create grid points
g_crf <- st_make_grid(v_crf, cellsize = 25, what = "centers")
g_crf <- g_crf[v_crf]
g_crf <- st_intersection(v_crf, g_crf) #add id of polygons

# estimate sample semivariogram
variogram1m <- variogram(lnEM100cm ~ 1, data = v_dat)
variogram05m <- variogram(lnEM50cm ~ 1, data = v_dat)

# fit semivariogram model
vgm1m <- fit.variogram(variogram1m, model = vgm(psill = 0.15, model = "Exp", range = 200, nugget = 0.05))
vgm05m <- fit.variogram(variogram05m, model = vgm(psill = 0.02, model = "Exp", range = 200, nugget = 0.01))

#ordinary kriging
library(stars)
EMpred <- krige(
    lnEM100cm ~ 1,
    v_dat,
    newdata = g_crf,
    model = vgm1m,
    nmax = 100,
    debug.level = 0
)

grdCRF <- EMpred %>%
    dplyr::select(lnEM100cm = var1.pred)

EMpred  <- krige(
    lnEM50cm ~ 1,
    v_dat,
    newdata = g_crf,
    model = vgm05m,
    nmax = 100,
    debug.level = 0
)

library(sfheaders)

grdCRF <- grdCRF %>%
    mutate(
        lnEM50cm = EMpred$var1.pred,
        subarea = g_crf$id) %>%
    sf_to_df(fill = TRUE)

#write_rds(grdCRF, "data/CottonResearchFarm_Uzbekistan.rds")
```

```{r EMdataUzbekistan2, echo = FALSE, out.width = "100%", fig.cap = "Interpolated surfaces of natural log of EM measurements on the Cotton Research Farm, used as covariates in spatial response surface sampling."}
grdCRF_lf <- grdCRF %>% pivot_longer(cols = c("lnEM100cm", "lnEM50cm"))

ggplot(data = grdCRF_lf) +
  geom_raster(mapping = aes(x = x , y = y , fill = value)) +
  scale_fill_viridis_c(name = "lnEM") +
  scale_x_continuous(
    name = "Easting (km)",
    labels = function(x) {1.0e-3 * x},
    limits = c(305750, 308000)) +
  scale_y_continuous(
    name = "Northing (km)",
    labels = function(x) {1.0e-3 * x},
    limits = c(4600200, 4601400),
    breaks = seq(from = 4600400, to = 4601200, by = 200)) +
  coord_fixed() +
  facet_grid(~ name)
```

Function `prcomp` of the **stats** package [@R2020] is used to compute the principal component scores\index{Principal component score} for all units in the population (grid cells). The two covariates are centred and scaled, i.e., standardised principal components are computed. 

```{r}
pc <- grdCRF %>%
  dplyr::select(lnEM100cm, lnEM50cm) %>%
  prcomp(center = TRUE, scale = TRUE)
```

The means of the two principal component scores are 0; however, their standard deviations are not zero but `r formatC(pc$sdev[1], 3, format = "f")` and `r formatC(pc$sdev[2], 3, format = "f")`. Therefore, the principal component scores are divided by these standard deviations. They then will have the same weight in the following steps. 

```{r}
grdCRF <- grdCRF %>%
  mutate(
    PC1 = pc$x[, 1] / pc$sdev[1],
    PC2 = pc$x[, 2] / pc$sdev[2])
```

Function `ccd` of package **rsm** [@Lenth2009] is now used to generate a central composite response surface design\index{Central composite response surface design} (CCRSD). Argument `basis` specifies the number of factors, which is two in our case. Argument `n0` is the number of centre points, and argument `alpha` determines the position of the star points (explained hereafter).

```{r}
library(rsm)
set.seed(314)
print(ccdesign <- ccd(basis = 2, n0 = 1, alpha = "rotatable"))
```

The experiment consists of two blocks, each of five experimental units. Block 1, the so-called cube block, consists of one centre point and four cube points\index{Cube point}. In the experimental unit represented by the centre point, both factors have levels in the centre of the experimental range. In the experimental units represented by the cube points, the levels of both factors is either -1 or +1 unit in the design space. Block 2, referred to as the star block, consists of one centre point and four star points\index{Star point}. With `alpha = "rotatable"` the  star points are on the circle circumscribing the square (Figure \@ref(fig:ccdesign)).

```{r ccdesign, echo = FALSE, fig.width=5, fig.cap = "Rotatable central composite design for two factors."}
cube <- data.frame(x1 = c(-1, 1, -1, 1, 0), x2 = c(-1, -1, 1, 1, 0))
star <- data.frame(x1 = c(-sqrt(2), sqrt(2), 0, 0, 0), x2 = c(0, 0, -sqrt(2), sqrt(2), 0))
df <- rbind(cube, star)
df$block <- rep(c("cube", "star"), each = 5)
ggplot(df) +
  geom_point(mapping = aes(x = x1, y = x2, shape = block), size = 2) +
  scale_shape_manual(values = c(0, 8), name = "Block") +
  geom_path(data = data.frame(x = c(-1, -1, 1, 1, -1), y = c(-1, 1, 1, -1, -1)), aes(x = x, y = y), lty = 2) +
  geom_circle(aes(x0 = 0, y0 = 0, r = sqrt(2)), lty = 2) +
  coord_fixed()
```

To adapt this design for an observational study, we drop one of the centre points (0,0).
 
```{r}
ccd_df <- data.frame(x1 = ccdesign$x1, x2 = ccdesign$x2)
ccd_df <- ccd_df[-6, ]
```

The coordinates of the CCRSD points are multiplied by a factor so that a large proportion $p$ of the bivariate standardised principal component scores of the population units is covered by the circle that passes through the design points (Figure \@ref(fig:ccdesign)). The factor is computed as a sample quantile of the empirical distribution of the distances of the points in the scatter to the centre. For $p$, I chose 0.7.

```{r}
d <- sqrt(grdCRF$PC1^2 + grdCRF$PC2^2)
fct <- quantile(d, p = 0.7)
print(fct)
ccd_df <- ccd_df %>%
  mutate(x1 = x1 * fct, x2 = x2 * fct)
```

The next step is to select for each design point several candidate sampling points. For each of the nine design points\index{Design point}, eight points are selected that are closest to that design point. This results in 9 $\times$ 8 candidate sampling points.

```{r}
candi_all <- NULL
for (i in seq_len(nrow(ccd_df))) {
    d2dpnt <- sqrt((grdCRF$PC1 - ccd_df$x1[i])^2 +
      (grdCRF$PC2 - ccd_df$x2[i])^2)
    grdCRF <- grdCRF[order(d2dpnt), ]
    candi <- grdCRF[c(1:8), c("point_id", "x", "y", "PC1", "PC2")]
    candi$dpnt <- i
    candi_all <- rbind(candi_all, candi)
}
```

Figure \@ref(fig:candidatelocations) shows the nine clusters of candidate sampling points around the design points. Note that the location of the candidate sampling points associated with the design points with coordinates  (0,-2.13), (1.51,-1.51), and (2.13,0) are all far inside the circle that passes through the design points. So, for the optimised sample,  there will be three points with principal component scores that considerably differ from the ideal values according to the CCRSD design.

```{r candidatelocations, echo = FALSE, fig.width = 5, fig.cap = "Clusters of points (red points) around the design points (triangles) of a CCRSD (two covariates), serving as candidate sampling points."}
notcandidates <- grdCRF[-candi_all$point_id, ]
ggplot(notcandidates) +
  geom_point(mapping = aes(x = PC1, y = PC2), alpha = 0.5) +
  geom_point(data = candi_all, mapping = aes(x = PC1, y = PC2), colour = "red") +
  geom_point(data = ccd_df, aes(x = x1, y = x2), shape = 2) +
  scale_x_continuous(name = "PC1") +
  scale_y_continuous(name = "PC2") +
  coord_fixed()
```
Figure \@ref(fig:candidatesingeospace) shows that in geographical space for most design points there are multiple spatial clusters of candidate units. For instance, for design point nine, there are three clusters of candidate sampling units. Therefore, there is scope to optimise the sample computationally. 

```{r candidatesingeospace, echo = FALSE, out.width = "100%", fig.cap = "Candidate sampling points plotted on a map of the first standardised principal component (PC1)."}
ggplot(grdCRF) +
  geom_raster(aes(x = x / 1000, y = y / 1000, fill = PC1)) +
  geom_point(data = candi_all, mapping = aes(x = x / 1000, y = y / 1000, shape = as.factor(dpnt)), size = 2) +
  scale_shape_manual(values = c(0, 1, 2, 3, 4, 5, 6, 7, 8), name = "Design pnt") +
  scale_fill_viridis_c(name = "PC1") +
  scale_x_continuous(name = "Easting (km)") +
  scale_y_continuous(name = "Northing (km)") +
  coord_fixed()
```

As a first step, an initial subsample from the candidate sampling units is selected by stratified simple random sampling, using the levels of factor `dpnt` as strata. Function `strata` of package **sampling** is used for stratified random sampling [@Tille2016].

```{r}
library(sampling)
set.seed(314)
units_stsi <- sampling::strata(
  candi_all, stratanames = "dpnt", size = rep(1, 9))
mysample0 <- getdata(candi_all, units_stsi) %>%
  dplyr::select(-(ID_unit:Stratum))
```

The locations of the nine sampling units are now optimised by minimising a criterion that is a function of the distance between the nine sampling points. Two minimisation criteria are implemented, a geometric criterion and a model-based criterion. 

In the geometric criterion (as proposed by @lesch2005) for each sampling point the log of the shortest distance to the other points is computed. The minimisation criterion is the negative of the sample mean of these distances. 

The model-based minimisation criterion is the average correlation of the sampling points. This criterion requires as input the parameters of a residual correlogram (see Section \@ref(IntroKED)). I assume an exponential correlogram without nugget, so that the only parameter to be chosen is the distance parameter $\phi$ (Equation \@ref(eq:exponential)). Three times $\phi$ is referred to as the effective range\index{Effective range} of the exponential correlogram. The correlation of the random variables at two points separated by this distance is 0.05.

A penalty term is added to the geometric or the model-based minimisation criterion, equal to the average distance of the sampling points to the associated design points, multiplied by a weight. With weights $>0$, sampling points close to the design points are preferred over more distant points.

In the next code chunk, a function is defined for computing the minimisation criterion.  Given a chosen value for $\phi$, the 9 $\times$ 9 distance matrix of the sampling points can be converted into a correlation matrix, using function `variogramLine` of package **gstat** [@peb04]. Argument `weight` is an optional argument with default value 0.

```{r}
getCriterion <- function(mysample, dpnt, weight = 0, phi = NULL) {
  D2dpnt <- sqrt((mysample$PC1 - dpnt$x1)^2 + (mysample$PC2 - dpnt$x2)^2)
  D <- as.matrix(dist(mysample[, c("x", "y")]))
  if (is.null(phi)) {
    diag(D) <- NA
    logdmin <- apply(D, MARGIN = 1, FUN = min, na.rm = TRUE) %>% log
    criterion_cur <- mean(-logdmin) + mean(D2dpnt) * weight
  } else {
    vgmodel <- vgm(model = "Exp", psill = 1, range = phi)
    C <- variogramLine(vgmodel, dist_vector = D, covariance = TRUE)
    criterion_cur <- mean(C) + mean(D2dpnt) * weight
  }
  return(criterion_cur)
}
```

Function `getCriterion` is used to compute the geometric criterion for the initial sample.

```{r}
criterion_geo <- getCriterion(mysample = mysample0, dpnt = ccd_df)
```

The initial value of the geometric criterion  is `r formatC(criterion_geo, 3, format = "f")`. In the next code chunk, the initial value for the model-based criterion is computed for an effective range of 150 m.

```{block2, type = 'rmdnote'}
It does not make sense to make the effective range smaller than the size of the grid cells, which is 25 m in our case. For smaller ranges, the correlation matrix is for any sample a matrix with zeroes. If the effective range is smaller than the smallest distance between two points in a cluster, the mean correlation is equal for all samples. 
```

```{r}
phi <- 50
criterion_mb <- getCriterion(mysample = mysample0, dpnt = ccd_df, phi = phi)
```

The initial value of the model-based criterion is `r formatC(criterion_mb, 3, format = "f")`.

The objective function defining the minimisation criterion is minimised with simulated annealing\index{Simulated annealing} (@Kirkpatrick1983, @Aarts1987). One sampling point is randomly selected and replaced by another candidate sampling point from the same cluster. If the criterion of the new sample is smaller than that of the current sample, the new sample is accepted. If it is larger, it is accepted with a probability that is a function of the change in the criterion (the larger the increase, the smaller the acceptance probability) and of an annealing parameter named the temperature (the higher the temperature, the larger the probability of accepting a new, poorer sample, given an increase of the criterion). See Section \@ref(SSA) for a more detailed introduction to simulated annealing. 

The sampling pattern can be optimised with function `anneal` of  package **sswr**. The arguments of this function will be clear from the description of the sampling procedure above.

```{r, eval = FALSE}
set.seed(314)
mySRSsample <- anneal(
  mysample = mysample0, candidates = candi_all, dpnt = ccd_df, phi = 50,
  T_ini = 1, coolingRate = 0.9, maxPermuted = 25 * nrow(mysample0),
  maxNoChange = 20, verbose = TRUE)
```

```{r, eval = FALSE, echo = FALSE}
write_rds(mySRSsample, file = "results/SpatialResponseSurfaceSample_CRF_geo.rds")
write_rds(mySRSsample, file = "results/SpatialResponseSurfaceSample_CRF_mb.rds")
```

Figure \@ref(fig:CCRSDinPCspace) shows the optimised CCRSD samples plotted in the space spanned by the two principal components, obtained with the geometric and the model-based criterion, plotted together with the design points. The two optimised samples are very similar.

```{r CCRSDinPCspace, echo = FALSE, out.width = "100%", fig.cap = "Principal component scores of the spatial CCRSD sample (triangles), optimised with the geometric and the model-based criterion. Dots: design points of CCRSD."}
mySRSsample <- read_rds(file = "results/SpatialResponseSurfaceSample_CRF_geo.rds")
mysample_geo <- mySRSsample$mysample
critmin_geo <- tail(mySRSsample$trace, 1)

mySRSsample <- read_rds(file = "results/SpatialResponseSurfaceSample_CRF_mb.rds")
mysample_mb <- mySRSsample$mysample
critmin_mb <- tail(mySRSsample$trace, 1)

mysamples <- rbind(mysample_geo, mysample_mb)
mysamples$design <- rep(c("Geometric", "Model-based"), each = 9)

ggplot(mysamples) +
 geom_point(aes(x = PC1, y = PC2), shape = 2, size = 3) +
 geom_point(data = ccd_df, mapping = aes(x = x1, y = x2), size = 3) +
 scale_x_continuous(name = "PC1") +
 scale_y_continuous(name = "PC2") +
 facet_wrap(~ design) +
 coord_fixed()
```

Figure \@ref(fig:CCRSDSample) shows the two optimised CCRSD samples plotted in geographical space on the first standardised principal component scores.

```{r CCRSDSample, echo = FALSE, out.width = "100%", fig.cap = "CCRSD sample from the Cotton Research Farm, optimised with the geometric and the model-based criterion, plotted on a map of the first standardised principal component (PC1)."}

ggplot(mysamples) +
  geom_raster(data = grdCRF, mapping = aes(x = x / 1000, y = y / 1000, fill = PC1)) +
  geom_point(data = mysamples, mapping = aes(x = x / 1000, y = y / 1000), size = 2) +
  scale_fill_viridis_c(name = "PC1") +
  scale_x_continuous(name = "Easting (km)") +
  scale_y_continuous(name = "Northing (km)") +
  facet_wrap(~ design, ncol = 1, nrow = 2) +
  coord_fixed()
```

## Increasing the sample size {#IncreaseSampleSize}

Nine points are rather few for fitting a polynomial regression model, especially for a second-order polynomial with interaction. Therefore, in experiments often multiple observations are done for each design point. Increasing the sample size of a response surface sample in observational studies is not straightforward. The challenge is to avoid spatial clustering of sampling points. A simple solution is to select multiple points from each subset of candidate sampling units. The success of this solution depends on how strong candidate sampling units are spatially clustered. For the Cotton Research Farm for most design points the candidate sampling units are not in one spatial cluster; so in this case, the solution may work properly. I increased the number of candidate sampling units per design point to 16, so that there is a larger choice in the optimisation of the sampling pattern. 

```{r}
candi_all <- NULL
for (i in seq_len(nrow(ccd_df))) {
    d2dpnt <- sqrt((grdCRF$PC1 - ccd_df$x1[i])^2 +
      (grdCRF$PC2 - ccd_df$x2[i])^2)
    grdCRF <- grdCRF[order(d2dpnt), ]
    candi <- grdCRF[c(1:16), c("point_id", "x", "y", "PC1", "PC2")]
    candi$dpnt <- i
    candi_all <- rbind(candi_all, candi)
}
```

A stratified simple random subsample of two points per stratum is selected, which serves as an initial sample.

```{r}
set.seed(314)
units_stsi <- sampling::strata(
  candi_all, stratanames = "dpnt", size = rep(2, 9))
mysample0 <- getdata(candi_all, units_stsi) %>%
  dplyr::select(-(ID_unit:Stratum))
```

The data frame with the design points must be doubled. Note that the order of the design points must be equal to the order in the stratified subsample.

```{r}
tmp <- data.frame(ccd_df, dpnt = 1:9)
ccd_df2 <- rbind(tmp, tmp)
ccd_df2 <- ccd_df2[order(ccd_df2$dpnt), ]
```

```{r, echo = FALSE, eval = FALSE}
set.seed(314)
mySRSsample <- anneal(
  mysample = mysample0,
  candidates = candi_all,
  dpnt = ccd_df2,
  phi = 50,
  T_ini = 1,
  coolingRate = 0.9,
  maxPermuted = 25 * nrow(mysample0),
  maxNoChange = 20,
  verbose = TRUE
)
write_rds(mySRSsample, file = "results/SpatialResponseSurfaceSample_2n_CRF_mb.rds")
```

Figures \@ref(fig:CCRSDUzbekistan2n) and \@ref(fig:CCRSDinPCSpace2n) show the optimised CCRSD sample of 18 points in geographical and principal component space, respectively, obtained with the model-based criterion, an effective range of 150 m, and zero weight for the penalty term. Sampling points are not spatially clustered, so I do not expect violation of the assumption of independent residuals. In principal component space, all points are pretty close to the design points, except for the four design points in the lower right corner, where no candidate units near these design points are available.

```{r CCRSDUzbekistan2n, echo = FALSE, out.width = "100%", fig.cap = "CCRSD sample with two points per design point, from the Cotton Research Farm, plotted on a map of the first standardised principal component (PC1)."}
mySRSsample <- read_rds(file = "results/SpatialResponseSurfaceSample_2n_CRF_mb.rds")
mysample <- mySRSsample$mysample
ggplot(grdCRF) +
  geom_raster(aes(x = x / 1000, y = y / 1000, fill = PC1)) +
  geom_point(data = mysample, mapping = aes(x = x / 1000, y = y / 1000), size = 2) +
  scale_fill_viridis_c(name = "PC1") +
  scale_x_continuous(name = "Easting (km)") +
  scale_y_continuous(name = "Northing (km)") +
  coord_fixed()
```

```{r CCRSDinPCSpace2n, echo = FALSE, fig.width = 5, fig.cap = "CCRSD sample (triangles) with two points per design point (dots), optimised with model-based criterion, plotted in the space spanned by the two standardised principal components."}
mySRSsample <- read_rds(file = "results/SpatialResponseSurfaceSample_2n_CRF_mb.rds")
mysample <- mySRSsample$mysample
ggplot(mysample) +
 geom_point(data = ccd_df, aes(x = x1, y = x2), size = 3) +
 geom_point(aes(x = PC1, y = PC2), size = 3, shape = 2) +
 scale_x_continuous(name = "PC1") +
 scale_y_continuous(name = "PC2") +
 coord_fixed()
```

## Stratified spatial response surface sampling

The sample size can also be increased by stratified spatial response surface sampling\index{Stratified spatial response surface sampling}. The strata are subareas of the study area. When the subsets of candidate sampling units for some design points are strongly spatially clustered, the final optimised sample obtained with the method of the previous section may also show strong spatial clustering. An alternative is then to split the study area into two or more subareas (strata) and to select from each stratum candidate sampling units. This guarantees that for each design point we have at least as many spatial clusters of candidate units as we have strata. 

```{block2, type = 'rmdnote'}
The spatial strata are not used for fitting separate regression models. All data are used to fit one (second-order) polynomial regression model.
```

Figure \@ref(fig:StrataCRF4CCRSD) shows two subareas used as strata in stratified response surface sampling of the Cotton Research Farm.

```{r StrataCRF4CCRSD, echo = FALSE, out.width = "100%", fig.cap = "Two subareas of the Cotton Research Farm used as strata in stratified CCRSD sampling."}
ggplot(grdCRF) +
  geom_raster(aes(x = x / 1000, y = y / 1000, fill = as.factor(subarea))) +
  scale_fill_grey(name = "Stratum", start = 0.6, end = 0.8) +
  scale_x_continuous(name = "Easting (km)") +
  scale_y_continuous(name = "Northing (km)") +
  coord_fixed()
```

The candidate sampling units are selected in a double for-loop. The outer loop is over the strata, the inner loop over the design points. Note that variable `dpnt` continues to increase by 1 after the inner loop over the nine design points in subarea 1 is completed, so that variable `dpnt` (used as a stratification variable in subsampling the sample of candidate sampling points) now has values $1,2, \dots , 18$.  An equal number of candidate sampling points per design point in both strata (eight points) is selected by sorting the points of a stratum by the distance to a design point using function `order`. Figure \@ref(fig:candidatesSTResponse) shows the candidate sampling points for stratified CCRSD sampling.

```{r}
candi_all <- NULL
for (h in c(1, 2)) {
  data_stratum <- grdCRF %>%
    filter(subarea == h)
  candi_stratum <- NULL
  for (i in seq_len(nrow(ccd_df))) {
      d2dpnt <- sqrt((data_stratum$PC1 - ccd_df$x1[i])^2 +
                     (data_stratum$PC2 - ccd_df$x2[i])^2)
      data_stratum <- data_stratum[order(d2dpnt), ]
      candi <- data_stratum[c(1:8),
          c("point_id", "x", "y", "PC1", "PC2", "subarea")]
      candi$dpnt <- i + (h - 1) * nrow(ccd_df)
      candi_stratum <- rbind(candi_stratum, candi)
  }
  candi_all <- rbind(candi_all, candi_stratum)
}
```

```{r candidatesSTResponse, echo = FALSE, out.width = "100%", fig.cap = "Candidate sampling points for stratified CCRSD sampling, plotted on a map of the first principal component (PC1)."}
subarea <- as.factor(c(rep(1, nrow(candi_all) / 2), rep(2, nrow(candi_all) / 2)))
dpnt9 <- c(rep(1:9, each = 8), rep(1:9, each = 8))
ggplot(grdCRF) +
  geom_raster(aes(x = x / 1000, y = y / 1000, fill = PC1)) +
  geom_point(data = candi_all, mapping = aes(x = x / 1000, y = y / 1000, shape = as.factor(dpnt9)), size = 1.5) +
  scale_shape_manual(values = c(0, 1, 2, 3, 4, 5, 6, 7, 8), name = "Design pnt") +
  scale_fill_viridis_c(name = "PC1") +
  scale_x_continuous(name = "Easting (km)") +
  scale_y_continuous(name = "Northing (km)") +
  coord_fixed()
```

As before, `dpnt` is used as a stratum identifier to subsample the candidate sampling units. Finally, the number of rows in `data.frame` `ccd_df` with the design points is doubled.

```{r}
set.seed(314)
units_stsi <- sampling::strata(
  candi_all, stratanames = "dpnt", size = rep(1, 18))
mysample0 <- getdata(candi_all, units_stsi) %>%
  dplyr::select(-(ID_unit:Stratum))
ccd_df2 <- rbind(ccd_df, ccd_df)
```


```{r, echo = FALSE, eval = FALSE}
set.seed(314)
mySRSsample <- anneal(
  mysample = mysample0,
  candidates = candi_all,
  dpnt = ccd_df2,
  weight = 5,
  phi = 50,
  T_ini = 1,
  coolingRate = 0.9,
  maxPermuted = 25 * nrow(mysample0),
  maxNoChange = 20,
  verbose = TRUE
)
#write_rds(mySRSsample, file = "results/StratifiedSpatialResponseSurfaceSample_CRF_mb.rds")
write_rds(mySRSsample, file = "results/StratifiedSpatialResponseSurfaceSample_CRF_mb_w5.rds")
```

Figures \@ref(fig:StratifiedCCRSD) and \@ref(fig:StratifiedCCRSDinPCSpace) show the optimised sample of 18 points in geographical and principal component space, obtained with the model-based criterion with an effective range of 150 m. The pattern in the principal component space is worse compared to the pattern in Figure \@ref(fig:CCRSDinPCSpace2n). In stratum 1, the distance to the star point at the top and the upper left and upper right cube points is very large. In this stratum no population units are present that are close to these design points. By adding a penalty term to the minimisation criterion that is proportional to the distance to the design points, the distance is somewhat decreased, but not really for the three design points mentioned above (Figure \@ref(fig:CCRSDinPCSpace2n)). Also note the spatial cluster of three sampling units in Figure \@ref(fig:StratifiedCCRSD) obtained with a weight equal to 5.

```{r StratifiedCCRSD, echo = FALSE, out.width = "100%", fig.cap = "Stratified CCRSD samples from the Cotton Research Farm, optimised with the model-based criterion, obtained without (weight = 0) and with penalty (weight = 5) for a large average distance to design points."}
mySRSsample <- read_rds(file = "results/StratifiedSpatialResponseSurfaceSample_CRF_mb.rds")
mysample_w0 <- mySRSsample$mysample

mySRSsample <- read_rds(file = "results/StratifiedSpatialResponseSurfaceSample_CRF_mb_w5.rds")
mysample_w5 <- mySRSsample$mysample
mysamples <- rbind(mysample_w0, mysample_w5)
mysamples$weight <- rep(c("Weight: 0", "Weight: 5"), each = 18)

ggplot(mysamples) +
  geom_raster(data = grdCRF, mapping = aes(x = x / 1000, y = y / 1000, fill = PC1)) +
  geom_point(data = mysamples, mapping = aes(x = x / 1000, y = y / 1000), size = 2) +
  scale_fill_viridis_c(name = "PC1") +
  scale_x_continuous(name = "Easting (km)") +
  scale_y_continuous(name = "Northing (km)") +
  facet_wrap(~ weight, ncol = 1, nrow = 2) +
  coord_fixed()
```

```{r StratifiedCCRSDinPCSpace, echo = FALSE, out.width = "100%", fig.cap = "Principal component scores of the stratified CCRSD sample, optimised with the model-based criterion, obtained without (weight = 0) and with penalty (weight = 5) for a large average distance to design points (dots)."}
mySRSsample <- read_rds(file = "results/StratifiedSpatialResponseSurfaceSample_CRF_mb.rds")
mysample_w0 <- mySRSsample$mysample
mysample_w0$subarea <- as.factor(rep(c(1, 2), each = 9))

mySRSsample <- read_rds(file = "results/StratifiedSpatialResponseSurfaceSample_CRF_mb_w5.rds")
mysample_w5 <- mySRSsample$mysample
mysample_w5$subarea <- as.factor(rep(c(1, 2), each = 9))

mysamples <- rbind(mysample_w0, mysample_w5)
mysamples$weight <- rep(c("Weight: 0", "Weight: 5"), each = 18)

ggplot(mysamples) +
 geom_point(data = ccd_df, mapping = aes(x = x1, y = x2), size = 2) +
 geom_point(aes(x = PC1, y = PC2, shape = subarea), size = 2) +
 scale_shape_manual(values = c(2, 3), name = "Stratum") +
 scale_x_continuous(name = "PC1") +
 scale_y_continuous(name = "PC2") +
 facet_wrap(~ weight) +
 coord_fixed()
```

## Mapping

Once the data are collected, the study variable is mapped by fitting a multiple linear regression model using the two covariates, in our case the two EM measurements, as predictors. The fitted model is then used to predict the study variable for all unsampled population units. 

The value of the study variable at an unsampled prediction location $\mathbf{s}_0$ is predicted by
\begin{equation}
\widehat{Z}(\mathbf{s}_0) = \mathbf{x}_0 \hat{\pmb{\beta}}_{\text{OLS}}\;,
(\#eq:ZpredMLR)
\end{equation}
with $\mathbf{x}_0$ the ($p+1$)-vector with covariate values at prediction location $\mathbf{s}_0$ and 1 in the first entry ($p$ is the number of covariates) and $\hat{\pmb{\beta}}$ the vector with ordinary least squares estimates\index{Ordinary least squares} of the regression coefficients:
\begin{equation}
\hat{\pmb{\beta}}_{\text{OLS}} = (\mathbf{X}^{\mathrm{T}}\mathbf{X})^{-1} (\mathbf{X}^{\mathrm{T}}\mathbf{z})\;,
(\#eq:betaOLS)
\end{equation}
with $\mathbf{X}$ the $(n \times (p+1))$ matrix with covariate values and ones in the first column ($n$ is the sample size, and $p$ is the number of covariates) and $\mathbf{z}$ the $n$-vector with observations of the study variable.

```{block2, type = 'rmdnote'}
Although the principal component scores are used to select the sampling locations, there is no need to use these scores as predictors in the linear regression model. When all principal components derived from the covariates are used as predictors, the predicted values and standard errors obtained with the model using the principal components as predictors are equal to those obtained with the model using the covariates as predictors.
```

The variance of the prediction error can be estimated by
\begin{equation}
\widehat{V}(\widehat{Z}(\mathbf{s}_0)) = \hat{\sigma}^2_{\epsilon}(1+\mathbf{x}_0^{\mathrm{T}}(\mathbf{X}^{\mathrm{T}}\mathbf{X})^{-1}\mathbf{x}_0)\;,
(\#eq:VarZpredMLR)
\end{equation}
with $\hat{\sigma}^2_{\epsilon}$ the estimated variance of the residuals.

In **R** the model can be calibrated with function `lm`, and the predictions can be obtained with function `predict`. The standard errors of the estimated means can be obtained with argument `se.fit = TRUE`. The variances of Equation \@ref(eq:VarZpredMLR) can be computed by squaring these standard errors and adding the squared value of the estimated residual variance, which can be extracted with `sigma()`.

```{r, eval = FALSE}
mdl <- lm(lnECe ~ lnEM100cm + lnEM50cm, data = mysample)
zpred <- predict(mdl, newdata = grdCRF, se.fit = TRUE)
v_zpred <- zpred$se.fit^2+sigma(mdl)^2
```

The assumption underlying Equations \@ref(eq:betaOLS) and \@ref(eq:VarZpredMLR) is that the model residuals are independent. We assume that all the spatial structure of the study variable is explained by the covariates. Even the residuals at two locations close to each other are assumed to be uncorrelated. A drawback of the spatial response surface design is that it is hard or even impossible to check this assumption, as the sampling locations are spread  throughout the study area. If the residuals are not independent, the covariance of the residuals can be accounted for by generalised least squares estimation of the regression coefficients (Equation \@ref(eq:betaGLS)). The study variable can then be mapped by kriging with an external drift (Section \@ref(IntroKED)). However, this requires an estimate of the semivariogram of the residuals (Section \@ref(ResidualVariogram)). 


```{r, echo = FALSE}
rm(list = ls())
```