# DLNMs: hypothesis tests and p-values

This is part of series about distributed lag non-linear models. Please read the first post for an introduction and a disclaimer.

A major goal of my postdoc project is to determine whether drought has an effect on plant vital rates (growth, survival, reproduction, recruitment).
Getting some measure of statistical significance of drought history in these models is therefore really important for me.
Even with simple linear models, there are multiple ways of doing hypothesis testing, some more “correct” than others.
For example, this recent Twitter discussion about the default behavior of `anova()`

usually being innapropriate:

EEB folks: when did you realize that #rstats uses Type I sums of squares as a default? via @DanielBolnick

— Andrew Hendry (@EcoEvoEvoEco) January 31, 2021

"before" means that you ran analyses without realizing but then changed them (e.g., to Type III) before publishing.

"after" means you published with it & later realized

## P-values for GAMs

This StackExchange answer does a really good job of explaining hypothesis testing with GAMs, and I think this extends to DLNMs fit as GAMs.
Unlike `anova.lm()`

or `summary.lm()`

, which are generally **not** the ones you want, the p-values in `anova.gam()`

and `summary.gam()`

are generally safe to interpret (also, they are exactly the same).
Simon Wood (the author of `mgcv`

) has given a lot of thought and published multiple papers on the calculation of these p-values.^{1}
^{2} For an ordinary penalized smooth (like the default, thin plant regression splines, or the `"cr"`

cubic regression spline basis), the actual wiggliness is lower than the maximum wiggliness (defined by the number of knots). This shrinkage toward a straight line (or a plane in the case of a crossbasis function) is expressed by estimated degrees of freedom (edf). For example, if edf `\(\simeq\)`

1, then the smooth is approaching a straight line. Let’s look at an example.

`library(mgcv)`

`## Loading required package: nlme`

```
##
## Attaching package: 'nlme'
```

```
## The following object is masked from 'package:dplyr':
##
## collapse
```

`## This is mgcv 1.8-33. For overview type 'help("mgcv-package")'.`

`library(dlnm)`

`## This is dlnm 2.4.2. For details: help(dlnm) and vignette('dlnmOverview').`

```
growth <-
gam(log_size_next ~
s(log_size) +
s(spei_history, L, #crossbasis function
bs = "cb",
k = c(3, 24),
xt = list(bs = "cr")),
family = gaussian(link = "identity"),
method = "REML",
data = ha)
```

`anova(growth)`

```
##
## Family: gaussian
## Link function: identity
##
## Formula:
## log_size_next ~ s(log_size) + s(spei_history, L, bs = "cb", k = c(3,
## 24), xt = list(bs = "cr"))
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(log_size) 1.363 1.644 1441.842 <2e-16
## s(spei_history,L) 7.400 8.847 2.906 0.0019
```

The `edf`

for `s(log_size)`

is fairly close to 1, indicating that it might be better modeled as a parametric term (i.e. just a slope).
The edf for the crossbasis function is higher, indicating a more complex surface.
The reference degrees of freedom `Ref.df`

is, I think, another way of calculating the edf, but honestly, the explanation in the help file and associated paper is beyond my understanding.
The test is a modification of a Wald test that can take fractional degrees of freedom (the edf).
The help file indicates that p-values “may be somewhat too low when smoothing parameters are highly uncertain. High uncertainty happens in particular when smoothing parameters are poorly identified, which can occur with nested smooths or highly correlated covariates (high concurvity)”.
This sounds worrying, but I actually don’t think it’s that different than the situation with a linear model.
Highly correlated covariates will *also* give you untrustworthy p-values in an ordinary linear regression, so I’m not sure there’s anything super different here.

## Shrinkage

In the GAM I fit above, the most a term can be penalized to is linear, i.e. edf = 1 (ignore the random effect of plot as it is different).
If I set `select = TRUE`

in the `gam()`

call, it adds a second penalty on the “null space” and allows edf to go to 0, effectively dropping out of the model entirely.
According to the StackOverflow answer, this is currently the best way to get p-values for GAMs.

```
growth_shrink <-
gam(log_size_next ~
s(log_size) +
s(spei_history, L, #crossbasis function
bs = "cb",
k = c(3, 24),
xt = list(bs = "cr")),
family = gaussian(link = "identity"),
method = "REML",
select = TRUE,
data = ha)
```

`anova(growth_shrink)`

```
##
## Family: gaussian
## Link function: identity
##
## Formula:
## log_size_next ~ s(log_size) + s(spei_history, L, bs = "cb", k = c(3,
## 24), xt = list(bs = "cr"))
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(log_size) 1.348 9.000 262.525 < 2e-16
## s(spei_history,L) 4.558 23.000 1.132 2.33e-05
```

All of the edf are smaller, but the `Ref.df`

have gone up, and are now whole numbers.
This is to correct for having done variable selection, I think.
Usually it is a bad idea to do variable selection and then do `Anova()`

on the final model—the p-values will be biased since you’ve already pulled terms out of your model.
So instead of getting estimated reference degrees of freedom, we now get something like the number of knots - 1 (although that’s not exactly what it is for the crossbasis function).

The test is still a null hypothesis test (\(s(x) = 0\)), but now terms are allowed to be dropped from the model completely, if they are not supported by the data.

## Visualizing Shrinkage

I’m going to use the `gratia`

package to plot the smooths from the shrinkage and non-shrinkage versions.

```
library(gratia)
draw(growth)
```

`## Warning: Removed 423 rows containing non-finite values (stat_contour).`

`draw(growth_shrink)`

`## Warning: Removed 423 rows containing non-finite values (stat_contour).`

There’s really no difference in the shape of `s(log_size)`

meaning that the relationship really is log-linear, but that the term *should* stay in the model.
The surface for the lagged drought effect is similar in shape, but slightly *flatter* in the shrinkage penalized version, just as we’d expect from the edf being lower.

Wood SN (2013) On p-values for smooth components of an extended generalized additive model. Biometrika 100:221–228 . https://doi.org/10.1093/biomet/ass048↩︎

Marra G, Wood SN (2011) Practical variable selection for generalized additive models. Computational Statistics & Data Analysis 55:2372–2387 . https://doi.org/10.1016/j.csda.2011.02.004↩︎