Week 13 - Non-linear models 1

Tue, Nov 10, 2020

In this exercise you will fit a non-linear model to some data simulated from information in Cousens (1985) paper on wheat yields in competition with barley. Unfortunately Cousens only reported the mean values and the residual sums of squares from the non-linear fits. So I have attempted to simulate data that has (at least approximately) the right means and variances. The script that simulates the data can be downloaded here, and the necessary data files are here and here. You can skip all that and just get the file with the simulated results here.

library(tidyverse)
cousens <- read_csv("data/cousenssim.csv")
## lot of variables focus on wheat yield
ggplot(data = cousens, aes(x=barley_sr, y=wheat_yield)) + 
  geom_point() +facet_wrap(~wheat_sr)

There are missing values because the rows with wheat seed rate of 0 have no wheat yield. Cousens proposed this model for a constant crop density

\[ Y = Y_{wf}\left[1 - \frac{iD}{100(1 + iD/a)}\right] \]

where \(Y\) is yield per unit area, \(Y_{wf}\) is the weed free yield, \(D\) is the density of the weed, \(i\) and \(a\) are parameters to be estimated. First try this model for a single seed rate of wheat. The hard part is finding good starting values. I will use the function curve() to get a sense of how the curve changes as I change the parameters.

ywf = 2.5
i = 1
a = 100
curve(ywf * (1 - (i*x) / (100*(1 + i * x/a))),
      from = 0, to = 400)

Play around with each of the parameters, raising and lowering it. Describe how the curve changes with each parameter.

wheat_200 <- filter(cousens, wheat_sr == 200)
wheat_200_nl <- nls(wheat_yield ~ ywf * (1 - (i*barley_sr) / (100*(1 + i * barley_sr/a))), 
    data = wheat_200, 
    start = list(ywf = 2.5, i = 1, a = 100))
summary(wheat_200_nl)

Try at least two different starting values for one of the parameters. Compare the outputs. Does the fitting algorithm end up in the same place?

We can make predictions from these fitted models just as with normal models, and get residuals etc.

# residual plot
w_200_resids <- broom::augment(wheat_200_nl)
ggplot(w_200_resids, aes(.fitted, .resid)) + geom_point() + geom_smooth()

Now add the predicted line to the plot above.

nd = data_frame(barley_sr = 0:400)
wheat_200_pred <- broom::augment(wheat_200_nl, newdata = nd)
ggplot(data = wheat_200, aes(x=barley_sr, y=wheat_yield)) + 
  geom_point() +
  geom_line(data = wheat_200_pred, 
            mapping = aes(x = barley_sr, y=.fitted))

Fit this model to a different wheat seed rate. Compare the estimated coefficients. Did they change in the way you expected?

Cousens goes on to fit models directly to yield, rather than percentage loss, and add crop density to enable fitting the entire model in one hit. He fit several different models but the best one was

\[ Y = \frac{aC}{1 + bC + fD} \]

where \(C\) is the density of the crop, and \(a\), \(b\) and \(f\) are parameters to be estimated. Again the hurdle is figuring out good starting values. I’ll plot the yield against wheat seed rate to get an idea about \(a\), the asymptotic yield when there is no barley.

ggplot(data = cousens, aes(x=wheat_sr, y=wheat_yield)) + 
  geom_point() +facet_wrap(~barley_sr)

Those parameters are now per seed, so 3 / 400 = 0.0075 seems like a good starting point for a

wheat_nl <- nls(wheat_yield ~ a * wheat_sr / (1 + b * wheat_sr + f * barley_sr), 
    data = cousens, 
    start = list(a = 3/400, f = .001, b = 0.001))
summary(wheat_nl)

Figuring out what those parameters mean is always a good trick! I think we can interpret \(f\) and \(b\) as per-seed competition coefficients for inter- and intra specific competition, respectively. So an additional barley seed has about 3 x the competitive effect of adding another wheat seed. Taking the limit as C goes to infinity for a fixed value of D, we get \(Y_{max} = a/b\), or about 3 kg.

Make a plot of the raw data, add the predicted lines for the model with both wheat and barley.

Optional

Cousens also fit this model

\[ Y = \frac{aC}{(1 + bC + fD)^g} \]

which has one more parameter. When \(g=1\) you get the same model as before. Fitting this model is a bit tricky. I had to switch to a different algorithm and impose lower limits of 0 on all the parameters.

wheat_nl2 <- nls(wheat_yield ~ a * wheat_sr / (1 + b * wheat_sr + f * barley_sr)^g, 
    data = cousens, 
    start = list(a = 0.1, b = 0.1, f = .1, g=1), # trace = TRUE,
    algorithm = "port", lower = c(0,0,0,0))
summary(wheat_nl2)

Compare the parameters with the 3 parameter model. Take a look at the estimate for \(g\), can we conclude that it is different from 1? Compare the residual standard errors – is the extra parameter worth it? What about comparing by the use of AIC?

Week 13 - Non-linear models 1

Optional

Andrew Tyre

Professor of Wildlife Ecology