Generalized Linear Models In this week we’ll look at breaking the assumptions that the error distribution is normal and the relationship between E(Y) and BX is linear.
Learning Objectives Use the inverse link function to interpret coefficients in a GLM. Recognize binomial and Poisson distributed responses Evaluate the goodness of fit of a GLM by examining residual plots and the residual deviance. Monday Optional online help session 10-11, read, watch videos, homework
In this exercise we’ll practice fitting glms and making predictions. The data is hanson_birds in package NRES803. This data set was collected by Andrea Hanson during her MSc. project in the School of Natural Resources. She sampled 14 transects in CRP fields scattered across SE Nebraska 4 times throughout the summer. She measured Visual Obstruction Readings (VOR) at 12 spots along each transect; these were averaged to obtain a single number representing vegetation structure within the CRP field.
Generalized Additive Models In this week we’ll look at breaking the assumption that the covariates are linearly related to the inverse link of E(Y).
Learning Objectives Detect violations of the linear assumption in residual plots Use a penalized smooth spline term to fit an arbitrary non-linear function Decide what type of model to use based on the properties of the data Monday Here’s a handout that I call putting it all together.
In this lab we’re going to use a subset of data from Sikkink et al. (2007; Chapter 26 in AED) on grassland species richness in Yellowstone National park. The data were measured on 8 different transects in 8 years between 1958 and 2002. Not every transect was measured in every year, so the intervals between samples varies from 4 to 11 years in duration. According to Zuur et al. (2009) the Richness variable is the Beta diversity - the number of species unique to that site.
The discipline of landscape ecology frequently postulates that the spatial pattern of habitat is important, in addition to local characteristics such as patch area, vegetation type, and climate. Westphal et al (2003) analyzed data from the South Australian Bird Atlas using a series of landscape pattern metrics estimated at 3 spatial scales. They concluded that landscape structure had a positive effect on many bird species. However, this dataset was never designed to be analyzed using logistic regression, and consequently their conclusions were somewhat weak, and badly compromised by model selection uncertainty.
Putting it all together Reinforcing and reviewing all the bits we’ve done up to now.
Learning Objectives Monday Here’s a handout that I call putting it all together. I’m not sure when the best time is give you this handout, but now seems to be a good time. In the first part, which deals with deciding what sort of models to fit, when you get to step 6 just answer NO and you’ll be good.
We’ll play with a few datasets from Mick Crawley’s “The R Book” 1 to see how to identify when a GAM can be useful, and when to stick with a GLM. The first dataset is of population sizes of Soay Sheep.
library(NRES803) library(tidyverse) library(mgcv) library(broom) library(GGally) library(gridExtra) # need this to make residual plots of gam models bollocks.augment <- function(model) { r <- model.frame(model) r$.fitted <- fitted(model) r$.resid <- resid(model) r$.
This week we’ll start breaking the assumption that the observations are completely independent!
Learning Objectives Recognize design scenarios resulting in non-independent data
Classify possible covariates as fixed or random
Estimate and interpret a mixed effects model with normally distributed data.
Monday Optional online help session 10-11, read, watch video, homework
Wednesday Before Wednesday’s class
Read AED Chapter 8 Watch mixed model video In Wednesday’s class
Sleep Study This is a great dataset to play with. From the help documentation: The average reaction time per day for subjects in a sleep deprivation study. On day 0 the subjects had their normal amount of sleep. Starting that night they were restricted to 3 hours of sleep per night. The observations represent the average reaction time on a series of tests given each day to each subject.
Learning Objectives
Diagnose convergence failures Fit and interpret Generalized Linear Mixed Models Monday Finishing lab from Week 10
Before Wednesday: Read AED Chapter 23 - example of LMM
Wednesday Look at GLMM output from chapter 23 and discuss GLMM model selection
Friday’s Lab Week 11 Lab instructions, turn it in on Canvas.