This is a subset of the data of the grassland birds I caught and banded in South Dakota.
Record of bird species caught, where they were caught, and their measurements.
Variables: species (using four letter code), age (binary, hatch-year/after hatch-year), mass (in grams), wing (length mm), tarsus (length mm), fat (score on scale of 1-5), location, taxon, foraging (how the bird forages), temp (temperature in F), season ( binary, spring/fall), year (binary 2018, 2019), time
install.packages("MuMIn")
#Stands for Multi-Model Inference
#Calculates AICc scores
#Automated model generation
install.packages("AICcmodavg")
#stands for AICc model averaging
#Does other things, but we will primarily use it to make pretty AIC tables
library(MuMIn)
library(AICcmodavg)
glm(formula, data, family = gaussian,…)
Formula:
follows Y=B0+B1X
in R: dependent variable ~ independent variables
Data:
name of data frame
Family:
in R: family(link= “link_name”)
Options: gaussian, binomial, gamma, poisson
#simple way to get rid of NA's
##glm will automatically drop NA's in independent variables, but not dependent
banding_na<-na.omit(banding)
glm_example<- glm( #glm function to run model
mass~ #dependent variables that you want to predict
species+fat+wing*tarsus, #independent variables that you want to use to predict
## * indicates interaction
data = banding_na, #reference data set
family=gaussian) #reference family name (here is where you would also put your link function)
#Trick: turn your binomial variable into 1's and 0's
banding$location_numeric<-ifelse( #name a new variable, and use if else function
banding$location=="Forb",0,1) #if location equals "Forb" turn into a 0 else turn it into a 1
#binomial glm, same as gaussian, but change family name
Band_binomial<- glm(location_numeric~temp, data= banding, family=binomial)
#not very useful, gives you intercepts for each variable
glm_example
##
## Call: glm(formula = mass ~ species + fat + wing * tarsus, family = gaussian,
## data = banding_na)
##
## Coefficients:
## (Intercept) speciesAMRO speciesATSP speciesCOYE speciesDICK speciesHASP
## 12.17082 38.12907 0.75373 -1.09128 9.66889 13.00497
## speciesLCSP speciesLISP speciesSAVS speciesSOSP speciesSWSP speciesYEWA
## 1.33343 2.79863 2.27142 5.96914 2.84455 -3.37180
## fat wing tarsus wing:tarsus
## 0.88363 -0.07958 -0.53147 0.01256
##
## Degrees of Freedom: 609 Total (i.e. Null); 594 Residual
## Null Deviance: 32420
## Residual Deviance: 810.7 AIC: 1939
#nicer, but this is not a t-test
summary(glm_example)
##
## Call:
## glm(formula = mass ~ species + fat + wing * tarsus, family = gaussian,
## data = banding_na)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -4.5545 -0.7099 -0.0249 0.6238 10.8800
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.170819 6.211964 1.959 0.05055 .
## speciesAMRO 38.129072 3.506298 10.874 < 2e-16 ***
## speciesATSP 0.753735 0.459969 1.639 0.10181
## speciesCOYE -1.091278 0.677540 -1.611 0.10779
## speciesDICK 9.668891 0.816786 11.838 < 2e-16 ***
## speciesHASP 13.004969 0.809828 16.059 < 2e-16 ***
## speciesLCSP 1.333433 0.634567 2.101 0.03603 *
## speciesLISP 2.798630 0.543061 5.153 3.49e-07 ***
## speciesSAVS 2.271416 0.475647 4.775 2.26e-06 ***
## speciesSOSP 5.969137 0.584056 10.220 < 2e-16 ***
## speciesSWSP 2.844550 0.640856 4.439 1.08e-05 ***
## speciesYEWA -3.371800 0.541331 -6.229 8.91e-10 ***
## fat 0.883632 0.069686 12.680 < 2e-16 ***
## wing -0.079582 0.088304 -0.901 0.36783
## tarsus -0.531473 0.278193 -1.910 0.05656 .
## wing:tarsus 0.012556 0.003893 3.225 0.00133 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 1.364891)
##
## Null deviance: 32423.82 on 609 degrees of freedom
## Residual deviance: 810.75 on 594 degrees of freedom
## AIC: 1938.6
##
## Number of Fisher Scoring iterations: 2
#Heck yeah! P-values, but have to go to one of above summaries to get intercept estimates
anova(glm_example, test="F")
## Analysis of Deviance Table
##
## Model: gaussian, link: identity
##
## Response: mass
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev F Pr(>F)
## NULL 609 32424
## species 11 31170.6 598 1253 2076.131 < 2.2e-16 ***
## fat 1 200.7 597 1052 147.076 < 2.2e-16 ***
## wing 1 188.7 596 864 138.225 < 2.2e-16 ***
## tarsus 1 38.9 595 825 28.474 1.353e-07 ***
## wing:tarsus 1 14.2 594 811 10.403 0.001327 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In regular R, you can get AIC scores, but not AICc scores :(
Both MuMIn and AICcmodavg have an AICc function
AIC(glm_example)
## [1] 1938.647
AICc(glm_example)
## [1] 1939.681
(Warning: may anger statisticians, see step command for step wise model selection)
We want to create an optimal model to predict mass in birds
#create a large model with thoughtful variables
band_model<- glm(mass~species+tarsus*wing+fat, data = banding_na, family=gaussian,
na.action = na.fail) #have to include line to knit, but it works better if you run code without na.action
#modeling dredging
AICc_band_models <- dredge( #construct all possible models
band_model, #use band model as a reference
rank = "AICc", #use AICc scores to compare
fixed = "species") #list variables that you want to include in every model
## use NULL if you don't want to include
#Make a list of all the models
model_list <- get.models(AICc_band_models, #retrieve models from dredged data
subset = TRUE) #select which models to retrieve
##1:10 gives you top 10 models
## subset = TRUE gives you all the models
model_list[1] #check out info from a single model
## $`16`
##
## Call: glm(formula = mass ~ fat + tarsus + wing + tarsus:wing + 1 +
## species, family = gaussian, data = banding_na, na.action = na.fail)
##
## Coefficients:
## (Intercept) fat tarsus wing speciesAMRO speciesATSP
## 12.17082 0.88363 -0.53147 -0.07958 38.12907 0.75373
## speciesCOYE speciesDICK speciesHASP speciesLCSP speciesLISP speciesSAVS
## -1.09128 9.66889 13.00497 1.33343 2.79863 2.27142
## speciesSOSP speciesSWSP speciesYEWA tarsus:wing
## 5.96914 2.84455 -3.37180 0.01256
##
## Degrees of Freedom: 609 Total (i.e. Null); 594 Residual
## Null Deviance: 32420
## Residual Deviance: 810.7 AIC: 1939
#Make a messy but informative AICc table
AICc_table_band<-model.sel(model_list)
#Trick: create a list of model names
model_name_list<-NULL #make an empty list
for (i in 1:10){
model_name_list = c(model_name_list, as.character(model_list[[i]][['formula']]))
} #loop through model output to extract formula for each model
model_name_listb <- model_name_list[seq(3, length(model_name_list), 3)] #select every third element from list and put it in a new list
modavg_table<-aictab(model_list, #make a table with models from your list
modnames = model_name_listb, #label the models with your names list
second.ord = TRUE, #Use AICc (FALSE gives you AIC)
sort = TRUE) #Order based on model weight
#View table
modavg_table
##
## Model selection based on AICc:
##
## K AICc Delta_AICc AICcWt
## fat + tarsus + wing + tarsus:wing + 1 + species 17 1939.68 0.00 0.99
## fat + tarsus + wing + 1 + species 16 1948.15 8.47 0.01
## fat + wing + 1 + species 15 1974.13 34.45 0.00
## fat + tarsus + 1 + species 15 2042.99 103.31 0.00
## tarsus + wing + tarsus:wing + 1 + species 16 2083.69 144.01 0.00
## tarsus + wing + 1 + species 15 2089.18 149.50 0.00
## fat + 1 + species 14 2092.53 152.85 0.00
## wing + 1 + species 14 2112.01 172.33 0.00
## tarsus + 1 + species 14 2154.49 214.81 0.00
## 1 + species 13 2196.92 257.24 0.00
## Cum.Wt LL
## fat + tarsus + wing + tarsus:wing + 1 + species 0.99 -952.32
## fat + tarsus + wing + 1 + species 1.00 -957.62
## fat + wing + 1 + species 1.00 -971.66
## fat + tarsus + 1 + species 1.00 -1006.09
## tarsus + wing + tarsus:wing + 1 + species 1.00 -1025.39
## tarsus + wing + 1 + species 1.00 -1029.19
## fat + 1 + species 1.00 -1031.91
## wing + 1 + species 1.00 -1041.65
## tarsus + 1 + species 1.00 -1062.89
## 1 + species 1.00 -1085.15
Create a glm model with at least three independent variables using the banding data set or another data set of your choice
Determine which independent variables contribute to an optimal model using AICc
Create a AICc table using AICctab to show your results
?AICtab
# gives 10 suggestions for model selection