Getting started with the glmmTMB package
Ben Bolker
October7,2023
1Introduction/quick start
glmmTMB is an R package built on the Template Model Builder automatic differentiation engine,for fitting generalized linear mixed models and exten-sions.
(Not-yet-implemented features are denoted like this)
response distributions:Gaussian,binomial,beta-binomial,Poisson, negative binomial(NB1and NB2parameterizations),Conway-Maxwell-Poisson,generalized Poisson,Gamma,Beta,Tweedie;as well as zero-truncated Poisson,Conway-Maxwell-Poisson,generalized Poisson,and negative binomial;Student t.See?family glmmTMB for a current list including details of parameterizations.
link functions:log,logit,probit,complementary log-log,inverse,iden-tity
zero-inflation with fixed and random-effects components;hurdle models via truncated Poisson/NB
single or multiple(nested or crossed)random effects
offsets
fixed-effects models for dispersion
diagonal,compound-symmetric,or unstructured random effects variance-covariance matrices;first-order autoregressive(AR1)variance struc-tures
1
In order to use glmmTMB effectively you should already be reasonably fa-miliar with generalized linear mixed models(GLMMs),which in turn requires familiarity with(i)generalized linear he special cases of logis-tic,binomial,and Poisson regression)and(ii)‘modern’mixed models(those working via maximization of the marginal likelihood rather than by manipu-lating sums of squares).Bolker et al.(2009)and Bolker(2015)are reasonable starting points in this area(especially geared to biologists and less-technical readers),as are Zuur et al.(2009),Millar(2011),and Zuur et al.(2013).
In order to fit a model in glmmTMB you need to:
specify a model for the conditional effects,in the standard R(Wilkinson-
Rogers)formula notation(see?formula or Section11.1of the Intro-
duction to R.Formulae can also include offsets.
specify a model for the random effects,in the notation that is common
block truncatedto the nlme and lme4packages.Random effects are specified as x|g,
where x is an effect and g is a grouping factor(which must be a factor
variable,or a nesting of/interaction among factor variables).For ex-
ample,the formula would be1|block for a random-intercept model or
time|block for a model with random variation in slopes through time
across groups specified by block.A model of nested random effects
(block within site)could be1|site/block if block labels are reused
across multiple sites,or(1|site)+(1|block)if the nesting structure
is explicit in the data and each level of block only occurs within one
site.A model of crossed random effects(block and year)would be
(1|block)+(1|year).
choose the error distribution by specifying the family(family argu-
ment).In general,you can specify the function(binomial,gaussian,
poisson,Gamma from base R,or one of the options listed at family glmmTMB [nbinom2,beta family(),betabinomial,...])).
choose the error distribution by specifying the family(family argu-
ment).You can choose among
–Distributions defined in base R and documented in?family,ex-
cept the inverse Gaussian and quasi-families
–Distributions defined in?glmmTMB::family glmmTMB
2
optionally specify a zero-inflation model(via the ziformula argument) with fixed and/or random effects
optionally specify a dispersion model with fixed effects
This document was generated using4.3.1and package versions:
##bbmle glmmTMB
## 1.0.25 1.1.8
The current citation for glmmTMB is:
Brooks ME,Kristensen K,van Benthem KJ,Magnusson A,Berg
CW,Nielsen A,Skaug HJ,Maechler M,Bolker BM(2017).“glmmTMB Balances Speed and Flexibility Among Packages for Zero-inflated
Generalized Linear Mixed Modeling.”The R Journal,9(2),378–
400.doi:10.32614/RJ-2017-066.
2Preliminaries:packages and data
Load required packages:
library("glmmTMB")
library("bbmle")##for AICtab
library("ggplot2")
##cosmetic
theme_set(theme_bw()+
theme(panel.spacing=grid::unit(0,"lines")))
The data,taken from Zuur et al.(2009)and ultimately from Roulin and Bersier(2007),quantify the number of negotiations among owlets(owl chicks)in different nests prior to the arrival of a provisioning parent as a function of food treatment(deprived or satiated),the sex of the parent,and arrival time.The total number of calls from the nest is recorded,along with the total brood size,which is used as an offset to allow the use of a Poisson response.
Since the same nests are measured repeatedly,the nest is used as a random effect.The model can be expressed as a zero-inflated generalized linear mixed model(ZIGLMM).
3
Various small manipulations of the data set:(1)reorder nests by mean negotiations per chick,for plotting purposes;(2)add log brood size variable
(for offset);(3)rename response variable and abbreviate one of the input variables.
Owls<-transform(Owls,
Nest=reorder(Nest,NegPerChick),
NCalls=SiblingNegotiation,
FT=FoodTreatment)
(If you were really using this data set you should start with summary(Owls)
to explore the data set.)
We should explore the data before we start to build by plotting it in various ways,but this vignette is about glmmTMB,not about
Now fit some models:
The basic glmmTMB fit—a zero-inflated Poisson model with a single zero-inflation parameter applying to all observations(ziformula~1).(Excluding
zero-inflation is glmmTMB’s default:to exclude it explicitly,use ziformula~0.)
fit_zipoisson<-glmmTMB(NCalls~(FT+ArrivalTime)*SexParent+
offset(log(BroodSize))+(1|Nest),
data=Owls,
ziformula=~1,
family=poisson)
summary(fit_zipoisson)
We can also try a standard zero-inflated negative binomial model;the default is the“NB2”parameterization(variance=µ(1+µ/k):Hardin
and Hilbe(2007)).To use families(Poisson,binomial,Gaussian)that are defined in R,you should specify them as in?glm(as a string referring
to the family function,as the family function itself,or as the result of
a call to the family family="poisson",family=poisson, family=poisson(),and family=poisson(link="log")are all allowed and
all equivalent(the log link is the default for the Poisson family).Some
of the additional families that are not defined in base R(at this point
4
nbinom2and nbinom1)can be specified using the same format.Other-wise,for families that are impleme
nted in glmmTMB but for which glmmTMB does not provide a function,you should specify the family argument as a list containing(at least)the(character)elements family and family=list(family="nbinom2",link="log").(In order to be able to re-trieve Pearson(variance-scaled)residuals from a fit,you also need to specify a variance component;see?family glmmTMB.)
fit_zinbinom<-update(fit_zipoisson,family=nbinom2) Alternatively,we can use an“NB1”fit(variance=ϕµ).
fit_zinbinom1<-update(fit_zipoisson,family=nbinom1) Relax the assumption that total number of calls is strictly proportional to brood using log(brood size)as an offset):
fit_zinbinom1_bs<-update(fit_zinbinom1,
.~(FT+ArrivalTime)*SexParent+
BroodSize+(1|Nest))
Every change we have made so far improves the fit—changing distri-butions improves it enormously,while changing the role of brood size makes only a modest(-1AIC unit)difference:
AICtab(fit_zipoisson,fit_zinbinom,fit_zinbinom1,fit_zinbinom1_bs)
2.1Hurdle models
In contrast to zero-inflated models,hurdle models treat zero-count and non-zero outcomes as two completely separate categories,rather than treating the zero-count outcomes as a mixture of structural and sampling zeros.
glmmTMB includes truncated Poisson and negative binomial familes and hence can fit hurdle models.
5
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论