Poisson regression is a method to model the frequency of event counts or the event rate, such as the number of adverse events of a certain type or frequency of epileptic seizures during a clinical trial, by a set of covariates. The counts are assumed to follow a Poisson distribution with other variables that are modeled as a function of the covariates. The Poisson regression model is a special case of a generalized linear model (GLM) with a log link - this is why the Poisson regression may also be called Log-Linear Model . Consequently, it is often presented as an example in the broader context of GLM theory.
Poisson regression is the simplest regression model for count data and assumes that each observed count Yi is drawn from a Poisson distribution with the conditional mean ui on a given vector Xi for case i. The number of events follows the Poisson distribution that is described blow:
- e is the base of the natural logarithm (e = 2.71828...)
- k is the number of occurrences of an event - the probability of which is given by the function
- k! is the factorial of k
- λ is a positive real number, equal to the expected number of occurrences that occur during the given interval, the interval could be a time interval or other offset variables (denominators).
When overdisperse occurs, an alternative model with additional free parameters may provide a better fit. In the case of the count data, an alternaitve model such as negative binomial distribution may be used.
In practice, we often see the count data with excessive zero counts (no event), which may cause the deviation from the Poisson distribution - overdispersion or underdispersion. If this is the case, zero-inflated Poisson regression may be used.
In SAS, several procedures in both STAT and ETS modules can be used to estimate Poisson regression. While GENMOD, GLIMMIX (from SAS/Stat), and COUNTREG (from SAS/ETS) are easy to use with standard MODEL statement, NLMIXED, MODEL, NLIN provide great flexibility to model count data by specifying the log likelihood function explicitly.
- Excellent Discussion about Zero-Inflated Poisson Regression by UCLA Academic Technology Services
- Count Data Model using SAS by Liu and Cela
- Zero-Inflated Poisson and Zero-Inflated Negative Binomial Models Using the COUNTREG Procedure by Erdman et al
- Design and Analysis of Count Data by Lakshminarayanan
- Proc Countreg user's guide from SAS
- Course notes about negative binomial distribution by Dr.Preisser in UNC
Regression models for count data in R
ReplyDeletehttp://www.jstatsoft.org/v27/i08/
The course notes on the negative binomial distribution requires a log-in
ReplyDelete