Return to search

Application of Finite Mixture Models for Vehicle Crash Data Analysis

Developing sound or reliable statistical models for analyzing vehicle crashes is very
important in highway safety studies. A difficulty arises when crash data exhibit overdispersion.
Over-dispersion caused by unobserved heterogeneity is a serious problem
and has been addressed in a variety ways within the negative binomial (NB) modeling
framework. However, the true factors that affect heterogeneity are often unknown to
researchers, and failure to accommodate such heterogeneity in the model can undermine
the validity of the empirical results.
Given the limitations of the NB regression model for addressing over-dispersion of crash
data due to heterogeneity, this research examined an alternative model formulation that
could be used for capturing heterogeneity through the use of finite mixture regression
models. A Finite mixture of Poisson or NB regression models is especially useful when
the count data were generated from a heterogeneous population. To evaluate these
models, Poisson and NB mixture models were estimated using both simulated and
empirical crash datasets, and the results were compared to those from a single NB
regression model. For model parameter estimation, a Bayesian approach was adopted,
since it provides much richer inference than the maximum likelihood approach.
Using simulated datasets, it was shown that the single NB model is biased if the
underlying cause of heterogeneity is due to the existence of multiple counting processes.
The implications could be poor prediction performance and poor interpretation. Using two empirical datasets, the results demonstrated that a two-component finite mixture of
NB regression models (FMNB-2) was quite enough to characterize the uncertainty about
the crash occurrence, and it provided more opportunities for interpretation of the dataset
which are not available from the standard NB model. Based on the models from the
empirical dataset (i.e., FMNB-2 and NB models), their relative performances were also
examined in terms of hotspot identification and accident modification factors. Finally,
using a simulation study, bias properties of the posterior summary statistics for
dispersion parameters in FMNB-2 model were characterized, and the guidelines on the
choice of priors and the summary statistics to use were presented for different sample
sizes and sample-mean values.

Identiferoai:union.ndltd.org:tamu.edu/oai:repository.tamu.edu:1969.1/ETD-TAMU-2010-05-7667
Date2010 May 1900
CreatorsPark, Byung Jung
ContributorsLord, Dominique
Source SetsTexas A and M University
LanguageEnglish
Detected LanguageEnglish
TypeBook, Thesis, Electronic Dissertation, text
Formatapplication/pdf

Page generated in 0.0028 seconds