Evaluating Aging Pedestrian Crash Severity With Bayesian Complementary Log–Log Model for Improved Prediction Accuracy

Document Type


Publication Date


Publication Title

Transportation Research Record


© 2017 National Academy of Sciences. Reliable prediction accuracy is an essential attribute for crash prediction models. Generally, more severe injury outcomes, such as fatalities, are rarer than less severe crashes, such as property damage only or minor injury crashes. The complementary log–log (cloglog) model, commonly used in epidemiological research, is known for its accuracy in predicting rare events. This study implemented the cloglog model in analyzing pedestrian injury severity and compared its performance with the two conventional models used in injury severity research: the probit and logit models. The three models were developed with data from 1,397 crashes involving aging pedestrians that occurred in Florida from 2009 through 2013. The response variable, injury severity level, was binary and categorized as either fatal or severe injury or minor or no injury. The study used three accuracy metrics (deviance information criteria, prediction accuracy, and receiver operating characteristics curves) to compare the performance of the models. The cloglog model outperformed the probit and logit models in overall goodness of fit and prediction accuracy. More important, the cloglog model outperformed the other two models considerably for predicting fatal and severe crashes according to the recall metric (72% accuracy versus 43% and 41% for probit and logit models, respectively). However, the other two models outperformed the cloglog model in predicting crashes with no or minor injuries. Of predictor variables included in the model, six were found to significantly influence fatal or severe injuries for aging pedestrians at 95% Bayesian credible interval. These variables included pedestrian age, alcohol involvement, first harmful event, vehicle movement, shoulder type, and posted speed.