Advances and Applications in Statistics

The Advances and Applications in Statistics is an internationally recognized journal indexed in the Emerging Sources Citation Index (ESCI). It provides a platform for original research papers and survey articles in all areas of statistics, both computational and experimental in nature.

Submit Article

SUPERVISED MACHINE LEARNING: A COMPARISON OF POISSON AND NEGATIVE BINOMIAL REGRESSION FOR COUNT DATA ANALYSIS

Authors

  • Walaa Ahmed Hamdi

Keywords:

supervised machine learning, regression, count data, poisson regression, negative binomial regression

DOI:

https://doi.org/10.17654/0972361725040

Abstract

This study explores the application of supervised machine learning techniques, specifically Poisson and negative binomial regression models, for analyzing count data to forecast outgoing mail volume for the General Directorate of Posts of Saudi Arabia from 2002 to 2006. The dataset covers 13 administrative regions and consists of 65 observations with 3 variables - the dependent variable is the number of outgoing mails, and the independent variables are year and region. Exploratory data analysis revealed significant overdispersion in the data, with a large number of zero observations. Initial Poisson regression analysis highlighted the model’s limitations in addressing these data characteristics. In contrast, the negative binomial regression model demonstrated superior performance, achieving a lower Mean Absolute Prediction Error (MAPE) of 34,026.7 compared to 34,253.08 for the Poisson model. Additionally, likelihood-based metrics such as the Likelihood Ratio Test, AIC, and BIC consistently indicated that the negative binomial regression model provided a better fit to the data, reflecting the underlying overdispersion. Based on these findings, the negative binomial regression model is recommended as the primary approach for predicting outgoing mail volume for the General Directorate of Posts of Saudi Arabia.

Received: October 27, 2024
Accepted: February 11, 2025

References

J. M. Hilbe, Negative Binomial Regression, Cambridge University Press, New York, 2011.

A. Agresti, Categorical Data Analysis, Cambridge University Press, Hoboken, New Jersey, 2002.

A. J. Dobson and A. G Barnett, An Introduction to Generalized Linear Models, CRC Press, New York, 2008.

A. C. Cameron and P. K. Trivedi, Regression Analysis of Count Data, Cambridge University Press, New York, 2013.

G. James, D. Witten, T. Hastie and R. Tibshirani, An Introduction to Statistical Learning, 1st ed., Springer, New York, 2013.

H. Akaike, Information Theory and an Extension of the Maximum Likelihood Principle, Springer, 1973.

G. Schwarz, Estimating the dimension of a model, Ann. Statist. 6 (1978), 461-464.

General Directorate of Posts, 2023.

Retrieved from https://www.sp.gov.sa/en/about-us/.

G. Buyrukoglu, S. Buyrukoglu and Z. Topalcengiz, Comparing regression models with count data to artificial neural network and ensemble models for prediction of generic Escherichia coli population in agricultural ponds based on weather station measurements, Microbial Risk Analysis 19 (2021), 100171.

Published

13-05-2025

Issue

Section

Articles

How to Cite

SUPERVISED MACHINE LEARNING: A COMPARISON OF POISSON AND NEGATIVE BINOMIAL REGRESSION FOR COUNT DATA ANALYSIS. (2025). Advances and Applications in Statistics , 92(7), 949-961. https://doi.org/10.17654/0972361725040

Similar Articles

1-10 of 243

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)