Distributions to model overdispersed count data


  • Sylvain Coly
  • Anne-Franoise Yao
  • David Abrial
  • Myriam Charras-Garrido


In the early twentieth century, only a few count distributions (binomial and Poisson distributions) were commonly used in modeling. These distributions fail to model bimodal or overdispersed data, especially data related to phenomena for which the occurrence of a given event increases the chance of additional events occurring. New count distributions have since been introduced to address such phenomena; they are named "contagious" distributions. This group of distributions, which includes the negative binomial, Neyman, Thomas and Pólya-Aeppli distributions, can be expressed as mixture distributions or as stopped-sum distributions. They take into account bimodality and overdispersion, and show a greater flexibility with regards to value distributions. The aim of this literature review is to 1) explain the introduction of these distributions, 2) describe each of these overdispersed distributions, focusing in particular on their definitions, their basic properties, and their practical utility, and 3) compare their strengths and weaknesses by modeling overdispersed real count data (bovine tuberculosis cases).