Distribution Fitting for Very Large Railway Delay Data Sets with Discrete Values

  • Steven Harrod Department of Technology, Management, and Economics, Technical University of Denmark
  • Georgios Pournaras Ansaldo STS - a Hitachi company group
  • Bo Friis Nielsen Department of Applied Mathematics and Computer Science, Technical University of Denmark
Nøgleord: Big data, railway delays, delays distribution


Modern railway signal systems allow the collection of very large data sets (more than a thousand values). These data sets are often rounded by the signal technology, so that the values are effectively discrete. This paper reviews other literature on fitting distributions to large data sets, and then shares the experience of distribution fitting to a large data set from the Danish railways.