More accurate rail updates using big data analytics

We investigated how big data analytics could be used in the rail transport system to provide real-time service information.

A train with motion blur is passing through a station plaform.


As train passengers, we’re on our way to important work meetings, off to visit family, having a day out, or simply nipping to the shops. When we have somewhere to be, delays can be frustrating. A key challenge for train companies is predicting and relaying delay information so they can provide immediate and accurate updates to minimise customer inconvenience.

With each of the 23 rail operators in the UK collecting and feeding in data to a central system, just one month of train scheduling data generates more than 50 million messages about alterations and delays, so processing and interpreting that data as quickly and efficiently as possible is more important than ever.


In a project funded by STFC Innovations Ltd., the Hartree Centre decided to investigate how big data analytics could contribute a possible solution for sharing more accurate information about the length and frequency of train delays as they occur. The Rail Delivery Group provided assistance by allowing researchers to gain access to data resources and provided advice on its interpretation.

The IBM Big Insights distributed computing system at the Hartree Centre can convert and process the millions of messages generated within rail data into structured data at a rate of more than 4,500 per second. Researchers looked at specific sections of track to find the probability and duration of delays or alterations, and were able to update this information in real time by the hour, rather than relying on potentially out of date averages. They were also able to incorporate available open data regarding factors such as weather, passenger numbers, vegetation, population and rail track infrastructure.


For an infrastructure that accommodates approximately 1.7 billion passenger journeys per year and generated £9.3 billion1 in revenue in 2015-16, the ability to share and access data in real time has the potential to make a huge impact on UK travel in terms of customer satisfaction. More informative updates about train alterations would enable operators to make better decisions, faster, and improve their overall service. Perhaps more vitally, it could save precious minutes for the customers on the ground, thus encouraging increased uptake of public transport.

1 Stats from Office of Rail & Road: Passenger Rail Usage 2016-17 Quarter 2 Statistical Release​​

“The IBM Big Insights distributed computing system at the Hartree Centre can process the millions of messages at a rate of more than 4,500 per second. This gives us real-time information and means that operators would no longer have to rely on out of date averages.”

Lee Hannis, Head of Business Development for the Hartree Centre

Join Newsletter

Provide your details to receive regular updates from the STFC Hartree Centre.