Opening data fully to improve London’s transport network

Authorities and operators can drastically improve the experience of all those travelling in London by harnessing the potential within the copious amounts of data that exists within the city, says Ryan Sweeney, Data and Analytics Portfolio Manager, Transport for London…

Opening data to access its full capacity in the transport industry

2019 will mark the 30th anniversary of the classic Game Boy game Tetris. This game probably occupied many lives in the early nineties with its infectious theme tune and frustrating addictive capabilities. But all this came from a game cartridge that held just one megabyte of data. In comparison, the amount of data Transport for London (TfL) now analyses every day, from journeys on buses, tubes and rail, doesn’t fit on numerous Blu-ray discs.

Every day, TfL collects vast amounts of data about how people move across the network. There are approximately 20 million ‘taps’ captured through the ticketing system, the iBus location system provides accurate location and prediction information for 9,200 vehicles and London is kept moving by the management of traffic flow with 6,000 traffic signals and 1,400 cameras. This vast volume of data means a unique insight into how people travel across the city is gathered, but simply holding high quantities of data isn’t enough.

To produce value from it, big data techniques need to be used, turning raw data into useful information that can then be used by both staff and customers. TfL is doing this to improve the design of the network and operate more efficiently. For example, combining ticketing patterns and bus location data created a comprehensive understanding of public transport travel which assisted in the planning of services and responses to events. Machine learning is also being implemented to identify early indicators; preventing the removal of trains from service.

Across TfL there are active experiments to investigate what can be learnt from data, with the results helping to improve the products and services that are delivered to customers. A good example of how analytics and big data have been used is the focus to help improve safety across London. By using big data to analyse trends from deaths and serious injuries on the roads, the major contributory factors were identified, which led to better target preventative action both on a micro and a policy level. This work led to the creation of an interactive digital collision map, meaning anyone can now see where historic collisions have taken place in London over the last 10 years. This tool has only been possible as a result of using big data and forms a key part of a continued drive to achieve the Mayor’s ‘Vision Zero’ aim of eradicating deaths and serious injuries from road collisions on London’s streets by 2041.

Going forward, TfL is looking at innovative new ways to utilise big data for more predictive travel information, which would greatly improve the operational management of networks. In 2017, the findings of a four-week pilot were published to see whether depersonalised Wi-Fi connection data from customers’ mobile devices could be used to better understand how people navigate the London Underground network. The pilot focused on Central London stations and saw more than 509 million depersonalised ‘probing requests’, or pieces of data, collected from 5.6 million mobile devices making around 42 million journeys.

The data project was designed with customer privacy at the forefront. The data collected was all automatically depersonalised, so that no individuals could be identified, and then analysed by TfL’s in-house analytics team. Using a specially created algorithm, they broke the data into different aggregated ‘movement types’ to help understand what customers were doing at particular points of their journeys – such as entering or exiting a station, changing between lines or just passing through the station while on a train. Examples of these patterns are available in the published report. By using this data, a much more accurate understanding of how people move through stations, interchange between services and how crowding develops was attained.

In the coming years, TfL plan to continue innovating with real-time and predictive data to increase the granularity and accuracy of their data, providing customers with the information they want and need to have the best journey experience possible.

The opportunities that arise from utilising and combining big data are constantly multiplying and evolving. It is for this reason that TfL also try to make as much of their data freely available via the website and through a Unified API for developers. By freely sharing transport data, the creation of new apps to make travel easier for customers and get more people working to solve the transport challenges of cities is encouraged. It enables developers to think creatively and test their analytical skills, while giving Londoners up-to-date information about public transport and road networks.

Around 14,400 open data users are currently registered for TfL’s Unified API, from app developers to academic institutions and sat-nav providers. More than 650 apps are now being directly powered by open data, giving more choice and convenience and are regularly used by 42 per cent of Londoners.

Offering open data boosts the economy. Last year, research commissioned by TfL and conducted by Deloitte, found that the provision of free, accurate and real-time open data is helping London’s economy by up to £130 million a year.

Keen to improve their offerings all the time, TfL published in 2017, for the first time, data to show the levels of crowding for a typical weekday on the London Underground network, as well as real-time information about the availability of electric vehicle charging-points across London. They are also working on massively improving the level of accessibility data provided though the open data feed, as comments from stakeholders highlight this to be one area where third party developers can be worked with to help provide clearer, more accessible data within a wide range of apps and websites.

We now have a thriving market in public transport apps and many new businesses and jobs created in London’s booming tech sector because of data – all at low cost to TfL’s organisation.

Our approach to data is simple. With over 31 million journeys made in London every day, it is vital that people have the right travel information readily available to help them travel around the city. Having good data sources is key, but only if they are used to their full potential. Much like in Tetris, you need to combine the separate pieces into meaningful patterns, allowing you to achieve much more than you would by letting the data pass randomly uninterrogated.