Will I find a bike?

The Churnbusters predictive model for Helsinki City bikes

Launched in 2016, the city bikes in Helsinki have quickly gained a huge popularity. Just two years later, the service counts over 2500 bikes, distributed over more that 250 stations.

This rise in popularity can, on the other hand, lead to the unpleasant event that no city bike is available for the journey home or the trip to the supermarket. Predicting the availability at the different stations could help to mitigate this situation and allow cyclists to better plan their journey in advance.

For this demo, Churnbusters uses machine learning to predict the availability of city bikes for each bike station in Helsinki and Espoo. The demo uses open data from Helsinki Region Transport (HSL) and combines it it with the weather data from the Finnish Meteorological Institute (FMI; or Ilmatieteenlaitos). Equipped with this information, the model can calculate and predict the availability of city bikes based on weather and time of day.

By changing the sliders under Predict, the user can estimate the number of bikes available in his or her favorites station for a given time and weather, based on the model’s prediction.

Built with open data

The demo uses two open datasets:

The bike data covers the availability of bikes in a 1 minute resolution for the period between August and October 2018. It contains the following features:

  • Date and time
  • Coordinates of the station
  • Station name and ID
  • Total slots in the station
  • Free slots in the station
  • Number of available bikes
  • Operative status of the station
  • Style

The weather data is available through FMI’s API and contains records from the Helsinki Kaisaniemi weather station. The data is recorded in 10 minute intervals and contains the following variables:

  • Temperature
  • Wind speed
  • Wind gusts
  • Wind direction
  • Relative humidity
  • Rain
  • Visibility
  • Air pressure
  • Dew point
  • Snowfall

What happens under the hood

Churnbusters developed an artificial neural network to predict the number of available bikes in each bike station, based on the historical availability of the bikes, weather conditions, time of day and weekday.

For the predictions, the demo relies on TensorflowJS, a state-of-the-art technology that enables machine learning in the browser. This allows for minimum latency and also provides an element of security, as no user data needs to be sent to any server. This is particularly useful for models that handle sensitive data, or where Internet connections are unreliable.

The demo demonstrates the power and capabilities of machine learning, while the model is purposely kept simple. This means that some of the aspects of the model have been simplified; such as input resolution or dependencies between the input parameters. As an example, the user can set the parameters in a way that relative humidity is 50%, but cumulative rain within 10 minute period is 6.5 mm.

The model can be further improved by including time series predictions. This could improve the accuracy particularly for short-term availability predictions. The currently available datasets only covers the period of August – October 2018; data over a longer timespan, preferrably the entire biking season, would make the model more robust.