Talking about the Weather

When chatting with a Brit, there are few better ways to break the ice than talking about the weather. According to one report, almost ALL Britons admit to having conversed about it in the past six hours. For Windward, though, the weather is something that, until recently, had been missing from our conversation.

This might seem strange. Our ability to identify ships’ behaviour is based on machine learning identification of vessels’ actions. If we knew what the weather was like where a ship was operating, we’d know if the ship was stopping in calm seas or high waves; we’d know if it was drifting with the current or sailing slowly against it; and we’d improve our ability to understand why a ship acted as it did, and be better able to assess its risk level.

And so, Windward’s Product Management team launched The Weather Project. It began by attempting to answer the questions: Will weather data give us – and therefore our customers – a better understanding of ships’ behavior? Which parameters would have the biggest impact – now and in the future – and provide our system with as much additional data as possible?

Selected Parameters

  • Waves (Height, Direction, Max Wave Height) – Gives us the direction and size of waves at sea.
  • Wind (Speed, Direction, Max Gust Speed) – Gives us the direction and speed of wind at sea.
  • Currents – Currents around the globe are mostly static, but it’s important to know them as well.
  • Temperature – Gives us the temperature at different locations.
  • Beaufort scale – Gives a convenient breakdown of different weather types.
  • Tides – Gives us the times and intensity of tides, which give better insight into conditions at sea.
  • Moonlight – Allows us to know visibility conditions at night.
  • Visibility – Gives us data about visibility conditions in daytime.
  • Fog – Tells us if there is any.
  • Weather symbol – A general interpretation of weather conditions.

At this stage, we decided to look into different weather data sources, to see which would best fit our needs.

The first thing we discovered was that our needs were different from most other consumers of weather data. What most users need – and most companies provide – is forecast data i.e. what the weather at location X will be in the next hours/days/weeks. And not just any old forecast: consumers want the most accurate forecasts. Yet a forecast is just an estimate; by its nature it cannot be completely accurate.

At Windward, we require accurate (measured, not estimated) weather data for the entire world – now and in the past – ideally going back years. These unique requirements present a unique challenge. Nevertheless, we found several companies that could help. After some research, came to the following conclusions:

  • OpenWeatherMap – Lacked wave data, one of the most important constituents of marine weather.
  • WorldWeatherOnline – Had wave data, but not for all dates and locations.
  • Meteomatics – Had significant number of different benchmarks available.
  • NOAA (National Oceanic and Atmospheric Association) Blended Sea Winds – What’s interesting about NOAA is that it’s a governmental organization, so the data is openly available; at the same time, it’s a scientific institution, so it keeps historical records going back many years.

We decided to create a Proof Of Concept (PoC) with the two most promising data sources (for our purposes): Meteomatics and NOAA, and compare the results.

NOAA provides a batch of historical data (in netCDF format), which goes back years, and which can be downloaded and used offline. But it has limited scope and granularity.

Meteomatics, on the other hand, provides an online API which allows queries on their data. Each query can span multiple dates, data types and locations.

A Few Words About netCDF

An interesting point about NOAA data is that it’s kept in a scientific format called NetCDF. This will be familiar to the world of academia, but not to the world of tech. Working with this format brought an additional level of complexity: all the data was kept in a 4-dimensional array of values:

  • Time, using six hour intervals.
  • Height above sea level (constant).
  • Latitude, and Longitude (to a resolution of 0.25 degrees).

What’s interesting about NetCDF format is that to minimize the data footprint, the data is stored in a particular fashion.

For example, longitude and latitude, which are float values representing real latitude/longitude values around the globe, were respectively stored as integer/positive values from 0 to 719 and 0 to 1440, representing points around the globe; 0.25 degree iterations were represented by single point differences.

Another example is time, which is calculated in hours since 1978-01-01.

Here is an excerpt from the file’s metadata:

variables:
   int time(time=4);
     :long_name = “Center Time of the Data”;
     :units = “hours since 1978-01-01 00:00:00”;
   float zlev(zlev=1);
     :long_name = “height above sea level”;
     :units = “meters”;
   float lat(lat=719);
     :long_name = “latitude”;
     :units = “degrees_north”;
     :grids = “uniform grids from -89.75 to 89.75 by 0.25”;
   float lon(lon=1440);
     :long_name = “longitude”;
     :units = “degrees_east”;
     :grids = “uniform grids from 0.00 to 359.75 by 0.25”;

After the PoC which included a comprehensive comparison and analysis of both NOAA and Meteomatics data we decided to use the Meteomatics API for a few reasons:

  1. The coverage provided and granularity of data available was better.
  2. The API provides flexibility and allows us to query large sums of data for both UI representation and modeling use cases.
  3. The number of queryable marine-related parameters is significantly higher than any other source checked.

We can see that in some cases when comparing NOAA with Meteomatics we are able to get a stronger signal indicating cases of extreme weather, due to higher geographical and time granularity.

In conclusion, we see weather data as an important addition to Windward’s capabilities, enabling us to:

  1. Enrich the world marine traffic map with weather data.
  2. Use Machine Learning to determine uneconomical behaviour by ships, which is useful for our Intelligence product.
  3. Use Machine Learning to determine risky behaviour by ships, which is useful for our Marine Insurance product.

We can’t promise to talk about the weather as often as our British friends, but you should expect it to appear a lot more regularly in our future conversations.

Alex Milman is a Senior Backend Developer at Windward