You remind me of a vessel I once knew

Picture this: you’re in a crowded bar. You see an old friend you haven’t seen in ages. You call out her name. You walk up to her. And then it hits you: this person is a complete stranger; she just looks like your old friend.


People looking similar to other people is not uncommon – there’s even a website called Twin Strangers that’ll help you find your unrelated doppelganger. Finding vessels that are similar – in terms of their behavior – is a little trickier, not least because it’s not just based on looks.


How does it work? First, we need to answer two questions:

  1. On what characteristics do we wish to measure similarity ?
    This will define the features that we use
  2. How do we measure the similarity distance between two objects?
    After we define the object characteristics (features) we need to define a metric to measure the value of similarity (distance) between two objects.

When it comes to the maritime domain, we also need to decide – and define – how we’re going to measure vessel similarity.


The easiest – and least innovative – way is to use static characteristics; things like vessel type, size and deadweight. But that’s not what we do. Windward analyzes the movement of all vessels; we look at more than 1,500 characteristics, or features, at any given time. These features are dynamic i.e. they change over the vessel’s lifetime. In other words, our calculations of vessel similarity are based on behavioral features.


Selected Operational Profile features. Source: Windward


Now we just need to define the best behavioural features, ones that give a good indication, or signature, of a vessel’s day-to-day operation. These may include:

  1. Trade Patterns: the vessel’s journey from one port to another. This feature is worthy of its own post. For now, we’ll just say that we use a process called Singular Value Decomposition (SVD) to create a directional graph between all 8,000+ ports. This enables us to find each ship’s unique graph signature.
  2. Vessel Kinematic Profile: including the speed at which the vessel travels; its acceleration characteristics; and how fast it turns.
  3. Low speed operations: things like anchoring, drifting and mooring.
  4. Ship-to-Ship Operations: which ships does this vessel meet on the high seas.


Now we’ve answered the first question, we need to answer the second: how do we measure the similarity distance between two objects, e.g. between vessel speed and the number of anchoring events?


We created a normalised distance metric that takes all the different factors into account to give us a measurement of distance between two vessels. In this way, we can put a number on the similarity behaviours between them.


When we apply this algorithm to a given vessel, we calculate its similarity to all the half a million ships that are out there, ranked in order of similarity. Since this is too much information, it’s not necessarily very helpful. It may also be that the ship we’re looking at is so different from every other vessel out there that the ship to which it’s most similar may still be completely different.


In order to derive only the most relevant results, we add a score of similarity to each one, running from 0 to 100. A perfect 100 means the two vessels are similar in all their features (a theoretical notion, since there are so many parameters as to make this impossible in reality). We use our domain experts to set a threshold on this score, meaning that we give back to our customers only similar vessels that obtained a similarity score above the set threshold. Windward calculates similarity to all vessels on an ongoing basis, since the vessel’s dynamic features are always changing.


For example: let’s take a 55m Japanese fishing vessel Wakashio Maru 118 (IMO 9167772). If we look at this vessel’s operating pattern for the last year, we can see it mainly fishes off the coasts of South Africa and Mozambique; it made multiple part calls to Cape Town and Maputo.



Using the similarity algorithm we find that the most similar vessel (out of 500K vessels) is the NO.639 Dongwon (IMO 9011167), a South Korean fishing vessel which also operated that year in the same waters (see picture below). Although the patterns are quite similar, a closer look reveals that the Dongwon also operated off the coast of Madagascar.



Examining the second most similar vessel is also interesting. It is called Dongwon 117 (IMO 8827595), also a South Korean fishing vessel. But unlike the first two vessels that are each longer than 50 metres, the Dongwon 117 is a smaller vessel, only 38 meters in length. If we had used static features, such as length, in our similarity algorithm, this vessel wouldn’t have come up in our results.



As mentioned above, in this algorithm we only use Windward’s unique dynamic behavioural features to measure similarity. The vessel type (fishing in this case) wasn’t even given to the algorithm; it simply detected similar fishing patterns in all three vessels, based on its behavior.


In this blog we discussed ways of detecting similar vessels in the same given time based on their respective behaviour. We can extend this notion by comparing vessel historical behaviour to present day vessels. For example, if we know of a vessel that was involved in smuggling a few years ago, we can detect vessels operating today that display similar behavioural patterns. More on that in my next Captain’s Blog…

Ido Sovran is a Senior Data Scientist at Windward


More from the Captain’s Blog: Iron Man vs Terminator