K-means clustering for SAT-AIS data analysis

WMU Journal of Maritime Affairs, Jul 2021

The paper deals with a problem of automatic identification system (AIS) data analysis, especially eliminating the impact of AIS packet collision and detecting existing outliers in AIS data. To solve this problem, a clustering-based approach is proposed. AIS is a system that supports the exchange of information between vessels about their trajectories, e.g. position, speed or course. However, SAT-AIS, which enables the system to work on a global scale, struggles against packet collisions due to the fact that the satellite, which receives AIS data from ships, has a field of view that covers multiple areas that are not synchronized among themselves. As a result, the received data is difficult to process by AIS receivers, because most of the messages have a character of noise. In this paper, results of a computational experiment using k-means algorithm for packet recovery and for dealing with noise have been presented. The outcome proves that a clustering-based approach could be used as an initial step in AIS packet reconstruction, when the original data is incorrect.

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s13437-021-00241-3.pdf

K-means clustering for SAT-AIS data analysis

WMU Journal of Maritime Affairs https://doi.org/10.1007/s13437-021-00241-3 IAMU SECTION ARTICLE K -means clustering for SAT-AIS data analysis Marta Mieczyńska1 · Ireneusz Czarnowski2 Received: 19 December 2019 / Accepted: 16 May 2021 / © The Author(s) 2021 Abstract The paper deals with a problem of automatic identification system (AIS) data analysis, especially eliminating the impact of AIS packet collision and detecting existing outliers in AIS data. To solve this problem, a clustering-based approach is proposed. AIS is a system that supports the exchange of information between vessels about their trajectories, e.g. position, speed or course. However, SAT-AIS, which enables the system to work on a global scale, struggles against packet collisions due to the fact that the satellite, which receives AIS data from ships, has a field of view that covers multiple areas that are not synchronized among themselves. As a result, the received data is difficult to process by AIS receivers, because most of the messages have a character of noise. In this paper, results of a computational experiment using kmeans algorithm for packet recovery and for dealing with noise have been presented. The outcome proves that a clustering-based approach could be used as an initial step in AIS packet reconstruction, when the original data is incorrect . Keywords K-means · Clustering · SAT-AIS · Data analysis · Maritime data analytics 1 Introduction An automatic identification system (AIS) is an automatic tracking system that has been developed according to the International Maritime Organisation (IMO) regulations. The aim of creating such system was to develop a technology that would  Ireneusz Czarnowski Marta Mieczyńska 1 Department of Marine Telecommunications, Gdynia Maritime University, Morska 81-87, 81-225 Gdynia, Poland 2 Department of Information Systems, Gdynia Maritime University, Morska 81-87, 81-225 Gdynia, Poland M. Mieczyńska, I. Czarnowski provide information about ships, including their unique identifier, type, position, speed, course and current state, to other vessels and shore stations automatically (International Maritime Organisation (IMO) 2019). The dynamic information is obtained from the ship’s navigational sensors such as its global navigation satellite system (GNSS) receiver and gyrocompass. On the other hand, static information (e.g. ship’s identifier MMSI) is permanently programmed on the ship’s equipment. Both of them are formed into binary format to create AIS messages and transmitted regularly using dedicated transponders. The reception of AIS messages is performed by either ships or land-based systems (e.g. vessel traffic systems) (exactEarth 2015). Most of AIS messages are transmitted on a regular basis. For instance, messages containing dynamic information are exchanged every 2 to 180 s (European Space Agency 2019). Hence, during a specific recording time period a significant amount of data can be received. To process this huge dataset and actually derive some meaningful information from it, the use of modern, advanced technology is required (Czarnowski 2019). Machine learning methods might be one of the possible approaches here, since it provides algorithms that cope with, among others, finding a pattern in a huge dataset (Mieczyńska and Czarnowski 2019). Nowadays, a need for carrying out the analysis of AIS data appears more and more often. The reason is that such functionality is utilized by various applications. The importance of AIS data analysis is crucial especially for maritime industry since the usage of data analysis may lead to improved performance of monitoring and optimization of maritime processes. Examples of those applications might be related to the maritime safety. For instance, the usage of a system that would predict the vessels’ movement may result in an early collision avoidance between ships (Zhang et al. 2015). The same system may be indispensable when it comes to predict a vessel’s location (Liang et al. 2019) in emergency situations, when the connection with that ship is lost. Another example of analysis of both real-time and historical data is an identification of abnormal vessels’ activity that may lead to the detection of an act of piracy (Lane et al. 2010). On the other hand, AIS data might also be useful in a research of industrial usage in the form of maritime traffic analysis — prediction of the load in seaports and its optimization (Millefiori et al. 2016) or route planning (He et al. 2019). The original, terrestrial AIS utilizes two VHF (very high frequency) frequencies (161.975 MHz and 162.025 MHz) with the bandwidth of 25 kHz. To manage the access to the wireless medium by multiple AIS transponders, the TDMA (time division multiple access) method is used. A single device is allowed to transmit only during a pre-determined period of time (called slot). More specifically, each AIS transponder must preannounce the time slots it wants to use (this technique is called self-organizing TDMA (SOTDMA)). Time slots filled with information from various devices form a time frame. Nine 1-min-long time frames (consisting of 2250 26.6ms time slots per radio frequency channel) are then grouped into a communication cell. Within such a communication cell, slot selection is organized randomly. Devices choose their time slots so they can transmit in a pre-assumed rate (which depends on such factors as the speed of the vessel or its heading). If the AIS transponder changes its slot assignment, it must transmit its new assignment and timeout for that assignment. K -means clustering for SAT-AIS data analysis Although original (terrestrial) AIS itself has many advantages and potential applications, there are some drawbacks of this system as well. As mentioned before, it has been originally developed to provide information about nearby vessels that could be used to prevent collisions of vessels. The information about ships’ movement (course, position, speed) is exchanged between them and shore stations regularly, so they are able to recognize other vessels that may appear on their paths. However, the main limitation of this communication is its range. Due to the Earth’s curvature, the horizontal range of terrestrial AIS’ visibility is about 74 km (40 nautical miles) from shore (European Space Agency 2019). Consequently, this indicates that the original AIS is a system working on a local scale, i.e. on a ship-to-ship basis or around coastal zones only. To overcome such a problem and enable AIS to work on a global scale, a SATAIS system has been proposed (European Space Agency 2019). In general, SAT-AIS utilizes satellites (e.g. AAUSAT3) on low-earth-orbit to increase the range of transmission. Messages sent by ships are recorded by a satellite (which has a broader range of view due to its altitude) and then transmitted to ground stations for further processing and distribution (Wawrzaszek et al. 2019). Although it seems to solve man (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007/s13437-021-00241-3.pdf
Article home page: https://link.springer.com/article/10.1007/s13437-021-00241-3

Mieczyńska, Marta, Czarnowski, Ireneusz. K-means clustering for SAT-AIS data analysis, WMU Journal of Maritime Affairs, 2021, pp. 1-24, DOI: 10.1007/s13437-021-00241-3