SEMANTIC SCENE UNDERSTANDING FOR THE AUTONOMOUS PLATFORM (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLIII-B2-2020/637/2020/isprs-archives-XLIII-B2-2020-637-2020.pdf

SEMANTIC SCENE UNDERSTANDING FOR THE AUTONOMOUS PLATFORM

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition) SEMANTIC SCENE UNDERSTANDING FOR THE AUTONOMOUS PLATFORM B. Vishnyakov *, Y. Blokhinov, I. Sgibnev , V. Sheverdin, A. Sorokin, A. Nikanorov, P. Masalov, K. Kazakhmedov, S. Brianskiy, Е. Andrienko, Y. Vizilter FGUP «State Research Institute of Aviation Systems», Russia, 125319, Moscow, Viktorenko street, 7 - (vishnyakov, yuri.blokhinov, sgibnev, sheverdin, ans, avnikanorov, masalov, kkirill, sbrianskiy, viz)@gosniias.ru KEY WORDS: multi-sensor platform, autonomous vehicle, SLAM, CNN, dynamic scene analysis, semantic segmentation, off-road, autonomous driving, camera calibration, LiDAR calibration. ABSTRACT: In this paper we describe a new multi-sensor platform for data collection and algorithm testing. We propose a couple of methods for solution of semantic scene understanding problem for land autonomous vehicles. We describe our approaches for automatic camera and LiDAR calibration; three-dimensional scene reconstruction and odometry calculation; semantic segmentation that provides obstacle recognition and underlying surface classification; object detection; point cloud segmentation. Also, we describe our virtual simulation complex based on Unreal Engine, that can be used for both data collection and algorithm testing. We collected a large database of field and virtual data: more than 1,000,000 real images with corresponding LiDAR data and more than 3,500,000 simulated images with corresponding LiDAR data. All proposed methods were implemented and tested on our autonomous platform; accuracy estimates were obtained on the collected database. 1. INTRODUCTION The autonomous car market is currently growing at an existential rate and many companies develop their own concepts of driverless vehicles. A self-driving car, also called an autonomous vehicle, is a vehicle that uses a combination of sensors, cameras, radars and artificial intelligence, to travel between destinations without the need of any human effort. Scientific community publishes huge number of papers on the topics of object detection, scene segmentation, 3D-reconstruction using cameras and LiDARs, radars. These algorithms combination allows us to develop high level algorithms of autonomous driving. However, most of the driving algorithms are based on the vector map of the roads. So, autonomous driving in off-road conditions, in the countryside is still a challenging problem. The solution requires robust algorithms of semantic segmentation, three-dimensional scene reconstruction, object detection. All these algorithms work much better in the cities than in the countryside. In this paper we describe our multi-sensor off-road platform for data collection and algorithm testing. We propose a new, fully automatic technique for mutual calibration of machine vision cameras and LiDARs, discuss algorithms for real-time semantic 3D-scene reconstruction. collect video and three-dimensional data and try out algorithms for three-dimensional reconstruction, semantic segmentation and obstacle classification. This vision system is mounted on a metal frame support that allows one to change the distance between the cameras and quickly install or remove other sensors if necessary. In addition, two AXIS M5525 PTZ cameras for object detection are places on the platform. This computer vision subsystem is designed to collect data and try out algorithms for object detection and recognition, semantic segmentation, threedimensional scene reconstruction. The sensors location on the platform is shown in Figure 1. Figure 1. Ten short focus cameras (purple), two long focus cameras (green), four LiDARs (grey circles), two PTZ cameras (orange), four SWIR cameras (light red) 2. AUTONOMOUS PLATFORM Autonomous platform is a relatively large vehicle with dimensions close to real cars (1.8m wide, 4.4m long). 2.1 Sensors The core of the autonomous platform is a computer vision hardware complex, which consists of ten short focus (5mm lens) and two long focus (25mm lens) Prosilica GT2050C machine vision cameras, four SWIR cameras Goldeye G-032 SWIR TEC1, four Velodyne VLP-16 LiDARs. This system allows us to * Hardware processing part consist of seven computing units – Vecow RCS-9430FHR-RTX2080-256 industrial computer based on Intel Core i7-7700 processor, Nvidia GeForce RTX 2080 graphics card and a special four-channel gigabit network card with power over the network (PoE) function – PE-1004. All machine vision cameras are connected to the PE-1004 board since each camera generates a data stream of approximately 1 Gbit per second. Other devices are connected to a gigabit switch and are in the same local area network. Also, a Delphi ESR-2.5 radar, GPS-receiver and xsens inertial system can be mounted on the vehicle. Corresponding author This contribution has been peer-reviewed. https://doi.org/10.5194/isprs-archives-XLIII-B2-2020-637-2020 | © Authors 2020. CC BY 4.0 License. 637 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition) We use UPS APC Smart-UPS SRT 1000VA / 900W uninterruptible power supply with five 1500VA batteries, which allows the computing unit and the set of sensors of the vision system to run continuously up to 8 hours. 2.2 Software We developed special software using ROS2 platform on Ubuntu 18 basis, which allows us to synchronously record data from all sensors in the system, including cameras, LiDARs, radar, GPS-receiver and inertial system, to a specialized storage called rosbag. Data streams from cameras and LiDARs are synchronized at a hardware level with synchronization cables and over PTP/PPS protocols. An optional remote Wi-Fi connection of the operator to the computing unit is also optionally provided for the purpose of monitoring data collection processes or testing computer vision algorithms. 3. VIRTUAL SIMULATION A lot of scientific labs and groups of engineers use virtual simulation as a most affordable way to generate extra data for training of neural networks. We also use virtual simulation to get image and LiDAR data in different conditions. We chose Unreal Engine, a game engine developed and supported by Epic Games, as a basic simulation tool. A game engine (not a professional one, for example, Vega Prime) was chosen due to the fact that the game engines currently provide the most realistic scene visualization. Since 1998 (when the first version of the Unreal engine was released), various versions of the engine have been used in more than a hundred games and a thousand of other projects, including scientific projects and virtual simulation tools. Figure 3. Second virtual scene sample 3.2 LiDAR modelling When modeling VLP-16 LiDAR, single measurement 16 rays are emitted at different polar angles from the point where the LiDAR is mounted, a (...truncated)