Performance analysis of optimized versatile video coding software decoders on embedded platforms

Journal of Real-Time Image Processing, Oct 2023

In recent years, the global demand for high-resolution videos and the emergence of new multimedia applications have created the need for a new video coding standard. Therefore, in July 2020, the versatile video coding (VVC) standard was released, providing up to 50% bit-rate savings for the same video quality compared to its predecessor high-efficiency video coding (HEVC). However, these bit-rate savings come at the cost of high computational complexity, particularly for live applications and on resource-constrained embedded devices. This paper evaluates two optimized VVC software decoders, named OpenVVC and Versatile Video deCoder (VVdeC), designed for low resources platforms. These decoders exploit optimization techniques such as data-level parallelism using single instruction multiple data (SIMD) instructions and functional-level parallelism using frame, tile, and slice-based parallelisms. Furthermore, a comparison of decoding runtime, energy, and memory consumption between the two decoders is presented while targeting two different resource-constraint embedded devices. The results showed that both decoders achieve real-time decoding of full high-definition (FHD) resolution on the first platform using 8 cores and high-definition (HD) real-time decoding for the second platform using only 4 cores with comparable results in terms of the average energy consumed: around 26 J and 15 J for the 8 cores and 4 cores platforms, respectively. Furthermore, OpenVVC showed better results regarding memory usage with a lower average maximum memory consumed during runtime than VVdeC.

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s11554-023-01376-7.pdf

Performance analysis of optimized versatile video coding software decoders on embedded platforms

Journal of Real-Time Image Processing (2023) 20:120 https://doi.org/10.1007/s11554-023-01376-7 RESEARCH Performance analysis of optimized versatile video coding software decoders on embedded platforms Anup Saha1 · Wassim Hamidouche2,3 · Miguel Chavarrías1 · Fernando Pescador1 · Ibrahim Farhat2 Received: 9 May 2023 / Accepted: 4 October 2023 © The Author(s) 2023 Abstract In recent years, the global demand for high-resolution videos and the emergence of new multimedia applications have created the need for a new video coding standard. Therefore, in July 2020, the versatile video coding (VVC) standard was released, providing up to 50% bit-rate savings for the same video quality compared to its predecessor high-efficiency video coding (HEVC). However, these bit-rate savings come at the cost of high computational complexity, particularly for live applications and on resource-constrained embedded devices. This paper evaluates two optimized VVC software decoders, named OpenVVC and Versatile Video deCoder (VVdeC), designed for low resources platforms. These decoders exploit optimization techniques such as data-level parallelism using single instruction multiple data (SIMD) instructions and functional-level parallelism using frame, tile, and slice-based parallelisms. Furthermore, a comparison of decoding runtime, energy, and memory consumption between the two decoders is presented while targeting two different resource-constraint embedded devices. The results showed that both decoders achieve real-time decoding of full high-definition (FHD) resolution on the first platform using 8 cores and high-definition (HD) real-time decoding for the second platform using only 4 cores with comparable results in terms of the average energy consumed: around 26 J and 15 J for the 8 cores and 4 cores platforms, respectively. Furthermore, OpenVVC showed better results regarding memory usage with a lower average maximum memory consumed during runtime than VVdeC. 1 Introduction This work was supported by both the Energy Efficient Enhanced Media Streaming (3EMS) project funded by the Brittany Region and TALENT project (PID2020-116417RB-C41), funded by the Spanish Ministerio de Ciencia y Innovación. * Miguel Chavarrías Anup Saha Wassim Hamidouche ; Fernando Pescador Ibrahim Farhat 1 CITSEM at Universidad Politécnica de Madrid, Madrid, Spain 2 Univ. Rennes, INSA Rennes, CNRS, IETR—UMR, 6164 Rennes, France 3 Technology Innovation Institute (TII), P.O.Box: 9639, Masdar City Abu Dhabi, UAE A new era of information and communication technologies is emerging, where video communication plays an essential role in internet traffic. In particular, the significant increase in video traffic, supported by emerging video formats and applications, has led to the development of a new video coding standard named versatile video coding (VVC)/H.266. The latter was standardized in July 2020 by the Joint Video Experts Team (JVET) of the ITU-T Video Coding Experts Group (VCEG) and the Motion Picture Experts Group (MPEG) of ISO/IEC JTC 1/SC 29 [1]. VVC enables bit-rate savings of up to 50% [15] with respect to the previous standard High Efficiency Video Coding (HEVC)/H.265 [2] for the same video quality. However, this achievement comes at the cost of 8× and 2× more complexity compared to HEVC for the reference encoder and decoder, respectively [3]. In this scenario, the main challenge is to develop real-time VVC codecs, either a hardware or software solution for video encoding or decoding, that consider resource-constrained consumer devices frequently used in consumer electronics based on embedded platforms. 13 Vol.:(0123456789) 120 Page 2 of 13 Each coding standard is released with a reference software implementation available to the scientific community. These solutions incorporate all the standard features but offer minimal speed performance. For example, in the case of VVC, the reference software is VVC test model (VTM) [4]. Taking this as a starting point, research groups and companies develop their own real-time software and hardware solutions. These solutions mainly exploit the intrinsic parallelism of the algorithms, both at the data and functional levels, to enhance their performance in terms of speed and energy consumption. In the first case, some data operations included in the source code are optimized using instructions of type single instruction multiple data (SIMD) [5]. Here, vectorized operations perform mathematical operations with more than one operator using a single processor instruction. The other potential optimization route is to take advantage of the intrinsic parallelism of the independent processing of pictures [6] or smaller parts of the picture, such as slices [7] or tiles [8]. In the latter case, it is necessary that the coding is done by activating these normative tools that break dependencies between adjacent regions. In this work, two open-source VVC decoders are evaluated and compared against each other. These solutions, named OpenVVC [9, 10] and Versatile Video deCoder (VVdeC) [11] decoders, are optimized using data and functional-level parallelism techniques. This paper evaluates their performance in decoding runtime, power consumption, and memory usage targeting two different embedded platforms. The results showed that both decoders achieved 15 to 34 frame per second (fps) for ultra-high-definition (UHD) sequences with quantization parameter (QP) 27 and 37 and achieved real-time decoding of full high-definition (FHD) and high definition (HD) sequences over the first target platform using 8 cores. Furthermore, 16 to 28 fps have been obtained for FHD sequences with QPs 27 and 37, and real-time decoding has been achieved for all HD sequences by OpenVVC and VVdeC when targeting the second embedded platform with 4 cores. Regarding energy consumption and maximum memory usage, the experimental results showed that VVdeC requires 2× more memory compared to OpenVVC on both target platforms. On the other hand, OpenVVC consumed the same energy as VVdeC on the embedded system-on-chip (ESoC)1 platform with 8 cores and around 1.25× VVdeC’s energy consumption when targeting the ESoC2 embedded platform with 4 cores. The remainder of the paper is structured as follows. First, Sect. 2 briefly introduces the VVC standard. Then, Sect. 3 describes the optimizations included in VVC decoders using specific parallelization techniques along with the state-ofthe-art of VVC decoders, followed by a brief description of open-source VVC decoders in Sect. 4. Next, Sect. 5 details the proposed optimization techniques in the OpenVVC decoder. The results obtained and the comparison between 13 Journal of Real-Time Image Processing (2023) 20:120 the performance of OpenVVC and VVdeC are provided in Sect. 6. Finally, Sect. 7 concludes the paper. 2 Introduction to VVC This section briefly describes the VVC decoder and related codec optimizations in scientific literature. Like its predecessors, (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007/s11554-023-01376-7.pdf
Article home page: https://link.springer.com/article/10.1007/s11554-023-01376-7

Saha, Anup, Hamidouche, Wassim, Chavarrías, Miguel, Pescador, Fernando, Farhat, Ibrahim. Performance analysis of optimized versatile video coding software decoders on embedded platforms, Journal of Real-Time Image Processing, 2023, pp. 1-13, Volume 20, Issue 6, DOI: 10.1007/s11554-023-01376-7