Understanding Shared Memory Bank Access Interference in Multi-Core Avionics (pdf)

Article PDF cannot be displayed. You can download it here:

http://drops.dagstuhl.de/opus/volltexte/2016/6905/pdf/OASIcs-WCET-2016-12.pdf

Understanding Shared Memory Bank Access Interference in Multi-Core Avionics

Understanding Shared Memory Bank Access Interference in Multi-Core Avionics Andreas Löfwenmark1 and Simin Nadjm-Tehrani2 1 2 Dept. of Computer and Information Science, Linköping University, Linköping, Sweden Dept. of Computer and Information Science, Linköping University, Linköping, Sweden Abstract Deployment of multi-core platforms in safety-critical applications requires reliable estimation of worst-case response time (WCRT) for critical processes. Determination of WCRT needs to accurately estimate and measure the interferences arising from multiple processes and multiple cores. Earlier works have proposed frameworks in which CPU, shared cache, and shared memory (DRAM) interferences can be estimated using some application and platform-dependent parameters. In this work we examine a recent work in which single core equivalent (SCE) worst case execution time is used as a basis for deriving WCRT. We describe the specific requirements in an avionics context including the sharing of memory banks by multiple processes on multiple cores, and adapt the SCE framework to account for them. We present the needed adaptations to a real-time operating system to enforce the requirements, and present a methodology for validating the theoretical WCRT through measurements on the resulting platform. The work reveals that the framework indeed creates a (pessimistic) bound on the WCRT. It also discloses that the maximum interference for memory accesses does not arise when all cores share the same memory bank. 1998 ACM Subject Classification D.4.7 [Organization and Design] Real-Time Systems and Embedded Systems Keywords and phrases multi-core, avionics, shared memory systems, WCET Digital Object Identifier 10.4230/OASIcs.WCET.2016.12 1 Introduction Future safety-critical avionic systems will use multi-core platforms, partly because of the more complex systems requiring more computational capacity and partly because of decreasing availability of single-core processors; but there are still challenges remaining to demonstrate the predictability needed for certification. The memory hierarchy, and more specifically, shared caches and dynamic random access memory (DRAM) is one of the major sources of timing variability in a multi-core system [9]. Parallel accesses by cores can lead to interference and either of the resources can become saturated. Shared caches introduce a number of problems when estimating worst-case execution time (WCET): an intra- or inter-task interference may occur when tasks on the same core evict either their own cache lines or another task’s cache line respectively. In addition, asynchronous operating system activities can result in cache pollution. Furthermore, inter-core interference is the result of a task evicting a cache line used by a task on another core. © Andreas Löfwenmark and Simin Nadjm-Tehrani; licensed under Creative Commons License CC-BY 16th International Workshop on Worst-Case Execution Time Analysis (WCET 2016). Editor: Martin Schoeberl; Article No. 12; pp. 12:1–12:11 Open Access Series in Informatics Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany 12:2 Understanding Shared Memory Bank Access Interference in Multi-Core Avionics The DRAM memory system is composed of a memory controller and memory devices that store the data. The controller is a shared resource in most multi-core systems, which if accessed simultaneously from multiple cores has to somehow arbitrate the accesses and this arbitration can lead to non-determinism in the time domain. DRAM memory devices are organized into ranks containing banks. Banks contain a number of rows and each row has a number of columns. For each bank there is a row buffer that is used to store the contents of one row in the bank. To read data from memory, the row containing that data must be opened and the contents read to the row buffer and from there the column containing the data can be read. Subsequent requests to the same row can be serviced with low latency, as the row is already open. If a request requires another row to be opened, this will increase the latency as the currently open row must be closed and the data written back to the row before the new row can be opened. This will also affect the worst-case response time (WCRT) if different cores request data from different rows in the same bank. To mitigate these effects when estimating the WCET, several methods have been proposed [9]. One approach targeting the problems outlined above is the Single Core Equivalence (SCE) framework proposed by Mancuso et al. [12]. This approach combines several of the previously proposed approaches and consists of three parts: Colored Lockdown [11] for managing the shared cache; MemGuard [24] for monitoring and limiting the number of DRAM requests; PALLOC [23] for DRAM bank partitioning. Starting from single-core WCET estimations, they are able to add interference bounds resulting from shared resource usage on a multi-core platform to minimize the effects from other cores. For some systems it may be possible to locate the data in such a way that each core can access its own private bank(s), but in the general case there will be some sharing of data between applications and these applications may reside on different cores resulting in a use-case where shared banks is a necessity. It may also be the case that we have more cores than banks, which also will result in the necessity of sharing banks. Currently, the number of cores in a multi-core chip is growing faster than the number of banks in the DRAM [8]. In this paper we consider integrating the SCE concepts in an ARINC 653 [1] real-time operating system (RTOS) designed for avionic systems. Specifically, we study the general case of bounding the interference delay when using shared DRAM banks. The contributions of this paper are: We adapt the SCE approach for WCRT estimation in avionics software by integrating assumptions valid for our context, namely cache partitioning and memory bank sharing. We adapt a custom RTOS to restrict memory accesses according to earlier works ([14, 5, 24, 10]). We present a methodology for validation of the WCRT estimates using the modified RTOS, COTS multi-core hardware, and repeatable measurements. We show that accessing the same bank from all cores does not necessarily represent the worst-case interference delay. The remainder of this paper is structured as follows. Section 2 contains related work and Section 3 contains relevant background. We describe our SCE adaptation and the validation in Section 4 and Section 5 respectively. We conclude the paper in Section 6. 2 Related Work The early work on utilizing multi-core processors for deterministic systems includes CPU scheduling. Anderson et al. [2] propose a hierarchical scheduling with different levels of A. Löfwenmark and S. Nadjm-Tehrani 12:3 execution time estimation requirements for the different criticality levels in RTCA/DO178 [18]. Mol (...truncated)