Understanding Shared Memory Bank Access Interference in Multi-Core Avionics
Understanding Shared Memory Bank Access
Interference in Multi-Core Avionics
Andreas Löfwenmark1 and Simin Nadjm-Tehrani2
1
2
Dept. of Computer and Information Science, Linköping University, Linköping,
Sweden
Dept. of Computer and Information Science, Linköping University, Linköping,
Sweden
Abstract
Deployment of multi-core platforms in safety-critical applications requires reliable estimation
of worst-case response time (WCRT) for critical processes. Determination of WCRT needs to
accurately estimate and measure the interferences arising from multiple processes and multiple
cores. Earlier works have proposed frameworks in which CPU, shared cache, and shared memory
(DRAM) interferences can be estimated using some application and platform-dependent parameters. In this work we examine a recent work in which single core equivalent (SCE) worst case
execution time is used as a basis for deriving WCRT. We describe the specific requirements in an
avionics context including the sharing of memory banks by multiple processes on multiple cores,
and adapt the SCE framework to account for them. We present the needed adaptations to a
real-time operating system to enforce the requirements, and present a methodology for validating
the theoretical WCRT through measurements on the resulting platform. The work reveals that
the framework indeed creates a (pessimistic) bound on the WCRT. It also discloses that the
maximum interference for memory accesses does not arise when all cores share the same memory
bank.
1998 ACM Subject Classification D.4.7 [Organization and Design] Real-Time Systems and
Embedded Systems
Keywords and phrases multi-core, avionics, shared memory systems, WCET
Digital Object Identifier 10.4230/OASIcs.WCET.2016.12
1
Introduction
Future safety-critical avionic systems will use multi-core platforms, partly because of the more
complex systems requiring more computational capacity and partly because of decreasing
availability of single-core processors; but there are still challenges remaining to demonstrate
the predictability needed for certification.
The memory hierarchy, and more specifically, shared caches and dynamic random access
memory (DRAM) is one of the major sources of timing variability in a multi-core system [9].
Parallel accesses by cores can lead to interference and either of the resources can become
saturated.
Shared caches introduce a number of problems when estimating worst-case execution time
(WCET): an intra- or inter-task interference may occur when tasks on the same core evict
either their own cache lines or another task’s cache line respectively. In addition, asynchronous
operating system activities can result in cache pollution. Furthermore, inter-core interference
is the result of a task evicting a cache line used by a task on another core.
© Andreas Löfwenmark and Simin Nadjm-Tehrani;
licensed under Creative Commons License CC-BY
16th International Workshop on Worst-Case Execution Time Analysis (WCET 2016).
Editor: Martin Schoeberl; Article No. 12; pp. 12:1–12:11
Open Access Series in Informatics
Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany
12:2
Understanding Shared Memory Bank Access Interference in Multi-Core Avionics
The DRAM memory system is composed of a memory controller and memory devices
that store the data. The controller is a shared resource in most multi-core systems, which if
accessed simultaneously from multiple cores has to somehow arbitrate the accesses and this
arbitration can lead to non-determinism in the time domain. DRAM memory devices are
organized into ranks containing banks. Banks contain a number of rows and each row has a
number of columns. For each bank there is a row buffer that is used to store the contents
of one row in the bank. To read data from memory, the row containing that data must be
opened and the contents read to the row buffer and from there the column containing the
data can be read. Subsequent requests to the same row can be serviced with low latency, as
the row is already open. If a request requires another row to be opened, this will increase the
latency as the currently open row must be closed and the data written back to the row before
the new row can be opened. This will also affect the worst-case response time (WCRT) if
different cores request data from different rows in the same bank.
To mitigate these effects when estimating the WCET, several methods have been proposed [9]. One approach targeting the problems outlined above is the Single Core Equivalence
(SCE) framework proposed by Mancuso et al. [12]. This approach combines several of the
previously proposed approaches and consists of three parts: Colored Lockdown [11] for
managing the shared cache; MemGuard [24] for monitoring and limiting the number of
DRAM requests; PALLOC [23] for DRAM bank partitioning. Starting from single-core
WCET estimations, they are able to add interference bounds resulting from shared resource
usage on a multi-core platform to minimize the effects from other cores.
For some systems it may be possible to locate the data in such a way that each core
can access its own private bank(s), but in the general case there will be some sharing of
data between applications and these applications may reside on different cores resulting in a
use-case where shared banks is a necessity. It may also be the case that we have more cores
than banks, which also will result in the necessity of sharing banks. Currently, the number
of cores in a multi-core chip is growing faster than the number of banks in the DRAM [8].
In this paper we consider integrating the SCE concepts in an ARINC 653 [1] real-time
operating system (RTOS) designed for avionic systems. Specifically, we study the general
case of bounding the interference delay when using shared DRAM banks.
The contributions of this paper are:
We adapt the SCE approach for WCRT estimation in avionics software by integrating
assumptions valid for our context, namely cache partitioning and memory bank sharing.
We adapt a custom RTOS to restrict memory accesses according to earlier works ([14, 5,
24, 10]).
We present a methodology for validation of the WCRT estimates using the modified
RTOS, COTS multi-core hardware, and repeatable measurements.
We show that accessing the same bank from all cores does not necessarily represent the
worst-case interference delay.
The remainder of this paper is structured as follows. Section 2 contains related work and
Section 3 contains relevant background. We describe our SCE adaptation and the validation
in Section 4 and Section 5 respectively. We conclude the paper in Section 6.
2
Related Work
The early work on utilizing multi-core processors for deterministic systems includes CPU
scheduling. Anderson et al. [2] propose a hierarchical scheduling with different levels of
A. Löfwenmark and S. Nadjm-Tehrani
12:3
execution time estimation requirements for the different criticality levels in RTCA/DO178 [18]. Mol (...truncated)