How to Avoid Making a Billion-Dollar Mistake: Type-Safe Data Plane Programming with SafeP4
How to Avoid Making a Billion-Dollar Mistake:
Type-Safe Data Plane Programming with SafeP4
Matthias Eichholz
Technische Universität Darmstadt, Germany
Eric Campbell
Cornell University, Ithaca, NY, USA
Nate Foster
Cornell University, Ithaca, NY, USA
Guido Salvaneschi
Technische Universität Darmstadt, Germany
Mira Mezini
Technische Universität Darmstadt, Germany
Abstract
The P4 programming language offers high-level, declarative abstractions that bring the flexibility of
software to the domain of networking. Unfortunately, the main abstraction used to represent packet
data in P4, namely header types, lacks basic safety guarantees. Over the last few years, experience
with an increasing number of programs has shown the risks of the unsafe approach, which often
leads to subtle software bugs.
This paper proposes SafeP4, a domain-specific language for programmable data planes in
which all packet data is guaranteed to have a well-defined meaning and satisfy essential safety
guarantees. We equip SafeP4 with a formal semantics and a static type system that statically
guarantees header validity – a common source of safety bugs according to our analysis of real-world
P4 programs. Statically ensuring header validity is challenging because the set of valid headers can
be modified at runtime, making it a dynamic program property. Our type system achieves static
safety by using a form of path-sensitive reasoning that tracks dynamic information from conditional
statements, routing tables, and the control plane. Our evaluation shows that SafeP4’s type system
can effectively eliminate common failures in many real-world programs.
2012 ACM Subject Classification Software and its engineering → Formal language definitions;
Networks → Programming interfaces
Keywords and phrases P4, data plane programming, type systems
Digital Object Identifier 10.4230/LIPIcs.ECOOP.2019.12
Related Version https://arxiv.org/abs/1906.07223
Funding This work has been co-funded by the German Research Foundation (DFG) as part of the
Collaborative Research Center (CRC) 1053 MAKI and 1119 CROSSING, by the DFG projects
SA 2918/2-1 and SA 2918/3-1, by the Hessian LOEWE initiative within the Software-Factory 4.0
project, by the German Federal Ministry of Education and Research and by the Hessian Ministry of
Science and the Arts within CRISP, by the National Science Foundation under grants CNS-1413972
and CCF-1637532, and by gifts from InfoSys and Keysight.
© Matthias Eichholz, Eric Campbell, Nate Foster, Guido Salvaneschi, and Mira Mezini;
licensed under Creative Commons License CC-BY
33rd European Conference on Object-Oriented Programming (ECOOP 2019).
Editor: Alastair F. Donaldson; Article No. 12; pp. 12:1–12:28
Leibniz International Proceedings in Informatics
Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany
12:2
Type-Safe Data Plane Programming with SafeP4
1
Introduction
I couldn’t resist the temptation to put in a null reference [...] This has led to
innumerable errors, vulnerabilities, and system crashes, which have probably
caused a billion dollars of pain and damage in the last forty years.
– Tony Hoare
Modern languages offer features such as type systems, structured control flow, objects,
modules, etc. that make it possible to express rich computations in terms of high-level
abstractions rather than machine-level code. Increasingly, many languages also offer fundamental safety guarantees – e.g., well-typed programs do not go wrong [23] – that make entire
categories of programming errors simply impossible.
Unfortunately, although computer networks are critical infrastructure, providing the
communication fabric that underpins nearly all modern systems, most networks are still
programmed using low-level languages that lack basic safety guarantees. Unsurprisingly,
networks are unreliable and remarkably insecure – e.g., the first step in a cyberattack often
involves compromising a router or other network device [26, 19].
Over the past decade, there has been a shift to more flexible platforms in which the
functionality of the network is specified in software. Early efforts related to software-defined
networking (SDN) [21, 6], focused on the control plane software that computes routes,
balances load, and enforces security policies, and modeled the data plane as a simple pipeline
operating on a fixed set of packet formats. However, there has been recent interest in allowing
the functionality of the data plane itself to be specified as a program – e.g., to implement
new protocols, make more efficient use of hardware resources, or even relocate applicationlevel functionality into the network [15, 14]. In particular, the P4 language [4] enables the
functionality of a data plane to be programmed in terms of declarative abstractions such
as header types, packet parsers, match-action tables, and structured control flow that a
compiler maps down to an underlying target device.
Unfortunately, while a number of P4’s features were clearly inspired by designs found
in modern languages, the central abstraction for representing packet data – header types –
lacks basic safety guarantees. To a first approximation, a P4 header type can be thought of
as a record with a field for each component of the header. For example, the header type for
an IPv4 packet, would have a 4-bit version field, an 8-bit time-to-live field, two 32-bit fields
for the source and destination addresses, and so on.
According to the P4 language specification, an instance of a header type may either be
valid or invalid: if the instance is valid, then all operations produces a defined value, but if it
is invalid, then reading or writing a field yields an undefined result. In practice, programs
that manipulate invalid headers can exhibit a variety of faults including dropping the packet
when it should be forwarded, or even leaking information from one packet to the next. In
addition, such programs are also not portable, since their behavior can vary when executed
on different targets.
The choice to model the semantics of header types in an unsafe way was intended to make
the language easier to implement on high-speed routers, which often have limited amounts of
memory. A typical P4 program might specify behavior for several dozen different protocols,
but any particular packet is likely to contain only a small handful of headers. It follows
that if the compiler only needs to represent the valid headers at run-time, then memory
requirements can be reduced. However, while it may have benefits for language implementers,
the design is a disaster for programmers – it repeats Hoare’s “mistake,” and bakes an unsafe
feature deep into the design of a language that has the potential to become the de-facto
standard in a multi-billion-dollar industry.
M. Eichholz, E. Campbell, N. Foster, G. Salvaneschi, and M. Mezini
12:3
This paper investigates the design of a domain-specific language for programmable data
planes in which all packet (...truncated)