The BPF framework can be used to load programs into the Linux kernel at runtime. It can be used for both tracing and for network programming. The BPF code runs in a virtual machine inside the kernel. BPF has a small (around 100 opcodes) RISC-like 64-bit Instruction Set Architecture (ISA). It uses 11 64-bit registers and a 512-byte stack. BPF programs can be written in various languages (C, Go, Rust, etc.) and compiled with clang/LLVM to BPF bytecode.
There are several limitations to BPF programs. When the program is loaded into the kernel, a verifier checks if the program is safe. The verifier rejects programs when they have for example loops (unless they can be unrolled), out of bound memory access or jumps, calls to functions other than helper functions, or unreachable instructions. If the program passes the verifier, it can be JIT compiled to native instructions for the hardware.
BPF programs can be attached to various parts of the kernel. These programs can be divided in two major categories: programs used for kernel tracing and programs used for networking. This chapter focusses on BPF programs for networking that attach to the eXpress Data Path (XDP). XDP sits between the network driver and the TCP/IP stack and provides high performance packet processing. It is similar to DPDK (Data Plane Development Kit), except that DPDK runs in user space and XDP runs in kernel space (see figure 1).
DPDK has been around since 2010 and has become a very popular toolkit for high rate packet processing. XDP/BFP is newer (since around 2016) and tries to offer high rate packet processing inside the kernel. Network programming that uses XDP/BPF inside the kernel has the advantage of using the security features of the kernel, such as application isolation. With DPDK, the DPDK application needs to take care of that. Another advantage of XDP/BPF over DPDK is that XDP/BPF does not need dedicated cores. DPDK programs use one or more dedicated cores for packet processing. As such, XDP/BPF scales better when the packet load increases because more cores are added by the kernel scheduler to do the processing. But there are still interesting use cases for DPDK, such as software routers, like VPP (Vector Packet Processing), or high rate traffic generators, like TRex or pktgen. BPF and XDP, together with the BPF Compiler Collection (BCC), are being developed in the IO Visor Project (https://www.iovisor.org).
XDP/BPF programs have access to packet metadata that contains a pointer to the start and the end of a packet. The BPF program can change the packet (e.g. RTT decrement, rewrite MAC addresses, etc) and add or remove headers (e.g. VLAN pop or push). After processing the packet, the BPF program returns an action code. The currently defined codes are:
- XDP_ABORTED - program error, drop the packet
- XDP_DROP – drop the packet
- XDP_PASS – forward the packet to the network stack
- XDP_TX – forward the packet to the interface it arrived from
- XDP_REDIRECT – forward the packet to another interface
XDP/BPF has much potential in use cases that require network programming at the edge.