Exploration of the Dirty Pipe Vulnerability (CVE-2022-0847)

Intro

This blog post reflects our exploration of the Dirty Pipe Vulnerability in the Linux kernel. The bug was discovered by Max Kellermann and described here . If you haven’t read the original publication yet, we’d suggest that you read it first (maybe also twice ;)). While Kellermann’s post is a great resource that contains all the relevant information to understand the bug, it assumes some familiarity with the Linux kernel. To fully understand what’s going on we’d like to shed some light on specific kernel internals. The aim of this post is to share our knowledge and to provide a resource for other interested individuals. The idea of this post is as follows: We take a small proof-of-concept (PoC) program and divide it into several stages. Each stage issues a system call (or syscall for short), and we will look inside the kernel to understand which actions and state changes occur in response to those calls. For this we use both, the kernel source code (elixir.bootlin.com , version 5.17.9) and a kernel debugging setup (derived from linux-kernel-debugging ). The Dirty Pipe-specific debugging setup and the PoC code is provided in a GitHub repository.

Our Goal / Disclaimer

It’s important to talk about the goal of our investigation first:

Do we want to understand how the Linux kernel works in general? Maybe not right now…
Do we want to know what the vulnerability is? Why it occurs? How it can be exploited? Yes!

It is important to keep in mind, what we want to achieve. The Linux kernel is a very complex piece of software. We have to leave some blind spots, but that’s absolutely okay :)

Thus, when we show kernel source code we will often hide parts that are not directly relevant for our discussion to improve readability. In general, those parts may very well be security-relevant and we encourage you to follow the links to review the original code. In particular, if you want to find your own vulnerabilities or become a kernel hacker you should spend more time to understand (all) the mechanisms and details! ;)

Page Cache

The page cache plays an important role in the Dirty Pipe vulnerability so let’s see what it is and how it works first.

The physical memory is volatile and the common case for getting data into the memory is to read it from files. Whenever a file is read, the data is put into the page cache to avoid expensive disk access on the subsequent reads. Similarly, when one writes to a file, the data is placed in the page cache and eventually gets into the backing storage device. The written pages are marked as dirty and when Linux decides to reuse them for other purposes, it makes sure to synchronize the file contents on the device with the updated data. source

In particular, the above means that if any process on the system (or the kernel itself) requests data from a file that is already cached, the cached data is used instead of accessing the disk. Of course there are ways to influence this behavior by using flags (O_DIRECT | O_SYNC) when opening a file, or by explicitly instructing the kernel to synchronize dirty pages. You could also discard the cached pages using the sysfs pseudo file system: # echo 1 > /proc/sys/vm/drop_caches. However, in most situations the cached data is what is ultimately used by the kernel (and thus also the user processes).

At this point we can already tease what the Dirty Pipe vulnerability is all about: It will allow us to overwrite the cached data of any file that we are allowed to open (read-only access is sufficient), without the page cache actually marking the overwritten page as ‘dirty’. Thus, we can trick the system into thinking that the file contents changed (at least for a while) without leaving traces on disk.

But let’s not get ahead of ourselves, the goal is after all to understand why this happens. As we can see, the first thing our PoC does, is opening a file for reading, without any additional flags.

int tfd;
...
pause_for_inspection("About to open() file");
tfd = open("./target_file", O_RDONLY);

Exploration of the Dirty Pipe Vulnerability (CVE-2022-0847)

Table of Contents

Intro

Our Goal / Disclaimer

Page Cache

Pipes (general)

Pipes (initialization)

Overview

Code

Debugger

Pipes (reading/writing)

Writing

Reading

Summary

Pipes (splicing)

The splice System Call (user land)

The splice System Call (Implementation)

Debugger

What’s the Actual Problem?

Limitations

Approaches to Understand the Bug

Top Down vs. Bottom Up vs. Hybrid

Linux Kernel Source

Conclusion

The `splice` System Call (user land)

The `splice` System Call (Implementation)