Stopping Kill Signals Against eBPF Programs

3 hours ago 1

Most eBPF agents run as daemons, so they can’t be shut down, only allowing authorized actors to shut them down. But if a malicious process gets elevated privileges, it could shut down our eBPF agent. Now, most security tools would have to stop here, but eBPF does not, we can make our agents even more resilient.

If we want to do this, we need a way to stop malicious processes from killing our agents, no matter their access to the system.

bpfdoge

That seems complicated, right? How can we stop any process from killing our agent, no matter their system privileges? With eBPF, we can hook into security_task_kill and stop shutdown signals from reaching our agent.

If you write an lsm/task_kill hook, you can stop the death of any process, so if we write a hook like this, you can prevent the death of your process, no matter what happens.

SEC("lsm/task_kill") int BPF_PROG(lsm_task_kill, struct task_struct *p, struct kernel_siginfo *info, int sig, const struct cred *cred) { u32 pid = BPF_CORE_READ(p, pid); if (pid == my_userspace_agent_pid) { return -EPERM; } return 0; }

Ok, great, now it is impossible to kill, the job is done.

But now, your boss just added a new requirement to your eBPF agent, so you build it and release a new version to use. But now the question is, how do we upgrade our current running eBPF agent?

If we can’t kill the process, we can’t shut it down and start an upgraded version. At this point, nothing can stop it. Your agent used to be useful, but now it’s like a wart that won’t go away.

At the end, to solve this issue, you restart every single machine in your infra, angering your boss (he was never happy, but now he really is pissed) and messing up everyone’s day (or week). You must remove the shutdown security from your eBPF agent, leaving it wide open for any bad actor to kill it.

How did things go so wrong, so fast?

The problem is that we need a way to tell the agent whether a good guy or a bad guy is killing it. But according to our requirements, we can’t even trust root(sudo). What do we do?

We could send a signed message to the eBPF program’s user space agent. If an authorized user signs it, the eBPF program kills itself. However, even this has a problem.

We may be able to keep our key secret, but can we keep that signed message secret? Someone could perform a replay attack and sabotage our system.

If we killed the agent once, and somebody got that signed message, now they could send the message again, at their own discretion.

To solve this issue, we could use a nonce. When an authorized (or unauthorized) user requests to kill the eBPF agent, the agent sends back a random string to be signed. The user must send this signed nonce back to the agent, which, if valid, will shut down the program. This requires the same authorized user to re-sign a new message for each shutdown, ensuring it is not an attacker with an old, used message.

REGULAR SIGNATURE-BASED SHUTDOWN SECURITY FLOW ┌─────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ Actor (stop.sh) Agent (shutdown server) │ │ │ │ │ │ │ │ │ Sign fixed message with private key │ │ │ │ signature = RSA-SHA256("shutdown", private_key) │ │ │ │ │ │ │ │ 1. POST /stop │ │ │ │ {"signature": "base64..."} │ │ │ │───────────────────────────────────────────────────▶│ │ │ │ │ │ │ │ │ Verify signature with public │ │ │ │ key matches fixed message │ │ │ │ │ │ │ │ ✓ Valid signature │ │ │ │ │ │ │ 2. Response: {"status": "shutting down"} │ │ │ │◀───────────────────────────────────────────────────│ │ │ │ │ │ │ │ │ │ │ │ │ Initiate graceful shutdown │ │ │ │ │ │ │ ▼ │ │ │ [DEAD] │ │ │ └─────────────────────────────────────────────────────────────────────────────────────────────┘ NONCE-BASED SHUTDOWN SECURITY FLOW ┌─────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ Actor (stop.sh) Agent (shutdown server) │ │ │ │ │ │ │ │ │ 1. POST /request │ │ │ │───────────────────────────────────────────────────▶│ │ │ │ │ │ │ │ │ Generate random nonce │ │ │ │ Store nonce temporarily │ │ │ │ (Expire after 5 minutes) │ │ │ 2. Response: {"nonce": "abc123..."} │ │ │ │◀───────────────────────────────────────────────────│ │ │ │ │ │ │ │ Sign nonce with private key │ │ │ │ signature = RSA-SHA256(nonce, private_key) │ │ │ │ │ │ │ │ 3. POST /stop │ │ │ │ {"nonce": "abc123...", │ │ │ │ "signature": "base64..."} │ │ │ │───────────────────────────────────────────────────▶│ │ │ │ │ │ │ │ │ Verify nonce exists & unused │ │ │ │ Verify signature with public │ │ │ │ key matches nonce │ │ │ │ │ │ │ │ ✓ Valid signature │ │ │ │ │ │ │ 4. Response: {"status": "shutting down"} │ │ │ │◀───────────────────────────────────────────────────│ │ │ │ │ │ │ │ │ │ │ │ │ Initiate graceful shutdown │ │ │ │ │ │ │ ▼ │ │ │ [DEAD] │ │ │ └─────────────────────────────────────────────────────────────────────────────────────────────┘

Keys are more secure since we can keep them on separate, more secure systems. Whenever the agent needs to be shut down, you could send a request from the agent to another machine, which could verify if they want to shut down the agent manually and then sign the message. The key could even be a hardware key. Using a key lets us move our security to a more secure system (the most secure being hardware keys).

Well, you don’t exactly have to do this, but then again, somebody could just shut down your program, neutralizing your whole eBPF agent, without even finding a loophole in its logic.

Now, this doesn’t mean that when you start writing and deploying eBPF agents, you immediately need to do this, but without some sort of shutdown security, writing a security tool with eBPF without it kind of defeats the purpose of kernel-level security.

If you want your eBPF agent to be truly secure, securing against shutdowns is not enough. Attackers can manipulate your eBPF maps, disabeling policies and destroying logs. If you want to be secure, you should read this blog post: https://bomfather.dev/blog/attacking_and_securing_ebpf_maps/.

Read Entire Article