I’m excited to see other people thinking about userspace-to-kernel attack surface reduction ideas. Theo de Raadt recently published slides describing Pledge. This uses the same ideas that seccomp implements, but with less granularity. While seccomp works at the individual syscall level and in addition to killing processes, it allows for signaling, tracing, and errno spoofing. As de Raadt mentions, Pledge could be implemented with seccomp very easily: libseccomp would just categorize syscalls.
I don’t really understand the presentation’s mention of “Optional Security”, though. Pledge, like seccomp, is an opt-in feature. Nothing in the kernel refuses to run “unpledged” programs. I assume his point was that when it gets ubiquitously built into programs (like stack protector), it’s effectively not optional (which is alluded to later as “comprehensive applicability ~= mandatory mitigation”). Regardless, this sensible (though optional) design gets me back to his slide on seccomp, which seems to have a number of misunderstandings:
- A Turing complete eBPF program watches your program Strictly speaking, seccomp is implemented using a subset of BPF, not eBPF. And since BPF (and eBPF) programs are guaranteed to halt, it makes seccomp filters not Turing complete.
- Who watches the watcher? I don’t even understand this. It’s in the kernel. The kernel watches your program. Just like always. If this is a question of BPF program verification, there is literally a program verifier that checks various properties of the BPF program.
- seccomp program is stored elsewhere This, with the next statement, is just totally misunderstood. Programs using seccomp define their program in their own code. It’s used the same way as the Pledge examples are shown doing.
- Easy to get desyncronized either program is updated As above, this just isn’t the case. The only place where this might be true is when using seccomp on programs that were not written natively with seccomp. In that case, yes, desync is possible. But that’s one of the advantages of seccomp’s design: a program launcher (like minijail or systemd) can declare a seccomp filter for a program that hasn’t yet been ported to use one natively.
- eBPF watcher has no real idea what the program under observation is doing… I don’t understand this statement. I don’t see how Pledge would “have a real idea” either: they’re both doing filtering. If we get AI out of our syscall filters, we’re in serious trouble. :)
OpenBSD has some interesting advantages in the syscall filtering department, especially around sockets. Right now, it’s hard for Linux syscall filtering to understand why a given socket is being used. Something like SOCK_DNS seems like it could be quite handy.
Another nice feature of Pledge is the path whitelist feature. As it’s still under development, I hope they expand this to include more things than just paths. Argument inspection is a weak point for seccomp, but under Linux, most of the arguments are ultimately exposed to the LSM layer. Last year I experimented with creating a “seccomp LSM” for path matching where programs could declare whitelists, similar to standard LSMs.
So, yes, Linux “could match this API on seccomp”. It’d just take some extensions to libseccomp to implement
pledge(), as I described at the top. With OpenBSD doing a bunch of analysis work on common programs, it’d be excellent to see this usable on Linux too. So far on Linux, only a few programs (e.g. Chrome, vsftpd) have bothered to do this using seccomp, and it could be argued that this is ultimately due to how fine grained it is.
© 2015, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.