codeblog code is freedom — patching my itch


security things in Linux v4.3

Filed under: Chrome OS,Debian,Kernel,Security,Ubuntu,Ubuntu-Server — kees @ 2:54 pm

When I gave my State of the Kernel Self-Protection Project presentation at the 2016 Linux Security Summit, I included some slides covering some quick bullet points on things I found of interest in recent Linux kernel releases. Since there wasn’t a lot of time to talk about them all, I figured I’d make some short blog posts here about the stuff I was paying attention to, along with links to more information. This certainly isn’t everything security-related or generally of interest, but they’re the things I thought needed to be pointed out. If there’s something security-related you think I should cover from v4.3, please mention it in the comments. I’m sure I haven’t caught everything. :)

A note on timing and context: the momentum for starting the Kernel Self Protection Project got rolling well before it was officially announced on November 5th last year. To that end, I included stuff from v4.3 (which was developed in the months leading up to November) under the umbrella of the project, since the goals of KSPP aren’t unique to the project nor must the goals be met by people that are explicitly participating in it. Additionally, not everything I think worth mentioning here technically falls under the “kernel self-protection” ideal anyway — some things are just really interesting userspace-facing features.

So, to that end, here are things I found interesting in v4.3:


Russell King implemented this feature for ARM which provides emulated segregation of user-space memory when running in kernel mode, by using the ARM Domain access control feature. This is similar to a combination of Privileged eXecute Never (PXN, in later ARMv7 CPUs) and Privileged Access Never (PAN, coming in future ARMv8.1 CPUs): the kernel cannot execute user-space memory, and cannot read/write user-space memory unless it was explicitly prepared to do so. This stops a huge set of common kernel exploitation methods, where either a malicious executable payload has been built in user-space memory and the kernel was redirected to run it, or where malicious data structures have been built in user-space memory and the kernel was tricked into dereferencing the memory, ultimately leading to a redirection of execution flow.

This raises the bar for attackers since they can no longer trivially build code or structures in user-space where they control the memory layout, locations, etc. Instead, an attacker must find areas in kernel memory that are writable (and in the case of code, executable), where they can discover the location as well. For an attacker, there are vastly fewer places where this is possible in kernel memory as opposed to user-space memory. And as we continue to reduce the attack surface of the kernel, these opportunities will continue to shrink.

While hardware support for this kind of segregation exists in s390 (natively separate memory spaces), ARM (PXN and PAN as mentioned above), and very recent x86 (SMEP since Ivy-Bridge, SMAP since Skylake), ARM is the first upstream architecture to provide this emulation for existing hardware. Everyone running ARMv7 CPUs with this kernel feature enabled suddenly gains the protection. Similar emulation protections (PAX_MEMORY_UDEREF) have been available in PaX/Grsecurity for a while, and I’m delighted to see a form of this land in upstream finally.

To test this kernel protection, the ACCESS_USERSPACE and EXEC_USERSPACE triggers for lkdtm have existed since Linux v3.13, when they were introduced in anticipation of the x86 SMEP and SMAP features.

Ambient Capabilities

Andy Lutomirski (with Christoph Lameter and Serge Hallyn) implemented a way for processes to pass capabilities across exec() in a sensible manner. Until Ambient Capabilities, any capabilities available to a process would only be passed to a child process if the new executable was correctly marked with filesystem capability bits. This turns out to be a real headache for anyone trying to build an even marginally complex “least privilege” execution environment. The case that Chrome OS ran into was having a network service daemon responsible for calling out to helper tools that would perform various networking operations. Keeping the daemon not running as root and retaining the needed capabilities in children required conflicting or crazy filesystem capabilities organized across all the binaries in the expected tree of privileged processes. (For example you may need to set filesystem capabilities on bash!) By being able to explicitly pass capabilities at runtime (instead of based on filesystem markings), this becomes much easier.

For more details, the commit message is well-written, almost twice as long as than the code changes, and contains a test case. If that isn’t enough, there is a self-test available in tools/testing/selftests/capabilities/ too.

PowerPC and Tile support for seccomp filter

Michael Ellerman added support for seccomp to PowerPC, and Chris Metcalf added support to Tile. As the seccomp maintainer, I get excited when an architecture adds support, so here we are with two. Also included were updates to the seccomp self-tests (in tools/testing/selftests/seccomp), to help make sure everything continues working correctly.

That’s it for v4.3. If I missed stuff you found interesting, please let me know! I’m going to try to get more per-version posts out in time to catch up to v4.8, which appears to be tentatively scheduled for release this coming weekend. Next: v4.4.

© 2016, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


  1. Great write-up! I’m trying to write a linux kernel mitigation checklist and keep everything about KSPP in it:

    v4.9 is going to be a milestone if PAN emulation for armv8.0 and vmalloc stack can be merged. I’m happy to see attacker feels pain in their ass when they face the wrath of PaX/Grsecurity stuff. KSPP is making those important features into UPSTREAM, which is awesome. Thanks for the contribution for the FLOSS community!

    Comment by Shawn C — 9/26/2016 @ 9:15 pm

  2. Hi, small nitpick: SMAP is actually available since Broadwell, not Skylake

    Comment by Corsac — 9/27/2016 @ 10:43 pm

  3. It was very late Broadwell, true. Unfortunately, none of the Broadwell-generation Xeons were late enough to gain SMAP, so as it stands, there’s still no server-based SMAP; it’s just in laptops mostly.

    Comment by kees — 10/5/2016 @ 10:26 pm

Leave a Reply

Your email address will not be published. Required fields are marked *

Powered by WordPress