codeblog code is freedom — patching my itch

November 14, 2019

security things in Linux v5.3

Filed under: Blogging,Chrome OS,Debian,Kernel,Security,Ubuntu,Ubuntu-Server — kees @ 5:36 pm

Previously: v5.2.

Linux kernel v5.3 was released! I let this blog post get away from me, but it’s up now! :) Here are some security-related things I found interesting:

heap variable initialization
In the continuing work to remove “uninitialized” variables from the kernel, Alexander Potapenko added new init_on_alloc” and “init_on_free” boot parameters (with associated Kconfig defaults) to perform zeroing of heap memory either at allocation time (i.e. all kmalloc()s effectively become kzalloc()s), at free time (i.e. all kfree()s effectively become kzfree()s), or both. The performance impact of the former under most workloads appears to be under 1%, if it’s measurable at all. The “init_on_free” option, however, is more costly but adds the benefit of reducing the lifetime of heap contents after they have been freed (which might be useful for some use-after-free attacks or side-channel attacks). Everyone should enable CONFIG_INIT_ON_ALLOC_DEFAULT_ON=1 (or boot with “init_on_alloc=1“), and the more paranoid system builders should add CONFIG_INIT_ON_FREE_DEFAULT_ON=1 (or “init_on_free=1” at boot). As workloads are found that cause performance concerns, tweaks to the initialization coverage can be added.

pidfd_open() added
Christian Brauner has continued his pidfd work by creating the next needed syscall: pidfd_open(), which takes a pid and returns a pidfd. This is useful for cases where process creation isn’t yet using CLONE_PIDFD, and where /proc may not be mounted.

-Wimplicit-fallthrough enabled globally
Gustavo A.R. Silva landed the last handful of implicit fallthrough fixes left in the kernel, which allows for -Wimplicit-fallthrough to be globally enabled for all kernel builds. This will keep any new instances of this bad code pattern from entering the kernel again. With several hundred implicit fallthroughs identified and fixed, something like 1 in 10 were missing breaks, which is way higher than I was expecting, making this work even more well justified.

x86 CR4 & CR0 pinning
In recent exploits, one of the steps for making the attacker’s life easier is to disable CPU protections like Supervisor Mode Access (and Execute) Prevention (SMAP and SMEP) by finding a way to write to CPU control registers to disable these features. For example, CR4 controls SMAP and SMEP, where disabling those would let an attacker access and execute userspace memory from kernel code again, opening up the attack to much greater flexibility. CR0 controls Write Protect (WP), which when disabled would allow an attacker to write to read-only memory like the kernel code itself. Attacks have been using the kernel’s CR4 and CR0 writing functions to make these changes (since it’s easier to gain that level of execute control), but now the kernel will attempt to “pin” sensitive bits in CR4 and CR0 to avoid them getting disabled. This forces attacks to do more work to enact such register changes going forward. (I’d like to see KVM enforce this too, which would actually protect guest kernels from all attempts to change protected register bits.)

additional kfree() sanity checking
In order to avoid corrupted pointers doing crazy things when they’re freed (as seen in recent exploits), I added additional sanity checks to verify kmem cache membership and to make sure that objects actually belong to the kernel slab heap. As a reminder, everyone should be building with CONFIG_SLAB_FREELIST_HARDENED=1.

KASLR enabled by default on arm64
Just as Kernel Address Space Layout Randomization (KASLR) was enabled by default on x86, now KASLR has been enabled by default on arm64 too. It’s worth noting, though, that in order to benefit from this setting, the bootloader used for such arm64 systems needs to either support the UEFI RNG function or provide entropy via the “/chosen/kaslr-seed” Device Tree property.

hardware security embargo documentation
As there continues to be a long tail of hardware flaws that need to be reported to the Linux kernel community under embargo, a well-defined process has been documented. This will let vendors unfamiliar with how to handle things follow the established best practices for interacting with the Linux kernel community in a way that lets mitigations get developed before embargoes are lifted. The latest (and HTML rendered) version of this process should always be available here.

Those are the things I had on my radar. Please let me know if there are other things I should add! Linux v5.4 is almost here…

© 2019 – 2020, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 License.
CC BY-SA 4.0



    Comment by Morfik — November 15, 2019 @ 2:00 am

  2. Is there a list of recommended CONFIG* settings for security somewhere on your blog, so that I can verify them and recompile my kernel? E.g. on the Ubuntu bug tracker you recommended setting CONFIG_PAGE_POISONING* to 1, while Arch (which I use) doesn’t enable them.

    Comment by Ren — November 20, 2019 @ 4:00 am

  3. Whoops! Thanks (to many people) for catching the CONFIG_SLAB_FREELIST_HARDENED typo! I’ve fixed it now. :)

    Comment by kees — November 21, 2019 @ 7:45 am

  4. For recommended CONFIGs, see here:

    Comment by kees — November 21, 2019 @ 7:48 am

  5. No mention to previous work implementing exactly the same such as (or the PaX project).
    Part of it was also formally copyrighted in 2015.

    How long is this kind of plagiarism going to go for? When is due credit going to happen? If not for moral reasons, for the sake of avoiding lawsuits.
    I find it heart warming that some 10 years late my work, or the work of others, finally makes it to the kernel mainline, but at the same time, I find it puerile and despicable that people’s work and contributions are not being lawfully credited.

    Comment by Larry H — February 16, 2020 @ 8:56 am

  6. I’m sorry you consider Alexander’s work to be plagiarism; I think it is clearly not. I agree it’s a similar idea, but even that idea already existed in the kernel (e.g. CONFIG_PAGE_POISONING), but this is more usable and complete (e.g. is covers more corner cases). Regardless, there was mention of PAX_MEMORY_SANITIZE (though that is only a partially overlapping feature — the “init_on_free=1” portion is very similar to an untuned version of PAX_MEMORY_SANITIZE) in and around the discussion of these patches (e.g.

    I’m grateful to you and PaX Team for trying to push these mitigations in earlier years — as you can see, it’s much more a social issue with the kernel community than strictly a technical one. Without the earlier efforts, landing this now would have been even harder than it already was.

    Comment by kees — February 18, 2020 @ 2:11 pm

Powered by WordPress