codeblog code is freedom — patching my itch

October 19, 2010

CVE-2010-2963 v4l compat exploit

Filed under: Blogging,Debian,Security,Ubuntu,Ubuntu-Server,Vulnerabilities — kees @ 3:41 pm

If you’re running a 64bit system, and you’ve got users with access to a video device (/dev/video*), then be sure you update your kernels for CVE-2010-2963. I’ve been slowly making my way through auditing the many uses in the Linux kernel of the copy_from_user() function, and ran into this vulnerability.

Here’s the kernel code from drivers/media/video/v4l2-compat-ioctl32.c:

static int get_microcode32(struct video_code *kp, struct video_code32 __user *up)
        if (!access_ok(VERIFY_READ, up, sizeof(struct video_code32)) ||
                copy_from_user(kp->loadwhat, up->loadwhat, sizeof(up->loadwhat)) ||
                get_user(kp->datasize, &up->datasize) ||
                copy_from_user(kp->data, up->data, up->datasize))
                        return -EFAULT;
        return 0;

Note that kp->data is being used as the target for up->data in the final copy_from_user() without actually verifying that kp->data is pointing anywhere safe. Here’s the caller of get_microcode32:

static long do_video_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
        union {
                struct video_tuner vt;
                struct video_code vc;
        } karg;
        void __user *up = compat_ptr(arg);
        switch (cmd) {
                err = get_microcode32(&, up);

So, the contents of up are totally under control of the caller, and the contents of karg (in our case, the video_code structure) are not initialized at all. So, it seems like a call for VIDIOCSMICROCODE would write video_code->datasize bytes from video_code->data into some random kernel address, just causing an Oops, since we don’t control what is on the kernel’s stack.

But wait, who says we can’t control the contents of the kernel’s stack? In fact, this compat function makes it extremely easy. Let’s look back at the union. Notice the struct video_tuner? That gets populated from the caller’s up memory via this case of the switch (cmd) statement:

        case VIDIOCSTUNER:
        case VIDIOCGTUNER:
                err = get_video_tuner32(&karg.vt, up);

So, to control the kernel stack, we just need to call this ioctl twice in a row: once to populate the stack via VIDIOCSTUNER with the contents we want (including the future address for video_code->data, which starts at the same location as video_tuner->name[20]), and then again with VIDIOCSMICROCODE.

Tricks involved here are: the definition of the VIDIOCSMICROCODE case in the kernel is wrong, and calling the ioctls without any preparation can trigger other kernel work (memory faults, etc) that may destroy the stack contents. First, we need the real value for the desired case statement. This turns out to be 0x4020761b. Next, we just repeatedly call the setup ioctl in an attempt to get incidental kernel work out of the way so that our last ioctl doing the stack preparation will stick, and then we call the buggy ioctl to trigger the vulnerability.

Since the ioctl already does a multi-byte copy, we can now copy arbitrary lengths of bytes into kernel memory. One method of turning an arbitrary kernel memory write into a privilege escalation is to overwrite a kernel function pointer, and trigger that function. Based on the exploit for CVE-2010-3081, I opted to overwrite the security_ops function pointer table. Their use of msg_queue_msgctl wasn’t very good for the general case since it’s near the end of the table and its offset would depend on kernel versions. Initially I opted for getcap, but in the end used ptrace_traceme, both of which are very near the top the security_ops structure. (Though I need share credit here with Dan Rosenberg as we were working together on improving the reliability of the security_ops overwrite method. He used the same approach for his excellent RDS exploit.)

Here are the steps for one way of taking an arbitrary kernel memory write and turning it into a root escalation:

  • overwrite security_ops with default_security_ops, which will revert the LSM back to the capabilities-only security operations. This, however, means we can calculate where cap_ptrace_traceme is.
  • overwrite default_security_ops->ptrace_traceme to point to our supplied function that will actually perform the privilege escalation (thanks to Brad Spengler for his code from Enlightenment).
  • trigger the function (in this case, call ptrace(PTRACE_TRACEME, 0, NULL, NULL)).
  • restore default_security_ops->ptrace_traceme to point to cap_ptrace_traceme so the next caller doesn’t Oops the system (since userspace memory will be remapped).

Here’s the source for Vyakarana as seen running in Enlightenment using cap_getcap (which is pretty unstable, so you might want to switch it to use ptrace_traceme), and as a stand-alone memory writer.

Conclusions: Keep auditing the kernel for more arbitrary writes; I think there are still many left. Reduce the exploitation surface within the kernel itself (which PaX and grsecurity have been doing for a while now), specifically:

  • Block userspace memory access while in kernel mode. This would stop the ability to make the kernel start executing functions that live in userspace — a clear privilege violation. This protection would stop the current exploit above, but the exploit could be adjusted to use kernel memory instead.
  • Keep function pointers read-only. There is no reason for these function pointer tables (fops, IDT, security_ops, etc) to be writable. These should all be marked correctly, with inline code exceptions being made for updating the global pointers to those tables, leaving the pointer read-only after it gets set. This would stop this particular exploit above, but there are still plenty more targets.
  • Randomize the kernel stack location on a per-syscall basis. This will stop exploits that depend on a stable kernel stack location (as this exploit does).

© 2010, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 License.
CC BY-SA 4.0


  1. Hi,

    I’ve seen mention of this elsewhere.

    It would be nice if you could elaborate on how, exactly, to fix these vulnerabilities. I have a couple in mind which I’m responsible for, but I’m unsure exactly how to fix them — and be sure I’ve done it right.

    The other mention I saw of this also failed to explain how to fix it, other than mentioning access_ok() should be used. I could take a stab at it, but I’d like to know I was doing it right.

    Comment by SteveC — October 19, 2010 @ 4:28 pm

  2. In general, it’s best to just verify every single use of copy_to_user(), copy_from_user(), get_user(), and put_user(), etc (or any memory copying for that matter). Make sure you know where you’re reading and writing from, and how large those accesses are. (For example, are you sure a length can’t be negative, or larger than you’re expecting?) Calling access_ok() is already done inside the copy_*_user() functions, so it usually isn’t needed. You’ll note that in this particular case, it was the kernel destination that was unchecked. Everything was fine about the userspace reads — nothing was out of bounds. But nothing actually defined where or how large the kernel destination buffer was.

    Comment by kees — October 19, 2010 @ 4:38 pm

  3. Thanks for this detailed post!

    Comment by JohnTaylor — October 19, 2010 @ 11:49 pm

  4. Actually, it turns out I was thinking of a different issue now that I’m looking at it again, CVS-2010-3081

    also related to the compat stuff, but to compat_alloc_user_space().

    Comment by SteveC — October 20, 2010 @ 5:14 am

  5. That should trigger a sparse warning shouldn’t it? If someone dereferences a __user pointer?

    I don’t have a 64 bit computer and so I don’t compile that file. But someone should have caught that.

    Comment by Dan Carpenter — October 23, 2010 @ 11:08 am

  6. @Dan which part should have? Everything was syntactically correct except for the part where ->data wasn’t initialized.

    Comment by kees — October 23, 2010 @ 1:56 pm

  7. up is declared with the __user attribute.

    So we’re not allowed to dereference it when we do the “up->datasize”. Calling access_ok() doesn’t mean we can dereference it, it just means that we use __get_user() instead of get_user(). User memory can be in a different address space or it can be swapped out.

    Btw up->loadwhat is an array so that doesn’t actually dereference “up”.

    Of course, the datasize isn’t capped as well as you point out. It seems like someone could write a Smatch script to detect these places automatically. Stuff like:
    get_user(size, &user_ptr);
    <– no cap on size here.
    copy_from_user(dest, src, size);
    I'll poke at this on Monday.

    Comment by Dan Carpenter — October 23, 2010 @ 8:57 pm

  8. Actually, I just read more thouroughly and that’s not what you were pointing out at all. :P Ha ha. It’s amazing this crap works at all.

    Comment by Dan Carpenter — October 23, 2010 @ 10:00 pm

  9. “Next, we just repeatedly call the setup ioctl in an attempt to get incidental kernel work out of the way so that our last ioctl doing the stack preparation will stick, and then we call the buggy ioctl to trigger the vulnerability.”

    Can you give me an example when stack preparation will not stick?
    I wrote my own exploit for this vulnerability and I call only one time the setup ioctl and it always works..

    Comment by madara — August 7, 2011 @ 7:03 am

  10. @madara I didn’t try it with just 1 stack prep, but I figured a bunch wouldn’t hurt. :)

    Comment by kees — August 8, 2011 @ 10:32 am

Powered by WordPress