|
Vulnerability Passive System Fingerprinting using Network Client Applications Affected munices Description Jose Nazario posted following white paper. Here is his a low-jack approach to passive network analysis. Passive target fingerprinting involves the utilization of network traffic between two hosts by a third system to identify the types of systems being used. Because no data is sent to either system by the monitoring party, detection approaches the impossible. Methods which rely solely on the IP options present in normal traffic are limited in the accuracy about the targets. Further inspection is also needed to determine avenues of vulnerability, as well. We describe a method to rapidly identify target operating systems and version, as well as vectors of attack, based on data sent by client applications. While simplistic, it is robust. The accuracy of this method is also quite high in most cases. Four methods of fingerprinting a system are presented, with sample data provided. Passive OS mapping has become a new area of research in both white hat and black hat arenas. For the white hat, it becomes a new method to map their network and monitor traffic for security. For example, a new and possibly subversive host can be identified quickly, often with great accuracy. For the black hat, this method provides a nearly undetectable method to map a network, finding vulnerable hosts. To be sure, passive mapping can be a time consuming process. Even with automated tools like Siphon (a sufficient quantity packets to arrive to build up a statistically significant reading of the subjects' operating systems). Compare this to active OS fingerprinting methods, using tools like nmap and queso, which can operate in under a minute usually, and only more determined attackers, or curious types, will be attracted to this method. Siphon, nmap and queso are available from: http://www.subterrain.net/projects/siphon/ http://www.insecure.org/nmap/ http://www.apostols.org/ Two major methods of operating system fingerprinting exist in varying degrees of use, active and passive. Active scanning involves the use of IP packets sent to the host and the scanner then monitoring the replies to guess the operating systems. Passive scanning, in contrast, allows the scanning party to obtain information in the absence of any packets sent from the listening system to the targets. Each method has their advantages, and their limitations. Active Scanning =============== By now nearly everyone is familiar with active scanning methods. The premier port scanning tool, nmap, has been equipped for some time now with accurate active scanning measures. This code is based off of an earlier tool, queso, from the group The Apostols. Nmap's author, Fyodor, has written an excellent paper on this topic in the e-zine Phrack (issue 54 article 9). Ofir Arkin has been using ICMP bit handling to differentiate between certain types of operating systems. Because ICMP usually slips below the threshold of analysis, and most of the ICMP messages used are legitimate, the detection of this scanning can be more difficult than, say, queso or nmap fingerprinting. The problems with active scanning are mainly twofold: first, we can readily firewall the packets used to fingerprint our system, obfuscating the information; secondly, we can detect it quite easily. Because of this, it is less attractive for a truly stealthy adversary. Passive Scanning ================ In a message dated June 30, 1999, Photon posted to the nmap-hackers list with some ideas of passive operating system fingerprinting (this note is available from the MARC archives of the nmap-hackers list). He set up a webpage with some of his thoughts, which has since been taken down. In short, by using default IP packet construction behavior, including default TTL values, the presence of the DF bit, and the like, one can gain a confident level of the system's OS. These ideas were quickly picked up by others and several lines of research have been active since then. Lance Spitzer's paper dated May 24 2000: http://www.enteract.com/~lspitz/pubs.html on passive fingerprinting included many of the data needed to build such a tool. In fact, two quickly appeared, one from Craig Smith and another tool called p0f from Michael Zalewski: http://www.enteract.com/~lspitz/passfing.tar.gz http://lcamtuf.hack.pl/p0f.tgz One very interesting tool that is under active development, extending the earlier work, is Siphon. By utilizing not only IP stack behavior, but also routing information and spanning tree updates, a complete network map can be built over time. Passive port scans also take place, adding to the data. This tool promises to be truly useful for the white hat, and a patient black hat. One limitation of these methods, though, is that they only provide a measure of the operating system. Vulnerabilities may or may not exist, and further investigations must be undertaken to evaluate if this is the case. While suitable for the white hat for most purposes (like accounting), this is not suitable to a would-be attacker. Simply put, more information is needed. An Alternative Approach ======================= An alternative method to merely fingerprinting the operating system is to perform an identification by using client applications. Quite a number of network clients send revealing information about their host systems, either directly or indirectly. We use application level information to map back to the operating system, either directly or indirectly. One very large advantage to the method described here is that in some situations, much more accurate information can be gained about the client. Because of stack similarities, most Windows systems, including 95, 98 and NT 4.0, look too similar to differentiate. The client application, however, is willing to reveal this information. This provides not only a measure of the target's likely operating system, but also a likely vector for entrance. Most of these client applications have numerous security holes, to which one can point malicious data. In some cases, this can provide the key information needed to begin infiltrating a network, and one can proceed more rapidly. In most cases it provides a starting point for the analysis of vulnerabilities of a network. One major limitation of this method, however, comes when a system is emulating another to provide access to client software. This includes Solaris and SCO's support for Linux binaries. As such, under these circumstances, the data should be taken with some caution and evaluated in the presence of other information. This limitation, however, is similar to the limitation that IP stack tweaking can place on passive fingerprinting at the IP level, or the effect on active scanning from these adjustments or firewalling. Four different type of network clients are discussed here which provide suitable fingerprinting information. Email clients, which leave telltale information in most cases on their messages; Usenet clients, which, like mail applications, litter their posts with client system information; web browsers, which send client information with each request; and even the ubiquitous telnet client, which sends such information more quietly, but can just as effectively fingerprint an operating system. Knowing this, one now only needs to harvest the network for this information and map it to source addresses. Various tools, including sniffers, both generic and specialized, and even web searches will yield this information. A rapid analysis of systems can be quickly performed. This works quite well for the white hat and the black hat hacker, as well. In this paper is described a low tech approach to fingerprinting systems for both their operating system and a likely route to gaining entry. By using application level data sent from them over the network, we can quickly gather accurate data about a system. In some cases, one doesn't even have to be on the same network as the targets, they can gather the information from afar, compile the information and use it at their discretion at a later date. Mail Clients ------------ One of the largest type of traffic the network sees is electronic mail. Nearly everyone who uses the Internet on a regular basis uses email in those transaction sessions. They not only receive mail, but also send a good amount of mail, too. Because it is ubiquitous, it makes an especially attractive avenue for system fingerprinting and ultimately penetration. Within the headers of nearly every mail message is some form of system identification. Either through the use of crafted message identification tags, as used by Eudora and Pine, or by explicit header information, such as headers generated by OutLook clients or CDE mail clients. The scope of this method, both in terms of information gained and the potential impact, should not be underestimated. If anything, viruses that spread by email, including ones that are used to steal passwords from systems, should illustrate the effectiveness of this method. Pine, for example, itself is one of the worst offenders of any application for the system it is on. It gives away a whole host of information useful to an attacker in one fell swoop. To wit: Message-ID: <Pine.LNX.4.10.9907191137080.14866-100000@somehost.example.ca> It is clear it's Pine, we know the version (4.10), and we know the system type. Too much about it, in fact. This is a list of the main ports of Pine as of 4.30: a41 IBM RS/6000 running AIX 4.1 or 4.2 a32 IBM RS/6000 running AIX 3.2 or earlier aix IBM S/370 AIX aos AOS for IBM RT (untested) mnt FreeMint aux Macintosh A/UX bsd BSD 4.3 bs3 BSDi BSD/386 Version 3 and Version 4 bs2 BSDi BSD/386 Version 2 bsi BSDi BSD/386 Version 1 dpx Bull DPX/2 B.O.S. cvx Convex d54 Data General DG/UX 5.4 d41 Data General DG/UX 4.11 or earlier d-g Data General DG/UX (even earlier) ult DECstation Ultrix 4.1 or 4.2 gul DECstation Ultrix using gcc compiler vul VAX Ultrix os4 Digital Unix v4.0 osf DEC OSF/1 v2.0 and Digital Unix (OSF/1) 3.n sos DEC OSF/1 v2.0 with SecureWare epx EP/IX System V bsf FreeBSD gen Generic port hpx Hewlett Packard HP-UX 10.x hxd Hewlett Packard HP-UX 10.x with DCE security ghp Hewlett Packard HP-UX 10.x using gcc compiler hpp Hewlett Packard HP-UX 8.x and 9.x shp Hewlett Packard HP-UX 8.x and 9.x with Trusted Computer Base gh9 Hewlett Packard HP-UX 8.x and 9.x using gcc compiler isc Interactive Systems Unix lnx Linux using crypt from the C library lnp Linux using Pluggable Authentication Modules (PAM) slx Linux using -lcrypt to get the crypt function sl4 Linux using -lshadow to get the crypt() function sl5 Linux using shadow passwords, no extra libraries lyn Lynx Real-Time System (Lynxos) mct Tenon MachTen (Mac) osx Macintosh OS X neb NetBSD nxt NeXT 68030's and 68040's Mach 2.0 bso OpenBSD with shared-lib sc5 SCO Open Server 5.x sco SCO Unix pt1 Sequent Dynix/ptx v1.4 ptx Sequent Dynix/ptx dyn Sequent Dynix (not ptx) sgi Silicon Graphics Irix sg6 Silicon Graphics Irix >= 6.5 so5 Sun Solaris >= 2.5 gs5 Sun Solaris >= 2.5 using gcc compiler so4 Sun Solaris <= 2.4 gs4 Sun Solaris <= 2.4 using gcc compiler sun Sun SunOS 4.1 ssn Sun SunOS 4.1 with shadow password security gsu SunOS 4.1 using gcc compiler s40 Sun SunOS 4.0 sv4 System V Release 4 uw2 UnixWare 2.x and 7.x wnt Windows NT 3.51 Pine system types used in Message-ID tags as of Pine 4.30. This table was gathered from the supported systems listed in the Pine source code documentation, in the file pine4.30/doc/pine-ports, and was edited for brevity. Hence, with the above message ID, one knows the target's hostname, an account on that machine that reads mail using Pine, and that it's Linux without shadowed passwords (the LNX host type). Hang out on a mailing list, maybe something platform agnostic, and collect targets. In this case, one could use a well known exploit within the mail message, grab the system password file and send it back to ourselves for analysis. This can easily scaled to as many clients as has been fingerprinted; one mass mailing, and sit back and wait for the password files to come in. This is not to say that other mail clients are not vulnerable to such information leaks. Most mail clients give out similar information, either directly or indirectly. Direct information would be an entry in the message headers, such as an X-Mailer tag. Indirect information would be similar to that seen for Pine, a distinctive message ID tag. When this information is coupled to the information about the originating host, a fingerprint can occur rapidly. Some examples: User-Agent: Mutt/1.2.4i X-Mailer: Microsoft Outlook Express 5.00.3018.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.3018.1300 X-Mailer: dtmail 1.2.1 CDE Version 1.2.1 SunOS 5.6 sun4u sparc X-Mailer: PMMail 2000 Professional (2.10.2010) For Windows 2000 (5.0.2195) X-Mailer: QUALCOMM Windows Eudora Version 4.3.2 Message-ID: <4.3.2.7.2.20001117142518.043ad100@mailserver3.somewhere.gov> While not all clients give out their host system or processors, such as Mutt or Outlook Express, this information can be used by itself to get a larger vulnerability assessment. For example, if we know what version strings appear only on Windows, as opposed to a MacOS system, we can determine the processor type. The dtmail application is entirely too friendly to someone determining vulnerabilities, giving up the processor and OS revision. Given the problems that have appeared in the CDE suite, and in older versions of Solaris, an attack would be all too easy to construct. There are two main avenues for finding this information for lots of clients quickly. First, we can sniff the network for this information. Using a tool like mailsnarf, ngrep or any sniffer with some basic filtering, a modest collection of host to client application data can be gathered. The speed of collection and the ultimate size of this database depends chiefly on the amount of traffic your network segment sees. This is the main drawback to this method, a limited amount of data. A much more efficient method, and one that can make use of this above information, is in offline (for the target with respect to the potential attacker) system fingerprinting, with an exploit path included. How do we do this? We search the web, with it's repleat mailing list archives, and we turn up some boxes. Altavista: 2,033 pages found (for pine.ult) Google results 1-10 of about 141,000 for pine.lnx Altavista: 16,870 pages found (for pine.osf) You get the idea. Tens of thousands of hits, thousands of potentially exploitable boxes ready to be picked. Simply evaluate the source host information and map it to the client data and a large database of vulnerable hosts is rapidly built. The exploits are easy. Every week, new exploits are found in client software, either mail applications like Pine, or methods to deliver exploits using mail software. Examples of this include the various buffer overflows that have appeared (and persist) in Pine and OutLook, the delivery of malicious DLL files using Eudora attachments, and such. We know from viruses like ILOVEYOU and Melissa that more people than not will open almost any mail message, and we know from spammers that it's trivial to bulk send messages with forged headers, making traceback difficult. These two items combine to make for a very readily available exploit. In a manner similar to electronic mail, Usenet clients leave significant information in the headers of their posts which reveal information about their host operating systems. One great advantage to Usenet, as opposed to email or even web traffic, is that posts are distributed. As such, we can be remote and collect data on hosts without their knowledge or ever having to gain entry into their network. Among the various newsreaders commonly used, copious host info is included in the headers. The popular UNIX newsreader 'tin' is among the worst offenders of revealing host information. Operating system versions, processors and applications are all listed in the 'User-Agent' field, and when coupled to the NNTP-Posting-Host information, a remote host fingerprint has been performed: User-Agent: tin/1.5.2-20000206 ("Black Planet") (UNIX) (SunOS/5.6(sun4u)) User-Agent: tin/pre-1.4-980226 (UNIX) (FreeBSD/2.2.7-RELEASE (i386)) User-Agent: tin/1.4.2-20000205 ("Possession") (UNIX) (Linux/2.2.13(i686)) NNTP-Posting-Host: host.university.edu The standard web browsers also leave copious information about themselves and their host systems, as they do with HTTP requests and mail. We will elaborate on web clients in the next section, but they are also a problem as Usenet clients: X-Http-User-Agent: Mozilla/4.75 (Windows NT 5.0; U) X-Mailer: Mozilla 4.75 (X11; U; Linux 2.2.16-3smpi686) And several other clients also leave verbose information about their hosts to varying degrees. Again, when combined with the NNTP-Posting-Host or other identifying header, one can begin to amass information about hosts without too much work: Message-ID: <Pine.LNX.4.21.0010261126210.32652-100000@host.example.co.nz> User-Agent: MT-NewsWatcher/3.0 (PPC) X-Operating-System: GNU/Linux 2.2.16 User-Agent: Gnus/5.0807 (Gnus v5.8.7) XEmacs/21.1 (Bryce Canyon) X-Newsreader: Microsoft Outlook Express 5.50.4133.2400 X-Newsreader: Forte Free Agent 1.21/32.243 X-Newsreader: WinVN 0.99.9 (Released Version) (x86 32bit) Either directly or indirectly, we can fingerprint the operating system over the source host. Other programs are not so forthcoming, but still leak information about a host that can be used to determine vulnerability analysis. X-Newsreader: KNode 0.1.13 User-Agent: Pan/0.9.1 (Unix) User-Agent: Xnews/03.02.04 X-Newsreader: trn 4.0-test74 (May 26, 2000) X-Newsreader: knews 1.0b.0 (mrsam/980423) User-Agent: slrn/0.9.5.7 (UNIX) X-Newsreader: InterChange (Hydra) News v3.61.08 None of these header fields are required by the specifications for NNTP, as noted in RFC 2980. They provide only some additional information about the host which was the source of the data. However, given that more transactions that concern the servers are between servers, this data is entirely extraneous. It is, it appears, absent from RFC 977, the original specification for NNTP. On interesting possibility to exploiting a user agent like Mozilla is to examine the accepted languages. In the below example, we see not only English is supported, but that the browser is linked to Acrobat. Given potential holes, and past problems, with malicious PDF files, this could be another avenue to gaining entry to a host. X-Mailer: Mozilla 4.75 (Win98; U) X-Accept-Language: en,pdf While this may seem that we're limited to fingerprinting hosts, or out of luck if they are using a proxy, this is not the case. We can also retrieve proxy info from the headers: X-Http-Proxy: 1.0 x72.deja.com:80 (Squid/1.1.22) for client 10.32.34.18 While in this case the proxy is disconnected from the client's network, if this were a border proxy, we could use this to gain information about a possible entry point to the network and, over time and with enough sample data, information about the network behind the protected border. A remarkably simple and highly effective means of fingerprinting a target is to follow the web browsing that gets done from it. Most every system in use is a workstation, and nearly everyone uses their web browsers to spend part of their day. And just about every browser sends too much information in it's 'User-Agent' field. RFC 1945 notes that the 'User-Agent' field is not required in an HTTP 1.0 request, but can be used. The authors state, "user agents should include this field with requests." They cite statistics as well as on the fly tailoring of data to meet features or limitations of browsers. The draft standard for HTTP version 1.1 requests, RFC 2616, also notes similar usage of the 'User-Agent' field. We can gather this information in two ways. First, we could run a website and turn on logging of the User-Agent field from the client (if it's not already on). Simply generate a lot of hits and watch the data come in. Get on Slashdot, advertise some pornographic material, or mirror some popular software (like warez) and you're ready to go. Secondly, we can sniff web traffic on our visible segment. While almost any sniffer will work, one of the easiest for this type of work is urlsnarf from the dsniff package from Dug Song. This package is available at http://www.monkey.org/~dugsong/dsniff/ Examples of browsers that send not only their application information, such as the browser and the version, but also the operating system which the host runs include: - Netscape (UNIX, MacOS, and Windows) - Internet Explorer One shining example of a browser that doesn't send extraneous information is Lynx. On both 2.7 and 2.8 versions, only the browser information is sent, no information about the host. The User-Agent field can be important to the web server for legitimate reasons. Due to implementations, both Netscape and Explorer are not equivalent on many items, including how they handle tables, scripting and style sheets. However, host information is not needed and is sent gratuitously. A typical request from a popular browser looks like this: GET / HTTP/1.0 Connection: Keep-Alive User-Agent: Mozilla/4.08 (X11; I; SunOS 5.7 sun4u) Host: 10.10.32.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */* Accept-Encoding: gzip Accept-Language: en Accept-Charset: iso-8859-1,*,utf-8 The User-Agent field is littered with extra information that we don't need to know: the operating system type, version and even the hardware being used. Instantly we know everything there is to know about compromising this host: the operating system, the host's architecture, and even a route we could use to gain entry. For example a recent problems in Netscape's JPEG handling. Using urlsnarf to log these transactions is the easiest method to sniff this information from the network. A typical line of output is below: 10.10.1.232 - - "GET http://www.latino.com/ HTTP/1.0" - - "http://www.latino.com/" "Mozilla/4.07 (Win95; I ;Nav)" We can also use the tool ngrep to listen to this information on the wire. A simple filter to listen only to packets that contain the information 'User-Agent' can be set up and used to log information about hosts on the network. ngrep can be obtained from the PacketFactory website: http://www.packetfactory.net/Projects/Ngrep/ A simple regular expression filter can do the trick: ngrep -qid ep1 'User-Agent' tcp port 80 This will print out all TCP packets which contain the case insensitive string User-Agent in them. And, within this field, for too many browsers, is too much information about the host. With the above options to ngrep, typical output will look like this: T 10.10.11.43:1860 -> 130.14.22.107:80 GET /entrez/query/query.js HTTP/1.1..Accept: */*..Referer: http://www. ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search DB=PubMed..Accept-Langua ge: en-us..Accept-Encoding: gzip, deflate..If-Modified-Since: Thu, 29 Jun 2000 18:38:45 GMT; length=4558..User-Agent: Mozilla/4.0 (compatibl e; MSIE 5.5; Windows 98)..Host: www.ncbi.nlm.nih.gov..Connection: Keep -Alive..Cookie: WebEnv=FpEB]AfeA>>Hh^`Ba@<]^d]bCJfdADh@(j)@ =^a=T=EjIE=b<F bg<.... Even more information is contained within the request than urlsnarf showed us, information including cookies. In much the same way as one can use the strings sent during requests by the clients to determine what system type is in use, one can follow the replies sent back by the server to determine what type it is. Again we will use ngrep, this time matching the expression 'server:' to gather the web server type: T 192.168.0.5:80 -> 192.168.0.1:1033 HTTP/1.0 200 OK..Server: Netscape-FastTrack/2.01..Date: Mon, 30 Oct 20 00 00:15:31 GMT..Content-type: text/html.... While specifics about the operating system information are lost, this works to passively gather vulnerability information about the target server. This can be coupled to other information to decide how best to proceed with an attack. This information will not be covered as this paper is limited to client applications and systems being fingerprinted. While telnet is no longer in widespread use due to the fact that all of its data is sent in plain text, including authentication data, it is still used widely enough to be of use in fingerprinting target systems. What is interesting is that it not only gives us a mechanism to gather operating system data, it gives us the particular application in use, which can be of value in determining a mechanism of entry. The specification for the telnet protocol describes a negotiation between the client and the host for information such as line speed, terminal type and echoing (for descriptive information on these options and their negotiations, please see RFCs 857, 858, 859, 860, 1091, 1073, 1079, 1184, 1372, and 1408. Also, see TCP Illustrated, Volume 1: The Protocols by W. Richard Stevens). What is interesting to note is that each client behaves in a unique way, even different client applications on the same host type. Similarly, the telnet server, running a telnet daemon, can be fingerprinted by following the negotiations with the client. This information can be viewed from the telnet command line application on a UNIX host by issuing the 'toggle options' command at the telnet> prompt. This information can be gather directly, using a wedge application or a honey-pot as demonstrated on the network at Hope2k, or it can be sniffed off the network in a truly passive fashion. We discuss below gathering data about both the client system and the server being connected to. The same principles apply to both host identification methods. The negotiations described above, and in the references listed, can be used to fingerprint the client based upon the options set and the order in which they are negotiated. Table 1 describes the behavior of several telnet clients in these respects. Their differences are immediately obvious, even for different clients on the same operating system, such as Tera Term Pro and Windows Telnet on a Windows 95 host. In this table, all server commands and negotiation options are ignored and only data originating from the client is shown. Table is omitted in this version, please see: http://www.crimelabs.net/docs/passive.html for pdf or/and .ps Obviously, the most direct method to fingerprint a server would be to connect to it and examine the order of options and their values as a telnet session was negotiated. However, as this study is concerned with passive scanning of clients, we will leave it to the reader to map this information and learn what to do with it. Solution In this paper has been illustrated the effectiveness of target system identification by using the information provided by network client applications. This provides a very efficient and precise measure of the client operating system, as well as identifying a vector for attack. This information is sent gratuitously and is not essential to the normal operation of many of these applications. The main limitation of this information is found when a host is performing emulation of another operating system to run the client software. While this is rare, it could lead to a false system identification. This mainly falls in the open software world, however, and only for some operating systems. For web browsers, which are ubiquitous and used by nearly everyone on the Internet, the host operating system should not be sent. Ideally, information about what protocols are spoken, what standards are met and what language are supported (ie English, German, French) should suffice. Lynx behaves nearly ideally in this regard, and both Netscape and Explorer should follow this lead. With respect to Usenet and electronic mail clients, again only what features are supported should be provided. Pine is an example of how bad it can get, providing too much information about a host too quickly. There is no reason why any legitimate client should know what processor and OS is being run on the sending host. Telnet clients are far more difficult. It is tempting to say that all telnet applications should support the same set of features, but that is simply impossible. Proxy hosts should be used, if possible, to strip off information about the originating system, including the workstation address and operating system information. This will help obscure needed information to map a network from outside the perimeter. Coupled with strong measures to catch viruses and malicious code, such as in a web page script, the risks should be greatly reduced. The best solution is for application authors to not send gratuitous information in their headers or requests. Furthermore, client applications should be scrutinized to the same degree as daemons that run with administrative privilidges. The lessons of RFC 1123 most certainly apply at this level. First: The admin/user must not be able to alter/remove the ident strings that are sent out by the application. This is the case for most of Windows apps and even where it is possible people usually do not take any measures in this departement. So we can move on. Second: The information displayed must actually be correct. This is when the fun begins. To take a really good example, the Pine on most Linux systems *always* sends messages with a Message-Id that contains "LNX" although we think most are using shadow passwords. Also, most mail agents are quite good at rewriting headers if we ask them to, MTAs being another hidden champions of this. If you ever happen to receive a mail from me with a From: header that says root, do not even for a second think that it was actually sent from that account. Also, you do not seem to adequately account for the fact that many people are not running mail servers on their systems. So if you eg see: Outlook Express, than fine, you know that the sender machine was running this MUA. But it also makes it more than likely that the email address you found is not actually that on the sending machine but rather one on a big mail server, which may be running anything. You have no knowledge of how mail collection at that site works, so you cannot be sure that your exploit will actually work. (eg person can make an email enquiry with their browsers email client after clicking on a link but use someting else for "normal" mail and may not even be aware of the difference and yes, we have seen a setup like this.) Also, emulation is (or rather can be) a quite big issue with lesser known OSs for which not enough native applications exist. Eg there is no Netscape binary of the current release available for any BSD operating system. (With BSDi support having been dropped after 4.75) so if you want to use Netscape on any BSD, you have to use emulation. But if you go after the presumably old, 2.0.x kernel based Linux system it reports itself as, you will be in for a surprise). But the real kicker is using Wine (Windows emulation package for UNIX) and a windows-based web-browser... (yes ppl have done things like this. Sometimes you are forced to, eg if there is not even a Linux port of the sw you need to run). For proxies: It is known that there exist proxies that hide your real IP address and cannot be detected any easy way (because they do not insert an X-Forwarded-IP field). The proxy may or may not be local, so you do not necessarily have the entrance to the network either. Also, web search engines can be helpful for finding vulnerabilities in servers but to compile lists of target hosts from mailing list archives is fragile... there may not be many live hits from those. (Even for server fingerprinting, some surpriese are in the game: eg Walmart was suspected of forging their server signature because at least on one occassion they reported themselves as Microsoft-IIS/4.0 (Unix) mod_ssl/2.6.6 OpenSSL/0.9.5.. outright funny. So the point is: although the information may be there, it may already be forged intentionally or otherwise incorrect. Also, the fact that you found eg Mutt does not tell a lot to you unless you have a specific exploit... because many of these programs run on many UNIX/UNIX-like systems plus on DOS/Windows. So you do not know a lot. And finally: with all this information you have to go out and do some actual scanning to verify/gather more information and this is where you can already get caught. But, yes, even with the above points made, considering the average Windows/Mac user and admin, information leakage can be a cause for many an interesting occurance... why, at this rate we got the idea for a paper titled: "utilizing information gleaned from Internet-accessible support pages of various big organizations and institutions in network incidents..." It is at least as interesting a topic as this one.