==Phrack Inc.== Volume 0x0b, Issue 0x39, Phile #0x0c of 0x12 |=-----------------=[ Network Intrusion Detection System ]=--------------=| |=--------------=[ On Mass Parallel Processing Architecture ]=-----------=| |=------------=[ Wanderley J. Abreu Jr. <storm@stormdev.net> ]=---------=| "Nam et Ipsa Scientia Potestas Est" - Francis Bacon 1 ----|Introduction: One of the hardest challenges of the security field is to detect with a 100% certainty malicious attacks while they are occuring, and taking the most effective method to log, block and prevent it from happening again. The problem was solved, partially. About 19 years ago, Intrusion Detection System concept came to fit the market wishes to handle security problems concerning Internal/External attacks, with a low or medium cost, without major needs for trained security personnel, since any network administrator "seems" to manage them well. But then we came across some difficulties with three demands of anomaly and policy based IDS which are: effectiveness, efficiency and ease of use. This paper focuses on enhancing the bayesian detection rate by constructing a Depth-Search algorithm based IDS on a mass parallel processing (MPP) environment and give a mathematical aproach to effectiveness of this model in comparision with other NIDS. One Problem with building any software on such an expensive environment,like most MPPs, is that it is limited to a very small portion of computer community, thus we'll focus on High Performance Computer Cluster called "Class II - Beowulf Class Cluster" which is a set of tools developed by NASA. These tools are used to emulate MPP environment built of x86 computers running under Linux Based Operating Systems. The paper does not intend to offer the absolute solution for false positives and false negatives generated by Network-Based IDS, but it gives one more step towards the utopia. 2 -----|Bayesian Detection Rate (BDR): In 1761, Reverend Thomas Bayes brought us a concept for govern the logical inference, determining the degree of confidence we may have, in various possible conclusions, based on the body of evidence available. Therefore, to arrive at a logically defensible prediction one must use Bayes’ theorem. The Bayesian Detection Rate was first used to measure IDS effectiveness in Mr. Stefan Axelson paper "The Base-Rate Fallacy and its Implications for the Difficulty of Intrusion Detection" presented on RAID 99 which gives a realistic perspective on how "False Alarm" rate can limit the performance of an IDS. As said, the paper aims to increase the detection rate reducing false alarms on the IDS model, therefore we must know the principles of Bayesian Detection Rate (BDR): P(D|H)P(H) P(H|D) = ------------------------- P(D|H)P(H) + P(D|H')P(H') Let's use a simple example to ilustrate how Bayes Theorem Works: Suppose that 2% of people your age and heredity have cancer. Suppose that a blood test has been developed that correctly gives a positive test result in 90% of people with cancer, and gives a false positive in 10% of the cases of people without cancer. Suppose you take the test, and it is positive. What is the probability that you actually have cancer, given the positive test result? First, you must identify the Hypothesis, H, the Datum, D, and the probabilities of the Hypothesis prior to the test, and the hit rate and false alarm rates of the test. H = the hypothesis; in this case H is the hypothesis that you have cancer, and H' is the hypothesis that you do not. D = the datum; in this case D is the positive test result. P(H) is the prior probability that you have cancer, which was given in the problem as 0.02. P(D|H) is the probability of a positive test result GIVEN that you have cancer. This is also called the HIT RATE, and was given in the problem as 0.90. P(D|H') is the probability of a positive test result GIVEN that you do not have cancer. This is also called the FALSE ALARM rate, and was given as 0.10. P(H|D) is the probability that you have cancer, given that the test was positive. This is also called the posterior probability or Bayesian Detection Rate. In this case it was 0.155(16% aprox., i'd not bet the rest of my days on this test). Applying it to Intrusion Detection Let's say that: Ii -> Intrusion behaviour Ij -> Normal behaviour Ai -> Intrusion Alarm Aj -> No Alarm Now, what a IDS is meant to do is alarm us when log pattern really indicates an intrusion, so what we want is P(Ii|Ai), or the Bayesian Detection Rate. P(Ii) P(Ai|Ii) P(Ii|Ai) = ---------------------------------- P(Ii) P(Ai|Ii) + P (Ij) P(Ai|Ij) Where: True Positive Rate P(Ai|Ii): Real Attack-Packets Detected P(Ai|Ii) = ---------------------------------- Total Of Real Attack-Packets False Positive Rate P(Ai|Ij): False Attack-Packets Detected P(Ai|Ij) = ------------------------------------------------------- (Total Of Packets) - (Total Of Real Attack-Packets) Intrusive Behaviour P(Ii): 1 P(Ii) = ------------------------------------------------------------- Total of Packets ----------------------------------------------------- (Number of Packets Per Attack) * (Number of Attacks) Non-Intrusive Behaviour P(Ij): P(Ij) = 1 - P(Ii) By now you should realize that the Bayesian Detection Rate increases if the False Positive Rate decreases. 3 -----|Normal Distribution: To detect a raise on BDR we must know what is the standard BDR for actual Intrusion Detection Systems so we'll use a method called Normal Distribution. Normal distributions are a family of distributions that have the same general shape. They are symmetric with scores more concentrated in the middle than in the tails. Normal distributions are sometimes described as bell shaped. The area under each curve is the same. The height of a normal distribution can be specified mathematically in terms of two parameters: +the mean (m) and the standard deviation (s). +The height (ordinate) of a normal curve is defined as: 1 f(x)= ------------------ * e ^(-(x-m)^2)/2s^2 /-------------| \/ 2*p*s^2 Where m is the mean and s is the standard deviation, p is the constant 3.14159, and e is the base of natural logarithms and is equal to 2.718282. x can take on any value from -infinity to +infinity. 3.1 ---------| The Mean: The arithmetic mean is what is commonly called the average and it can be defined as: x1 + x2 + x3 + ... + xn m = ----------------------- n Where n is the number of scores entered. 3.2 ---------| The Standard Deviation: The Standard Deviation is a measure of how spread out a distribution is. It is computed as the average squared deviation of each number from its mean: (x1 - m) ^2 + (x2 - m) ^2 + (x3 - m) ^2 + ... + (xn - m) ^2 s^2 = ------------------------------------------------------------- n Where n is the number of scores entered. We'll define a experimental method in which X will be the BDR for the most known IDS from market and we'll see how much our protype based on MPP plataform will differ from their results with the Normal Distribution Method and with the Standard Deviation. 4 ------|Experimental Environment: Now we should gather experimental information to trace some standard to IDS BDR: Let's take the default installation of 10 IDS plus our prototype, 11 in total running at this configuration: *Pentium 866 MHZ *128 MBytes RAM *100 Mb/s fast Ethernet Adapter(Intel tulip based(2114X) ) *1Megabyte of synchronous cache *Motherboard ASUS P3BF *Total of 30 gigabytes of HD capacity Transfer Rate of 15 Mb/s The Experiment will run for 22 days. Each IDS will run separately for 2 days. We'll use 3 Separate Subnets here 192.168.0.0/26 Netmask 255.255.255.192, 192.168.0.129/26 Netmask 255.255.255.192, And a Real IP Network, 200.200.200.x. The IDS can only differ on OS aspect and methods of detection, but must still mantain the same node configuration. We'll simulate, random network usage and 4 intrusion attacks (4 packets) until the amount of traffic reaches around 100,000 packets from diferent protocols. The gateway (host node) remains routing or seeing packets of the Internal network, Internet, WAN, etc. ------------------- | SWITCH | ------------------- | | |______DMZ ____>Firewall___>Router___> Internet | | | | |_________ | __________ LAN ____> _____________| | | | | ----- ----- HOST NODE | | ------- | | (login node) | | | |--- | | | | ---- | | | | | ----- ------- | ----- node |ooooo| _ node one |ooooo| | | two(IDS) (gateway) ------- - Keyboard/Mouse Monitor 4.1 -----|MPP Environment: Now we must define a network topology and a standard operating system for our prototype. The gateway host is in the three networks at the same time and it will handle the part of the software that will gather packet information, process a Depth-1st search and then transmit the supicious packets to the other hosts. The hardware will be: *3 Pentium II 400 MHZ *128 Megabytes RAM ---------------------- *1 Pentium III 550 MHZ *512 Megabytes RAM ---------------------- *Motherboard ASUS P3BF *Total of 30 gigabytes of HD capacity Transfer Rate of 15 Mb/s *1Megabyte of synchronous cache *100 Mb/s fast Ethernet Adapter ( Intel tulip based (2114X) ) The OS will be the Extreme Linux distribution CD which comes with all the necessary components to build a Cluster. Note that we have the same processing capability of the other NIDS systems (866 MHZ), we'll discuss the cost of all environments later. ------------------- | SWITCH | ------------------- __________| | | | | |______DMZ ____>Firewall___>Router___> Internet | ______| | | | | | | __| | | | __________ LAN ____> | | | | | | ----- ----- ----- | | ----- | | | | | | ----- |_____________| | ------- | | | | | | | | | | | |--- | | | | | | | | HOST NODE | | ---- | | | ----- ----- ----- | | (login node) ----- ------- | node node node ----- node |ooooo| _ five four three node one |ooooo| | | two (gateway) ------- - Keyboard/Mouse Monitor 5 ------|The Experiment: Tested NIDS Were: +SNORT +Computer Associates Intrusion Detection System +Real Secure +Shadow +Network Flight Recorder +Cisco NetRanger +EMERALD (Event Monitoring Enabling Response to Anomalous Live Disturbances) +Network Associates CyberCop +PENS Dragon Intrusion Detection System +Network ICE +MPP NIDS Prototype 5.1 ------|Results: ----|Snort False positives - 7 False Negatives - 3 True Positives - 1 1 P(Ii) = -------------------- = 2.5 * 10^-4 1*10^5 -------- 1*4 P(Ij) = 1 - P(Ii) = 0.99975 P(Ai|Ii) = 1/4 = 0.25 P(Ai|Ij) = 7/99996 = 7.0 * 10^-5 (2.5 * 10^-4) * (2.5^-10) BDR = ------------------------------------------------------------- = 0.4718 (2.5 * 10^-4) * (2.5^-10) + (9.9975 * 10^-1) * (7.0 * 10^-5) ----|Computer Associates Intrusion Detection System False positives - 5 False Negatives - 2 True Positives - 2 1 P(Ii) = -------------------- = 2.5 * 10^-4 1*10^5 -------- 1*4 P(Ij) = 1 - P(Ii) = 0.99975 P(Ai|Ii) = 2/4 = 0.50 P(Ai|Ij) = 5/99996 = 5.0 * 10^-5 (2.5 * 10^-4) * (5.0^-10) BDR = ------------------------------------------------------------- = 0.7143 (2.5 * 10^-4) * (5.0^-10) + (9.9975 * 10^-1) * (5.0 * 10^-5) ----|Real Secure False positives - 6 False Negatives - 2 True Positives - 2 1 P(Ii) = -------------------- = 2.5 * 10^-4 1*10^5 -------- 1*4 P(Ij) = 1 - P(Ii) = 0.99975 P(Ai|Ii) = 2/4 = 0.50 P(Ai|Ij) = 6/99996 = 6.0 * 10^-5 (2.5 * 10^-4) * (5.0^-10) BDR = ------------------------------------------------------------- = 0.6757 (2.5 * 10^-4) * (5.0^-10) + (9.9975 * 10^-1) * (6.0 * 10^-5) ----|Network Flight Recorder False positives - 5 False Negatives - 1 True Positives - 3 1 P(Ii) = -------------------- = 2.5 * 10^-4 1*10^5 -------- 1*4 P(Ij) = 1 - P(Ii) = 0.99975 P(Ai|Ii) = 3/4 = 0.75 P(Ai|Ij) = 5/99996 = 5.0 * 10^-5 (2.5 * 10^-4) * (7.5^-10) BDR = ------------------------------------------------------------- = 0.7895 (2.5 * 10^-4) * (7.5^-10) + (9.9975 * 10^-1) * (5.0 * 10^-5) ----|Cisco NetRanger False positives - 5 False Negatives - 3 True Positives - 1 1 P(Ii) = -------------------- = 2.5 * 10^-4 1*10^5 -------- 1*4 P(Ij) = 1 - P(Ii) = 0.99975 P(Ai|Ii) = 1/4 = 0.25 P(Ai|Ij) = 5/99996 = 5.0 * 10^-5 (2.5 * 10^-4) * (2.5^-10) BDR = ------------------------------------------------------------- = 0.5556 (2.5 * 10^-4) * (2.5^-10) + (9.9975 * 10^-1) * (5.0 * 10^-5) ----|EMERALD False positives - 7 False Negatives - 3 True Positives - 1 1 P(Ii) = -------------------- = 2.5 * 10^-4 1*10^5 -------- 1*4 P(Ij) = 1 - P(Ii) = 0.99975 P(Ai|Ii) = 1/4 = 0.25 P(Ai|Ij) = 7/99996 = 7.0 * 10^-5 (2.5 * 10^-4) * (2.5^-10) BDR = ------------------------------------------------------------ = 0.4718 (2.5 * 10^-4) * (2.5^-10) + (9.9975 * 10^-1) * (7.0 * 10^-5) ----|CyberCop False positives - 4 False Negatives - 2 True Positives - 2 1 P(Ii) = -------------------- = 2.5 * 10^-4 1*10^5 -------- 1*4 P(Ij) = 1 - P(Ii) = 0.99975 P(Ai|Ii) = 2/4 = 0.50 P(Ai|Ij) = 4/99996 = 4.0 * 10^-5 (2.5 * 10^-4) * (5.0^-10) BDR = ------------------------------------------------------------ = 0.7576 (2.5 * 10^-4) * (5.0^-10) + (9.9975 * 10^-1) * (4.0 * 10^-5) ----|PENS Dragon Intrusion Detection System False positives - 6 False Negatives - 2 True Positives - 2 1 P(Ii) = -------------------- = 2.5 * 10^-4 1*10^5 -------- 1*4 P(Ij) = 1 - P(Ii) = 0.99975 P(Ai|Ii) = 2/4 = 0.50 P(Ai|Ij) = 6/99996 = 6.0 * 10^-5 (2.5 * 10^-4) * (5.0^-10) BDR = ------------------------------------------------------------- = 0.6757 (2.5 * 10^-4) * (5.0^-10) + (9.9975 * 10^-1) * (6.0 * 10^-5) ----|Network ICE False positives - 5 False Negatives - 3 True Positives - 1 1 P(Ii) = -------------------- = 2.5 * 10^-4 1*10^5 -------- 1*4 P(Ij) = 1 - P(Ii) = 0.99975 P(Ai|Ii) = 1/4 = 0.25 P(Ai|Ij) = 5/99996 = 5.0 * 10^-5 (2.5 * 10^-4) * (2.5^-10) BDR = ------------------------------------------------------------- = 0.5556 (2.5 * 10^-4) * (2.5^-10) + (9.9975 * 10^-1) * (5.0 * 10^-5) ----|Shadow False positives - 3 False Negatives - 2 True Positives - 2 1 P(Ii) = -------------------- = 2.5 * 10^-4 1*10^5 -------- 1*4 P(Ij) = 1 - P(Ii) = 0.99975 P(Ai|Ii) = 2/4 = 0.50 P(Ai|Ij) = 3/99996 = 3.0 * 10^-5 (2.5 * 10^-4) * (5.0^-10) BDR = ------------------------------------------------------------- = 0.8065 (2.5 * 10^-4) * (5.0^-10) + (9.9975 * 10^-1) * (3.0 * 10^-5) ----|MPP NIDS Prototype False positives - 2 False Negatives - 1 True Positives - 3 1 P(Ii) = -------------------- = 2.5 * 10^-4 1*10^5 -------- 1*4 P(Ij) = 1 - P(Ii) = 0.99975 P(Ai|Ii) = 3/4 = 0.75 P(Ai|Ij) = 2/99996 = 2.0 * 10^-5 (2.5 * 10^-4) * (7.5^-10) BDR = ------------------------------------------------------------- = 0.9036 (2.5 * 10^-4) * (7.5^-10) + (9.9975 * 10^-1) * (2.0 * 10^-5) 4.2 -------|Normal Distribution Using the normal distribuiton method let us identify, for a scale from 1 to 10, what's the score of our NIDS Prototype: ---|The Average BDR for NIDS test was: 0.4718+0.7143+0.6757+0.7895+0.5556+0.4718+...+0.8065+0.9036 m(BDR) = ------------------------------------------------------------- 11 m(BDR) = 0.6707 ---|The Standard Deviation for NIDS test was: (0.4718 - 0.6707)^2+(0.7143 - 0.6707)^2+...+(0.9036 - 0.6707)^2 s(BDR)^2 = ---------------------------------------------------------------- 11 s(BDR) = 0.1420 ---|The Score The mean is 67.07(m) and the standard deviation is 14.2(s). Since 90.36(X) is 23.29 points above the mean (X - m = 23.29) and since a standard deviation is 14.2 points,there is a distance of 1.640(z) standard deviations between the 67.07 and 90.36 (z=[23.29/14.2]) plus 0,005 for rounds and 5.0 for our average standard score. The score (z) can be computed using the following formula: X - m Z = -------- s If you get a positive number for Z then apply (z = z + 0.005 + 5.0) If you get a negative number for Z then apply (z = z - 0.005 + 5.0) You should consider just the two first decimal places: So for our prototype we'll get: z = 1.640 + 0.005 + 5.0 z = 6.64 Our prototype scored 6.64 in our test, at this point the reader is encouraged to make the same calculation for all NIDS, you'll see that our prototype achieved the best score of all NIDS we tested. 6 -------|Why? Why our prototype differs so much from the rest of the NIDS, if it was built under almost the same concepts? 6.1 ---|E,A,D,R AND "C" Boxes Using the CIDF (Common Intrusion Detection Framework) we have 4 basic boxes, which are: E - Boxes, or event generators, are the sensors; Their Job is to detect events and push out the reports. A - Boxes receive reports and do analysis. They might offer a prescription and recommend a course of action. D - Boxes are database components; They can determine wheter an IP address or an attack has been seen before, and they can do trend analysis R - Boxes can take the input of the E, A and D Boxes and Respond to the event Now what are the "C" - Boxes? They are Redundancy Check boxes, they use CRC methods to check if a True Positive is really a True Positive or not. The C-Boxes can tell If an E - Box generates a rightful report or an A - Box generates a real true positive based on that report. Because we're dealing with a MPP Enviroment this node can be at all machines dividing the payload data by as much as boxes you have. 6.2 ---|CISL Our prototype Boxes use a language called CISL (Common Intrusion Specification Language) to talk with one another and it convey the following kinds of information: +Raw event information: Audit Trail Records and Network Traffic +Analysis Results: Description of System Anomalies and Detected Attacks +Response Prescriptions: Halt Particular Activities or modify component security specifications 6.3 ---|Transparent NIDS Boxes All but some E-Boxes will use a method comonly applied to firewalls and proxies to control in/out network traffic to certain machines. It's Called "Box Transparency", it reduces the needs for software replacement and user retain. It can control who or what is able to see the machine so all unecessary network traffic will be reduced by a minimum. 6.4 ---|Payload Distribution And E-Box to A-Box Tunneling Under MPI (Message Passing Interface) programming environment, using Beowulf as Cluster Plataform, we can distribute network payload traffic parsing of A - Boxes every machine in the cluster, maximizing the A - Box perfomance and C - Box as well. All other network traffic than the report data that come from E-Boxes by a encrypted tunneling protocol, is blocked in order to maximize the cluster data transfer and the DSM (Distributed Shared Memory). 7 -------|Conclusions Altough Neither Attack Method nor the NIDS Detection Model were considered on this paper, it's necessary to add that no one stays with a NIDS with their default configuration, so you can achieve best scores with your well configured system. You can also score any NIDS scope with this method and it gives you a glimpse of how your system is doing in comparison with others. Like it was said at the introduction topic, this paper is not a final solution for NIDS performance mesurement or a real panacea to false positive rates (doubtfully any paper will be), but it gives the reader a relative easy way to measure yours NIDS enviroment effectivess and it proposes one more way to perform this hard job. 8 -------|Bibliography AMOROSO, Edward G. (1999), "Intrusion Detection", Intrusion NetBook, USA. AXELSON, Stefan (1999) - "The Base-Rate Fallacy and its Implications for the Difficulty of Intrusion Detection", www.ce.chalmers.se/staff/sax/difficulty.ps, Sweden. BUNDY, Alan (1997), "Artificial Inteligence Techniques", Springer-Verlag Berlin Heidelberg, Germany. BUYYA, Rajkumar (1999), "High Performance Cluster Computing: Architectures and Systems", Prentice Hall, USA. KAEO, Merike (1999), "Designing Network Security", Macmillan Technical Publishing, USA. LEORNARD, Thomas (1999), "Bayesian Methods: An Analysis for Statisticians and Interdisciplinary Researchers", Cambridge Univ Press, UK. NORTHCUTT, Stephen (1999), "Network Intrusion Detection: An Analyst's Handbook", New Riders Publishing, USA. PATEL, Jagdish K. (1996), "Handbook of the Normal Distribution", Marcel Dekker, USA. STERLING, Thomas L. (1999), "How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters", MIT Press, USA. 9 -------|Acknowlegments: #Segfault at IRCSNET, Thanks for all fun and moral support TICK, for the great hints on NIDS field and beign the first one to believe on this paper potential VAX, great pal, for all those sleepless nights Very Special Thanks to GAMMA, for the great Text & Math hints SYD, for moral support and great jokes All THC crew Michal Zalewski, dziekuje tobie za ostatnia noc My Girlfriend Carolina, you all Know why :) Storm Security Staff, for building the experimental environment |=[ EOF ]=---------------------------------------------------------------=|