|
==Phrack Inc.== Volume 0x0b, Issue 0x3b, Phile #0x0f of 0x12 |=-------------=[ CRYPTOGRAPHIC RANDOM NUMBER GENERATORS ]=--------------=| |=-----------------------------------------------------------------------=| |=-----------------=[ DrMungkee <pub@drmungkee.com> ]=-------------------=| ----| Introduction Every component in a cryptosystem is critical to its security. A single failure in one could bring down all the others. Cryptographic random numbers are often used as keys, padding, salt and initialization vectors. Using a good RNG for each of these components is essential. There are many complications imposed by the predictability of computers, but there are means of extracting the few bits of entropy regardless of them being exponentially out-numbered by redundancy. This article's scope covers the design, implementation and analysis of RNGs. RNGs subject to exploration will be NoiseSpunge, Intel RNG, Linux' /dev/random, and Yarrow. ----| Glossary RNG - Random Number Generator PRNG - Pseudo Random Number Generator entropy - Unpredictable information redundancy - Predictable or probabilistic information ----| 1) Design Principles of RNGs 1.0) Overview A variety of factors come into play when designing an RNG. It's output must be undissernable from white noise, there must be no way of predicting any portion of it, and there can be no way of finding previous or future outputs based on any known outputs. If an RNG doesn't conform to this criteria, it is not cryptographicaly secure. 1.1) Entropy Gathering To meet the first and second criteria, finding good sources of entropy is an obligation. These sources must be unmoniterable by an attacker, and any attempts by an attacker to manipulate the entropy sources should not make them predictable or repetitive. Mouse movement is often used as entropy, but if the entropy is improperly interpreted by the RNG, there is a segnficant amount of redundancy. To demonstrate, I monitered mouse movement at an interval of 100 miliseconds. These positions were taken consecutively while the mouse was moved hecticaly in all directions. These results say it all: X-Position Y-Position 0000001011110101 0000000100101100 Only the last 9 bits of each 0000001000000001 0000000100001110 coordinate actualy appear 0000001101011111 0000001001101001 random. 0000001000100111 0000000111100100 0000001010101100 0000000011111110 0000000010000000 0000000111010011 0000001000111000 0000000100100111 0000000010001110 0000000100001111 0000000111010100 0000000011111000 0000000111100011 0000000100101010 The next demonstration shows a more realistic gathering of entropy by keeping only the 4 least significant bits of the X and Y positions and XORing them with a high-frequency counter, monitoring them at a random interval: X Y Timer XORed 1010 1001 00100110 01111111 0100 1100 00101010 00000110 0101 0010 01011111 01110101 1001 1100 10110000 11111100 0101 0100 11001110 11100010 0101 1100 01010000 01111100 1011 0000 01000100 00011100 0111 0111 00010111 00101000 0011 0101 01101011 01110110 0001 0001 11011000 11010001 Good entropy is gathered because 4bits from each coordinates represents a change in 16 pixels in each direction rather than assuming a motion of 65536 can occur in all directions. The high-resolution timer is used as well because although it is completly sequencial, it's last 8 bits will have been updated very often during a few CPU clock cycles, thus making those bits unmonitorable. An XOR is used to combine the entropy from the 2 sources because it has very the very good property of merging numbers in a way that preserves the dependency of every bit. The most common sources of entropy used all involve user interaction or high-frequency clocks in one way, shape, or form. A hybrid of both is always desirable. Latencies between user-triggered events (keystroke, disk I/O, IRQs, mouse clicks) measured at high-precisions are optimal because of the unpredictable nature of a user's behaviors and precise timing. Some sources may seem random enough but are in fact not. Network traffic is sometimes used but is unrecommended because it can be monitored and manipulated by an outside source. Another pittfall is millisecond precision clocks: they don't update frequently enough to be put to good use. A good example of entropy gathering shortcommings is Netscape's cryptographically _broken_ not-so-RNG. Netscape used the time and date with its process ID and its parent's process ID as it's only source of entropy. The process ID in Win9x is a value usualy below 100 (incremented once for each new process) that is XORed with the time of day Win9x first started. Even though the hashing function helped generate output that seemed random, it is easy to estimate feseable values for the entropy, hash them, and predict the RNG's output. It doesn't matter weather or not the output looks random if the source of entropy is poor. 1.2 Entropy Estimations Evaluating the quantity of entropy gathered should not be overlooked. It must be dones in order to prevent the RNG from attempting to output more entropy than it has gathered. Depending on system parameters, you can assign quality estimates for each of your entropy sources. For example, you can evaluate all keyboard generated entropy as being 4bits in size, regardless of how many bits of entropy you collect from it. If the RNG is on a file server and uses disk I/O as an entropy source, it could derrive an entropy estimate proportional to the number of users accessing the disk to prevent sequencial disk access from resulting in redundant entropy. The entropy estimates do not need to be the same size as the inputs or outputs of entropy gathering. They are meant as a safety precaution in further calculations. There are alternative methods for estimating the entropy. You could bias entropy from a source to be of better quality if that source has not supplied entropy for a period exceeding a certain interval. You can accumulate large amounts of entropy in a buffer, compress it, and derive an estimation from the compression ratio. Statistical tests comparing the last input entropy with a large quantity of previous inputs doesn't do much in terms of finding the current input's quality, but it gives the RNG an oppertunity to reject inputs that increase statistical probability of the group of entropy inputs. The best approach to this is also a hybrid. One method of estimating entropy quality usualy isn't enough. There are cases where an entropy source can be assumed to provide a consistant quality of entropy however. In these cases, a fixed size can be assigned to all entropy inputs from that source, but carefull analysis should be done before this assumption is made. It is wisest to calculate multiple estimates and assume the smallest value to be the most accurate. 1.3) Entropy Pools No entropy source should be assumed perfect. More specificaly, no entropy source should be assumed perfect on a computer. That is why entropy is gathered in a buffer (entropy pool) to undergo supplimentary processing. After entropy is gathered from a source, it is input into an entropy pool. The entropy pool must do several things with this input. It must keep track of the amount of entropy contained within it, mix the last input uniformaly with all the previous inputs contained within it, and provide an at least seamingly random state regardless of the quality of the entropy input (patternistic inputs should still look random in the pool). Mixing the contents of the entropy pool should neither sacrifice any of the entropy within it nor be considered to add entropy to its state. If the mixing function expands the pool, entropy estimation of its contents should not change. Only the entropy gathering functions are responsible for increasing entropy and are dealt with serperately. The best candidates for mixing functions are hashing algorithms. The hashing algorithm should accept any size input, and have a large sized output that reflects the speed at which entropy is gathered, and have a non-deterministic output. To preserve gathered entropy, the hashing function should not input more entropy than the size of it's output. With that said, if the hashing function outputs 160bits, it should not be input more than 160bits prior to output. If the hashing algorithm is cryptographically secure (which it should be) the output will yield the same amount of entropy as the input. If the output is larger than the input, the state of the pool cannot be assumed to have increased in entropy. There are several approaches to using large pools of entropy. One approach implments a pool that is hashed linearly. For this method, you would need a buffer that is concatinated with the last input of entropy. Hashing should be started at the end of the buffer. The rest of the buffer should be hashed, one chunk (the size of the output) at a time, each time XORing the output with the output of the last block's hash to ensure the entire pool is affected by the last input, without overwritting any previous entropy. This is only an examplar method. Whichever procedure you choose, it should meet all the criteria mentioned in the previous paragraphs. Another approach to maintaining a large entropy pool is using multiple hashed contexts which are used to affect each other. A common use is a pool that contains unmanipulated entropy. Once that pool is full, it is hashed and used to update another pool either by updating a hashing context or XORing. This is cascaded through as many pools as desired, but to avoid losing previous entropy, some pools should only be updated after it's parent pool (the one that updates it) has been updated a certain number of times. For example, once the first hashed pool has been updated 8 times, a second pool can be updated. Once the second hashed pool has been updated 3 times, it can update a third pool. With this method, the third pool contains entropy from the last 24 entropy updates. This conserves less entropy (limited by the size of the hashing contexts) but provides better quality entropy. Entropy is of better quality because the source of the entropy containted within the third pool is completly dependent on 24 entropy inputs. Inputing entropy into a pool is usualy called updating or seeding. Entropy pools combined with the output function by themselves are in fact PRNGs. What makes a RNG is the entropy gathering process which obtains truly random seeds. As long a good entropy is input, the RNG will have an infinite period (no output patterns) as oposed to PRNGs which have a semi-fixed point at whitch they will start to repeat all previous outputs in the same order. Entropy pools are the key to preventing any previous or future outputs of RNG from being predicted. Attacks against an RNG to determine previous and future outputs are either based on knowledge of the entropy pool, entropy inputs or previous outputs. The pool should be designed to prevent knowledge of its current state from compromising any or all future outputs. To do this, entropy pools should undergo a drastic change from time to time by removing protions or all of its entropy. This is called reseeding. Reseeding should _always_ replace the entropy that is removed with fresh entropy before outputing. If the entropy is not replaced, the pool will be in a severely weakened state. An RNG does not need to reseed, but if it doesn't, it must have entropy added at a rate greater than the RNG's output. Reseeding should only occur after sufficient unused entropy has been accumulated to fill a large portion of the pool, and the entropy estimation of the pool should be adjusted to the estimated size of the input entropy. Reseeding should not occur very often, and only based on the number of bits output by the RNG and the size of the pool. A safe estimation on the reseeding frequency of an RNG would be the after an 95% of the size of the entropy input has been output. This estimate assumes that entropy is added to the pool in between the RNG's outputs. If this is not the case, reseeding should occur more frequently. The less entropy is input between outputs, the better the chances that an attacker who has found one output will find the previous output (which can cascade backwards after each output is found). 1.4) Output Functions An RNG's output should be passed through a one-way function. A one-way function's output is derrived from its input, but that input is computationaly infeasable to derive from its output. One-way hash functions are perfect for this. More complex methods involve using portions of the pool as key data fed to a symmetric encryption algorithm that encrypts another portion of the pool and outputs the ciphertext. Expansion-compression is a very effective one-way function as well. To do this you can use portions of the pool as seeds to a PRNG and generate multiple outputs (each the size of the PRNG's seed) and then inputing all of these into a hash function and outputing its result. This is effective because many intermediate (expanded) states could result in the same hash output, but only one iniciate (before expansion) state can result in that intermediate state. Every time the RNG outputs, its entropy estimate should be decremented by the size of the output. This is done with the assumption that the output entirely consists of entropy. Because that output's entropy is still in the pool, it is now redundant and cannot be assumed as entropy (inside the pool) any longer. If the pool is 512bits in size, and 160bits of entropy is consumed on every output then almost all entropy hash been used after 3 outputs and the pool should be reseeded. There is a problem nearly impossible to overcome that occurs when implementing entropy pools: there is no way of determining what entropy bits were output, and which were not. The best way to nullify the symptomes of this problem is by making it impossible to know when entropy has been used more than once based on the the RNG's output. When an output occurs, the pool's state must be permuted so that consecutive outputs without any entropy added or reseeding will not result in identical RNG outputs. The pool's state permutation must be a one-way function and must apply the same concepts and criteria used in the output function. The pool's entropy size is always assumed to be identical after permutation as long as the procedure follows the criteria. 1.5) Implementation All the effort put into a well designed RNG is useless if it isn't properly implemented. Three layers of the implemetation will be covered: media, hardware/software, and usage of the output. Storage and communication media each represent a risk in an unencrypted state. The following lists various degrees of risk assigned to storage and communication media. Risks are assigned as such: 0 - no risk 1 - low risk 2 - medium risk 3 - high risk MEDIA RISK ------------------------------------ RAM <storage> 0 *& Hard Drive <storage> 1 *& Shared memory <transfer> 1 *& Removable disks <transfer> 2 LAN <communication> 2 & WAN <communication> 3 Any properly encrypted media's risk is 0. * If the storage media is on a computer connected to a network, risk is increased by 1. & If physical access is possible (computer/LAN)., risk is increased by 1. The highest risk of all medias should be interpreted as the implementation's risk (weakest link, good bye!). High risk is unacceptable. Medium risk is acceptable depending on the value of the RNG's output (what's it worth to an attacker?). A personal diary can easily cope with medium risk unless you have many skeletons in your closet. Industrial secrets should only use 0 risk RNGs. Acceptable risk is usualy up to the programmer, but the user should be aware of his choice. Hardware RNGs should be tamper-proof. If any physical modification is attempted, the RNG should no longer output. This precaution prevents manipulation of the entropy pool's state and output. There should be no way of monitoring hardware RNGs through frequencies, radiation, voltage, or any other emissions generated by the RNG. Any of these could be used as a source of information with whitch the RNG's entropy pool or output could be compromised. To prevent this, all hardware RNGs should be properly shielded. Software implementations can be very tricky. Reverse engineering will remain a problem until digital signing of executable files is implemented at the operating system level. Until then, any attempts made on the programmer's behalf to prevent reverse engineering of the RNG's software implementation will only delay the innevitable. It is still important that the programmer takes care in writting the software to have to lowest possible risk factor (the chart takes into account reverse engineering of software). // the following applies to RNGs seperate from their calling applications The RNG must take special care to ensure that only one program has access to each of the RNG's outputs. The method by which the data is transfered from the RNG to the program must not succomb to observation. Distinct outputs are usualy guarrentied by the output function, but sometimes the output is copied to a temporary buffer. It might be possible to trick an RNG into conserving that buffer, or copying it elsewhere providing easy observation. A quick solution is for an application to encrypt the RNG's output with a key it generates by its own means. However, you could go all out and implement a full key-escrow between the RNG and the calling applications and still be vulnerable to a hack. The kind of _prevention_ a programmer incorporates into software only serves as a road block, but this is often enough to discourage 99.9% of its users from attempting to compromise security. Not much can be done about 0.1% that can still manipulate the software because there will always be a way to crack software. 1.6) Analysis There are two important aspects to analysing an RNG: randomness and security. To evaluate an RNG's randomness, one usualy resorts to statistical analysis of the RNG's input (entropy gathering process) and output (output function). To evaluate it's security, one would look for flaws in its entropy gathering, entropy pool, mixing function, and output function that allow an attacker to find past, present, or future outputs by any means possible. There is no guarrentying the effectiveness of either of these aspects. The only certain thing is once the RNG is broken, it is broken; until then, you can only speculate. There are many statistical tests available on the internet suitable for testing randomness of data. Most require a large sample of data stored in a file to derive significant results. A Probabilistic value is obtained through statistical analysis of the sample. This value is usualy in the form of P, a floating point number between 0 and 1. Tests are done in various block sizes usualy between 8 and 32bits. P's precision varies from one test to the next. A P value close to 0.5 is what is usualy desired. When P is close to 0.5, probability is at it's midrange and there is no incline towards either 0 or 1. An RNG is not weak because it has a value close to 1 or 0. It can occur even with purely random data. If it were impossible to obtain a value close to 0 or 1, the RNG would be flawed anyway. This is because when data is completly random, all outputs are equaly likely. This is why patterned outputs are possible. When P is less then satisfactory, many new samples should be created and tested. If other samples result in bad Ps then the RNG most likely has deterministic output and should not be used. DieHard offers an armada of 15 tests that use P values. Other tests describe there results with an integer and it's target. The closer the integer is to its target the better. An example of this is the Maurer Universal Statistics Test. The problem with statistical tests is that any good PRNG or hashing function will pass them easily without any entropy. Even if the output is non-deterministic the RNG is only an RNG if it cannot be predicted. For that reason, the RNG's entropy must be non-deterministic as well. Unless the entropy source can be guarrentied to function properly, it is wise to use the same tests on the raw entropy itself. By doing this you can achieve a sufficient level of confidence about the randomness. A big speed-bump stares you right in the eyes when you're trying to do this, however. Entropy is often gathered at a very slow pace making the gathering of a sufficiently large data sample extremely tedius and in some circumstances it might not even be worthwhile. Whether this is the case or not, it is logical to intellegently scrutinise entropy sources, rather than depending on statistical tests (which cannot guarrenty anything) to find flaws (see 1.1). Evaluating an RNG's security is a complexe task with infinite means and only one end: a break. The odds are always well stacked against an RNG. No matter how many provisions are made to prevent breaks, new attacks will always eventualy emerge from that RNG or another. Every aspect of the RNG must be studied carefully, from entropy gathering right up to the delivery of the RNG's output. Every component should be tested individualy and then as a group. Tests include the possibility of hacks that can tamper with or monitor entropy gathering, and cryptanalysis of mixing and output functions. Most breaks are discovered under laboratory conditions. These are called academic breaks and they usualy require very specific conditions be met in order to function (usualy highly improbable). Finding these breaks is a broad topic on its own and is beyond of the scope in article. Successful breaks are usually the result of months (often years) of pain-staking work done by cryptanalysts with years of experience. The best thing to do is to carefully design the RNG from start to finish with security in mind. Even as the limits of mathematics and cryptanalysis are reached in testing, advancements in sience could reak havoc on your RNG. For example, Tempest scanning could be used by an attacker to follow keystrokes and mouse positions. Discoveries can even be made in the analysis of white noise, eventualy. These breaks are usualy found by scholars and professionals who seek only to make their knowledge available before damage occurs. Not much can be done to prevent attacks that are unknown. Finding an effective fix quickly and learning from the is what is expected from developers. Thankfully, these attacks emerge very rarely, but things are changing as research increases. Only the security analysis of the RNGs in section 2 will be discussed because each has already been tested for and passed randomness analysis. ----| 2 Description of specific RNGs 2.1) NoiseSpunge's Design Information Source: Uhhhh, I wrote it. 2.1.0) NoiseSpunge Overview NoiseSpunge was specifically written for generating random 256bit keys suitable for strong encryption. Gathering entropy for a single output (256bits) requires a few seconds of mouse movement on the user's part. Its structure is complex and computationaly expensive. NoiseSpunge is meant to be a component within cryptosystems, and for that reason, special consideration has to be made in order to prevent it from being a liability. The trade off in this implementation is it would be clumsy at best if large quantities of random data were needed regularly because it would require intense user-interaction and it would consume too many CPU cycles. 2.1.1) NoiseSpunge Entropy Gathering A PRNG is seeded with initial zeros. The PRNG then outputs a value used to calculate the length of the interval used. When the interval is triggered, the mouse position is checked for movement. If the mouse has moved since the last trigger the PC's high-frequency clock is queried for its current value. The 4 least significant bits are XORed with the 4 least significant bits of the mouse's x & y coordinates. A new interval is then calculated from the PRNG. The 4 bits produced are concatenated until 32 bits are gathered and output. The 32bits are concatenated to the an entropy buffer and also used to update the PRNG that sets the interval. The process is then repeated. If the mouse has not moved, a new interval is set and the process repeats until is has moved. There is also a function that allows the programmer to input 32bits of entropy at a time. This function is suitable if there is a hardware entropy device or another known secure source of entropy on a particular system. However, the use of another RNG's output would be redundant if it is good and useless if it is bad. 2.1.2) NoiseSpunge Entropy Estimation Entropy estimation is straight forward. The worst case scenario is assumed with each input. Only 4bits are gathered for every mouse capture. No further estimations are done because they would only yield results 4bits or greater. Entropy estimation for the supplementary function that allows the programmer to supply his own entropy requires the programmer to guarrantee his entropy is of good quality; estimation of this input's entropy is left in his hands. 2.1.3) NoiseSpunge Entropy Pool The internal state comprises 762bit. There is a 256bit seed, a 256bit primary hash, and a 256bit secondary hash. 256bit Haval is used as the hashing function. When a 32bit block of entropy is added, it is appended to a 256bit buffer. Once the buffer is full the primary hash is updated with it. The seed is XORed with The primary hash's output unless this is the 8th primary reseed. In that case, the primary hash's output is input into the secondary hash and that hash's output is permuted (see bellow) and replaces the seed. Seed permutation is accomplished by an expansion-compression. 32bit words of the seed are fed as a PRNG's random seed and used to output two 32bit words. All 512bits of the PRNG's output are hashed and replace the pool's seed. After every primary reseed, a KeyReserve counter is incremented and capped at 8. The KeyReserve reperesents the number of 256bit groups of entropy that have been added to the internal state. This KeyReserve is a rough estimate of when there is no longer any purpose to adding entropy into the pool and the entropy gathering thread can be paused (until the RNG outputs). 2.1.4) NoiseSpunge Output Function There are 2 methods provided for the RNG's output: safe and forced. A safe output makes sure the KeyReserve is not zeroed and decrements it after output. A forced output ignores the KeyReserve. To output, the seed is copied to a temporary buffer and is then permuted. The new seed is used a key to initialize Rijndael (symmetric block cipher). The temporary buffer is encrypted with Rijndael and then permuted with an expansion-compression (the same way the seed is). This is repeated for N rounds (chosen by the programmer) and the buffer is then output. 2.1.5) NoiseSpunge Analysis [1] The heavy relyance upon mouse movement could _starve_ the entropy pool if the mouse is not in use for an extended period of time. However, a counter prevents output when entropy is low. [2] The programmer could forcefully input poor quality entropy and weaken the RNG's internal state. [3] There are no provisions for systems without high-resolution timers. [4] Even though the pool's internal state is 762bits long, there is a maximum of 256bits of entropy at any state. (The other bits are only there to prevent back-tracking and to obfuscate the seed). That makes this RNG only suitable when small amounts of secure random data are needed. 2.2) Intel RNG's Design Information Source: Intel Random Number Generator White Paper *1 2.2.0) Intel RNG Overview The Intel RNG is system-wide. It is designed to provide good quality random data in massive quantities to any software that requires it. It's average throughput is 75Kb/s (bits). The Intel Security Driver provides a bridge between the middleware (CDSA, RSA-BSAFE, and Microsoft CryptoAPI) that will serve out the random numbers to requesting applications and the hardware. The hardware portion is in Intel's 810 chipset, and will be in the 82802 Firmware Hub Device for all future 8xx chipsets. {WARNING: these are some of my personal opinions; take them with a grain of salt} Intel has chosen to eloquantly label its RNG as a TRNG (True Random Number Generator), but then they go on to call it an RNG through the rest of the paper. Thechnicaly there is no fundamental difference that sets it asside from any other good RNG; it is a label for hype and has nothing to do with its ability to produce random numbers (RNG==TRNG & TRNG==RNG). As for your daily dose of corporate assurance: "The output of Intel RNG has completed post-design validation with Cryptography Research Inc. (CRI) and the Federal Information Processing (FIPS) Level 3 test for statistical randomness (FIPS 140-1)." I find it reassuring that a company (CRI) has analyzed and is supporting this RNG. That isn't something you see very often. On the other hand FIPS140-1 is just another hype generator. After reading FIPS140-1, one realises it has absolutely NOTHING to do with the quality of the RNG, but hey! Who cares? Microsoft seems to think it's good enough to use in their family of _high_quality_and_security_ products, so it must be great. All kidding asside, despite the corporate stench, this RNG is well designed and will prevent many RNG blunders such as Netscape's. I think this is a step in the right direction. Rather than letting Joe, Timmy his cousin, and Timmy's best friend's friend design their own RNGs, they provide a good solution for everyone without having them trip on their own feet like Netscape did. 2.2.1) Intel RNG Entropy Gathering Intel's Random Number Generator is to be integrated into PC motherboards. There are 2 resistors and 2 oscillators (one slow, the other fast). The voltage difference between the 2 resistors is amplified to sample thermal noise. This noise source is used to modulate the slow clock. This clock with variable modulation is used to set intervals between measurements of the fast clock. When the interval is triggered the frequency of the fast clock is then filtered through what Intel calls the von Neumann corrector (patent pending). The corrector compensates for the fast clocks bias towards staying in fixed bit states (regardless of the slow clock's variable modulation). It works by comparring pairs of bits and outputing only one or no bits ([1,0]=0; [0,1]=1; [0,0]or[1,1]=no output;). The output of the corrector is grouped in 32bit blocks and sent to the Intel Security Driver. 2.2.2) Intel RNG Entropy Estimation No estimations are done for a few reasons. Because the entropy source is hardware based, it cannot be manipulated unless it is put into temperatures far beyond or bellow resonable ambient conditions, or the computer's power is cut off (in which case the entropy gathering stops). Beyond that, all entropy is gathered in the same way and can be assumed of identical quality. 2.2.3) Intel RNG Entropy Pool The Intel Security Driver takes care of mixing the RNG's output. The pool is composed of 512bits of an SHA-1 hash contexts divided into two states. An 80bit hash of the first state is generated and appended with 32 bits of entropy (from the hardware) and the first 160bits from the first state to create the second state. When another 32bits of entropy are generated, the second state becomes the first state and the same process is repeated. 2.2.4) Intel RNG Output Function The last 16bits of the 80bit hash of the first state are output to the middleware. The Intel Security Driver ensures that each output is dispatched only once. If desired, additional processing of the output will have to be done by the program that requested the random data. 2.2.5) Intel RNG Analysis [1] The need to implement the von Neumann corrector is demonstration of the RNG's affinity for repetitive sequences. An attacker could calculate when 1s or 0s are disproportionatly output by estimating it's throughput in bits/sec, but this doesn't lead to any feasable attacks (yet). [2] The use of contracted middleware may lead to security holes. Before using a company's middleware, you may want to wait a few months just to see if a quick break is released. 2.3) Linux' /dev/random's Design Information Source: /dev/random source code *2 2.3.0) /dev/random Overview Linux provides the /dev/random character device as an interface for applications to recieve random data with good quality entropy. It provides a gernourously sized entropy pool (512 bytes) to accomodate the operating system and all software running on it. When quality entropy is not necessary, a second character device /dev/urandom is provided as a PRNG to avoid wastefully depleting /dev/random's entropy pool. 2.3.1) /dev/random Entropy Gathering External functions from the kernel trigger the addition of entropy into the pool. Events that trigger this are key presses, mouse movement, and IRQs. Uppon each trigger, 32bits of a high-frequency timer are copied, and another 32bits are derrived depending on the type of trigger (either the mouse coordinates, keybaord scancode, or IRQ number). 2.3.2) /dev/random Entropy Estimation Entropy estimation is calculated with the help of three deltas. Delta1 is the time elapsed since the last trigger of its type occured. Delta2 is the difference between Delta1 and the previous Delta1. Delta3 is the difference between Delta2 and the previous Delta2. The smallest of the three deltas calculated is chosen as Delta. The least significant bit of Delta is ignored and the next 12bits are used to increment the entropy counter. 2.3.3) /dev/random Entropy Pool This RNG uses an entropy pool of 4096bits. Prior to input, a marker denoting the current position along the pool is decremented by 2 32bit words. If the position is 0, the position is wrapped around backwards to the second last 32bit word. Entropy is added in two 32bit words: x & y. A variable, j determines how many bits to the left the entropy should be rotated. Before entropy is added, j is incremented by 14 (7 if the pool is in position 0). Entropy is rotated by j. Depending on the current position along the pool, y is XORed with 5 other fixed portions of the pool (the following positions are wrapped around from the current position: 103,76, 51,25,1 (for a 4096bit pool) and x is XORed with each next word. x is shifted to the right 3bits, XORed by a constant within a 1x7 table (0, 0x3b6e20c8, 0x76dc4190, 0x4db26158, 0xedb88320, 0xd6d6a3e8, 0x9b64c2b0, 0xa00ae278) the index of which is chosen by x AND 7 (bitwise, 3bits). x XOR y is then appended to the pool skipping one word. y is shifted to the right 3bits, XORed with the constant table the same way x was and then copied into the word that was skipped in the pool. The pool remains at this position (previous position - 2, possibly wrapped around the end). 2.3.4) /dev/random Output Function When output is requested from the RNG, the timer and the number of bytes requested is added to the pool as entropy. The pool is then hashed with SHA-1 and the first 2 words of the hash are fed as entropy into the pool; this is repeated 8 times, but each time the next 2 words of the hash are fed into the pool. The first half of the final hash is then XORed to its second half to produce the output. The output is either the requested size or 20 bytes (half the hash size); the smallest of these is chosen. 2.3.5) Linux' /dev/random Analysis [1] Monitoring and predicting of some IRQs is possible in a networked environment. [2] There is allot of redundancy in the lower 16bits of the entropy added. For example, when a keypress occurs a 32bit variable holds 16bits from a high-resolution timer, and the lower 16 bits are 0-255 for the keypress (256+ are used to designate interupts). This leaves 8bits of redundancy for every keypress. [3] The time elapsed since the last block of entropy was added is usually irrelevent to the quality of the entropy, unless that lapse is very short. This doesn't take into account sequencial entropy entries like continuous disk access while moving a file. [4] When output occurs, the mixing mechanism re-enters allot of hashed entropy which may or may not be of good quality. These re-entered words are added to the entropy count but should not. They are bits of entropy that have already been counted. After output, 512bits of entropy are redundantly entered. If this estimate is accurate, then after 8 calls to output there are 4096bits (the entire pool) of entropy of undifinable quality. Under these circumstances, if no entropy is input from user-interacting during the calls, the RNG becomes a PRNG. 2.4) Yarrow's Design information sources: Yarrow source code and White Papers *3,*4 2.4.0) Yarrow Overview Yarrow is designed by Bruce Schneier, auther of Applied Cryptography and designer of block ciphers Blowfish and AES finalist Twofish. Yarrow is Schneier's interpretation of the proper design of an RNG and is accompanied by a detailed paper descibing its inner-workings and analysis (see the second information source). It is the product of lengthy research and sets standard in properties expected to be found in a secure RNG. It is discussed here for comparisson between commonly trusted RNGs and one designed by a seasoned proffessional. 2.4.1) Yarrow Entropy Gathering System hooks wait for keyboard or mouse events. If a key has been pressed, the time elapsed since the last key-press is appended to an array. The same is done when a mouse button has been pressed. If the mouse has moved, the x and y coordinates are appended to a mouse movement array. Once an array is full is is passed to the entropy estimation function. 2.4.2) Yarrow Entropy Estimation The entropy estimation function is passed an estimated number of bits of entropy chosen by the programmer's bias towards it's source. One could decide that that mouse movement only represents 4 bits of entropy per movement, while keyboard latency is worth 8bits per key-press. Another measurement uses a small compression algorithm and measures the compressed size. The third and last measurement is half the size of the entropy sample. The smallest of these three measurements increments the entropy estimate. 2.4.3) Yarrow Entropy Pool When entropy is input, it is fed into a fast pool (SHA-1 context) and an entropy estimate is updated for that pool. Once the pool has accumulated 100bits of entropy, the hash output of this pool is fed into the slow pool and its entropy estimate is updated. When the slow pool has accumulated 160bits of entropy it's hash output becomes the current key. 2.4.4) Yarrow Output Function When output is required, the current key (derived from the slow pool) encrypts a counter (its number of bits is chosen by the programmer) and outputs the ciphertext; the counter is then incremented. After 10 outputs, the RNG reseeds the key by replacing it with another (forced) output. The key will next be reseeded either when the slow pool has accumulated 160bits or 10 outputs have occured. 2.4.5) Yarrow Analysis [1] Mouse movement on its own is very redundant, there is a very limited range of motion between the last postion and the current position after the OS has sent the message that the mouse has moved. Most of the bits representing the mouse's position are unlikely to change and throw-off the entropy estimates in this RNG. [2] Even though the pool's internal state is 320+n+kbits long, there is a maximum of 160bits of entropy during any state. "Yarrow-160, our current construction, is limited to at most 160 bits of security by the size of its entropy accumulation pools." *4 ----| 3) NoiseSpunge Source Code The Following source code is simply a brief example. Do whatever you want with it; even that thing you do with your tongue and the rubber ... never mind. It _WILL_NOT_COMPILE_ because about 1,200 lines have been omitted, consisting of Haval, Rijndael and the PRNG). Haval and Rijndael source code is readily available. Any PRNG will do, but make sure it works with 32bit inputs and outputs and has a period of at least 2^32 (4294967296). I've devided it into 3 chunks: entropy gathering, entropy pool, output functions. [ENTROPY GATHERING] This loop must run on a thread independent of the application's main thread. For OS dependancies, I've created dummy functions that should be replaced: int64 CounterFreq; //high-res counter's frequency/second int64 QueryCounter; //high-res counter's current value Delay(int ms); //1 milisecond precision delay int GetMouseX; //current mouse x coordinate int GetMouseY; // " y coordinate #define MOUSE_INTERVAL 10 { Prng_CTX PCtx; int x,y; unsigned long Block; unsigned long BitsGathered; int65 Interval,Frequency,ThisTime,LastTime; unsigned long BitsGathered=0; bool Idled=false; Frequency=CounterFreq; bool Terminated=false; //Set value to true to end the loop do { if (Idled==false) { Delay(MOUSE_INTERVAL); Idled=true; } ThisTime=QueryCounter; if ((ThisTime-LastTime)>Interval) { if ((x!=GetMouseX)&&(y!=GetMouseY) { x=mouse.cursorpos.x; y=mouse.cursorpos.y; Block|=((x^y^ThisTime)& 15)<<BitsGathered; BitsGathered+=4; if (BitsGathered==32) { PrngInit(&PCtx,Block); AddEntropy(Block); //this function is defined lower Block=0; BitsGathered=0; } Interval=((((Prng(@PCtx)%MOUSE_INTERVAL)>>2)+MOUSE_INTERVAL) * Frequency)/1000; } LastTime=QueryCounter; Idled=false; } } while (Terminated==false); } [ENTROPY POOL] #define SEED_SIZE 8 #define PRIMARY_RESEED 8 #define SECONDARY_RESEED 8 //parameters #define MAX_KEY_RESERVE 8 #define KEY_BUILD_ROUNDS 16 typedef unsigned long Key256[SEED_SIZE]; Key256 Seed; Key256 EntropyBuffer; Haval_CTX PrimaryPool; Haval_CTX SecondaryPool; unsigned char PrimaryReseedCount; unsigned char EntropyCount; unsigned char KeyReserve; //FUNCTIONS void NoiseSpungeInit { HavalInit(&PrimaryPool); HavalInit(&SecondaryPool); for (int i=0;i<8;i++) Seed[i]=0; EntropyCount=0; PrimaryReseedCount=0; KeyReserve=0; } void PermuteSeed { Key256 TempBuffer[2]; Prng_CTX PCtx; Haval_CTX HCtx; for (int i=0;i<SEED_SIZE;i++) //expand { PrngInit(&PCtx,Seed[i]); TempBuffer[0][i]=Prng(&PCtx); TempBuffer[1][i]=Prng(&PCtx); } HavalInit(&HCtx); HavalUpdate(&HCtx,&TempBuffer,64); //compress HavalOutput(&HCtx,&Seed); } void PrimaryReseed { Key256 TempSeed; HavalUpdate(&PrimaryPool,&EntropyBuffer,32); if (PrimaryReseedCount<SECONDARY_RESEED) { HavalOutput(&PrimaryPool,&TempSeed); for (int i=0;i<SEED_SIZE;i++) Seed[i]^=TempSeed[i]; PrimaryReseedCount++; } else SecondaryReseed; for (int i=0;i<SEED_SIZE;i++) EntropyBuffer[i]=0; if (KeyReserve<MAX_KEY_RESERVE) KeyReserve++; EntropyCount=0; } void SecondaryReseed { HavalOutput(&PrimaryPool,&Seed); HavalUpdate(&SecondaryPool,&Seed,32); HavalOutput(&SecondaryPool,&Seed); PermuteSeed; HavalInit(&PrimaryPool); PrimaryReseedCount=0; } void AddEntropy(unsigned long Block) { EntropyBuffer[EntropyCount++]=Block; if (EntropyCount==PRIMARY_RESEED) PrimaryReseed; } [OUTPUT FUNCTIONS] int SafeGetKey(Key256 *Key) { Key256 TempSeed; Key256 TempBuffer[2]; Rijndael_CTX RCtx; Prng_CTX PCtx; Haval_CTX HCtx; if (KeyReserve==0) Return 0; for (int i=0;i<SEED_SIZE;i++) TempSeed[i]=Seed[i]; PermuteSeed; RijndaelInit(&RCtx,&Seed); for (int i=0;i<KEY_BUILD_ROUNDS;i++) { RijndaelEncrypt(&RCtx,&TempSeed[0]); //encrypt RijndaelEncrypt(&RCtx,&TempSeed[4]); for (int j=0;j<SEED_SIZE;j++) //expand { PrngInit(&pctx,TempSeed[j]); TempBuffer[0,j]=Prng(&PCtx); TempBuffer[1,j]=Prng(&PCtx); } HavalInit(&HCtx); HavalUpdate(&HCtx,&TempBuffer,64); HavalOutput(&HCtx,&TempSeed); } for (int i=0;i<SEED_SIZE;i++) Key[i]=TempSeed[i]; if (KeyReserve>0) KeyReserve--; Return 1; } void ForcedGetKey(Key256 *Key) { Key256 TempSeed; Key256 TempBuffer[2]; Rijndael_CTX RCtx; Prng_CTX PCtx; Haval_CTX HCtx; for (int i=0;i<SEED_SIZE;i++) TempSeed[i]=Seed[i]; PermuteSeed; RijndaelInit(&RCtx,&Seed); for (int i=0;i<KEY_BUILD_ROUNDS;i++) { RijndaelEncrypt(&RCtx,&TempSeed[0]); //encrypt RijndaelEncrypt(&RCtx,&TempSeed[4]); for (int j=0;j<SEED_SIZE;j++) //expand { PrngInit(&pctx,TempSeed[j]); TempBuffer[0,j]=Prng(&PCtx); TempBuffer[1,j]=Prng(&PCtx); } HavalInit(&HCtx); HavalUpdate(&HCtx,&TempBuffer,64); HavalOutput(&HCtx,&TempSeed); } for (int i=0;i<SEED_SIZE;i++) Key[i]=TempSeed[i]; if (KeyReserve>0) KeyReserve--; } ----| 4) References *1 Intel Random Number Generator White Paper http://developer.intel.com/design/security/rng/CRIwp.htm *2 /dev/random source code http://www.openpgp.net/random/ *3 Yarrow source code http://www.counterpane.com/Yarrow0.8.71.zip *4 Yarrow-160: Notes on the Design and Analysis of the Yarrow Cryptographic Pseudorandom Number Generator http://www.counterpane.com/yarrow-notes.html