TUCoPS :: General Information

TUCoPS :: General Information :: passch.txt
How to choose secure passwords

Newsgroups: sci.crypt
Path: netcom.com!grady
From: grady@netcom.com (Grady Ward)
Subject: Passphrase proto-FAQ
Message-ID: <gradyCBIx4n.6n8@netcom.com>
Organization: Moby lexicons
X-Newsreader: TIN [version 1.1 PL8]
Date: Tue, 10 Aug 1993 03:17:11 GMT
Lines: 224

FAQ: How do I choose a good password or phrase?
 
 
ANS: shocking nonsense makes the most sense
 
	With the intrinsic strength of some of the modern 
encryption, authentication, and message digest algorithms 
such as RSA, MD5, SHS and IDEA the user password or phrase 
is becoming more and more the focus of vulnerability.
 
	Considering even the early PGP 1.0 application for example, 
a Deputy with the Los Angeles Country Sheriff's Department 
admitted in early 1993 that both they and the FBI despaired 
of breaking the system except through a successful 
dictionary attack (trying many possible passwords or 
phrases from lists of probable choices and their 
variations) rather than "breaking" the underlying 
cryptographic algorithm mathematically.
 
	The fundamental reason why attacking or trying to 
guess the user's password or phrase will increasingly be 
the focus of cryptanalysis is that the user's choice of 
password may represent a much simpler cryptographic key 
than optimal for the encryption algorithm. This weakness of 
the user's password choice provides the cryptanalytic 
wedge.
 
	For example, suppose a user chooses the password 
'david.' On the surface the entropy of this key (or the 
number of different equiprobable key states) appears to be 
five characters chosen from a set of twenty-six with 
replacements: 26^5 or 1.188 x 10^7. But since the user is 
apparently biased toward common given names, which a 
majority appear in lists numbering only 6,000-7,000 
entries, the true entropy is undoubtedly much closer to 6.5 
x 10^3, or about four orders of magnitude smaller than the 
raw length might suggest. (In fact this password probably 
possesses a much smaller entropy than even this for the 
very common name "david" would be one of the first names to 
be checked by an optimized dictionary attack program.) In 
other words, "entropy" is not a fixed physical quantity: 
the cryptanalyst can exploit whole meanings and contexts, 
not just byte frequencies, digraphs, or even whole-word 
correlations to reduce the entropy of the key space he or 
she is trying to explore.
 
	To thwart this avenue of attack we would like to 
discover a method of selecting passwords or phrases that 
have at least as many bits of entropy (or "hard-to-
guessness") as the entropy of the cryptographic key of the 
underlying algorithm being used.
 
	To compare, DES (Data Encryption Standard) is believed 
to have about 54-55 bits (~4 x 10 ^16) of entropy while the 
IDEA algorithm is believed to have about 128 bits (~3.5 x 
10^38) of entropy. The closer the entropy of the user's 
password or phrase is to the intrinsic entropy of the 
cryptographic key of the underlying algorithm being used, 
the more likely an attacker would need to search a 
substantially larger portion of the algorithm's key space 
in order to discover it.
 
	Unfortunately many documents suggest choosing 
passwords or phrases that are distinctly inferior to the 
latest methods. For example, one white paper widely 
archived on the internet suggests selecting an original 
password by constructing an acronym from a popular song 
lyric or from a line of script from, for example, the SF 
movie "Star Wars". Both of these ideas turn out to be weak 
because both the entire script to Stars Wars and entire 
sets of song lyrics to thousands of popular songs are 
available on-line to everyone and, in some case, are 
already embedded into "crack" dictionary attack programs.
 
	However the conflict between choosing an easy-to-
remember key and choosing a key with a high level of 
entropy is not a hopeless task if we exploit mnemonic 
devices that have been known for a long time outside the 
field of cryptography. With the goal of making up a 
passphrase not included in any existing corpus yet very 
easy to remember, an effective technique the one known as 
"shocking nonsense."
 
	"Shocking nonsense" means to make up a short phrase or 
sentence that is both nonsensical and shocking in the 
culture of the user, that is, it contains grossly obscene, 
racist or impossible or other extreme juxtaposition of 
ideas. This technique is permissable because the 
passphrase, by its nature, ought never to be revealed to 
anyone with sensibilities to be offended.
 
	Further, shocking nonsense is unlikely to be 
duplicated anywhere because it does not describe a matter-
of-fact that could be accidentally rediscovered by anyone 
else and the emotional evocation makes it difficult for the 
creator to forget. A relatively mild example of such 
shocking nonsense might be: "mollusks peck my galloping 
genitals ." The reader can undoubtedly make up many far 
more shocking examples for himself or herself...
 
	Even relatively short phrases offer acceptable entropy 
because the far larger "alphabet" pool of word symbols that 
may be chosen than characters form the Roman alphabet. Even 
choosing from a vocabulary of a few thousand words a five 
word phrase might have on the order of 58 to 60 bits of 
entropy -- more than what is needed for the DES algorithm, 
for example. If in the case an entire phrase cannot be used 
because the password is restricted to, say, eight 
alphanumeric characters, concatenating the first letters of 
a suitable shocking nonsense passphrase should usually give 
a better than reasonable starting point if followed by 
adding numeric and non-alphabetic characters.
 
	When you are permitted to use passphrases of arbitrary 
length (in PGP for example) it is not necessary to further 
perturb your 'shocking nonsense' passphrase to include 
numbers or special symbols because the pool of word choices 
is already very high. Not needing those special symbols or 
numbers (that are not intrinsically meaningful) makes the 
shocking nonsense passphrase that much easier to remember.
 
 
Appendix A.  For software developers
 
	For software developers designing "front-ends" or user 
interfaces to conventional short-password applications, 
very good results will come from permitting the user 
arbitrary length passphrases that are then "crunched" or 
processed using a strong digest algorithm such as the 160-
bit SHS (Secure Hash Standard) or the 128-bit MD5 (Message 
Digest rev. 5). The interface program then chooses the 
appropriate number of bits from the digest and supplies 
them to the engine enforcing a short password. This 'key 
crunching' technique will assure the developer that even 
the short password key space will have a far greater 
opportunity of being fully exploited by the user.
 
 
Appendix B. A tool to experimentally investigate entropy
 
	A practical Unix tool for investigating the entropy of 
typical user keys can be found in Wu and Manber's 'agrep' 
(approximate grep) similarity pattern matching tool 
available in C source from cs.arizona.edu [192.12.69.5]. 
This tool can determine the "edit distance," that is, the 
number of insertions, substitutions, or deletions that 
would be required of an arbitrary pattern in order for it 
to match any of a large corpus of words or phrases, say the 
usr/dict word list, or over the set of Star Trek trivia 
archives. The user can then adjust the pattern to give an 
arbitrary high threshold difference between it and common 
words and phrases in the corpus to make crack programs that 
systematically vary known strings less likely to succeed. 
It is often surprising to discover that a substring pattern 
like "hxirtes" is only of edit distance two from as many as 
forty separate words ranging from "bushfires" to "whitest." 
Certainly no password or phrase ought to be chosen as a 
working password or phrase that is within two or fewer edit 
distance from a known string or substring in any on-line 
collection.
 
 
select references
 
[selection and of passwords in differing threat 
environments]
Department of Defense Password Management Guideline
CSC-STD-002-85
published by the Computer Security Center of the Department 
of Defense Fort George G. Meade, MD 20755
 
[discovering weak passwords]
The COPS Security Checker System by D. Farmer, E. Spafford
Purdue University Technical Report CSD-TR-993
West Lafayette, IN 47907
 
[an example of automated key cracking]
With Microscope and Tweezers:
An Analysis of the Internet Virus of 1988
by M. Eichin, J. Rochlis, Massachusetts Institute of 
Technology Cambridge, MA 02139
 
[password vulnerabilities in distributed systems]
Computer Emergency Response - An International Problem
by R. Pethia, K. van Wyk CERT/Software Engineering 
Institute
Carnegie Mellon University, Pittsburgh, PA 15213
 
[key metrics and the MD5 message digest algorithm]
Answers to Frequently Asked Questions About Today's 
Cryptography by Paul Fahn
RSA Laboratories, Redwood City, CA 94065
(available through anonymous FTP from rsa.com)
 
[implementation details of the MD5 message digest 
algorithm]
RFC-1321 ('request for comments') The MD5 algorithm
by R. Rivest MIT Center for Computer Science
(available on the internet from gatekeeper.dec.com)
 
[implementation details of the NIST Secure Hash Standard]
The Secure Hash Standard (SHS) Specification, Jan 1992 
DRAFT
Federal Information Processing Standards Publication YY
Director, Computer Systems Laboratory
National Institute of Standards and Technology
Gaithersburg, MD 20899
(The SHS was approved as a Federal Standard in May, 1993)
 
[other possible approaches to password generation]
Automated Password Generator, NIST publication ????
Director, Computer Systems Laboratory
National Institute of Standards and Technology
Gaithersburg, MD 20899
(a pronounceable password algorithm using DES)
 
v 1.0 alpha
comments on this FAQ are solicited; e-mail grady@netcom.com

-- 
grady@netcom.com  voice/fax (707) 826-7715 
compiler of Moby lexical databases, including
Moby Part-of-Speech, second edition: 230,000 entries, priority marked 
finger grady@netcom.com or e-mail for more information.

--