TUCoPS :: Crypto :: cssanlys.txt

Cryptanalysis of a Contents Scrambling System by Frank A. Stevenson

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
   <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
   <meta name="Author" content="Frank A. Stevenson">
   <meta name="GENERATOR" content="Mozilla/4.6 [en] (Win98; U) [Netscape]">
   <title>Analysis of DVD ContentsScrambling System</title>
<body text="#000000" bgcolor="#FFFFFF" link="#0000EE" vlink="#551A8B" alink="#FF0000">
<p><font size=+3>Cryptanalysis of Contents Scrambling System,</font>
<br><i>Frank A. Stevenson ( frank@funcom.com )</i>
<p><i><b>Abstract: </b>CSS is a scrambling system used in the distribution
for movies on DVD ( Digital Versatile Disc ) a high capacity CD like storage
system. Its main purpose is to prevent the unauthorized duplication of
disc contents. This is achieved through encrypting the files, and storing
keys in hardware. Here we will describe the system, and show that even
if the keys can be securely stored in hardware, the data will not be protected
from unauthorized copying. Severe weaknesses in the ciphers effectively
voids the need for the hardware keys when decrypting the content.</i>
<p><b>8th November 1999</b>
<br><b>(updated 13th Nov.)</b></center>

<hr WIDTH="80%">
<br><font size=+2>0 General disclaimer.</font>
<blockquote><font size=+1>This information is provided as is, with no warranties
on its accuracy or usability. It is based on a piece of source code claiming
to be the css algorithms, and which have since been confirmed to interoperate
with the CSS system. The author has not read any official CSS documentation,
and any errors in the terminology is a result of this. This information
has not to the knowledge of the author been made available through breaches
of the DVD consortium Non Disclosure Agreement.</font></blockquote>

<p><br><font size=+2>1 System overview.</font>
<blockquote><font size=+1>Every DVD player is equipped with a small set
of player keys. When presented with a new disc, the player will attempt
to decrypt the contents with the set of keys it possesses. Every disk has
a disk key data block that is organized as follows:</font>
<font size=+1>5 bytes hash of decrypted disk key ( <i>hash</i> )</font></li>

<font size=+1>disk key encrypted with player key 1 (<i>dk<sub>1</sub> </i>)</font></li>

<font size=+1>disk key encrypted with player key 2 (<i>dk<sub>2</sub> </i>)</font></li>

<font size=+1>...</font></li>

<font size=+1>disk key encrypted with player key 409 (<i>dk<sub>409</sub></i>)</font></li>
<font size=+1>Suppose the player has a valid key for slot 213, it will
<br><font size=+1>(1)<i>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; K<sub>d</sub>
= D</i></font><i><sub><font size=-1>A</font></sub><font size=+1>( dk<sub>213</sub>
, K<sub>p213</sub> )</font></i>
<p><font size=+1>To verify that <i>K<sub>d</sub> </i>is correct, the following
check is done, if the check fails, it will try the next player key.</font>
<br><font size=+1>(2)<i>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; K<sub>d</sub>
= D</i></font><i><sub><font size=-1>A</font></sub><font size=+1>( hash
, K<sub>d</sub> )</font></i>
<p><font size=+1>An obvious weakness stems from this check, by trying all
2</font><sup>40</sup><font size=+1> possible <i>K<sub>d</sub></i>, disk
key<i><sub> </sub></i>can be deduced without knowing any valid player key.
As will be shown later, this attack can be carried out with a complexity
of&nbsp; 2</font><sup>25</sup><font size=+1>, making such an attack feasible
in runtime applications.&nbsp; Another obvious attack is that by having
1 working player key, other player keys can be derived&nbsp; through a
similar search. This can be done offline, also keys obtained from the former
attack can be used as a starting point.</font>
<p><font size=+1>To decrypt the contents an additional key <i>tk</i> -
the title key is decrypted with the now decrypted and verified disk key.</font>
<p><font size=+1>(3)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <i>K<sub>t</sub>
= D</i></font><i><sub><font size=-1>B</font></sub><font size=+1>( tk, K<sub>d</sub>)</font></i>
<p><font size=+1>Each sector of the data files is the optionally encrypted
by a key that is derived from <i>K<sub>t </sub></i>by exclusive or of specified
bytes from the unencrypted first 128 bytes of the 2048 bytes sector. The
decryption is done with the CSS stream cipher primitive described in section
<font size=+2>2 CSS streamcipher primitive:</font>
<blockquote><font size=+1>The CSS streamcipher is a very simplistic one,
based on 2 LFSRs being added together to produce output bytes. There is
no truncation, both LFSR are clocked 8 times for every byte output, and
there are 4 ways of combining the output of the LFSRs to an output byte.
These four modes are just settings on 2 inverter switches, and the modes
operation are used for the following purposes.</font>
<font size=+1>Authentication to DVD drive ( not discussed )</font></li>

<font size=+1>Decryption of Disk key (<i>D</i></font><i><sub><font size=-1>A</font></sub></i><font size=+1>)</font></li>

<font size=+1>Decryption of Title key (<i>D</i></font><i><sub><font size=-1>B</font></sub></i><font size=+1>)</font></li>

<font size=+1>Decryption of data blocks.</font></li>
<font size=+1>LFSR1: 17 bits ? taps, and is initialized by the 2 first
bytes of key, and setting the most significant bit to 1 to prevent null
<br><font size=+1>LFSR2: 25 bits 4 taps, is initialized with byte 3,4,5
of the key shifting all but the 3 least significant bits up 1 position,
and setting bit 4 to prevent null cycling.</font>
<p><font size=+1>As new bits are clocked into the LFSRs, the same bits
are clocked in with reversed order to the two LFSRs output bytes. ( With
optional inversion of bits. )</font>
<p><font size=+1>The output of LFSR1 is <i>O<sub>1</sub>(1), O<sub>1</sub>(2),
O<sub>1</sub>(3) ...</i></font>
<br><font size=+1>Likewise LFSR2 produces <i>O<sub>2</sub>(1), O<sub>2</sub>(2),
O<sub>2</sub>(3) ...</i></font>
<p><font size=+1>These two streams are combined through 8 bits addition
with carry carried over to the next output. The carry bit is zero at start
of stream.</font>
<br><font size=+1>(4)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <i>O(i)
= O<sub>1</sub>(i) + O<sub>2</sub>(i) + c&nbsp;&nbsp;&nbsp;&nbsp; </i>where<i>
c </i>is carry bit from<i> O(i-1)</i></font>
<p><font size=+1>This streamcipher is very weak, a trivial 2</font><sup>16</sup><font size=+1>
attack is possible with output bytes known for <i>i = {1,2,3,4,5,6}</i>.
Guess the initial state of LFSR1, and clock out 4 bytes. <i>O<sub>2</sub>(1),
O<sub>2</sub>(2), O<sub>2</sub>(3), O<sub>2</sub>(4) </i>can then be uniquely
determined, and from them the state at <i>i=4</i> is fully known. The guess
on LFSR1 can then be verified by clocking out 2 or more bytes of the cipher
and comparing the result.</font>
<p><font size=+1>Another important attack is the case when only <i>O(i)</i>
for <i>i = {1,2,3,4,5} </i>is known. Guess the initial state of LFSR1,
and clock out 3 bytes. Now <i>O<sub>2</sub>(1), O<sub>2</sub>(2) </i>and
</i>can be found as in the above attack. This will reveal all but the most
significant bit of LFSR2s state at <i>i=3</i>. If both possible settings
for MSB is tried, and LFSR2 is clocked backwards 24 steps, a state where
bit 4 is set at <i>i=1</i> can always be found. ( This is stated without
proof ). Select the setting of the most significant bit for LFSR2 such
that LFSR2 is in a legal state at <i>i=1</i>, and clock out two more bytes
to verify the guess of LFSR1. For some values of <i>O( i = {1,2,3,4,5}
)</i> multiple start states can be found, and for others none. Selecting
the correct start state is not a problem, as this attack is used in situations
where only the first five output bytes are of significance ( encryption
of keys ).</font>
<font size=+2>3 CSS mangling step:</font>
<blockquote><font size=+1>When the CSS streamcipher is used to encrypt
keys such as in <i>D</i></font><i><sub>A</sub><font size=+1>(data,key)</font></i><font size=+1>
and <i>D</i></font><i><sub>B</sub><font size=+1>(data,key)</font></i><font size=+1>,
an additional mangling step is performed on the data. This cipher is best
illustrated with the following block diagram:</font>
<font size=+1>A(1,2,3,4,5) are the input bytes (data)</font></li>

<font size=+1>C(1,2,3,4,5) are the output bytes (data)</font></li>

<i><font size=+1>k<sub>i</sub> = O(i)</font></i><font size=+1> where <i>O(i={1,2,3,4,5})</i>
is streamcipher output from key</font></li>

<font size=+1>B(1,2,3,4,5) are temporary stages</font></li>
<font size=+1>The cipher is evaluated top down, with exceptions indicated
by an arrow.</font></blockquote>

<p><img SRC="Mangle.png" height=400 width=600></center>

<br><font size=+1>Examples of evaluating cipher:</font>
<font size=+1><i>B(j) = </i>xor<i>( <b>F</b>( A(j) ) , A(j-1) , k<sub>j</sub>
) </i>for <i>j = {2,3,4,5}</i></font></li>

<font size=+1><i>B(1) = </i>xor <i>( <b>F</b>( A(1)) , B(5), k<sub>1</sub>

<font size=+1><i>C(j) = </i>xor<i>( <b>F</b>( B(j) ) , B(j-1) , k<sub>j</sub>
) </i>for <i>j = {2,3,4,5}</i></font></li>

<font size=+1><i>C(1) = </i>xor <i>( <b>F</b>( B(1)) , k<sub>1</sub> )</i></font></li>
<font size=+1><b><i>F</i></b> is a function, defined by a byte permutation
table. With known cipher and plaintext, the whole cipher unravels with
a minimal amount of work. Here is how:</font>
<font size=+1>Make a guess on k<sub>5</sub></font></li>

<font size=+1><i>B(5) = </i>xor<i>( <b>F</b>( A(5) ) , A(4) , k<sub>5</sub>

<font size=+1><i>B(4) = </i>xor<i>( <b>F</b>( B(5) ) , C(5), k<sub>5</sub>

<i><font size=+1>k<sub>4</sub> = </font></i><font size=+1>xor<i>( <b>F</b>(
A(4) ) , A(3) , B(4) )</i></font></li>

<font size=+1><i>B(3) = </i>xor<i>( <b>F</b>( B(4) ) , C(4), k<sub>4</sub>

<i><font size=+1>k<sub>3</sub> = </font></i><font size=+1>xor<i>( <b>F</b>(
A(3) ) , A(2) , B(3) )</i></font></li>

<font size=+1><i>B(2) = </i>xor<i>( <b>F</b>( B(3) ) , C(3), k<sub>3</sub>

<i><font size=+1>k<sub>2</sub> = </font></i><font size=+1>xor<i>( <b>F</b>(
A(2) ) , A(1) , B(2) )</i></font></li>

<font size=+1><i>B(1) = </i>xor<i>( <b>F</b>( B(2) ) , C(2), k<sub>2</sub>

<i><font size=+1>k<sub>1</sub> = </font></i><font size=+1>xor<i>( <b>F</b>(
A(1) ) , B(5) , B(1) )</i></font></li>

<font size=+1>verify by checking <i>C(1) = </i>xor <i>( <b>F</b>( B(1)
, k<sub>1</sub> )</i></font></li>
<font size=+1>Thus by trying 256 possibilities, we can recover 5 output
bytes from the CSS streamcipher, and so recover the key by using the five
known output bytes. This attack can be put to immediate use for recovering
other player keys as in the notes to eqn. 2,3. Even if the player key is
not recovered through the reversal of the stream cipher, the output of
the streamcipher is known, and will still be usefull for decrypting disks
that employ other player keys.</font></blockquote>
<font size=+2>4 Attacking the hash of the disk key.</font>
<blockquote><font size=+1>Reversing the hash at the start of the disc key
block is somewhat more complicated. From (2) we see that only the hash
value is known, the problem is finding a disk key such that the decrypted
hash equals the disk key itself. An attack of complexity 2</font><sup>25</sup><font size=+1>
proceeds as follows.</font>
<p><font size=+1>First some aspects on the value of <i>k<sub>2</sub></i>
will have to be considered. <i>A(1)</i> and <i>A(2)</i> is known, and a
table can be build by running through every possible combination of <i>k<sub>2
calculate the resulting <i>C(2)</i>. When trying to build a table of possible
values <i>k<sub>2 </sub></i>of indexed by <i>C(2)</i> and <i>B(1)</i> there
will be many values that map to the same set of indices, so a the table
must be able to hold several values of <i>k<sub>2 </sub></i>in each location.</font>
<p><font size=+1>Guess the start state of LFSR1, calculate <i>O<sub>1</sub>(
i = {1,2,3,4,5} ) </i>. Next guess <i>B(1)</i> and complete the following
<i><font size=+1>k<sub>1</sub> = </font></i><font size=+1>xor<i>( <b>F</b>(
B( 1 ) ) , C(1) )</i>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <i>C(1,2)</i>
is known, they are the start state of LFSR1</font></li>

<font size=+1><i>B(5)</i> = xor<i>( <b>F</b>( A(1) ) , B(1), k<sub>1</sub>)</i></font></li>

<i><font size=+1>k<sub>5</sub></font></i><font size=+1> <i>= </i>xor<i>(
A(5) ) , A(4),&nbsp; B(5) )</i></font></li>

<font size=+1>Through the table indexed by <i>C(2)</i> and <i>B(1)</i>
all permissible <i>k<sub>2 </sub></i>can be found, there can be from 0-8
, on average 1. For all permissible <i>k<sub>2</sub></i> calculate:</font></li>

<i><font size=+1>O<sub>2</sub>(1)</font></i><font size=+1> , <i>O<sub>2</sub>(2)</i>,
and 2 possible <i>O<sub>2</sub>(5)</i>.&nbsp;&nbsp; This is possible since
k<sub>1,2,5</sub> are found.</font></li>

<font size=+1>For every legal initial state of LFSR2 there exists a one
to one mapping to <i>O<sub>2</sub>(1,2,5)</i> , by generating a table with
2</font><sup>24</sup><font size=+1> entries the start state of LFSR2 can
be found. Thus <i>C(1,2,3,4,5)</i> is potentially known.</font></li>

<font size=+1><i>B(4) = </i>xor<i>( <b>F</b>( B(5) ) , C(5), k<sub>5</sub>

<i><font size=+1>k<sub>4</sub> = </font></i><font size=+1>xor<i>( <b>F</b>(
A(4) ) , A(3) , B(4) )</i></font></li>

<font size=+1><i>B(3) = </i>xor<i>( <b>F</b>( B(4) ) , C(4), k<sub>4</sub>

<i><font size=+1>k<sub>3</sub> = </font></i><font size=+1>xor<i>( <b>F</b>(
A(3) ) , A(2) , B(3) )</i></font></li>

<font size=+1><i>B(2) = </i>xor<i>( <b>F</b>( B(3) ) , C(3), k<sub>3</sub>

<font size=+1>verify <i>k<sub>2</sub> = </i>xor<i>( <b>F</b>( A(2) ) ,
A(1) , B(2) ) </i>, this holds for 1 / 256 tries ( 2</font><sup><font size=-1>17</font></sup><font size=+1>
altogether ) and if the test holds, the key <i>C(1,2,3,4,5)</i> can be
tested by eqn. (2). If eqn (2) holds, then a key has been found that will
satisfy the hash. From experience it is possible to find from zero to a
few such keys to any given hash value. When multiple disc keys are found
trial decryption of the files will eliminate the false keys.</font></li>
<font size=+1>This attack when implemented on a Pentium III running 450
MHz, will recover a disk key from the hash alone in less than 18 seconds.
This is clearly much less than what is to be expected of a 40 bits cipher.</font></blockquote>
<font size=+2>5 Conclusion</font>
<blockquote><font size=+1>The author has through email correspondence learned
that attacks as described at (2) have indeed been carried out by brute
force prior to this analysis. CSS was designed with a 40 bit keylength
to comply with US government export regulation, and as such it easily compromised
through brute force attacks ( such are the intentions of export control
). Moreover the 40 bits have not been put to good use, as the ciphers succumb
to attacks with much lower computational work than which is permitted in
the export control rules. Whether CSS is a serious cryptographic cipher
is debatable. It has been clearly been demonstrated that its strength does
not match the keylength. If the cipher was intended to get security by
remaining secret, this is yet another testament to the fact that security
through obscurity is an unworkable principle.</font></blockquote>
<font size=+2>6 Further information</font>
<blockquote><font size=+1>I have collected <a href="http://people.a2000.nl/mwielaar/dvd-css/csspaper/livid.html">links</a>
to posts that were made to the Livid project mailing list. These include
the original anonymous posting of the CSS algorithm, as well as full C
source code for the attacks I outline here.</font></blockquote>


TUCoPS is optimized to look best in Firefox® on a widescreen monitor (1440x900 or better).
Site design & layout copyright © 1986-2024 AOH