TUCoPS :: Crypto :: bt32.txt

TUCoPS :: Crypto :: bt32.txt
Defeating HTML "Encryption"




There are quite a few HTML Encryptors that can be found online which 

promise to "encrypt" the user's HTML code disable printing, right mouse 

clicks and other "protections".



Examples of such tools are:

	http://www.protecthtml.com/

	http://www.protware.com/

	

This exploit will not focus on Microsoft Script Encoding or the 

JScript.Encode

(http://www.micro soft.com/mind/0899/scriptengine/scriptengine.asp) since 

it only works with Internet Explorer 5.0 and above, so unless the 

webmaster wants to leave out all other browsers from his site he will not 

use this method to "encrypt" his pages.



The term encrypt is deceiving, since there is no real encryption taking 

place, only a file encoding method that varies from tool to tool. Some 

tools will even advertise different encryption methods, with different 

levels of protection!



These all work roughly the same way, they generate an encoded file from the

original HTML in which there is JavaScript decoding function. As one would 

expect the file has all the information needed to decode it self, otherwise

the browser would not be able to show anything to the user.



Some tools don't even go this far and just URI escape the source code! So 

to "decrypt" all we need to do is call uri_unescape() if we're using perl.



Usually the SCRIPT will start with an eval(unescape(...)); which will 

define the decoding function. Then this function will be called with the 

appropriate parameters. Most tools only pass one parameter to the decoding 

function which is a string with the encoded source. 

After the decoding function has done its job it will call

document.write(...) with the decoded/original HTML so that the browser

can render it appropriately to the user. This happens regardless of 

encoding algorithm.



Example output from one of the tools:



&lt;SCRIPT LANGUAGE="JavaScript"><!--

eval(unescape("%66%75%6E%63%74%69%6F%6E%20%61%28%73% ... 

               %64%6F%63%75%6D%65%6E%74%2E%77%72%69%74%65%28%6F%29%7D"));

a("JPOgaINOINghakqohXXINOINghz~J}PH^&#$@#$%%^ ...");

eval(unescape("%64%6F%63%75%6D%65%6E%74%2E%77%72%69%74%65%3D%6E%75%6C%6C%

3B"));

//-->&lt;/SCRIPT&gt;



"..." are used to cut down the line size



If we write the first unescape(...) output to a browser window we will get 

the source code for the decoding function which usually has the format:



function decode(encoded_str){

	<decoding algorithm>

	

        document.write(decoded_str)

}



So the Script is defining function decode() then calling it with the 

encoded text to decode it, and the last instruction off the decode() 

function is a document.write(). 

The last eval(unescape(...)); generates a document.write=null; which will 

un define the document.write function in order to prevent its usage again.



So all we have to do to "decrypt" the HTML is search for all the 

document.write() calls and replace them with our own code to trick the 

browser into showing the source HTML instead of rendering it. This can 

be done buy using the <PLAINTEXT> tag. 

Example replace document.write(decoded_str) with 

document.write("<PLAINTEXT>"+decoded_str+"</PLAINTEXT>");



Since most tools produce an escaped version of the JavaScript source we'll

have to search for the clear document.write and its escaped version 

%64%6F%63%75%6D%65%6E%74%2E%77%72%69%74%65



Here's a perl script that does just that. After doing the 

replacements it will generate a clear_text.html file in the current 

directory. To get the source of the encoded HTML just open this file in 

the browser!



This approach works with all tools that i tested with regardless 

of "encryption" algorithm used.



-RJfix



--- START script ---

use URI::Escape;

require HTTP::Request;

use LWP::UserAgent;





# Define the page we want to see the HTML source

$html_page = "http://www.protecthtml.com/product/wp/sample21.htm";



$ua = LWP::UserAgent->new;

$request = HTTP::Request->new(GET => $html_page );

$response = $ua->request($request);

if ($response->is_success) {

	 $encrypted_html =$response->content;

} else {

	print $response->error_as_HTML;

	exit(0);

}



# Some try to overwrite document.write by doing something like

#	document.write = null;

# so we're going to search the source code for any document.write= 

# or its escaped version which is:

# 	%64%6F%63%75%6D%65%6E%74%2E%77%72%69%74%65%3D

$encrypted_html =~ s/document.write[ ]*=(.*)\;/void_var=$1/i;



# -- this is all on the same line --

$encrypted_html =~

s/%64%6F%63%75%6D%65%6E%74%2E%77%72%69%74%65(%20)*(%3D)(.*)

\;/void_var=$3/i;



# All scripts have to use a document.write to write the decrypted HTML

# to the browser window so all we're going to do is add a <PLAINTEXT>

# tag to make sure that the derypted html is not decoded by the browser

# and instead we see the source code!

# -- this is all on the same line --

$encrypted_html =~ s/document.write[

]*\((.*?)

\)/document.write\(\\\"<PLAINTEXT>\\\"+$1+\\\"<\/PLAINTEXT>\\\"\)/gi;



# -- this is all on the same line --

$encrypted_html =~

s/%64%6F%63%75%6D%65%6E%74%2E%77%72%69%74%65(%20)*%28(.*?)%

29/document.write\(\\\"<PLAINTEXT>\\\"+$2+\\\"<\/PLAINTEXT>\\\"\)/gi;

                     

open(OUT,">clear_text.html");

print OUT $encrypted_html;



# Some LAME tools don't even try to encrypt the pages they just URL encode

everything 

print OUT "<p> Let us try just to Unescape the source! <PLAINTEXT>";

print OUT uri_unescape($response->content);

close(OUT);



--- END script ---