|
Millions of PDF invisibly embedded with your internal disk paths
----------------------------------------------------------------
I found an interesting privacy issue while analyzing PDF files. This bug
occurs when you are using Internet Explorer to print locally saved web pages
as PDF and affects all IE versions including IE8. It does not matter which
PDF generation software you are using like Adobe Acrobat Professional,
CutePDF, PrimoPDF, etc as long as you are invoking it from inside the IE
print function. In Windows, even when your default browser is not IE and if
you right click a file to select the PRINT from the context menu, then by
default it invokes the IE print handler. So, you will still see this issue
in the generated PDF.
This bug is NOT ABOUT the local disk path appearing in the FOOTER of your
pdf since it is clearly visible and already known by most people. This is
easy enough to hide by just going File -> Page Setup -> Change the Footer
value from =93URL=94 to =93-Empty-=94. After doing that, you will not expect your
internal disk path being put anywhere else. However, that does not happen.
The privacy issue arises from the fact that your local disk path gets
invisibly embedded inside your PDF in the title attribute. Only when you
open the file in an Editor like Notepad, you will see it. Currently, there
is no option in IE to disable it. The only workaround is to manually nullify
this value by editing the PDF file. Note that this problem does not occur
when using other browsers such as Firefox and Chrome. In fact, Chrome
handles the other footer issue intelligently as well by showing your disk
path as =93=85=94, rather than exposing it.
Proof of Concept:
-----------------
Steps to reproduce:
-------------------
1. Pick a .HTM or .HTML or .MHT file on your local computer.
2. Open this file in IE and click Ctrl-P.
OR Right-click the file in explorer and select PRINT from context menu.
4. Select any PDF writer as Printer such as Adobe PDF / CutePDF / PrimoPDF /
etc.
5. Click Print. When the PDF writer asks for a filename, provide any name.
6. Open the generated pdf in notepad, and search for =93file://=94 without
quotes.
Search for this on your favorite search engine (Google/Bing)
------------------------------------------------------------
filetype:pdf file c (htm OR html OR mhtml)
Google Search 1 (for drive C)
[http://www.google.com/search?hl=en&q=filetype%3Apdf+file+c+%28htm+OR+html+O
R+mhtml%29&btnG=Search&aq=f&oq=&aqi=] =96 4 million results
Google Search 2 (for drive D)
[http://www.google.com/search?hl=en&q=filetype%3Apdf+file+d+%28htm+OR+html+O
R+mhtml%29&btnG=Search&aq=f&oq=&aqi=] =96 13 million results
and so on=85. (I added till drive letter J and total was more than 50
million=85.)
So, out of 280 million pdfs accessible on the internet, more than 20% look
to be exposing internal disk paths which is a huge number. I have contacted
the Microsoft and Adobe Security Teams about this issue. Microsoft has plans
to fix this in IE9, while Adobe has opened the case but hasn=92t planned the
timelines yet.
Examples:
http://www.eda.gov/PDF/EDA_vol1;%20Issue10.pdf
01.