Nov 14, 2009

Analyzing malicious PDF

Maybe some of you have read about malicious PDF and its danger if we neglect the possibility that it can harm you in some ways.

So today, let say you have caught some suspicious pdf file in wild and you don’t really know what to do.

The first and following questions might arise, “Does it contains malicious content?” and if yes, “What it’ll try to do and how?”.
Now the first thing that we normally do is to look into the PDF content structure and check for any hints. PDF do have a structure to form as a PDF Document. As in this case, we’re going to use the ‘cat’.

Alright, everything seems normal except the unreadable content between ’stream’ and ‘endstream’. We cannot determine whether this unreadable content is a malicious content or just a usual content for PDF document. However, we know that the unreadable content is encoded with FlateDecode as we search through the keyword ‘/Filter’. Most of normal PDF file usually have some of its content encoded with FlateDecode, and other encodings to name like JBIG2Decode and DCTDecode. FlateDecode usually can be decoded by using pdf-parser or inflater.

It turns out that the decoded content is a JavaScript code that is obfuscated with Base64 encoding. Until this part, PDF file has drawn some attention as it show some signs that the code might contains shellcode. Now we’ll dump the snippet JavaScript code into new text file for further analysis.

Next, we’ll use spidermonkey to interprete the JavaScript code and generate the output shellcode.

As we gone this far, we can surely identify that this PDF contains malicious content, and we also can identify what it try to do and how. From this shellcode, we can see that it try to exploit the vulnerability of util.printf (CVE-2008-2992) of Adobe Reader 8.1.2 and below. If succeeded, it will execute the payload attached within unescape() function.

Our next (and might be the last) attention is the payload itself. It is a UTF16/UCS2 character which can be converted to hex or into bin executable file. can do the job for that.

From the hexdump output, we’re able to see that there is a URL of potential malware that will be triggered when the exploitation succeeded.