Pages

Dec 30, 2012

Some update on PageScan (v0.2)

It's been a while since I wrote > 1000 lines of code for a security project tool, and sorry for not mentioning about the release of PageScan earlier.

For those of you who didn't know, I've released PageScan, a web content scraper for the purpose of web-based malware analysis. It assist on static analysis by scraping and listing any redirection, iframe, javascript, and links found inside the web page. Below are some of the output from the PageScan;

CLI output

HTML output

Features
- Scrap HTML content, JavaScript code (inline or external JS), iframe, and links
- Follow iframe and redirection (meta and 301/302 redirection)
- TXT/HTML output
- User-defined Referer and User Agent

Future Development
- Scrap iframe/redirection address from JavaScript (in document.write() or conditions)
- Properly execute JavaScript code (for obfuscated redirection or content)
- Yara signature module for scraped contents


Feel free to dig into the source code. This tool license is WTFPL, so do what ever you want to do with with the code. You can get the latest version of PageScan at https://github.com/d3t0n4t0r/pagescan.