Archive for the ‘Blackhat’ Category
Noobies Guide on How to Scrape: Part 4 – cURL
Monday, May 11, 2009 13:01 3 CommentsNow we get the idea of POST and GET. We found our target, we know it’s url structure, we know where the data is, but how do we use PHP to fetch the webpages?
Luckily we have what is call cURL (from PHP.net):
PHP supports libcurl, a library created by Daniel Stenberg, that allows [...]
Noobies Guide on How to Scrape: Part 2 – URLs, URL Variables, and using Live HTTP Headers
Wednesday, April 8, 2009 21:11 1 CommentUnderstanding the fundamentals of how sites communicate with themselves, and how we communicate with them, is crucial in being able to reverse engineering a site for our scraper. Luckily it’s pretty easy for the most part.
Anatomy of a URL
The protocol your using.
The website your trying to get to. Although www is synonymous with the base [...]
Noobies Guide on How to Scrape: Part 1 – Intro & Tools
Monday, April 6, 2009 0:03 2 CommentsWelcome to the Noobies Guide to Scraping: Part 1. In this installment we are only going to focus on a few very basic things that we are going to need to get started, and no code will be wrote.
What is scraping? Scraping is the process of getting / gathering data from some web source, whether [...]





