Archive for the ‘Blackhat’ Category
PHP proxy check that also checks proxy that also checks it’s anonimity (using curl_multi)
Friday, April 16, 2010 16:04 9 CommentsBased on the requests from some others, they wanted a script that not only checked for useful proxies but to see how anonymous they are. The first one you upload to some server, it rates the proxy based on the header information passed. 3 being highly anonymous to 1 being a transparent proxy.
<?php
//proxy levels
//Level 3 [...]
PHP proxy checker using curl_multi
Thursday, April 15, 2010 15:31 2 CommentsBeen using a certain proxy finder lately, after lengthy testing you get lots of dead or not anonymous proxies. Testing these one by one with back to back curl calls is very time inefficient (testing 150 proxies even at 1 second a piece is going to take over 150 seconds). [...]
Noobies Guide on How to Scrape: Part 4 – cURL
Monday, May 11, 2009 13:01 3 CommentsNow we get the idea of POST and GET. We found our target, we know it’s url structure, we know where the data is, but how do we use PHP to fetch the webpages?
Luckily we have what is call cURL (from PHP.net):
PHP supports libcurl, a library created by Daniel Stenberg, that allows [...]
Noobies Guide on How to Scrape: Part 2 – URLs, URL Variables, and using Live HTTP Headers
Wednesday, April 8, 2009 21:11 1 CommentUnderstanding the fundamentals of how sites communicate with themselves, and how we communicate with them, is crucial in being able to reverse engineering a site for our scraper. Luckily it’s pretty easy for the most part.
Anatomy of a URL
The protocol your using.
The website your trying to get to. Although www is synonymous with the base [...]
Noobies Guide on How to Scrape: Part 1 – Intro & Tools
Monday, April 6, 2009 0:03 2 CommentsWelcome to the Noobies Guide to Scraping: Part 1. In this installment we are only going to focus on a few very basic things that we are going to need to get started, and no code will be wrote.
What is scraping? Scraping is the process of getting / gathering data from some web source, whether [...]





