Monday, September 29, 2014

Google Scraper

I am in need for analyzing google search result, fortunately there are multiple opensource solution out there. But google hates scrapers and would block your IP should they determine that you are breaking their terms and condition. 

Possible Google Scraper: (Play with the sleep timing between request to prevent IP blocking)

//Example using MarioVilas's google scraper: 
python --stop=20 "inurl:console filetype:php" > test.txt

//If you need to remove parameters, a simple bash script is perfect: 
while read p; do
echo ${FILE%%\?*}
done < test.txt

No comments:

Post a Comment