Scraping Google for Fun and Profit

This information and source code is provided for free. Anyway a donation would be appreciated.

written 2010 by Justone [justone(at)squabbel.com], updates 2011, rewrite 2012-2014
update from 16th Nov. 2010: The scraper source code is now compatible with the new google design (instant, previews, etc)
update from 13th Dec. 2011: The scraper source code is now compatible with Google design changes (span removed)
update 2012: some bugfixes
update 2012: A better google scraper was written this year, check out the Google Rank Checker


I've a great update to all my readers! I spent weeks and developed a much more advanced project, free again!
Instead of billing for my work (and I had more than a few requests to write custom code) I added donation buttons and hope they will be used.

Make sure to check out the successor of this code: 
the new (2012-2014) Open Source   

Google Rank Checker

[PHP] Google SERP scraping is an often required task for SEO experts and Internet professionals. By scraping it is possible to monitor ranking positions (SERP), the PPC market, link popularity and much more. No matter if you offer scraping as an SEO service, embed into your website, or if you require it for your own projects: You need important knowhow to succeed. I am providing you the key-knowhow about SERP scraping, focused on Google: the largest search engine. You will find important hints and a complete multi-page google search engine scraper written in PHP with private proxy API support for proxy rotation!

What happens if you scrape Google ?

Google is the largest scraper on the world but they do not allow scraping of their own pages. Without a lot of experience and knowhow it can be a hard task to get anything out of them. Google uses a number of techniques to detect automated access and to prevent it. When Google detects scraping activity this is going to happen: 1. When accessing Google, you can be warned about something "dangerous" going on. You will see a warning about a possible Virus or Trojan on your computer. 2. If you continue scraping Google they will now throw in their first block. You will again see the virus message, this time you need to enter a Captcha to continue. The Captcha will create an authentication cookie that allows you to continue. 3. Now Google uses larger weapons: They will block your IP temporarily. ("Google blocked your ip temporarily") It can last from minutes to hours, you immediately need to stop your current scraping and change code/add IPs. 4. If you scrape google again you will be banned for a longer time. How does Google detect scraping ? That's the key question and not too hard to find out: Google mainly watches for * the IP address: the IP is the only identification sign of a user they use * keyword changes: normal users don't look for many keywords in a short time * frequency: every access to google is matched with allowed access patterns

Hints for scraping Google and avoiding detection

For your use and customization: an advanced Google scraper written in PHP for web or console usage

This source is free for your fun and profit, you can change everything except the first free commented lines. This script includes: 1. Automated proxy rotation (using the API seo-proxies.com, a reliable private proxy service) If you have own reliable proxies you need to adapt that part, try use clean and fast proxies for good results! If you have a license at www.seo-proxies.com then all you need to do is to change the "USERID" and "API-PASSWORD" variables at the top of the scraper.php script to match your license. 2. Automated scraping of all google result pages from a specific search result 3. Usage of sub-keywords to increase the number of possible results 4. Automated detection and removal of advertisements 5. Storage of the scraped results in an array, displaying it on demand as HTML text or normal text What you should consider to do is to add database support for storing results and managing keywords! For professional projects PHP is well suited but you should use the scraper as console script for best reliability. Download the two source code files here: scraper.php functions.php Make sure to also check our the highly advanced and new (2012) free

Google Rank Checker

, opensource PHP and much better than this project


A donation would be appreciated.

Scraping Google autocomplete has been solved too: Google Suggest Scraper