Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Google has cracked down on its own web scrapers that harvest data from search results, causing global outages for many popular ranking tools such as SEMRush that depend on providing fresh data from search results pages.
What happens if Google’s SERPs get blocked completely? A certain amount of data provided by tracking services has long been extrapolated by algorithms from various data sources. It is possible that one way around the current block is to extrapolate data from other sources.
Google’s guidelines have long prohibited automated ranking checks in search results, but apparently Google has also allowed many companies to scrape their search results and charge for access to ranking data for keyword and ranking purposes.
According to Google guidelines:
“Machine-generated traffic (also called automated traffic) refers to the practice of sending automated queries to Google. This includes scanning results for ranking purposes or other automated access to Google search that is conducted without express permission. Machine-generated traffic consumes resources and interferes with our ability to best serve users. Such activities violate our spam policy and Google’s terms of service.”
It is a very demanding resource to block scrapers, especially since they can respond to blocks by doing things like changing their IP address and user agent to bypass any block. Another way to block scrapers is to target specific behaviors such as the number of pages a user searches. An excessive amount of page requests can cause blocking. The problem with this approach is that it can become demanding to keep track of all the blocked IP addresses, which can quickly number in the millions.
A post in the private SEO Signals Lab Facebook group reported that Google is cracking down on web scrapers, with one member commenting that the Scrape Owl tool was not working for them, while others stated that SEMRush’s data was not up to date.
More publish, this time on LinkedInnoticed multiple tools not refreshing their content, but also noted that not all data providers were affected by the block, noting that Sistrix and MonitorRank were still working. Someone from a company called HaloScan reported that they made adjustments to continue scraping data from Google and recovered, and someone else reported that another tool called MyRankingMetrics was still reporting data.
So whatever Google is doing isn’t affecting all scrapers right now. It’s possible that Google is targeting certain copying behavior, learning from the responses and improving their ability to block them. The coming weeks could reveal that Google is improving its ability to block scrapers, or targeting only the biggest ones.
More post on LinkedIn speculated that blocking could result in higher resources and fees charged to end users of SaaS SEO tools. They published:
“This move by Google makes data extraction more challenging and expensive. As a result, users may face higher subscription fees. “
Ryan Jones chirped:
“Looks like Google made an update last night that blocks most scrapers and many APIs.
Google, just give us a paid API for search results. we’ll pay you instead.”
So far there have been no announcements from Google, but it may be that the internet chatter will make someone at Google consider making a statement.
Featured Image Shutterstock/Krakenimages.com