What Determines Search Engine’s Crawl Rate?

Filed Under (Search Engine Crawling) by SEOmaster on December 27, 2007

Tagged Under : ,

The web is an ever-growing and dynamically changing world. Given the gigantic scale and dynamic nature of the web, it’s a nearly insurmountable task to maintain an up-to-date search engine index for the entire web. Therefore, one of the most important tasks of a search engine is to determine what part of the web is important and thus worth crawling more often, and reflect any updated content in their index in a timely manner. From a webmaster’s point of view, our mission is then to become a search engine’s favorite, and to make search engine robots visit our site more often than others. Here are a (non-exhaustive) list of factors that can affect the search engine’s crawl rate.

1. Relevant and Authoritative Backlinks
It’s a well known fact that backlinks help major search engines’ crawlers find your site and can give your site greater visibility in their search results. Especially links from relevant content and authoritative sources are considered a more powerful vote by search engines, and therefore are more likely to bring search engine robots to your website. Submitting your site to reputable and well-categorized web directories or major social networking sites helps your site get more exposed to crawlers.

2. Content Update and Pinging

Regular and frequent content update is another important factor that attract search engine robots. For example, the purpose of Google’s fresh crawl is to detect content update, and reflect the change in the search engine results immediately.

If your site is a blog, you can try existing pinging services such as pingomatic.com or Google’s Blog Search pinging service to proactively inform search engine robots of new posts and content changes.

3. Internal Link Structure
Another factor that affects search engine’s crawling rate is how the current page of a website is linked from other pages within the same website domain. Search engines determine the relative importance of the current page on a website based on the site’s overall internal link structure. Pages that are heavily linked to internally (e.g., site-wide pages) are considered important by search engines, and therefore receive more frequent visits from spiders.

4. Sitemap and Robots.txt

Creating a search engine sitemap for your site helps your site indexed more deeply as well as more frequently. With Google, you can create XML/TXT-formatted sitemap and submit it to your Google webmaster tools account. A typical sitemap contains a list of URLs for crawler to retrieve. If the sitemap is formatted in XML, you can specify extra information for crawlers, such as frequency of content change, last modification date, or relative importance of a page.

While sitemap informs crawlers which pages to retrieve, robots.txt does the opposite. That is, robots.txt prevents spiders from retrieving all or part of your website, which otherwise is publicly accessible by human. As webmasters become more SEO-savvy, they start to make use of robots.txt more actively (e.g., to eliminate duplicate content). But at the same time, it increases a chance for them to fumble robots.txt, and unwittingly block search engine spiders. In order to prevent any costly mistake, always arm yourself with the up-to-date syntax of robots.txt recommended by major search engines such as Google and Yahoo, and look out for Google’s crawl error reports.

5. Server Speed

Not to interfere with search engine’s crawling, the web server where your site is hosted should respond to a request in a reasonable time. Fast response time offers visitors good surfing experience. The same logic applies to search engine robots as well. Given that the search engine’s primary role is to provide users good searching experience, having your website hosted on a fast web server helps your site indexed faster and updated more frequently by search engine.

6. Set Crawl Rate Feature in Google Webmaster Tools

In your Google webmaster account, you can choose three different types of crawl speed for your website: Faster, normal, slower. The set crawl rate option is available only for top-level domain or sub-domains, but not for any internal pages or folders. An once requested crawl rate need to be renewed every 90 days. However, it’s reported that this feature does not guarantee an immediate effect on Google’s crawl rate.

10 Comments »

It’s a shame this isn’t getting more attention at Sphinn. You’re clearly a good writer and this is a nice, informative piece. Taught me a thing or two about crawl rates and I thought I knew this stuff. Keep up the great blogging!

Thanks for the nice words, Gab. :)

 
 

[…] What Determines Search Engine’s Crawl Rate? […]

 

[…] to ramp up the search engine’s crawling rate, SEOmeter.com also shares their tips that describes what determines search engine’s crawl rate. SHARETHIS.addEntry({ title: “Measure Your Crawl Cycle”, url: […]

 

very interesting.
i’m adding in RSS Reader

 

[…] useful tips related to search engine crawling andindexing. They also have a nicely written post on how to make Google crawl your site faster. It’s always nice to find new webmaster tools and resources out there, and SEOmeter.com is […]

 

[…] If you choose to use that option, you have to re-set the crawl rate once every 90 days. Otherwise, the crawl rate will be back to its default rate. More about on Google’s crawl rate: What Determines Search Engine’s Crawl Rate? - Search Engine Optimization Blog […]

 

[…] web server helps your site indexed faster and updated more frequently by search engine. Source : What Determines Search Engine’s Crawl Rate? - Search Engine Optimization Blog Forum Rules & Regulations | Free cPanel Web […]

 

[…] What determines search engine’s crawl rate is best outlined at the SEOmeter blog. The factors include content updates, pinging, sitemap, robots.txt and server speed. […]

 

[…] blog offers a good piece of article on how you can raise the rate of search engine bot crawl rate of your site which I would like to recommend to all of you even though you might not want to give […]

 

Your Comment (smaller size | larger size)