Crawl Budget 101 – The Basics You Should Know
Crawl budget refers to the number of pages crawled and indexed by Googlebot on a particular website during a specific period of time.
Crawl budget pertains to the highest number of pages that the search engine wants to and can crawl on a given website. The crawl budget is determined by Google through weighing crawl demand and crawl rate limit.
- Crawl demand – How popular your pages are and how stale or fresh they are can affect your crawl demand.
- Crawl rate limit – Crawl errors, page speed, and crawl limit that is set in the Google Search Console (site owners can choose to reduce the crawl of Googlebot of their website) can all affect the crawl rate limit.
Crawl Budget: A Short History
It was in 2009 when Google acknowledged that they can only locate a certain percentage of online content and they urged webmasters into optimizing for crawl budget.
According to them, the internet is quite a big place where new content gets created constantly. Google’s resources are somewhat limited. This is why when Googlebot is faced with an almost infinite number of available content online, it can only locate and crawl a certain percentage of the content. Out of all the content they crawl, they can only index a part of it.
Webmasters and SEO Companies in Bangkok like us, started talking about the crawl budget even further and in 2017, this prompted Google to publish a post regarding the importance of crawl budget for Googlebot. The said post made it clear how crawl budget is perceived by Google and how it is being calculated.
Should You Be Concerned about Your Crawl Budget?
There might be no need for you to be concerned about your crawl budget if you are working on a smaller website. Google states that most publishers don’t really need to be concerned about crawl budget. If a website only has less than a thousand URLs, crawling it can often be done efficiently.
But, if your website is on the larger end of the spectrum, especially those websites where pages are automatically generally according to URL parameters, it is recommended to prioritize those activities that will help Google understand when and what to crawl.
Importance of Crawl Budget for SEO
If Google fails to index a certain page, it will not get ranked for anything. That’s how simple it works.
This means that if your number of pages goes beyond the crawl budget of your website, there will be some pages on your website that will not be indexed.
Having said this, most of the websites out there don’t really have to be concerned about crawl budget. After all, Google is great when it comes to finding as well as indexing pages.
However, there are several cases wherein it would be worth it to pay attention to your crawl budget:
- You are running a big website.
If you got a website such as an e-commerce site that has more than 10,000 pages, it might be a bit difficult for Google to find all of them.
- You have recently added a few pages.
In case you have recently added another section to your website that has hundreds of pages, it is best to ensure that you got the crawl budget so you can index all of them fast.
- Numerous redirects – Having a lot of redirect chains and redirects can easily eat up most of your crawl budget.
Now that you know the role that crawl budget plays in your SEO efforts, it is time to learn a few simple ways to help you make the most out of your website’s crawl budget.
Best Practices for Crawl Budget
Here are a few things you can do to ensure that your crawl budget will work to your advantage:
- Improve the speed of your website.
When you improve the page speed of your website, this will allow Googlebot to crawl more URLs in your website.
In fact, according to Google, making your website faster can improve user experience and increase crawl rate at the same time. In other words, pages that load up slowly can consume the precious time of Googlebot.
On the other hand, if your web pages can load faster, Googlebot will have more time for visiting and indexing most of your pages.
2. Take advantage of internal links.
Pages with plenty of internal and external links that point to them are the top priority for Googlebot. Yes, it is ideal that you would get backlinks that point to each page on your website. However, more often than not, this is not realistic at all.
This is what makes internal linking necessary. The internal links that you have send Googlebot to all of your website’s different pages that you want to index.
3. Opt for flat website architecture.
Google states that more popular URLs on the internet have the tendency to be crawled more frequently for them to stay fresher in their index.
As far as Google is concerned, popular is equivalent to link authority.
For this reason, you would want your website to stick with flat web design architecture. Flat architecture can set up things so that all the pages of your website will have some link authority that flows to them.
4. Stay away from “orphan pages.”
An orphan page is a page that doesn’t have external or internal links that point to them. Google really find it hard to look for these orphan pages. Thus, if you like to use your crawl budget to the fullest, see to it that there is at least a single external or internal link that points to each page on your website.
5. Limit your duplicate content.
There are many reasons why it is smart to limit duplicate content. Duplicate content can negatively affect your crawl budget. This is because Google wouldn’t want wasting resources through indexing several pages with similar content.
Be sure that all of the pages on your site are composed of quality and unique content. This might not be easy for websites with more 10,000 pages yet you need to do it if you want to leverage your crawl budget.
Crawl budget is a revenue thing and not just a technical matter. So make sure you get the best out of it!