Crawl budget
Also known as: crawl allocation, crawl frequency
Crawl budget is the volume of pages Google (and other crawlers) will fetch from your site within a given timeframe. It's a function of two factors: crawl rate limit (how fast crawlers can hit your servers without degrading performance) and crawl demand (how often the crawler decides each URL is worth re-fetching). Material concern at sites with 100K+ URLs or volatile content.
When crawl budget becomes a problem
Most sites under 50K URLs don’t have crawl budget issues. Problems emerge when:
- Faceted navigation creates millions of low-value URL permutations
- Infinite scroll / parameter URLs generate near-duplicate pages
- JavaScript-rendered sites require two crawls (HTML + JS) per URL, doubling cost
- Pagination explosion (page=1, page=2, … page=50,000) on listing sites
- Slow server response causes crawl rate throttling
How to manage it
- Block low-value paths in robots.txt (admin, search results, deep faceted nav)
- Use rel=canonical to consolidate near-duplicates
- Noindex thin pages (parameter combinations, low-value tag pages)
- Improve server response time to lift the crawl rate limit
- Update XML sitemap with priority/lastmod hints (lastmod still matters)
- Audit redirect chains — each hop wastes crawl budget
Diagnostic signals
- Google Search Console → Settings → Crawl stats: declining crawl requests over time
- Sitemap discovery rate vs index inclusion rate gap widening
- New URLs taking weeks to be discovered + indexed
- Critical pages crawled less frequently than non-critical ones
What it doesn’t affect
Crawl budget is rarely a ranking signal directly. It affects whether pages get indexed and how fresh they stay — both indirect ranking implications.
- Resocial service →
/services/seo/technical-seo/ - Read on the blog →
/blog/technical-seo-vs-on-page-seo/