X-Robots-Tag
Also known as: X-Robots-Tag header
X-Robots-Tag is an HTTP response header that lets server administrators apply robots-meta directives (noindex, nofollow, noarchive, etc.) at the response level — for non-HTML resources (PDFs, images, videos) where you can't add a `<meta name='robots'>` tag in the document head. Same directives, different transport mechanism. Useful for excluding non-HTML content from search results without blocking crawlers from accessing it.
When you need X-Robots-Tag (vs meta robots)
- HTML page → use
<meta name="robots" content="...">in the head - PDF, image, video, JSON file → these don’t have a
<head>to add a meta tag to. Use X-Robots-Tag in the HTTP response header - Server-side blanket directives → set X-Robots-Tag at the server level to apply to many URLs without touching their content
Example HTTP responses
For a PDF you don’t want indexed:
HTTP/1.1 200 OK
Content-Type: application/pdf
X-Robots-Tag: noindex
For multiple directives:
X-Robots-Tag: noindex, nofollow, noarchive
For user-agent-specific directives (more advanced):
X-Robots-Tag: googlebot: noindex
X-Robots-Tag: bingbot: noindex, nofollow
Common use cases
- Exclude internal PDF reports from search
- Block image directories from image search
- Prevent JSON / API endpoints from being indexed
- Block video files (raw MP4s) from video search
- Apply directives to entire directories via server config
Implementation by server
Apache (.htaccess):
<FilesMatch "\.pdf$">
Header set X-Robots-Tag "noindex, nofollow"
</FilesMatch>
Nginx:
location ~* \.pdf$ {
add_header X-Robots-Tag "noindex, nofollow";
}
Cloudflare Workers / edge functions: set the header in the response.
Verification
Check the header is being sent correctly:
curl -I https://yourdomain.com/document.pdf
The response should include:
X-Robots-Tag: noindex
If the header is missing, server config didn’t apply correctly.
Common mistakes
- Forgetting Content-Type — make sure the header is sent for the right file types
- Conflicting with robots.txt — if robots.txt blocks crawl, the crawler never sees the header. Allow crawl, then use X-Robots-Tag to control indexation.
- Setting on wrong path patterns — overly-broad rules can accidentally noindex pages you wanted indexed
- Browser-not-crawler header behavior — X-Robots-Tag is informational for crawlers; browsers don’t act on it
Resocial perspective
X-Robots-Tag is a small but powerful tool. Most clients we work with have legacy PDF or asset URLs leaking into the index — applying X-Robots-Tag noindex via server config removes them en masse without per-file edits. Part of the Technical SEO cleanup playbook.
- Resocial service →
/services/seo/technical-seo/ - Read on the blog →
/blog/technical-seo-complete-guide/