Technical

X-Robots-Tag

Also known as: X-Robots-Tag header

X-Robots-Tag is an HTTP response header that lets server administrators apply robots-meta directives (noindex, nofollow, noarchive, etc.) at the response level — for non-HTML resources (PDFs, images, videos) where you can't add a `<meta name='robots'>` tag in the document head. Same directives, different transport mechanism. Useful for excluding non-HTML content from search results without blocking crawlers from accessing it.

When you need X-Robots-Tag (vs meta robots)

  • HTML page → use <meta name="robots" content="..."> in the head
  • PDF, image, video, JSON file → these don’t have a <head> to add a meta tag to. Use X-Robots-Tag in the HTTP response header
  • Server-side blanket directives → set X-Robots-Tag at the server level to apply to many URLs without touching their content

Example HTTP responses

For a PDF you don’t want indexed:

HTTP/1.1 200 OK
Content-Type: application/pdf
X-Robots-Tag: noindex

For multiple directives:

X-Robots-Tag: noindex, nofollow, noarchive

For user-agent-specific directives (more advanced):

X-Robots-Tag: googlebot: noindex
X-Robots-Tag: bingbot: noindex, nofollow

Common use cases

  • Exclude internal PDF reports from search
  • Block image directories from image search
  • Prevent JSON / API endpoints from being indexed
  • Block video files (raw MP4s) from video search
  • Apply directives to entire directories via server config

Implementation by server

Apache (.htaccess):

<FilesMatch "\.pdf$">
  Header set X-Robots-Tag "noindex, nofollow"
</FilesMatch>

Nginx:

location ~* \.pdf$ {
  add_header X-Robots-Tag "noindex, nofollow";
}

Cloudflare Workers / edge functions: set the header in the response.

Verification

Check the header is being sent correctly:

curl -I https://yourdomain.com/document.pdf

The response should include:

X-Robots-Tag: noindex

If the header is missing, server config didn’t apply correctly.

Common mistakes

  • Forgetting Content-Type — make sure the header is sent for the right file types
  • Conflicting with robots.txt — if robots.txt blocks crawl, the crawler never sees the header. Allow crawl, then use X-Robots-Tag to control indexation.
  • Setting on wrong path patterns — overly-broad rules can accidentally noindex pages you wanted indexed
  • Browser-not-crawler header behavior — X-Robots-Tag is informational for crawlers; browsers don’t act on it

Resocial perspective

X-Robots-Tag is a small but powerful tool. Most clients we work with have legacy PDF or asset URLs leaking into the index — applying X-Robots-Tag noindex via server config removes them en masse without per-file edits. Part of the Technical SEO cleanup playbook.

Looking for hands-on help with this?

Free SEO audit

60+ dimensions, 48-hour turnaround.

Get a Free SEO Audit

Enterprise RFP

Tailored proposal in 5 business days.

Submit an Enterprise RFP