Robots.txt

What is Robots.txt?

Robots.txt is a simple text file in your website's root folder (yoursite.com/robots.txt) that gives instructions to search engine crawlers about which parts of your site they can visit.

How It Works

When a crawler visits your site, it first checks robots.txt for instructions. You can tell it:

Which pages/folders to avoid
Which pages/folders to crawl
Where your sitemap is located

Basic Syntax

User-agent: *
Disallow: /private/
Allow: /public/
Sitemap: https://yoursite.com/sitemap.xml

User-agent: Which crawler (* means all)
Disallow: Pages to avoid
Allow: Pages to crawl (overrides Disallow)
Sitemap: Your sitemap location

Common Uses

Block Admin Areas

Disallow: /admin/
Disallow: /wp-admin/

Block Search Results

Disallow: /search/

Block Staging Sites

User-agent: *
Disallow: /

Important Warnings

Not Security

Robots.txt is a suggestion, not a lock. Malicious bots ignore it. Don't use it to hide sensitive content.

Blocking Too Much

Accidentally blocking important pages is common. Always check your robots.txt isn't hiding content you want indexed.

Checking Your Robots.txt

Visit yoursite.com/robots.txt to see your current file. Google Search Console also has a robots.txt tester tool.

What is Robots.txt?

How It Works

Basic Syntax

Common Uses

Block Admin Areas

Block Search Results

Block Staging Sites

Important Warnings

Not Security

Blocking Too Much

Checking Your Robots.txt

Related Terms

Crawling

Indexing

SEO

Sitemap

Explore More Terms

Want to Learn More?