Definition
Robots.txt is a file on your website that provides instructions to search engine crawlers about which areas they should or should not crawl.
Key Takeaways
- Robots.txt controls crawling, not guaranteed indexing.
- For treatment sites, robots.txt mistakes can cause major visibility loss.
- Always QA robots.txt during redesigns, migrations, and plugin changes.
Why It Matters for Treatment and Behavioral Health
A single line in robots.txt can block important pages from being crawled, which can reduce visibility. This matters most on program and location pages that drive calls.
Treatment Lens: Common Use Cases
Blocking internal admin paths, search results pages, and low-value parameter URLs. Avoid blocking pages that you want to rank or that support trust.
Operational Checklist
Audit robots.txt after launches, verify with Search Console, and confirm that critical pages remain crawlable. Coordinate with caching and security plugins that can modify rules.
Common Mistakes
- Blocking entire site during development and forgetting to remove the rule.
- Blocking assets needed for rendering and page evaluation.
- Using robots.txt to hide sensitive information instead of proper authentication.
Related Terms
Indexing, Search Engine Guidelines, XML Sitemaps, Canonical Tag
FAQ
Does robots.txt remove pages from Google?
Not necessarily. It mainly controls crawling. Indexing control often requires noindex or removal tools.
Should we block parameter URLs?
Often yes, if they create crawl waste and duplication. Do it carefully and validate impact.
How often should robots.txt be checked?
After any site change and at least quarterly as part of technical SEO maintenance.
