Robots.txt

Definition

Robots.txt is a file on your website that provides instructions to search engine crawlers about which areas they should or should not crawl.

Key Takeaways

Robots.txt controls crawling, not guaranteed indexing.
For treatment sites, robots.txt mistakes can cause major visibility loss.
Always QA robots.txt during redesigns, migrations, and plugin changes.

Why It Matters for Treatment and Behavioral Health

A single line in robots.txt can block important pages from being crawled, which can reduce visibility. This matters most on program and location pages that drive calls.

Treatment Lens: Common Use Cases

Blocking internal admin paths, search results pages, and low-value parameter URLs. Avoid blocking pages that you want to rank or that support trust.

Operational Checklist

Audit robots.txt after launches, verify with Search Console, and confirm that critical pages remain crawlable. Coordinate with caching and security plugins that can modify rules.

Common Mistakes

Blocking entire site during development and forgetting to remove the rule.
Blocking assets needed for rendering and page evaluation.
Using robots.txt to hide sensitive information instead of proper authentication.

FAQ

Does robots.txt remove pages from Google?

Not necessarily. It mainly controls crawling. Indexing control often requires noindex or removal tools.

Should we block parameter URLs?

Often yes, if they create crawl waste and duplication. Do it carefully and validate impact.

How often should robots.txt be checked?

After any site change and at least quarterly as part of technical SEO maintenance.

If organic visibility changed after a site update, we can audit robots.txt and indexing controls to restore crawl access and rankings.

« Back to Glossary Index