Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Robots.txt
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Filename used to indicate portions for web crawling}} {{Lowercase}} {{Selfref|For Wikipedia's robots.txt file, see https://en.wikipedia.org/robots.txt.}} {{Pp-pc1}} {{Pp-pc|small=yes}} {{Infobox technology standard | title = robots.txt | long_name = Robots Exclusion Protocol | image = Robots txt.svg | image_size = | alt = | caption = Example of a simple robots.txt file, indicating that a user-agent called "Mallorybot" is not allowed to crawl any of the website's pages, and that other user-agents cannot crawl more than one page every 20 seconds, and are not allowed to crawl the "secret" folder | abbreviation = | native_name = <!-- Name in local language. If more than one, separate using {{plain list}} --> | native_name_lang = <!-- ISO 639-1 code e.g. "fr" for French. If more than one, use {{lang}} inside native_name items instead --> | status = Proposed Standard | year_started = <!-- {{Start date|YYYY|MM|DD|df=y}} --> | first_published = 1994 published, formally standardized in 2022 | version = | version_date = | preview = | preview_date = | organization = | committee = | series = | editors = | authors = {{plain list| * Martijn Koster (original author) * Gary Illyes, Henner Zeller, Lizzi Sassman (IETF contributors) }} | base_standards = | related_standards = | predecessor = | successor = | domain = | license = | copyright = | website = {{URL|https://robotstxt.org}}, {{URL|https://datatracker.ietf.org/doc/html/rfc9309|RFC 9309}} }} '''robots.txt''' is the [[filename]] used for implementing the '''Robots Exclusion Protocol''', a standard used by [[website]]s to indicate to visiting [[web crawler]]s and other [[Internet bot|web robots]] which portions of the website they are allowed to visit. The standard, developed in 1994, relies on [[voluntary compliance]]. Malicious bots can use the file as a directory of which pages to visit, though standards bodies discourage countering this with [[security through obscurity]]. Some archival sites ignore robots.txt. The standard was used in the 1990s to mitigate [[server (computing)|server]] overload. In the 2020s, websites began denying bots that collect information for [[generative artificial intelligence]]. The "robots.txt" file can be used in conjunction with [[sitemaps]], another robot inclusion standard for websites.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)