Home > Computers > Internet > Searching > Search Engines > Robots
Web robots (also known as crawlers or spiders) are programs that traverse the Web automatically, and which are used by search engines to index the Web, or part of it.
http://www.searchtools.com/robots/
Search Tools Consulting explains how the search engine programs called "robots" or "spiders" work, and reviews related sites.
http://www.the-acap.org/
Standard being developed on behalf of content publishers to communicate permissions information more extensively than is the case with robots.txt. Project documents, implementation and background information.
http://www.botsvsbrowsers.com/
This large database lists user agents in categories and distinguishes between robots and browsers.
http://www.siteware.ch/webresources/useragents/db.html
An alphabetical list of user agents and the deployer behind them, compiled by Christoph Rüegg.
http://www.user-agents.org/
A searchable database of user-agents with information about their type, purpose and origin.
http://www.iplists.com/
Lists IP addresses of search engine spiders. Can be searched by IP address. Also links to resources on spiders.
http://www.jafsoft.com/searchengines/webbots.html
John A. Fotheringham presents data in tabular form on the robots sent by search engines and other sites to read and index Web pages: their origins, names and IP addresses.
http://www.robotstxt.org/
Information on the robots.txt Robots Exclusion Standard and other articles about writing well-behaved Web robots.
http://user-agent-string.info/
Tool from ASAP Consulting s.r.o. for detailed user agent string analysis using an online form. Includes databases of browsers and robots.
http://user-agents.my-addr.com/
Contains a database of user-agents for crawlers, spiders, browsers; tools for user-agent lookup and tools for user-agent string search.
Home > Computers > Internet > Searching > Search Engines > Robots
Thanks to DMOZ, which built a great web directory for nearly two decades and freely shared it with the web. About us