Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Deep web
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Indexing methods== Methods that prevent web pages from being indexed by traditional search engines may be categorized as one or more of the following: # '''Contextual web''': pages with content varying for different access contexts (e.g., ranges of client IP addresses or previous navigation sequence). # '''Dynamic content''': [[Dynamic Web page|dynamic pages]], which are returned in response to a submitted query or accessed only through a form, especially if open-domain input elements (such as text fields) are used; such fields are hard to navigate without [[domain knowledge]]. # '''Limited access content''': sites that limit access to their pages in a technical manner (e.g., using the [[Robots Exclusion Standard]] or [[CAPTCHA]]s, or no-store directive, which prohibit search engines from browsing them and creating [[web cache|cached]] copies).<ref>{{cite journal|title=Hypertext Transfer Protocol (HTTP/1.1): Caching|website=[[Internet Engineering Task Force]]|year=2014|doi=10.17487/RFC7234 |url=http://tools.ietf.org/html/rfc7234#section-5.2.2.3|access-date=July 30, 2014|editor-last1=Fielding |editor-last2=Nottingham |editor-last3=Reschke |editor-first1=R. |editor-first2=M. |editor-first3=J. |last1=Fielding |first1=R. |last2=Nottingham |first2=M. |last3=Reschke |first3=J. }}</ref> Sites may feature an internal search engine for exploring such pages.<ref>[[Special:Search]]</ref><ref>{{Cite web|url=https://archive.org/search.php|title=Internet Archive Search}}</ref> # '''Non-HTML/text content''': textual content encoded in multimedia (image or video) files or specific [[file formats]] not recognised by search engines. # '''Private web''': sites that require registration and login (password-protected resources). # '''Scripted content''': pages that are accessible only by links produced by [[JavaScript]] as well as content dynamically downloaded from Web servers via [[Adobe Flash|Flash]] or [[Ajax (programming)|Ajax]] solutions. # '''Software''': certain content is hidden intentionally from the regular Internet, accessible only with special software, such as [[Tor (anonymity network)|Tor]], [[I2P]], or other darknet software. For example, Tor allows users to access websites using the [[.onion]] server address anonymously, hiding their IP address. # '''Unlinked content''': pages which are not linked to by other pages, which may prevent [[web crawling]] programs from accessing the content. This content is referred to as pages without [[backlink]]s (also known as inlinks). Also, search engines do not always detect all backlinks from searched web pages. # '''Web archives''': Web archival services such as the [[Wayback Machine]] enable users to see archived versions of web pages across time, including websites that have become inaccessible and are not indexed by search engines such as Google. <ref name=":0" />The Wayback Machine may be termed a program for viewing the deep web, as web archives that are not from the present cannot be indexed, as past versions of websites are impossible to view by a search. All websites are updated at some time, which is why web archives are considered Deep Web content.<ref>{{cite web|last1=Wiener-Bronner|first1=Danielle|title=NASA is indexing the 'Deep Web' to show mankind what Google won't|url=http://fusion.net/story/145885/nasa-is-indexing-the-deep-web-to-show-mankind-what-google-wont/|publisher=Fusion|date=June 10, 2015|access-date=June 27, 2015|quote=There are other simpler versions of Memex already available. "If you've ever used the Internet Archive's Wayback Machine", which gives you past versions of a website not accessible through Google, then you've technically searched the Deep Web, said [[Chris Mattmann]].|archive-date=June 30, 2015|archive-url=https://web.archive.org/web/20150630010143/http://fusion.net/story/145885/nasa-is-indexing-the-deep-web-to-show-mankind-what-google-wont/|url-status=dead}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)