Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Web server
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Read request message === Web server programs are able:<ref name="rfc7230-2.1">{{cite IETF |rfc=7230 |sectionname=Client/Server Messaging|section=2.1|title=RFC 7230, HTTP/1.1: Message Syntax and Routing|pages=7β8}}</ref> <ref name="rfc7230-3.4">{{cite IETF |rfc=7230 |sectionname=Handling Incomplete Messages|section=3.4|title=RFC 7230, HTTP/1.1: Message Syntax and Routing|page=34}}</ref> <ref name="rfc7230-3.5">{{cite IETF |rfc=7230 |sectionname=Message Parsing Robustness|section=3.5|title=RFC 7230, HTTP/1.1: Message Syntax and Routing|pages=34β35}}</ref> * to read an HTTP request message; * to interpret it; * to verify its syntax; * to identify known [[HTTP headers]] and to extract their values from them. Once an HTTP request message has been decoded and verified, its values can be used to determine whether that request can be satisfied or not. This requires many other steps, including '''[[Computer security|security checks]]'''. ==== URL normalization ==== {{Main|URL normalization}} Web server programs usually perform some type of [[URL normalization]] ([[Uniform Resource Locator|URL]] found in most HTTP request messages) in order to: * make resource path always a clean uniform path from root directory of website; * lower security risks (e.g. by intercepting more easily attempts to access static resources outside the root directory of the website or to access to portions of path below website root directory that are forbidden or which require authorization); * make path of web resources more recognizable by human beings and [[Web log analysis software|web log analysis programs]] (also known as log analyzers / statistical applications). The term ''URL normalization'' refers to the process of modifying and standardizing a URL in a consistent manner. There are several types of normalization that may be performed, including the conversion of the scheme and host to lowercase. Among the most important normalizations are the removal of "." and ".." path segments and adding trailing slashes to a non-empty path component. ==== URL mapping ==== {{Update section|date=June 2023}}{{Main|URL mapping}}<blockquote>"URL mapping is the process by which a URL is analyzed to figure out what resource it is referring to, so that that resource can be returned to the requesting client. This process is performed with every request that is made to a web server, with some of the requests being served with a file, such as an HTML document, or a gif image, others with the results of running a CGI program, and others by some other process, such as a built-in module handler, a PHP document, or a Java servlet."<ref name="ws-url-mapping">{{Cite web|url=http://people.apache.org/~jim/ApacheCons/ApacheCon2002/pdf/Bowen-urlmap-ACUS02/bowen-urlmap-ACUS02.pdf|title=URL Mapping|author=R. Bowen|publisher=Apache software foundation|date=2002-09-29|access-date=2021-11-15|language=en|archive-date=15 November 2021|archive-url=https://web.archive.org/web/20211115181448/http://people.apache.org/~jim/ApacheCons/ApacheCon2002/pdf/Bowen-urlmap-ACUS02/bowen-urlmap-ACUS02.pdf|url-status=live}}</ref>{{Update inline|date=June 2023}}</blockquote>In practice, web server programs that implement advanced features, beyond the simple ''static content serving'' (e.g. URL rewrite engine, dynamic content serving), usually have to figure out how that URL has to be handled, e.g. as a: * [[#URL redirection|URL redirection]], a redirection to another URL; * ''static request'' of [[Computer file|file]] content; * ''dynamic request'' of: ** [[Directory (computing)|directory]] listing of files or other sub-directories contained in that directory; ** other types of dynamic request in order to identify the program / module processor able to handle that kind of URL path and to pass to it other [[URL parts]], i.e. usually path-info and [[query string]] variables. One or more configuration files of web server may specify the mapping of parts of '''URL path''' (e.g. initial parts of [[Path (computing)|file path]], [[filename extension]] and other path components) to a specific URL handler (file, directory, external program or internal module).<ref name="ws-static-rqs-root-dir">{{Cite web|url=https://httpd.apache.org/docs/2.4/urlmapping.html|title=Mapping URLs to Filesystem Locations|publisher=Apache: HTTPd server project|year=2021|access-date=2021-10-19|language=en|archive-date=20 October 2021|archive-url=https://web.archive.org/web/20211020053640/http://httpd.apache.org/docs/2.4/urlmapping.html|url-status=live}}</ref> When a web server implements one or more of the above-mentioned advanced features then the path part of a valid URL may not always match an existing file system path under website directory tree (a file or a directory in [[file system]]) because it can refer to a virtual name of an internal or external module processor for dynamic requests. ==== URL path translation to file system ==== Web server programs are able to translate an URL path (all or part of it), that refers to a physical file system path, to an [[Path (computing)|absolute path]] under the target website's root directory.<ref name="ws-static-rqs-root-dir" /> Website's root directory may be specified by a configuration file or by some internal rule of the web server by using the name of the website which is the [[URL#Syntax|host]] part of the URL found in HTTP client request.<ref name="ws-static-rqs-root-dir" /> Path translation to file system is done for the following types of web resources: * a local, usually non-executable, file (static request for file content); * a local directory (dynamic request: directory listing generated on the fly); * a program name (dynamic requests that is executed using CGI or SCGI interface and whose output is read by web server and resent to client who made the HTTP request). The web server appends the path found in requested URL (HTTP request message) and appends it to the path of the (Host) website root directory. On an [[Apache HTTP Server|Apache server]], this is commonly <code>/home/www/website</code> (on [[Unix]] machines, usually it is: <code>/var/www/website</code>). See the following examples of how it may result. '''URL path translation for a static file request''' Example of a ''static request'' of an existing file specified by the following URL: <nowiki>http://www.example.com/path/file.html</nowiki> The client's [[user agent]] connects to <code><nowiki>www.example.com</nowiki></code> and then sends the following [[HTTP]]/1.1 request: GET <nowiki>/path/file.html</nowiki> HTTP/1.1 <nowiki>Host: www.example.com</nowiki> Connection: keep-alive The result is the local file system resource: <nowiki>/home/www/www.example.com/path/file.html</nowiki> The web server then reads the [[Computer file|file]], if it exists, and sends a response to the client's web browser. The response will describe the content of the file and contain the file itself or an error message will return saying that the file does not exist or its access is forbidden. '''URL path translation for a directory request (without a static index file)''' Example of an implicit ''dynamic request'' of an existing directory specified by the following URL: <nowiki>http://www.example.com/directory1/directory2/</nowiki> The client's [[user agent]] connects to <code><nowiki>www.example.com</nowiki></code> and then sends the following [[HTTP]]/1.1 request: GET <nowiki>/directory1/directory2</nowiki> HTTP/1.1 <nowiki>Host: www.example.com</nowiki> Connection: keep-alive The result is the local directory path: <nowiki>/home/www/www.example.com/directory1/directory2/</nowiki> The web server then verifies the existence of the [[Directory (computing)|directory]] and if it exists and it can be accessed then tries to find out an index file (which in this case does not exist) and so it passes the request to an internal module or a program dedicated to directory listings and finally reads data output and sends a response to the client's web browser. The response will describe the content of the directory (list of contained subdirectories and files) or an error message will return saying that the directory does not exist or its access is forbidden. '''URL path translation for a dynamic program request''' For a ''dynamic request'' the URL path specified by the client should refer to an existing external program (usually an executable file with a CGI) used by the web server to generate dynamic content.<ref name="ws-dynamic-rqs-root-dir">{{Cite web|url=https://httpd.apache.org/docs/2.4/howto/cgi.html|title=Dynamic Content with CGI|publisher=Apache: HTTPd server project|year=2021|access-date=2021-10-19|language=en|archive-date=15 November 2021|archive-url=https://web.archive.org/web/20211115181448/https://httpd.apache.org/docs/2.4/howto/cgi.html|url-status=live}}</ref> Example of a ''dynamic request'' using a program file to generate output: <nowiki>http://www.example.com/cgi-bin/forum.php?action=view&orderby=thread&date=2021-10-15</nowiki> The client's [[user agent]] connects to <code><nowiki>www.example.com</nowiki></code> and then sends the following [[HTTP]]/1.1 request: GET <nowiki>/cgi-bin/forum.php?action=view&ordeby=thread&date=2021-10-15</nowiki> HTTP/1.1 <nowiki>Host: www.example.com</nowiki> Connection: keep-alive The result is the local file path of the program (in this example, a [[PHP]] program): <nowiki>/home/www/www.example.com/cgi-bin/forum.php</nowiki> The web server executes that program, passing in the path-info and the [[query string]] <code>action=view&orderby=thread&date=2021-10-15</code> so that the program has the info it needs to run. (In this case, it will return an HTML document containing a view of forum entries ordered by thread from October 15, 2021). In addition to this, the web server reads data sent from the external program and resends that data to the client that made the request.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)