Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Web server
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Technical overview== [[File:Client-server-model.svg|thumb|250x150px|right|PC clients connected to a web server via Internet]] The following technical overview should be considered only as an attempt to give a few very ''limited examples'' about ''some'' features that may be [[Implementation#Computer science|implemented]] in a web server and ''some'' of the tasks that it may perform in order to have a sufficiently wide scenario about the topic. A '''web server program''' plays the role of a server in a [[client–server model]] by implementing one or more versions of HTTP protocol, often including the HTTPS secure variant and other features and extensions that are considered useful for its planned usage. The complexity and the efficiency of a web server program may vary a lot depending on (e.g.):<ref name="web-server-technology" /> * [[#Common features|common features]] implemented; * [[#Common tasks|common tasks]] performed; * [[#Performances|performances]] and scalability level aimed as a goal; * [[#Software efficiency|software]] model and techniques adopted to achieve wished performance and scalability level; * target hardware and category of usage, e.g. embedded system, low-medium traffic web server, high traffic [[Internet]] web server. === Common features === Although web server programs differ in how they are implemented, most of them offer the following common features. These are '''basic features''' that most web servers usually have. * [[Static web page|Static content serving]]: to be able to serve static content (web files) to clients via HTTP protocol. * [[HTTP]]: support for one or more versions of HTTP protocol in order to send versions of HTTP responses compatible with versions of client HTTP requests, e.g. HTTP/1.0, HTTP/1.1 (eventually also with [[Encryption|encrypted]] connections [[HTTPS]]), plus, if available, [[HTTP/2]], [[HTTP/3]]. * [[Logging (software)|Logging]]: usually web servers have also the capability of logging some information, about client requests and server responses, to [[Server log|log files]] for security and statistical purposes. A few other more '''advanced''' and popular '''features''' (''only a very short selection'') are the following ones. * [[Dynamic web page|Dynamic content serving]]: to be able to serve dynamic content (generated on the fly) to clients via HTTP protocol. * [[Virtual hosting]]: to be able to serve many websites ([[domain name]]s) using only one [[IP address]]. * [[#URL authorization|Authorization]]: to be able to allow, to forbid or to authorize access to portions of website paths (web resources). * [[#Content cache|Content cache]]: to be able to cache static and/or dynamic content in order to speed up server responses; * [[Large file support]]: to be able to serve files whose size is greater than 2 GB on 32 bit [[Operating system|OS]]. * [[Bandwidth throttling]]: to limit the speed of content responses in order to not saturate the network and to be able to serve more clients; * [[Rewrite engine]]: to map parts of [[clean URL]]s (found in client requests) to their real names. * [[Custom error page]]s: support for customized HTTP error messages. === Common tasks === A web server program, when it is running, usually performs several general '''tasks''', (e.g.):<ref name="web-server-technology" /> * starts, optionally reads and applies settings found in its [[configuration file]](s) or elsewhere, optionally opens log file, starts listening to client connections / requests; * optionally tries to adapt its general behavior according to its settings and its current [[#Operating conditions|operating conditions]]; * manages '''client connection(s)''' (accepting new ones or closing the existing ones as required); * '''[[#Read request message|receives]]''' client requests (by reading HTTP messages): ** reads and verify each HTTP request message; ** usually performs [[#URL normalization|URL normalization]]; ** usually performs [[#URL mapping|URL mapping]] (which may default to URL path translation); ** usually performs [[#URL path translation to file system|URL path translation]] along with various security checks; * '''[[#Manage request message|executes]]''' or refuses requested HTTP method: ** optionally manages [[#URL authorization|URL authorization]]s; ** optionally manages [[#URL redirection|URL redirection]]s; ** optionally manages requests for '''[[#Serve static content|static resources]]''' (file contents): *** optionally manages [[#Directory index files|directory index files]]; *** optionally manages [[#Regular files|regular files]]; ** optionally manages requests for '''[[#Serve dynamic content|dynamic resources]]''': *** optionally manages [[#Directory listings|directory listings]]; *** optionally manages [[#Program or module processing|program or module processing]], checking the availability, the start and eventually the stop of the execution of external programs used to generate dynamic content; *** optionally manages the communications with external programs / internal modules used to generate dynamic content; * '''[[#Send response message|replies]]''' to client requests sending proper HTTP responses (e.g. requested resources or error messages) eventually verifying or adding [[HTTP headers]] to those sent by dynamic programs / modules; * optionally '''logs''' (partially or totally) '''client requests and/or its responses''' to an external user log file or to a system log file by [[syslog]], usually using [[Common Log Format|common log format]]; * optionally '''logs process messages''' about '''detected anomalies or other notable events''' (e.g. in client requests or in its internal functioning) using syslog or some other system facilities; these log messages usually have a debug, warning, error, alert level which can be filtered (not logged) depending on some settings, see also [[Syslog#Severity level|severity level]]; * optionally generates '''statistics''' about web traffic managed and/or its performances; * other custom tasks. === Read request message === Web server programs are able:<ref name="rfc7230-2.1">{{cite IETF |rfc=7230 |sectionname=Client/Server Messaging|section=2.1|title=RFC 7230, HTTP/1.1: Message Syntax and Routing|pages=7–8}}</ref> <ref name="rfc7230-3.4">{{cite IETF |rfc=7230 |sectionname=Handling Incomplete Messages|section=3.4|title=RFC 7230, HTTP/1.1: Message Syntax and Routing|page=34}}</ref> <ref name="rfc7230-3.5">{{cite IETF |rfc=7230 |sectionname=Message Parsing Robustness|section=3.5|title=RFC 7230, HTTP/1.1: Message Syntax and Routing|pages=34–35}}</ref> * to read an HTTP request message; * to interpret it; * to verify its syntax; * to identify known [[HTTP headers]] and to extract their values from them. Once an HTTP request message has been decoded and verified, its values can be used to determine whether that request can be satisfied or not. This requires many other steps, including '''[[Computer security|security checks]]'''. ==== URL normalization ==== {{Main|URL normalization}} Web server programs usually perform some type of [[URL normalization]] ([[Uniform Resource Locator|URL]] found in most HTTP request messages) in order to: * make resource path always a clean uniform path from root directory of website; * lower security risks (e.g. by intercepting more easily attempts to access static resources outside the root directory of the website or to access to portions of path below website root directory that are forbidden or which require authorization); * make path of web resources more recognizable by human beings and [[Web log analysis software|web log analysis programs]] (also known as log analyzers / statistical applications). The term ''URL normalization'' refers to the process of modifying and standardizing a URL in a consistent manner. There are several types of normalization that may be performed, including the conversion of the scheme and host to lowercase. Among the most important normalizations are the removal of "." and ".." path segments and adding trailing slashes to a non-empty path component. ==== URL mapping ==== {{Update section|date=June 2023}}{{Main|URL mapping}}<blockquote>"URL mapping is the process by which a URL is analyzed to figure out what resource it is referring to, so that that resource can be returned to the requesting client. This process is performed with every request that is made to a web server, with some of the requests being served with a file, such as an HTML document, or a gif image, others with the results of running a CGI program, and others by some other process, such as a built-in module handler, a PHP document, or a Java servlet."<ref name="ws-url-mapping">{{Cite web|url=http://people.apache.org/~jim/ApacheCons/ApacheCon2002/pdf/Bowen-urlmap-ACUS02/bowen-urlmap-ACUS02.pdf|title=URL Mapping|author=R. Bowen|publisher=Apache software foundation|date=2002-09-29|access-date=2021-11-15|language=en|archive-date=15 November 2021|archive-url=https://web.archive.org/web/20211115181448/http://people.apache.org/~jim/ApacheCons/ApacheCon2002/pdf/Bowen-urlmap-ACUS02/bowen-urlmap-ACUS02.pdf|url-status=live}}</ref>{{Update inline|date=June 2023}}</blockquote>In practice, web server programs that implement advanced features, beyond the simple ''static content serving'' (e.g. URL rewrite engine, dynamic content serving), usually have to figure out how that URL has to be handled, e.g. as a: * [[#URL redirection|URL redirection]], a redirection to another URL; * ''static request'' of [[Computer file|file]] content; * ''dynamic request'' of: ** [[Directory (computing)|directory]] listing of files or other sub-directories contained in that directory; ** other types of dynamic request in order to identify the program / module processor able to handle that kind of URL path and to pass to it other [[URL parts]], i.e. usually path-info and [[query string]] variables. One or more configuration files of web server may specify the mapping of parts of '''URL path''' (e.g. initial parts of [[Path (computing)|file path]], [[filename extension]] and other path components) to a specific URL handler (file, directory, external program or internal module).<ref name="ws-static-rqs-root-dir">{{Cite web|url=https://httpd.apache.org/docs/2.4/urlmapping.html|title=Mapping URLs to Filesystem Locations|publisher=Apache: HTTPd server project|year=2021|access-date=2021-10-19|language=en|archive-date=20 October 2021|archive-url=https://web.archive.org/web/20211020053640/http://httpd.apache.org/docs/2.4/urlmapping.html|url-status=live}}</ref> When a web server implements one or more of the above-mentioned advanced features then the path part of a valid URL may not always match an existing file system path under website directory tree (a file or a directory in [[file system]]) because it can refer to a virtual name of an internal or external module processor for dynamic requests. ==== URL path translation to file system ==== Web server programs are able to translate an URL path (all or part of it), that refers to a physical file system path, to an [[Path (computing)|absolute path]] under the target website's root directory.<ref name="ws-static-rqs-root-dir" /> Website's root directory may be specified by a configuration file or by some internal rule of the web server by using the name of the website which is the [[URL#Syntax|host]] part of the URL found in HTTP client request.<ref name="ws-static-rqs-root-dir" /> Path translation to file system is done for the following types of web resources: * a local, usually non-executable, file (static request for file content); * a local directory (dynamic request: directory listing generated on the fly); * a program name (dynamic requests that is executed using CGI or SCGI interface and whose output is read by web server and resent to client who made the HTTP request). The web server appends the path found in requested URL (HTTP request message) and appends it to the path of the (Host) website root directory. On an [[Apache HTTP Server|Apache server]], this is commonly <code>/home/www/website</code> (on [[Unix]] machines, usually it is: <code>/var/www/website</code>). See the following examples of how it may result. '''URL path translation for a static file request''' Example of a ''static request'' of an existing file specified by the following URL: <nowiki>http://www.example.com/path/file.html</nowiki> The client's [[user agent]] connects to <code><nowiki>www.example.com</nowiki></code> and then sends the following [[HTTP]]/1.1 request: GET <nowiki>/path/file.html</nowiki> HTTP/1.1 <nowiki>Host: www.example.com</nowiki> Connection: keep-alive The result is the local file system resource: <nowiki>/home/www/www.example.com/path/file.html</nowiki> The web server then reads the [[Computer file|file]], if it exists, and sends a response to the client's web browser. The response will describe the content of the file and contain the file itself or an error message will return saying that the file does not exist or its access is forbidden. '''URL path translation for a directory request (without a static index file)''' Example of an implicit ''dynamic request'' of an existing directory specified by the following URL: <nowiki>http://www.example.com/directory1/directory2/</nowiki> The client's [[user agent]] connects to <code><nowiki>www.example.com</nowiki></code> and then sends the following [[HTTP]]/1.1 request: GET <nowiki>/directory1/directory2</nowiki> HTTP/1.1 <nowiki>Host: www.example.com</nowiki> Connection: keep-alive The result is the local directory path: <nowiki>/home/www/www.example.com/directory1/directory2/</nowiki> The web server then verifies the existence of the [[Directory (computing)|directory]] and if it exists and it can be accessed then tries to find out an index file (which in this case does not exist) and so it passes the request to an internal module or a program dedicated to directory listings and finally reads data output and sends a response to the client's web browser. The response will describe the content of the directory (list of contained subdirectories and files) or an error message will return saying that the directory does not exist or its access is forbidden. '''URL path translation for a dynamic program request''' For a ''dynamic request'' the URL path specified by the client should refer to an existing external program (usually an executable file with a CGI) used by the web server to generate dynamic content.<ref name="ws-dynamic-rqs-root-dir">{{Cite web|url=https://httpd.apache.org/docs/2.4/howto/cgi.html|title=Dynamic Content with CGI|publisher=Apache: HTTPd server project|year=2021|access-date=2021-10-19|language=en|archive-date=15 November 2021|archive-url=https://web.archive.org/web/20211115181448/https://httpd.apache.org/docs/2.4/howto/cgi.html|url-status=live}}</ref> Example of a ''dynamic request'' using a program file to generate output: <nowiki>http://www.example.com/cgi-bin/forum.php?action=view&orderby=thread&date=2021-10-15</nowiki> The client's [[user agent]] connects to <code><nowiki>www.example.com</nowiki></code> and then sends the following [[HTTP]]/1.1 request: GET <nowiki>/cgi-bin/forum.php?action=view&ordeby=thread&date=2021-10-15</nowiki> HTTP/1.1 <nowiki>Host: www.example.com</nowiki> Connection: keep-alive The result is the local file path of the program (in this example, a [[PHP]] program): <nowiki>/home/www/www.example.com/cgi-bin/forum.php</nowiki> The web server executes that program, passing in the path-info and the [[query string]] <code>action=view&orderby=thread&date=2021-10-15</code> so that the program has the info it needs to run. (In this case, it will return an HTML document containing a view of forum entries ordered by thread from October 15, 2021). In addition to this, the web server reads data sent from the external program and resends that data to the client that made the request. === Manage request message === Once a request has been read, interpreted, and verified, it has to be managed depending on its method, its URL, and its parameters, which may include values of HTTP headers. In practice, the web server has to handle the request by using one of these response paths:<ref name="ws-static-rqs-root-dir" /> * if something in request was not acceptable (in status line or message headers), web server already sent an error response; * if request has a method (e.g. <code>OPTIONS</code>) that can be satisfied by general code of web server then a successful response is sent; * if URL requires authorization then an [[#URL authorization|authorization error message]] is sent; * if URL maps to a redirection then a [[#URL redirection|redirect message]] is sent; * if URL maps to a [[#Serve dynamic content|dynamic resource]] (a virtual path or a directory listing) then its handler (an internal module or an external program) is called and request parameters (query string and path info) are passed to it in order to allow it to reply to that request; * if URL maps to a [[#Serve static content|static resource]] (usually a file on file system) then the internal static handler is called to send that file; * if request method is not known or if there is some other unacceptable condition (e.g. resource not found, internal server error, etc.) then an [[#Error message|error response]] is sent. ==== Serve static content ==== [[File:Web server serving static content.png|thumb|221x144px|right|PC clients communicating via network with a web server serving static content only]] If a web server program is capable of '''serving static content''' and it has been configured to do so, then it is able to send file content whenever a request message has a valid URL path matching (after URL mapping, URL translation and URL redirection) that of an existing file under the root directory of a website and file has attributes which match those required by internal rules of web server program.<ref name="ws-static-rqs-root-dir" /> That kind of content is called ''static'' because usually it is not changed by the web server when it is sent to clients and because it remains the same until it is modified (file modification) by some program. NOTE: when serving '''static content only''', a web server program usually '''does not change file contents''' of served websites (as they are only read and never written) and so it suffices to support only these [[HTTP method]]s: * <code>OPTIONS</code> * <code>HEAD</code> * <code>GET</code> Response of static file content can be sped up by a '''[[#File cache|file cache]]'''. ===== Directory index files ===== {{Main|Web server directory index}} If a web server program receives a client request message with an URL whose path matches one of an existing ''directory'' and that directory is accessible and serving directory index file(s) is enabled then a web server program may try to serve the first of known (or configured) static index file names (a [[#Regular files|regular file]]) found in that directory; if no index file is found or other conditions are not met then an error message is returned. Most used names for static index files are: <code>index.html</code>, <code>index.htm</code> and <code>Default.htm</code>. ===== Regular files ===== If a web server program receives a client request message with an URL whose path matches the file name of an existing ''file'' and that file is accessible by web server program and its attributes match internal rules of web server program, then web server program can send that file to client. Usually, for security reasons, most web server programs are pre-configured to serve only [[regular file]]s or to avoid to use ''special file types'' like [[device file]]s, along with [[symbolic link]]s or [[hard link]]s to them. The aim is to avoid undesirable side effects when serving static web resources.<ref name="web-server-http">{{Cite book|url=https://books.google.com/books?id=oxg8_i9dVakC&pg=PA38|title=HTTP developer's handbook|author=Chris Shiflett|year=2003|access-date=2021-12-09|language=en|publisher=Sams's publishing|isbn=0-672-32454-7|archive-date=20 January 2023|archive-url=https://web.archive.org/web/20230120185219/https://www.google.it/books/edition/HTTP_Developer_s_Handbook/oxg8_i9dVakC?hl=en&gbpv=1&pg=PA38&printsec=frontcover|url-status=live}}</ref> ==== Serve dynamic content ==== [[File:Web server serving static and dynamic content.png|thumb|221x144px|right|PC clients communicating via network with a web server serving static and dynamic content]] If a web server program is capable of '''serving dynamic content''' and it has been configured to do so, then it is able to communicate with the proper internal module or external program (associated with the requested URL path) in order to pass to it the parameters of the client request. After that, the web server program reads from it its data response (that it has generated, often on the fly) and then it resends it to the client program who made the request.{{citation needed|date=November 2021}} NOTE: when serving '''static and dynamic content''', a web server program usually has to support also the following HTTP method in order to be able to safely '''receive data''' from client(s) and so to be able to host also websites with interactive form(s) that may send large data sets (e.g. lots of [[data entry]] or [[file upload]]s) to web server / external programs / modules: * <code>POST</code> In order to be able to communicate with its internal modules and/or external programs, a web server program must have implemented one or more of the many available '''gateway interface(s)''' (see also [[#StandardCGIs|Web Server Gateway Interfaces used for dynamic content]]). The three '''standard''' and historical '''gateway interfaces''' are the following ones. ; [[Common Gateway Interface|CGI]] : An external CGI program is run by web server program for each dynamic request, then web server program reads from it the generated data response and then resends it to client. ; [[Simple Common Gateway Interface|SCGI]] : An external SCGI program (it usually is a process) is started once by web server program or by some other program / process and then it waits for network connections; every time there is a new request for it, web server program makes a new network connection to it in order to send request parameters and to read its data response, then network connection is closed. ; [[FastCGI]] : An external FastCGI program (it usually is a process) is started once by web server program or by some other program / process and then it waits for a network connection which is established permanently by web server; through that connection are sent the request parameters and read data responses. ===== Directory listings ===== [[File:Web server directory list.png|thumb|271x161px|right|Directory listing dynamically generated by a web server]] {{Main|Web server directory index}} A web server program may be capable to manage the dynamic generation (on the fly) of a '''[[Web server directory index|directory index list]]''' of files and sub-directories.<ref name="ws-directory-listings">{{Cite web|url=https://cwiki.apache.org/confluence/display/HTTPD/DirectoryListings|title=Directory listings|author=ASF Infrabot|publisher=Apache foundation: HTTPd server project|date=2019-05-22|access-date=2021-11-16|language=en|archive-date=7 June 2019|archive-url=https://web.archive.org/web/20190607234544/https://cwiki.apache.org/confluence/display/HTTPD/DirectoryListings|url-status=live}}</ref> If a web server program is configured to do so and a requested URL path matches an existing directory and its access is allowed and no static index file is found under that directory then a web page (usually in HTML format), containing the list of files and/or subdirectories of above mentioned directory, is ''dynamically generated'' (on the fly). If it cannot be generated an error is returned. Some web server programs allow the customization of directory listings by allowing the usage of a web page template (an HTML document containing placeholders, e.g. <code>$(FILE_NAME), $(FILE_SIZE)</code>, etc., that are replaced with the field values of each file entry found in directory by web server), e.g. <code>index.tpl</code> or the usage of HTML and embedded source code that is interpreted and executed on the fly, e.g. <code>index.asp</code>, and / or by supporting the usage of dynamic index programs such as CGIs, SCGIs, FCGIs, e.g. <code>index.cgi</code>, <code>index.php</code>, <code>index.fcgi</code>. Usage of dynamically generated ''directory listings'' is usually avoided or limited to a few selected directories of a website because that generation takes much more OS resources than sending a static index page. The main usage of ''directory listings'' is to allow the download of files (usually when their names, sizes, modification date-times or [[file attribute]]s may change randomly / frequently) ''as they are, without requiring to provide further information to requesting user''.<ref name="ws-apache-dir">{{Cite web|url=https://archive.apache.org/dist/httpd/|title=Apache: directory listing to download files|author=|publisher=Apache: HTTPd server|access-date=2021-12-16|archive-date=2 December 2021|archive-url=https://web.archive.org/web/20211202004258/http://archive.apache.org/dist/httpd/|url-status=live}}</ref> ===== Program or module processing ===== An external program or an internal module (''processing unit'') can execute some sort of application function that may be used to get data from or to store data to one or more [[Content repository|data repositories]], e.g.:{{citation needed|date=November 2021}} * files (file system); * [[database]]s (DBs); * other sources located in local computer or in other computers. A ''processing unit'' can return any kind of web content, also by using data retrieved from a data repository, e.g.:{{citation needed|date=November 2021}} * a document (e.g. [[HTML]], [[XML]], etc.); * an image; * a video; * structured data, e.g. that may be used to update one or more values displayed by a dynamic page ([[DHTML]]) of a [[web interface]] and that maybe was requested by an [[XMLHttpRequest]] [[API]] (see also: [[Dynamic web page|dynamic page]]). In practice whenever there is content that may vary, depending on one or more parameters contained in client request or in configuration settings, then, usually, it is generated dynamically. === Send response message === Web server programs are able to send response messages as replies to client request messages.<ref name="rfc7230-2.1" /> An error response message may be sent because a request message could not be successfully read or decoded or analyzed or executed.<ref name="rfc7230-3.4" /> NOTE: the following sections are reported only as examples to help to understand what a web server, more or less, does; these sections are by any means neither exhaustive nor complete. ==== Error message ==== A web server program may reply to a client request message with many kinds of error messages, anyway these errors are divided mainly in two categories: * [[HTTP client errors]], due to the type of request message or to the availability of requested web resource;<ref name="rfc7231-6.5">{{cite IETF |rfc=7231 |sectionname=Client Error 4xx|section=6.5|title=RFC 7231, HTTP/1.1: Semantics and Content|page=58}}</ref> * [[HTTP server errors]], due to internal server errors.<ref name="rfc7231-6.6">{{cite IETF |rfc=7231 |sectionname=Server Error 5xx|section=6.6|title=RFC 7231, HTTP/1.1: Semantics and Content|pages=62-63}}</ref> When an error response / message is received by a client browser, then if it is related to the main user request (e.g. an URL of a web resource such as a web page) then usually that error message is shown in some browser window / message. ==== URL authorization ==== A web server program may be able to verify whether the requested URL path:<ref name="rfc7235-1">{{cite IETF |rfc=7235 |sectionname=Introduction|section=1|title=RFC 7235, HTTP/1.1: Authentication|page=3}}</ref> * can be freely accessed by everybody; * requires a user authentication (request of user credentials, e.g. such as [[user name]] and [[password]]); * access is forbidden to some or all kind of users. If the authorization / access rights feature has been implemented and enabled and access to web resource is not granted, then, depending on the required access rights, a web server program: * can deny access by sending a specific error message (e.g. access [[List of HTTP status codes#403|forbidden]]); * may deny access by sending a specific error message (e.g. access [[List of HTTP status codes#401|unauthorized]]) that usually forces the client browser to ask human user to provide required user credentials; if authentication credentials are provided then web server program verifies and accepts or rejects them. ==== URL redirection ==== {{Main|URL redirection}} A web server program ''may'' have the capability of doing URL redirections to new URLs (new locations) which consists in replying to a client request message with a response message containing a new URL suited to access a valid or an existing web resource (client should redo the request with the new URL).<ref name="rfc7231-6.4">{{cite IETF |rfc=7231 |sectionname=Response Status Codes: Redirection 3xx |section=6.4|title=RFC 7231, HTTP/1.1: Semantics and Content|pages=53–54}}</ref> URL redirection of location is used:<ref name="rfc7231-6.4" /> * to fix a directory name by adding a final slash '/';<ref name="ws-directory-listings" /> * to give a new URL for a no more existing URL path to a new path where that kind of web resource can be found. * to give a new URL to another domain when current domain has too much load. Example 1: a URL path points to a '''directory''' name but it does not have a final slash '/' so web server sends a redirect to client in order to instruct it to redo the request with the fixed path name.<ref name="ws-directory-listings" /> From:<br /> <code>/directory1/directory2</code><br /> To:<br /> <code>/directory1/directory2/</code> Example 2: a whole set of documents has been '''moved inside website''' in order to reorganize their file system paths. From:<br /> <code>/directory1/directory2/2021-10-08/</code><br /> To:<br /> <code>/directory1/directory2/2021/10/08/</code> Example 3: a whole set of documents has been '''moved to a new website''' and now it is mandatory to use secure HTTPS connections to access them. From:<br /> <code><nowiki>http://www.example.com/directory1/directory2/2021-10-08/</nowiki></code><br /> To:<br /> <code><nowiki>https://docs.example.com/directory1/2021-10-08/</nowiki></code> Above examples are only a few of the possible kind of redirections. ==== Successful message ==== A web server program is able to reply to a valid client request message with a successful message, optionally containing requested '''web resource data'''.<ref name="rfc7231-6.3">{{cite IETF |rfc=7231 |sectionname=Successful 2xx|section=6.3|title=RFC 7231, HTTP/1.1: Semantics and Content|pages=51-54}}</ref> If web resource data is sent back to client, then it can be '''static content''' or '''dynamic content''' depending on how it has been retrieved (from a file or from the output of some program / module). === Content cache === In order to speed up web server responses by lowering average HTTP response times and hardware resources used, many popular web servers implement one or more content [[Cache (computing)|cache]]s, each one specialized in a content category.<ref name="ws-content-cache-apache">{{Cite web|url=https://httpd.apache.org/docs/2.4/caching.html|title=Caching Guide|publisher=Apache: HTTPd server project|year=2021|access-date=2021-12-09|language=en|archive-date=9 December 2021|archive-url=https://web.archive.org/web/20211209211243/https://httpd.apache.org/docs/2.4/caching.html|url-status=live}}</ref> <ref name="ws-content-cache-nginx">{{Cite web|url=https://docs.nginx.com/nginx/admin-guide/content-cache/content-caching/|title=NGINX Content Caching|publisher=F5 NGINX|year=2021|access-date=2021-12-09|language=en|archive-date=9 December 2021|archive-url=https://web.archive.org/web/20211209211246/https://docs.nginx.com/nginx/admin-guide/content-cache/content-caching/|url-status=live}}</ref> Content is usually cached by its origin, e.g.: * static content: ** [[#File cache|file cache]]; * dynamic content: ** [[#Dynamic cache|dynamic cache]] (module / program output). ==== File cache ==== Historically, static contents found in [[Computer file|file]]s which had to be accessed frequently, randomly and quickly, have been stored mostly on electro-mechanical [[Hard disk drive|disk]]s since mid-late 1960s / 1970s; regrettably reads from and writes to those kind of [[Peripheral|device]]s have always been considered very slow operations when compared to [[RAM]] speed and so, since early [[Operative system|OS]]s, first disk caches and then also [[Operating system|OS]] file [[Cache (computing)|cache]] sub-systems were developed to speed up [[Input/output|I/O]] operations of frequently accessed data / files. Even with the aid of an OS file cache, the relative / occasional slowness of I/O operations involving directories and files stored on disks became soon a [[Bottleneck (software)|bottleneck]] in the increase of [[#Performances|performances]] expected from top level web servers, specially since mid-late 1990s, when web Internet traffic started to grow exponentially along with the constant increase of speed of Internet / network lines. The problem about how to further efficiently speed-up the serving of static files, thus increasing the maximum number of requests/responses per second ([[#requests per second|RPS]]), started to be studied / researched since mid 1990s, with the aim to propose useful cache models that could be implemented in web server programs.<ref name="ws-file-cache-mmc">{{Cite web|url=https://www.ra.ethz.ch/cdstore/www5/www218/overview.htm|title=Main Memory Caching of Web Documents|publisher=Computer networks and ISDN Systems|author=Evangelos P. Markatos|year=1996|access-date=2021-12-09|language=en|archive-date=20 January 2023|archive-url=https://web.archive.org/web/20230120185224/https://www.ra.ethz.ch/cdstore/www5/www218/overview.htm|url-status=live}}</ref> In practice, nowadays, many popular / high performance web server programs include their own ''[[Userland (computing)|userland]]'' '''file cache''', tailored for a web server usage and using their specific implementation and parameters.<ref name="ws-file-cache-iplanet">{{Cite web|url=https://docs.oracle.com/cd/E19146-01/821-1827/gaidp/index.html|title=IPlanet Web Server 7.0.9: file-cache|publisher=Oracle|author=|year=2010|access-date=2021-12-09|language=en|archive-date=9 December 2021|archive-url=https://web.archive.org/web/20211209175035/https://docs.oracle.com/cd/E19146-01/821-1827/gaidp/index.html|url-status=live}}</ref> <ref name="ws-file-cache-apache">{{Cite web|url=https://httpd.apache.org/docs/2.4/mod/mod_file_cache.html|title=Apache Module mod_file_cache|publisher=Apache: HTTPd server project|year=2021|access-date=2021-12-09|language=en|archive-date=9 December 2021|archive-url=https://web.archive.org/web/20211209194811/https://httpd.apache.org/docs/2.4/mod/mod_file_cache.html|url-status=live}}</ref> <ref name="ws-file-cache-servez">{{Cite web|url=https://www.gnu.org/software/serveez/manual/html_node/HTTP-Server.html|title=HTTP server: configuration: file cache|publisher=GNU|year=2021|access-date=2021-12-09|language=en|archive-date=9 December 2021|archive-url=https://web.archive.org/web/20211209173634/https://www.gnu.org/software/serveez/manual/html_node/HTTP-Server.html|url-status=live}}</ref> The wide spread adoption of [[Redundant array of independent disks|RAID]] and/or fast [[solid-state drive]]s (storage hardware with very high I/O speed) has slightly reduced but of course not eliminated the advantage of having a file cache incorporated in a web server. ==== Dynamic cache ==== Dynamic content, output by an internal module or an external program, may not always change very frequently (given a unique URL with keys / parameters) and so, maybe for a while (e.g. from 1 second to several hours or more), the resulting output can be cached in RAM or even on a fast [[Disk storage|disk]].<ref name="ws-disk-cache-apache">{{Cite web|url=https://httpd.apache.org/docs/2.4/mod/mod_cache_disk.html|title=Apache Module mod_cache_disk|publisher=Apache: HTTPd server project|year=2021|access-date=2021-12-09|language=en|archive-date=9 December 2021|archive-url=https://web.archive.org/web/20211209211241/https://httpd.apache.org/docs/2.4/mod/mod_cache_disk.html|url-status=live}}</ref> The typical usage of a dynamic cache is when a website has [[dynamic web page]]s about news, weather, images, maps, etc. that do not change frequently (e.g. every ''n'' minutes) and that are accessed by a huge number of clients per minute / hour; in those cases it is useful to return cached content too (without calling the internal module or the external program) because clients often do not have an updated copy of the requested content in their browser caches.<ref name="ws-dynamic-cache-edu">{{Cite web|url=https://www.educative.io/edpresso/what-is-dynamic-cache|title=What is dynamic cache?|publisher=Educative|author=|year=2021|access-date=2021-12-09|language=en|archive-date=9 December 2021|archive-url=https://web.archive.org/web/20211209234355/https://www.educative.io/edpresso/what-is-dynamic-cache|url-status=live}}</ref> Anyway, in most cases those kind of caches are implemented by external servers (e.g. [[reverse proxy]]) or by storing dynamic data output in separate computers, managed by specific applications (e.g. [[memcached]]), in order to not compete for hardware resources (CPU, RAM, disks) with web server(s).<ref name="ws-dynamic-cache-tut">{{Cite web|url=https://www.siteground.com/tutorials/supercacher/dynamic-cache/|title=Dynamic Cache Option Tutorial|publisher=Siteground|author=|year=2021|access-date=2021-12-09|language=en|archive-date=20 January 2023|archive-url=https://web.archive.org/web/20230120185251/https://www.siteground.com/tutorials/supercacher/dynamic-cache/|url-status=live}}</ref> <ref name="ws-dynamic-cache-std">{{Cite web|url=https://www.researchgate.net/publication/2585583|title=Improving Web Server Performance by Caching Dynamic Data|publisher=Usenix|author1=Arun Iyengar|author2=Jim Challenger|year=2000|access-date=2021-12-09|language=en}}</ref> === Kernel-mode and user-mode web servers === A web server software can be either incorporated into the [[Operating system|OS]] and executed in [[kernel (operating system)|kernel]] space, or it can be executed in [[user space]] (like other regular applications). Web servers that run in [[Supervisor mode|kernel mode]] (usually called [[In-kernel web server|kernel space web servers]]) can have direct access to kernel resources and so they can be, in theory, faster than those running in user mode; anyway there are disadvantages in running a web server in kernel mode, e.g.: difficulties in developing ([[debugging]]) software whereas [[Runtime (program lifecycle phase)|run-time]] [[critical error]]s may lead to serious problems in OS kernel. Web servers that run in [[user-mode]] have to ask the system for permission to use more memory or more [[Central processing unit|CPU]] resources. Not only do these requests to the kernel take time, but they might not always be satisfied because the system reserves resources for its own usage and has the responsibility to share hardware resources with all the other running applications. Executing in user mode can also mean using more buffer/data copies (between user-space and kernel-space) which can lead to a decrease in the performance of a user-mode web server. Nowadays almost all web server software is executed in user mode (because many of the aforementioned small disadvantages have been overcome by faster hardware, new OS versions, much faster OS [[system calls]] and new optimized web server software). See also [[comparison of web server software]] to discover which of them run in kernel mode or in user mode (also referred as kernel space or user space).
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)