HTTP (HyperText Transfer Protocol)IntroductionThe WEBInternet (or The Web) is a massive distributed client/server information system as depicted in the following diagram. Show
Many applications are running concurrently over the Web, such as web browsing/surfing, e-mail, file transfer, audio & video streaming, and so on. In order for proper communication to take place between the client and the server, these applications must agree on a specific application-level protocol such as HTTP, FTP, SMTP, POP, and etc. HyperText Transfer Protocol (HTTP)HTTP (Hypertext Transfer Protocol) is perhaps the most popular application protocol used in the Internet (or The WEB).
BrowserWhenever you issue a URL from your browser to get a web resource using HTTP, e.g. Uniform Resource Locator (URL)A URL (Uniform Resource Locator) is used to uniquely identify a resource over the web. URL has the following syntax: protocol://hostname:port/path-and-file-name There are 4 parts in a URL:
For example, in the URL Other examples of URL are: ftp://www.ftp.org/docs/test.txt mailto: news:soc.culture.Singapore telnet://www.nowhere123.com/ HTTP ProtocolAs mentioned, whenever you enter a URL in the address box of the browser, the browser translates the URL into a request message according to the specified protocol; and sends the request message to the server. For example, the browser translated the URL GET /docs/index.html HTTP/1.1 Host: www.nowhere123.com Accept: image/gif, image/jpeg, */* Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) (blank line) When this request message reaches the server, the server can take either one of these actions:
An example of the HTTP response message is as shown: HTTP/1.1 200 OK Date: Sun, 18 Oct 2009 08:56:53 GMT Server: Apache/2.2.14 (Win32) Last-Modified: Sat, 20 Nov 2004 07:16:26 GMT ETag: "10000000565a5-2c-3e94b66c2e680" Accept-Ranges: bytes Content-Length: 44 Connection: close Content-Type: text/html X-Pad: avoid browser bug <html><body><h2>It works!</h2></body></html> The browser receives the response message, interprets the message and displays the contents of the message on the browser's window according to the media type of the response (as in the Content-Type response header). Common media type include " In its idling state, an HTTP server does nothing but listening to the IP address(es) and port(s) specified in the configuration for incoming request. When a request arrives, the server analyzes the message header, applies rules specified in the configuration, and takes the appropriate action. The webmaster's main control over the action of web server is via the configuration, which will be dealt with in greater details in the later sections. HTTP over TCP/IPHTTP is a client-server application-level protocol. It typically runs over a TCP/IP connection, as illustrated. (HTTP needs not run on TCP/IP. It only presumes a reliable transport. Any transport protocols that provide such guarantees can be used.) TCP/IP (Transmission Control Protocol/Internet Protocol) is a set of transport and network-layer protocols for machines to communicate with each other over the network. IP (Internet Protocol) is a network-layer protocol, deals with network
addressing and routing. In an IP network, each machine is assigned an unique IP address (e.g., 165.1.2.3), and the IP software is responsible for routing a message from the source IP to the destination IP. In IPv4 (IP version 4), the IP address consists of 4 bytes, each ranges from 0 to 255, separated by dots, which is called a quad-dotted form. This numbering scheme supports up to 4G addresses on the network. The latest IPv6 (IP version 6) supports more addresses.
Since memorizing number is difficult for most of the people, an english-like domain name, such as TCP (Transmission Control Protocol) is a transport-layer protocol, responsible for establish a connection between two machines. TCP consists of 2 protocols: TCP and UDP (User Datagram Package). TCP is reliable, each packet has a sequence number, and an acknowledgement is expected. A packet will be re-transmitted if it is not received by the receiver. Packet delivery is guaranteed in TCP. UDP does not guarantee packet delivery, and is therefore not reliable. However, UDP has less network overhead and can be used for applications such as video and audio streaming, where reliability is not critical. TCP multiplexes applications within an IP machine. For each IP machine, TCP supports (multiplexes) up to 65536 ports (or sockets), from port number 0 to 65535. An application, such as HTTP or FTP, runs (or listens) at a particular port number for incoming requests. Port 0 to 1023 are pre-assigned to popular protocols, e.g., HTTP at 80, FTP at 21, Telnet at 23, SMTP at 25, NNTP at 119, and DNS at 53. Port 1024 and above are available to the users. Although TCP port 80 is pre-assigned to HTTP, as the default HTTP port number, this does not prohibit you from running an HTTP server at other user-assigned port number (1024-65535) such as 8000, 8080, especially for test server. You could also run multiple HTTP servers in the same machine on different port numbers. When a client issues a URL without explicitly stating the port number, e.g., In brief, to communicate over TCP/IP, you need to know (a) IP address or hostname, (b) Port number. HTTP SpecificationsThe HTTP specification is maintained by W3C (World-wide Web Consortium) and available at http://www.w3.org/standards/techs/http. There are currently two versions of HTTP, namely, HTTP/1.0 and HTTP/1.1. The original version, HTTP/0.9 (1991), written by Tim Berners-Lee, is a simple protocol for transferring raw data across the Internet. HTTP/1.0 (1996) (defined in RFC 1945), improved the protocol by allowing MIME-like messages. HTTP/1.0 does not address the issues of proxies, caching, persistent connection, virtual hosts, and range download. These features were provided in HTTP/1.1 (1999) (defined in RFC 2616). Apache HTTP Server or Apache Tomcat ServerA HTTP server (such as Apache HTTP Server or Apache Tomcat Server) is needed to study the HTTP protocol. Apache HTTP server is a popular industrial-strength production server, produced by Apache Software Foundation (ASF) @ www.apache.org. ASF is an open-source software foundation. That is to say, Apache HTTP server is free, with source code. The first HTTP server is written by Tim Berners Lee at CERN (European Center for Nuclear Research) at Geneva, Switzerland, who also invented HTML. Apache was built on NCSA (National Center for Supercomputing Applications, USA) "httpd 1.3" server, in early 1995. Apache probably gets its name from the fact that it consists of some original code (from an earlier NCSA httpd web server) plus some patches; or from the name of an American Indian tribe. Read "Apache How-to" on how to install and configuare Apache HTTP server; or "Tomcat How-to" to install and get started with Apache Tomcat Server. HTTP Request and Response MessagesHTTP client and server communicate by sending text messages. The client sends a request message to the server. The server, in turn, returns a response message. An HTTP message consists of a message header and an optional message body, separated by a blank line, as illustrated below: HTTP Request MessageThe format of an HTTP request message is as follow: Request LineThe first line of the header is called the request line, followed by optional request headers. The request line has the following syntax: request-method-name request-URI HTTP-version
Examples of request line are: GET /test.html HTTP/1.1 HEAD /query.html HTTP/1.0 POST /index.html HTTP/1.1 Request HeadersThe request headers are in the form of request-header-name: request-header-value1, request-header-value2, ... Examples of request headers are: Host: www.xyz.com Connection: Keep-Alive Accept: image/gif, image/jpeg, */* Accept-Language: us-en, fr, cn ExampleThe following shows a sample HTTP request message: HTTP Response MessageThe format of the HTTP response message is as follows: Status LineThe first line is called the status line, followed by optional response header(s). The status line has the following syntax: HTTP-version status-code reason-phrase
Examples of status line are: HTTP/1.1 200 OK HTTP/1.0 404 Not Found HTTP/1.1 403 Forbidden Response HeadersThe response headers are in the form response-header-name: response-header-value1, response-header-value2, ... Examples of response headers are: Content-Type: text/html Content-Length: 35 Connection: Keep-Alive Keep-Alive: timeout=15, max=100 The response message body contains the resource data requested. ExampleThe following shows a sample response message: HTTP Request MethodsHTTP protocol defines a set of request methods. A client can use one of these request methods to send a request message to an HTTP server. The methods are:
"GET" Request MethodGET is the most common HTTP request method. A client can use the GET request method to request (or "get") for a piece of resource from an HTTP server. A GET request message takes the following syntax: GET request-URI HTTP-version (optional request headers) (optional request body)
Testing HTTP RequestsThere are many way to test out the HTTP requests. Your can use utility program such as " Telnet"Telnet" is a very useful networking utility. You can use telnet to establish a TCP connection with a server; and issue raw HTTP requests. For example, suppose that you have started your HTTP server in the localhost (IP address 127.0.0.1) at port 8000: > telnet telnet> help ... telnet help menu ... telnet> open 127.0.0.1 8000 Connecting To 127.0.0.1... GET /index.html HTTP/1.0 (Hit enter twice to send the terminating blank line ...) ... HTTP response message ... Telnet is a character-based protocol. Each character you enter on the telnet client will be sent to the server immediately. Therefore, you cannot make typo error in entering you raw command, as delete and backspace will be sent to the server. You may have to enable "local echo" option to see the characters you enter. Check the telnet manual (search Windows' help) for details on using telnet. Network ProgramYou could also write your own network program to issue raw HTTP request to an HTTP server. You network program shall first establish a TCP/IP connection with the server. Once the TCP connection is established, you can issue the raw request. An example of network program written in Java is as shown (assuming that the HTTP server is running on the localhost (IP address 127.0.0.1) at port 8000): import java.net.*; import java.io.*; public class HttpClient { public static void main(String[] args) throws IOException { String host = "127.0.0.1"; int port = 8000; Socket socket = new Socket(host, port); BufferedReader in = new BufferedReader( new InputStreamReader(socket.getInputStream())); PrintWriter out = new PrintWriter(socket.getOutputStream(), true); out.println("GET /index.html HTTP/1.0"); out.println(); out.flush(); String line; while((line = in.readLine()) != null) { System.out.println(line); } in.close(); out.close(); } } HTTP/1.0 GET RequestThe following shows the response of an HTTP/1.0 GET request (issue via telnet or your own network program - assuming that you have started your HTTP server): GET /index.html HTTP/1.0 (enter twice to create a blank line) HTTP/1.1 200 OK Date: Sun, 18 Oct 2009 08:56:53 GMT Server: Apache/2.2.14 (Win32) Last-Modified: Sat, 20 Nov 2004 07:16:26 GMT ETag: "10000000565a5-2c-3e94b66c2e680" Accept-Ranges: bytes Content-Length: 44 Connection: close Content-Type: text/html X-Pad: avoid browser bug <html><body><h2>It works!</h2></body></html> Connection to host lost. In this example, the client issues a GET request to ask for a document named " The server receives the request message, interprets and maps the request-URI to a document under its document directory. If the requested document is available, the server returns the document with a response status code "200 OK".
The response headers provide the necessary description of the document returned, such as the last-modified date ( Notes:
Response Status CodeThe first line of the response message (i.e., the status line) contains the response status code, which is generated by the server to indicate the outcome of the request. The status code is a 3-digit number:
Some commonly encountered status codes are:
More HTTP/1.0 GET Request ExamplesExample: Misspelt Request MethodIn the request, "GET" is misspelled as "get". The server returns an error "501 Method Not Implemented". The response header " get /test.html HTTP/1.0 (enter twice to create a blank line) HTTP/1.1 501 Method Not Implemented Date: Sun, 18 Oct 2009 10:32:05 GMT Server: Apache/2.2.14 (Win32) Allow: GET,HEAD,POST,OPTIONS,TRACE Content-Length: 215 Connection: close Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>501 Method Not Implemented</title> </head><body> <h2>Method Not Implemented</h2> <p>get to /index.html not supported.<br /> </p> </body></html> Example: 404 File Not FoundIn this GET request, the request-URL " GET /t.html HTTP/1.0 (enter twice to create a blank line) HTTP/1.1 404 Not Found Date: Sun, 18 Oct 2009 10:36:20 GMT Server: Apache/2.2.14 (Win32) Content-Length: 204 Connection: close Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>404 Not Found</title> </head><body> <h2>Not Found</h2> <p>The requested URL /t.html was not found on this server.</p> </body></html> Example: Wrong HTTP Version NumberIn this GET request, the HTTP-version was misspelled, resulted in bad syntax. The server returns an error "400 Bad Request". HTTP-version should be either HTTP/1.0 or HTTP/1.1. GET /index.html HTTTTTP/1.0 (enter twice to create a blank line) HTTP/1.1 400 Bad Request Date: Sun, 08 Feb 2004 01:29:40 GMT Server: Apache/1.3.29 (Win32) Connection: close Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>400 Bad Request</TITLE> </HEAD><BODY> <H1>Bad Request</H1> Your browser sent a request that this server could not understand.<P> The request line contained invalid characters following the protocol string.<P><P> </BODY></HTML> Note: The latest Apache 2.2.14 ignores this error and returns the document with status code "200 OK". Example: Wrong Request-URIIn the following GET request, the request-URI did not begin from the root " GET test.html HTTP/1.0 (blank line) HTTP/1.1 400 Bad Request Date: Sun, 18 Oct 2009 10:42:27 GMT Server: Apache/2.2.14 (Win32) Content-Length: 226 Connection: close Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>400 Bad Request</title> </head><body> <h2>Bad Request</h2> <p>Your browser sent a request that this server could not understand.<br /> </p> </body></html> Example: Keep-Alive ConnectionBy fault, for HTTP/1.0 GET request, the server closes the TCP connection once the response is delivered. You could request for the TCP connection to be maintained, (so as to send another request using the same TCP connection, to improve
on the network efficiency), via an optional request header " GET /test.html HTTP/1.0 Connection: Keep-Alive (blank line) HTTP/1.1 200 OK Date: Sun, 18 Oct 2009 10:47:06 GMT Server: Apache/2.2.14 (Win32) Last-Modified: Sat, 20 Nov 2004 07:16:26 GMT ETag: "10000000565a5-2c-3e94b66c2e680" Accept-Ranges: bytes Content-Length: 44 Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Content-Type: text/html <html><body><h2>It works!</h2></body></html> Notes:
Example: Accessing a Protected ResourceThe following GET request tried to access a protected resource. The server returns an error "403
Forbidden". In this example, the directory " <Directory "C:/apache/htdocs/forbidden"> Order deny,allow deny from all </Directory> GET /forbidden/index.html HTTP/1.0 (blank line) HTTP/1.1 403 Forbidden Date: Sun, 18 Oct 2009 11:58:41 GMT Server: Apache/2.2.14 (Win32) Content-Length: 222 Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>403 Forbidden</title> </head><body> <h2>Forbidden</h2> <p>You don't have permission to access /forbidden/index.html on this server.</p> </body></html> HTTP/1.1 GET RequestHTTP/1.1 server supports so-called virtual hosts. That is, the same physical server could house several virtual hosts, with different hostnames (e.g., Example: HTTP/1.1 RequestHTTP/1.1 maintains persistent (or keep-alive) connection by default to improve the network efficiency. You can use a request header " GET /index.html HTTP/1.1 Host: 127.0.0.1 (blank line) HTTP/1.1 200 OK Date: Sun, 18 Oct 2009 12:10:12 GMT Server: Apache/2.2.14 (Win32) Last-Modified: Sat, 20 Nov 2004 07:16:26 GMT ETag: "10000000565a5-2c-3e94b66c2e680" Accept-Ranges: bytes Content-Length: 44 Content-Type: text/html <html><body><h2>It works!</h2></body></html> Example: HTTP/1.1 Missing Host HeaderThe following example shows that " GET /index.html HTTP/1.1 (blank line) HTTP/1.1 400 Bad Request Date: Sun, 18 Oct 2009 12:13:46 GMT Server: Apache/2.2.14 (Win32) Content-Length: 226 Connection: close Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>400 Bad Request</title> </head><body> <h2>Bad Request</h2> <p>Your browser sent a request that this server could not understand.<br /> </p> </body></html> Conditional GET RequestsIn all the previous examples, the server returns the entire document if the request can be fulfilled (i.e. unconditional). You may use additional request header to issue a "conditional request". For example, to ask for the document based on the last-modified date (so as to decide whether to use the local cache copy), or to ask for a portion of the document (or range) instead of the entire document (useful for downloading large documents). The conditional request headers include:
Request HeadersThis section describes some of the commonly-used request headers. Refer to HTTP Specification for more details. The syntax of header name is words with initial-cap joined using dash
(
The following headers can be used for content negotiation by the client to ask the server to deliver the preferred type of the document (in terms of the media type, e.g. JPEG vs. GIF, or language used e.g. English vs. French) if the server maintain multiple versions for the same document.
GET Request for DirectorySuppose that a directory called " If a client issues a GET request to "
It is interesting to
take note that if a client issue a GET request to " GET /testdir HTTP/1.1 Host: 127.0.0.1 (blank line) HTTP/1.1 301 Moved Permanently Date: Sun, 18 Oct 2009 13:19:15 GMT Server: Apache/2.2.14 (Win32) Location: http://127.0.0.1:8000/testdir/ Content-Length: 238 Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>301 Moved Permanently</title> </head><body> <h2>Moved Permanently</h2> <p>The document has moved <a href="http://127.0.0.1:8000/testdir/">here</a>.</p> </body></html> Most of the browser will follow up with another request to " Issue a GET Request through a Proxy ServerTo send a GET request through a proxy server, (a) establish a TCP connection to the proxy server; (b) use an absolute request-URI The following trace was captured using telnet. A connection is established with the proxy server, and a GET request issued. Absolute request-URI is used in the request line. GET http://www.amazon.com/index.html HTTP/1.1 Host: www.amazon.com Connection: Close (blank line) HTTP/1.1 302 Found Transfer-Encoding: chunked Date: Fri, 27 Feb 2004 09:27:35 GMT Content-Type: text/html; charset=iso-8859-1 Connection: close Server: Stronghold/2.4.2 Apache/1.3.6 C2NetEU/2412 (Unix) Set-Cookie: skin=; domain=.amazon.com; path=/; expires=Wed, 01-Aug-01 12:00:00 GMT Connection: close Location: http://www.amazon.com:80/exec/obidos/subst/home/home.html Via: 1.1 xproxy (NetCache NetApp/5.3.1R4D5) ed <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>302 Found</TITLE> </HEAD><BODY> <H1>Found</H1> The document has moved <A HREF="http://www.amazon.com:80/exec/obidos/subst/home/home.html"> here</A>.<P> </BODY></HTML> 0 Take note that the response is returned in "chunks". "HEAD" Request MethodHEAD request is similar to GET request. However, the server returns only the response header without the response body, which contains the actual document. HEAD request is useful for checking the headers, such as The syntax of the HEAD request is as follows: HEAD request-URI HTTP-version (other optional request headers) (blank line) (optional request body) ExampleHEAD /index.html HTTP/1.0 (blank line) HTTP/1.1 200 OK Date: Sun, 18 Oct 2009 14:09:16 GMT Server: Apache/2.2.14 (Win32) Last-Modified: Sat, 20 Nov 2004 07:16:26 GMT ETag: "10000000565a5-2c-3e94b66c2e680" Accept-Ranges: bytes Content-Length: 44 Connection: close Content-Type: text/html X-Pad: avoid browser bug Notice that the response consists of the header only without the body, which contains the actual document. "OPTIONS" Request MethodA client can use an OPTIONS request method to query the server which request methods are supported. The syntax for OPTIONS request message is: OPTIONS request-URI|* HTTP-version (other optional headers) (blank line) " ExampleFor example, the following OPTIONS request is sent through a proxy server: OPTIONS http://www.amazon.com/ HTTP/1.1 Host: www.amazon.com Connection: Close (blank line) HTTP/1.1 200 OK Date: Fri, 27 Feb 2004 09:42:46 GMT Content-Length: 0 Connection: close Server: Stronghold/2.4.2 Apache/1.3.6 C2NetEU/2412 (Unix) Allow: GET, HEAD, POST, OPTIONS, TRACE Connection: close Via: 1.1 xproxy (NetCache NetApp/5.3.1R4D5) (blank line) All servers that allow GET request will allow HEAD request. Sometimes, HEAD is not listed. "TRACE" Request MethodA client can send a TRACE request to ask the server to return a diagnostic trace. TRACE request takes the following syntax: TRACE / HTTP-version (blank line) ExampleThe following example shows a TRACE request issued through a proxy server. TRACE http://www.amazon.com/ HTTP/1.1 Host: www.amazon.com Connection: Close (blank line) HTTP/1.1 200 OK Transfer-Encoding: chunked Date: Fri, 27 Feb 2004 09:44:21 GMT Content-Type: message/http Connection: close Server: Stronghold/2.4.2 Apache/1.3.6 C2NetEU/2412 (Unix) Connection: close Via: 1.1 xproxy (NetCache NetApp/5.3.1R4D5) 9d TRACE / HTTP/1.1 Connection: keep-alive Host: www.amazon.com Via: 1.1 xproxy (NetCache NetApp/5.3.1R4D5) X-Forwarded-For: 155.69.185.59, 155.69.5.234 0 (To compare the TRACE request with trace route) Submitting HTML Form Data and Query StringIn many Internet applications, such as e-commerce and search engine, the clients are required to submit additional information to the server (e.g., the name, address, the search keywords). Based on the data submitted, the server takes an appropriate action and produces a customized response. The clients are usually presented with a form (produced using HTML The following is a sample HTML form, which is produced by the following HTML script: <html> <head><title>A Sample HTML Form</title></head> <body> <h2 align="left">A Sample HTML Data Entry Form</h2> <form method="get" action="/bin/process"> Enter your name: <input type="text" name="username"><br /> Enter your password: <input type="password" name="password"><br /> Which year? <input type="radio" name="year" value="2" />Yr 1 <input type="radio" name="year" value="2" />Yr 2 <input type="radio" name="year" value="3" />Yr 3<br /> Subject registered: <input type="checkbox" name="subject" value="e101" />E101 <input type="checkbox" name="subject" value="e102" />E102 <input type="checkbox" name="subject" value="e103" />E103<br /> Select Day: <select name="day"> <option value="mon">Monday</option> <option value="wed">Wednesday</option> <option value="fri">Friday</option> </select><br /> <textarea rows="3" cols="30">Enter your special request here</textarea><br /> <input type="submit" value="SEND" /> <input type="reset" value="CLEAR" /> <input type="hidden" name="action" value="registration" /> </form> </body> </html> A form contains fields. The types of field include:
Each field has a name and can take on a specified value. Once the client fills in the fields and hits the submit button, the browser gathers each of the fields' name
and value, packed them into " name1=value1&name2=value2&name3=value3&... Special characters are not allowed inside the query string. They must be replaced by a " name=Peter+Lee&address=%23123+Happy+Ave&Language=C%2B%2B The query string can be sent to the server using either HTTP GET or POST request method,
which is specified in the <form method="get|post" action="url"> If GET request method is used, the URL-encoded query string will be appended behind the request-URI after a " GET request-URI?query-string HTTP-version (other optional request headers) (blank line) (optional request body) Using GET request to send the query string has the following drawbacks:
POST method overcomes these drawbacks. If POST request method is used, the query string will be sent in the body of the request message, where the amount is not limited. The request headers ExampleThe following HTML form is used to gather the username and password in a login menu. <html> <head><title>Login</title></head> <body> <h2>LOGIN</h2> <form method="get" action="/bin/login"> Username: <input type="text" name="user" size="25" /><br /> Password: <input type="password" name="pw" size="10" /><br /><br /> <input type="hidden" name="action" value="login" /> <input type="submit" value="SEND" /> </form> </body> </html> The HTTP GET request method is used to send the query string. Suppose the user enters "Peter Lee" as the username, "123456" as password; and clicks the submit button. The following GET request is: GET /bin/login?user=Peter+Lee&pw=123456&action=login HTTP/1.1 Accept: image/gif, image/jpeg, */* Referer: http://127.0.0.1:8000/login.html Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Host: 127.0.0.1:8000 Connection: Keep-Alive Note that although the password that you enter does not show on the screen, it is shown clearly in the address box of the browser. You should never use send your password without proper encryption. http://127.0.0.1:8000/bin/login?user=Peter+Lee&pw=123456&action=login URL and URIURL (Uniform Resource Locator)A URL (Uniform Resource Locator), defined in RFC 2396, is used to uniquely identify a resource over the web. URL has the following syntax: protocol://hostname:port/path-and-file-name There are 4 parts in a URL:
For example, in the URL Other examples of URL are: ftp://www.ftp.org/docs/test.txt mailto: news:soc.culture.Singapore telnet://www.nowhere123.com/ Encoded URLURL cannot contain special characters, such as blank or URI (Uniform Resource Identifier)URI (Uniform Resource Identifier), defined in RFC3986, is more general than URL, which can even locate a fragment within a resource. The URI syntax for HTTP protocol is: http://host:port/path?request-parameters#nameAnchor
"POST" Request MethodPOST request method is used to "post" additional data up to the server (e.g., submitting HTML form data or uploading a file). Issuing an HTTP URL from the browser always triggers a GET request. To trigger a POST request, you can use an HTML form with attribute The POST request takes the following syntax: POST request-URI HTTP-version Content-Type: mime-type Content-Length: number-of-bytes (other optional request headers) (URL-encoded query string) Request headers Example: Submitting Form Data using POST Request MethodWe use the same HTML script as above, but change the request method to POST. <html> <head><title>Login</title></head> <body> <h2>LOGIN</h2> <form method="post" action="/bin/login"> Username: <input type="text" name="user" size="25" /><br /> Password: <input type="password" name="pw" size="10" /><br /><br /> <input type="hidden" name="action" value="login" /> <input type="submit" value="SEND" /> </form> </body> </html> Suppose the user enters "Peter Lee" as username and "123456" as password, and clicks the submit button, the following POST request would be generated by the browser: POST /bin/login HTTP/1.1 Host: 127.0.0.1:8000 Accept: image/gif, image/jpeg, */* Referer: http://127.0.0.1:8000/login.html Accept-Language: en-us Content-Type: application/x-www-form-urlencoded Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Content-Length: 37 Connection: Keep-Alive Cache-Control: no-cache User=Peter+Lee&pw=123456&action=login Note that the POST vs GET for Submitting Form DataAs mentioned in the previous section, POST request has the following advantage compared with the GET request in sending the query string:
Note that although the password is not shown on the browser’s address box, it is transmitted to the server in clear text, and subjected to network sniffing. Hence, sending password using a POST request is absolutely not secure. File Upload using multipart/form-data POST Request"RFC 1867: Form-based File upload in HTML" specifies how a file can be uploaded to the server using a POST request from an HTML form. A new attribute ExampleThe following HTML form can be used for file upload: <html> <head><title>File Upload</title></head> <body> <h2>Upload File</h2> <form method="post" enctype="multipart/form-data" action="servlet/UploadServlet"> Who are you: <input type="text" name="username" /><br /> Choose the file to upload: <input type="file" name="fileID" /><br /> <input type="submit" value="SEND" /> </form> </body> </html> When the browser encountered an When the user clicks the submit button, the browser send the form data and the content
of the selected file(s). The old encoding type " Each part identifies the input name within the original HTML form, and the content type if the media is known, or as The original local file name could be supplied as a " An example of the POST message for file upload is as follows: POST /bin/upload HTTP/1.1 Host: test101 Accept: image/gif, image/jpeg, */* Accept-Language: en-us Content-Type: multipart/form-data; boundary=---------------------------7d41b838504d8 Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Content-Length: 342 Connection: Keep-Alive Cache-Control: no-cache -----------------------------7d41b838504d8 Content-Disposition: form-data; name="username" Peter Lee -----------------------------7d41b838504d8 Content-Disposition: form-data; name="fileID"; filename="C:\temp.html" Content-Type: text/plain <h2>Home page on main server</h2> -----------------------------7d41b838504d8-- Servlet 3.0 provides built-in support for processing file upload. Read "Uploading Files in Servlet 3.0". "CONNECT" Request MethodThe HTTP CONNECT request is used to ask a proxy to make a connection to anther host and simply relay the content, rather than attempting to parse or cache the message. This is often used to make a connection through a proxy. (Under Construction) Other Request MethodsPUT: Ask the server to store the data. DELETE: Ask the server to delete the data. For security consideration, PUT and DELETE are not supported by most of the production server. Extension methods (also error codes and headers) can be defined to extend the functionality of the HTTP protocol. (Under Construction) Content NegotiationAs mention earlier, HTTP support content negotiation between the
client and the server. A client can use additional request headers (such as Content-Type NegotiationThe server uses a MIME configuration file (called " For content-type negotiation, suppose that the client
requests for a file call " image/gif gif image/jpeg jpeg jpg jpe The server will return " The message trace is shown: GET /logo HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */* Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Host: test101:8080 Connection: Keep-Alive (blank line) HTTP/1.1 200 OK Date: Sun, 29 Feb 2004 01:42:22 GMT Server: Apache/1.3.29 (Win32) Content-Location: logo.gif Vary: negotiate,accept TCN: choice Last-Modified: Wed, 21 Feb 1996 19:45:52 GMT ETag: "0-916-312b7670;404142de" Accept-Ranges: bytes Content-Length: 2326 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: image/gif (blank line) (body omitted) However, if the server
has 3 " GET /logo HTTP/1.1 Accept: */* Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Host: test101:8080 Connection: Keep-Alive (blank line) HTTP/1.1 200 OK Date: Sun, 29 Feb 2004 01:48:16 GMT Server: Apache/1.3.29 (Win32) Content-Location: logo.html Vary: negotiate,accept TCN: choice Last-Modified: Fri, 20 Feb 2004 04:31:17 GMT ETag: "0-10-40358d95;404144c1" Accept-Ranges: bytes Content-Length: 16 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html (blank line) (body omitted) Accept: */* The following Apache’s configuration directives are relevant to content-type negotiation:
Language Negotiation and "Options MultiView"The " AddLanguage en .en <Directory "C:/_javabin/Apache1.3.29/htdocs"> Options Indexes MultiViews </Directory> Suppose that the client requests for " A message trace is as follows: GET /index.html HTTP/1.1 Accept: */* Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Host: test101:8080 Connection: Keep-Alive (blank line) HTTP/1.1 200 OK Date: Sun, 29 Feb 2004 02:08:29 GMT Server: Apache/1.3.29 (Win32) Content-Location: index.html.en Vary: negotiate TCN: choice Last-Modified: Sun, 29 Feb 2004 02:07:45 GMT ETag: "0-13-40414971;40414964" Accept-Ranges: bytes Content-Length: 19 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html Content-Language: en (blank line) (body omitted) The Note that " The directive <IfModule mod_negotiation.c> LanguagePriority en da nl et fr de el it ja kr no pl pt pt-br </IfModule> Character Set NegotiationA client can use the
request header Accept-Charset: charset-1, charset-2, ... The commonly encountered character sets include: ISO-8859-1 (Latin-I), ISO-8859-2, ISO-8859-5, BIG5 (Chinese Traditional), GB2312 (Chinese Simplified), UCS2 (2-byte Unicode), UCS4 (4-byte Unicode), UTF8 (Encoded Unicode), and etc. Similarly, the AddCharset ISO-8859-8 .iso8859-8 AddCharset ISO-2022-JP .jis AddCharset Big5 .Big5 .big5 AddCharset WINDOWS-1251 .cp-1251 AddCharset CP866 .cp866 AddCharset ISO-8859-5 .iso-ru AddCharset KOI8-R .koi8-r AddCharset UCS-2 .ucs2 AddCharset UCS-4 .ucs4 AddCharset UTF-8 .utf8 Encoding NegotiationA client
can use the Accept-Encoding: encoding-method-1, encoding-method-2, ... Similarly, the AddEncoding x-compress .Z AddEncoding x-gzip .gz .tgz Persistent (or Keep-alive) ConnectionsIn HTTP/1.0, the server closes the TCP connection after delivering the response by default ( The client can negotiate with the server and ask the server not to close the connection after delivering the response, so that another request can be sent through the same connection. This is known as persistent
connection (or keep-alive connection). Persistent connections greatly enhance the efficiency of the network. For HTTP/1.0, the default connection is non-persistent. To ask for persistent connection, the client must include a request header " For HTTP/1.1, the default connection is persistent. The client do not have to sent the " Persistent connection is extremely useful for web pages with many small inline images and other associated data, as all these can be downloaded using the same connection. The benefits for persistent connection are:
In Apache HTTP server, several configuration directives are related to the persistent connections: The KeepAlive On|Off The MaxKeepAliveRequests 200 The KeepAliveTimeout 10 Range DownloadAccept-Ranges: bytes Transfer-Encoding: chunked (Under Construction) Cache ControlThe client can send a request header " Pragma: no-cache Cache-Control: no-cache (More, Under Construction) REFERENCES & RESOURCES
Latest version tested: HTTP 1.1, Apache HTTP Server 2.2.14 Which is the protocol used to transfer information between server and browser?Hypertext Transfer Protocol (HTTP) is a method for encoding and transporting information between a client (such as a web browser) and a web server. HTTP is the primary protocol for transmission of information across the Internet.
Which protocol is used to transmit HTML files?HTTP is a protocol that's built on top of the TCP/IP protocols. Each HTTP request is inside an IP packet, and each HTTP response is inside another IP packet--or more typically, multiple packets, since the response data can be quite large.
Which protocol is used to transfer the HTML content from web server to client?HTTP is a protocol for fetching resources such as HTML documents. It is the foundation of any data exchange on the Web and it is a client-server protocol, which means requests are initiated by the recipient, usually the Web browser.
Is HTML a transfer protocol?Hypertext Transfer Protocol (HTTP) is an application-layer protocol for transmitting hypermedia documents, such as HTML. It was designed for communication between web browsers and web servers, but it can also be used for other purposes.
|