Whether you're a programmer or not, you have seen it everywhere on the web. Even your first Hello World PHP script sent HTTP headers without you realizing it. In this article we are going to learn about the basics of HTTP headers and how we can use them in our web applications.
What are HTTP Headers?
HTTP stands for "Hypertext Transfer Protocol". The entire World Wide Web uses this protocol. It was established in the early 1990's. Almost everything you see in your browser is transmitted to your computer over HTTP. For example, when you opened this article page, your browser probably have sent over 40 HTTP requests and received HTTP responses for each.
HTTP headers are the core part of these HTTP requests and responses, and they carry information about the client browser, the requested page, the server and more.
Example
When you type a URL in your address bar, your browser sends an HTTP request and it may look like this:
GET /tutorials/other/top-20-mysql-best-practices/ HTTP/1.1 Host: code.tutsplus.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729) Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Cookie: PHPSESSID=r2t5uvjq435r4q7ib3vtdjq120 Pragma: no-cache Cache-Control: no-cache
First line is the "Request Line" which contains some basic info on the request. And the rest are the HTTP headers.
After that request, your browser receives an HTTP response that may look like this:
HTTP/1.x 200 OK Transfer-Encoding: chunked Date: Sat, 28 Nov 2009 04:36:25 GMT Server: LiteSpeed Connection: close X-Powered-By: W3 Total Cache/0.8 Pragma: public Expires: Sat, 28 Nov 2009 05:36:25 GMT Etag: "pub1259380237;gz" Cache-Control: max-age=3600, public Content-Type: text/html; charset=UTF-8 Last-Modified: Sat, 28 Nov 2009 03:50:37 GMT X-Pingback: https://code.tutsplus.com/xmlrpc.php Content-Encoding: gzip Vary: Accept-Encoding, Cookie, User-Agent <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "https://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Top 20+ MySQL Best Practices - Nettuts+</title> <!-- ... rest of the html ... -->
The first line is the "Status Line", followed by "HTTP Headers", until the blank line. After that, the "content" starts (in this case, HTML output).
When you look at the source code of a web page in your browser, you will only see the HTML portion and not the HTTP headers, even though they actually have been transmitted together as you see above.
These HTTP requests are also sent and received for other things, such as images, CSS files, JavaScript files etc. That is why I said earlier that your browser has sent at least 40 or more HTTP requests as you loaded just this article page.
Now, let's start reviewing the structure in more detail.
How to See HTTP Headers
I used the Firefox Firebug to analyze HTTP headers, but you can use the Developer Tools in Firefox, Chrome or any modern web browser to view HTTP headers.
In PHP:
getallheaders()
gets the request headers. You can also use the$_SERVER
array.headers_list()
gets the response headers.
Further in the article, we will see some code examples in PHP.
HTTP Request Structure
The first line of the HTTP request is called the request line and consists of 3 parts:
- The "method" indicates what kind of request this is. Most common methods are GET, POST and HEAD.
- The "path" is generally the part of the URL that comes after the host (domain). For example, when requesting "https://code.tutsplus.com/tutorials/other/top-20-mysql-best-practices/" , the path portion is "/tutorials/other/top-20-mysql-best-practices/".
- The
protocol
part containsHTTP
and the version, which is usually 1.1 in modern browsers.
The remainder of the request contains HTTP headers as Name: Value
pairs on each line. These contain various information about the HTTP request and your browser. For example, the User-Agent
line provides information on the browser version and the Operating System you are using. Accept-Encoding
tells the server if your browser can accept compressed output like gzip.
You may have noticed that the cookie data is also transmitted inside an HTTP header. And if there was a referring URL, that would have been in the header too.
Most of these headers are optional. This HTTP request could have been as small as this:
GET /tutorials/other/top-20-mysql-best-practices/ HTTP/1.1 Host: code.tutsplus.com
And you would still get a valid response from the web server.
Request Methods
The three most commonly used request methods are: GET, POST and HEAD. You're probably already familiar with the first two, from writing html forms.
GET: Retrieve a Document
This is the main method used for retrieving html, images, JavaScript, CSS, etc. Most data that loads in your browser was requested using this method.
For example, when loading a Nettuts+ article, the very first line of the HTTP request looks like so:
GET /tutorials/other/top-20-mysql-best-practices/ HTTP/1.1 ...
Once the html loads, the browser will start sending GET request for images, that may look like this:
GET /wp-content/themes/tuts_theme/images/header_bg_tall.png HTTP/1.1 ...
Web forms can be set to use the method GET. Here is an example.
<form method="GET" action="foo.php"> First Name: <input type="text" name="first_name" /> <br /> Last Name: <input type="text" name="last_name" /> <br /> <input type="submit" name="action" value="Submit" /> </form>
When that form is submitted, the HTTP request begins like this:
GET /foo.php?first_name=John&last_name=Doe&action=Submit HTTP/1.1 ...
You can see that each form input was added into the query string.
POST: Send Data to the Server
Even though you can send data to the server using GET and the query string, in many cases POST will be preferable. Sending large amounts of data using GET is not practical and has limitations.
POST requests are most commonly sent by web forms. Let's change the previous form example to a POST method.
<form method="POST" action="foo.php"> First Name: <input type="text" name="first_name" /> <br /> Last Name: <input type="text" name="last_name" /> <br /> <input type="submit" name="action" value="Submit" /> </form>
Submitting that form creates an HTTP request like this:
POST /foo.php HTTP/1.1 Host: localhost User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729) Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://localhost/test.php Content-Type: application/x-www-form-urlencoded Content-Length: 43 first_name=John&last_name=Doe&action=Submit
There are three important things to note here:
- The path in the first line is simply
/foo.php
and there is no query string anymore. Content-Type
andContent-Length
headers have been added, which provide information about the data being sent.- All the data is in now sent after the headers, with the same format as the query string.
POST method requests can also be made via AJAX, applications, cURL, etc. And all file upload forms are required to use the POST method.
HEAD: Retrieve Header Information
HEAD is identical to GET, except the server does not return the content in the HTTP response. When you send a HEAD request, it means that you are only interested in the response code and the HTTP headers, not the document itself.
With this method the browser can check if a document has been modified, for caching purposes. It can also check if the document exists at all.
For example, if you have a lot of links on your website, you can periodically send HEAD requests to all of them to check for broken links. This will work much faster than using GET.
HTTP Response Structure
After the browser sends the HTTP request, the server responds with an HTTP response. Excluding the content, it looks like this:
The first piece of data is the protocol. This is again usually HTTP/1.x or HTTP/1.1 on modern servers.
The next part is the status code followed by a short message. Code 200 means that our GET request was successful and the server will return the contents of the requested document, right after the headers.
We all have seen 404 pages. This number actually comes from the status code part of the HTTP response. If the GET request would be made for a path that the server cannot find, it would respond with a 404 instead of 200.
The rest of the response contains headers just like the HTTP request. These values can contain information about the server software, when the page/file was last modified, the mime type etc...
Again, most of those headers are actually optional.
HTTP Status Codes
- 200's are used for successful requests.
- 300's are for redirections.
- 400's are used if there was a problem with the request.
- 500's are used if there was a problem with the server.
200 OK
As mentioned before, this status code is sent in response to a successful request.
206 Partial Content
If an application requests only a range of the requested file, the 206 code is returned.
It's most commonly used with download managers that can stop and resume a download, or split the download into pieces.
404 Not Found
When the requested page or file was not found, a 404 response code is sent by the server.
401 Unauthorized
Password protected web pages send this code. If you don't enter a login correctly, you may see the following in your browser.
Note that this only applies to HTTP password protected pages, that pop up login prompts like this:
403 Forbidden
If you are not allowed to access a page, this code may be sent to your browser. This often happens when you try to open a URL for a folder, that contains no index page. If the server settings do not allow the display of the folder contents, you will get a 403 error.
For example, on my local server I created an images folder. Inside this folder I put an .htaccess file with this line: "Options -Indexes
". Now when I try to open http://localhost/images/ I see this:
There are other ways in which access can be blocked, and 403 can be sent. For example, you can block by IP address, with the help of some htaccess directives.
order allow,deny deny from 192.168.44.201 deny from 224.39.163.12 deny from 172.16.7.92 allow from all
302 (or 307) Moved Temporarily & 301 Moved Permanently
These two codes are used for redirecting a browser. For example, when you use a URL shortening service, such as bit.ly, that's exactly how they forward the people who click on their links.
Both 302 and 301 are handled very similarly by the browser, but they can have different meanings to search engine spiders. For instance, if your website is down for maintenance, you may redirect to another location using 302. The search engine spider will continue checking your page later in the future. But if you redirect using 301, it will tell the spider that your website has moved to that location permanently. For example https://net.tutsplus.com redirects to https://code.tutsplus.com—that is the new canonical URL.
500 Internal Server Error
This code is usually seen when a web script crashes. Most CGI scripts do not output errors directly to the browser, unlike PHP. If there is any fatal errors, they will just send a 500 status code. And the programmer then needs to search the server error logs to find the error messages.
Complete List
You can find the complete list of HTTP status codes with their explanations here.
HTTP Headers in HTTP Requests
Now, we'll review some of the most common HTTP headers found in HTTP requests.
Almost all of these headers can be found in the $_SERVER
array in PHP. You can also use the getallheaders()
function to retrieve all headers at once.
Host
An HTTP Request is sent to a specific IP Addresses. But since most servers are capable of hosting multiple websites under the same IP, they must know which domain name the browser is looking for.
Host: code.tutsplus.com
This is basically the host name, including the domain and the subdomain.
In PHP, it can be found as $_SERVER['HTTP_HOST']
or $_SERVER['SERVER_NAME']
.
User-Agent
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)
This header can carry several pieces of information such as:
- Browser name and version.
- Operating System name and version.
- Default language.
This is how websites can collect certain general information about their surfers' systems. For example, they can detect if the surfer is using a cell phone browser and redirect them to a mobile version of their website which works better with low resolutions.
In PHP, it can be found with: $_SERVER['HTTP_USER_AGENT']
.
if ( strstr($_SERVER['HTTP_USER_AGENT'],'MSIE 6') ) { echo "Please stop using IE6!"; }
Accept-Language
Accept-Language: en-us,en;q=0.5
This header displays the default language setting of the user. If a website has different language versions, it can redirect a new surfer based on this data.
It can carry multiple languages, separated by commas. The first one is the preferred language, and each other listed language can carry a "q
" value, which is an estimate of the user's preference for the language (min. 0 max. 1).
In PHP, it can be found as: $_SERVER["HTTP_ACCEPT_LANGUAGE"]
.
if (substr($_SERVER['HTTP_ACCEPT_LANGUAGE'], 0, 2) == 'fr') { header('Location: http://french.mydomain.com'); }
Accept-Encoding
Accept-Encoding: gzip,deflate
Most modern browsers support gzip, and will send this in the header. The web server then can send the HTML output in a compressed format. This can reduce the size by up to 80% to save bandwidth and time.
In PHP, it can be found as: $_SERVER["HTTP_ACCEPT_ENCODING"]
. However, when you use the ob_gzhandler()
callback function, it will check this value automatically, so you don't need to.
// enables output buffering // and all output is compressed if the browser supports it ob_start('ob_gzhandler');
If-Modified-Since
If a web document is already cached in your browser, and you visit it again, your browser can check if the document has been updated by sending this:
If-Modified-Since: Sat, 28 Nov 2009 06:38:19 GMT
If it was not modified since that date, the server will send a "304 Not Modified" response code, and no content—and the browser will load the content from the cache.
In PHP, it can be found as: $_SERVER['HTTP_IF_MODIFIED_SINCE']
.
// assume $last_modify_time was the last the output was updated // did the browser send If-Modified-Since header? if(isset($_SERVER['HTTP_IF_MODIFIED_SINCE'])) { // if the browser cache matches the modify time if ($last_modify_time == strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE'])) { // send a 304 header, and no content header("HTTP/1.1 304 Not Modified"); exit; } }
There is also an HTTP header named Etag, which can be used to make sure the cache is current. We'll talk about this shortly.
Cookie
As the name suggests, this sends the cookies stored in your browser for that domain.
Cookie: PHPSESSID=r2t5uvjq435r4q7ib3vtdjq120; foo=bar
These are name=value pairs separated by semicolons. Cookies can also contain the session id.
In PHP, individual cookies can be accessed with the $_COOKIE
array. You can directly access the session variables using the $_SESSION
array, and if you need the session id, you can use the session_id()
function instead of the cookie.
echo $_COOKIE['foo']; // output: bar echo $_COOKIE['PHPSESSID']; // output: r2t5uvjq435r4q7ib3vtdjq120 session_start(); echo session_id(); // output: r2t5uvjq435r4q7ib3vtdjq120
Referer
As the name suggests, this HTTP header contains the referring URL.
For example, if I visit the Envato Tuts+ Code homepage and click on an article link, this header is sent to my browser:
Referer: https://code.tutsplus.com/
In PHP, it can be found as $_SERVER['HTTP_REFERER']
.
if (isset($_SERVER['HTTP_REFERER'])) { $url_info = parse_url($_SERVER['HTTP_REFERER']); // is the surfer coming from Google? if ($url_info['host'] == 'www.google.com') { parse_str($url_info['query'], $vars); echo "You searched on Google for this keyword: ". $vars['q']; } } // if the referring URL was: // http://www.google.com/search?source=ig&hl=en&rlz=&=&q=http+headers&aq=f&oq=&aqi=g-p1g9 // the output will be: // You searched on Google for this keyword: http headers
You may have noticed the word "referrer" is misspelled as "referer". Unfortunately it made into the official HTTP specifications like that and got stuck.
Authorization
When a web page asks for authorization, the browser opens a login window. When you enter a username and password in this window, the browser sends another HTTP request, but this time it contains this header.
Authorization: Basic bXl1c2VyOm15cGFzcw==
The data inside the header is base64 encoded. For example, base64_decode('bXl1c2VyOm15cGFzcw==')
would return 'myuser:mypass'
.
In PHP, these values can be found as $_SERVER['PHP_AUTH_USER']
and $_SERVER['PHP_AUTH_PW']
.
More on this when we talk about the WWW-Authenticate header.
HTTP Headers in HTTP Responses
Now we are going to look at some of the most common HTTP headers found in HTTP responses.
In PHP, you can set response headers using the header()
function. PHP already sends certain headers automatically, for loading the content and setting cookies etc. You can see the headers that are sent, or will be sent, with the headers_list()
function. You can check if the headers have been sent already, with the headers_sent()
function.
Cache-Control
Definition from w3.org:
The Cache-Control general-header field is used to specify directives which MUST be obeyed by all caching mechanisms along the request/response chain.
These "caching mechanisms" include gateways and proxies that your ISP may be using.
Example:
Cache-Control: max-age=3600, public
public
means that the response may be cached by anyone. max-age
indicates how many seconds the cache is valid for. Allowing your website to be cached can reduce server load and bandwidth, and also improve load times at the browser.
Caching can also be prevented by using the no-cache
directive.
Cache-Control: no-cache
For more detailed info, see w3.org.
Content-Type
This header indicates the "mime type" of the document. The browser then decides how to interpret the contents based on this. For example, an HTML page (or a PHP script with HTML output) may return this:
Content-Type: text/html; charset=UTF-8
text
is the type and html
is the subtype of the document. The header can also contain more info such as charset.
For a GIF image, this may be sent:
Content-Type: image/gif
The browser can decide to use an external application or browser extension based on the mime type. For example this will cause the Adobe Reader or browser built-in PDF reader to be loaded:
Content-Type: application/pdf
When loading directly, Apache can usually detect the mime type of a document and send the appropriate header. Also most browsers have some amount fault tolerance and auto-detection of the mime-types, in case the headers are wrong or not present.
You can find a list of common mime types here.
In PHP, you can use the finfo_file()
function to detect the mime type of a file.
Content-Disposition
This header instructs the browser to open a file download box, instead of trying to parse the content. Example:
Content-Disposition: attachment; filename="download.zip"
That will cause the browser to do this:
Note that the appropriate Content-Type
header should also be sent along with this:
Content-Type: application/zip Content-Disposition: attachment; filename="download.zip"
Content-Length
When content is going to be transmitted to the browser, the server can indicate the size of it (in bytes) using this header.
Content-Length: 89123
This is especially useful for file downloads. That's how the browser can determine the progress of the download.
For example, here is a dummy script I wrote, which simulates a large download.
// it's a zip file header('Content-Type: application/zip'); // 1 million bytes (about 1megabyte) header('Content-Length: 1000000'); // load a download dialogue, and save it as download.zip header('Content-Disposition: attachment; filename="download.zip"'); // 1000 times 1000 bytes of data for ($i = 0; $i < 1000; $i++) { echo str_repeat(".",1000); // sleep to slow down the download usleep(50000); }
The result is:
Now I am going to comment out the Content-Length header
// it's a zip file header('Content-Type: application/zip'); // the browser won't know the size // header('Content-Length: 1000000'); // load a download dialogue, and save it as download.zip header('Content-Disposition: attachment; filename="download.zip"'); // 1000 times 1000 bytes of data for ($i = 0; $i < 1000; $i++) { echo str_repeat(".",1000); // sleep to slow down the download usleep(50000); }
Now the result is:
The browser can only tell you how many bytes have been downloaded, but it does not know the total amount. And the progress bar is not showing the progress.
Etag
This is another header that is used for caching purposes. It looks like this:
Etag: "pub1259380237;gz"
The web server may send this header with every document it serves. The value can be based on the last modify date, file size or even the checksum value of a file. The browser then saves this value as it caches the document. Next time the browser requests the same file, it sends this in the HTTP request:
If-None-Match: "pub1259380237;gz"
If the Etag value of the document matches that, the server will send a 304 code instead of 200, and no content. The browser will load the contents from its cache.
Last-Modified
As the name suggests, this header indicates the last modify date of the document, in GMT format:
Last-Modified: Sat, 28 Nov 2009 03:50:37 GMT
$modify_time = filemtime($file); header("Last-Modified: " . gmdate("D, d M Y H:i:s", $modify_time) . " GMT");
It offers another way for the browser to cache a document. The browser may send this in the HTTP request:
If-Modified-Since: Sat, 28 Nov 2009 06:38:19 GMT
We already talked about this earlier in the If-Modified-Since
section.
Location
This header is used for redirections. If the response code is 301 or 302, the server must also send this header. For example, when you go to http://net.tutsplus.com your browser will receive this:
HTTP/1.x 301 Moved Permanently ... Location: https://code.tutsplus.com/ ...
In PHP, you can redirect a surfer like so:
header('Location: https://code.tutsplus.com/');
By default, that will send a 302 response code. If you want to send 301 instead:
header('Location: https://code.tutsplus.com/', true, 301);
Set-Cookie
When a website wants to set or update a cookie in your browser, it will use this header.
Set-Cookie: skin=noskin; path=/; domain=.amazon.com; expires=Sun, 29-Nov-2009 21:42:28 GMT Set-Cookie: session-id=120-7333518-8165026; path=/; domain=.amazon.com; expires=Sat Feb 27 08:00:00 2010 GMT
Each cookie is sent as a separate header. Note that the cookies set via JavaScript do not go through HTTP headers.
In PHP, you can set cookies using the setcookie()
function, and PHP sends the appropriate HTTP headers.
setcookie("TestCookie", "foobar");
Which causes this header to be sent:
Set-Cookie: TestCookie=foobar
If the expiration date is not specified, the cookie is deleted when the browser window is closed.
WWW-Authenticate
A website may send this header to authenticate a user through HTTP. When the browser sees this header, it will open up a login dialogue window.
WWW-Authenticate: Basic realm="Restricted Area"
Which looks like this:
There is a section in the PHP manual, that has code samples on how to do this in PHP.
if (!isset($_SERVER['PHP_AUTH_USER'])) { header('WWW-Authenticate: Basic realm="My Realm"'); header('HTTP/1.0 401 Unauthorized'); echo 'Text to send if user hits Cancel button'; exit; } else { echo "<p>Hello {$_SERVER['PHP_AUTH_USER']}.</p>"; echo "<p>You entered {$_SERVER['PHP_AUTH_PW']} as your password.</p>"; }
Content-Encoding
This header is usually set when the returned content is compressed.
Content-Encoding: gzip
In PHP, if you use the ob_gzhandler()
callback function, it will be set automatically for you.
How to Send HTTP Headers
After reading the tutorial up to this point, you should have a good idea of what HTTP headers are and what their different values mean. Some headers are sent and received automatically when you make a request to a server and get a response back.
However, there will be situations where you would want to send your own custom headers besides the ones sent by the client or server.
One of the most common ways of sending your own headers in a request is by using the cURL library in PHP. The library comes with a bunch of functions to handle all your needs. There are four basic steps involved:
- You use
curl_init()
to start your cURL session. You can pass it the URL which you want to request. - The
curl_setopt()
function is used to configure the request according to your needs. This is where you can set your own headers by using theCURLOPT_HTTPHEADER
option. - After you have set all the options, you can execute the request by calling
curl_exec()
. - Finally, you can close the session by calling the
curl_close()
function.
Here is a basic example that sends a request to https://code.tutsplus.com/tutorials.
<?php $ch = curl_init("https://code.tutsplus.com/tutorials"); curl_setopt($ch, CURLOPT_HTTPHEADER, array( "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:88.0) Gecko/20100101 Firefox/88.0", "Accept-Language: en-US,en;q=0.5" )); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $output = curl_exec($ch); curl_close($ch); echo $output; ?>
You can learn more about cURL by reading these two tutorials. They cover all the basics of the library to help you get started.
If you want to send response headers in PHP, then you should use the header()
function. Among other things, one of its common use is redirecting visitors to other pages. This can be done by using the Location
header. Here is an example:
<?php header('Location: https://code.tutsplus.com/tutorials'); // Other PHP or HTML code. ?>
You have to remember to call the header()
function before any kind of output either in HTML or in PHP. Even blank output is not permitted. Otherwise, you will get the Headers already sent error.
Conclusion
Thanks for reading. I hope this article was a good starting point to learn about HTTP Headers. Please leave your comments and questions below, and I will try to respond as much as I can.
If you want to take your web development further, check out some of the popular files on CodeCanyon. These scripts, apps, templates and plugins can save you precious development time and help you add new features quickly and easily.
The Best PHP Scripts on CodeCanyon
Explore thousands of the best and most useful PHP scripts ever created on CodeCanyon.
Here are a few of the best-selling and up-and-coming PHP scripts available on CodeCanyon for 2021.
-
PHP11 Best PHP Event Calendar and Booking Scripts... and 3 Free Options
-
PHP13 Best PHP URL Shortener Scripts (Free and Premium)
-
PHP18 Best Contact Form PHP Scripts for 2021
-
PHPComparing the 5 Best PHP Form Builders (And 4 Free Scripts)
-
PHPCreate Beautiful Forms With PHP Form Builder
This post has been updated with contributions from Monty Shokeen. Monty is a full-stack developer who also loves to write tutorials, and to learn about new JavaScript libraries.
No comments:
Post a Comment