Nginx file caching. Cache Wordpress using Nginx

|

The faster a site loads, the greater the chance of retaining its visitors. Sites with a lot of images and interactive content that runs scripts in the background are very difficult to load quickly. The loading process of such a site consists of a huge number of requests to different files. The fewer such requests the server sends, the faster the site will load.

There are many ways to speed up website loading, and browser caching is one of the most important. This allows the browser to reuse local copies of previously downloaded files. To do this, you need to introduce new HTTP response headers.

The header module of the Nginx web server will help you with this. This module can add arbitrary headers to the response, but its main role is to define caching headers. This tutorial shows how to use the header module to configure browser caching.

Requirements
  • Ubuntu 16.04 server (you can find out about setting up the server).
  • A user with access to the sudo command.
  • Pre-installed Nginx web server (installation guide – ).
  • Map module (instructions for setting up this module are in).
1: Create test files

To get started, create some test files in the standard Nginx directory. These files can later be used to check browser caching.

To determine the type of file being transferred over the network, Nginx does not parse its content (that would be too slow), instead it looks at the file extension to determine its MIME type, which determines the purpose of the file.

Therefore, it does not matter at all what the test files will contain. Just give them the appropriate names and extensions, and Nginx will treat the empty file as an image or stylesheet.

In the default Nginx directory, create a test.html file using truncate. As you can see from the extension, this will be an HTML file.

sudo truncate -s 1k /var/www/html/test.html

In the same way, create a few more test files with jpg (image), css (style sheet) and js (JavaScript) extensions:

sudo truncate -s 1k /var/www/html/test.jpg
sudo truncate -s 1k /var/www/html/test.css
sudo truncate -s 1k /var/www/html/test.js

2: Checking Nginx Standard Behavior

By default, all files are cached equally. To verify this, use a test HTML file.

Send a request to test.html from your local Nginx server and view the response headers:

This command will return a response header like this:

HTTP/1.1 200 OK

Date: Sat, 10 Sep 2016 13:12:26 GMT
Content-Type: text/html
Content-Length: 1024

Connection: keep-alive
ETag: "57d40685-400"
Accept-Ranges: bytes

In the red highlighted line you see the ETag header, which contains the unique identifier of this view of the requested file. If you run the curl command again, you will see the exact same ETag value.

The browser stores the ETag value and sends it back to the server in the If-None-Match response header when it needs to re-request the file (for example, when the page is refreshed).

You can simulate this behavior with the following command:

curl -I -H "If-None-Match: "57d40685-400"" http://localhost/test.html

Note: Enter your ETag value instead of 57f6257c-400.

The command will now return:

HTTP/1.1 304 Not Modified
Server: nginx/1.10.0 (Ubuntu)
Date: Sat, 10 Sep 2016 13:20:31 GMT
Last-Modified: Sat, 10 Sep 2016 13:11:33 GMT
Connection: keep-alive
ETag: "57d40685-400"

This time Nginx will return 304 Not Modified. The web server will not forward the file again, it will simply tell the browser that it can reuse the previously downloaded file.

This reduces network traffic, but is not quite enough to achieve high caching performance. The problem with ETag is that the browser must always send a request to the server to reuse its cached file. The server responds with a 304 instead of a file, but this procedure still takes a lot of time.

3: Setting up the Cache-Control and Expires headers

In addition to ETag, there are two more response headers to control caching: Cache-Control and Expires. Cache-Control is a newer header that has more features than Expires and is generally more useful in setting up caching.

These headers tell the browser that the requested file can be stored locally for a certain period (including forever) without requesting it again. If these headers are not configured, the browser will be forced to constantly request files from the server and expect a 200 OK or 304 Not Modified response.

These HTTP headers can be configured using the header module. The Header module is built into Nginx, which means it does not need to be installed.

To add this module, open the default Nginx virtual host file in a text editor:

sudo nano /etc/nginx/sites-available/default

Find the server block:

. . .

#
server (
listen 80 default_server;

. . .

Place two new sections in the file: one before the server block (setting the duration of caching of files of various types), and the second inside this block (setting caching headers).

. . .
# Default server configuration
#
# Expires map
map $sent_http_content_type $expires (
default off;
text/html epoch;
text/css max;
application/javascript max;
~image/max;
}
server (
listen 80 default_server;
listen [::]:80 default_server;
expires $expires;
. . .

The section before the server block is a new map block that defines the correspondence between the file type and the period of its storage in the cache.

  • off is the default and adds no caching control headers. This is a precaution for content that has no specific caching requirements.
  • text/html is set to epoch. This is a special value that disables caching, as a result of which the browser will always request the current state of the site.
  • text/css (style sheets) and application/javascript (Javascript files) are set to max. This means that the browser will cache these files for as long as possible, significantly reducing the number of requests (given that there are usually a lot of such files).
  • ~image/ is a regular expression that searches for all files with the MIME type image/ (for example, image/jpg and image/png). It also has the value max, since there are a lot of images, as well as style sheets, on sites. By caching them, the browser will reduce the number of requests.

The expires directive (included in the headers module) configures headers to control caching. It uses the value of the $expires variable specified in the map block, causing the response headers to differ depending on the file type.

Save and close the file.

To update the settings, restart Nginx:

sudo systemctl restart nginx

4: Testing Browser Caching

Run the same query as at the beginning of the guide:

curl -I http://localhost/test.html

Response times will vary. You will see two new response headers in the output

HTTP/1.1 200 OK
Server: nginx/1.10.0 (Ubuntu)
Date: Sat, 10 Sep 2016 13:48:53 GMT
Content-Type: text/html
Content-Length: 1024
Last-Modified: Sat, 10 Sep 2016 13:11:33 GMT
Connection: keep-alive
ETag: "57d40685-400"
Expires: Thu, 01 Jan 1970 00:00:01 GMT
Cache-Control: no-cache
Accept-Ranges: bytes

The Expires header shows a date in the past, and Cache-Control is set to no-cache, which means the browser must constantly request the latest version of the file (using the ETag header).

Request another file:

curl -I http://localhost/test.jpg
HTTP/1.1 200 OK
Server: nginx/1.10.0 (Ubuntu)
Date: Sat, 10 Sep 2016 13:50:41 GMT
Content-Type: image/jpeg
Content-Length: 1024
Last-Modified: Sat, 10 Sep 2016 13:11:36 GMT
Connection: keep-alive
ETag: "57d40688-400"
Expires: Thu, 31 Dec 2037 23:55:55 GMT
Cache-Control: max-age=315360000
Accept-Ranges: bytes

As you can see, the result is different. Expires contains a date in the distant future, and Cache-Control has a max-age value that tells the browser how long it can cache the file (in seconds). In this case, the browser will cache the downloaded file for as long as possible, so that in the future the browser will use the local cache to download this image.

Try querying the test.js and test.css files, you should get a similar result.

The curl command output shows that browser caching has been successfully configured. Page caching will increase site performance and reduce the number of requests.

Note: This guide offers convenient caching settings that will suit the average website. To improve your site's performance, analyze your site's content and adjust your caching settings based on which files are more abundant on your site.

Caching is a technology or process of creating a copy of data on quickly accessible storage media (cache, cash). Simply put and applied to the realities of website building, this can be the creation of a static HTML copy of a page or part of it, which is generated using PHP scripts (or others, such as Perl, ASP.net), depending on what language the site’s CMS is written in ) and is saved on disk, in RAM, or even partially in the browser (we’ll look at it in more detail below). When a request for a page from a client (browser) occurs, instead of reassembling it with scripts, the browser will receive a ready-made copy of it, which is much more economical in terms of hosting resources, and faster, since transmitting the finished page takes less time (sometimes significantly less), than creating it anew.

Why use caching on your website?
  • To reduce the load on hosting
  • To quickly display site content to the browser

Both arguments, I think, require no comment.

Disadvantages and negative effects of website caching

Oddly enough, website caching also has its downsides. First of all, this applies to sites whose content changes dynamically when interacting with it. Often, these are sites that serve content or part of it using AJAX. In general, AJAX caching is also possible and even necessary, but this is a topic for another discussion and does not concern traditionally used technologies, which will be discussed later.
Also, problems may arise for registered users, for whom a persistent cache may become a problem when interacting with site elements. Here, as a rule, the cache is disabled, or object caching of individual site elements is used: widgets, menus, and the like.

How to set up caching on your website

First, we need to figure out what technologies are traditionally used to cache website content.
All possible methods can be divided into 3 groups

Server side caching Caching with NGINX Caching with htaccess (Apache)

If you only have access to .htaccess and the production server is Apache only, then you can use techniques like gzip compression and Expires headers to make use of the browser cache.

Enable gzip compression for appropriate MIME file types

AddOTPUTFILTERBYTYPE DEFLATE TEXT/HTML AddOTPUTPULTERBYTYPE DEFLATE TEXT/CSS AddputfilterbyType Text/JavaScript Appline JavaScript Application/X-JavaScript AddOTPUTFILTERBYTYPE Deflate Text/XML Application/XHTML+XML Application/RSS+XML AddUTPUTPUTFILTERBYT Ype Deflate Application/Json Addoutputfilterbytype DEFLATE application/vnd.ms-fontobject application/x-font-ttf font/opentype image/svg+xml image/x-icon

We enable Expires headers for static files for a period of 1 year (365 days)

ExpiresActive on ExpiresDefault "access plus 365 days"

Caching with Memcached Caching with php accelerator

If the site engine is written in PHP, then every time any page of the site is loaded, PHP scripts are executed: the code interpreter reads the scripts written by the programmer, generates bytecode from them that is understandable to the machine, executes it and produces the result. The PHP accelerator eliminates the constant generation of bytecode by caching the compiled code in memory or on disk, thereby increasing performance and reducing the time spent executing PHP. Among the currently supported accelerators there are:

  • Windows Cache Extension for PHP
  • XCache
  • Zend OPcache

PHP versions 5.5 and above already have the Zend OPcache accelerator built in, so to enable the accelerator you just need to update your PHP version

Site-side caching

As a rule, this refers to the ability of the site’s CMS to create static HTML copies of pages. Most popular engines and frameworks have this capability. Personally, I have worked with Smarty, WordPress, so I can assure you that they do an excellent job. The original WordPress out of the box does not have caching capabilities, which are necessary for any slightly loaded project, but there are many popular plugins for caching:

  • , which is responsible for generating static website pages;
  • Hyper Cache, which essentially works the same as the previous plugin;
  • DB Cache. The essence of the work is caching queries to the database. Also a very useful feature. Can be used in conjunction with the two previous plugins;
  • W3 Total Cache. Saved it for dessert, this is my favorite plugin in WordPress. With it, the site is transformed, turning from a hulking bus into a racing car. Its huge advantage is a huge range of capabilities, such as several caching options (statics, accelerators, Memcached, database queries, object and page caching), code concatenation and minification (merging and compressing CSS, Javascript files, HTML compression by removing spaces ), using CDN and much more.
  • What can I say - use the right CMS, and high-quality caching will be available almost out of the box.

    Browser (client) side caching, caching headers

    Browser caching is possible because any self-respecting browser allows and encourages it. This may be due to the HTTP headers that the server gives to the client, namely:

    • Expires;
    • Cache-Control: max-age;
    • Last-Modified;
    • ETag.

    Thanks to them, users who repeatedly visit the site spend very little time loading pages. Caching headers should be applied to all cached static resources: template files, image files, javascript and css files if available, PDF, audio and video, and so on.
    It is recommended to set headers so that static data is stored for at least a week and no more than a year, preferably a year.

    Expires

    The Expires header controls how long the cache is current, and the browser can use cached resources without asking the server for a new version of them. It is strong and highly desirable to use, as it is mandatory. It is recommended to indicate a period from a week to a year in the title. It is better not to specify more than a year, this is a violation of the RFC rules.

    For example, to configure Expires in NGINX for all static files for a year (365 days), the code must be present in the NGINX configuration file

    Location ~* ^.+\.(jpg|jpeg|gif|png|svg|js|css|mp3|ogg|mpe?g|avi|zip|gz|bz2?|rar|swf)$ ( expires 365d; )

    Cache-Control: max-age;

    Cache-Control: max-age does the same thing.
    It is more preferable to use Expires than Cache-Control due to its greater prevalence. However, if Expires and Cache-Control are present in the headers at the same time, then priority will be given to Cache-Control.

    In NGINX, Cache-Control is enabled in the same way as Expires , with the expires: 365d directive;

    Last-Modified and ETag

    These headers work on the principle of digital fingerprints. This means that each URL in the cache will have its own unique id. Last-Modified creates it based on the last modification date. The ETag header uses any unique resource identifier (most often a file version or a content hash). Last-Modified is a "weak" header because the browser uses heuristics to determine whether to request the element from the cache.

    In NGINX, ETag and Last-Modified are enabled by default for static files. For dynamic pages, it is either better not to specify them, or the script that generates the page should do this, or, best of all, use a properly configured cache, then NGINX will take care of the headers itself. For example, for WordPress, you can use .

    These headers allow the browser to efficiently update cached resources by sending GET requests each time the user explicitly reloads the page. Conditional GET requests do not return a full response unless the resource has changed on the server, and thus provide lower latency than full requests, thereby reducing hosting load and response time.

    The simultaneous use of Expires and Cache-Control: max-age is redundant, just as the simultaneous use of Last-Modified and ETag is redundant. Use Expires + ETag or Expires + Last-Modified in combination.

    Enable GZIP compression for static files

    Of course, GZIP compression is not directly related to caching as such, however, it greatly saves traffic and increases page loading speed.

    How to enable GZIP for static in server ( .... gzip on; gzip_disable "msie6"; gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript application /javascript; ) How to enable GZIP for static in To enable gzip compression in .htaccess, you need to insert the following code at the beginning of the file: AddOutputFilterByType DEFLATE text/plain AddOutputFilterByType DEFLATE text/html AddOutputFilterByType DEFLATE text/xml AddOutputFilterByType DEFLATE text/css AddOutputFilterByType DEFLATE application/ xml AddOutputFilterByType DEFLATE application/xhtml+xml AddOutputFilterByType DEFLATE application/rss+xml AddOutputFilterByType DEFLATE application/javascript AddOutputFilterByType DEFLATE application/x-javascript

    |

    Nginx includes a FastCGI module that allows you to use directives for caching dynamic content in the PHP interface. FastCGI eliminates the need to find additional page caching solutions (such as reverse proxies or custom application plugins). Content can also be excluded from caching based on the request method, URL, cookies, or any other server variable.

    Enabling FastCGI Caching

    To follow this guide, you must first. You also need to edit the virtual host configuration file:

    nano /etc/nginx/sites-enabled/vhost

    Add the following lines to the top of the file outside the server() directive:

    The fastcgi_cache_path directive specifies the path to the cache (/etc/nginx/cache), its size (100m), memory zone name (MYAPP), subdirectory levels, and inactive timer.

    The cache can be placed at any convenient location on the hard drive. The maximum cache size should not exceed server RAM + swap file size; otherwise, a Cannot allocate memory error will be displayed. If the cache has not been used for a specific period of time specified by the "inactive" option (60 minutes in this case), then Nginx deletes it.

    The fastcgi_cache_key directive specifies how file names are hashed. According to these settings, Nginx will encrypt files using MD5.

    Now we can move on to the location directive, which passes PHP requests to the php5-fpm module. In location ~ .php$ ( ), add the following lines:

    fastcgi_cache MYAPP;
    fastcgi_cache_valid 200 60m;

    The fastcgi_cache directive refers to a memory zone that was already specified in the fastcgi_cache_path directive.

    By default, Nginx keeps cached objects for a period of time specified using one of these headers:

    X-Accel-Expires
    Expires
    Cache-Control.

    The fastcgi_cache_valid directive specifies the default cache age if none of these headers are present. By default, only responses with status code 200 are cached (of course, other status codes can be specified).

    Check your FastCGI settings

    service nginx configtest

    Then restart Nginx if the settings are ok.

    service nginx reload

    At this stage, the vhost file should look like this:

    fastcgi_cache_path /etc/nginx/cache levels=1:2 keys_zone=MYAPP:100m inactive=60m;
    fastcgi_cache_key "$scheme$request_method$host$request_uri";
    server (
    listen 80;
    root /usr/share/nginx/html;
    index index.php index.html index.htm;
    server_name example.com;
    location/(
    try_files $uri $uri/ /index.html;
    }
    location ~ \.php$ (
    try_files $uri =404;
    fastcgi_pass unix:/var/run/php5-fpm.sock;
    fastcgi_index index.php;
    include fastcgi_params;
    fastcgi_cache MYAPP;
    fastcgi_cache_valid 200 60m;
    }
    }

    Now we need to check if caching is working.

    Checking FastCGI caching

    Create a PHP file that outputs the UNIX timestamp.

    /usr/share/nginx/html/time.php

    Add to the file:

    Then request this file several times via curl or a web browser.

    root@server:~# curl http://localhost/time.php;echo
    1382986152

    1382986152
    root@server:~# curl http://localhost/time.php;echo
    1382986152

    If caching is done properly, the timestamp of all requests will match (since the response has been cached).

    To find the cache of this query, you need to perform a cache writeback

    root@server:~# ls -lR /etc/nginx/cache/
    /etc/nginx/cache/:
    total 0
    drwx------ 3 www-data www-data 60 Oct 28 18:53 e
    /etc/nginx/cache/e:
    total 0
    drwx------ 2 www-data www-data 60 Oct 28 18:53 18
    /etc/nginx/cache/e/18:
    total 4
    -rw------- 1 www-data www-data 117 Oct 28 18:53

    You can also add an X-Cache header, which will indicate that the request was processed from the cache (X-Cache HIT) or directly (X-Cache MISS).

    Above the server ( ) directive, enter:

    add_header X-Cache $upstream_cache_status;

    Restart the Nginx service and issue a verbose request using curl to see the new header.

    root@server:~# curl -v http://localhost/time.php
    * About to connect() to localhost port 80 (#0)
    * Trying 127.0.0.1...
    * connected
    * Connected to localhost (127.0.0.1) port 80 (#0)
    > GET /time.php HTTP/1.1
    > User-Agent: curl/7.26.0
    > Host: localhost
    > Accept: */*
    >
    * HTTP 1.1 or later with persistent connection, pipelining supported
    < HTTP/1.1 200 OK
    < Server: nginx
    < Date: Tue, 29 Oct 2013 11:24:04 GMT
    < Content-Type: text/html
    < Transfer-Encoding: chunked
    < Connection: keep-alive
    < X-Cache: HIT

    Send a POST request to this file with the URL you want to scrape.

    curl -d "url=http://www.example.com/time.php" http://localhost/purge.php

    The script will output true or false depending on whether the cache was cleared or not. Be sure to exclude this script from caching, and also do not forget to restrict access to it.

    Tags: ,

    Client-side data caching is the ability to configure a one-time download of data of a certain type and then save it in the client’s memory. Caching the nginx browser or using another server allows you to reduce the number of requests from the client machine, and, as a result, the load, and also increase the loading speed of sites.

    Those. the client accesses the site page - the server processes the request, the generated page is sent to the client along with a certain header. The browser stores the information locally and returns it when requested again.

    CSS and Javascript style images are cached. Nginx browser caching is implemented by adding the Cache-control header.

    In headers, service information is transmitted from the server to the client browser, from which the browser learns when it needs to save data of a certain type and how long to keep it in memory.

    Nginx Browser Caching

    In the Nginx configuration file, JS/CSS caching is enabled as follows (other extensions have been added - in practice it is better to cache them all):

    server (

    location ~* \.(jpg|jpeg|gif|png|ico|css|bmp|swf|js|html|txt)$ (
    expires max;
    root /home/website/example.com/;
    }

    }

    expires max means that the TTL is set to infinity and if the files on the server are changed, the client will never know about it since a repeat request will not be sent.

    expires (this header will be discussed below) determines when the browser will update the cache, the value is set in seconds.

    Usually, the expires max value is set in the server config, then in the application, when connecting css and js files, their versions are determined, which should change every time the content is updated.

    Specifying application-level caching headers

    The server in this case will perceive each new version as a new file added and will cache it.

    Along with Cache-Control, the Expires header is often specified - it forces the date and time when the browser will reset the existing cache; the next time the user contacts, the updated data will be loaded into the cache again.

    The additional HTTP Expires header specifies the date and time when the browser should update the cache (the headers can be used together; Expires has a lesser value when both headers are used):

    Both headers can be set in application-level code.

    Enabling caching in PHP

    Most web projects are written in the PHP language; in PHP, the Cache-control and Expires HTTP headers are set as follows: