loading...

Nginx – Log Analysis, Monitoring, and Automation

How to Change the server name on Windows Server 2019

If you are a web hosting provider, setting up web servers will be a fairly repetitive task to the extent that you might want to automate the whole process of creating and configuring the website. However, if you have a few websites to manage, setting up a web server and hosting your application will often be relatively straightforward. Once you have set up the web server, the changes in the configuration will be rare and only on a need basis. A typical web administrator spends far more time maintaining the web farm than configuring it. This chapter focuses on maintaining the web server. You will learn about log gathering, analysis, monitoring, and automation.

Error Log

While processing the requests, it is possible that the request is not honored for several reasons. As a visitor, these error messages can be pretty generic. Most browsers have an in-built error messages template

that is used regardless of the web server it requested from. So, in case your (any) web server responds with status code 404

, it is possible that the browser shows an in-built custom error message. This is done for consistency, so that even the layman can understand what the error message means.

As an administrator though, the generic error message doesn’t help much. If you find errors while browsing your website, you will have to troubleshooting using error logs. Nginx writes information about the issues it encountered while processing a request in an error log. The logging mechanism is smart and has various levels. The levels ensure that you log only as much as needed on a regular basis. You must understand that logging is a cost, and it uses resources.

It is a good idea to log at higher level in production and lower levels during development. Take a look at Figure 9-1 to understand about levels. Emergency (emerg) is the highest level of error message that you shouldn’t ignore at all. Most emerg level errors will cause service issues and the Nginx service will not even start. Alert (alert) is the second highest level of error followed by critical (crit), error, warning (warn), notice, information (info), and debugging (debug).

Figure 9-1. Error levels

in Nginx

If you set the error logging to warn, you are telling Nginx to log any error message that has a level of warning or above. In essence, warn, error, crit, alert, and emerg messages will all be logged if you choose warn as the error logging method.

To configure

error log, you will need to use the error_log directive in main, HTTP, mail, stream, server, or location block as follows:

error_log  /var/log/nginx/error.log info;

If you restart the Nginx service now, you should be able to see logs in /var/log/nginx/error.log.

Try doing as follows so that you learn about different entries that get logged during reload of a configuration ( sudo nginx -s reload).

  1. Edit your Nginx configuration file and ensure that the error_log directive is set to info level.

  2. Read the log file using sudo tail /var/log/nginx/error.log:

  3. Note the time stamp so that you can compare the additional logs that have appeared after you have entered the following command:

    sudo nginx -s reload
  4. Read the log again, using sudo tail /var/log/nginx/error.log -n 50:

    [notice] 13300#0: signal process started
    [notice] 13281#0: signal 1 (SIGHUP) received, reconfiguring
    [notice] 13281#0: reconfiguring
    [notice] 13281#0: using the "epoll" event method
    [notice] 13281#0: start worker processes
    [notice] 13281#0: start worker process 13301
    [notice] 13285#0: gracefully shutting down
    [notice] 13285#0: exiting
    [notice] 13285#0: exit
    [notice] 13281#0: signal 17 (SIGCHLD) received
    [notice] 13281#0: worker process 13285 exited with code 0
    [notice] 13281#0: signal 29 (SIGIO) received

Notice, that the log entry is marked [notice] as the log levels, and it emits a bunch of lines telling exactly what Nginx has done. Try running different commands, and check the logs again to see various entries being logged. It is a good way of learning and understanding the underlying concepts.

Note

By default, you can use all levels except debug. For debug level to work, you must use –with-debug while building Nginx. Refer to chapter 2 for more information about setting up Nginx using different switches. You can use nginx -V to determine if your Nginx binary was built with debug module.

The configuration file can have multiple error_log directives, and the one declared at the lowest level of hierarchy overrides the configuration on the higher level. If there are multiple error_log directives at a level, the logs are written to all specified log files.

Access Log

Access log is another log that Nginx creates while serving requests. Where error logs contain service-related information

, access logs contain information about client requests

. Every request is logged right after it is processed.

  1. Start by typing nginx -V to view the path of access logs and error logs and find the values of –error-log-path (for error logs) and –http-log-path (for access logs):

    $nginx -V
    nginx version: nginx/1.8.1
    built by gcc 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC)
    built with OpenSSL 1.0.1e-fips 11 Feb 2013
    TLS SNI support enabled
    configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_stub_status_module --with-http_auth_request_module --with-mail --with-mail_ssl_module --with-file-aio --with-ipv6 --with-http_spdy_module --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic'
  2. Open nginx.conf file, and remove log_format related lines if any.

  3. Reload the configuration and execute curl http://localhost.

  4. View the access logs, by using the following command:

    $sudo tail /var/log/nginx/access.log
    127.0.0.1 - - [actual_request_time] "GET / HTTP/1.1" 200 17 "-" "curl/7.29.0"
  5. This format is the default format and is referred to as combined format that appears automatically if you do not specify any specific log format. The combined format contains information about $remote_addr $remote_user [$time_local] $request $status $body_bytes_sent $http_referer $http_user_agent. In the previous command you can see that the request is a GET request for the root (/) location and was done using curl.

If you need to, you can change the format using log_format and specify more or less variables based on your requirement. A list of all variables in Nginx can be viewed at
http://nginx.org/en/docs/varindex.html
. The following snippet shows how you can declare a log_format as per your need, followed by access_log directive. Note that the log_format has a name called main, and this is the name provided in the access_log directive in addition to the file name.

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

Some of the important variables used in log_format are as follows:

  • $bytes_sent: Total number of bytes sent to a client.

  • $request_time: Total time taken to process a request. If a request is taking longer time, you should try to figure out the root cause for it. This variable contains information about time elapsed since the first bytes were read by the client, until the time the last byte was consumed.

  • $status: Contains the status code of the response. It is important that you scan your log files periodically to check if the requests are being served as per your application design. 5xx related errors are the ones that should be fixed as soon as possible, since it implies that your server (or application) was not able to handle the request gracefully. In general:

    • 2xx means success

    • 3xx means redirection

    • 4xx means errors due to client

    • 5xx means errors due to the server

What to Log?

The logging configuration

is so simple that it is easy to goof up. It might seem productive to log as much as possible, but in production scenarios where thousands of connections are handled every second, logging more can slow your web server down. This slowness is not because of the actual writing process, but more because of the evaluation of variables. Bear in mind that the variables in Nginx are evaluated at runtime for every request.

Your application, budget, and other requirements determine what to log and what not to. For example, if bandwidth is costly and of concern to you, it would be prudent to log compression ratio, so that you can log the details and analyze it later.

In simple words, every field that is logged using log_format will have a cost and you should think about how you plan to consume the log in future. If you can’t imagine a reason for logging a counter, you should not log it to begin with. This will save you both disk space and processing time.

Because you can configure logging in http, server and location blocks, you can fine tune it at various levels as per your requirement.

A common mistake that is often committed is to log at a much lower level than necessary. If you are running an application in production after thorough testing, you might want to log errors at a higher level, like error or crit. That way, you can safely ignore the logs at lower levels. During the development and troubleshooting process, it makes a lot of sense to keep the logging level low, so that you can minutely analyze the logs.

To summarize, be wise while logging and don’t log for logging sake.

Log Buffers

You can buffer the logs in memory before it is written to the disk. The access_log directive allows you to set the buffer size. If the logs are buffered, data is written to the file when one of the following conditions is true:

  • The next line doesn’t fit the buffer.

  • The buffered data is older than the flush parameter. (the flush parameter in the access_log specifies how frequently the logs should be flushed to the disk).

  • The worker process is shutting down or reopening the log files.

Conditional

Log

If the traffic to your website is huge, you may not want to log the successes at all. It is possible in Nginx to do conditional logging. Once enabled, the request will not be logged if the condition evaluates to 0 or an empty string. The following configuration snippet is pretty smart if you think about it.

map $status $loggable{
        ∼^[23]   0;
        default 1;
}
access_log /var/log/nginx/access.log  combined if=$loggable;

At first, the map directive creates variables whose values depend on certain factors. In this case, $status is taken as input and $loggable is the output. The regular expression is used to match $status and evaluate if the status starts with 2xx or 3xx, in which case it sets the $loggable to 0. Otherwise, the $loggable is set to 1. In the access_log directive, $loggable variable is evaluated using the if directive. If the value is 0, it is not logged and vice versa.

Log Compression

access_log directive provides a very neat way to compress the logs. It is as easy as appending another parameter to the directive like so:

access_log /log_path/log.gz combined gzip=1 flush=10m

In this case, the logs are compressed before being written to the file. gzip=1 implies that the level of compression is at least 1, and hence it is fastest. You can change the level to a maximum of 9, which would imply best compression but slowest. Generally, the higher the compression level, the more CPU Nginx consumes. The gzipped files get saved in the same location as access log and can be viewed with zcat like so:

sudo zcat /var/log/nginx/access.log-xxxxxx.gz

If the file is large, you can pipe the output to more or less as follows:

sudo zcat /var/log/nginx/access.log-xxxxxx.gz | more
sudo zcat /var/log/nginx/access.log-xxxxxx.gz | less
Tip

more is an older command that allows only forward scrolling and is available on most platforms. Less, on the other hand, is a newer command with a lot of functionality including backward scrolling. Read more about them in the main pages.

Syslog

If you have multiple servers emitting logs, it can be quite painful to individually log in on every server and analyze the logs. Syslog is a widely accepted utility that provides a way for networked servers to send event messages to a central logging server. Nginx can take advantage of syslog by using syslog: prefix in error_log and access_log directives

.

In the following example, the access log is written to a syslog server using an IPv6 address

on port 8080. The entries are tagged with text nginx_fe1 so that every line has this text. This will help you isolate logs from various servers even though they are stored in the same file.

access_log syslog:server=[xxx:xx::1]:8080,facility=local7,tag=nginx_fe1,severity=info;

Syslog protocol

utilizes numerical facility listed below (default value is local7 as used in the previous configuration block):

 0 kernel messages
 1 user-level messages
 2 mail system
 3 system daemons
 4 security/authorization messages
 5 messages generated internally by syslogd
 6 line printer subsystem
 7 network news subsystem
 8 UUCP subsystem
 9 clock daemon
10 security/authorization messages
11 FTP daemon
12 NTP subsystem
13 log audit
14 log alert
15 clock daemon
16 local use 0 (local0)    
17 local use 1 (local1)    
18 local use 2 (local2)    
19 local use 3 (local3)    
20 local use 4 (local4)    
21 local use 5 (local5)    
22 local use 6 (local6)    
23 local use 7 (local7)    

Analyze Logs

Analyzing a log is a time-consuming process

, and you have multiple tools and utilities that can help in faster log analysis. One of the simplest and least effective ways is to cat the log file, simply because the log files are usually big. There are various free and commercial tools available for this job. You will learn about some of the free tools in this section.

tail

You have been already using tail so far in this book so you must be pretty familiar with the basic syntax already. By default, tail command prints the last 10 lines of the file to a standard output. There are some interesting parameters that you should be aware of:

  • tail /access.log -c 500: will read the last 500 bytes of the log file

  • tail /access.log -n 50: will read the last 50 lines of the log file

  • tail /access.log -f: will keep listening to the log file and emit the latest lines as they appear. This comes in handy when you want to troubleshoot an issue in production. You can run this command before you try to reproduce the issue. Since the command keeps emitting the output as it appears, it will make your troubleshooting experience a lot smoother because you will not have to repeat the command again and again to view the latest entries.

ngxtop

Available as an open source project, ngxtop parses your access log and outputs useful metrics from your web server. It is very similar to top command.

A sample output of top command:

$top
top - 05:31:06 up 14 days, 19:48,  2 users,  load average: 0.00, 0.01, 0.05
Tasks:  91 total,   2 running,  89 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  1017160 total,   140920 free,   309768 used,   566472 buff/cache
KiB Swap:   839676 total,   839252 free,      424 used.   483480 avail Mem

  PID USER  PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                                                                                    
    1 root  20   0  208216   6948   2540 S  0.0  0.7   0:43.27 systemd                                                                                                    
    2 root  20   0       0      0      0 S  0.0  0.0   0:00.14 kthreadd                                                                                                    
    3 root  20   0       0      0      0 S  0.0  0.0   0:04.41 ksoftirqd/0                                                                                                
    5 root  0 -20       0      0      0 S  0.0  0.0   0:00.00 kworker/0:0H                                                                                                
    6 root  20   0       0      0      0 S  0.0  0.0   0:02.72 kworker/u2:0                                                                                                
    7 root  rt   0       0      0      0 S  0.0  0.0   0:00.00 migration/0                                                                                                
    8 root  20   0       0      0      0 S  0.0  0.0   0:00.00 rcu_bh                                                                                                      
    9 root  20   0       0      0      0 S  0.0  0.0   0:00.00 rcuob/0                                                                                                    

If you have been following along in this book so far, you must be working on CentOS, and it doesn’t have pip installed by default. pip is a package management system (just like yum) used to install and manage software packages written in Python. Ngxtop is written in Python and you need to install it.

  1. Use the following command to download it:

    sudo curl "https://bootstrap.pypa.io/get-pip.py" -o "get-pip.py" -k
  2. Install it:

    sudo python get-pip.py
  3. Check if it is installed properly by executing pip –help

  4. You can view the version by executing the following:

    pip -V
    pip 8.1.2 from /usr/lib/python2.7/site-packages (python 2.7)
  5. Now that pip is installed, use it to install ngxtop:

    sudo pip install ngxtop

Time to execute ngxtop!

If you execute it locally, you will see the output that follows in Figure 9-2.

Figure 9-2.
Nginx top output

In a production box, you will see a more comprehensive output. To get a feel of it, let it run locally and hit the website using a browser from your host machine ( http://127.0.0.1:8006).

Soon, you will start seeing additional information as you can see in Figure 9-3. This can be very helpful in production scenarios if you have long-running requests. You can quickly fire up the ngxtop command and get a gist of what’s going on. Keep in mind though that this utility is designed to run for shorter periods of time just like the top command for troubleshooting and monitoring purposes. You will need to use other software in case you want to do detailed analysis for a longer period of time.

Figure 9-3.
Nginx top output with data

Official home page:
https://github.com/lebinh/ngxtop

There is some output taken directly from their home page, to give you a gist of what kind of information you can get using this tool.

  • View top source IPs of clients:

    $ ngxtop top remote_addr
    running for 20 seconds, 3215 records processed: 159.62 req/sec
    top remote_addr
    | remote_addr     |   count |
    |-----------------+---------|
    | 118.173.177.161 |      20 |
    | 110.78.145.3    |      16 |
    | 171.7.153.7     |      16 |
    | 180.183.67.155  |      16 |
    | 183.89.65.9     |      16 |
    | 202.28.182.5    |      16 |
    | 1.47.170.12     |      15 |
    | 119.46.184.2    |      15 |
    | 125.26.135.219  |      15 |
    | 125.26.213.203  |      15 |
  • List 4xx or 5xx responses with HTTP referrer

    $ ngxtop -i 'status >= 400' print request status http_referer
    running for 2 seconds, 28 records processed: 13.95 req/sec
    
    request, status, http_referer:
    | request   |   status | http_referer   |
    |-----------+----------+----------------|
    | -         |      400 | -              |

GoAccess

GoAccess

is another web log analyzer and much more functional than ngxtop. It is an open source project that allows you to view logs interactively and runs directly in the terminal. Apart from view the logs in the terminal, it also allows you to create HTML reports for the logs. You can read more about it at
https://github.com/allinurl/goaccess
. The list of features it provides is pretty impressive:

  • General statistics, bandwidth, etc.

  • Time taken to serve the request (useful to track pages that are slowing down your site)

  • Metrics for cumulative, average, and slowest running requests

  • Top visitors

  • Requested files and static files

  • 404 or Not Found

  • Hosts, Reverse DNS, IP Location

  • Operating Systems

  • Browsers and Spiders

  • Referring Sites and URLs

  • Key Phrases

  • Geo Location – Continent/Country/City

  • Visitors Time Distribution

  • HTTP Status Codes

  • Metrics per Virtual Host

  • Ability to output HTML, JSON, and CSV

  • Tailor GoAccess to suit your own color taste/schemes

  • Incremental log processing

  • Support for large datasets and data persistence

  • Support for HTTP/2 and IPv6

  • Output statistics to HTML

It supports nearly all web log formats:

  • Amazon CloudFront (Download Distribution)

  • AWS Elastic Load Balancing

  • Combined Log Format (XLF/ELF) Apache | Nginx

  • Common Log Format (CLF) Apache

  • Google Cloud Storage

  • Apache virtual hosts

  • Squid Native Format

  • W3C format (IIS)

You can view the details in a colored format directly on the terminal or create HTML reports as can be seen in Figures 9-4 and 9-5 respectively.

Figure 9-4. goaccess output in a terminal (courtesy:
https://github.com/allinurl/goaccess
)

Figure 9-5. goaccess HTML report sample (courtesy:
https://goaccess.io/goaccess_html_report.html
)

Custom Error pages

If your visitors request a page that doesn’t exist, by default, a plain looking and boring error message is displayed like the one you see in Figure 9-6.

Figure 9-6.
Default error messages

There are other scenarios where it could be a lot more cryptic. To view it in practice, follow along these steps to learn how you can customize and show a much more informative error messages when something goes wrong.

  1. Start by modifying the config file

    so that you can reproduce errors easily. In the following config, there is a path setup ( http://localhost/testing) that would error out on purpose since the path doesn’t exist.

    server {
        listen       80;
        server_name  localhost;
    
        location / {
            root   /usr/share/nginx/html;
            index  index.html index.htm;
        }
    
        location /testing {
            fastcgi_pass unix:ooops;
        }
    }
  2. Reload configuration and execute
    curl localhost/testing.

    $ curl localhost/testing
    <html>
    <head><title>502 Bad Gateway</title></head>
    <body bgcolor="white">
    <center><h1>502 Bad Gateway</h1></center>
    <hr><center>nginx/1.8.1</center>
    </body>
    </html>
  3. If this error was thrown to your visitor, he would hardly understand what it means. Try another request with
    curl localhost/nopage, and you will see another message similar to this with a different code.

  4. In the following configuration
    error_page directive

    has been used. The directives talk about the error code and the corresponding route that should take care of the particular status code. For instance, when a page could not be found, a 404 error would be thrown and eventually handled by /custom_4xx.html route. Similarly, any server-related error (5xx) will be handled by /custom_5xx.html route.

    server {
        listen       80;
        server_name  localhost;
    
        location / {
            root   /usr/share/nginx/html;
            index  index.html index.htm;
        }
    
        location /testing {
            fastcgi_pass unix:ooops;
        }
    
        error_page  404              /custom_4xx.html;
        error_page 500 502 503 504 /custom_5xx.html;
    
        location = /custom_4xx.html {
            root /usr/share/nginx/html;
        }
    
        location = /custom_5xx.html {
            root /usr/share/nginx/html;
        }
    }
  5. Once this configuration

    is saved, you must create the files custom_4xx.html and custom_5xx.html in the root specified (the files must have read permissions for the Nginx process account). Here is what the text looks like:

    $ sudo cat /usr/share/nginx/html/custom_4xx.html
    <h1>Sorry, the page could not be found</h1>
    <p>Please ensure that you have typed the address correctly.</p>
    $ sudo cat /usr/share/nginx/html/custom_5xx.html
    <h1>Sorry, we couldn't process the request</h1>
    <p>There seems to be an error. Please report it at contact@oursite.com if you continue to see this.</p>
  6. The error details are still bland and you can definitely customize it further. Be creative! With the error pages in place, try to repeat the following commands:

    $ curl localhost/foo
    <h1>Sorry, the page could not be found</h1>
    <p>Please ensure that you have typed the address correctly.</p>
    $ curl localhost/testing
    <h1>Sorry, we couldn't process the request</h1>
    <p>There seems to be an error. Please report it at contact@oursite.com if you continue to see this.</p>
  7. This is pretty great. But there is a small catch. Try hitting the following URI:

    $ curl localhost/custom_5xx.html
    <h1>Sorry, we couldn't process the request</h1>
    <p>There seems to be an error. Please report it at contact@oursite.com if you continue to see this.</p>
  8. This route now functions as if it was a normal page. This behavior is normally not desired. To avoid these routes from working directly, you should mark these locations with an additional directive called internal like so:

        location = /custom_4xx.html {
            root /usr/share/nginx/html;
            internal;
        }
    
        location = /custom_5xx.html {
            root /usr/share/nginx/html;
            internal;
        }
  9. Reload the configuration

    and try again:

    curl localhost/custom_5xx.html
    <h1>Sorry, the page could not be found</h1>
    <p>Please ensure that you have typed the address correctly.</p>
  10. Wait. The page requested was
    custom_5xx.html, but the result is from custom_4xx.html. Don’t let this confuse you. What basically happened was that because of the
    internal directive

    , Nginx refused to return the page custom_5xx.html directly and errored out. To refuse it, Nginx threw error 404 and lied to the visitor that the page doesn’t exist. Since 404 status code was mapped to custom_4xx.html, you saw the result from that page instead.

Benchmark

If you run a web server, it helps to know how it is performing. Even more, you need to know how much load it can possibly handle under stress. If your website appears on the news headlines for good reasons, you will definitely not want it to crash when all eyes are on it. To put the problem in simpler words, how would you plan the deployment in a predictable way?

Benchmarking tests help you derive conclusions and make business decisions. It should provide you performance-related numbers

from multiple perspectives. The benchmarking exercise is comprised of various tests that yield results that can later be analyzed. It can be a pretty extensive process depending

on how complex your application is. At the minimum, you should be informed about the following:

  • Average Number of Users: You must find out the average number of users the servers can handle easily. If you expect 1000+ concurrent users, you should ensure that the tests put the appropriate load and the server stays up for a considerable amount of time.

  • Performance Under Load: You must find out how does your server behave under stress? Are there areas of your application that show slowness? If the service goes down, does it come back up automatically? Do the requests hang and don’t respond at all? What kind of errors do the end users see when there is a lot of load on the server?

  • Hard Limits: This is something that is better known in advance than discovered late in production. For example, if you have an application that deals with file paths, it is better if you know well in advance that there is a limit to the path name length. Not knowing limitations of the hardware or software may become a recipe for disaster. It is possible that you don’t hit the hard limits, but it helps if you are aware of it, so that you can plan the architecture appropriately. Load testing increases the load until the point the servers start failing. In that case, if you know the numbers you become aware of the hard limits of your server (or application) and you can plan well in advance, should such a need arise.

Apache

Benchmark

Apache Benchmark
(ab) is a nifty utility used for benchmarking. It is free, open source, and quite powerful. Follow these steps in order to use it:

  1. Install it using yum install ab.

  2. It is better to run ab from a different server than the server that is being tested. You can run it on your host server as well. A sample command and its output can be seen in the following code listing. ( k parameter tells ab to use keep alive connections, c implies the number of requests to make concurrently. In this case 90 keep alive concurrent connections are to be used for a total of 10000 requests).

    $ ab -kc 90 -n 10000 http://127.0.0.1:8006/index.htm
    This is ApacheBench, Version 2.3 <$Revision: 1663405 $>
    Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
    Licensed to The Apache Software Foundation, http://www.apache.org/
    
    Benchmarking 127.0.0.1 (be patient)
    Completed 1000 requests
    Completed 2000 requests
    Completed 3000 requests
    Completed 4000 requests
    Completed 5000 requests
    Completed 6000 requests
    Completed 7000 requests
    Completed 8000 requests
    Completed 9000 requests
    Completed 10000 requests
    Finished 10000 requests
    
    Server Software:        nginx/1.8.1
    Server Hostname:        127.0.0.1
    Server Port:            8006
    
    Document Path:          /index.htm
    Document Length:        108 bytes
    
    Concurrency Level:      90
    Time taken for tests:   1.900 seconds
    Complete requests:      10000
    Failed requests:        0
    Non-2xx responses:      10000
    Keep-Alive requests:    9902
    Total transferred:      2829510 bytes
    HTML transferred:       1080000 bytes
    Requests per second:    5262.02 [#/sec] (mean)
    Time per request:       17.104 [ms] (mean)
    Time per request:       0.190 [ms] (mean, across all concurrent requests)
    Transfer rate:          1454.00 [Kbytes/sec] received
    
    Connection Times (ms)
                  min  mean[+/-sd] median   max
    Connect:        0    1  25.1      0    1135
    Processing:     0   15 120.6      1    1471
    Waiting:        0   15 120.6      0    1471
    Total:          0   16 133.6      1    1892                                                                                                                                                          
    
    Percentage of the requests served within a certain time (ms)
      50%      1
      66%      1
      75%      1
      80%      1
      90%      1
      95%      1
      98%     35
      99%    781
     100%   1892 (longest request)

Some observations that you should notice:

  • 5262 requests were served every second.

  • The output shows connection times split into four areas: connect, processing, waiting, and total.

  • There is no good or bad result, since it is primarily based on your requirements.

  • You should repeat this test with different parameters to find out the results.

  • It is a good practice to test various pages under different loads.

  • When in doubt, test again.

  • Test, test, test… is the basic mantra when it comes to benchmarks. Test as much as possible and make judicious decisions based on your requirement.

JMeter

The ab utility simply downloads the file. It is good for testing pages in silos and will give you results about how many of those requests could be served from a page download perspective. You can run the test for static file, images, PHP. and pretty much any URI. However, if the HTML page contains certain scripts, ab will not be able to tell you how long the page took to render.

Although this book will not cover load testing it is worth mentioning that JMeter is another fantastic tool that can help you a lot. It is a pure Java application designed to load test functional behavior and measure performance. You can record tests and execute them later with various load parameters. You can learn more about it from
http://jmeter.apache.org
.

Cloud-Based

Benchmarking

While doing load testing it is often found that the server is way more powerful than the client, and the client is not able to make as many requests as the server is able to serve. To test such massive web servers and farms, you need to have equally powerful test servers. With the advent of cloud computing, this has become a lot simpler. There are many service providers who provide a cloud-based testing approach if your website is public. You schedule a test and the cloud service takes care of the rest. It makes a number of requests to your servers from different locations and returns informative results.
http://loader.io
is one good example of such a kind of service and has a free option as well. Quite a few cloud-based testing services have come up to make your job easier from a load testing perspective.

Baseline

People often confuse between benchmark and baseline since they are actually similar but distinct activities. You can consider baselining as an activity that yields result that you can refer to at a later point. Let’s assume your server takes 110 seconds to boot up on a regular day. After a patch, you reboot it and it takes a much longer time, say, 200 seconds. If you haven’t baselined the server on a regular day and already knew that 110 seconds is your baseline, it would become difficult for you to say that 200 seconds is longer. Baseline provides that reference point.

In contrast you benchmark your servers to compare the results. For instance, you can use results from a benchmark on a server to compare to another server and comment “this server is slower than the other by factor X.”

In essence, baseline is about identifying an approved state, where as benchmarking is assessing the relative performance.

Monitoring

Even after all the preparation, gathering logs, baselining, and benchmarking, when your application is in production it might just crash. Monitoring of a web server is paramount and apparently you can never monitor enough. A web server is like a mid-air flight. If you don’t monitor the vitals constantly, major accidents can happen. Aggressive monitoring and alert mechanism helps you in fixing the issues during the flight. It is impossible to monitor the servers manually, and you should choose your tools wisely.

Nginx PLUS

While configuring load balancing in chapter 8, you have already learned about Nginx PLUS and its monitoring capabilities. Nginx team has provided a sample configuration that can be download using curl directly. Run the following command:

curl https://www.nginx.com/resource/conf/status.conf | sudo tee status.conf
sudo mv status.conf /etc/nginx/conf.d/

The commands will create a status.conf file that will work only if you have Nginx PLUS binaries. The status.conf that was downloaded looks as follows and is well commented (for further details refer to
https://www.nginx.com/blog/live-activity-monitoring-nginx-plus-3-simple-steps
). You will learn about basic authentication in chapter 10:

# This is an example of Live Activity Monitoring (extended status) feature configuration
# Created by NGINX, Inc. for nginx-plus-r6

# Documentation: http://nginx.org/r/status

# In order to enable this configuration please move this file to /etc/nginx/conf.d
# and reload nginx:
# mv /etc/nginx/conf.d/status.conf.example /etc/nginx/conf.d/status.conf
# nginx -s reload

# Note #1: enable status_zone directive for http and tcp servers.
# For more information please see http://nginx.org/r/status_zone

# Note #2: enable zone directive for http and tcp upstreams.
# For more information please see http://nginx.org/r/zone

server {                                                        
        # Status page is enabled on port 8080 by default.
        listen 8080;

        # Status zone allows the status page to display statistics for the whole server block.
        # It should be enabled for every server block in other configuration files.
        status_zone status-page;

        # In case of nginx process listening on multiple IPs you can restrict status page
        # to single IP only
        # listen 10.2.3.4:8080;                                                                              

        # HTTP basic Authentication is enabled by default.
        # You can add users with any htpasswd generator.
        # Command line and online tools are very easy to find.
        # You can also reuse your htpasswd file from Apache web server installation.
        #auth_basic on;
        #auth_basic_user_file /etc/nginx/users;

        # It is recommended to limit the use of status page to admin networks only
        # Uncomment and change the network accordingly.
        #allow 10.0.0.0/8;
        #deny all;
        # NGINX provides a sample HTML status page for easy dashboard view
        root /usr/share/nginx/html;
        location = /status.html { }

        # Standard HTTP features are fully supported with the status page.
        # An example below provides a redirect from "/" to "/status.html"
        location = / {
                return 301 /status.html;
        }

        # Main status location. HTTP features like authentication, access control,
        # header changes, logging are fully supported.
        location /status {
                status;
                status_format json;
        }
}

Automation

There are many manual tasks that are supposed to be executed repeatedly (or in a fixed frequency) in a web farm. They say, to err is human, and human errors can only be avoided if you can avoid the use of humans! As web administrators, you wouldn’t want to err either by forgetting to back up periodically, or archiving the logs, or any similar mundane but important task. If you know something is important, it is a good idea to automate it.

Let’s say you want to delete log files older than 10 days. To achieve this, you can use the following command:

find /var/log/nginx -type f -mtime +10

The command finds all files that are older than 10 days:

/var/log/nginx/access.log-20160306.gz
/var/log/nginx/error.log-20160306.gz
/var/log/nginx/access.log-20160307.gz
/var/log/nginx/error.log-20160307.gz
/var/log/nginx/access.log-20160308.gz
/var/log/nginx/error.log-20160308.gz
/var/log/nginx/access.log-20160309.gz
/var/log/nginx/error.log-20160309.gz                      

If you are satisfied with the output, you can add an -exec parameter and process it appropriately. Hence, to delete this, you can use the following command:

sudo find /var/log/nginx -type f -mtime +10 -exec rm {} \;

In this command, you are trying to find files ( -type f) that are older than 10 days and remove that file. You can do additional things such as using bash script if you like. The core idea is to finalize the activity that you would like to do every day (or at any specific interval for that matter).

Once the script is ready, you will need to schedule it so that it is executed automatically as per your requirements. crontab can be of great help here, since it contains a list of commands that you want to be executed. crontab stands for cron table, since it uses the scheduler called cron. You need to do the following to register this command so that it is executed automatically.

  1. Type crontab -e. This will open vi editor and you can edit it as you would edit any other file. Every line is an additional task and lines prefixed with # are considered comments.

  2. There are six distinct pieces of information that have to be included in every line separated by a space. The first five pieces of information tell cron when to run it, and the last one tells what to run. The pieces are as follows in order (* in the crontab entry signifies every):

    1. A number, a list of numbers (ex. 10, 20, 30) or a range of numbers (ex. 10–20) that represents minutes of the hour.

    2. A number, a list of numbers, or a range that represents hour of the day.

    3. A number, a list of numbers, or a range that represents days of the month.

    4. A number, a list of numbers, or a range that represents months of the year.

    5. A number, a list of numbers, or a range that represents days of the week.

    6. Actual command or bash script that needs to be executed.

  3. If you write the following line as one of the entries, it will execute the command and delete the older logs every night at 1 a.m.:

    0 1 * * * /path/to/script/remove_old_logs.sh
  4. Exit the editor and create a new file at path that you have mentioned in the crontab list entry, and cron will take care of the rest.

Some additional sample entries:

# Run something every minute
* * * * * /path/to/script/every_minute.sh

# Run something every hour starting 9AM and ending at 6PM every day
0 9-18 * * * /path/to/script/every_hour.sh

# Run something every night at 11:30PM
30 23 * * * /path/to/script/every_night_at_specific_time.sh

# Run something every Sunday at 1 AM
0 1 * * Mon /path/to/script/every_monday_morning.sh

Summary

In this chapter you have learned about the usage of error logs and access logs. You should be able to comfortably analyze what kind of information you need to log and how much to log. You have also learned about log analysis using tools. Do ensure that you baseline your servers and benchmark new servers appropriately whenever you need to scale up or scale out. Don’t forget to customize your error pages and use the monitoring tools pragmatically in production. Automation is good, and you should try to automate as much as possible to reduce human errors.

Comments are closed.

loading...