Web server log (Apache, NGINX) monitoring with Netdata
This module parses Apache and NGINX web servers logs.
Charts#
Module produces following charts:
- Total Requests in
requests/s - Excluded Requests in
requests/s - Requests By Type in
requests/s - Responses By Status Code Class in
responses/s - Responses By Status Code in
responses/s - Informational Responses By Status Code in
responses/s - Successful Responses By Status Code in
responses/s - Redirects Responses By Status Code in
responses/s - Client Errors Responses By Status Code in
responses/s - Server Errors Responses By Status Code in
responses/s - Bandwidth in
kilobits/s - Request Processing Time in
milliseconds - Requests Processing Time Histogram in
requests/s - Upstream Response Time in
requests/s - Upstream Responses Time Histogram in
responses/s - Current Poll Unique Clients in
clients - Requests By Vhost in
requests/s - Requests By Port in
requests/s - Requests By Scheme in
requests/s - Requests By HTTP Method in
requests/s - Requests By HTTP Version in
requests/s - Requests By IP Protocol in
requests/s - Requests By SSL Connection Protocol in
requests/s - Requests By SSL Connection Cipher Suite in
requests/s - URL Field Requests By Pattern
requests/s
For every Custom field:
- Requests By Pattern in
requests/s
For every URL pattern:
- Responses By Status Code in
responses/s - Requests By HTTP Method in
requests/s - Bandwidth in
kilobits/s - Request Processing Time in
milliseconds
Log Parsers#
Weblog supports 4 different log parsers:
Try to avoid using RegExp because it's much slower than the other parsers. Prefer to use LTSV or CSV parser.
There is an example job for every log parser.
Log Parser Auto-Detection#
If log_type parameter set to auto (which is default), weblog will try to auto-detect appropriate log parser and log
format using the last line of the log file.
- checks if format is
CSV(using regexp). - checks if format is
JSON(using regexp). - assumes format is
CSVand tries to find appropriateCSVlog format using predefind list of formats. It tries to parse the line using each of them in the following order:
The first one matches is used later. If you use default Apache/NGINX log format auto-detect will do for you. If it doesn't work you need to set format manually.
Known Fields#
These are NGINX and Apache log format variables.
Weblog is aware how to parse and interpret the fields:
| nginx | apache | description |
|---|---|---|
| $host ($http_host) | %v | Name of the server which accepted a request. |
| $server_port | %p | Port of the server which accepted a request. |
| $scheme | - | Request scheme. "http" or "https". |
| $remote_addr | %a (%h) | Client address. |
| $request | %r | Full original request line. The line is "$request_method $request_uri $server_protocol". |
| $request_method | %m | Request method. Usually "GET" or "POST". |
| $request_uri | %U | Full original request URI. |
| $server_protocol | %H | Request protocol. Usually "HTTP/1.0", "HTTP/1.1", or "HTTP/2.0". |
| $status | %s (%>s) | Response status code. |
| $request_length | %I | Bytes received from a client, including request and headers. |
| $bytes_sent | %O | Bytes sent to a client, including request and headers. |
| $body_bytes_sent | %B (%b) | Bytes sent to a client, not counting the response header. |
| $request_time | %D | Request processing time. |
| $upstream_response_time | - | Time spent on receiving the response from the upstream server. |
| $ssl_protocol | - | Protocol of an established SSL connection. |
| $ssl_cipher | - | String of ciphers used for an established SSL connection. |
In addition to that weblog understands user defined fields.
Notes:
- Apache
%hlogs the IP address if HostnameLookups is Off. The web log collector counts hostnames as IPv4 addresses. We recommend either to disable HostnameLookups or use%ainstead of%h. - Since httpd 2.0, unlike 1.3, the
%band%Bformat strings do not represent the number of bytes sent to the client, but simply the size in bytes of the HTTP response. It will will differ, for instance, if the connection is aborted, or if SSL is used. The%Oformat provided bymod_logiowill log the actual number of bytes sent over the network. - To get
%Iand%Oworking you need to enablemod_logioon Apache. - NGINX logs URI with query parameters, Apache doesnt.
$requestis parsed into$request_method,$request_uriand$server_protocol. If you have$requestin your log format, there is no sense to have others.- Don't use both
$bytes_sentand$body_bytes_sent(%Oand%Bor%b). The module does not distinguish between these parameters.
Custom Log Format#
Custom log format is easy. Use known fields to construct your log format.
- If using
CSVparser
Since weblog understands NGINX and Apache variables all you need is to copy your log format and... that is it!
If there is a field that is not known by the weblog it's not a problem. It will skip it during parsing. We suggest
replace all unknown fields with - for optimization purposes.
Let's take as an example some non default format.
To get it working we need to copy the format without any changes (make it a line for nginx). Replacing unknown fields is optional but recommended.
Special case:
Both %t and $time_local fields represent time
in Common Log Format. It is a special case
because it's in fact 2 fields after csv parse (ex.: [22/Mar/2009:09:30:31 +0100]). Weblog understands it and you don't
need to replace it with - (if we want to do it we need to make it - -).
- If using
JSONparser
Provide fields mapping if needed. Don't use $ and % prefixes for mapped field names. They are only
needed in CSV format.
- If using
LTSVparser
Provide fields mapping if needed. Don't use $ and % prefixes for mapped field names. They are only
needed in CSV format.
- If using
RegExpparser
Use pattern with subexpressions names. These names should be known by weblog.
Custom Fields Feature#
Weblog is able to extract user defined fields and count patterns matches against these fields.
This feature needs:
- custom log format with user defined fields
- list of patterns to match against appropriate fields
Pattern syntax: matcher.
There is an example with 2 custom fields - $http_referer and $http_user_agent. Weblog is unaware of these fields,
but we still can get some info from them.
Custom time fields feature#
The web log collector is also able to extract user defined time fields and could count min/avg/max + histogram against these fields.
This feature needs:
- A custom log format with user-defined time fields.
- A histogram to show response time in seconds, which is optional.
As an example, Apache mod_logio adds a ^FB logging
directive. This value shows a delay in microseconds between when the request arrived, and the first byte of the response
headers are written.
As with the custom fields feature, Netdata's web log collector is unaware of these fields, but we can still get some info from them.
Configuration#
Edit the go.d/web_log.conf configuration file using edit-config from the
Netdata config directory, which is typically at /etc/netdata.
This module needs only path to log file. If it fails to auto-detect your log format you
need to set it manually.
For all available options, please see the module configuration file.
Troubleshooting#
To troubleshoot issues with the web_log collector, run the go.d.plugin with the debug option enabled. The output
should give you clues as to why the collector isn't working.
First, navigate to your plugins directory, usually at /usr/libexec/netdata/plugins.d/. If that's not the case on your
system, open netdata.conf and look for the setting plugins directory. Once you're in the plugin's directory, switch
to the netdata user.
You can now run the go.d.plugin to debug the collector: