Share



Check out Abot
anonymous messaging for Slack

Setup ELK for NGINX logs with Elasticsearch, Logstash, and Kibana


ElasticSearch, Logstash and Kibana represented by Elk

ELK Elastic stack is a popular open-source solution for analyzing weblogs. In this blog post, I describe how to setup Elasticsearch, Logstash and Kibana on a barebones VPS to analyze NGINX access logs. I don’t dwell on details but instead focus on things you need to get up and running with ELK-powered log analysis quickly.

Comparing to other tools available ELK gives you extreme flexibility in terms of ways to analyze and present your logs data. Hosted solutions are a bit pricey with monthly costs starting around $50 for a reasonable features set. By following this tutorial you can setup your own log analysis machine for a cost of a simple VPS server. No need to be a dev-ops pro to do it yourself.

ELK stack will reside on a server separate from your application. NGINX logs will be sent to it via an SSL protected connection using Filebeat. We will also setup GeoIP data and Let’s Encrypt certificate for Kibana dashboard access.

This step by step tutorial covers the 6.2.4 version of the ELK stack components.

Just to show you a sneak peak of what we will be building in this tutorial series:

Kibana dashboard charts

Kibana charts generated from NGINX access logs of this blog


Let’s get started.

Setup VPS

You need to start with purchasing a barebones VPS and adding SSH access to it. I don’t elaborate on how to do it in this tutorial.

You will also need a domain or a subdomain you will config with your VPS server IP using an A DNS entry. You can check out my other blog post for tips on how to save money on domain names. If you use Cloudflare for your DNS remember not to use their CDN for this domain because it changes IP domain resolves to and can cause trouble with setup.

For my ELK stack server, I use a 4GB Digital Ocean VPS with Ubuntu 16.04. It is running Elasticsearch, Kibana and Logstash processes. With my current amount of traffic log data 4GB RAM is enough so far.

Install ELK dependencies

Access your VPS and run the following commands as a sudo user to install required dependencies:

Java

Java is required for both Elasticsearch and Logstash. Make sure to install Java 8. At the time of writing Logstash is not yet compatible with the newest Java version.

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
sudo apt-get install oracle-java8-set-default

You also need to set a correct JAVA_HOME variable. Add the following contents to the /etc/environment file:

JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

You can verify that installation was successful by typing:

java -version

Result should look similar to:

java version "1.8.0_171"
Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)

Elasticsearch

Elasticsearch is a database where logs are stored after Logstash processes them. It can be quite memory hungry so make sure to monitor your RAM usage when working with it on a low-end VPS.

Let’s install it by running:

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
sudo apt-get install apt-transport-https
echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list
sudo apt-get update
sudo apt-get install elasticsearch

Now uncomment the following lines in /etc/elasticsearch/elasticsearch.yml

http.port: 9200
network.host: localhost

start the Elasticsearch process:

sudo service elasticsearch start

and verify that it is running by making a cURL request:

curl -v http://localhost:9200

JSON response should look something like:

{
  "name" : "lEOavW3",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "8iVBXLx-RC2uNiGXJeWd5w",
  "version" : {
    "number" : "6.2.4",
    "build_hash" : "ccec39f",
    "build_date" : "2018-04-12T20:37:28.497551Z",
    "build_snapshot" : false,
    "lucene_version" : "7.2.1",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

Kibana

Kibana is a visual layer of an ELK stack. It queries an ElasticSearch for log data and offers a multitude of ways to analyze and present it.

First, let’s install it:

sudo apt-get install kibana

Then add the following lines to /etc/kibana/kibana.yml

server.port: 5601
server.host: "localhost"
elasticsearch.url: "http://localhost:9200"

Now you can start a Kibana process by typing:

sudo service kibana start

Just like in the case of Elasticsearch you can verify that it is running by using a cURL command:

curl -v http://localhost:5601

Now let’s expose an access to our Kibana dashboard to an external world using NGINX.

NGINX

This is not the NGINX we will be analyzing logs from. This one will be used to provide password-protected access to Kibana instance running on our ELK server. We will use a Let’s Encrypt SSL certificate for secure access. We can do it by typing:

sudo add-apt-repository ppa:certbot/certbot
sudo apt-get update
sudo apt-get install nginx apache2-utils
sudo apt-get install python-certbot-nginx
sudo certbot --nginx -d my-elk-stack-vps.com

To automatically renew your certificate add this line to /etc/crontab/ file:

@monthly root certbot -q renew

Now you need to set a password for your Kibana user:

sudo htpasswd -c /etc/nginx/htpasswd.users kibanaadmin

Next, configure NGINX to use generated certificates and proxy pass traffic from your VPS root path to Kibana:

  server {
      listen 443 default_server ssl;

      server_name my-elk-stack-vps.com
      ssl_certificate /etc/letsencrypt/live/my-elk-stack-vps.com/fullchain.pem;
      ssl_certificate_key /etc/letsencrypt/live/my-elk-stack-vps.com/privkey.pem;

      auth_basic "Restricted Access";
      auth_basic_user_file /etc/nginx/htpasswd.users;

      location / {
          proxy_pass http://localhost:5601;
          proxy_http_version 1.1;
          proxy_set_header Upgrade $http_upgrade;
          proxy_set_header Connection 'upgrade';
          proxy_set_header Host $host;
          proxy_cache_bypass $http_upgrade;
    }
}

This config assumes that there is an A DNS entry for my-elk-stack-vps.com domain pointing to your VPS server IP.

You should now be able to see your Kibana dashboard by going to my-elk-stack-vps.com and entering your credentials.

Empty Kibana dashboard

Kibana dashboard, no logs here yet.

Logstash and SSL certificates

Logstash is used to accept logs data sent from your client application by Filebeat then transform and feed them into an Elasticsearch database.

Install it by running:

sudo apt-get install logstash

Configure SSL certificates

Because you will be sending your logs from a separate server, you should do it via a secure connection. Generating self-signed certificates will be necessary to do it:

  sudo mkdir -p /etc/elk-certs
  cd /etc/elk-certs
  sudo openssl req -subj '/CN=my-elk-stack-vps.com/' -x509 -days 3650 -batch -nodes -newkey rsa:2048 -keyout elk-ssl.key -out elk-ssl.crt
  chown logstash elk-ssl.crt
  chown logstash elk-ssl.key

Remember to substitute my-elk-stack-vps.com with your domain name in the command generating a self-signed certificate. Later we will have to copy resulting files to your client-server Filebeat configuration.

Configure GeoIP data

In order to map visitor IP adresses to geographical locations we need to download GeoIP database:

 cd /etc/logstash/
 wget http://geolite.maxmind.com/download/geoip/database/GeoLite2-City.mmdb.gz
 gunzip GeoLite2-City.mmdb.gz

Configure Logstash

Now you need to configure Logstash with the following files:

/etc/logstash/logstash.yml

path.data: /var/lib/logstash
path.config: /etc/logstash/conf.d
path.logs: /var/log/logstash

/etc/logstash/conf.d/logstash-nginx-es.conf

input {
    beats {
        port => 5400
        ssl => true
        ssl_certificate_authorities => ["/etc/elk-certs/elk-ssl.crt"]
        ssl_certificate => "/etc/elk-certs/elk-ssl.crt"
        ssl_key => "/etc/elk-certs/elk-ssl.key"
        ssl_verify_mode => "force_peer"
    }
}

filter {
 grok {
   match => [ "message" , "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}"]
   overwrite => [ "message" ]
 }
 mutate {
   convert => ["response", "integer"]
   convert => ["bytes", "integer"]
   convert => ["responsetime", "float"]
 }
 geoip {
   source => "clientip"
   target => "geoip"
   add_tag => [ "nginx-geoip" ]
 }
 date {
   match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
   remove_field => [ "timestamp" ]
 }
 useragent {
   source => "agent"
 }
}

output {
 elasticsearch {
   hosts => ["localhost:9200"]
   index => "weblogs-%{+YYYY.MM.dd}"
   document_type => "nginx_logs"
 }
 stdout { codec => rubydebug }
}


This config specifies input and output for out logs and how they will be formatted before sending them to Elasticsearch. GeoIP data is configured here as well. It also enforces a secure SSL connection signed by a correct certificate for logs sent by a Filebeat.

Now let’s start Logstash process and verify that it is listening on a correct port:

systemctl enable logstash
service restart logstash
netstat -tulpn | grep 5400

Output of the last command should be similar to:

tcp6  0 0 :::5400 :::* LISTEN  21329/java

If it does not work, you can check out the troubleshooting guide at the end of the post.

Filebeat

Fielbeat is the only part of the infrastructure that needs to be installed on a client server. You should login to the server of your NGINX application and copy the self-signed SSL certificate files to the correct folder:

/etc/elk-certs/elk-ssl.crt
/etc/elk-certs/elk-ssl.key

You can use SCP to do it or just copy/paste the contents of files.

Now, install Java using the same commands as for the main ELK host server. Then install rest of the dependencies:

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list
sudo apt-get update
sudo apt-get install filebeat

Now configure Filebeat by modifying this file: /etc/filebeat/filebeat.yml

filebeat.prospectors:
- input_type: log
  paths:
    - /var/log/nginx/*.log
  exclude_files: ['\.gz$']

output.logstash:
  hosts: ["my-elk-stack-vps.com:5400"]
  ssl.certificate_authorities: ["/etc/elk-certs/elk-ssl.crt"]
  ssl.certificate: "/etc/elk-certs/elk-ssl.crt"
  ssl.key: "/etc/elk-certs/elk-ssl.key"

This config tells Filebeat where to send our logs and which SSL certificates to use for authentication. paths option points to a default NGINX logs folder.

If you have an NGINX running for a while, you probably have a bunch of GZipped logs in /var/log/nginx/. To send them to Kibana you should unzip them using gunzip and change their resulting filenames to match the *.log wildcard expression.

Raw logs are here

If everything went fine you should go to Kibana dashboard and create an index pattern called weblogs-*. You can do it in a Management menu tab. Now you can go to Discover and see your raw logs data there:

Kibana dashboard with raw NGINX logs

Raw NGINX access logs in Kibana dashboard

This how a raw JSON entry stored in Elasticsearch for a single NGINX log event after being parsed by Logstash looks like:

{
  "_index": "weblogs-2018.06.01",
  "_type": "nginx_logs",
  "_id": "UQX6wGMBnPkSpi68QAq6",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2018-06-01T15:25:47.000Z",
    "source": "/var/log/nginx/pawel-blog-access-1.log",
    "input_type": "log",
    "os_name": "Ubuntu",
    "request": "/profitable-slack-bot-rails",
    "agent": "\"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0\"",
    "message": "157.234.132.47 - - [01/Jun/2018:17:25:47 +0200] \"GET /profitable-slack-bot-rails HTTP/1.1\" 200 12104 \"https://news.ycombinator.com/\" \"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0\"",
    "device": "Other",
    "clientip": "157.234.132.47",
    "response": 200,
    "type": "log",
    "httpversion": "1.1",
    "host": "pawel-blog",
    "referrer": "\"https://news.ycombinator.com/\"",
    "build": "",
    "tags": [
      "beats_input_codec_plain_applied",
      "nginx-geoip"
    ],
    "beat": {
      "hostname": "pawel-blog",
      "version": "6.2.4",
      "name": "pawel-blog"
    },
    "os": "Ubuntu",
    "@version": "1",
    "offset": 571753,
    "geoip": {
      "latitude": -26.2309,
      "longitude": 28.0583,
      "country_code3": "ZA",
      "timezone": "Africa/Johannesburg",
      "region_name": "Gauteng",
      "ip": "157.234.132.47",
      "postal_code": "2000",
      "continent_code": "AF",
      "city_name": "Johannesburg",
      "country_name": "South Africa",
      "region_code": "GT",
      "country_code2": "ZA",
      "location": {
        "lon": 28.0583,
        "lat": -26.2309
      }
    },
    "ident": "-",
    "verb": "GET",
    "name": "Firefox",
    "minor": "0",
    "bytes": 12104,
    "major": "61",
    "auth": "-"
  },
  "fields": {
    "@timestamp": [
      "2018-06-01T15:25:47.000Z"
    ]
  }
}

Troubleshooting

As you can see you need to make various components play together in order to get the ELK stack running. Here’s a list of commands which can help you debug when things go wrong:

Filebeat logs:

tail -f /var/log/filebeat/filebeat

Logstash logs:

tail -f /var/log/logstash/logstash-plain.log

Start a Filebeat process in the foreground to see if it can connect to Logstash on the host ELK server:

/usr/share/filebeat/bin/filebeat -c /etc/filebeat/filebeat.yml -e -v

Start a Logstash process in the foreground to check why it’s not listening on a port:

/usr/share/logstash/bin/logstash --debug

Summary

I am just gettings started to play with ELK Elastic stack and discovering options it has to offer. I hope that this tutorial will help you get up and running with it quickly even if you don’t have much dev ops experience up your sleeve.

Pawel Urbanek Full Stack Ruby on Rails developer avatar

Abot - Anonymous Slack messaging bot has recently received a major update. I invite you to check it out.

I'm not giving away a free ebook, but you're still welcome to subscribe.

Back to top