ELK Elastic stack is a popular open-source solution for analyzing weblogs. In this tutorial, I describe how to setup Elasticsearch, Logstash and Kibana on a barebones VPS to analyze NGINX access logs. I don’t dwell on details but instead focus on things you need to get up and running with ELK-powered log analysis quickly.
Comparing to other tools available ELK gives you extreme flexibility in terms of ways to analyze and present your logs data. Hosted solutions are a bit pricey with monthly costs starting around $50 for a reasonable features set. By following this tutorial you can setup your own log analysis machine for a cost of a simple VPS server. No need to be a dev-ops pro to do it yourself.
ELK stack will reside on a server separate from your application. NGINX logs will be sent to it via an SSL protected connection using Filebeat. We will also setup GeoIP data and Let’s Encrypt certificate for Kibana dashboard access.
This step by step tutorial covers the 6.2.4 version of the ELK stack components.
Just to show you a sneak peak of what we will be building in this tutorial series:
Currently, I am using Kibana to analyze traffic logs of this blog and Abot landing page.
Let’s get started.
You need to start with purchasing a barebones VPS and adding SSH access to it. I don’t elaborate on how to do it in this tutorial.
You will also need a domain or a subdomain you will config with your VPS server IP using an
A DNS entry. If you use Cloudflare for your DNS remember not to use their CDN for this domain because it changes IP domain resolves to and can cause trouble with setup.
For my ELK stack server, I use a 4GB Digital Ocean VPS with Ubuntu 16.04. It is running Elasticsearch, Kibana and Logstash processes. With my current amount of traffic log data 4GB RAM is enough so far.
Install ELK dependencies
Access your VPS and run the following commands as a sudo user to install required dependencies:
Java is required for both Elasticsearch and Logstash. Make sure to install Java 8. At the time of writing Logstash is not yet compatible with the newest Java version.
You also need to set a correct
JAVA_HOME variable. Add the following contents to the
You can verify that installation was successful by typing:
Result should look similar to:
Elasticsearch is a database where logs are stored after Logstash processes them. It can be quite memory hungry so make sure to monitor your RAM usage when working with it on a low-end VPS.
Let’s install it by running:
Now uncomment the following lines in
start the Elasticsearch process:
and verify that it is running by making a cURL request:
JSON response should look something like:
Kibana is a visual layer of an ELK stack. It queries an ElasticSearch for log data and offers a multitude of ways to analyze and present it.
First, let’s install it:
Then add the following lines to
Now you can start a Kibana process by typing:
Just like in the case of Elasticsearch you can verify that it is running by using a cURL command:
Now let’s expose an access to our Kibana dashboard to an external world using NGINX.
This is not the NGINX we will be analyzing logs from. This one will be used to provide password-protected access to Kibana instance running on our ELK server. We will use a Let’s Encrypt SSL certificate for secure access. We can do it by typing:
To automatically renew your certificate add this line to
Now you need to set a password for your Kibana user:
Next, configure NGINX to use generated certificates and proxy pass traffic from your VPS root path to Kibana:
This config assumes that there is an
A DNS entry for
my-elk-stack-vps.com domain pointing to your VPS server IP.
You should now be able to see your Kibana dashboard by going to
my-elk-stack-vps.com and entering your credentials.
Logstash and SSL certificates
Logstash is used to accept logs data sent from your client application by Filebeat then transform and feed them into an Elasticsearch database.
Install it by running:
Configure SSL certificates
Because you will be sending your logs from a separate server, you should do it via a secure connection. Generating self-signed certificates will be necessary to do it:
Remember to substitute
my-elk-stack-vps.com with your domain name in the command generating a self-signed certificate. Later we will have to copy resulting files to your client-server Filebeat configuration.
Configure GeoIP data
In order to map visitor IP adresses to geographical locations we need to download GeoIP database:
Now you need to configure Logstash with the following files:
This config specifies input and output for out logs and how they will be formatted before sending them to Elasticsearch. GeoIP data is configured here as well. It also enforces a secure SSL connection signed by a correct certificate for logs sent by a Filebeat.
Now let’s start Logstash process and verify that it is listening on a correct port:
Output of the last command should be similar to:
If it does not work, you can check out the troubleshooting guide at the end of the post.
Fielbeat is the only part of the infrastructure that needs to be installed on a client server. You should login to the server of your NGINX application and copy the self-signed SSL certificate files to the correct folder:
You can use SCP to do it or just copy/paste the contents of files.
Now, install Java using the same commands as for the main ELK host server. Then install rest of the dependencies:
Now configure Filebeat by modifying this file:
This config tells Filebeat where to send our logs and which SSL certificates to use for authentication.
paths option points to a default NGINX logs folder.
If you have an NGINX running for a while, you probably have a bunch of GZipped logs in
/var/log/nginx/. To send them to Kibana you should unzip them using
gunzip and change their resulting filenames to match the
*.log wildcard expression.
Raw logs are here
If everything went fine you should go to Kibana dashboard and create an index pattern called
weblogs-*. You can do it in a
Management menu tab. Now you can go to
Discover and see your raw logs data there:
This how a raw JSON entry stored in Elasticsearch for a single NGINX log event after being parsed by Logstash looks like:
As you can see you need to make various components play together in order to get the ELK stack running. Here’s a list of commands which can help you debug when things go wrong:
Start a Filebeat process in the foreground to see if it can connect to Logstash on the host ELK server:
Start a Logstash process in the foreground to check why it’s not listening on a port:
I am just gettings started to play with ELK Elastic stack and discovering options it has to offer. I hope that this tutorial will help you get up and running with it quickly even if you don’t have much dev ops experience up your sleeve.