Creating Log Infrastructure with Elastic Stack and Docker Compose (Part 1)

Jagad Nabil Tuah Imanda
12 min readOct 8, 2023

--

Elasticsearch logo

Logging plays a crucial role in any system. Logging enables error tracing, troubleshooting, performance analysis, and much more. As the codebase or system grows in complexity, you can’t rely on manual tracing or debugging methods such as fmt.Println() or console.log() anymore. Also in large codebases or systems, there are usually intermittent errors (errors that occur randomly) or intermittent high latency/execution time issues in the system. To identify and trace these errors, we need to navigate through the logs and find what time they occurred.

Elastic Stack is one of the options for the needs of logging infrastructure. Elastic Stack usually contains Elasticsearch, Kibana, Beats, and Logstash. Each of them has different uses, whereas Elasticsearch is a NoSQL Database that is commonly used as a search engine. Kibana is a data visualization and exploration tool that is commonly used to visualize data from Elasticsearch. Beats is an agent that can send data to Elasticsearch or Logstash. Lastly, Logstash is a log aggregator that usually accepts logs from Beats / any other sources and then processes the logs that will be sent to Elasticsearch.

In this article, I will demonstrate how to set up Elastic Stack log infrastructure with Docker Compose. The reason I used Docker Compose is simply I don’t want my Virtual Machine/Server to get dirty (removing dependencies and software are pain for me). So, let’s get this started!

Setting up the Application

To simulate an application that generates a log entry each time it’s accessed, I’ve developed a simple API using Go Programming Language, to access the code of the API, please access this Repository. The repository also includes a Dockerfile, making it effortless to build the Docker image and have the application up and running in no time!

To build the Docker image, use docker build . command. You can add image name and tag to the image by using --name <name>:<tag> argument. After the image is successfully built, you can check the image with docker images command.

Docker image
List of docker images

Next, we need to run the application image that has been created. This can be accomplished using either docker run command or with docker compose. Personally, I prefer using docker compose approach as it allows me to prepare Elastic Stack instance setting.

If you prefer using docker run, you can use this command to run the image

docker run --name <your desired container name -p <port>:8080 -d <image name>:<tag>

Or, if you want to set it using docker compose, this is the YAMLfile of docker compose:

version: "3"

services:
api:
container_name: api
image: arceister/go-log-example:latest
ports:
- "8080:8080"
networks:
- elastic

networks:
elastic:

Then run docker compose up -d command.

If you follow one of those, the container should be running now. You can check the container status with docker ps.

Docker container status
Docker container status

Next, we need to verify whether the application is actually running. To do this, we will use cURL to connect to our localhost (or local server) on the exposed port. The expected response should be:

Expected response from server
Expected response

After connecting to our localhost using cURL, the server should log the connections we initiate. You can view the container logs by using the docker logs <container name> command. The log that we want to appear is highlighted in the red box.

Docker container log

Setting up Elasticsearch

Next, we want to set and configure Elasticsearch as our log database. Elasticsearch instance that will be set and configured will only have a single node. Additionally, I will not enable SSL for this Elasticsearch instance. So, let’s start to set and configure the Elasticsearch instance right now!

You might want to create a .env file first. I will create .env file to contain “secret” values such as Elasticsearch username, Elasticsearch password, and many more. This is optional but I highly recommend you to do this. This is the .env file for the Elasticsearch instance that will be set and configured:

ELASTICSEARCH_PASSWORD=<enter your desired password>

I’m using Elasticsearch version 8.8.1 for this. You can obtain the image here, but I’m specifying the image name on my docker-compose.yaml file, as Docker will automatically fetch the image if a matching image name and tag exists on Dockerhub. Therefore, here is the docker compose YAML file for setting up and configuring the Elasticsearch instance:

...
services:

...

elasticsearch:
container_name: elasticsearch
image: elasticsearch:8.8.1
ports:
- "9200:9200"
environment:
- ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
- xpack.security.enabled=false
- discovery.type=single-node
- "ES_JAVA_OPTS=-Xmx1g -Xms1g"
healthcheck:
test:
[
"CMD-SHELL",
"curl -f localhost:9200"
]
interval: 5s
timeout: 10s
retries: 120
depends_on:
- api
networks:
- elastic

...

I want to explain some things from the Elasticsearch configuration YAML file for a bit. I assign a name to the container because it is easier to debug and troubleshoot the container and I also find it easier to manage the communication between containers. Regarding port configuration, I exposed only 9200 because I didn’t set multiple nodes. Port 9200 is used for handling HTTP requests for CRUD actions on the Elasticsearch database. So if we want to get, create, update, or delete any data on Elasticsearch, we can use this specific port.

Now moving on to the environment settings, I already set Elasticsearch user password on the .env file, so I’m just referring to the value that is already set on the environment variable. Next, I set xpack.security.enabled=false option to disable SSL settings on Elasticsearch, so that I don’t need to set certificates. I also set discovery.type=single-node to specify that only a single node should be used for Elasticsearch instance. Lastly, I also defined ES_JAVA_OPTS=-Xmx1g -Xms1g to limit the JVM heap size allocation for Elasticsearch (I encountered a memory issue where the JVM heap is using half of my storage hence I limited the memory allocation).

For health checking, I perform GET request to localhost:9200 at specific intervals and enforce a timeout limit to ensure Elasticsearch instance’s health. Also, this container had a dependency on api container hence it will wait for api container to be started first in order for this Elasticsearch instance container to be started on docker compose .

After the docker compose is already executed and the Elasticsearch container has been started, the Elasticsearch container will perform a health check. The health check process usually takes ~1 minute until the Elasticsearch container is healthy because the Elasticsearch instance is booting on the container.

This is the Elasticsearch container when it's still doing health check, you can see beside the Up x seconds:

Elasticsearch container health checking indicator
Elasticsearch container still health check

And this is the Elasticsearch container when it’s healthy:

Elasticsearch container is already healthy

Now, we can try to do a simple CRUD transaction on the Elasticsearch container that has been set. First, we will input a document to Elasticsearch by cURL. Here is the command to input a document to the Elasticsearch database:

curl --location 'http://localhost:9200/<index_name>/_doc' \
--header 'Content-Type: application/json' \
--data '{
"title": "One",
"tags": ["ruby"]
}'

This command creates a document that has title: "One” and tags:["ruby"] attribute, you can replace <index_name>with your desired index. This command also creates a new index if the specified index name does not exist in the first place. To get the data that has been created, use this command:

curl --location 'http://localhost:9200/<index_name>/_search'

You may replace <index_name> with your desired index name. The command should produce this output:

Elasticsearch GET data output

Setting up Kibana

After setting up the Elasticsearch instance, the next step is to configure a graphical user interface (GUI) to visualize Elasticsearch data. This is where Kibana comes into play. Similar to the Elasticsearch that has already been set and configured, we won’t enable SSL for this Kibana instance. So, let’s start to set and configure the Kibana instance right now!

Remember that previously we already set a .env file? Well, we need to add a new variable to the existing .env file. Here is the variable that must be added to the existing .env file:

KIBANA_USERNAME=kibana_system
KIBANA_PASSWORD=<your kibana password>
ELASTIC_HOSTS=http://<elasticsearch container name>:9200

Additional notes regarding variables on the .env file, I use kibana_system as a value for KIBANA_USERNAME because Kibana needs to perform background tasks that require the use of kibana_system user.

I used Kibana version 8.8.1 because the Elasticsearch version is also 8.8.1. You can obtain the Docker image here, but I’m specifying the image name on my docker-compose.yaml file, as Docker will automatically fetch the image if a matching image name and tag exist on Dockerhub. Therefore, here is the docker-compose YAML file for setting up and configuring the Kibana instance:

...
services:

...

kibana:
container_name: kibana
image: kibana:8.8.1
ports:
- "5601:5601"
environment:
- ELASTICSEARCH_HOSTS=${ELASTIC_HOSTS}
- ELASTICSEARCH_USERNAME=${KIBANA_USERNAME}
- xpack.security.enabled=false
depends_on:
elasticsearch:
condition: service_healthy
networks:
- elastic

...

Let me briefly explain some things from the Kibana configuration YAML file. Same as above, I assign a name to the container because it’s easier to debug and troubleshoot the container, and of course, I find it easier to manage the communication between containers. I exposed port 5601 because that port is used to serve the Kibana web application.

Now, moving on to the environment settings, ELASTICSEARCH_HOSTS is an environment variable that refers to the Elasticsearch endpoint because Kibana will get data from Elasticsearch. I set ${ELASTIC_HOSTS} environment as a value for ELASTICSEARCH_HOSTS because I already set that value on .env. For the ELASTICSEARCH_USERNAME, I also set a variable that will lead to an environment variable value. I set ${KIBANA_USERNAME} for it. Lastly, I set xpack.security.enabled=false option to disable the need for certs and SSL settings on Kibana.

This container had a dependency on elasticsearch container because this container will fetch the required data from elasticsearch. Additionally, I also set the dependency, requiring the elasticsearch container to be healthy. This container also has the same network as other containers.

Now, we can start to execute the docker compose command again. The annoying thing is that we will have to wait until the elasticsearch container is healthy for this container to be started. It took ~1 minute to get this docker compose configuration to be started (also 2 or 3 minutes to get the Kibana web application ready to be served).

Additional notes: In some cases, if you open localhost:5601 too early, you probably will see this error message. This error is caused by the Kibana server is not ready yet to serve the web application.

The error message usually happens if you open the Kibana web application too early

To anticipate this, you can see the log of the current Kibana instance with docker logs kibana command. If there’s no [http,server,kibana] http server running at http://0.0.0.0:5601 log, that means the web application of the Kibana instance is not ready to serve the application.

The specified log

After the whole docker compose execution is complete, you can open localhost:5601 on your browser. The UI of the website should be like this:

Kibana appearance

Setting up Filebeat

This may be the last piece of the logging infrastructure if we want to have a bare minimum logging infrastructure that will work. After setting Elasticsearch as the database and Kibana as a GUI tool to check the Elasticsearch data, we will set an agent that will harvest files from the log of the application and then send it to Elasticsearch. Let’s get started to set up Filebeat right now!

On Filebeat configuration, we didn’t have to set a new variable in our .env file because we can use environment variables that have been set. So without further ado, we will continue to set our Filebeat instance.

In order to set the Filebeat instance, we will need to set filebeat.yml first. The configuration of Filebeat relies on the filebeat.yml that we will set. But we will only set regarding logs that will be harvested and the output settings on this session. Here is the filebeat.yml that is used:

filebeat.autodiscover:
providers:
- type: docker
templates:
- condition:
contains:
docker.container.name: api
config:
- type: container
paths:
- /var/lib/docker/containers/${data.docker.container.id}/*.log

setup.template.name: "filebeat-logs"
setup.template.pattern: "filebeat-logs"

output.elasticsearch:
hosts: ${ELASTICSEARCH_HOSTS}
index: "filebeat-logs"

Let me explain about filebeat.ymlfile for a bit. filebeat.autodiscover is a setting that tracks changes in the system, for further details, you can refer here. I set docker as a provider for Filebeat autodiscover, so Filebeat will only watch Docker containers. Additionally, I set condition tocontains: docker.container.name: api, because I only want to watch api Docker container changes. And config is the file that will be watched by Filebeat, then if there’s a change in the file, Filebeat will throw the changes to the specified output.

As for output, it’s kinda simple. A Filebeat instance has many options for output, but one Filebeat instance can’t have multiple outputs. You can refer here regarding Filebeat outputs. We will only output Filebeat to Elasticsearch, hence we use output.elasticsearch setting. In that setting too, there are hosts and index. hosts is the URL that will become the target of Filebeat data ingestion. index is the name of the Elasticsearch index that will contain the data from Filebeat, if we don’t set theindex , the index name will become filebeat-[version]. If you want to change the index name, you’ll have to set setup.template.name and setup.template.pattern too. I set the setup.template.name and setup.template.pattern to have the same name as the index.

Next, I will continue to set the Filebeat instance on docker-compose.yaml. I used Filebeat version 8.8.1, the same as Kibana and Elasticsearch versions to avoid version mismatch errors. You can obtain the Docker image for Filebeat 8.8.1 here, but I’m specifying the image name on my docker-compose.yaml file, as Docker will automatically fetch the image if a matching image name and tag exist on Dockerhub. Therefore, here is the docker-compose YAML file for setting up and configuring Filebeat instance:

services:
...
filebeat:
container_name: filebeat
image: elastic/filebeat:8.8.1
user: root
environment:
- ELASTICSEARCH_HOSTS=${ELASTIC_HOSTS}
volumes:
- "./filebeat.yml:/usr/share/filebeat/filebeat.yml:ro"
- "/var/lib/docker/containers:/var/lib/docker/containers:ro"
- "/var/run/docker.sock:/var/run/docker.sock:ro"
depends_on:
elasticsearch:
condition: service_healthy
networks:
- elastic
...

Allow me to explain some things from the Filebeat instance configuration on docker-compose.yaml file. Same as Elasticsearch and Kibana earlier, I assign a name to the container because it’s easier to debug and troubleshoot the container. It’s also making communication between containers easy. This container didn’t need any port to be exposed, hence I didn’t expose any port, unlike Elasticsearch or Kibana.

I set user: root because I want to run Filebeat as a root user. Filebeat must be run as root user to be able to read all the logs generated by the application. Moving on to the environment section, I set ELASTICSEARCH_HOSTS=${ELASTIC_HOSTS} so that filebeat.yml can refer to this environment variable. On Docker volumes, I bind filebeat.yml configuration file to /usr/share/filebeat/filebeat.yml:ro , where ro on the end of the means read-only. I also bind my Docker containers directory and Docker sock to Filebeat container, because Filebeat needs to read and watch the Docker system changes. To read more about bind volume on Docker, please refer to this. Lastly, I also set the dependency, requiring the elasticsearch container to be healthy. Also, this container is located in the same network as other containers.

Now, the folder structure of this project should be like this:

--
|- docker-compose.yaml
|- filebeat.yml
|- .env

And the full docker-compose.yaml is as follows:

version: "3"

services:
api:
container_name: api
image: arceister/go-log-example:latest
ports:
- "8080:8080"
networks:
- elastic

elasticsearch:
container_name: elasticsearch
image: elasticsearch:8.8.1
ports:
- "9200:9200"
environment:
- ELASTIC_PASSWORD=${ELASTICSEARCH_PASSWORD}
- xpack.security.enabled=false
- discovery.type=single-node
- "ES_JAVA_OPTS=-Xmx1g -Xms1g"
healthcheck:
test:
[
"CMD-SHELL",
"curl -f localhost:9200"
]
interval: 5s
timeout: 10s
retries: 120
depends_on:
- api
networks:
- elastic

kibana:
container_name: kibana
image: kibana:8.8.1
ports:
- "5601:5601"
environment:
- ELASTICSEARCH_HOSTS=${ELASTIC_HOSTS}
- ELASTICSEARCH_USERNAME=${KIBANA_USERNAME}
- xpack.security.enabled=false
depends_on:
elasticsearch:
condition: service_healthy
networks:
- elastic

filebeat:
container_name: filebeat
image: elastic/filebeat:8.8.1
user: root
environment:
- ELASTICSEARCH_HOSTS=${ELASTIC_HOSTS}
volumes:
- "./filebeat.yml:/usr/share/filebeat/filebeat.yml:ro"
- "/var/lib/docker/containers:/var/lib/docker/containers:ro"
- "/var/run/docker.sock:/var/run/docker.sock:ro"
depends_on:
elasticsearch:
condition: service_healthy
networks:
- elastic

networks:
elastic:

Additionally, the full .env file is as follows:

ELASTICSEARCH_PASSWORD=<your-elasticsearch-password>

KIBANA_USERNAME=kibana_system
KIBANA_PASSWORD=<your-kibana-password>
ELASTIC_HOSTS=http://elasticsearch:9200

So, let’s start to execute our docker-compose.yaml file. Again, it takes ~1 minute for all containers to be healthy and started (2 or 3 minutes if we wait for the Kibana web application to be ready). After the container is all healthy and running (Additionally you can wait for Kibana web application to be ready), open Kibana on your browser and go to Discover section, it should show this:

Discover section on Kibana

Click on Create Data View, fill in filebeat-logs for both Index pattern and Name (the name is optional, but I prefer to have the same name as the Index pattern), and click Save data view to Kibana. Now, try to hit localhost:8080 , the logs of the application should be shown at Kibana.

Logs of the application

I think that’s it, this article only covers the setup and configuration of the Log Infrastructure until Elasticsearch, Kibana, and Filebeat are configured. In the next part, I will continue to write about Logstash! If there are any questions or suggestions or there’s something wrong with this article, feel free to comment. Hope you have a nice day!

--

--

Jagad Nabil Tuah Imanda

Software Engineer @ Tokopedia, Interested on Server-Side and Software Infrastructure things.