Creating Log Infrastructure with Elastic Stack and Docker Compose (Part 1)
Logging plays a crucial role in any system. Logging enables error tracing, troubleshooting, performance analysis, and much more. As the codebase or system grows in complexity, you can’t rely on manual tracing or debugging methods such as fmt.Println()
or console.log()
anymore. Also in large codebases or systems, there are usually intermittent errors (errors that occur randomly) or intermittent high latency/execution time issues in the system. To identify and trace these errors, we need to navigate through the logs and find what time they occurred.
Elastic Stack is one of the options for the needs of logging infrastructure. Elastic Stack usually contains Elasticsearch, Kibana, Beats, and Logstash. Each of them has different uses, whereas Elasticsearch is a NoSQL Database that is commonly used as a search engine. Kibana is a data visualization and exploration tool that is commonly used to visualize data from Elasticsearch. Beats is an agent that can send data to Elasticsearch or Logstash. Lastly, Logstash is a log aggregator that usually accepts logs from Beats / any other sources and then processes the logs that will be sent to Elasticsearch.
In this article, I will demonstrate how to set up Elastic Stack log infrastructure with Docker Compose. The reason I used Docker Compose is simply I don’t want my Virtual Machine/Server to get dirty (removing dependencies and software are pain for me). So, let’s get this started!
Setting up the Application
To simulate an application that generates a log entry each time it’s accessed, I’ve developed a simple API using Go Programming Language, to access the code of the API, please access this Repository. The repository also includes a Dockerfile, making it effortless to build the Docker image and have the application up and running in no time!
To build the Docker image, use docker build .
command. You can add image name and tag to the image by using --name <name>:<tag>
argument. After the image is successfully built, you can check the image with docker images
command.
Next, we need to run the application image that has been created. This can be accomplished using either docker run
command or with docker compose
. Personally, I prefer using docker compose
approach as it allows me to prepare Elastic Stack instance setting.
If you prefer using docker run
, you can use this command to run the image
docker run --name <your desired container name -p <port>:8080 -d <image name>:<tag>
Or, if you want to set it using docker compose
, this is the YAMLfile of docker compose
:
version: "3"
services:
api:
container_name: api
image: arceister/go-log-example:latest
ports:
- "8080:8080"
networks:
- elastic
networks:
elastic:
Then run docker compose up -d
command.
If you follow one of those, the container should be running now. You can check the container status with docker ps
.
Next, we need to verify whether the application is actually running. To do this, we will use cURL to connect to our localhost (or local server) on the exposed port. The expected response should be:
After connecting to our localhost using cURL, the server should log the connections we initiate. You can view the container logs by using the docker logs <container name>
command. The log that we want to appear is highlighted in the red box.
Setting up Elasticsearch
Next, we want to set and configure Elasticsearch as our log database. Elasticsearch instance that will be set and configured will only have a single node. Additionally, I will not enable SSL for this Elasticsearch instance. So, let’s start to set and configure the Elasticsearch instance right now!
You might want to create a .env file first. I will create .env
file to contain “secret” values such as Elasticsearch username, Elasticsearch password, and many more. This is optional but I highly recommend you to do this. This is the .env
file for the Elasticsearch instance that will be set and configured:
ELASTICSEARCH_PASSWORD=<enter your desired password>
I’m using Elasticsearch version 8.8.1 for this. You can obtain the image here, but I’m specifying the image name on my docker-compose.yaml
file, as Docker will automatically fetch the image if a matching image name and tag exists on Dockerhub. Therefore, here is the docker compose
YAML file for setting up and configuring the Elasticsearch instance:
...
services:
...
elasticsearch:
container_name: elasticsearch
image: elasticsearch:8.8.1
ports:
- "9200:9200"
environment:
- ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
- xpack.security.enabled=false
- discovery.type=single-node
- "ES_JAVA_OPTS=-Xmx1g -Xms1g"
healthcheck:
test:
[
"CMD-SHELL",
"curl -f localhost:9200"
]
interval: 5s
timeout: 10s
retries: 120
depends_on:
- api
networks:
- elastic
...
I want to explain some things from the Elasticsearch configuration YAML file for a bit. I assign a name to the container because it is easier to debug and troubleshoot the container and I also find it easier to manage the communication between containers. Regarding port configuration, I exposed only 9200
because I didn’t set multiple nodes. Port 9200
is used for handling HTTP requests for CRUD actions on the Elasticsearch database. So if we want to get, create, update, or delete any data on Elasticsearch, we can use this specific port.
Now moving on to the environment settings, I already set Elasticsearch user password on the .env
file, so I’m just referring to the value that is already set on the environment variable. Next, I set xpack.security.enabled=false
option to disable SSL settings on Elasticsearch, so that I don’t need to set certificates. I also set discovery.type=single-node
to specify that only a single node should be used for Elasticsearch instance. Lastly, I also defined ES_JAVA_OPTS=-Xmx1g -Xms1g
to limit the JVM heap size allocation for Elasticsearch (I encountered a memory issue where the JVM heap is using half of my storage hence I limited the memory allocation).
For health checking, I perform GET request to localhost:9200
at specific intervals and enforce a timeout limit to ensure Elasticsearch instance’s health. Also, this container had a dependency on api
container hence it will wait for api
container to be started first in order for this Elasticsearch instance container to be started on docker compose
.
After the docker compose
is already executed and the Elasticsearch container has been started, the Elasticsearch container will perform a health check. The health check process usually takes ~1 minute until the Elasticsearch container is healthy because the Elasticsearch instance is booting on the container.
This is the Elasticsearch container when it's still doing health check, you can see beside the Up x seconds
:
And this is the Elasticsearch container when it’s healthy:
Now, we can try to do a simple CRUD transaction on the Elasticsearch container that has been set. First, we will input a document to Elasticsearch by cURL. Here is the command to input a document to the Elasticsearch database:
curl --location 'http://localhost:9200/<index_name>/_doc' \
--header 'Content-Type: application/json' \
--data '{
"title": "One",
"tags": ["ruby"]
}'
This command creates a document that has title: "One”
and tags:["ruby"]
attribute, you can replace <index_name>
with your desired index. This command also creates a new index if the specified index name does not exist in the first place. To get the data that has been created, use this command:
curl --location 'http://localhost:9200/<index_name>/_search'
You may replace <index_name>
with your desired index name. The command should produce this output:
Setting up Kibana
After setting up the Elasticsearch instance, the next step is to configure a graphical user interface (GUI) to visualize Elasticsearch data. This is where Kibana comes into play. Similar to the Elasticsearch that has already been set and configured, we won’t enable SSL for this Kibana instance. So, let’s start to set and configure the Kibana instance right now!
Remember that previously we already set a .env
file? Well, we need to add a new variable to the existing .env
file. Here is the variable that must be added to the existing .env
file:
KIBANA_USERNAME=kibana_system
KIBANA_PASSWORD=<your kibana password>
ELASTIC_HOSTS=http://<elasticsearch container name>:9200
Additional notes regarding variables on the .env
file, I use kibana_system
as a value for KIBANA_USERNAME
because Kibana needs to perform background tasks that require the use of kibana_system
user.
I used Kibana version 8.8.1 because the Elasticsearch version is also 8.8.1. You can obtain the Docker image here, but I’m specifying the image name on my docker-compose.yaml
file, as Docker will automatically fetch the image if a matching image name and tag exist on Dockerhub. Therefore, here is the docker-compose YAML file for setting up and configuring the Kibana instance:
...
services:
...
kibana:
container_name: kibana
image: kibana:8.8.1
ports:
- "5601:5601"
environment:
- ELASTICSEARCH_HOSTS=${ELASTIC_HOSTS}
- ELASTICSEARCH_USERNAME=${KIBANA_USERNAME}
- xpack.security.enabled=false
depends_on:
elasticsearch:
condition: service_healthy
networks:
- elastic
...
Let me briefly explain some things from the Kibana configuration YAML file. Same as above, I assign a name to the container because it’s easier to debug and troubleshoot the container, and of course, I find it easier to manage the communication between containers. I exposed port 5601
because that port is used to serve the Kibana web application.
Now, moving on to the environment settings, ELASTICSEARCH_HOSTS
is an environment variable that refers to the Elasticsearch endpoint because Kibana will get data from Elasticsearch. I set ${ELASTIC_HOSTS}
environment as a value for ELASTICSEARCH_HOSTS
because I already set that value on .env
. For the ELASTICSEARCH_USERNAME
, I also set a variable that will lead to an environment variable value. I set ${KIBANA_USERNAME}
for it. Lastly, I set xpack.security.enabled=false
option to disable the need for certs and SSL settings on Kibana.
This container had a dependency on elasticsearch
container because this container will fetch the required data from elasticsearch
. Additionally, I also set the dependency, requiring the elasticsearch
container to be healthy. This container also has the same network as other containers.
Now, we can start to execute the docker compose
command again. The annoying thing is that we will have to wait until the elasticsearch
container is healthy for this container to be started. It took ~1 minute to get this docker compose
configuration to be started (also 2 or 3 minutes to get the Kibana web application ready to be served).
Additional notes: In some cases, if you open localhost:5601
too early, you probably will see this error message. This error is caused by the Kibana server is not ready yet to serve the web application.
To anticipate this, you can see the log of the current Kibana instance with docker logs kibana
command. If there’s no [http,server,kibana] http server running at http://0.0.0.0:5601
log, that means the web application of the Kibana instance is not ready to serve the application.
After the whole docker compose
execution is complete, you can open localhost:5601
on your browser. The UI of the website should be like this:
Setting up Filebeat
This may be the last piece of the logging infrastructure if we want to have a bare minimum logging infrastructure that will work. After setting Elasticsearch as the database and Kibana as a GUI tool to check the Elasticsearch data, we will set an agent that will harvest files from the log of the application and then send it to Elasticsearch. Let’s get started to set up Filebeat right now!
On Filebeat configuration, we didn’t have to set a new variable in our .env
file because we can use environment variables that have been set. So without further ado, we will continue to set our Filebeat instance.
In order to set the Filebeat instance, we will need to set filebeat.yml
first. The configuration of Filebeat relies on the filebeat.yml
that we will set. But we will only set regarding logs that will be harvested and the output settings on this session. Here is the filebeat.yml
that is used:
filebeat.autodiscover:
providers:
- type: docker
templates:
- condition:
contains:
docker.container.name: api
config:
- type: container
paths:
- /var/lib/docker/containers/${data.docker.container.id}/*.log
setup.template.name: "filebeat-logs"
setup.template.pattern: "filebeat-logs"
output.elasticsearch:
hosts: ${ELASTICSEARCH_HOSTS}
index: "filebeat-logs"
Let me explain about filebeat.yml
file for a bit. filebeat.autodiscover
is a setting that tracks changes in the system, for further details, you can refer here. I set docker
as a provider for Filebeat autodiscover, so Filebeat will only watch Docker containers. Additionally, I set condition
tocontains: docker.container.name: api
, because I only want to watch api
Docker container changes. And config
is the file that will be watched by Filebeat, then if there’s a change in the file, Filebeat will throw the changes to the specified output.
As for output, it’s kinda simple. A Filebeat instance has many options for output, but one Filebeat instance can’t have multiple outputs. You can refer here regarding Filebeat outputs. We will only output Filebeat to Elasticsearch, hence we use output.elasticsearch
setting. In that setting too, there are hosts
and index
. hosts
is the URL that will become the target of Filebeat data ingestion. index
is the name of the Elasticsearch index that will contain the data from Filebeat, if we don’t set theindex
, the index name will become filebeat-[version]
. If you want to change the index name, you’ll have to set setup.template.name
and setup.template.pattern
too. I set the setup.template.name
and setup.template.pattern
to have the same name as the index.
Next, I will continue to set the Filebeat instance on docker-compose.yaml
. I used Filebeat version 8.8.1, the same as Kibana and Elasticsearch versions to avoid version mismatch errors. You can obtain the Docker image for Filebeat 8.8.1 here, but I’m specifying the image name on my docker-compose.yaml
file, as Docker will automatically fetch the image if a matching image name and tag exist on Dockerhub. Therefore, here is the docker-compose
YAML file for setting up and configuring Filebeat instance:
services:
...
filebeat:
container_name: filebeat
image: elastic/filebeat:8.8.1
user: root
environment:
- ELASTICSEARCH_HOSTS=${ELASTIC_HOSTS}
volumes:
- "./filebeat.yml:/usr/share/filebeat/filebeat.yml:ro"
- "/var/lib/docker/containers:/var/lib/docker/containers:ro"
- "/var/run/docker.sock:/var/run/docker.sock:ro"
depends_on:
elasticsearch:
condition: service_healthy
networks:
- elastic
...
Allow me to explain some things from the Filebeat instance configuration on docker-compose.yaml
file. Same as Elasticsearch and Kibana earlier, I assign a name to the container because it’s easier to debug and troubleshoot the container. It’s also making communication between containers easy. This container didn’t need any port to be exposed, hence I didn’t expose any port, unlike Elasticsearch or Kibana.
I set user: root
because I want to run Filebeat as a root user. Filebeat must be run as root
user to be able to read all the logs generated by the application. Moving on to the environment section, I set ELASTICSEARCH_HOSTS=${ELASTIC_HOSTS}
so that filebeat.yml
can refer to this environment variable. On Docker volumes, I bind filebeat.yml
configuration file to /usr/share/filebeat/filebeat.yml:ro
, where ro
on the end of the means read-only. I also bind my Docker containers directory and Docker sock to Filebeat container, because Filebeat needs to read and watch the Docker system changes. To read more about bind volume on Docker, please refer to this. Lastly, I also set the dependency, requiring the elasticsearch
container to be healthy. Also, this container is located in the same network as other containers.
Now, the folder structure of this project should be like this:
--
|- docker-compose.yaml
|- filebeat.yml
|- .env
And the full docker-compose.yaml
is as follows:
version: "3"
services:
api:
container_name: api
image: arceister/go-log-example:latest
ports:
- "8080:8080"
networks:
- elastic
elasticsearch:
container_name: elasticsearch
image: elasticsearch:8.8.1
ports:
- "9200:9200"
environment:
- ELASTIC_PASSWORD=${ELASTICSEARCH_PASSWORD}
- xpack.security.enabled=false
- discovery.type=single-node
- "ES_JAVA_OPTS=-Xmx1g -Xms1g"
healthcheck:
test:
[
"CMD-SHELL",
"curl -f localhost:9200"
]
interval: 5s
timeout: 10s
retries: 120
depends_on:
- api
networks:
- elastic
kibana:
container_name: kibana
image: kibana:8.8.1
ports:
- "5601:5601"
environment:
- ELASTICSEARCH_HOSTS=${ELASTIC_HOSTS}
- ELASTICSEARCH_USERNAME=${KIBANA_USERNAME}
- xpack.security.enabled=false
depends_on:
elasticsearch:
condition: service_healthy
networks:
- elastic
filebeat:
container_name: filebeat
image: elastic/filebeat:8.8.1
user: root
environment:
- ELASTICSEARCH_HOSTS=${ELASTIC_HOSTS}
volumes:
- "./filebeat.yml:/usr/share/filebeat/filebeat.yml:ro"
- "/var/lib/docker/containers:/var/lib/docker/containers:ro"
- "/var/run/docker.sock:/var/run/docker.sock:ro"
depends_on:
elasticsearch:
condition: service_healthy
networks:
- elastic
networks:
elastic:
Additionally, the full .env
file is as follows:
ELASTICSEARCH_PASSWORD=<your-elasticsearch-password>
KIBANA_USERNAME=kibana_system
KIBANA_PASSWORD=<your-kibana-password>
ELASTIC_HOSTS=http://elasticsearch:9200
So, let’s start to execute our docker-compose.yaml
file. Again, it takes ~1 minute for all containers to be healthy and started (2 or 3 minutes if we wait for the Kibana web application to be ready). After the container is all healthy and running (Additionally you can wait for Kibana web application to be ready), open Kibana on your browser and go to Discover section, it should show this:
Click on Create Data View, fill in filebeat-logs
for both Index pattern and Name (the name is optional, but I prefer to have the same name as the Index pattern), and click Save data view to Kibana. Now, try to hit localhost:8080
, the logs of the application should be shown at Kibana.
I think that’s it, this article only covers the setup and configuration of the Log Infrastructure until Elasticsearch, Kibana, and Filebeat are configured. In the next part, I will continue to write about Logstash! If there are any questions or suggestions or there’s something wrong with this article, feel free to comment. Hope you have a nice day!