Monitoring Kafka with Prometheus

Introduction

Monitoring Kafka using Prometheus is easy. In this blog entry, we will see how we can monitor the state of Kafka brokers and also how we can monitor Kafka topic lags.

Till now we have used pre-built exporters for Linux, Docker and JMX exporter for Cassandra. In this blog, we will use a combination of JMX exporter and a pre-built exporter to monitor Kafka.

Kafka’s metrics are well documented and are available on this link. This blog assumes that you have a working Kafka cluster. In case, you want to install Kafka, head over to this link which has got very nice steps. For the purpose of this blog, we have a three-node cluster on AWS.

It has two topics. See below.

The blog is divided into the following sections

  • Download & Install JMX exporter
  • Configure JMX exporter for Kafka
    • Configure JMX exporter
    • Configure Kafka
    • Check metrics
  • Configure Prometheus server
  • Query Prometheus using PromQL
  • Grafana Dashboard for Kafka Brokers
  • Download Kafka Lag exporter
  • Configure Kafka Lag exporter
  • Query Prometheus using PromeQL
  • Grafana Dashboard for Kafka Lag exporter

Download & Install JMX exporter

Step 1 – JMX exporter can be downloaded easily via the maven repo. Use this link. For Linux – see below

wget https://search.maven.org/remotecontent?filepath=io/prometheus/jmx/jmx_prometheus_ja

Step 2 – Once downloaded the jar needs to be placed along with other Kafka jars. The usual place for this would be $KAFKA_HOME/libs

Configure JMX exporter for Kafka

This is probably the most important part of this blog. But don’t worry it is not difficult!. You just need to know which files to copy or modify files. There are just two of them as always 🙂

Note: The steps for JMX exporter need to be performed on all the brokers.

Configure JMX exporter

Kafka exposes a lot of metrics and they are really well-documented here. To enable JMX exporter to scrape metrics it needs to know a few things

  • Which metrics to scrape and which “NOT” to scrape
  • Rules around the naming of those metrics

Naming and filtering of the metrics can be done via regex expressions as a configuration in a YAML file. There is already a sample configuration file to get us started and it is available on this link.

I downloaded and have stored my configuration for JMX exporter in the $KAFKA_HOME/config/jmx_exporter.yml

Configure Kafka broker

Integrating Kafka and JMX exporter is easy and it requires only one line added! 😉 Here goes

Step – 1 – Goto the Kafka configuration directory
In my case it is $KAFKA_HOME/conf

Step – 2 – Edit kafka-server-start.sh

Add the following line to shell script

export KAFKA_OPTS=' -javaagent:/home/ec2-user/kafka_2.11-2.4.1/libs/jmx_prometheus_javaagent-0.13.0.jar=7071:/home/ec2-user/kafka_2.11-2.4.1/config/jmx_exporter.yml'

Note: Keep in mind if you already have KAFKA_OPTS environment variable defined you may want to just add the java agent settings to it

See the screenshot below

That’s it done! – As promised one line change! Kafka is now started.

Check metrics

Goto the following URL – http://<Kafka-Broker>:7071 – It should now be able to show you the metrics. See below.

Configure Prometheus Server

For Prometheus server to scrape metrics from the Kafka broker additional configuration needs to be added. These are put in prometheus.yml. See below

It’s time to start the Prometheus server. If you need more information on how to install/run/configure Prometheus server please refer to this blog entry.

./prometheus --config.file="prometheus.yml" --storage.tsdb.retention.time=400d --storage.tsdb.path="data/"

You can check if Prometheus server is able to scrape the metrics is by navigating to Prometheus UI on http://<Prometheus-Host>:<Prometheus-Port>/graph

Prometheus can also show you if it is scrapping metrics of Kafka Brokers

On prometheus UI Goto Status->Targets

This would bring up something similar as below

Query Prometheus using PromQL

The metrics from JMX exporter can be queried like any other metrics. Type in Up metric. This metric is available to a jmx_exporter by default

Grafana Dashboard for Kafka Brokers

So our Prometheus server is now able to scrape Kafka broker metrics. Its time to import a grafana dashboard for Kafka brokers. For the purpose of this blog entry, I am going to import a dashboard on this link

To import a grafana dashboard follow these steps

Step 1 – Press the + button as shown below

Step 2 – You can import by typing the id assigned by grafana website to the dashboard or directly paste the JSON. I have decided to just type in the id. See Below

Step 3 -Select the data source and folder name. Press import.

The dashboard is ready!.

This works across multiple nodes. You can now add/change/remove charts to suit your requirements. You can explore various individual metrics and come up with something new!

Download Kafka Lag exporter

Now let’s turn our attention to monitoring Kafka topic lags. Monitoring Kafka Lags is equally important to measure the performance of Kafka consumer applications and your data pipeline.

In Kafka, every consumer group ingests data at a certain number of messages/second. Lag may go up or down for various reasons. Kafka Lag exporter is used to monitor this metric and use it as a health indicator of how quickly/slowly data in Kafka topic is being consumed.

Kafka Lag Monitor can be found on GitHub on this link.

wget https://github.com/lightbend/kafka-lag-exporter/releases/download/v0.6.1/kafka-lag-exporter-0.6.1.zip

Configure Kafka Lag exporter

Kafka Lag exporter is non-intrusive in nature – meaning it does not require any changes to be done to your Kafka setup. Some of the configuration to get going is the given below.

  • Kafka Broker addresses – Required
  • Endpoint port number – Default – 8080. Prometheus server will scrape this port.
  • Poll interval – Default – 30 seconds. How often does the lag monitor will poll the Kafka cluster.
  • Whitelist of metrics – If you want to scrape only specific metrics.

In addition these simple configuration parameters there is a nice list of parameters available here.

You can now start the Lag monitor using the following command

./kafka-lag-exporter -Dconfig.file=/home/ec2-user/kafka-lag-exporter-0.6.2/bin/application.conf

Query Kafka lag metrics using PromQL

The metrics from Kafka lag exporter can be queried like any other metrics. Type in kafka_consumergroup_group_lag metric.

Grafana Dashboard for Kafka lag monitor

So our Prometheus server is now able to scrape Kafka lag monitor for metrics. Its time to import a grafana dashboard for Kafka lag monitor. For the purpose of this blog entry, I am going to import a dashboard on this link

To import a grafana dashboard follow these steps

Step 1 – Press the + button as shown below

<<Add an image>>

Step 2 – You can import by typing the id assigned by grafana website to the dashboard or directly paste the JSON. I have decided to paste the JSON from the link. See Below

Step 3 -Select the data source and folder name. Press import.

The dashboard is ready!.

Phew! That is a long post. If you have made it till here congratulations. This brings us to the end of this entry. Hope you have found this entry useful. If you like it – share it. Maybe leave a comment! 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *