Prometheus monitoring – getting started

This is the start of the mini-series on monitoring infrastructure using Prometheus. The aim is to give an insight into Prometheus which is an open-source tool of systems monitoring. How easy and quickly it is to set up and get value out of it. A combination of Prometheus and Grafana can be considered as an alternative to paid tools like New Relic, Data Dog.

Each blog entry will provide you with the ability to incrementally add monitoring your infrastructure with minimal to no changes. We will also see how we can integrate with new communication tools like Slack and MS Teams and provide robust alerting.

This blog entry is about

  • Introduction to Prometheus
  • Getting Prometheus up and running
  • Exploring Prometheus UI / Demo

Introduction

Prometheus was developed by SoundCloud in 2012. Since then it has seen its adoption increase quite nicely. It is also a project on Cloud Native Computing Foundation

At its heart, Prometheus is nothing but a time-series database. Having said that Prometheus server has significant capabilities.

  • Can Pull metrics over HTTP
  • PromQL to query the metrics collected
  • Supported by various dashboarding technologies – e.g Grafana
  • Push metrics via Push Gateway to statsD, Graphite etc

The Prometheus ecosystem has got various components. But here is a fantastic diagram from the Prometheus website. Which pretty much sums up everything.

The orange boxes are the core components of Prometheus.

  • Prometheus Server – stores the time series database.
  • Prometheus Web UI – queries the Prometheus server using PromQL
  • Alert Manager – Provides alerting capabilities
  • Push Gateway – Ability to push metrics to other applications like StatsD, Graphite etc.

In addition to the core capabilities, there are various client libraries which can be used to add functional metric capabilities to applications. There is going to be a blog entry specifically for this.

Let’s talk about some of the jargon Prometheus has and it is going to stick around.

  • Target – Application or Machine to be monitored.
  • Pulling/Scraping metrics – Prometheus server based on configuration pulls metrics from target
  • Exporters – Deployed on servers to monitor specific targets. There are loads of pre-built exporters already available on this link.

Getting Prometheus Up!

Getting Prometheus up and running is quite easy. Easier than you think!. It can be downloaded from this link

For this blog entry – I am using Ubuntu Linux to run Prometheus. Unzip the tar file to a directory. Using the following command

tar -xvf prometheus-2.16.0.linux-amd64.tar.gz

Once you have unzipped the file navigate to the Prometheus directory and fire the following command.

./prometheus --config.file="prometheus.yml" --storage.tsdb.retention.time=400d --storage.tsdb.path="data/"

The parameters passed are pretty much self explanatory but here goes

  • config.file – Path to the Prometheus config file.
  • storage.tsdb.retention.time – Time in days(d), hours(h), minutes the data should be retained in Prometheus server
  • storage.tsdb.path – directory in which to store the metrics data collected.

You should see something similar like the screenshot below.

Prometheus is now up!

Oh BTW – if you want to see a full list of parameters can be obtained using the following command

./prometheus --help

Prometheus Configuration

Before we go ahead to interesting things let’s take a look at our default configuration file – prometheus.yml. We will always be coming back to it. The configuration file has the following configuration in general

  • How often to scrape
  • Which targets to scrape
  • HTTP path on which to look for scraping

Below is the screenshot of the prometheus.yml.

Let’s look at Prometheus UI now.

Prometheus Web UI

By default, it runs on port 9090. Let navigate to our browser

Let’s quickly look at the status menu. It has very useful information and would be a good idea to explore it a bit on your own.

Prometheus server when it starts also exposes some metrics. They are published on the ‘/metrics’ HTTP path. These are scraped as a valid target automatically. You will see its setup in the configuration file.

Below is the screenshot how the metrics are exposed via HTTP.

PromQL via Prometheus Web UI

PromQL allows the ability to query the prometheus data across various dimensions and is quite powerful tool. Expressions can be quickly built in the expression builder field. See an example below

Let’s look at one metric in detail – prometheus_http_requests_total

In the screenshot, all the dimensions against which the metric value was collected are highlighted. These are in addition to the time dimension. Let’s analyse the entry highlighted.

  • Element – prometheus_http_requests_total {code=”200″, handler=”/metrics”, instance=”localhost:9090″, job=”prometheus”}
  • Value26

The dimensions are marked in red above. That means code, handler, instance & job can be used to filter.

For example, if we want to filter for all the data with HTTP code equal to 200. Use the following PromQL in the expression builder. Press execute.

prometheus_http_requests_total{code="200"}

See the result below. See PromQL query. We can only see rows of data where code=200. Relevant portions are highlighted.

Now keep in mind if you are concerned if the results are changing is because the console is showing you the latest results. Let’s see how to graph them now – which is now a lot easier.

Just press on the Graph link. See below

You will see the graph of the data which we extracted above for a period of 1hr by default. You can obviously play with it

If you are able to come this far – that is awesome. You can now get Prometheus up and running and start playing with it. Hope you have found this helpful.

Next entries are all about how to scrape metrics of different targets. Starting with Linux servers – till then byeeeee!!

5 comments

Leave a Reply

Your email address will not be published. Required fields are marked *