Monitoring Linux with Prometheus

In our previous post, we saw how to set up a simple Prometheus server. In this post, we extend it to monitor Linux servers. Monitoring of Linux servers can be done easily using Prometheus using a simple exporter provided – it is called the node exporter also called system metrics exporter. You can find a list of exporters on the Prometheus site. This blog entry is divided into Setup & Configure node exporter on Linux server Configure Prometheus server to get data from node exporter Query the data using PromQL Create a dashboard using Grafana Setup node exporter on Linux server Setting up node exporter can be done in a few ways. You could do the following Download the binaries from this link Download the source and try to build it. It’s not as difficult as it seems. Use a docker image (though not a recommended way) For our blog … Read more

Grafana dashboards – getting started!

Collection of metrics data is of limited use if you cannot analyse it. One of the ways to analyse data collected is by creating dashboards. Grafana is one of the best open-source dashboarding tools. It is fast, simple, extendable and supports many data sources. In the blog entry we will do the following Install Grafana Configure Prometheus as a data source Create a chart & a put it on a dashboard Install Grafana Grafana is available various Linux flavours. For this blog entry, we will be installing Grafana on Ubuntu. Once installed it can be started using the following command The grafana server by default starts on port 3000. All grafana settings can be easily be changed. These are available on /etc/grafana/grafana.ini. These can be edited More on that in a later blog post. Configure data source We can now login to our grafana server via a browser url – … Read more

Prometheus monitoring – getting started

This is the start of the mini-series on monitoring infrastructure using Prometheus. The aim is to give an insight into Prometheus which is an open-source tool of systems monitoring. How easy and quickly it is to set up and get value out of it. A combination of Prometheus and Grafana can be considered as an alternative to paid tools like New Relic, Data Dog. Each blog entry will provide you with the ability to incrementally add monitoring your infrastructure with minimal to no changes. We will also see how we can integrate with new communication tools like Slack and MS Teams and provide robust alerting. This blog entry is about Introduction to Prometheus Getting Prometheus up and running Exploring Prometheus UI / Demo Introduction Prometheus was developed by SoundCloud in 2012. Since then it has seen its adoption increase quite nicely. It is also a project on Cloud Native Computing … Read more

Airflow RBAC – Role-Based Access Control

Airflow version 1.10.0 onward introduced Role-Based Access Control(RBAC) as part of their security landscape. RBAC is the quickest way to get around and secure airflow. It comes with pre-built roles which makes it easy to implement. Not stopping there you could add your own roles as well. In addition, to securing various features of airflow web UI, RBAC can be used to secure access to DAGs as well. Though this feature is only available after 1.10.2. And as with any software, a few bugs popped. But this was fixed in the 1.10.7. There is an entry for a bug AIRFLOW-2694. In this blog entry, we will touch upon DAG level access as well. The blog entry is divided into few parts Enable RBAC Create users using standard roles Roles & permissions Secure DAGs with RBAC Needless to say, I am assuming you have a working airflow. If not, please head … Read more

Airflow – XCOM

Introduction Airflow XCom is used for inter-task communications. Sounds a bit complex but it is really very simple. Its implementation inside airflow is very simple and it can be used in a very easy way and needless to say it has numerous use cases. Inter-task communication is achieved by passing key-value pairs between tasks. Tasks can run on any airflow worker and need not run on the same worker. To pass information a task pushes a key-value pair. The key-value pair is then pulled by another task and utilized. This blog entry requires some knowledge of airflow. If you are just starting out. I would suggest you first get familiar with airflow. You can try this link This blog entry is divided into Pushing values to XCOM Viewing XCOM values in Airflow UI Pulling XCOM values Pushing values to XCOM Before we dive headlong into XCOM, let’s see where to … Read more

Redis, Docker & Raspberry PI

Happy New Year!. Time for the first post of 2020. Have been wanting to write a post on raspberry pi and Redis for some time now. Finally, here it is. I also added docker to the mix of things just to make it interesting. Just to make sure you do need an internet connection. To get all this going I used a Raspberry Pi 3 Model B Quad Core CPU 1.2 GHz 1 GB RAM, so it is about 4 years old hardware. It uses Raspbian Stretch distribution is also an upgrade from Raspbian Jessie. The upgrade is pretty simple though it took a few hours. Below is a Raspberry PI which I used for this blog entry. As you can see it has been well-loved 😉 I have divided the entry into three parts Installing docker on Raspberry PI Run a Redis container on Raspberry PI Redis Operations on … Read more

Spark & Redis

One of the fantastic use-cases of Redis is its use along with Apache-Spark in-memory computation engine. You can in some sense use it as a backend to persist spark objects – data frames, datasets or RDDs in the Redis Cache alongside other cached objects. To enable this there is a very handy library available called Spark-Redis. This library has both Scala and Python-based API. Redis can be used to persist data and be used as a backend – it can be used to share common data between various jobs rather than loading the same data again and again. This makes Redis an invaluable tool for big data developers. In this blog post, we will use both scala and python based API to read data and write data frames and RDDs to/from Redis. Using Scala API In this section, we will read and write to a Redis cluster using Scala and … Read more

Redis – Data Structures – Introduction

Now that you have seen how easy it is to setup Redis. Let’s start our exploration journey. Redis is a data structure server, it means you can store data in various different data structures depending upon your requirements. It supports a number of data structures. Some of them are listed below Simple Strings Lists Sets Sorted sets Hashes Bit Arrays HyperLogLogs Streams To get started quickly on these data structures via the Redis-CLI – I suggest you have a look at this link. It is a very nice read and a quick way to get introduced to Redis data structures via the CLI interface. Personally, I read this to get started as well! Introduces all the data structures- if you just want a quick introduction just read – Redis Strings, Lists, Sets and Hashes. Remaining you can leave them till you need them! In this blog, in addition to looking … Read more

Redis – Getting Started

Redis is an open-source, in-memory, stateless, distributed datastore. It is heavily used as a distributed cache, as a NoSQL for storing data in memory. Redis can also be used as a message broker, for serving fast in-memory analytics. But simply to get started let’s just use Redis as a cache. Redis is not as hard as it looks or may seem. You can be on your way to using it in a very short time. The idea for a Redis mini-series started when I started writing about scaling airflow using Redis. Redis blog entries cover the following aspects Data Structures – Lists, Sets, Maps etc – Using Redis CLI, Python and Scala Using Redis with Apache Spark Configuring Redis Cluster Using AWS Elastic Cache for Redis with AWS EMR Let’s get started! – As is the case with any software we need to download and install it. Well with Redis, … Read more

Airflow – Scale-out with Redis and Celery

Introduction This post uses Redis and celery to scale-out airflow. Redis is a simple caching server and scales out quite well. It can be made resilient by deploying it as a cluster. In my previous post, the airflow scale-out was done using celery with rabbitmq as the message broker. On the whole, I found the idea of maintaining a rabbitmq a bit fiddly unless you happen to be an expert in rabbitmq. Redis seems to be a better solution when compared to rabbitmq. On the whole, it is a lot easier to deploy and maintain when compared with the various steps taken to deploy a RabbitMQ broker. In a nutshell, I like it more than RabbitMQ! To create an infrastructure like this we need to do the following steps Install & Configure Redis server on a separate host – 1 server Install & Configure Airflow with Redis and Celery Executor … Read more