Apache Airflow – First DAG

Now that we have a working Airflow it is time to look at DAGs in detail. In the previous post, we saw how to execute DAGs from the UI. In this post, we will talk more about DAGs.DAGs are the core concept of airflow. Simple DAG Simply speaking a DAG is a python script. Here is the code of a hello world DAG. It executes a command to print “helllooooo world”. It may not do much but it provides a lot of information about how to write an airflow DAG. Understanding an Airflow DAG Remember this code is stored in the $DAGS_FOLDER. Please refer to the previous blog which has the details on the location. An important thing to note and I quote from the airflow website One thing to wrap your head around (it may not be very intuitive for everyone at first) is that this Airflow Python script … Read more

Apache Airflow – Getting Started

I recently finished a project where Apache Airflow(just airflow for short) was being touted as the next generating Workflow Mangement System and the whole place was just going gaga over. Well, that got me thinking how I could get to understand and learn it. Hence the blog post. Here are some things you may want to know before getting your hands dirty into Apache Airflow What is Airflow?The definition of Apache Airflow goes like this Airflow is a platform to programmatically author, schedule and monitor workflows. Use airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The airflow scheduler executes your tasks on an array of workers while following the specified dependencies. So Airflow executes a series of unrelated tasks which when executed together accomplish a business outcome. For those folks who are working on the likes of Informatica – airflow is similar to Workflow Designer or those working in Oracle Data … Read more