Spark SQL & Datasets – Hello World
This post introduces you to a simple spark SQL & datasets example. It assumes that you are comfortable with Spark Core API. Before we start writing a program – let’s see what all tools we would be using to write this program IntelliJ Community Edition – IDE Scala SBT – Scala Build Tool Apache Spark For the purpose of this, we would be using Ubuntu Desktop. I already have an Ubuntu desktop using a Virtual Box but you can use MacBook and process would still be the same. Launch IntelliJ IDE Click on Create New Project Select SBT & click Next Provide the following information and then click finish Project Name – SparkHelloWorldDataSet sbt version – 0.13.17 Scala version – 2.11.8 This will create a sbt project. Add the Spark libraries to the project. Open build.sbt, it is available in the root of the project. Visible in the screenshot. Add the following entry to build.sbt This will import all … Read more