Scala Collection – Operators

To be honest there are no operators in Scala :). Every method in scala is an operator! But for some simplicity let’s just call them collection operators.

Below are some of the operators which are common to all scala collections. The examples are mostly done using Arrays but can be applied to collections as well.

length operator

//Array of 5 integers with elements 6,7,8,9,10
val num1 = Array(6,7,8,9,10)

//length of array - should return 5
println(num1.length)

distinct – More importantly it returns an array of elements.

//Array of 5 integers with elements
val num1 = Array(6,7,6,9,10)
val num2 = num1.distinct
//Prints 6,7,9,10(only 4 elements)
num2.foreach(println)

//Another way of doing this would be
num1.distinct.foreach(println)

range, fill

//Array of 5 integers with elements 1,2,3,4,5
val num1 = Array.range(1, 5)

//Prints 1,2,3,4,5
num1.foreach(println)

//Array of 5 elements filled with 0
val num2 = Array.fill(5)(0)

head, tail, last

head operator returns the first element of the array. tail returns all the elements but the first element.last returns the last element of an array

//Array of 5 integers with elements 1,2,3,4,5
val num1 = Array.range(1, 5)

println(num1.head) //Prints 1
println(num1.tail) //Prints a list(2,3,4,5)
println(num1.last) //Prints 5

Let’s look at some more interesting ones

filter is a transformer and returns a new collection. It requires a predicate function which is applied to every element in the collection and if it evaluates to true then that element is returned. filterNot is exactly the opposite and would return if the condition evaluates to false.

//Array of 5 integers with elements 1,2,3,4,5
val num1 = Array.range(1, 5)

//Prints 1,2,3,4,5
num1.foreach(println)

//Apply a predicate function to the list will return a list
val num2 = num.filter(_%2==0)
//Prints 2,4
num2.foreach(println)
val num3 = num.filter(_%2!=0)
//Prints 1,3,5
num3.foreach(println)

In the above example “_%2==0” and “_%2!=0” are the two predicate functions.

union, intersect, diff are two operators which can be used for performing set operations. Union essentially concatenates both the arrays into a new array. Intersect only creates an array of common elements. Diff provides all elements in collection1 not in collection2

//Array of 5 integers with elements 1,2,3,4,5
val num1 = Array.range(1, 5)
//Array of 5 integers with elements 4,5,6,7,8
val num2 = Array.range(4, 8)

//Apply union operator
val num3 = num1.union(num2)
//Prints 1,2,3,4,5,4,5,6,7,8
num3.foreach(println)

//Apply intersect operator
val num4 = num1.intersect(num2)
//Prints 4,5
num4.foreach(println)

//Apply diff operator on array1 - elements in num1 not in num2
val num5 = num1.diff(num2)
//Prints 1,2,3
num5.foreach(println)

//Apply diff operator on array2 - elements in num2 not in num1
val num6 = num2.diff(num1)
//Prints 6,7,8
num6.foreach(println)

There are various operators for sorting – they are grouped here – sorted, sortBy, sortWith

sorted performs a simple sort using the natural order. In ascending or descending order.  For descending order use Ordering[Type].reverse where Type may be – Int, Float, String etc. Ordering trait is part of scala.math. See Below

//Array of 5 integers with elements 1,2,3,4,5
val num1 = Array(4,5,1,3,2)

//Prints 1,2,3,4,5
num1.sorted.foreach(println)

//Prints 5,4,3,2,1. Using Ordering[Type].reverse
num1.sorted(Ordering[Int].reverse).foreach(println)

sortBy performs a sort using an attribute of elements. For example if you have a list of person objects and you want to sort on the last name rather than the first name. Or if you want to sort a list of strings based on their length. See Below

val cities = List("London", "New York", "Paris")
//Prints London, New York, Paris
cities.foreach(println)

//Prints Paris, London, New York
cities.sortBy(city=>city.length).foreach(println)

//Shorter syntax. But gives the same result.
//Prints Paris, London, New York
cities.sortBy(_.length).foreach(println)

//Prints New York, London, Paris
cities.sortBy(_.length)(Ordering[Int].reverse).foreach(println)

sortWith perform a sort on elements or objects based on a user defined comparator function which return a boolean. Essentially the function takes two elements as parameters and does a comparison.

//Array of 5 integers with elements 1,2,3,4,5
val num1 = Array(4,5,1,3,2)

//Prints 1,2,3,4,5
num.sortWith((x,y)=> x < y).foreach(println)
//Shorter syntax. But gives the same result.
//Prints 1,2,3,4,5
num.sortWith(_ < _).foreach(println)

If the requirement is to have sorted elements in a list then it may make sense to explore sortedList as well.

Let’s look at zip, zipAll and zipWithIndex. This series of operators are very helpful when you want to combine lists. The output of all the operators is a list of tuples. These functions quite useful in machine learning. Let’s look at them

zip operator combines two lists two give a list of tuples. If the lists are not of same length then it provides a list of tuples with the length of the list equal to the length of shorter list. See Below

//Two list of 5 elements each
val list1 = List(1,2,3,4,5)
val list2 = List("a","b","c","d","e")

//zipping lists of same size
val ziplist1 = list1.zip(list2)
//Prints list of tuples-(1,a) , (2,b) , (3,c) , (4,d) , (5,e)
ziplist1.foreach(println)

//zipping lists of same size
val ziplist2 = list2.zip(list1)
//Prints list of tuples-(a,1) , (b,2) , (c,3) , (d,4) , (e,5)
ziplist2.foreach(println)

//Zipping lists of different size
val list3 = List(1,2,3)
val list4 = List("a","b","c","d","e")
//Prints a shorter list of tuples - (1,a) , (2,b) , (3,c)
list3.zip(list4).foreach(println)

zipAll operator combines two lists two give a list of tuples. In addition it provides default value for given value. If the lists are not of same length then it can also provide a default value for the missing elements in a list. Careful the way you apply the method. See Below

//Zipping lists of different size
val list1 = List(1,2,3)
val list2 = List("a","b","c","d","e")

//Prints list of tuples-(a,1) , (b,2) , (c,3) , (d,0) , (e,0)
list2.zipAll(list1,"",0).foreach(println)

//Prints list of tuples-(1,a) , (2,b) , (3,c) , (,d) , (,e)
list1.zipAll(list2,"",0).foreach(println)

zipWithIndex works on a single list and provides a list of tuples. The tuple is made of the element and its index. See Below

val list = List("a","b","c","d","e")

//Prints list of tuples-(a,1) , (b,2) , (c,3) , (d,4) , (e,5)
list.zipWithIndex.foreach(println)

unzip is the reverse of the zipping functions. You can split a list of tuples into two lists.

val list = List((a,1) , (b,2) , (c,3) , (d,4) , (e,5))

val (list1,list2) = list.unzip
//Prints list of elements - a,b,c,d,e
list1.foreach(println)

//Prints list of elements - 1,2,3,4,5
list2.foreach(println)

Starting from this point we will start looking at some more interesting constructs which help in functional programming and also reduce the amount of code you write. Hence make it easier to maintain. We will be looking at reduce, fold, scan and map series  of operators

reduce, reduceLeft and reduceRight allow you to traverse through the collection and apply a function. The result of the function is stored along with the next element is passed to the function and this continues till we have gone thru all the elements. As a starting point the first two elements are passed to the method. See Below

reduce operator goes thru all the elements of the collection in a non-deterministic manner – simply speaking it does not bother with the ordering.  See Below

val list = List(1,2,3,4,5)
//Use the reduce function
//x is the accumulator and y is the element from list
val factorial1 = list.reduce((x,y) => x*y)
//Prints 60
println(factorial1)

//Same method - shorter syntax
val factorial2 = list.reduce(_ * _)
//Prints 60
println(factorial2)

reduceLeft operator goes thru all the elements of the collection from the first element to the last element. manner – simply speaking it goes from left to right.  To prove the point see below. See Below

val list = List(10,20,30,40,50)
//Define a method
val computeSum = (x:Int, y:Int) => {
  val sum = x + y
  printf("Value of accumulator is %s, Element is %s \n",x,y)
  sum
}
val totalSum = list.reduceLeft(computeSum)
printf("Sum is %d", totalSum )

//The above gives the following output
//Value of Accumulator 10, Value of Element 20
//Value of Accumulator 30, Value of Element 30
//Value of Accumulator 60, Value of Element 40
//Value of Accumulator 100, Value of Element 50
//Sum is 150

reduceRight operator goes thru all the elements of the collection from the last element to the first element. manner – simply speaking it goes from right to left.  To prove the point see below. See Below

val list = List(10,20,30,40,50)
//Define a method
val computeSum = (x:Int, y:Int) => {
  val sum = x + y
  printf("Value of accumulator is %s, Element is %s \n",x,y)
  sum
}
val totalSum = list.reduceRight(computeSum)
printf("Sum is %d", totalSum )

//The above gives the following output
//Value of Accumulator 40, Value of Element 50
//Value of Accumulator 30, Value of Element 90
//Value of Accumulator 20, Value of Element 120
//Value of Accumulator 10, Value of Element 140
//Sum is 150

fold, foldLeft and foldRight is similar to reduce, reduceLeft and reduceRight. The only additional thing is that it starts with a seeded value. See Below

fold operator goes thru all the elements of the collection in a non-deterministic manner – simply speaking it does not bother with the ordering. An initial seeded value is provided. See Below

val list = List(1,2,3,4,5)
//Use the reduce function
//x is the accumulator and y is the element from list
val totalSum1 = list.fold(10)((x,y) => x+y)
//Prints 25
println(totalSum1)

//Same method - shorter syntax
val totalSum2 = list.fold(10)(_ + _)
//Prints 25
println(totalSum2)

foldLeft operator goes thru all the elements of the collection from the first element to the last element. manner – simply speaking it goes from left to right.  An initial seeded value is provided. See Below

val list = List(10,20,30,40,50)
//Define a method
val computeSum = (x:Int, y:Int) => {
  val sum = x + y
  printf("Value of accumulator is %s, Element is %s \n",x,y)
  sum
}
val totalSum = list.foldLeft(100)(computeSum)
printf("Sum is %d", totalSum )

//The above gives the following output
//Value of Accumulator 100, Value of Element 10
//Value of Accumulator 110, Value of Element 20
//Value of Accumulator 130, Value of Element 30
//Value of Accumulator 160, Value of Element 40
//Value of Accumulator 200, Value of Element 50
//Sum is 250

foldRight operator goes thru all the elements of the collection from the last element to the first element. manner – simply speaking it goes from right to left.  An initial seeded value is provided. See Below

val list = List(10,20,30,40,50)
//Define a method
val computeSum = (x:Int, y:Int) => {
  val sum = x + y
  printf("Value of accumulator is %s, Element is %s \n",y,x)
  sum
}
val totalSum = list.foldRight(100)(computeSum)
printf("Sum is %d", totalSum )

//The above gives the following output
//Value of Accumulator 50, Value of Element 100
//Value of Accumulator 40, Value of Element 150
//Value of Accumulator 30, Value of Element 190
//Value of Accumulator 20, Value of Element 220
//Value of Accumulator 10, Value of Element 240
//Sum is 250

scan, scanLeft and scanRight is similar to fold, foldLeft and foldRight. However, instead of a single result it returns a list of cumulative results. See Below

scan operator goes thru all the elements of the collection in a non-deterministic manner – simply speaking it does not bother with the ordering.

val list = List(1,2,3,4,5)
//Use the reduce function
//x is the accumulator and y is the element from list
val totalSum1 = list.scan(10)((x,y) => x+y)
//Prints
println(totalSum1)

//Same method - shorter syntax
val totalSum2 = list.scan(10)(_ + _)
//Prints a List(10, 11, 13, 16, 20, 25)
println(totalSum2)

scanLeft operator goes thru all the elements of the collection from the first element to the last element. manner – simply speaking it goes from left to right.  See Below

val list = List(1,2,3,4,5)
//Define a method
val computeSum = (x:Int, y:Int) => {
  val sum = x + y
  printf("Value of accumulator is %s, Element is %s \n",x,y)
  sum
}
val newList = list.scanLeft(10)(computeSum)
println(newList )

//The above gives the following output
//Value of Accumulator 10, Value of Element 1
//Value of Accumulator 11, Value of Element 2
//Value of Accumulator 13, Value of Element 3
//Value of Accumulator 16, Value of Element 4
//Value of Accumulator 20, Value of Element 5
//List(10,11,13,16,20,25)

scanRight operator goes thru all the elements of the collection from the last element to the right element. manner – simply speaking it goes from right to left.  See Below

val list = List(1,2,3,4,5)
//Define a method
val computeSum = (x:Int, y:Int) => {
  val sum = x + y
  printf("Value of accumulator is %s, Element is %s \n",y,x)
  sum
}
val newList = list.scanRight(10)(computeSum)
println(newList)

//The above gives the following output
//Value of Accumulator 10, Value of Element 5
//Value of Accumulator 15, Value of Element 4
//Value of Accumulator 19, Value of Element 3
//Value of Accumulator 22, Value of Element 2
//Value of Accumulator 24, Value of Element 1
//List(25,24,22,19,15,10)

Finally, we come to one of the most talked about operators map and flatMap. Both these functions are very heavily used and are ideal for heavy data processing. Keep in mind map operator should not be confused by the Map collection type.

map is an operator which applies a function to every element of a collection and adds the result to a new list. See below

val list = List(1,2,3,4,5)
val multipleBy2 = (x:Int) =>{
  x * 2
}

val newList1 = list.map(multiplyBy2)
val newList2 = list.map(x=&gt; x * 2)
val newList3 = list.map(_ * 2)
//All the three result in same output List(2,4,6,8,10)
println(newList1)

val cities = List("London","Paris","New York")
//Returns the length of the strings.
val strLen = cities.map(_.length)
println(strLen)

//Creates list of tuples - (1,2), (2,4), (3,6), (4,8), (5,10)
val newList4 = list.map(x => (x,x*2))
println(newList4)

flatMap is similar to the map function. In addition to what map does flatMap also flattens the result in a single list. For example if there is a list of lists as shown below

List( List(1,2,3) ,  List(4,5,6))

If map is used, method will only be applied twice – once to each of the lists which make up the main list. However, in case flatMap is used the method is applied to all the six elements which make up the two lists and the result is a flattened

val list = List( List(1,2,3), List(4,5,6))
val newList1 = list.map(x=>x.map(x*2))
//Prints - List( List(2,4,6), List(8,10,12))
println(newList1)

val newList2 = list.flatMap(x=>x.map(x*2))
//Prints - List(2,4,6,8,10,12)
println(newList2)

Leave a Comment