Tuesday, September 8, 2009

Collection/Iterable/Iterator foldRight and foldLeft

Folding and reducing are two types of operations that are both very important with dealing with collections in Scala (and other functional languages). This topic covers the fold operations. The reduce topic was previously covered.

A simplistic explanation is they allow processing of the collections. I think about foldRight and foldLeft to be very similar to a visitor that visits each element and performs a calculation. It does not change the values of the collection instead it calculates a result from visiting each element. In Java you might see the following pattern:
  1. class Collection {
  2.   pubic Result op( Result startingValue, Op operation){...}
  3. }
  4. interface Op{
  5.   public Result op( Object nextCollectionElem, Result lastCalculatedResult)
  6. }

To a Java programmer this is pretty obvious how you use it. This is essentially the fold operation. The difference is we are using closures so there is almost no boilerplate and all collection/Iteratore/Iterable interfaces have the fold methods.

One last point before examples. For fold and reduce operations there are both foldLeft and foldRight and similarly reduceLeft and reduceRight. These indicate the order that a collection is processed. Note that for certain collections one direction is more performant that the other. Lists, for example, are better to process left to right because of the datastructure used.

Some tasks that are suited to fold are:
  • Find the 5 smallest elements in a list
  • Sum all the values of a Map together
  • Sum the lengths of the Strings contained in the collection

There are a number of ways that fold can be written syntactically. The following examples are ways to sum together the integers of a Map. Note that /: is an alias for foldLeft and :/ is an alias for foldRight. Normally foldLeft and foldRight should be used unless the rest of the developers who will be reading the code are quite confortable reading Scala.

  1. // fold right processes from left to right
  2. scala> val m = Map(1 -> "one", 2 -> "two", 3 -> "three")
  3. m: scala.collection.immutable.Map[Int,java.lang.String] = Map(1 -> one, 2 -> two, 3 -> three)
  4. scala> m.foldRight(0)((kv: (Int, String), a : Int) => a+kv._2.length)
  5. res1: Int = 11
  6. scala> (m :\ 0)((kv: (Int, String), a : Int) => a+kv._2.length)
  7. res5: Int = 11
  8. scala> m.foldRight(0){ case ((key, value), a) => a+value.length}
  9. res7: Int = 11
  10. scala> m.foldRight(0){ case (kv, a) => a+kv._2.length} }}
  11. res8: Int = 11
  12. // fold left processes from right to left (usually slower than foldRight)
  13. scala> val m = Map("a" -> 1, "b" -> 2, "c" -> 3)
  14. m: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -> 1, b -> 2, c -> 3)
  15. scala> m.foldLeft(0)( (accum, kv) => accum + kv._2 )
  16. res9: Int = 6
  17. scala> m.foldLeft(0){case (accum, (key, value)) => accum + value}
  18. res5: Int = 6
  19. scala> (0 /: m){case (accum, (key, value)) => accum + value}
  20. res4: Int = 6

4 comments:

  1. You wrote that lists are better to process right to left but I believe that its the other way around. foldRight will in fact blow your stack because every object in your list will require a stack frame before processing can even start. See this stack overflow question for a complete explanation:
    http://stackoverflow.com/questions/1446419

    - David

    ReplyDelete
  2. You are right. I researched the issue to make sure I had it correct and then went a head and put the wrong solution.

    I have updated the post with this fix

    ReplyDelete
  3. Fold left process from left to right
    http://www.scala-lang.org/api/current/index.html#scala.collection.Map

    "Applies a binary operator to a start value and all elements of this map, going left to right."

    ReplyDelete
  4. Jesse. Thanks for your posts which are generally highly useful. I appreciate how you put the scala-ese into understandable terms. Would you be up to adding an example or two about fold itself?

    ReplyDelete