Showing posts with label traversable. Show all posts
Showing posts with label traversable. Show all posts

Monday, April 19, 2010

Scala 2.8 Arrays are not Traversables

One performance/consistency change that has been make in Scala 2.8 is to make Scala Array always be a Java Array. This has some consequences which we will examine in this post. The biggest one is that Array is not a Scala Collection/Traversable. It is implicitly converted to one but it is not an instance of a Traversable. There are several reasons this was done. Probably the biggest is for performance. Because a Scala array is a Java array there is no overhead when using a Scala array.

Thanks to implicit type conversion all the normal collection methods are useable with an array. Even better, after running a method like map the result will again be a Java array. So the API is much more consistent.

An example illustrating that an Array is not a Traversable:
  1. // This does not compile (which is good) 
  2. // because Traversable[Int] can never be an array
  3. scala> def x(t:Traversable[Int]) = t match {
  4.      | case x : Array[Int] => true          
  5.      | }
  6. < console>:13: error: pattern type is incompatible with expected type;
  7.  found   : Array[Int]
  8.  required: Traversable[Int]
  9.        case x : Array[Int] => true
  10.                 ^
  11. < console>:13: error: type mismatch;
  12.  found   : Array[Int]
  13.  required: Traversable[Int]
  14.        case x : Array[Int] => true
  15.               ^

Another example:
  1. scala> def x(t:Traversable[Int]) = t.isInstanceOf[Array[_]]
  2. x: (t: Traversable[Int])Boolean
  3. /* this evaluates to false because Array is converted
  4.  * to WrappedArray because it has to be implicitly converted
  5.  * to a Traversable.  Since Array is not a Traversable the resulting 
  6.  * object is not an Array
  7.  */
  8. scala> x(Array(1,2,3))                                     
  9. res24: Boolean = false
  10. scala> def x(t:Traversable[Int]) = println(t)
  11. x: (t: Traversable[Int])Unit
  12. // This method call demonstrates the previous assertion
  13. scala> x(Array(1,2,3))                                            
  14. WrappedArray(1, 2, 3)

So suppose you want to be able to accept and use arrays and Traversables in a method but you want to be able to
check that the parameter is an Array. Why not match against WrappedArray. You probably can, but you may get performance improvements in some cases if you don't require wrapping the array.

For a more concrete example of why you may want to do this. In a Input/Output routine I wrote I would write the data one way if the input was an Array: stream.write(array). But if the input was a traversable then I would have to handle it differently. My particular issue was more complicated than that but that is the basic issue.

So the work around is to define a Generic parameter for the method:
  1. scala> def x[T <% Traversable[Int]](t:T) = t match { 
  2.      | case x : Array[Int] => true                                
  3.      | }
  4. x: [T](t: T)(implicit evidence$1: (T) => Traversable[Int])Boolean
  5. scala> x(Array(1,2,3))                               
  6. res27: Boolean = true

Tuesday, April 13, 2010

Creating Custom Traversable implementations

One of the most talked about features of Scala 2.8 is the improved Collections libraries. Creating your own implementation is trivial, however if you want your new collection to behave the same way as all the included libraries there are a few tips you need to be aware of.

Note: All of these examples can either be ran in the REPL or put in a file and ran

Starting with the simple implementation:
  1. import scala.collection._
  2. import scala.collection.generic._
  3. class MyColl[A](seq : A*) extends Traversable[A] {
  4.     // only abstract method in traversable is foreach... easy :) 
  5.   def foreach[U](f: A => U) = util.Random.shuffle(seq.toSeq).foreach(f)
  6. }

This is a silly collection I admit but it is custom :).

This example works but if you test the result of a map operation (or any other operation that returns a new instance of the collection) you will notice it is not an instance of MyColl. This is expected because unless otherwise defined Traversable will return a new instance of Traversable.

To demonstrate run the following tests:
  1. val c = new MyColl(1, 2, 3)
  2. println (c mkString ",")
  3. println(c mkString ",")
  4. println(c drop 1 mkString ",")
  5. // this two next assertions fail (see following explanation)
  6. assert(c.drop(1).isInstanceOf[MyColl[_]])
  7. assert((c map {_ + 1}).isInstanceOf[MyColl[_]])

Both assertions will fail. The reason for these failures is because the collection is immutable which dictates by necessity that a new object must be returned from filter/map/etc... Since the Traversable trait returns instances of Traversable these two assertions fail. The easiest way to make these methods return an instance of MyColl is to make the following changes/additions.
  1. import scala.collection._
  2. import scala.collection.generic._
  3. /*
  4. Adding GenericTraversableTemplate will delegate the creation of new
  5. collections to the companion object.  Adding the trait and
  6. companion object causes all the new collections to be instances of MyColl
  7. */
  8. class MyColl[A](seq : A*) extends Traversable[A] 
  9.                              with GenericTraversableTemplate[A, MyColl] {
  10.   override def companion = MyColl
  11.   def foreach[U](f: A => U) = util.Random.shuffle(seq.toSeq).foreach(f)
  12. }
  13. // The TraversableFactory trait is required by GenericTraversableTemplate
  14. object MyColl extends TraversableFactory[MyColl] {
  15. /* 
  16. If you look at the signatures of many methods in TraversableLike they have an
  17. implicit parameter canBuildFrom.  This allows one to define how the returned collections
  18. are built.  For example one could make a list's map method return a Set
  19. In this case we define the default canBuildFrom for MyColl
  20. */
  21.   implicit def canBuildFrom[A]: CanBuildFrom[Coll, A, MyColl[A]] = new GenericCanBuildFrom[A]
  22. /*  
  23. The method that builds the new collection.  This is a simple implementation
  24. but it works.  There are other implementations to assist with implementation if
  25. needed
  26. */
  27.   def newBuilder[A] = new scala.collection.mutable.LazyBuilder[A,MyColl[A]] {
  28.     def result = {
  29.       val data = parts.foldLeft(List[A]()){(l,n) => l ++ n}
  30.       new MyColl(data:_*)
  31.     }
  32.   }
  33. }

Now instances of MyColl will be created by the various filter/map/etc... methods and that is fine as long as the new object is not required at compile-time. But suppose we added a method to the class and want that accessible after applying methods like map and filter.

Adding val o : MyColl[Long] = c map {_.toLong} to the assertions will cause a compilation error since statically the class returned is Traversable[Long]. The fix is easy.

All that needs to be done is to add with TraversableLike[A, MyColl[A]] to MyColl and we are golden. There may be other methods as well but this works and is simple.

Note that the order in which the traits are mixed in is important. TraversableLike[A, MyColl[A]] must be mixed in after Traversable[A]. The reason is that we want methods like map and drop to return instances of MyColl (statically as well as dynamically). If the order was reversed then those methods would return Traversable event though statically the actual instances would still be MyColl.
  1. import scala.collection._
  2. import scala.collection.generic._
  3. class MyColl[A](seq : A*) extends Traversable[A]
  4.                              with GenericTraversableTemplate[A, MyColl] 
  5.                              with TraversableLike[A, MyColl[A]] {
  6.   override def companion = MyColl
  7.   def foreach[U](f: A => U) = util.Random.shuffle(seq.toSeq).foreach(f)
  8. }
  9. object MyColl extends TraversableFactory[MyColl] {  
  10.   implicit def canBuildFrom[A]: CanBuildFrom[Coll, A, MyColl[A]] = new GenericCanBuildFrom[A]
  11.   def newBuilder[A] = new scala.collection.mutable.LazyBuilder[A,MyColl[A]] {
  12.     def result = {
  13.       val data = parts.foldLeft(List[A]()){(l,n) => l ++ n}
  14.       new MyColl(data:_*)
  15.     }
  16.   }
  17. }

Now add in a new method to demonstrate that the new collection works as desired and we are done.

The following is the complete implementation with the tests. You can put it in a file and run scala <filename> or paste it into a REPL
  1. import scala.collection._
  2. import scala.collection.generic._
  3. import scala.collection.mutable.{ Builder, ListBuffer }
  4. class MyColl[A](seq : A*) extends Traversable[A]
  5.                              with GenericTraversableTemplate[A, MyColl] 
  6.                              with TraversableLike[A, MyColl[A]] {
  7.   override def companion = MyColl
  8.   def foreach[U](f: A => U) = util.Random.shuffle(seq.toSeq).foreach(f)
  9.   def sayhi = println("hi!")
  10. }
  11. object MyColl extends TraversableFactory[MyColl] {  
  12.   implicit def canBuildFrom[A]: CanBuildFrom[Coll, A, MyColl[A]] = new GenericCanBuildFrom[A]
  13.   def newBuilder[A] = new ListBuffer[A] mapResult (x => new MyColl(x:_*))
  14. }
  15. val c = new MyColl(1, 2, 3)
  16. println (c mkString ",")
  17. println(c mkString ",")
  18. assert(c.drop(1).isInstanceOf[MyColl[_]])
  19. assert((c map {_ + 1}).isInstanceOf[MyColl[_]])
  20. val o : MyColl[Int] = c filter {_ < 2}
  21. println(o mkString "," )
  22. o.sayhi

Monday, March 15, 2010

Unzip

_ Scala 2.8 only tip _

Unzip is a handy companion to partition.
- Partition divides a traversable into two traversables by a boolean predicate.
- Unzip divides a traversable into two by dividing each element into two parts (each becomes an element in one traversable). If an element is a Tuple2 then each tuple is divided into two otherwise a function is required to divide an element into two.

  1. // basic usage
  2. scala> List((1,2),(2,3)).unzip
  3. res2: (List[Int], List[Int]) = (List(1, 2),List(2, 3))
  4. /* 
  5. tuples can be of different types 
  6. and the resulting traversables reflect the differing types
  7. */
  8. scala> List((2,"a"),(3,"b")).unzip
  9. res3: (List[Int], List[java.lang.String]) = (List(2, 3),List(a, b))
  10. // Maps are Traversable[Collection] so unzip works with them
  11. scala> Map(1 -> 2, 3 -> 4).unzip
  12. res1: (scala.collection.immutable.Iterable[Int], scala.collection.immutable.Iterable[Int]) = (List(1, 3),List(2, 4))
  13. // Of course sets result in sets and duplicates are collected to a single element
  14. scala> Set((1,2),(2,2)).unzip
  15. res7: (scala.collection.immutable.Set[Int], scala.collection.immutable.Set[Int]) = (Set(1, 2),Set(2))
  16. /*
  17. Arbitrary elements can be unziped if a method is provided to decompose each element
  18. */
  19. scala> List("one word""another word").unzip {e => (e takeWhile {_ != ' '}, e dropWhile {_ != ' '})} 
  20. res6: (List[String], List[String]) = (List(one, another),List( word,  word))
  21. /*
  22. The following shows the same function 
  23. applied with map.  It results in a single
  24.  list of Tuples rather than two lists of single elements
  25.  */
  26. scala> List("one word""another word").map {e => (e takeWhile {_ != ' '}, e dropWhile {_ != ' '})}  
  27. res8: List[(String, String)] = List((one, word), (another, word))

Monday, December 7, 2009

Scala 2.8 Traversable and IndexedSequence

This topic is really just a collection of examples using the Scala 2.8 API. The examples are mostly based on Strings. But because Strings are in fact an IndexedSeq all of these examples also work with Lists arrays and the first several work with all collections.

These examples are all based on first Traversable. The basis of all collections and therefore will work with all collections. And the next set of examples work with IndexedSeq and most will work with Seq as well. Sets, Maps and others still need to be examined.

This is not an exhaustive set of examples and not all are even that interesting (sorry :-) ). But I let out the most trivial unless theay are being compared to others.

Traversable:
  1. scala> val license = """Subject of Agreement. ?Product?, as referred to in this Agreement, shall be the binary software package ?Sun VirtualBox,? which Product allows for creating multiple virtual computers, each with different operating systems (?Guest Computers?), on a physical computer with a specific operating system (?Host Computer?), to allow for installing and executing these Guest Computers simultaneously. The Product consists of executable files in machine code for the Solaris, Windows, Linux, and Mac?OS?X operating systems as well as other data files as required by the executable files at run-time and documentation in electronic form. The Product includes all documentation and updates provided to You by Sun under this Agreement and the terms of this Agreement will apply to all such documentation and updates unless a different license is provided with an update or documentation."""
  2. res0: RandomAccessSeq[Char] = 
  3. license: java.lang.String = ...
  4. scala> license count (_ == 'u')
  5. res1: Int = 37
  6. scala> license split ' ' count {_ == "and"
  7. res3: Int = 6
  8. scala> license ++ " this is silly!"
  9. res0: RandomAccessSeq[Char] = ...
  10. scala> license dropWhile { _ != 'y'} drop 1 takeWhile { _ != 'y'}
  11. res6: Seq[Char] = RandomAccessSeqTW( , s, o, ...
  12. scala> license forall {_ != 'z'}
  13. res8: Boolean = true
  14. scala> (1 until 5).init
  15. res0: scala.collection.immutable.Range = Range(1, 2, 3)
  16. scala> (1 until 5).head
  17. res1: Int = 1
  18. scala> (1 until 5).tail
  19. res2: scala.collection.immutable.IndexedSeq[Int] = Range(2, 3, 4)
  20. scala> (1 until 5).last
  21. res3: Int = 4
  22. scala> (1 until 5).headOption
  23. res4: Option[Int] = Some(1)
  24. scala> (1 until 5).lastOption
  25. res6: Option[Int] = Some(4)
  26. scala> license max   
  27. res8: Char = y
  28. scala> license min
  29. res9: Char =  
  30. scala> List() nonEmpty 
  31. res10: Boolean = false
  32. scala> List() isEmpty 
  33. res11: Boolean = true
  34. scala> license partialMap {case c if ('a' to 'z' contains c) => c}
  35. res13: scala.collection.immutable.IndexedSeq[Any] = IndexedSeq(u, b, j, e, c ...)
  36. scala> 1 to 10000 by 2 product 
  37. res19: Int = 481029393
  38. scala> 1 to 10000 by 2 sum                   
  39. res23: Int = 25000000
  40. scala> license partition {"aeiou" contains _}
  41. res20: (StringString) = (ueoeeeouaeeeoiieeeaeeia ....
  42. scala> license span {_ != 'a'}
  43. res22: (StringString) = (Subject of Agreement. ?Product?, ,as referred to in this Agreement, shall be the binary software package ?Sun VirtualBox,? which Product allows for creating multiple virtual computers, each with differ ...
  44. scala> val consonants = license withFilter {c => !("aeiuo" contains c)}
  45. consonants: scala.collection.immutable.StringOps#WithFilter = scala.collection.TraversableLike$WithFilter@78e8a591
  46. scala> consonants foreach print _                                      
  47. Sbjct f Agrmnt. ?Prdct?, s rfrrd t n ths Agrmnt, shll b th bnry sftwr pckg ?Sn VrtlBx,? whch Prdct llws fr crtng mltpl vrtl cmptrs, ch wth dffrnt prtng systms (?Gst Cmptrs?), n  physcl cmptr wth  spcfc prtng systm (?Hst Cmptr?), t llw fr nstllng nd xctng ths Gst Cmptrs smltnsly. Th Prdct cnssts f xctbl fls n mchn cd fr th Slrs, Wndws, Lnx, nd Mc?OS?X prtng systms s wll s thr dt fls s rqrd by th xctbl fls t rn-tm nd dcmnttn n lctrnc frm. Th Prdct nclds ll dcmnttn nd pdts prvdd t Y by Sn ndr ths Agrmnt nd th trms f ths Agrmnt wll pply t ll sch dcmnttn nd pdts nlss  dffrnt lcns s prvdd wth n pdt r dcmnttn.


IndexedSequense additions to Traversable:
  1. scala> '!' +: license    
  2. res33: String = !Subject of Agreement
  3. scala> "hello" +: license
  4. res32: scala.collection.immutable.IndexedSeq[Any] = IndexedSeq(hello, S, u, b, j, e, c, t,  , o, f,  , A, g, r, e, e, m, e, n, t, .,  , ?, P, r, o, d, u, c, t, ?, ,,  , a, s,  , r, e, f, e, r, r, e, d,  , t, o,  , i, n,  , t, h, i, s,  , A, g, r, e, e, m, e, n, t, ,,  , s, h, a, l, l,  , b, e,  , t, h, e,  , b, i, n, a, r, y,  , s, o, f, t, w, a, r, e,  , p, a, c, k, a, g, e,  , ?, S, u, n,  , V, i, r, t, u, a, l, B, o, x, ,, ?,  , w, h, i, c, h,  , P, r, o, d, u, c, t,  , a, l, l
  5. scala> license(11)
  6. res35: Char = A
  7. scala> license diff ("aeiou")
  8. res47: String = Sbjct f Agreement. ?Product?, s referred to n this Agreement, shall be the binary software package ?Sun VirtualBox,? which Product allows for creating multiple virtual computers, each with different operating systems (?Guest Computers?), on a physical computer with a specific operating system (?Host Computer?), to allow for installing and executing these Guest Computers simultaneously. The Product consists of executable files in machine code for the Solaris, Windows, Linux, and Mac?OS?X operating systems as well as other data files as required by the executable files at run-time and documentation in electronic form. The Product includes all documentation and updates provided to You by Sun under this Agreement and the terms of this Agreement will apply to all such documentation and updates unless a different license is provided with an update or documentation.
  9. scala> license dropRight(30)
  10. res48: String = Subject of Agreement. ?Product?, as referred to in this Agreement, shall be the binary software package ?Sun VirtualBox,? which Product allows for creating multiple virtual computers, each with different operating systems (?Guest Computers?), on a physical computer with a specific operating system (?Host Computer?), to allow for installing and executing these Guest Computers simultaneously. The Product consists of executable files in machine code for the Solaris, Windows, Linux, and Mac?OS?X operating systems as well as other data files as required by the executable files at run-time and documentation in electronic form. The Product includes all documentation and updates provided to You by Sun under this Agreement and the terms of this Agreement will apply to all such documentation and updates unless a different license is provided wi
  11. scala> license takeRight(10)
  12. res49: String = mentation.
  13. scala> license indexWhere { "aeiou" contains _} 
  14. res4: Int = 1
  15. scala> license lastIndexWhere { "aeiou" contains _}
  16. res6: Int = 873
  17. scala> List(1,2,3) flatten {0 to _}
  18. res7: List[Int] = List(0, 1, 0, 1, 2, 0, 1, 2, 3)
  19. scala> ' ' +: license zip license indexWhere {case (last, c) => "" + last + c == "an"}       
  20. res14: Int = 343
  21. scala> license indexOfSlice ("and")
  22. res17: Int = 342
  23. scala> ('a' to 'z') ++ ('A' to 'Z'
  24. res25: scala.collection.IndexedSeqView[Char,IndexedSeq[_]] = IndexedSeqViewA(a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z)
  25. scala> license intersect res25
  26. res26: String = SubjectofAgrmnPdasihlywpkVBxvGCHTWLMOXqY
  27. scala> license diff res25     
  28. res27: String =   eeet. ?rouct?,  referred to n tis Agreement, shal be the binar softare acage ?Sun irtualo,? which Product allows for creating multiple irtual computers, each with different operating systems (?uest omputers?), on a physical computer with a specific operating system (?ost Computer?), to allow for installing and executing these Guest Computers simultaneously. he Product consists of executable files in machine code for the Solaris, indows, inux, and ac?S? operating systems as well as other data files as reuired by the executable files at run-time and documentation in electronic form. The Product includes all documentation and updates provided to ou by Sun under this Agreement and the terms of this Agreement will apply to all such documentation and updates unless a different license is provided with an update or documentation.
  29. scala> license filter res25
  30. < console>:7: error: type mismatch;
  31.  found   : scala.collection.IndexedSeqView[Char,IndexedSeq[_]]
  32.  required: (Char) => Boolean
  33.        license filter res25
  34.                       ^
  35. scala> license filter {res25 contains _}
  36. res29: String = SubjectofAgreementProductasreferredtointhisAgreementshallbethebinarysoftwarepackageSunVirtualBoxwhichProductallowsforcreatingmultiplevirtualcomputerseachwithdifferentoperatingsystemsGuestComputersonaphysicalcomputerwithaspecificoperatingsystemHostComputertoallowforinstallingandexecutingtheseGuestComputerssimultaneouslyTheProductconsistsofexecutablefilesinmachinecodefortheSolarisWindowsLinuxandMacOSXoperatingsystemsaswellasotherdatafilesasrequiredbytheexecutablefilesatruntimeanddocumentationinelectronicformTheProductincludesalldocumentationandupdatesprovidedtoYoubySununderthisAgreementandthetermsofthisAgreementwillapplytoallsuchdocumentationandupdatesunlessadifferentlicenseisprovidedwithanupdateordocumentation
  37. scala> license filterNot {res25 contains _}
  38. res30: String =   . ??,      ,       ? ,?        ,      (? ?),          (? ?),          .            , , ,  ??                -     .                                        .
  39. scala> license removeDuplicates
  40. res31: String = Subject ofAgrmn.?Pd,asihlywpkVBxv(GC)HTWLMOXq-Y
  41. scala> license segmentLength ({_ > 'A'},2) 
  42. res37: Int = 5
  43. scala> license drop 2 span {_ > 'A'} _1
  44. res41: String = bject
  45. scala> license sortWith Ordering.Char          
  46. res45: String =                                                                                                                                   (()),,,,,,,,,-....??????????AAAABCCCGGHLMOPPPPSSSSSTTVWXYaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbccccccccccccccccccccccccccccccddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeefffffffffffffffffggggggggggghhhhhhhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiijkllllllllllllllllllllllllllllllllllmmmmmmmmmmmmmmmmmmmmmmnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnoooooooooooooooooooooooooooooooooooooooooooooooopppppppppppppppppppqrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrsssssssssssssssssssssssssssssssssssssssssssssssttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuvvvwwwwwwwwwwxxxxxyyyyyyyyy
  47. scala> 'a' to 'z' union ('A' to 'Z')
  48. scala> "hello world" patch (6, "goof", 5)
  49. res51: String = hello goof
  50. scala> "hello world" updated(0,'H')
  51. res54: String = Hello world