Showing posts with label extractor. Show all posts
Showing posts with label extractor. Show all posts

Tuesday, February 16, 2010

And Case Statements

Recently I encountered a good question on Stack Overflow about matching.
http://stackoverflow.com/questions/2261358/pattern-matching-with-conjunctions-patterna-and-patternb.

As mentioned in an earlier post Matching with Or case expressions suppose 'or' expression combination using the '|' character. However 'and' combinations are not possible.

One solution is to build an && extractor object as follows:
  1. scala> case object && {  def unapply[A](a: A) = Some((a, a))}
  2. defined module $amp$amp
  3. scala> object StartsWith {  def unapply(s: String) = s.headOption}
  4. defined module StartsWith
  5. scala> object EndsWith {  def unapply(s: String) = s.reverse.headOption}
  6. defined module EndsWith
  7. scala> "foo" match {  case StartsWith('f') && EndsWith('o') => println("f*o") }
  8. f*o

Note: this is a scala 2.7 solution Scala 2.8 can be used to improve the EndsWith extractor by using the method lastOption instead of s.reverse.headOption.

Thursday, January 28, 2010

Overcoming Type Erasure in matching 1

Since Scala runs on the JVM (it also runs on .NET but we this tip is aimed at Scala on the JVM) much of the type information that is available at compile-time is lost during runtime. This means certain types of matching are not possible. For example it would be nice to be able to do the following:
  1. scala> val x : List[Any] = List(1.0,2.0,3.0)
  2. x: List[Any] = List(1, 2, 3)
  3. scala> x match { case l : List[Boolean] => l(0) }

If you run this code the list matches the list of ints which is incorrect (I have ran the following with -unchecked so it will print the warning about erasure):
  1. scala> val x : List[Any] = List(1.0,2.0,3.0)
  2. x: List[Any] = List(1, 2, 3)
  3. scala> x match { case l : List[Boolean] => l(0) }         
  4. < console>:6: warning: non variable type-argument Boolean in type pattern List[Boolean] is unchecked since it is eliminated by erasure
  5.        x match { case l : List[Boolean] => l(0) }
  6.                           ^
  7. java.lang.ClassCastException: java.lang.Double cannot be cast to java.lang.Boolean
  8. at scala.runtime.BoxesRunTime.unboxToBoolean(Unknown Source)
  9. at .< init>(< console>:6)

Another example is trying to match on structural types. Wouldn't it be nice to be able to do the following (We will solve this in a future post):
  1. scala> "a string" match {
  2.      | case l : {def length : Int} => l.length
  3.      | }
  4. < console>:6: warning: refinement AnyRef{def length: Int} in type pattern AnyRef{def length: Int} is unchecked since it is eliminated by erasure
  5.        case l : {def length : Int} => l.length
  6.                 ^
  7. res5: Int = 8

So lets see what we can do about this. My proposed solution is to create an class that can be used as an extractor when instantiated to do the check.

This is a fairly advanced tip so make sure you have read up on Matchers and Manifests:

The key parts of the next examples are the Def class and the function 'func'. 'func' is defined in the comments in the code block.

The Def class is the definition of what we want to match. Once we have the definition we can use it as an Extractor to match List[Int] instances.

The Def solution is quite generic so it will satisfy may cases where you might want to do matching but erasure is getting in the way.

The secret is to use manifests:
  • When the class is created the manifest for the class we want is captured.
  • Each time a match is attempted the manifest of the class being matched is captured
  • In the unapply method, the two manifests are compared and when they match we can return the matched element but cast to the desired type for the compiler

It is critical to notice the use of the typeArguments of the manifest. This returns a list of the manifests of each typeArgument. You cannot simply compare desired == m because manifest comparisons are not deep. There is a weakness in this code in that it only handles generics that are 1 level deep. For example:

List[Int] can be matched with the following code but List[List[Int]] will match anything that is List[List[_]]. Making the method more generic is an exercise left to the reader.
  1. scala> import reflect._ 
  2. import reflect._
  3. /*
  4. This is the key class
  5. */
  6. scala> class Def[C](implicit desired : Manifest[C]) {
  7.      | def unapply[X](c : X)(implicit m : Manifest[X]) : Option[C] = {
  8.      |   def sameArgs = desired.typeArguments.zip(m.typeArguments).forall {case (desired,actual) => desired >:> actual}
  9.      |   if (desired >:> m && sameArgs) Some(c.asInstanceOf[C])
  10.      |   else None
  11.      | }
  12.      | }
  13. defined class Def
  14. // first define what we want to match
  15. scala> val IntList = new Def[List[Int]]
  16. IntList: Def[List[Int]] = Def@6997f7f4
  17. /*
  18. Now use the object as an extractor.  the variable l will be a typesafe List[Int]
  19. */
  20. scala> List(1,2,3) match { case IntList(l) => l(1) : Int ; case _ => -1 }
  21. res36: Int = 2
  22. scala> List(1.0,2,3) match { case IntList(l) => l(1) : Int ; case _ => -1 }
  23. res37: Int = -1
  24. // 32 is not a list so it will fail to match
  25. scala> 32 match { case IntList(l) => l(1) : Int ; case _ => -1 }
  26. res2: Int = -1
  27. scala> Map(1 -> 3) match { case IntList(l) => l(1) : Int ; case _ => -1 }
  28. res3: Int = -1
  29. /*
  30. The Def class can be used with any Generic type it is not restricted to collections
  31. */
  32. scala> val IntIntFunction = new Def[Function1[Int,Int]]
  33. IntIntFunction: Def[(Int) => Int] = Def@5675e2b4
  34. // this will match because it is a function
  35. scala> ((i:Int) => 10) match { case IntIntFunction(f) => f(3); case _ => -1}
  36. res38: Int = 10
  37. // no match because 32 is not a function
  38. scala> 32 match { case IntIntFunction(f) => f(3); case _ => -1}
  39. res39: Int = -1
  40. // Cool Map is a function so it will match
  41. scala> Map(3 -> 100) match { case IntIntFunction(f) => f(3); case _ => -1}
  42. res6: Int = 100
  43. /*
  44. Now we see both the power and the limitations of this solution.
  45. One might expect that:
  46.    def func(a:Any) = {...} 
  47. would work.  
  48. However, if the function is defined with 'Any' the compiler does not pass in the required information so the manifest that will be created will be a Any manifest object.  Because of this the more convoluted method declaration must be used so the type information is passed in. 
  49. I will discuss implicit parameters in some other post
  50. */
  51. scala> def func[A](a:A)(implicit m:Manifest[A]) = {
  52.      |   a match {
  53.      |     case IntList(l) => l.head                   
  54.      |     case IntIntFunction(f) => f(32)             
  55.      |     case i:Int => i                             
  56.      |     case _ => -1                                
  57.      |   } 
  58.      | }
  59. func: [A](a: A)(implicit m: Manifest[A])Int
  60. scala> func(List(1,2,3))                           
  61. res16: Int = 1
  62. scala> func('i')                                   
  63. res17: Int = -1
  64. scala> func(4)
  65. res18: Int = 4
  66. scala> func((i:Int) => i+2)
  67. res19: Int = 34
  68. scala> func(Map(32 -> 2))
  69. res20: Int = 2

Thursday, October 29, 2009

Boolean Extractors

As discussed in other topics there are several ways to create custom extractors for objects. See
There is one more custom extractor that can be defined. Simple boolean extractors. They are used as follows:
  1. scala>"hello world"match {                   
  2.      | case HasVowels() => println("Vowel found")
  3.      | case _ => println("No Vowels")            
  4.      | }
  5. Vowel found

A boolean extractor is an object that returns Boolean from the unapply method rather than Option[_].

Examples:
  1. scala>object HasVowels{ defunapply(in:String):Boolean = in.exists( "aeiou" contains _ ) }
  2. defined module HasVowels
  3. // Note that HasVowels() has ().
  4. // This is critical because otherwise the match checks whether
  5. // the input is the HasVowels object.
  6. // The () forces the unapply method to be used for matching
  7. scala>"hello world"match {
  8.      | case HasVowels() => println("Vowel found")
  9.      | case _ => println("No Vowels")
  10.      | }
  11. Vowel found
  12. // Don't forget the ()
  13. scala>"kkkkkk"match {    
  14.      | case HasVowels() => println("Vowel found")
  15.      | case _ => println("No Vowels")
  16.      | }
  17. No Vowels
  18. scala>class HasChar(c:Char) {           
  19.      | defunapply(in:String) = in.contains(c)
  20.      | }
  21. defined class HasChar
  22. scala>val HasC = new HasChar('c')
  23. HasC: HasChar = HasChar@3f2f529b
  24. // Don't forget the () it is required here as well
  25. scala>"It actually works!"match {
  26.      | case HasC() => println("a c was found")
  27.      | case _ => println("no c found")  
  28.      | }
  29. a c was found
  30. // Don't forget the ()
  31. scala>"hello world"match { 
  32.      | case HasC() => println("a c was found")
  33.      | case _ => println("no c found")       
  34.      | }
  35. no c found

Wednesday, October 28, 2009

Extractors 3 (Operator Style Matching)

One of the blessings and curses of Scala the several rules for creating expressive code (curse because it can be used for evil.) One such rule is related to extractors that allows the following style of match pattern:
  1. List(1,2,3) match {
  2.   case 1 :: _ => println("found a list that starts with a 1")
  3.   case _ => println("boo")
  4. }

The rule is very simple. An extractor object that returns Option[Tuple2[_,_]] (or equivalently Option[(_,_)]) can be expressed in this form.

In other words: object X {defunapply(in:String):Option[(String,String)] = ...} can be used in a case statement like: case first X head => ... or case"a" X head => ....

Example to extract out the vowels from a string:
  1. scala>object X { defunapply(in:String):Option[(RichString,RichString)] = Some(in.partition( "aeiou" contains _ )) }
  2. defined module X
  3. scala>"hello world"match { case head X tail => println(head, tail) }
  4. (eoo,hll wrld)
  5. // This is equivalent but a different way of expressing it
  6. scala>"hello world"match { case X(head, tail) => println(head, tail) }       
  7. (eoo,hll wrld)


Example for Matching the last element in a list. Thanks to 3-things-you-didnt-know-scala-pattern.html:
  1. scala>object ::> {defunapply[A] (l: List[A]) = Some( (l.init, l.last) )}
  2. defined module $colon$colon$greater
  3. scala> List(1, 2, 3) match {
  4.      | case _ ::> last => println(last)
  5.      | }
  6. 3
  7. scala> (1 to 9).toList match {
  8.      | case List(1, 2, 3, 4, 5, 6, 7, 8) ::> 9 => "woah!"
  9.      | }
  10. res12: java.lang.String = woah!
  11. scala> (1 to 9).toList match {
  12.      | case List(1, 2, 3, 4, 5, 6, 7) ::> 8 ::> 9 => "w00t!"
  13.      | }
  14. res13: java.lang.String = w00t!

Tuesday, September 29, 2009

Extract sequences (unapplySeq)

This topic continues the previous topic on matching and Extractors. Make sure you look at Extractors 1.

The first extractor topic covered the unapply method and how it is used during matching. Today I want to visit a similar method unapplySeq, which is used to match sequences. The method defunapplySeq(param):Option[Seq[T] can be used instead of unapply.

Note: if both unapply and unapplySeq are defined only unapply is used.

When matching on Sequences the _* symbol means to match an arbitrary sequence. We use this several times in the examples below
  1. scala>object FindAs {
  2.      | defunapplySeq(string:String):Option[List[String]] = {
  3.      |    def containsA (word:String) = word.toLowerCase contains "a"
  4.      | 
  5.      |    if (string.toLowerCase contains "a") {
  6.      |      val words = string.split ("\\s+").
  7.      |                         filter (containsA _)
  8.      |      Some(words.toList)
  9.      |    } else {
  10.      |      None
  11.      |    }
  12.      | }
  13.      | }
  14. defined module FindAs
  15. // as usual you can use extractors to assign variables
  16. scala>val FindAs(a,b) = "This sentence contains 2 a-s"
  17. a: String = contains
  18. b: String = a-s
  19. // If you only care about the first variable you can use _* to 
  20. // reference the rest of the sequence that you don-t care about
  21. scala>  val FindAs(a, _*) = "A crazy a sentence ack!"
  22. a: String = A
  23. // using b @ _* we can get the rest of the sequence assigned to b
  24. scala>  val FindAs(a, b@_*) = "A crazy a sentence ack!"
  25. a: String = A
  26. b: Seq[String] = List(crazy, a, ack!)
  27. // standard matching pattern
  28. scala>"This sentence contains 2 a-s"match {          
  29.      | case FindAs(a,b) => println(a,b)
  30.      | case _ => println("whoops")
  31.      | }
  32. (contains,a-s)
  33. // In this example we only care that it can match not the values
  34. // so we ignore all of the actual sequence by using: _* as the parameters
  35. scala>"This sentence contains 2 a-s"match {          
  36.      |  case FindAs(_*) => println("a match")          
  37.      | }
  38. a match
  39. scala>"This sentence contains 2 a-s"match {
  40.      | case FindAs( first, _*) => println("first word = "+first)
  41.      | }
  42. first word = contains
  43. scala>"A crazy a sentence ack!"match {
  44.      | case FindAs( first, next, rest @ _*) => println("1=%s, 2=%s, rest=%s".format(first, next, rest) )
  45.      | }
  46. 1=A, 2=crazy, rest=List(a, ack!)

Monday, September 28, 2009

Extractors 1 (Unapply)

When defining a match such as case Tuple2(one, two) the methods Tuple2.unapply and Tuple2.unapplySeq are called to see if that case can match the input. If one of methods return a Some(...) object then the case is considered to be a match. These methods are called Extractor methods because they essentially decompose the object into several parameters.

I will cover unapplySeq later.

Examples are the best way to illustrate the issue:
  1. // The unapply method of this object takes a string and returns an Option[String]
  2. //   This means that the value being matched must be a string and the value extracted is also a string
  3. scala>object SingleParamExtractor {
  4.      | defunapply(v:String):Option[String] = if(v.contains("Hello")) Some("Hello") else None
  5.      | }
  6. defined module SingleParamExtractor
  7. // this Will match since the extractor will return a Some object
  8. scala>"Hello World"match { case SingleParamExtractor(v) => println(v) }
  9. Hello
  10. // this will not match and therefore an exception will be thrown
  11. scala>"Hi World"match { case SingleParamExtractor(v) => println(v) }   
  12. scala.MatchError: Hi World
  13.                   at .(:7)
  14.                   at .()
  15.                   at RequestResult$.(:3)
  16.                   at RequestResult$.()
  17.                   at RequestResult$result()
  18.                   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  19.                   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  20.                   at sun.reflect.DelegatingMethodAccessorImpl.invok...
  21. // This extractor converts the string to an int if possible.
  22. scala>object ConvertToInt{         
  23.      | defunapply(v:String):Option[Int] = try{ Some(v.toInt) } catch { case _ => None }
  24.      | }
  25. defined module ConvertToInt
  26. scala>"10"match { case ConvertToInt(i) => println(i)}
  27. 10
  28. // If you want to extract multiple elements you return an Option that contains a Tuple.  
  29. //   In this example we divide the string into two parts if it has a space
  30. scala>object MultipleParamExtractor {                                       
  31.      |  defunapply(v:String):Option[(String,String)] = (v indexOf ' ') match {           
  32.      | case x if (x>0) => Some ((v take x, v drop x+1))
  33.      | case _ => None
  34.      | }
  35.      | }
  36. defined module MultipleParamExtractor
  37. scala>"hello everyone :)"match { case MultipleParamExtractor(one, two) => println(one,two) }
  38. (hello,everyone :))
  39. // Any object with a unapply method can be used it does not have to be defined as an object
  40. // So if you have a class of extractors that needs to be parameterized you can 
  41. // create a class and use instances of that class for matching
  42. scala>class Splitter(sep:Char){
  43.      | defunapply(v:String):Option[(String,String)] = (v indexOf sep) match {
  44.      | case x if (x>0) => Some ((v take x, v drop x+1))
  45.      | case _ => None
  46.      | }
  47.      | }
  48. defined class Splitter
  49. // Remember that we need the matching object start with an uppercase
  50. // See http://daily-scala.blogspot.com/2009/09/case-sensitive-matching.html 
  51. // for details
  52. scala>val SplitOnComma = new Splitter (',')
  53. SplitOnComma: Splitter = Splitter@15eb9b0d
  54. // How cool now can create splitters for all sorts of things
  55. scala>"1,2"match { case SplitOnComma(one,two) => println(one,two)}
  56. (1,2)
  57. // All extractors can also be used in assignments
  58. scala>val SplitOnComma(one,two) = "1,2"                           
  59. one: String = 1
  60. two: String = 2

Wednesday, September 9, 2009

Companion Object

A companion object is an object with the same name as a class or trait and is defined in the same source file as the associated file or trait. A companion object differs from other objects as it has access rights to the class/trait that other objects do not. In particular it can access methods and fields that are private in the class/trait.

An analog to a companion object in Java is having a class with static methods. In Scala you would move the static methods to a Companion object.

One of the most common uses of a companion object is to define factory methods for class. An example is case-classes. When a case-class is declared a companion object is created for the case-class with a factory method that has the same signature as the primary constructor of the case class. That is why one can create a case-class like: MyCaseClass(param1, param2). No new element is required for case-class instantiation.

A second common use-case for companion objects is to create extractors for the class. I will mention extractors in a future topic. Basically extractors allow matching to work with arbitrary classes.

NOTE: Because the companion object and the class must be defined in the same source file you cannot create them in the interpreter. So copy the following example into a file and run it in script mode:

scala mysourcefile.scala


Example:

  1. class MyString(val jString:String) {
  2.   privatevar extraData = ""
  3.   overridedef toString = jString+extraData
  4. }
  5. object MyString {
  6.   def apply(base:String, extras:String) = {
  7.     val s = new MyString(base)
  8.     s.extraData = extras
  9.     s
  10.   }
  11.   def apply(base:String) = new MyString(base)
  12. }
  13. println(MyString("hello"," world"))
  14. println(MyString("hello"))