Thursday, January 28, 2010

Overcoming Type Erasure in matching 1

Since Scala runs on the JVM (it also runs on .NET but we this tip is aimed at Scala on the JVM) much of the type information that is available at compile-time is lost during runtime. This means certain types of matching are not possible. For example it would be nice to be able to do the following:
  1. scala> val x : List[Any] = List(1.0,2.0,3.0)
  2. x: List[Any] = List(1, 2, 3)
  3. scala> x match { case l : List[Boolean] => l(0) }

If you run this code the list matches the list of ints which is incorrect (I have ran the following with -unchecked so it will print the warning about erasure):
  1. scala> val x : List[Any] = List(1.0,2.0,3.0)
  2. x: List[Any] = List(1, 2, 3)
  3. scala> x match { case l : List[Boolean] => l(0) }         
  4. < console>:6: warning: non variable type-argument Boolean in type pattern List[Boolean] is unchecked since it is eliminated by erasure
  5.        x match { case l : List[Boolean] => l(0) }
  6.                           ^
  7. java.lang.ClassCastException: java.lang.Double cannot be cast to java.lang.Boolean
  8. at scala.runtime.BoxesRunTime.unboxToBoolean(Unknown Source)
  9. at .< init>(< console>:6)

Another example is trying to match on structural types. Wouldn't it be nice to be able to do the following (We will solve this in a future post):
  1. scala> "a string" match {
  2.      | case l : {def length : Int} => l.length
  3.      | }
  4. < console>:6: warning: refinement AnyRef{def length: Int} in type pattern AnyRef{def length: Int} is unchecked since it is eliminated by erasure
  5.        case l : {def length : Int} => l.length
  6.                 ^
  7. res5: Int = 8

So lets see what we can do about this. My proposed solution is to create an class that can be used as an extractor when instantiated to do the check.

This is a fairly advanced tip so make sure you have read up on Matchers and Manifests:

The key parts of the next examples are the Def class and the function 'func'. 'func' is defined in the comments in the code block.

The Def class is the definition of what we want to match. Once we have the definition we can use it as an Extractor to match List[Int] instances.

The Def solution is quite generic so it will satisfy may cases where you might want to do matching but erasure is getting in the way.

The secret is to use manifests:
  • When the class is created the manifest for the class we want is captured.
  • Each time a match is attempted the manifest of the class being matched is captured
  • In the unapply method, the two manifests are compared and when they match we can return the matched element but cast to the desired type for the compiler

It is critical to notice the use of the typeArguments of the manifest. This returns a list of the manifests of each typeArgument. You cannot simply compare desired == m because manifest comparisons are not deep. There is a weakness in this code in that it only handles generics that are 1 level deep. For example:

List[Int] can be matched with the following code but List[List[Int]] will match anything that is List[List[_]]. Making the method more generic is an exercise left to the reader.
  1. scala> import reflect._ 
  2. import reflect._
  3. /*
  4. This is the key class
  5. */
  6. scala> class Def[C](implicit desired : Manifest[C]) {
  7.      | def unapply[X](c : X)(implicit m : Manifest[X]) : Option[C] = {
  8.      |   def sameArgs = desired.typeArguments.zip(m.typeArguments).forall {case (desired,actual) => desired >:> actual}
  9.      |   if (desired >:> m && sameArgs) Some(c.asInstanceOf[C])
  10.      |   else None
  11.      | }
  12.      | }
  13. defined class Def
  14. // first define what we want to match
  15. scala> val IntList = new Def[List[Int]]
  16. IntList: Def[List[Int]] = Def@6997f7f4
  17. /*
  18. Now use the object as an extractor.  the variable l will be a typesafe List[Int]
  19. */
  20. scala> List(1,2,3) match { case IntList(l) => l(1) : Int ; case _ => -1 }
  21. res36: Int = 2
  22. scala> List(1.0,2,3) match { case IntList(l) => l(1) : Int ; case _ => -1 }
  23. res37: Int = -1
  24. // 32 is not a list so it will fail to match
  25. scala> 32 match { case IntList(l) => l(1) : Int ; case _ => -1 }
  26. res2: Int = -1
  27. scala> Map(1 -> 3) match { case IntList(l) => l(1) : Int ; case _ => -1 }
  28. res3: Int = -1
  29. /*
  30. The Def class can be used with any Generic type it is not restricted to collections
  31. */
  32. scala> val IntIntFunction = new Def[Function1[Int,Int]]
  33. IntIntFunction: Def[(Int) => Int] = Def@5675e2b4
  34. // this will match because it is a function
  35. scala> ((i:Int) => 10) match { case IntIntFunction(f) => f(3); case _ => -1}
  36. res38: Int = 10
  37. // no match because 32 is not a function
  38. scala> 32 match { case IntIntFunction(f) => f(3); case _ => -1}
  39. res39: Int = -1
  40. // Cool Map is a function so it will match
  41. scala> Map(3 -> 100) match { case IntIntFunction(f) => f(3); case _ => -1}
  42. res6: Int = 100
  43. /*
  44. Now we see both the power and the limitations of this solution.
  45. One might expect that:
  46.    def func(a:Any) = {...} 
  47. would work.  
  48. However, if the function is defined with 'Any' the compiler does not pass in the required information so the manifest that will be created will be a Any manifest object.  Because of this the more convoluted method declaration must be used so the type information is passed in. 
  49. I will discuss implicit parameters in some other post
  50. */
  51. scala> def func[A](a:A)(implicit m:Manifest[A]) = {
  52.      |   a match {
  53.      |     case IntList(l) => l.head                   
  54.      |     case IntIntFunction(f) => f(32)             
  55.      |     case i:Int => i                             
  56.      |     case _ => -1                                
  57.      |   } 
  58.      | }
  59. func: [A](a: A)(implicit m: Manifest[A])Int
  60. scala> func(List(1,2,3))                           
  61. res16: Int = 1
  62. scala> func('i')                                   
  63. res17: Int = -1
  64. scala> func(4)
  65. res18: Int = 4
  66. scala> func((i:Int) => i+2)
  67. res19: Int = 34
  68. scala> func(Map(32 -> 2))
  69. res20: Int = 2

6 comments:

  1. Is there a way to deal with some type arguments being contravariant? Try the following:

    class A

    class B extends A

    val AAFunction = new Def[Function1[A,A]]

    ((a:A) => a) match {case AAFunction(f) => Some(f(new A)); case _ => None} // this is OK

    ((a:A) => new B) match {case AAFunction(f) => Some(f(new A)); case _ => None} // this is OK

    ((b:B) => b) match {case AAFunction(f) => Some(f(new A)); case _ => None} // gives a ClassCastException, since new A is not a B

    ReplyDelete
  2. Awesome point. I have a new post addressing this

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
  3. Hi,
    Looks like it doesn't work with
    new Def[Function1[Int,Unit]]

    ReplyDelete
  4. I would like to match objects that are a tuple-2 in the form (String, List[MyCustomTrait]) and then use both ._1 and ._2 .. Would it be possible to use this approach ?

    case (k : Sring, IntList(l)) =>

    did not prove fruitful. thanks

    ReplyDelete
  5. I think I have similar problem as @wethewolves, and I have a code to test against. See the gist do you have a solution?

    It prints out three WTF's, I expected STRING! STRING! LONG! How to deal with type erasure in this case?

    Thanks.

    ReplyDelete