I didn’t think I will ever write about homework assignments, but this one was too cool. The task: “Write a wrapper class for collections, that reifies all operations on it.” In other words, queries on the collection should be available as AST in order to perform optimizations on them. This post is about one small problem related to the task.
Let’s take a look at a small piece of code, that illustrates the problem.
abstract class Rep[T]
trait Collection[T] extends Rep[List[T]] {
def map[S](f: Rep[T] => Rep[S]): Rep[List[S]] = ...
}
trait Numerics extends Rep[Int] {
def +(other: Rep[Int]) = ...
}
We have to work on representations of values. Thus, Rep[T]
is a node representing a
value of type T
. It is necessary to define different methods for Rep
depending on the
exact type of T
. For collections we might want to define map
, as for numbers a +
is
required.
Up to now, everything looks fine. How about adding identifiers, that can be both, either
a collection or numeric, depending on it’s type T
.
case class Identifier[T] extends Collection[T] with Numerics
This isn’t correct for multiple reasons:
Collection[T]
imposes Rep[List[T]]
on the type of identifier
whereas Numerics
requires the identifier to be of type Rep[Int]
.So how can this be achieved?
The first solution I want to present is based on Scala’s type constraints. They are part
of the Predef and called
<:<
and =:=
. Without digging too deep into the inner workings of those classes, let’s
take a look at the solution first:
abstract class Rep[T] extends Collection[Rep[T]] with Numerics[Rep[T]]
trait Collection[Self] { self: Self =>
def map[T,S](f: Rep[T] => Rep[S])(implicit ev: Self <:< Rep[List[T]]): Rep[List[S]] = ...
}
trait Numerics[Self] { self: Self =>
def +(other: Rep[Int])(implicit ev: Self <:< Rep[Int]) = ...
}
case class Identifier[T](name: Symbol) extends Rep[T]
The first line is changed just a little. The representation mixes in both traits Collection
and Numerics
. Both times it provides it’s own type as argument to the type constructors.
In addition to the type argument, a self type annotation is added to the traits. It allows to use this
as instance of Self
within the implementation.
The interesting part is the added parameter list
(implicit ev: Self <:< Rep[List[T]])
It says: “Please provide an (implicit) argument, that proves evidence of the subtype relationship between Self
and Rep[List[T]]
”. In other words:
Self
has to be a subtype ofRep[List[T]]
in order to be able to apply the method.
This way map
can only be used on Identifier[List[T]]
and +
can only be used on Identifier[Int]
. Cool.
The downside is the additional type parameter on map
(T
). It makes inferring the types for Scala much harder and thus this encoding did not work with for comprehensions.
The same could also be achieved with a different encoding. The “Pimp my Library” pattern has been invented by Martin Odersky. The pattern is so cool and widely used, that with Scala 2.10 it has built-in language support.
abstract class Rep[T]
implicit class Collection[T](that: Rep[List[T]]) {
def map[S](f: Rep[T] => Rep[S]): Rep[List[S]] = ...
}
implicit class Numerics(that: Rep[Int]) {
def +(other: Rep[Int]): Rep[Int] = ...
}
case class Identifier[T](name: Symbol) extends Rep[T]
Now, the definition of Rep[T]
is independent of the methods defined on it. This is perfect
for extensibility, since new methods can be added step by step without touching the old
source ever again. Implicit classes are syntactic sugar for defining a class Wrapper
and then adding an implicit definition like
class Wrapper(n: Int) { ... define new methods ... }
implicit def int2wrapper(n: Int): Wrapper = new Wrapper(n)
And this is the downside of this approach. Many implicit conversions can make code at user site harder to understand. And it gets worse. Adding an automatic conversion to numerals
implicit class Numeral(n: Int) extends Rep[Int]
does not work well with the other implicit conversions. So we are forced to add the “transitive closure” of implicit conversions by hand:
implicit def intToNumerics(n: Int): Numerics = Numerics(Numeral(n))
If you are faced with the problem of adding methods to a type constructor based on it’s type
argument you are free to choose from the solutions presented above. The first one, based on type constraints is not very extensible, since the methods are more or less defined on Rep
itself. It is easier to understand to the user of your API, since the user does not need to see the implicit evidence parameter.
The second encoding, based on the “pimp my library pattern” is more extensible. New methods can
be added without touching old source code. On the other hand it might be hard to understand the implicit conversion magic going on under the hood.
In the end I had to go with the latter one, since it played well with for comprehension syntax and it’s desugaring.