`unapply`

in Scala to recover some of the convenience.
Let’s assume in your API you want to have users write functions of type:

The user code then looks like:

Is there a way we can still use Scala’s support for lambda-syntax? That is, can we somehow manage to write the following?

Turns out, we can recover the concise notation if the following is true:

Our rank-2 interface

`F`

only mentions the type`X`

in a contravariant position

That is, roughly, it has the form:

Now, if this restriction applies then we can rewrite our program as follows. First, we
change the type of `F`

to:

then we define the new type `FArg`

:

Now comes the trick, the only way to deconstruct a value of type `FArg`

is by the following unapply method:

This way our user programs becomes:

Actually, I came up with this roundabout way of defining functions when rethinking the API of a freer effects library for Scala (similar to Atnos Eff).

The code for this shortened example can be found in a scastie.

The library roughly consists of the following types:

Type `Op`

is a marker trait for effect operations and `Eff`

is the usual implementation of freer monads (but without effect safety!). A user program looks like:

I wanted to define effect handlers (like for the operation `Get`

) as partial functions. The first draft was

which made user code a bit too clumsy for my taste:

We uncurry and first rewrite the `Handler`

interface to

Now, we can apply the unapply trick from above, and rewrite `Handler`

to:

This way the type `X`

which was universally quantified in `Handler`

is brought into scope when pattern matching. We can now define handlers as:

Much better, if you ask me :)

*Note:* This post is automatically generated from a github gist. You may want to checkout the original sources and consider commenting there.

TLDR; The core idea is to index a monadic type by the set of regions it requires to be “alive”. This is not a new idea, but typically the set is represented by some typelevel list. Region subtyping then requires implicit evidence that one typelevel list is a sublist of another one. As a more lightweight approach, I propose to use Scala’s support for intersection types to represent the set of regions. Every region can be uniquely identified by its (path-dependent) singleton type and nesting of regions translates to the intersection of these singleton types. The nice thing about intersection types is that region subtyping now is just normal (contravariant) subtyping.

As it turns out, the idea of using a (contravariant) intersection type to track regions is not completely new and has been used by Parreaux et al. (2018) to guarantee macro hygiene.

This post is only about type / region safety. That is, we won’t actually implement functional local heaps or the like.

The code of this post and more variants of it can be found in this Scastie.

Launchbury & Sabry (1997) introduce a typed version of the ST monad. They use rank-2 types to describe regions in which resources (like references) are valid.

The following example uses the ST-monad to create mutable reference cells `Ref`

.

Let’s look at the interface of the ST-monad translated to Scala and specialized to references that can only store `Int`

s:

One essential requirement for the ST-trick is that programs which use `ST`

have to be fully parametric in `Scope`

. This is
captured in the type `Program`

:

The type `Program`

encodes a rank-2 type.
Our example program `prog`

has the right type and is parametric enough, so we can call `runST`

:

## Remark

In fact, since SAMs don’t work for rank-2 types like

`Program`

we actually need to write`scala runST { new Program[Int] { def apply[S] = prog[S] } }`

However, for conciseness, I sometimes pretend that we can write`runSt { prog }`

.

The idea behind the ST-trick now is the following: since programs have to work for *all scopes*
(notice the type parameter `Scope`

on `Program.apply`

) references introduced
in one scope cannot be used in a different scope. For example, we can try to leak a
reference like:

However, the type variable `S`

is not in scope outside of `runST`

and thus we can
only use an existential `Ref[_]`

to express “it is a reference for *some* scope which don’t know”.
References thus can escape but can never be used again outside of
the call to `runST`

. That is, the following does not type check:

The existential type cannot be unified with `S`

!
The rank-2 type thus ensures that we can free the allocated resources after executing `runST`

since they never can be used in any other call to `runST`

.

## Remark

In the setting of delimited control, Dybvig et al. (2007) use the same trick to prevent prompts from escaping the region of one

`run`

.

The same region safety can be achieved with a slightly different encoding
using Scala’s support for singleton types.
We can express the interfaces of `ST`

and `Ref`

exactly as before, but use a
different definition for `Program`

:

This way, the rank-2 universal quantification effectively moves from the type-level to the term-level.

Using `ST`

now looks like

Additionally, we can improve type inference a bit by making `Ref`

a member of `Scope`

:

The user program then changes to

We are now ready to generalize the previous encodings to multiple, potentially nested regions:

Here, we use intersection types to express that a program uses references of multiple nested scopes.
Every call to `scoped`

introduces one scope and removes the corresponding `scope.type`

from the
intersection.
To run the program, we define `type Global = Any`

for the top level scope (without any references).
We then introduce two nested scopes with `scoped`

to finally run the computation with `runST`

:

It might look a bit strange to have an intersection type of singleton types. In fact, most of those types don’t have inhabitants. However, we just use the intersection as a set of scope labels. It is a phantom type and does not have any operational relevance.

Here are the definitions of `ST`

, `Scope`

and `Program`

:

In fact, since Dotty supports dependent function types, we can write program as

The final version of the library can be found in the file 02.Library.scala.

*Note:* This post is automatically generated from a github gist. You may want to checkout the original sources and consider commenting there.

After completing the function definition, I often convert back to pointfree style for conciseness. End of the story: To understand it again a couple of weeks later, I start expanding to annotated pointful again.

The small 10-lines library at the end of this post allows to define pointfree functions with intermediate type annotations.

Credits: Agda and EqReasoning for syntactical inspiration.

basically this is just syntactic sugar for the (almost equally verbose / concise):

The latter only uses functions from the Scala Predef and thus requires no “library”. Still, I have never encountered this style of writing annotated pointfree programs in the wild.

You can safely ignore the details of this example, it is just to show the syntax.

Compare with the type-annotated (almost ANF) pointful version:

And the (almost) pointfree “one-liner”:

In particular, notice how we need to introduce the auxiliary definition `trans`

and refrain from fully using pointfree style in order to assist type inference.

*Note:* This post is automatically generated from a github gist. You may want to checkout the original sources and consider commenting there.

Download the full poster as a PDF.

The remainder of this post is just repeating the contents of the poster for easy web-reference. I am happy for every comment that links to either easy accessible material on the topic, or good examples of how to use the various morphisms.

Instances of Ī¼f are āf-data-structuresā or short āf-structuresā.

āØA finite f-structure, that can contain as. Is a functor and a monad. Monadic-bind corresponds to substitution: Substitutes as by terms that can contain bs.

āØAlso known as āfoldā. Deconstructs a f-structure level-by-level and applies the algebra [13, 5, 14, 6].

A.k.a. āthe Tupling-Trickā. Like cata, but allows access to the full subtree during teardown. Is a special case of zygo, with the helper being the initial-algebra [16].

Allows depending on a helper algebra for deconstructing a f-structure. A generalisation of para.

Deconstructs the f-structure with the help of all previous computation for the substructures (the trace). Difference to para: The subcomputation is already available and needs not to be recomputed.

Applies the natural transformation at every level, before destructing with the algebra. Can be seen as a one-level rewrite. This extension can be combined with other destruction morphisms [4].

Instances of Ī½f are āf-codata-structuresā or short āf-structuresā.

āØA possibly infinite f-structure, full of as. Is a functor and a comonad. Comonadic-extend corresponds to computing a new f-structure full of bs. At every level the a and the full trace are available for computing the b.

āØAlso known as āunfoldā. Constructs a f-structure level-by-level, starting with a seed and repeatedly applying the coalgebra [13, 5].

A.k.a. āthe Co-Tupling-Trickāā¢. Like ana, but also allows to return an entire substructure instead of one level only. Is a special case of g-apo, with the helper being the final-coalgebra [17, 16].

Allows depending on a helper coalgebra for constructing a f-structure. A generalisation of apo.

Constructs a f-structure stepwise, but the coalgebra can return multiple layers of a-valued substructures at once. Difference to apo: the subtrees can again contain as [16].

Applies the natural transformation at every level, after construction with the coalgebra. Can be seen as a one-level rewrite. This extension can be combined with other construction morphisms.

Omits creating the intermediate structure and immediately applies the algebra to the results of the coalgebraā [13, 2, 5, 14].

Constructs a structure and immediately destructs it while keeping intermediate resultsā . Can be used to implement dynamic-programming algorithms [9, 10].

Can at the same time ālook backā at previous results and ājump into the futureā by returning seeds that are multiple levels deepā [11].

Constructs a g-structure from a f-structure while changing the internal representation in-between [7].

āØ Most of the above morphisms can be modified to accept generalized algebras (with w being a comonad)

or generalised coalgebras (with m being a monad), respectively:

Also a multitude of other morphisms exist [12, 3, 1] and the combination of morphisms and distributive laws

has been studied [8, 15].

ā Can also be enhanced by a representation change (natural transformation f ā g), before deconstructing with a corresponding g-algebra

[1] AdĆ”mek, JiÅĆ, Stefan Milius, and JiÅĆ Velebil. “Elgot algebras.” *Electronic Notes ināØ Theoretical Computer Science*, 2006.

[2] Augusteijn, Lex. “Sorting morphisms.” *Advanced Functional Programming*.āØ Springer Berlin Heidelberg, 1998.

[3] Erwig, Martin. *Random access to abstract data types*. Springer BerlināØ Heidelberg, 2000.

[4] Fokkinga, Maarten M. “Law and order in algorithmics.ā PhD Thesis, 1992.

[5] Gibbons, Jeremy. “Origami programming.ā, 2003.

[6] Gibbons, Jeremy. “Design patterns as higher-order datatype-generic programs.āāØ *Proceedings of the Workshop on Generic programming*. ACM, 2006.

[7] Gibbons, Jeremy. “Metamorphisms: Streaming representation-changers.”āØ *Science of Computer Programming*, 2007.

[8] Hinze, Ralf, et al. “Sorting with bialgebras and distributive laws.” *Proceedings of the Workshop on Generic programming*. ACM, 2012.

[9] Hinze, Ralf, and Nicolas Wu. “Histo-and dynamorphisms revisited.” *Proceedings ofāØ the Workshop on Generic programming*. ACM, 2013.

[10] Kabanov, Jevgeni, and Varmo Vene. “Recursion schemes for dynamic programming.ā *Mathematics of Program Construction*. Springer Berlin Heidelberg, 2006.

[11] Kmett, Edward. āTime for Chronomorphisms.ā, 2008.āØ http://comonad.com/reader/2008/time-for-chronomorphisms/

[12] Kmett, Edward. āRecursion Schemes: A Field Guide (Redux).ā, 2009.āØ http://comonad.com/reader/2009/recursion-schemes/

[13] Meijer, Erik, Maarten Fokkinga, and Ross Paterson. “Functional programming withāØ bananas, lenses, envelopes and barbed wire.” *Functional Programming LanguagesāØ and Computer Architecture*. Springer Berlin Heidelberg, 1991.

[14] Oliveira, Bruno, and Jeremy Gibbons. “Scala for generic programmers.”āØ *Proceedings of the Workshop on Generic programming*. ACM, 2008.

[15] Turi, Daniele, and Gordon Plotkin. “Towards a mathematical operationalāØ semantics.” *Logic in Computer Science*. IEEE, 1997.

[16] Uustalu, Tarmo, and Varmo Vene. “Primitive (co) recursion and course-of-valueāØ (co) iteration, categorically.” *Informatica*, 1999.

[17] Vene, Varmo, and Tarmo Uustalu. “Functional programming with apomorphismsāØ (corecursion).” *Proceedings of the Estonian Academy of Sciences: Physics, Mathematics*. Vol. 47. No. 3. 1998.

`F1[O] = Int Ć O`

and `F2[I, O] = (I => Int) Ć (I => O)`

. Both look very similar,
but not similar enough – so I had to repeat the functor with the slight
changes over and over again. Then I came up with a solution…
The idea is strikingly simple: Since the differences are uniformly applied to every factor of the product we can abstract over this pattern. To see how this can be achieved in Scala, first lets encode both signatures naively:

Now we can abstract over the two special instances by defining a generic functor, parametrized over a type function that encapsulates the difference.

where `T`

is for “transformation” since it is some type-level function that
transforms the input. The two instances can be recovered simply by applying
`UberFunctor`

with type level functions:

That was easy, wasn’t it? The cool thing is, that this way transformations
between the different shapes of related functors like `F1`

and `F2`

can be
expressed as transformation between two instances of `UberFunctor`

, highlighting
the difference on the type-level.

*Your help is needed*: `UberFunctor`

is of course just some working title. I am
not sure how to name this concept. I only have a very basic understanding of
category, yet. Maybe this concept already exists and has a terrific name? Maybe
it is related to some other concept where the name could draw inspiration from?

I would love to hear some feedback, where you think this technique could be applied and how it should be called.

]]>`Option`

since it is “tedious and syntax heavy”. He suggests to use the “null object pattern” – creating a “distinguishable object to server the role of none”.
A problem with this approach is that one looses all the nice methods `Option`

provides such as `getOrElse`

or `map`

. In this post we will see how we can get the best of both worlds by using implicits.

Let’s start with a small motivating example, inspired by the scala compiler implementation.

Using this pattern allows us to check for `x == NoSymbol`

and use `x`

otherwise without having it
to unwrap via `get`

or an extractor.

But what if we would like to use `orElse`

on a `Symbol`

defined above:

This problem could of course be fixed by adding method `orElse`

to Symbol and implementing it in `MySymbol`

and in `NoSymbol`

:

Hey, this works. But it replicates code from `scala.Option`

. Maybe it is a better idea to inherit from `Option`

then?

Nope. `Option`

is sealed, so there is no way to inherit from it. We need to find another way…

Implicits can be used to add methods to an existing library in retrospective. Why not pimp symbols with the methods of `Option`

?

The singleton type of `NoSymbol`

is more specific than `Symbol`

and so the first implicit conversion is preferred over the second – if the singleton type is known at compile time. In all other cases we have to add a runtime check in order to convert to `Option[Symbol]`

.
Of course the first conversion also could be omitted, since it may not often be known at compile time whether `sym`

is of singleton type `NoSymbol.type`

or not. It is just a little optimization trying to use compile time information.

The above example usage now works without modification. Not changing or duplicating existing code we are able to handle `Symbol`

the same way as `Option`

while at the same time staying as concise as Paul Phillips likes it :)

Let’s first consider how the api should be used.

The small language we are trying to implement consists of *measures* und their unit suffix. Let’s inspect the design of the already existing wrapper library scaladin:

In Vaadin units are represented as enum, but the Java names are rather clumsy so new enums are introduced, reusing the integer values of the Java enums. The implicit class `MeasureExtent`

enabled suffix use of unit extensions to create instances of `Measure`

. `Measure`

then represents the compound of a length value and it’s unit.

Up to now measures can be created using suffix syntax, but doing math with them requires manual destruction. So let’s add a simple `addition`

method on class `Measure`

.

You’ll notice that the implementation of `+`

includes a call to `conv`

which is not yet implemented. Since we cannot add apples and oranges we need to convert between them before doing the math. Following the CSS specification conversion can be performed on absolute values. Implementing the conversion naively we would end up with 2^{k} cases to handle. In our example there are k=3 absolute units (`em`

is a relative unit) resulting in 8 different conversion to address.

This exponential blowup can be avoided easily by converting to some intermediate base unit, reducing the implementation cost to 2*k.

The following code directly results from choosing `pt`

as the base unit:

The implementation of `conv`

first handles the trivial conversion with source being the same unit as target and then tries to obtain a conversion function from source to base and from base to target.
While this works pretty well, we will run into conversion errors at runtime:

The second part of this post will explore how to shift parts of the implementation to the typelevel in order to fail earlier.

The basic idea is to move the selection of the *conversion function* to the typelevel. If no conversion function can be found a type error will be raised.

In order to be able to use the information at the type level some preparation needs to be made. First of all we need to add a type parameter to `Measure`

:

This allows matching on the type of `unit`

when it comes to comparing two different measures. Secondly, when searching for the right conversion there is not
much information on the type of the unit available. Asking for the type tag of an enum value results in `Units.Value`

:

So, what can we do about it? Since values of an enum already fulfill the role of a singleton, why not turn them into objects?

(The last line is necessary to force strict evaluation of the objects, otherwise the lookup `Units(0)`

will fail)

Asking for the type tag of an enum value now results in the singleton type of the corresponding object – great!

Now that all preparations are done, let’s get to the real stuff. The conversion lookup will be implemented using typeclasses just like the Scala collection library does with `CanBuildFrom`

. (If you are not familiar with this encoding I warmly recommend reading “Type Classes as Objects and Implicits” by Bruno Oliveira et al.)

The encoding will immediately follow from the above termlevel implementation (The names of the typeclasses are chosen to amplify this correspondence).

The typeclass `CanConvert[From, To]`

represents an evidence that type `From`

can be converted to type `To`

. This evidence manifests in the implementation of the conversion `impl`

.
Following the above implementation of `conv`

the first thing to do is to check for equivalence of `From`

and `To`

:

`id`

will be chosen by implicit lookup only if `U == U`

. It then provides the identity function as conversion. (Since the scope of implicit lookup is extended to the companion object, it is a good place to store all the implicit methods and values.)

This one was easy, but how can we encode “first try to find a conversion (a) to `Base`

, only if this succeeds try to find a conversion (b) from `Base`

to `To`

”?

If both conversions (a) and (b) are implemented as typeclasses, we can just use the `ToBase`

-conversion (a) as a premise to the `BaseTo`

implicit. Thus, the lookup will fail if there exists no conversion (a).

This might look a little bit complicated, but it just reads as “If you can provide me evidence that `From`

can be converted to base, I can provide evidence that `From`

can be converted to X”. Where X is the corresponding `To`

type i.e. `Units.pt.type`

in the first case.

For the corresponding implementation of `toBase`

we thus need a second typeclass:

This one is easier again, it just reads as “To convert `From`

to base just use the provided conversion function.”

Everything is wired together by object `CanConvert`

inheriting from `BaseTo`

. This way implicit lookup will prefer the identity implicit – and only if it fails continue to lookup in the parent trait `BaseTo`

.

Using this framework to implement addition in Measure results in the following code:

The usage of the API did not change but `14.pt + 1.em`

will now result in a compile time error. This error even can be customized by using `scala.annotation.implicitNotFound`

on `CanConvert`

.
The best thing of this encoding is the fact, that it is external to `Measure`

. This way users can add their own `CanConvert`

instances to allow additional conversions. For instance a typeclass `ConvertRelative`

could be defined that searches for an implicit value of the fontsize and allows conversion from `em`

to absolute measures.

Perfect, isn’t it?

Sadly, the answer for this use case is “no”. The typing of the underlying framework Vaadin is too weak. Methods like `ComponentPosition.getTop`

can return an arbitrary measure, so there is not all the necessary compile time information o make implicit lookup work. Of course one can pattern match on the result of the method call to restore type information, but this is tedious and too verbose.

I would love to hear some suggestions on how this situation can be improved. In fact, this post is presented in reverse order: I started with the typed version and only after encountering these usability issues I implemented the “runtime version”.

]]>Given the following example object containing a property `obj.expensive`

:

Our goal is to only evaluate the expression for `obj.expensive`

if it is really necessary.
The most common way to achieve this is using a function and memoization.

This way the value is just calculated once at the time `obj.expensive`

is actually used.

This is a lot of boilerplate, so let’s factor most things out into a function `lazy`

.

Much better. But still the interface to the user of `obj`

changes from `obj.expensive`

to
`obj.expensive()`

.

Using getters the code can be improved even more in order to preserve the interface to the user.

You may have noticed, that property descriptors and `Object.create`

are used
to define the property. Since descriptors allow to specify additional meta information on the mutability and visibility of a property I try to use them when ever possible (and appropriate).

Here are just two of the helper functions I usually use to define property descriptors

Using those an object definition can look like

This even looks nicer using ES6 Arrow Functions

]]>Let’s take a look at a small piece of code, that illustrates the problem.

We have to work on representations of values. Thus, `Rep[T]`

is a node representing a
value of type `T`

. It is necessary to define different methods for `Rep`

depending on the
exact type of `T`

. For collections we might want to define `map`

, as for numbers a `+`

is
required.
Up to now, everything looks fine. How about adding identifiers, that can be both, either
a collection or numeric, depending on it’s type `T`

.

This isn’t correct for multiple reasons:

- Even, if this would compile the result is not the desired one. An identifier could be used both as collection and numeric, regardless of it’s type.
- It does not compile since
`Collection[T]`

imposes`Rep[List[T]]`

on the type of identifier whereas`Numerics`

requires the identifier to be of type`Rep[Int]`

.

So how can this be achieved?

The first solution I want to present is based on Scala’s type constraints. They are part
of the Predef and called
`<:<`

and `=:=`

. Without digging too deep into the inner workings of those classes, let’s
take a look at the solution first:

The first line is changed just a little. The representation mixes in both traits `Collection`

and `Numerics`

. Both times it provides it’s own type as argument to the type constructors.
In addition to the type argument, a self type annotation is added to the traits. It allows to use `this`

as instance of `Self`

within the implementation.
The interesting part is the added parameter list

It says: “Please provide an (implicit) argument, that proves evidence of the subtype relationship between `Self`

and `Rep[List[T]]`

”. In other words:

`Self`

has to be a subtype of`Rep[List[T]]`

in order to be able to apply the method.

This way `map`

can only be used on `Identifier[List[T]]`

and `+`

can only be used on `Identifier[Int]`

. Cool.

The downside is the additional type parameter on `map`

(`T`

). It makes inferring the types for Scala much harder and thus this encoding did not work with *for comprehensions*.

The same could also be achieved with a different encoding. The “Pimp my Library” pattern has been invented by Martin Odersky. The pattern is so cool and widely used, that with Scala 2.10 it has built-in language support.

Now, the definition of `Rep[T]`

is independent of the methods defined on it. This is perfect
for extensibility, since new methods can be added step by step without touching the old
source ever again. Implicit classes are syntactic sugar for defining a class `Wrapper`

and then adding an implicit definition like

And this is the downside of this approach. Many implicit conversions can make code at user site harder to understand. And it gets worse. Adding an automatic conversion to numerals

does not work well with the other implicit conversions. So we are forced to add the “transitive closure” of implicit conversions by hand:

If you are faced with the problem of adding methods to a type constructor based on it’s type
argument you are free to choose from the solutions presented above. The first one, based on type constraints is not very extensible, since the methods are more or less defined on `Rep`

itself. It is easier to understand to the user of your API, since the user does not need to see the implicit evidence parameter.
The second encoding, based on the “pimp my library pattern” is more extensible. New methods can
be added without touching old source code. On the other hand it might be hard to understand the implicit conversion magic going on under the hood.

In the end I had to go with the latter one, since it played well with for comprehension syntax and it’s desugaring.

]]>In order to see why all this is helpful, let’s start with some simple functions that return either a result or `undefined`

.

Chaining multiple of these functions and checking for undefined can lead to rather verbose code.

Of course there are multiple ways around this tour-de-france of nested if-statements including early returns, exception handling and a functional encoding using the Maybe monad or Option types (for the Scala people out there).

Today we will look at some encoding that is similar to the last mentioned solution – with the goal to support a nice syntax.

To give a preview on how the solution to the example looks like, we start rewriting the two functions `f1`

and `f2`

.

As it can easily be seen we just used the function `some`

to flag the result correct and
otherwise return the value `none`

(We will see how both are defined in a second).

The chaining of the method calls now can be expressed in terms of for-loops:

Pure awesomeness. Let’s see how this can be achieved.

At first we create some constructor function `Option`

that takes an optional value to store.
Whether or not a value has been passed is saved in the property `this.some`

. The true ‘magic’ then happens in the prototype function `iterator`

which is called internally by for-loops and spread operators.

It basically works the same way as returning a singleton array

`[x]`

for`some(x)`

and an empty array`[]`

for`none`

.

So let’s dig into this a little bit deeper and redefine `some`

and `none`

in terms of arrays:

The amazing fact: The above version is of course backwards compatible to ES3. And it also works out of the box with the ES5/6 collections and libraries such as underscore.js.

Refactoring code like

`return is_true ? result : undefined`

into`return is_true ? [result] : []`

allows callers to write programs in a more functional way without loosing backwards compatibility.

Spread operator

for comprehensions

The solution is an encoding inspired by scala’s **for-comprehensions**. Sadly ES6 does not desugar “for-loops” into plain method calls and anonymous functions like in scala. So hacking the for syntax is rather restricted.
The easiest encoding using empty and singleton arrays for `none`

and `some`

offers the advantage that using list comprehensions the result is automatically wrapped up in the correct type.

whereas

yields nothing since `for`

is a statement and not an expression.

Nevertheless, I think both ways (the `Option`

function and the array encoding) offer great advantages over the common imperative idiom.