пятница, 2 ноября 2012 г.

Typed approach for object ids in Scala

Short example

Try to imagine that you are writing some application in Scala, use case classes to store your data, and something like Salat to automatically serialize your classes. Occasionly, you have two objects linked together:

At some point at time you need to add another link with another object:

Hey, now it compiles, but doesn't work! If you have tests, we can say you are in luck: you can run tests, try to fix broken line, run tests another time, try to fix other line... No fun at all. Or if you don't cover some method, you can have time bomb somewhere. Worst case ever!

Type system to the rescue!

What is type system? A type system is a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kind of values they compute. In other words, type system allows compiler to prove correct behavior of you program! So what is fun writing in language with powerful type system but don't use it powers?

At this point we already have some amount of code and tons of data, so we don't want to break serialization. Also, we don't want additional boxing for every id. What can we do here? Haskell, the Language of Types, already has a concept for this! It is called newtype. Using it, you can introduce new type on top of other, and have conversions from one to another, but without additional boxing -- values of source type and defined newtype will be represented identically in memory, so from one side you will not loose memory and cpu time for useless boxing, and Salat will serialize your data like it was old Keys!

And of course, this is in some Scala libraries for free! shapeless is one of them. Here goes an example: Using it, we can define type, representing object's id:

So our first example will become: And from this moment, we can change our definitions as we want, with compile-time check!

1 комментарий:

  1. Quite inspiring post @Nick! I really liked the way you implemented Id[T]. That being said, I believe the problem that you described in mess.scala is due to undeterministic order of the parameters. In order to avoid similar problems, I encourage everyone (including myself) to always use named parameters.

    Let me put it this way:

    case class Foo(id: Id[Foo], name: String = "Default")
    Foo(genId, "Luke") // ok

    case class Foo(id: Id[Foo], surname: String = null, name: String = "Default")
    Foo(genId, "Luke") // not ok

    Per see, now you need to define Name[Foo] and Surname[Foo] types as well. And this list goes on. IMHO, the optimal approach to this problem is to always stick with named parameters instead.