Scala 3 released not too long ago, and it comes packed with lots of tools to introspect your code and generate new code. One of these tools is Mirror, a new given type that can give you information about case classes. Mirror can tell you the name of a case class, the names of its fields, and the types of its fields. This is very powerful considering that you can use it without dipping your toes into scala 3's compile time reflection, and it hints at being able to generate new types programatically that have similar structure to your case class. In order to generate these new types and use them like a normal case class, we need two things: Heterogenous Maps and Dynamic types.
Heterogenous Maps
A heterogenous map is one that contains keys and values of myriad types. With a heterogenous map, you can give the map a key with type K, and if a matching entry is found in the map, a value of type V will be returned. A heterogenous map can store myriad key value type pairs without requiring they share a common subtype, and without losing type information. With a heterogenous map, code like below is possible:
- Keys must be Singleton types
- An HMap that already has a key has to refuse to store new data associated with said key
- All operations on the HMap that result in failure must do so at compile time
Our HMap will be backed with a Map[Any, Any] and will store information about its contents in a generic T that's a subtype of Tuple.
The first piece: Improved tuples
For our HMap to be of any use, we need to be able to add data to it. Scala 3 has improved tuples a great amount, and these improvements allow us to write a function for adding new data quite easily:
The second piece: Match types
The add function added to the HMap works well for adding data, but it has no guards to meet constraint 2: We do not want a key that's been stored in the HMap to be allowed to be readded with new data. If you run HMap.empty.add("hello", "world").add("hello", "goodbye"), there is no compile time error at all. Worse yet, the original data associated with "hello" has been thrown out. In order to meet constraint 2 and 3, we'll need a new capability of scala 3: Match Types.
We use the transformation of the NoMatch type and the using clause together to limit at compile time whether or not a key can be added to the HMap. The compiler will throw an error if evidence cannot be found that NoMatch[T,K] =:= true (for example, because NoMatch resolved to false).
These two additions to our code makes our add function complete. It obeys all 3 constraints: The keys have to be singleton types, they cannot be readded to an HMap, and trying to do so results in a compile-time error.
The third piece: More Match types, and our get functio
We are now able to add data to our HMap, and adding data the wrong way results in compile time errors. Everything is looking great, but we're about to hit our first roadblock: the get function.
The start of our get function is pretty simple. We take a k with type K, and K should be a singleton. However, how do we define the result type? The result type should be the V type associated with our key K, but that can be anywhere inside the tuple type that defines what is and isn't in our HMap.
Furthermore, what should happen if the HMap has no data associated with the key used to query it? In a normal map we'd return an Option type, but we don't want to do that. We want compile time errors!
Our answer lies with our NoMatch type and a new, special match type:
Something extra: The delete function
We have our HMap defined and we can add and query it now. However, constraint 2 can be a bit pesky. Imagine you were passing a fully formed HMap and you wanted to replace some data? We can't do that because constraint 2 prevents add from replacing data automatically. However, we can add a delete function and still meet our constraints. A delete function is a bit weird though. For add, we just appended our new data to the front of our tuple of type info, but a delete operation might remove data from any part of this tuple. How do we represent such a return type? A match type once again:
More extras: Opaque types and erased
Our HMap is in a useable state now, and we could move on from here. Or we could try to make the type as lightweight as possible. Our HMap only stores one piece of runtime data, the Map m. In scala 2, this class could be made a value class, and then we wouldn't construct an HMap instance all of the time, scala would just use the Map type at runtime. Value types had a number of shortcomings however (had to actually be instantiated in many surprising instances), and so Scala 3 has a concept that replaces Value types in many cases: Opaque types. Changing the code to use opaque types looks like below:
Considering our opaque HMap will not result in extra object instantiation, we could call it a day here, but we could also use the experimental scala keyword erased. In add, get, and delete, we use given instances as evidence that users are doing the right things with our functions. This evidence is instantiated at runtime and passed into our functions. Because of the generic nature of this evidence, it will almost certainly result in extra object instantiation for each function call. Since we don't actually use the givens though, we can use the erased keyword. This makes the data unavailable at runtime and should in theory mean we do not need to instantiate evidence like NoMatch[T, K] =:= true.
Putting it all together: Dynamic
Now that we have our HMap, it's time to create a pseudo-case class. The Dynamic trait has existed for a long time in Scala 2, but it's not seen much use. We're in a static, strongly-typed language, and the Dynamic trait seems to throw away static typing. This is not the case however.
This was a very informative article, thanks a lot! One question however, regarding the second constraint:
ReplyDelete"An HMap that already has a key has to refuse to store new data associated with said key"
Not having run this myself, so from just reading the code, it seems to me that HMap is a bit more restrictive than that in that it doesn't seem it will allow more than one key of the same type, not just the same value. Is that correct? If so, is there a way to overcome that? If not, how is NoMatch comparing specific values of a type to prevent that? Or what am I missing?
Thanks!
"it doesn't seem it will allow more than one key of the same type"
DeleteYou're right that the HMap only allows one key of the same type, but it's a bit more restrictive than even that. This HMap design only allows singleton types and values to be keys. 5 is a singleton type and value, and can be used as a key in this HMap. "hello" is a singleton type and value and can be used as a key in this HMap. An instance of Any would not be useable as a key, as Any is not a Singleton type.
"If so, is there a way to overcome that?" it depends. Part of the strength of this design is that we can determine what the result type for a get or drop at compile time. This is achieved by mapping from type to type. If we allowed non-singleton keys, then determining what the return type of get or drop should be would be impossible except at runtime, since determining which instance of Any you're dealing with would require a runtime equality check.
Ah right, the Singleton distinction. That clears it up, thanks!
DeleteThis comment has been removed by the author.
ReplyDelete