The good and the bad of Typed Racket

~/blog/thoughts-typed-racket

Published:


There's a lot to like about Typed Racket, which adds a powerful yet flexible type system to the Racket language. In my last post, I wrote about how I used Typed Racket (TR from now on) to implement a tree-walking interpreter for the Lox language from Crafting Interpreters.

Along the way, I gathered some thoughts about my chosen implementation language that I'd like to divulge here. Despite some good aspects of TR, even in this medium-sized project, I became frustrated with parts of TR's UX through its tooling and documentation. I'll break this up into two sections: the good and the bad parts of my experience. Let's start with the good.

The Good

Building on Racket

Typed Racket builds on the Racket platform through the #lang feature. This means that Typed Racket code is transformed into plain Racket after type-checking, and therefore can make use of any Racket libraries or interface with existing Racket code. Racket is a mature and stable project with excellent documentation and an extensive standard library, and though it may have fewer third-party libraries than other more popular languages, I still think it's a great foundation to build a language on.

Migration and Interoperability

With that in mind, converting from Racket to Typed Racket isn't difficult. Essentially all you need to do is change the #lang line to typed/racket and then add type annotations to function signatures, struct definitions, and a few other places like mutable values.

The story for typed-untyped interoperability is also good. TR comes built-in with typed versions of most (all?) of the standard library. Additionally, any untyped code can be imported with the require/typed form, where you essentially just need to add type signatures to the data types and functions you import.

Algebraic Data Types and Polymorphism

The main reason why I chose TR is that on top of being a typed Lisp, it crucially has support for algebraic data types (ADTs) and parametric polymorphism (generics). These two features are relatively small compared to a full OOP/interface setup, yet are powerful enough to allow you to write extensible code for a variety of applications. ADTs specifically, and having proper sum/union types, are something I find to be hard to give up once you have used them in any language.

Occurrence Typing and Inference

Typed Racket is distinguished by its relatively uncommon type system based on occurrence typing. Essentially this allows the TR to resolve types based on whether a predicate passes or fails on a particular value. Here's an example directly from their documentation. Suppose we want a function flexible-length that returns the length of either a string or a list. We can write

(: flexible-length (-> (U String (Listof Any)) Integer))
(define (flexible-length str-or-lst)
  (if (string? str-or-lst)
      (string-length str-or-lst)
      (length str-or-lst)))

Here, we annotate str-or-list has being a union type of either a String or (Listof Any). Then in the body we check if the value is a string with the predicate string?. Based on that check, TR knows that in the truthy branch of the if expression the value must be of type String, and in the falsy branch, it must have type (Listof Any), so the uses of string-length and length each type-check respectively.

This ability to distinguish between types at runtime, yet still rely on type-checking at compile time is super powerful, and allows TR code to be quite dynamic yet still type-safe.

Built on top of this occurrence typing, type inference for TR is usually good, meaning that type annotations outside function signatures are needed only for mutable values (whose type could change), or some higher-order polymorphic functions.

The Bad

That all being said, there are still some difficulties and frustration I had with TR that I found while using it to write the interpreter.

Slow Compilation Times

With TR's type-checking, I of course expect there to be reasonable compile times, but the TR compiler is still slow. On my laptop with a Ryzen 4500u, compiling about 1500 lines of TR from my interpreter to bytecode with raco make takes 14.2 seconds. TR does cache some compilation results, so subsequent calls are faster, but from unscientific benchmarking, just changing a comment in one file still takes 7.7 seconds to recompile after caching.

In comparison, on my machine, compiling the around 2000 lines of Java that make up Nystrom's original implementation of Lox takes just 2.4 seconds. Of course a comparison like this isn't at all one-to-one, and I don't expect a small compiler like TR to have the same optimizations or speed as the Java compiler, but it's nonetheless something to keep in mind when choosing an implementation language, especially for a large project.

Option vs. Opt

In Racket, there is only one falsy value, i.e only one that forces the else-branch of an if expression: the literal #f. In contrast, every other Racket value is truthy. In dynamic Racket code, we often use #f as a stand-in for empty values, similar to returning null in Java. For example, the list find procedure findf returns #f if the desired value isn't in the list.

In TR this concept is cemented with the built-in Option type, which is defined as the union of a type and False:

(define-type (Option a) (U a False))

where a is a generic type parameter.

Although it can be convenient, this pattern sometimes lead to type-unsafe code, since we're 'doubling up' on the usage of the False type as both an empty value and a boolean. For example, say you have a hash table that associates strings to booleans, like so:

(define names : (HashTable String Boolean) (make-hash))
(hash-set! names "John" #f)

In Racket, you can look up a value in this table with hash-ref, and then specify #f as an empty value to return if the key isn't in the table, such as in the following:

(hash-ref names "John" #f)
(hash-ref names "Jane" #f)

The problem with this approach is that we actually can't tell the difference between whether hash-ref succeeded (as it would for "John"), and the value associated with the key is #f, or if the key wasn't in the table, and thus returned #f, as it would for "Jane". This example might seem silly, but it's actually a bug that I ran into while writing the interpreter when I was careless about handling a hash table with boolean values.

A solution to this and similar issues would be to define the optional type differently that doesn't conflict with booleans. Indeed, in the TR guide, they show how to define an Opt type similar to the Maybe type in Haskell:

(struct None ())
(struct (a) Some ([v : a]))
(define-type (Opt a) (U None (Some a)))

This creates a union type Opt which is either an empty value None, or a generic type wrapped in the type constructor Some. By using Opt to specify empty values, we can avoid the bug described above by, say, returning None as the default value passed to hash-ref.

The problem I have with TR, is that the confusingly named Option, the one where #f is the empty value, is the type provided in the standard library. This other Opt type is given as a prominent example of how to write polymorphic data types in the Typed Racket guide, but its use (or at least documentation about the unsafety of Option) isn't codified in the language.

I think that TR should clarify this confusing relationship between Option and a user-defined type like Opt more clearly and give some guidance on when to choose one or the other.

Typing Hash Tables

This is small, and I'm not sure if I'm just missing something, but speaking of confusion with hash tables, I can't seem to get the hash-ref procedure to type-check correctly. Consider the following code snippet where we lookup a integer value in our names table with string keys:

(define names : (HashTable String Boolean) (make-hash))
(hash-ref names 5)

This just returns a runtime error, no value found for the key 5, but it should instead be a compile time type error, since 5 is clearly not a string. Again, if I'm doing something wrong here please reach out, because otherwise I don't see why this isn't a bug in the type of hash-ref.

Inconsistent Documentation

The Racket language has excellent documentation, both in the form of the extensive Racket Guide and the even more detailed Racket Reference. Typed Racket's guide and reference are good, but more inconsistent. I already talked a bit about the confusion between the Option type and similar user-defined types, but there are other issues as well.

For one, there is inconsistency in whether polymorphic type variables should be named as uppercase or lowercase letters. As we saw earlier, in the section of the guide Polymorphic Data Structures, Opt is defined with lowercase type variables:

(define-type (Opt a) (U None (Some a)))

but in the very next section, Polymorphic Functions), a function type is written with uppercase type variables:

(: list-length (All (A) (-> (Listof A) Integer)))

I don't care which is used but the documentation should stick to one style or the other.

There's a similar issue in whether to use the procedure type constructor -> as infix or prefix. In most parts of the documentation, the prefix notation is used, so the type of a procedure from String to Number would be (-> String Number). In other parts however, the same type is written using infix notation as (String -> Number). The fact that both of these create the same type and are used interchangeably is a bit confusing, given that everything else in Racket is generally written with prefix.

Lastly, it would be nice if the guide talked more about how to define one's own type predicates in the Occurrence Typing section. For example, at one point, I wanted to write a predicate to assert that a list contained only strings. I couldn't find an example anywhere about how to actually do this, but helpfully found out from this StackOverflow answer.

This assertion can be defined as such:

(: (-> Any Boolean : (Listof String)))
(define (listof-string? (l : (Listof Any)))
  (andmap string? l))

I think a simple example of something like this ought to be found somewhere in the documentation, given that it wasn't entirely clear that I could define and compose my own predicates in this way.

Editor Integration

The choice of editors for TR isn't great. There is of course DrRacket, the built-in IDE, but DrRacket is designed to be used by students who are new to programming, and is just not as smooth or polished of an editing experience as I'm used to with VSCode.

The other option is Emacs, which has long been the go-to editor for lisps. Emacs has racket-mode as well, which provides smart editor features for Racket. But I have no idea how to use Emacs proficiently, and when I looked into it, I realized I would need about as much time to learn Emacs as I would to just write the interpreter, so I didn't bother. Maybe I should just bite the bullet if I want to write more Racket in the future though.

So, instead of those options I used VSCode with the Magic Racket extension and the Racket language server. This setup works, it gets you syntax highlighting, and the language server will usually tell you when there is a syntax or type error. But it's still not at all a great editing experience for a bunch of minor reasons. Magic Racket's syntax highlighting isn't entirely consistent, and the language server basically crashes whenever I add a new import. There is also currently no support for displaying TR type signatures on hover.

None of these are deal-breakers, but they add up to an editing experience in VSCode that's well behind other functional languages like Clojure with Calva or Haskell with HLS.

Of course the community for Typed Racket is much smaller than for either of those languages, but personally I think that ergonomic tooling and editor integration is the kind of work that isn't always sexy, but will help retain users of the language.

Conclusion

All in all, I liked learning and using Typed Racket in my Lox interpreter. It takes an already good language in Racket and adds robust static typing. However, that didn't mean there weren't some frustrations along the way. I hope that some of these issues can be improved on and that Racket and its typed sibling find its way to more programmers in the future.

If you are a more experienced user of Typed (or untyped) Racket and you have question or comment, please feel free to reach out via Twitter or email.