Types and Systems

by

Michael Mroz

   |    

September 15, 2022

Types and Systems
Types and Systems

A fundamental principle of mine is that I will spare little expense to avoid stress. This manifests in a few ways, but in my work, it’s primarily my:

  1. Enthusiasm for designing soundly constrained systems,
  2. Aggressive use of type systems to guarantee those constraints, and
  3. Composing those systems to form larger ones while keeping things manageable and modular.

In this blog series, I'll focus on how I approach system design and encoding, both macro and micro.

A prerequisite is that we spend some time familiarizing ourselves with advanced type system features, and techniques for organizing data and behavior and guaranteeing correctness in them. To that end, I’m afraid we’ll also have to cover some ideas from mathematics. I’ll try my best to ensure that we always contextualize these and look at them as tools rather than entirely abstract concepts, but I can’t promise a 100% success rate. Some ideas are famously challenging to communicate. My deepest apologies in advance.

In this first post, though, let's talk broad strokes.

“Sound”?

When I talk about sound system design, I mean “sound” in the same sense people do when applying the word to type systems. Let’s spend a moment on that idea.

Usually, “soundness” is spoken about in the context of niche, purely functional languages like Haskell, OCaml or Idris. It refers to the property of type systems that can provably reject code at compile time that might, in unsound languages, have led to unpleasant runtime rejections like ClassCastException and NullReferenceException (Java), TypeError (JS), or undefined behavior in general.

The last of those might sound the airiest, but – especially in the purely logical domains often associated with SaaS – is often the most dangerous. In these contexts, the worst-case outcome of a crash is usually not too bad. There's always some version of "error budgets" decided by business tolerance for downtime and poor UX. But undefined behaviour means unknown, unintended system events as a function of inputs we promised we could manage.

When logic is the product and the product isn't logical, there's a problem. So in business domains where correctness is paramount, there's a lot to be gained by increasing the strictness of one's encoded system.

This strictness usually generates a need for increased use of types and – in programming languages well-suited to it – is generally delivered alongside other traits to compensate for that:

  • Types are usually far cheaper (in terms of code maintenance, but sometimes also in runtime memory/CPU cost) to declare and compose,
  • Inference is usually excellent (modern, principled systems allow value types to be inferred by usage, sometimes informing static dispatch many lines back),
  • Support for tagged union (also called “sum”) types is usually present, as are features that work well with them, like pattern matching.

Other trends in these languages are that they tend to encourage or demand the decoupling of data structure from program behavior, have great support for working with functions as values, and avoid ideas that encourage coupling, like inheritance. Later in this series, we'll get into these, why they’re appealing, and why they're not as limiting as they might seem prima facie.

These traits generally permit engineering teams to achieve higher-quality results with a lot less code, so in recent years as typical system complexity has massively increased, interest in the languages that support it has as well:

  • Jane St famously builds on OCaml to deliver high-performing quantitative trading systems
  • Mozilla built the foundations of Rust to help keep Firefox safe from (usually memory-related) attacks via its Javascript interpreter
  • Blockchains like Cardano and Solana build on Haskell and Rust respectively, since they need to run in zero-trust environments

There is a natural intuition that strictly encoding logical system constraints should benefit correctness, and an increasingly strong correlation between domains where correctness is paramount (e.g. the finance and security/identity spaces) and adoption of the practice.

So why not do it? Anecdotally, larger companies have faced issues adopting some of these powerful languages because they’re very different to the C-like ones most engineers are familiar with. And this can bring challenges around hiring and upskilling.

Seemingly in response to this tension, we’re now seeing a wave of modern, C-like languages that adopt some or many core ideas from functional ones.

Rust may be the best example in this category:

  • It has a design goal to be read and maintained easily by programmers familiar with C and its derivatives.
  • It offers a sound and flexible typesystem with powerful inference.
  • The direct (but checked and assisted) memory management system makes it an excellent choice for terse code serving complex use cases where performance is a concern.

That extreme pursuit of efficiency does have one significant detriment to those of us coming from managed runtimes: no garbage collector. Memory is managed semi-automatically, so while you don’t have to think about it when writing most code, familiarity with its “borrow” system and “lifetimes” is occasionally needed and can take time to develop.

Weaker examples, too, offer some improvement on established languages, but avoid demanding that we massively shift the way we reason about program behavior or turn our "interest in performance" dial from 3 to 11.

Scala was early here. It started as a terse, expressive, compatible alternative to Java, and (or so I once heard claimed at a conference) more or less accidentally delivered a combination of features that allowed programmers to encode type classes, which made it appealing to a chunk of the programming community.

I put Kotlin and Typescript in the same broad category. Neither is as powerful as Scala (though one might argue that TS is just differently powerful). Still, they’re similar in that they seek to improve the languages they build on (Java and Javascript, respectively) by offering terse type declarations, null-safety, sum types, and more. They both have issues that prevent them from being described as “sound”, but they certainly both make it easier to write high-quality code and inspire greater confidence than the languages they built upon.

In Typescript’s case, the degree to which this is true is frankly awe-inspiring, and not just in contrast with its famously treacherous Javascript foundation.

Features like conditional and mapped types made possible the development of libraries like Zod and tRPC, which allow one to write code that interoperates well with the type system to deliver utility like strict codec and API generation without reflection or a macro system. Template literal types enable well-typed wrappers around JS libraries using "stringly" typed values like Convict (code).‎ The power afforded by these features in combination is perhaps best (if not very practically) illustrated by this implementation of SQL, purely on the type system.

More established and OO-aligned languages like C# and Java are catching up too. C# looked promising early with its forward-thinking support for type parameters in its intermediate language (MSIL) before offering userland support for generics in v2 (2005), lambdas and LINQ in v3 (2007), and pattern matching features in v7 (2017). Contrast this with Java, which only got lambdas in v8 (2014) and progressed slowly in the years following. Thanks largely to Brian Goetz, Java has since accelerated, delivering previews of sealed classes in v15 (2020) and pattern matching in v17 (2021).

Back to system design

That kind of morphed into a modern history of mainstream enterprise programming languages, but I think it’s essential to cover this stuff and offer two reasons for it:

  1. System design is closely tied to system encoding, so it’s valuable to understand the available tools to constrain, communicate, and offer a framework for thinking about our models and machines.
  2. It’s also helpful to recognize deficiencies in our older tools that new ones can bring to light, and to acknowledge that we stick our heads in the sands of the past at our peril.

I hope that mainstream enterprise languages' increasingly rapid adoption of features historically associated with the PL research space convinces you that these ideas are worth exploring. My next post in this series will dive a little deeper into a few of the ideas I’ve touched on in more concrete, real-world contexts.

Share this article

Subscribe to our blog

Subscribe for Blog & Product Updates

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Subscribe to our blog

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.