Typing the stack

This page is a work in progress. Here be dragons.

A function call wears its shape on its sleeve. Seeing f(x, y) we know that the function f takes two arguments. Accidentally writing f(x) or f(x, y, z) is a visible mistake that can be caught without running the program.

Concatenative syntax doesn't offer the same affordance, a program foo bar baz provides no information at a glance of what's the arity or shape of each operation. How many values does foo consume or leave behind? More often than not we depend on naming to make these relationships clearer between words, but otherwise the syntax does not give us any help in this regard.

Static typing allows us to recover this information that the syntax discards, present it to the user, and catch mistakes before the program is executed.

Stack effects

Instead of giving a word a type like Int -> Int would exist in ML, we describe what it does to the stack in the form of ( inputs -- outputs ): to the left of the dashes is what the word consumes from the top of the stack, to the right is what it leaves behind.

We can see the stack effects of a few select words below:

+    ( int int -- int )
dup  ( a -- a a )
drop ( a -- )
swap ( a b -- b a )

The definitions for dup, drop and swap introduce type variables, similar to generics. dup, for example, doesn't care about what it's duplicating, so it uses a type variable to describe the input.

Discovering a type system

The point of a type system is to check that programs fit together. In our concatenative context, to "fit together" means that when you compose two words, the outputs of the first have to match the inputs of the second.

Our stack effect notation is almost there in allowing us to do this: to infer the type of a program like 2 + we can follow the composition as follows:

2   : ( -- int )
+   : ( int int -- int )

2 + : ( int -- int )

But stack effect notation purposefully elides that's important to us for static typing: codifying that there's a part of the stack we don't touch. With the examples shown it's not important, but once quotations come into the mix they're needed to express the type of certain words.

This means that all of our words presented so far are polymorphic over this "part of the stack we don't touch". We can express that by adding stack variables to our model:

+    : ( ..R int int -- ..R int )
dup  : ( ..R a -- ..R a a )
drop : ( ..R a -- ..R )
swap : ( ..R a b -- ..R b a )

In these examples, we have codified into the type that there's a part of the stack that passes through during the execution of the word.