# Entries tagged correctness

## Typed embedding of STLC into Haskell

5 February 2015 (programming haskell language correctness)Someone posted to the Haskell subreddit this blogpost of Lennart where he goes step-by-step through implementing an evaluator and type checker for CoC. I don't know why this post from 2007 showed up on Reddit this week, but it's a very good post, pedagogically speaking. Go and read it.

In this post, I'd like to elaborate on the simply-typed lambda calculus part of his blogpost. His typechecker defines the following types for representing STLC types, terms, and environments:

data Type = Base | Arrow Type Type deriving (Eq, Show) type Sym = String data Expr = Var Sym | App Expr Expr | Lam Sym Type Expr deriving (Eq, Show)

The signature of the typechecker presented in his post is as follows:

type ErrorMsg = String type TC a = Either ErrorMsg a newtype Env = Env [(Sym, Type)] deriving (Show) tCheck :: Env -> Expr -> TC Type

My approach is to instead create a representation of terms of STLC in such a way that only well-scoped, well-typed terms can be represented. So let's turn on a couple of heavy-weight language extensions from GHC 7.8 (we'll see how each of them is used), and define a typed representation of STLC terms:

{-# LANGUAGE GADTs, StandaloneDeriving #-} {-# LANGUAGE DataKinds, KindSignatures, TypeFamilies, TypeOperators #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE TemplateHaskell #-} -- sigh... import Data.Singletons.Prelude import Data.Singletons.TH import Data.Type.Equality -- | A (typed) variable is an index into a context of types data TVar (ts :: [Type]) (a :: Type) where Here :: TVar (t ': ts) t There :: TVar ts a -> TVar (t ': ts) a deriving instance Show (TVar ctx a) -- | Typed representation of STLC: well-scoped and well-typed by construction data TTerm (ctx :: [Type]) (a :: Type) where TConst :: TTerm ctx Base TVar :: TVar ctx a -> TTerm ctx a TLam :: TTerm (a ': ctx) b -> TTerm ctx (Arrow a b) TApp :: TTerm ctx (Arrow a b) -> TTerm ctx a -> TTerm ctx b deriving instance Show (TTerm ctx a)

The idea is to represent the context of a term as a list of
types of variables in scope, and index into that list, de
Bruijn-style, to refer to variables. This indexing operation
maintains the necessary connection between the pointer and the
type that it points to. Note the type of the `TLam`
constructor, where we extend the context at the front for the
inductive step.

To give a taste of how convenient it is to work with this representation programmatically, here's a total evaluator:

-- | Interpretation (semantics) of our types type family Interp (t :: Type) where Interp Base = () Interp (Arrow t1 t2) = Interp t1 -> Interp t2 -- | An environment gives the value of all variables in scope in a given context data Env (ts :: [Type]) where Nil :: Env '[] Cons :: Interp t -> Env ts -> Env (t ': ts) lookupVar :: TVar ts a -> Env ts -> Interp a lookupVar Here (Cons x _) = x lookupVar (There v) (Cons _ xs) = lookupVar v xs -- | Evaluate a term of STLC. This function is total! eval :: Env ctx -> TTerm ctx a -> Interp a eval env TConst = () eval env (TVar v) = lookupVar v env eval env (TLam lam) = \x -> eval (Cons x env) lam eval env (TApp f e) = eval env f $ eval env e

Of course, the problem is that this representation is not at all convenient for other purposes. For starters, it is certainly not how we would expect human beings to type in their programs.

My version of the typechecker is such that instead of giving the
type of a term (when it is well-typed), it instead transforms
the loose representation (`Term`) into the tight one
(`TTerm`). A `Term` is well-scoped and
well-typed (under some binders) iff there is a
`TTerm` corresponding to it. Let's use singletons
to store type information in existential positions:

$(genSingletons [''Type]) $(singDecideInstance ''Type) -- | Existential version of 'TTerm' data SomeTerm (ctx :: [Type]) where TheTerm :: Sing a -> TTerm ctx a -> SomeTerm ctx -- | Existential version of 'TVar' data SomeVar (ctx :: [Type]) where TheVar :: Sing a -> TVar ctx a -> SomeVar ctx -- | A typed binder of variable names data Binder (ctx :: [Type]) where BNil :: Binder '[] BCons :: Sym -> Sing t -> Binder ts -> Binder (t ': ts)

Armed with these definitions, we can finally define the type inferer. I would argue that it is no more complicated than Lennart's version. In fact, it has the exact same shape, with value-level equality tests replaced with Data.Type.Equality-based checks.

-- | Type inference for STLC infer :: Binder ctx -> Term -> Maybe (SomeTerm ctx) infer bs (Var v) = do TheVar t v' <- inferVar bs v return $ TheTerm t $ TVar v' infer bs (App f e) = do TheTerm (SArrow t0 t) f' <- infer bs f TheTerm t0' e' <- infer bs e Refl <- testEquality t0 t0' return $ TheTerm t $ TApp f' e' infer bs (Lam v ty e) = case toSing ty of SomeSing t0 -> do TheTerm t e' <- infer (BCons v t0 bs) e return $ TheTerm (SArrow t0 t) $ TLam e' inferVar :: Binder ctx -> Sym -> Maybe (SomeVar ctx) inferVar (BCons u t bs) v | v == u = return $ TheVar t Here | otherwise = do TheVar t' v' <- inferVar bs u return $ TheVar t' $ There v' inferVar _ _ = Nothing

Note that pattern matching on `Refl` in the
`App` case brings in scope type equalities that are
crucial to making `infer` well-typed.

Of course, because of the existential nature of
`SomeVar`, we should provide a typechecker as well
which is a much more convenient interface to work with:

-- | Typechecker for STLC check :: forall ctx a. (SingI a) => Binder ctx -> Term -> Maybe (TTerm ctx a) check bs e = do TheTerm t' e' <- infer bs e Refl <- testEquality t t' return e' where t = singByProxy (Proxy :: Proxy a) -- | Typechecker for closed terms of STLC check_ :: (SingI a) => Term -> Maybe (TTerm '[] a) check_ = check BNil

(The `SingI a` constraint is an unfortunate
implementation detail; the kind of `a` is
`Type`, which is closed, so GHC should be able to
know there is always going to be a `SingI a`
instance).

To review, we've written a **typed** embedding of
STLC into Haskell, with a total evaluator and a typechecker, in
about 110 lines of code.

If we were doing this in something more like Agda, one possible
improvement would be to define a function `untype :: TTerm
ctx a -> Term` and use that to give `check`
basically a type of `Binder ctx -> (e :: Term) -> Either
((e' :: TTerm ctx a) -> untype e' == e -> Void) (TTerm ctx
a)`, i.e. to give a proof in the non-well-typed case as well.

## Simply Typed Lambda Calculus in Agda, Without Shortcuts

1 May 2013 (programming agda correctness language)There are many ways to introduce dependent type systems, depending on which side of the Curry-Howard lens you look through. But if your interest is more in programming languages than proof assistants, then length-indexed vectors is your Hello, World!, and an interpreter for the simply-typed lambda calculus is your FizzBuzz. I think the usual presentation of the latter is a bit of a cheat, for reasons which I will explain below. I'll also show a "non-cheating" version.

Continue reading »## Mod-N counters in Agda

19 February 2012 (programming agda correctness haskell) (6 comments)First of all, before I start on the actual blog post, let me put this in context. I rembember a couple of years ago when I developed an interest in functional programming languages, and Haskell in particular. There was a phase when I was able to use Haskell to solve problems in the small. I understood most of the basics of pure functional programming; then there were things I regarded as magic; and of course there was a lot of things I didn't even know that I didn't know about. But none of it did I grok.

I feel like I'm starting to get to the same level with Agda now. So this is going to be one of those "look at this cool thing I made" posts where the actual result is probably going to be trivial for actual experts of the field; but it's an important milestone for my own understanding of the subject.

I wanted to play around with simple but Turing-complete
languages, and I started implementing an interpreter for a counter
machine. More on that in a later post; this present post
describes just the representation of register values. In the
model that I implemented, values of registers are byte counters,
meaning they have 256 different values, and two operations
`+1` and `-1` that the inverses of each
other. Incrementing/decrementing should roll over: `255 +1 = 0` and `0 -1 = 255`.

## Unary arithmetics is even slower than you'd expect

19 April 2010 (haskell programming math agda correctness)My coworker Encsé posted a challange use the 9-digit number problem to demonstrate interesting programming techniques. So I sat down and wrote my first practical program using dependant types, in Agda. I've been playing around with Agda previously, but this seemed like a good opportunity to try to write an actual, self-contained, provably correct program with it.

I'm not going to get into details now about the resulting Agda code, because I'm planning to present it in detail in a later post. In its current form, my program is formally proven to produce only good nine-digit numbers; the only property that still needs proving is that it finds all of them.

But the sad surprise came when I tried to actually run it. It was unbearably slow to just get to all possible combinations for the first three digits. I'm talking about 12 minutes just to list them from 123 to 423 (at which point I killed the process). For comparison, the following Haskell program, which is an implementation of the same naïve algorithm, finds the (unique) solution in 4 milliseconds:

import Control.Monad.List fromDigits = foldr shift 0 where shift d s = 10 * s + d p `divides` q = q `mod` p == 0 encse :: [Int] encse = map fromDigits $ encse' 0 [] where encse' 9 ds = return ds encse' n ds = do d <- [1..9] let ds' = d:ds n' = n + 1 guard $ not (d `elem` ds) guard $ n' `divides` fromDigits ds' encse' n' ds'

### So where's that slowdown coming from?

The first intuition would be that the code generated by Agda is slow because in parallel to the actual computation, it is also evaluating all kinds of proofs. But the proofs exist only in the world of types, so they shouldn't matter once the program is compiled.

The real answer is that calculating in unary representation is slow. Very, very slow. Even slower than you'd imagine.

Continue reading »## The B Method for Programmers (Part 2)

22 February 2010 (programming language math correctness)
In
my previous post, I introduced the B method and showed the
steps of writing a simple program for finding the
n^{th} element of a sequence
satisfying a given predicate p. While you may think
the resulting program is correct, we can't just say so and be
done with it. The whole point of the B method is that the
resulting program can be formally proven correct.

The B software generates 69 so-called proof obligations for the
code from the first part. These are assertions about the program
actually behaving as specified. For example, let's look at PO69
which asserts that `ll` is correctly set. Recall first
the relevant portion of the specification:

`ll = bool(card({kk | kk ∈ aa..bb ∧ pp(kk) = TRUE}) ≥ nn)`And the invariant of the implementation:

`ll = bool(card({kk | kk ∈ aa..ii ∧ pp(kk) = TRUE}) = nn)`
So what we have to prove is that given the preconditions, by the
time the loop in the implementation terminates, the invariant
makes sure `ll` is equal to its specified value. This is
what's described (somewhat more verbosely) by the actual proof
obligation below.

# Older entries:

- 16 February 2010: The B Method for Programmers (Part 1)
- 4 September 2008: Programozás visszavezetéssel (4 comments)