# Haskell代写 | COMP3161/9164 20T3 Assignment 2

这个作业是用Haskell实现MinHS的类型推断过程

COMP3161/9164 20T3 Assignment 2

Type Inference for Polymorphic MinHs

1 Task 1

Task 1 is worth 100% of the marks of this assignment. You are to implement type inference for

MinHS with aggregate data structures. The following cases must be handled:

• the MinHS language of the first task of assignment 1 (without n-ary functions or letrecs, or

lists)

• product types: the 0-tuple and 2-tuples.

• sum types

1

• polymorphic functions

These cases are explained in detail below. The abstract syntax defining these syntactic entities is

in Syntax.hs. You should not need to modify the abstract syntax definition in any way.

Your implementation is to follow the definition of algebraic data types found in the lecture on

data structures, and the rules defined in the lectures on type inference. Additional material can be

found in the lecture notes on polymorphism, and the reference materials. The full set of rules is

outlined in this specification.

2 Bonus Tasks

These tasks are all optional, and are worth a total of an additional 10%.

2.1 Bonus Task 1: Simple Syntax Extensions

This bonus task is worth an additional 3%. In this task, you should implement type inference for

n-ary functions, and multiple bindings in the one let expression, with the same semantics as the

extension tasks for Assignment 1.

You will need to develop the requisite extensions to the type inference algorithm yourself, but

the extensions are very similar to the existing rules.

2.2 Bonus Task 2: Recursive Bindings

This bonus task is worth an additional 3%. In this task you must implement type inference for

letrec constructs.

Each binding will require a fresh unification variable, similar to multiple-binding let expressions,

however for letrec we do not wish types to be generalised.

2.3 Bonus Task 3: User-provided type signatures

This bonus task is worth an additional 4%. In this task you are to extend the type inference pass

to accept programs containing some type information. You need to combine this with the results

of your type inference pass to produce the final type for each declaration. That is, you need to be

able to infer correct types for programs like:

main = let f :: (Int -> Int)

= recfun g x = x;

in f 2;

You must ensure that the type signatures provided are not overly general. For example, the following

program should be a type error, as the user-provided signature is too general:

main :: (forall a. a) = 3;

3 Algebraic Data Types

This section covers the extensions to the language of the first assignment. In all other respects

(except lists) the language remains the same, so you can consult the reference material from the

first assignment for further details on the language.

2

3.1 Product Types

We only have 2-tuples in MinHS, and the Unit type, which could be viewed as a 0 type.

Types τ → τ 1 * τ 2

| 1

Expressions exp → (e1, e2)

| fst e | snd e

| ()

3.2 Sum Types

Types τ → τ + τ

Expressions exp → Inl e1

| Inr e2

| case e of

Inl x -> e1 ;

Inr y -> e2 ;

3.3 Polymorphism

The extensions to allow polymorphism are relatively minor. Two new type forms have been introduced: the TypeVar t form, and the Forall t e form.

Types τ → forall τ. ..τ..

For example, consider the following code fragment before and after type inference:

main =

let f = recfun g x = x;

in if f True

then f (Inl 1)

else f (Inr ());

main :: (Int + 1) =

let f :: forall a. (a -> a) = recfun g :: (a -> a) x = x;

in if f True

then f (Inl 1)

else f (Inr ());

3

4 Type Inference Rules

Sections coloured in blue can be viewed as inputs to the algorithm, while red text can be viewed

as outputs.

Constants and Variables

Γ ` n : Int

x : ∀a1 . . . ∀an.τ ∈ Γ

Γ ` x : τ [a1 := β1, . . . , an := βn]

βi fresh

(unquantify in TyInfer.hs provides an easy way to implement this)

Constructors and Primops

constructorType(C) = ∀a1 . . . ∀an.τ

Γ ` C : τ [a1 := β1, . . . , an := βn]

primopType(o) = ∀a1 . . . ∀an.τ

Γ ` o : τ [a1 := β1, . . . , an := βn]

βi fresh

(constructorType and primopType are defined in TyInfer.hs)

Application

TΓ ` e1 : τ1 T

0TΓ ` e2 : τ2 T

0

τ1

U∼ (τ2 → α)

UT0TΓ ` Apply e1 e2 : Uα

α fresh

If-Then-Else

TΓ ` e : τ τ

U∼ Bool T1UTΓ ` e1 : τ1 T2T1UTΓ ` e2 : τ2 T2τ1

U

0

∼ τ2

U0T2T1UTΓ ` If e e1 e2 : U0τ2

Case

TΓ ` e : τ

T1T(Γ ∪ {x : αl}) ` e1 : τl

T2T1T(Γ ∪ {y : αr}) ` e2 : τr

T2T1T(αl + αr)

U∼ T2T1τ UT2τl

U

0

∼ Uτr

U0UT2T1TΓ ` Case e x.e1 y.e2 : U0Uτr

αl

, αr fresh

Recursive Functions

T(Γ ∪ {x : α1} ∪ {f : α2}) ` e : τ T α2

U∼ T α1 → τ

UTΓ ` Recfun f.x.e : U(T α1 → τ )

α1, α2 fresh

Let Bindings

TΓ ` e1 : τ T

0

(TΓ ∪ {x : Generalise(TΓ, τ )}) ` e2 : τ

0

T0TΓ ` Let e1 x.e2 : τ

0

TΓ ` e : τ

Γ ` Main e : Generalise(Γ, τ )

where Generalise(Γ, τ ) = ∀ (TV (τ ) \ TV (Γ)) . τ

4.1 Implementing the algorithm

The type inference rules imply an algorithm, where the expression and the environment can be seen

as input, and the substitution and the type of the expression can be seen as output, as seen from

the color coding above.

Such an algorithm would probably have a type signature like:

4

inferExp :: Gamma -> Exp -> TC (Type, Subst)

However our goal isn’t just to infer the type of the expression, but to also elaborate the expression to

its explicitly-typed equivalent. In other words, we want to add typing annotations to our expressions,

meaning we must return a new expression as well as the type and substitution.

inferExp :: Gamma -> Exp -> TC (Exp, Type, Subst)

The cases for let and recfun (and letrec, in the bonus) must add a (correct) type signature to

the resultant expression, and all other cases must be sure to return a new expression consisting of

the updated subexpressions.

Because we add annotations to the expressions on the way, those type annotations wouldn’t

contain the information we get from typing successive expressions. Consider the following example:

{+ : Int → Int → Int, x : a} ` ( (let z = x in x) , x+1)

When typing let z = x in x, we only know that x is of type a, so we would add the type annotation let z :: a = x in x. Only when we type the second expression, x+1, we know that a has

to be Int (which will be reflected in T

0

). That’s why, when we’re done typing the top-level binding,

we have to traverse the whole binding again, applying the substitution to each type annotation

anywhere in the binding.

The function allTypes, defined in Syntax.hs, allows you to update each type annotation in an

expression, making it easy to perform this substitution.

5 Substitution

Substitutions are implemented as an abstract data type, defined in Subst.hs. Subst is an instance

of the Monoid type class, which is defined in the standard library as follows:

class Monoid a where

mappend :: a -> a -> a — also written as the infix operator <>

mempty :: a

For the Subst instance, mempty corresponds to the empty substitution, and mappend is substitution composition. That is, applying the substitution a <> b is the same as applying a and b

simultaneously. It should be reasonably clear that this instance obeys the monoid laws:

mempty <> x == x — left identity

x <> mempty == x — right identity

x <> (y <> z) == (x <> y) <> z — associativity

It is also commutative (x <> y == y <> x) assuming that the substitutions are disjoint (i.e that

dom(x) ∩ dom(y) = ∅). In the type inference algorithm, your substitutions are all applied in order

and thus should be disjoint, therefore this property should hold.

You can use this <> operator to combine multiple substitutions into your return substitution.

You can construct a singleton substitution, which replaces one variable, with the =: operator,

so the substitution (“a” =: TypeVar “b”) <> (“b” =: TypeVar “c”) is a substitution which

replaces a with c and b with c.

The Subst module also includes a variety of functions for running substitutions on types, quantified types, and environments.

5

6 Unification

The unification algorithm is the structural matching one originally due to Robinson and discussed

in lectures:

• input: two type terms t1 and t2, where forall quantified variables have been replaced by fresh,

unique variables

• output: the most general unifier of t1 and t2 (if it exists). The unifier is a data structure or

function that specifies the substitutions to take place to unify t1 and t2.

6.1 Unification Cases

For t1 and t2

1. both are type variables v1 and v2:

– if v1 = v2, return the empty substitution

– otherwise, return [v1 := v2]

2. both are primitive types

– if they are the same, return the empty substitution

– otherwise, there is no unifier

3. both are product types, with t1 = (t11 ∗ t12), t1 = (t21 ∗ t22)

– compute the unifier S of t11 and t21

– compute the unifier S

0 of S t12 and S t22

– return S <> S

0

4. function types and sum types (as for product types)

5. only one is a type variable v , the other an arbitrary type term t

– if v occurs in t, there is no unifier

– otherwise, return [v := t]

6. otherwise, there is no unifier

Functions in the Data.List library are useful for implementing the occurs check.

Once you generate a unifier (also a substitution), you then need to apply that unifier to your

types, to produce the unified type.

7 Errors and Fresh Names

Thus far, the following type signature would be sufficient for implementing our type inference

function:

inferExp :: Gamma -> Exp -> (Exp, Type, Subst)

Unification is a partial function, however, so we need well-founded way to handle the error cases,

rather than just bail out with error calls.

To achieve this, we’ll adjust the basic, pure signature for type inference to include the possibility

of a TypeError:

6

inferExp :: Gamma -> Exp -> Either TypeError (Exp, Type, Subst)

Even this, though, is not sufficient, as we cannot generate fresh, unique type variables for use as

unification variables:

fresh :: Type — it is impossible for fresh to return a different

fresh = ? — value each time!

To solve this problem, we could pass an (infinite) list of unique names around our program, and

fresh could simply take a name from the top of the list, and return a new list with the name

removed:

fresh :: [Id] -> ([Id],Type)

fresh (x:xs) = (xs,TypeVar x)

This is quite awkward though, as now we have to manually thread our list of identifiers throughout

the entire inference algorithm:

inferExp :: Gamma -> Exp -> [Id] -> Either TypeError ([Id], (Exp, Type, Subst))

To resolve this, we bundle both the [Id] state transformer and the Either TypeError x error

handling into one abstract type, called TC (defined in TCMonad.hs)

newtype TC a = TC ([Id] -> Either TypeError ([Id], a))

One can think of TC a abstractly as referring to a stateful action that will, if executed, return a

value of type a, or throw an exception.

As the name of the module implies, TC is a Monad, meaning that it exposes two functions

(return and >>=) as part of its interface.

return :: a -> TC a

return = …

(>>) :: TC a -> TC b -> TC b

a >> b = a >>= const b

(>>=) :: TC a -> (a -> TC b) -> TC b

a >>= b = …

The function return is, despite its name, just an ordinary function which lifts pure values into a TC

action that returns that value. The function (>>) (read then), is a kind of composition operator,

which produces a TC action which runs two TC actions in sequence, one after the other, returning

the value of the last one to be executed. Lastly, the function (>>=), more general than (>>), allows

the second executed action to be determined by the return value of the first.

The TCMonad.hs module also includes a few built-in actions:

typeError :: TypeError -> TC a — throw an error

fresh :: TC Type — return a fresh type variable

Haskell includes special syntactic sugar for monads, which allow for programming in a somewhat

imperative style. Called do notation, it is simple sugar for (>>) and (>>=).

do e — e

do e; v — e >> do v

do p <- e; v — e >>= \p -> do v

do let x = y; v — let x = y in do v

This lets us write unification and type inference quite naturally. A simple example of the use of the

TC monad is already provided to you, with the unquantify function, which takes a type with some

number of quantifiers, and replaces all quantifiers with fresh variables (very useful for inference

cases for variables, constructors, and primops):

7

unquantify :: QType -> TC Type

unquantify (Ty t) = return t

unquantify (Forall x t) = do x’ <- fresh unquantify (substQType (x =:x’) t) To run a TC action, in your top level infer function, the runTC function can be used: runTC :: TC a -> Either TypeError a

Please note: This function runs the TC action with the same source of fresh names each time!

Using it more than once in your program is not likely to give correct results.