forked from sweirich/pi-forall
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathoplss.mng
2846 lines (2266 loc) · 119 KB
/
oplss.mng
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\newif\ifcomments %% include author discussion
\commentsfalse %% toggle comments here
\documentclass{article}
\usepackage{bigfoot}
\usepackage{amsmath,amssymb,amsthm}
\usepackage[dvipsnames]{xcolor}
\usepackage{verbatim}
\usepackage[utf8]{inputenc}
\usepackage[hyphens]{url}
\usepackage[hidelinks,bookmarksnumbered,pdfencoding=auto,psdextra]{hyperref}
% From https://tex.stackexchange.com/a/430877/133551 and Heiko's comment
\pdfstringdefDisableCommands{%
\def\({}%
\def\){}%
}
\usepackage{supertabular}
\usepackage{listings}
\usepackage{textcomp}
\usepackage{xspace}
\usepackage{ottalt}
\usepackage[T1]{fontenc}
\newcommand\scw[1]{\ifcomments\emph{\textcolor{violet}{#1}}\fi}
% requires dvipsnames
\usepackage{lstpi}
\usepackage{lsthaskell}
\newcommand\cd[1]{\lstinline[language=Haskell]{#1}}
\newcommand\pif{\texttt{pi-forall}\xspace}
\newcommand\unbound{\texttt{Unbound}\xspace}
\theoremstyle{definition}
\newtheorem{definition}{Definition}[section]
\newtheorem{lemma}{Lemma}[section]
% force footnotes to be on a single page
\interfootnotelinepenalty=10000
\title{Implementing Dependent Types in \pif}
\author{Stephanie Weirich}
\begin{document}
\maketitle
% load Ott generated latex macros, and modify them with ottalt.sty
\inputott{pi-rules}
\section{Overview and Goals}
These lecture notes describe the design of a minimal dependently-typed
language called ``\pif'' and walk through the implementation of its type
checker. They are based on lectures given at the \emph{Oregon Programming
Languages Summer School} during July 2023 and derived from earlier lectures
from summer schools during 2022, 2014 and 2013.
\paragraph{What do I expect from you?} This discussion assumes a
familiarity with the basics of the lambda calculus, including its standard
operations (alpha-equivalence, substitution, evaluation) and the basics of
type systems (especially their specification using inference rules). For
background on this material, I recommend the textbooks~\cite{tapl, pfpl}.
Furthermore, these notes also refer to an implementation of a demo type
checker and assume basic knowledge of the Haskell programming language. This
implementation is available at \url{https://github.com/sweirich/pi-forall} on
branch \texttt{2023}. As you study these notes, I encourage you to download
this repository, read through its source code, and experiment with
it. Installation instructions are available with the repository.
\paragraph{What do these notes cover?}
These notes are broken into several sections that incrementally build up the
design and implementation of a type checker for a dependently-typed
programming language. This implementation itself is available in separate
versions, each found in separate subdirectories of the repository.
\begin{figure}[ht]
\begin{center}
\begin{tabular}{llll}
Key feature & \pif subdirectory & Section\\
\hline
Core system & \texttt{version1} & Sections~\ref{sec:simple}, \ref{sec:bidirectional}, and \ref{sec:implementation} \\
Equality & \texttt{version2} & Sections~\ref{sec:equality} and ~\ref{sec:pattern-matching}\\
Irrelevance & \texttt{version3} & Section ~\ref{sec:irrelevance} \\
Datatypes & \texttt{full} & Sections~\ref{sec:examples} and ~\ref{sec:datatypes} \\
\end{tabular}
\end{center}
\caption{Connection between sections and \pif versions}
\label{fig:impls}
\end{figure}
These implementations build on each other (each is an extension of the
previous version) and are summarized in Figure~\ref{fig:impls}. As you read
each chapter, refer to its corresponding implementation to see how the
features described in that chapter can be implemented. The directory
\texttt{main} is the source of all of these implementations and contains the
markup needed to generate the four versions.
\begin{itemize}
\item Section~\ref{sec:examples} starts with some examples of the
\texttt{full} \pif language so that you can see how the pieces fit together.
\item Section~\ref{sec:simple} presents the mathematical specification of the
core \pif type system including its concrete syntax (as found in \pif source
files), abstract syntax (represented with a Haskell datatype), and core
typing rules (written using standard mathematical notation). This initial
type system is simple and declarative. In other words, it \emph{specifies}
what terms should type check, but cannot be directly implemented.
\item Section~\ref{sec:bidirectional} reformulates the typing rules so that
they are \emph{syntax-directed} and correspond to a type-checking
algorithm. The key idea of this section is to recast the typing rules as a
\emph{bidirectional type system}.
% As you read this section, ask yourself ``How much information do
% we need to include in terms to make type checking algorithmic?''
\item Section~\ref{sec:implementation} introduces the core \pif implementation
and walks through the type checker found in \texttt{version1}, the Haskell
implementation of the typing rules discussed in
Section~\ref{sec:bidirectional}. This section also shows how the \unbound library
can assist with the implementation of variable binding and automatically derive operations for
capture-avoiding substitution and alpha-equivalence. Finally, the section
describes a monadic structure for the type checker, and how it can help with
the production of error messages, the primary purpose of a type checker.
% Ask yourself ``How can we represent the
% abstract syntax of the language, including variable binding?''
\item Section~\ref{sec:equality} discusses the role of definitional equality
in dependently typed languages. After motivating examples, it presents both
a specification of when terms are equal and a semi-decidable algorithm that
can be incorporated into the type checker.
% Ask yourself ``How can we know when types (and terms) are equal?''
\item An important feature of dependently-typed languages is the ability for
run-time tests to be reflected into the type
system. Section~\ref{sec:pattern-matching} shows how to extend \pif with a
simple form of dependent pattern matching. We look at this feature in two
different ways: First how to reflect the gain in information about boolean
values in each branch of an $\mathsf{if}$ expression. Second, how to reflect the
equality represented via propositional equality.
\item Section~\ref{sec:irrelevance} introduces the idea of tracking the
\emph{relevance} of arguments. Relevance tracking enables parts of the
program to be identified as ``compile-time only'' and thus erased before
execution. It also identifies parts of terms that can be ignored when
deciding equivalence.
% Ask yourself ''How can we decide what parts
% of the term are irrelevant during computation? Can be ignored when checking
% for equality?''
\item Finally, Section~\ref{sec:datatypes} introduces arbitrary datatypes and
generalizes dependent pattern matching. This section combines the features
introduced in the previous sections (Sections \ref{sec:equality},
\ref{sec:pattern-matching}, and \ref{sec:irrelevance}) into a single unified
language feature.
\end{itemize}
\paragraph{What do these notes \textbf{not} cover?}
The goal of these notes is to provide an introductory overview of the
implementation of (dependent) type theories and type checkers. As a result,
there are many related topics that are not included here. Furthermore, because
these notes come with a reference implementation, they emphasize only a single
point in a rich design space.
For a broader view, section~\ref{sec:related-work} includes a discussion of
alternative tutorials and describes how this one differs from other
approaches. The key differences are summarized below:
\begin{itemize}
\item For simplicity, the \pif language does not enforce termination through
type checking. Implementing a proof system like Agda or Coq requires
additional structure, including universe levels and bounded recursion.
\item Many implementations of dependent type theories use
\emph{normalization-by-evaluation} to decide whether types are equivalent
during type checking. \pif uses an alternative approach based on
weak-head-normalization, defined using substitution. This approach is closer
to $\lambda$-calculus theory but can be less efficient in practice.
\item For simplicity, \pif does not attempt to infer missing arguments using
unification or other means. As a result, example programs in \pif are
significantly more verbose than in other languages.
\item This implementation relies on the \unbound library for variable binding,
alpha-equivalence, and substitution. As a result, these operations can be
automatically derived. A more common approach is to use de Bruijn indices.
\item Recent work on cubical type theory, higher-inductive types, and
univalence is not covered here. Indeed, the rules for dependent pattern matching
that we present are incompatible with such extensions.
\end{itemize}
\paragraph{What do I want you to get out of all of this?}
\begin{enumerate}
\item An understanding of how to translate mathematical specifications of type
systems and logics into code, i.e., how to represent the syntax of a
programming language and how to implement a type checker. More generally,
this involves techniques for turning a declarative specification of a system
of judgments into an algorithm that determines whether the judgment holds.
\item Exposure to the design choices of dependently-typed languages. In this
respect, the goal is breadth, not depth. As a result, I will provide
\emph{simple} solutions to some of the problems that we face and sidestep
other problems entirely. Because solutions are chosen for simplicity,
Section~\ref{sec:related-work} includes pointers to related work if you want to go
deeper.
\item Experience with the Haskell programming language. I think Haskell is an
awesome tool for this sort of work and I want to demonstrate how its
features (monads, generic programming) are well-suited for this task.
\item A tool that you can use as a basis for experimentation. When you design
your language, how do you know what programs you can and cannot express?
Having an implementation lets you work out (smallish) examples and will help
convince you (and your reviewers) that you are developing something useful.
Please use \pif as a starting point for your ideas.
\item Templates and tools for writing about type systems. The source files for
these lecture notes are available in the same repository as the demo
implementation and I encourage you to check them out.\footnote{See the
\texttt{doc} subdirectory in \url{https://github.com/sweirich/pi-forall}.}
Building these notes requires Ott~\cite{ott}, a tool specifically tailored
for typesetting type systems and mathematical specifications of programming
languages.
\end{enumerate}
\section{A taste of \pif}
\label{sec:examples}
Before diving into the design of the \pif language, we will start with a few
motivating examples for dependent types. This way, you will get to learn a bit
about the syntax of \pif and anticipate the definitions that are to come.
One example that dependently-typed languages are really good at is tracking
the length of lists.
In \pif, we can define a datatype for length-indexed lists (often called
vectors) as follows.\footnote{In the \texttt{full} implementation, you can find these
definitions in the file \texttt{pi/Vec.pi}.}
\begin{piforall}
data Vec (A : Type) (n : Nat) : Type where
Nil of [n = Zero]
Cons of [m:Nat] (A) (Vec A m) [n = Succ m]
\end{piforall}
The type \cd{Vec A n} has two parameters: \cd{A} the type of elements in the
list and \cd{n}, a natural number indicating the length of the list. This
datatype also has two constructors: \cd{Nil} and \cd{Cons}, corresponding to
empty and non-empty lists. The \cd{Cons} constructor has three arguments,
a number \cd{m}, the element at the head of the list (of type \cd{A}), and the tail
of the list (of type \cd{Vec A m}).
Both constructors also include \emph{constraints} that must be satisfied when
they are used. The \cd{Nil} constructor constrains the length
parameter to be \cd{Zero} and the \cd{Cons} constructor
constrains it to be one more than the length of the tail of the list.
In the definition of \cd{Cons}, the \cd{m} parameter is the length of the tail
of the list. This parameter is written in square brackets to indicate that it
should be \emph{irrelevant}: this value is for type checking only and
functions should not use it.
For example, we can use \pif to check that the list below contains exactly
three boolean values.
\begin{piforall}
v3 : Vec Bool 3
v3 = Cons [2] True (Cons [1] False (Cons [0] False Nil))
\end{piforall}
Above, \pif never infers arguments, so we must always supply the length of the
tail whenever we use \cd{Cons}.
We can also map over length-indexed lists.
\begin{piforall}
map : [A:Type] -> [B:Type] -> [n:Nat] -> (A -> B) -> Vec A n -> Vec B n
map = \ [A][B][n] f v.
case v of
Nil -> Nil
Cons [m] x xs -> Cons [m] (f x) (map[A][B][m] f xs)
\end{piforall}
As \pif doesn't infer arguments, the map function needs to take \cd{A}, \cd{B}
and \cd{n} explicitly at the beginning of the function definition, and they
need to be provided as arguments to the recursive call.
The type checker helps us avoid errors in the definition of \cd{map}. Like most
functional languages, if we had forgotten to call \cd{f} and instead included
\cd{x} in the \cd{Cons}, then \cd{map} would produce a vector containing elements of
type \cd{A} instead of \cd{B}. The type checker would flag this as an error. Similarly, if
we had replaces the \cd{Cons} branch with \cd{Nil} or had forgotten to \cd{Cons} \cd{f x} to the result of the recursive call in this branch, the resulting vector would be the wrong
length. The type checker would catch this error too!
The \pif language does not include the sophisticated type inference algorithms found in many
other languages, that can figure out some of the arguments provided
to a function automatically. As a result,
when we call map, we also need to supply the appropriate parameters for
the function and the vector that we are giving to \cd{map}. For example, to map
the boolean \cd{not} function, of type \cd{Bool -> Bool}, we need to provide
these two \cd{Bool}s along with the length of the vector.
\begin{piforall}
v4 : Vec Bool 3
v4 = map [Bool][Bool][3] not v3
\end{piforall}
Because we are statically tracking the length of vectors, we can write
a safe indexing operation. We index into the vector using an index where
the type bounds its range. Below, the \cd{Fin} type has a parameter
that states what length of vector it is appropriate for.
\begin{piforall}
data Fin (n : Nat) : Type where
Zero of [m:Nat][n = Succ m]
Succ of [m:Nat](Fin m)[n = Succ m]
\end{piforall}
For example, the type \cd{Fin 3} has exactly three values:
\begin{piforall}
x1 : Fin 3
x1 = Succ [2] (Succ [1] (Zero [0]))
x2 : Fin 3
x2 = Succ [2] (Zero [1])
x3 : Fin 3
x3 = Zero [2]
\end{piforall}
With the \cd{Fin} type, we can safely index vectors. The following
function is guaranteed to return a result because the index is within
range.
\begin{piforall}
nth : [a:Type] -> [n:Nat] -> Vec a n -> Fin n -> a
nth = \[a][n] v f. case f of
Zero [m] -> case v of
Cons [m'] x xs -> x
Succ [m] f' -> case v of
Cons [m'] x xs -> nth [a][m] xs f'
\end{piforall}
Note how, in the case analysis above, neither \cd{case} requires a branch for
the \cd{Nil} constructor. The \pif type checker can verify that the \cd{Cons}
case is exhaustive and that the \cd{Nil} case would be inaccessible. In these
cases, the type checker notes that the patterns for the missing branches don't
make sense: constructing the \cd{Nil} pattern would require satisfying the
constraint that \cd{Zero = Succ m}, which is impossible.
You may have noticed that some of the arguments to functions and constructors in the examples above are enclosed in square brackets. These arguments are \emph{irrelevant} in \pif. What this means is that the compiler can erase these arguments before running the program and can ignore these arguments when comparing types for equality. (When arguments are marked in this way, the type checker must ensure that this erasure is sound and the argument is never actually used in a meaningful way.)
Similarly, the constraints that are part of datatype definitions, such as \cd{n = Succ m} above, are enclosed in
square brackets because there is no runtime proof of the equality stored with the data constructor.
\subsection{Homework: Check out \pif}
The files in the directory \texttt{full/pi/} demonstrate the full power of the
\pif language. Take a little time to familiarize yourself with some of these
modules (such as \cd{Fin.pi} and \cd{Vec.pi}) and compare them to similar
ones that you might have seen in Agda, Haskell, or elsewhere.
To run the type checker on these modules, make sure that you have first
compiled the implementation using \texttt{stack build} at the terminal. Then,
to run \pif on a source file, such as \cd{Fin.pi}, you can issue the command
\texttt{stack exec -{}- pi-forall Fin.pi} in the \cd{full/pi} subdirectory. If
the file type checks, you will see the contents of the file displayed in the
terminal window. Otherwise, you will see an error message from the type checker.
For a more extensive series of examples, start with \cd{pi/Lambda0.pi} for an
interpreter for a small lambda calculus and then compare it with the
implementations in \cd{pi/Lambda1.pi} and \cd{pi/Lambda2.pi}. These two
versions index the expression type with the scope depth (a natural number) and
the expression type to eliminate run-time scope and type errors from the
execution of the interpreter.
\subsubsection{Homework: Church and Scott encodings}
The file \texttt{full/pi/NatChurch.pi} is a start at a Church encoding of
natural numbers. Replace the \cd{TRUSTME}s in this file so that it
compiles. For example, one task in this file is to define a predecessor
function. By replacing the \cd{TRUSTME} with \cd{Refl} below, you will be able
to force the type checker to normalize both sides of the equality. The code
will only type check if you have defined \cd{pred} correctly (on this
example).
\begin{piforall}
test_pred : pred two = one
test_pred = TRUSTME -- replace with Refl
\end{piforall}
\section{A Simple Core Language}
\label{sec:simple}
Now let's turn to the design and implementation of the \pif language
itself. We'll start with a small core, and then incrementally add features
throughout the rest of this tutorial.
Consider this simple dependently-typed lambda calculus, shown below. What should it
contain? At the bare minimum we start with the following five forms:
\[
\begin{array}{rcll}
[[a]],[[b]],[[A]],[[B]] & ::=& [[x]] &\mbox{ variables }\\
&&[[\x. a]] &\mbox{ lambda expressions (anonymous functions)} \\
&&[[a b]] &\mbox{ function applications }\\
&&[[(x:A) -> B]] &\mbox{ dependent function type, aka $\Pi$-types }\\
&&[[Type]] &\mbox{ the `type' of types}\\
\end{array}
\]
As in many dependently-typed languages, we have the \emph{same} syntax for
both expressions and types. For clarity, the convention is to use lowercase
letters ($a$,$b$) for expressions and uppercase letters ($A$,$B$) for types.
Note that $\lambda$ and $\Pi$ above are \emph{binding forms}. They bind the
variable $x$ in $a$ and $B$ respectively. In dependent function types
$[[(x:A) -> B]]$, if $x$ does not appear free in $B$, it is also convention
to write the type as $[[A -> B]]$.
\subsection{When do expressions in this language type check?}
We define the type system for this language using an inductive
relation shown in Figure~\ref{fig:typing}. This relation is between an
expression $[[a]]$, its type $[[A]]$, and a typing context $[[G]]$.
\[ \fbox{$[[ G |- a : A ]]$} \]
A typing context $[[G]]$ is an ordered list of assumptions about variables and
their types.
\[ [[ G]] ::= \emptyset\ |\ [[G, x : A]] \]
We will assume that each of the variables in this list are
distinct from each other so that there will always be at most one assumption
about any variable's type.\footnote{On paper, this assumption is reasonable
because we always extend the context with variables that come from binders,
and we can always rename bound variables so that they differ from other
variables in scope. Any implementation of the type system will need to somehow
make sure that this invariant is maintained. We will do this in Section~\ref{sec:implementation} using
a freshness monad from the \unbound library.}
\paragraph{An initial set of typing rules: Variables and Functions}
\begin{figure}[t]
\drules[t]{$[[G|-a:A]]$}{Core type system}{var,lambda,app,pi,type}
\caption{Typing rules for core system}
\label{fig:typing}
\end{figure}
Consider the first two rules shown in Figure~\ref{fig:typing}, \rref{t-var}
and \rref{t-lambda}. The first rule states that the types of variables are
recorded in the typing context. The premise $[[x:A elem G]]$ requires us to
find an association between $x$ and the type $A$ in the list $[[G]]$.
The second rule, for type checking functions, introduces a new variable into
the context when we type check the body of a $\lambda$-expression. It also
requires $[[A]]$, the type of the function's argument, to be a valid type.
\paragraph{Example: Polymorphic identity functions}
Note that in \rref{t-lambda}, the parameter $x$ is allowed to appear in the
result type of the function, $[[B]]$. Why is this useful? Well, it gives us
\emph{parametric polymorphism} automatically. In Haskell, we write the
identity function as follows, annotating it with a polymorphic type.
\begin{haskell}
id :: forall a. a -> a
id = \y -> y
\end{haskell}
\noindent
Because the type of \cd{id} is generic we can apply this function to any type of
argument.
We can also write a polymorphic identity function in \pif, as follows.
\begin{piforall}
id : (x:Type) -> (y : x) -> x
id = \x. \y. y
\end{piforall}
This definition is similar to Haskell, but with a few modifications. First,
\pif{} uses a single colon for type annotations and a \cd{.} for function
binding instead of Haskell's quirky \cd{::} and \cd{->}. Second, the \pif
definition of \texttt{id} uses two lambdas: one for the ``type'' argument $x$
and one for the ``term'' argument $y$. In \pif, there is no syntactic
difference between these arguments: both are arguments to the identity
function.
Finally, the \pif{} type of \cd{id} uses the dependent function type, with
general form $[[(x:A) -> B]]$. This form is like Haskell's usual function type
$[[A -> B]]$, except that we can name the argument $[[x]]$ so that it can be
referred to in the body of the type $[[B]]$. In \cd{id}, dependency allows
the type argument $[[x]]$ to be used later as the type of $[[y]]$. We call
types of this form $\Pi$-types. (Often dependent function types are written as
$\Pi x\!:\!A. B$ in formalizations of dependent type theory.\footnote{The
terminology for these types is muddled: sometimes they are called dependent
function types and sometimes they are called dependent product types. We use
the non $\Pi$-notation to emphasize the connection to functions.})
The fact that the type of $x$ is $[[Type]]$ means that $x$ plays the role of a
type variable, such as \texttt{a} in the Haskell type. Because we don't have a
syntactic distinction between types and terms, in \pif we say that ``types''
are anything of type $[[Type]]$ and ``terms'' are things of type $A$ where $A$
has type $[[Type]]$.
We can use the typing rules to construct a typing derivation for the identity
function as follows.
\[
\inferrule*
{
\inferrule*
{
\inferrule*{ [[y:x elem (x:Type,y:x)]]}{[[x:Type, y:x |- y : x]]}
\qquad \inferrule*{[[ x:Type elem (x:Type) ]] }{[[ x:Type |- x : Type ]]}
}{
[[x:Type |- \y. y : (y : x) -> x]]
} \qquad \inferrule*{}{[[ |- Type : Type ]]}
}{
[[ |- \x. \y. y : (x:Type) -> (y : x) -> x ]]
}
\]
This derivation uses \rref{t-lambda} to type check the two $\lambda$-expressions, before using the variable
rule to ensure that both the body of the function $[[x]]$ and its type $[[y]]$ are
well-typed. Finally, this rule also uses \rref{t-type} to show that $[[Type]]$ itself
is a valid type, see below.
Note that the \rref{t-lambda} makes subtle use of the fact that binding variables can
be freely renamed to alpha-equivalent versions. In the conclusion of the rule, the $[[x]]$ in
the body of the lambda abstraction $[[a]]$, is a different binder from the $[[x]]$ in the
body of the function type $[[B]]$. Both of these terms can be modified so that these two
$[[x]]$ do not have to match. But, above the line, there is only a single $[[x]]$ that appears
free in both $[[a]]$ and $[[B]]$! If we wanted to be explicit about what is going on, we could
also write the rule with an explicit substitution:
\[ \drule{t-altlambda} \]
Most type theorists do not bother with this version and prefer the one that
reuses the same binder. We will do the same here, and in other similar
rules. However, we will also have to be careful about how we implement this
rule in the implementation of our type checker!
\paragraph{More typing rules: Types}
Observe in the typing derivation above that the rule for typing
$\lambda$-expressions has a second precondition: we need to make sure that
when we add assumptions $x:A$ to the context, that $A$ really is a type.
Without this precondition, the rules would allow us to derive this nonsensical
type for the polymorphic identity function.
\[ [[ |- \x.\y. y : (x: True) -> (y:x) -> x ]] \]
This precondition means that we need some rules that conclude that types are
actually types. For example, the type of a function is itself a type, so we
will declare it so with \rref{t-pi}. This rule also ensures that the domain
and range of the function are also types.
Likewise, for polymorphism, we need the somewhat perplexing \rref{t-type}, that
declares, by fiat, that $[[Type]]$ is a valid $[[Type]]$.\footnote{Note that
this rule makes our language inconsistent as a logic, as it can encode
a logical paradox. The \cd{full} implementation includes a demonstration of
this paradox in the file \cd{Hurkens.pi}.}
\paragraph{More typing rules: Applications}
The application rule, \rref{t-app}, requires that the type of the argument
matches the domain type of the function. However, note that because the body
of the function type $B$ could have $x$ free in it, we need also need to
substitute the argument $[[b]]$ for $x$ in the result.
\paragraph{Example: applying the polymorphic identity function}
In \pif we should be able to apply the polymorphic identity function to
itself. When we do this, we need to first provide the type of \cd{id},
producing an expression of type
$[[(y: ((x:Type) -> (y:x) -> x)) -> ((x:Type) -> (y:x) -> x)]]$. This
function can take \cd{id} as an argument.
\begin{piforall}
idid : (x:Type) -> (y : x) -> x
idid = id ((x:Type) -> (y : x) -> x) id
\end{piforall}
\paragraph{Example: Church booleans}
Because we have (impredicative) polymorphism, we can \emph{encode} familiar
types, such as booleans. The idea behind this encoding, called the Church
encoding, is to represent terms by their eliminators. In other words, what is
important about the value true? The fact that when you get two choices, you
pick the first one. Likewise, false ``means'' that with the same two choices,
you should pick the second one. With parametric polymorphism, we can give the
two terms the same type, which we'll call ``bool''.
\begin{piforall}
true : (A:Type) -> A -> A -> A
true = \x. \y. \z. y
false : (A:Type) -> A -> A -> A
false = \x. \y. \z. z
\end{piforall}
\noindent
Thus, a conditional expression just takes a boolean and returns it.
\begin{piforall}
cond : ((A:Type) -> A -> A -> A) -> ((x:Type) -> x -> x -> x)
cond = \b. b
\end{piforall}
\paragraph{Example: void type}
The type \cd{(x:Type) -> x}, called ``void'' is usually an \emph{empty type}
in polymorphic languages. Observe that this function takes a type \cd{x:Type},
but never takes any expressions of type \cd{x}. Thus, there is no way to return
a value of any type without some sort of type case (not part of \pif).
However, because \pif does not enforce termination, we can define a value that
has this type.
\begin{piforall}
loop : (x:Type) -> x
loop = \x . loop x
\end{piforall}
\paragraph{Typing invariant}
Overall, a property that we want for our type system is that if a term type
checks, then its \emph{type} will also type check. More formally, to state
this property, we first need to say what it means to check all types in a
context.
\begin{definition}[Context type checks]
Define \fbox{$[[|-G]]$} with the following rules. \[ \drule{g-nil} \qquad \drule{g-cons} \]
\end{definition}
\noindent
With this judgement, we can state the following property of our type system.
\begin{lemma}[Regularity]
If $[[G |- a : A ]]$ and $[[|- G]]$, then $[[ G |- A : Type]]$
\end{lemma}
\noindent
Making sure that this property holds for the type system is the motivation for the
$[[G |- A : Type]]$ premise in \rref{t-lambda}.
\section{From typing rules to a typing algorithm}
\label{sec:bidirectional}
The rules that we have developed so far are great for saying \emph{what} terms
should type check, but they don't say \emph{how} type checking works. We have
developed these rules without thinking about how we would implement them.
A set of typing rules, or \emph{type system}, is called \emph{syntax-directed}
if it is readily apparent how to interpret that collection of typing rules as
code. In other words, for some type systems, we can directly translate them to
some Haskell function.
For this type system, we would like to implement the following Haskell
function, which when given a term and a typing context, represented as a list
of pairs of variables and their types, produces the type of the term if it
exists.\footnote{Note: If you are looking at the \pif implementation, note
that this is not the final type of \cd{inferType}.}
\begin{haskell}
type Ctx = [(Var,Type)]
inferType :: Term -> Ctx -> Maybe Type
\end{haskell}
Let us look at our rules in Figure~\ref{fig:typing}. Is the definition of this
function straightforward? For example, in the variable rule, as long as we
can look up the type of a variable in the context, we can produce its
type. That means that, assuming that there is some function \cd{lookupTy},
with type \cd{Ctx -> Var -> Maybe Type}, this rule corresponds to the
following case of the \cd{inferType} function. So far so good!
\begin{haskell}
inferType (Var x) ctx = lookupTy ctx x
\end{haskell}
\noindent
Likewise, the case of the typing function for the $[[Type]]$ term is also
straightforward. When we see the term $[[Type]]$, we know immediately that it
is its own type.
% \begin{haskell}
% inferType TyType ctx = Just TyType
% \end{haskell}
The only stumbling block for the algorithm is \rref{t-lambda}. To type check a
function, we need to type check its body when the context has been extended
with the type of the argument. But, like Haskell, the type of the argument
$[[A]]$ is not annotated on the function in \pif. So where does it come from?
There is actually an easy fix to turn our current system into an algorithmic
one. We could just annotate $\lambda$-expressions with the types of the
abstracted variables. But, to be ready for future extension, we will do
something else.
Look at our example \pif code above: the only types that we wrote were the
types of definitions. It is good style to do that. Furthermore, there is
enough information there for type checking---wherever we define a function, we
can look at those types to know what type its argument should have. So, by
changing our point of view, we can get away without annotating lambdas with
those argument types.
\subsection{A bidirectional type system}
Let us redefine the system using two judgments. The first one is similar to the
judgment that we saw above, and we will call it type
\emph{inference}.\footnote{The term \emph{type inference} usually refers to
much more sophisticated deduction of an expression's type, in the context of
much less information encoded in the syntax of the language. We are not doing
anything difficult here, just noting that we can read the judgment with
$[[A]]$ as an output. } This judgment will be paired with (and will depend
on) a second judgment, called type \emph{checking}, that takes advantage of
known type information, such as the annotations on top-level definitions.
We express these judgments using the notation defined in
Figure~\ref{fig:bidirectional} and implement them in Haskell using the
mutually-recursive functions \texttt{inferType} and
\texttt{checkType}. Furthermore, to keep track of which rule is in which
judgment, rules that have inference as conclusion start with
\textsc{i-} and rules that have checking as conclusion start with
\textsc{c-}.
\begin{figure}
\drules[i]{$[[G |- a => A]]$}{in context $[[G]]$, infer that term $a$ has type $A$}
{var,app-simple,pi,type,annot}
\drules[c]{$[[G |- a <= A]]$}{in context $[[G]]$, check that term $a$ has type $A$}
{lambda,infer-simple}
\caption{(Simple) Bidirectional type system}
\label{fig:bidirectional}
\end{figure}
Let us compare these rules with our original typing rules. For \rref{i-var},
we need to only change the colon to an inference arrow. The context tells us
the type to infer.
On the other hand, in \rref{c-lambda} we should check $\lambda$-expressions
against a known type. If that type is provided, we can propagate it to the
body of the lambda expression. (If we assume that the type provided to
the checking judgment has already been checked to be a good type, then
we can skip checking that $A$ is a $[[Type]]$.)
The rule for applications, \rref{i-app-simple}, is in inference mode. Here, we
first infer the type of the function, but once we have that type, we may use
it to check the type of the argument. This mode change means that
$\lambda-$expressions that are arguments to other functions (like
\texttt{map}) do not need any annotation.
For example, if map has type
\begin{haskell}
map : (x : Type) -> (y: Type) -> (x -> y) -> List x -> List y
\end{haskell}
then in an application, we can synthesize the type of the expression
\begin{haskell}
map Nat Nat (\x.x)
\end{haskell}
as \cd{List Nat -> List Nat}.
After the two type applications, we will end up checking the function
argument against the type \cd{Nat -> Nat}.
For types, it is apparent their type is $[[Type]]$, so \rref{i-pi,i-type} just
continue to infer that.
Notice that this system is incomplete. There are inference rules for every
form of expression except for lambda. On the other hand, only lambda
expressions can be checked against types. We can make the checking judgment
more applicable with \rref{c-infer-simple} that allows us to use inference
whenever a checking rule doesn't apply. This rule only applies when the
term is not a lambda expression.
Now, let us think about the reverse problem a bit. There are programs that the
checking system won't admit but would have been acceptable by our first
system. What do they look like?
Well, they involve applications of explicit lambda terms:
\[
\inferrule*[Right=t-app]
{
[[ |- \x.x : Bool -> Bool]] \qquad [[|- True : Bool ]]
}{
[[ |- (\x.x) True : Bool ]]
}
\]
This term doesn't type check in the bidirectional system because application
requires the function to have an inferable type, but lambdas don't.
%
However, there is not that much need to write such terms. We can always
replace them with something equivalent by doing a $\beta$-reduction of the
application (in this case, just replace the term with $[[True]]$).
In fact, the bidirectional type system has the property that it only checks
terms in \emph{normal} form, i.e., those that do not contain any
$\beta$-reductions.
\paragraph{Type annotations}
To type check nonnormal forms in \pif, we also add typing annotations as a new
form of expression to \pif, written $[[(a : A)]]$, and add \rref{i-annot} to
the type system.
Type annotations allow us to supply known type information anywhere, not just
at the top level. For example, we can construct this derivation.
\[
\inferrule*[Right=i-app]
{
[[ |- (\x.x : Bool -> Bool) => Bool -> Bool]] \qquad [[|- True <= Bool ]]
}{
[[ |- (\x.x : Bool -> Bool) True => Bool ]]
}
\]
The nice thing about the bidirectional system is that it reduces the number of
annotations that are necessary in programs that we want to write. As we will
see, checking mode will be even more important as we add more terms to the
language.
Furthermore, we want to convince ourselves that the bidirectional system
checks the same terms as the original type system. This means that we want to
prove a property like this one\footnote{This result requires the addition of a
typing rule for annotated terms $[[(a:A)]]$ to the original system.}:
\begin{lemma}[Correctness of the bidirectional type system]
If $[[G |- a => A]]$ and $[[|- G]]$ then $[[G |- a : A]]$.
If $[[G |- a <= A]]$ and $[[|- G]]$ and $[[G |- A : Type]]$ then $[[G |- a : A]]$.
\end{lemma}
On the other hand, the reverse property is not true. Even if there exists some
typing derivation $[[G |- a : A]]$ for some term $[[a]]$, we may not be able
to infer or check that term using the algorithm. However, all is not lost:
there will always be some term $[[a']]$ that differs from $[[a]]$ only in the
addition of typing annotations that can be inferred or checked instead.
One issue with this bidirectional system is that it is not closed under
substitution. What this means is that given some term $[[b]]$ with a free
variable, $[[ G, x : A |- b <= B]]$, and another term $[[a]]$ with the same
type $[[ G |- a <= A]]$ of that variable, we \emph{do not} have a derivation
$[[ G |- b[a/x] <= A]]$. The reason is that types of variables are always
inferred, but the term $[[a]]$, may need the type $[[A]]$ to be supplied to
the type checker. This issue is particularly annoying in \rref{i-app} when we
replace a variable (inference mode) with a term that was validated in checking
mode.
As a result, our type system infers types, but the types that are inferred do
not have enough annotations to be checked themselves. We can express the property
that does hold about our system, using this lemma:
\begin{lemma}[Regularity]
If $[[G |- a => A]]$ and $[[|- G]]$ then $[[G |- A : Type]]$.
\end{lemma}
This issue is not significant, and we could resolve it by adding an annotation
before substitution. However, in our implementation of \pif, we do not do so
to reduce clutter.
\subsection{Homework: Add Booleans and
\(\Sigma\)-types (Part I)}
Some fairly standard typing rules for booleans are shown below.
\drules{$[[G|-a:A]]$}{Booleans}{t-bool,t-true,t-false}
\[ \drule[width=4in]{t-if-simple} \]
Likewise, we can also extend the language with $\Sigma$-types, or
products, where the type of the second component of the product can depend on the value of the first component.
\drules{$[[G|-a:A]]$}{$\Sigma$-types}{t-sigma,t-pair}
\[ \drule[width=4in]{t-letpair-weak} \]
We destruct $\Sigma$-types using pattern matching. The simple rule for pattern
matching introduces variables into the context when type checking the body of
the \textsf{let} expression. These variables are not allowed to appear free
in the result type of the pattern match.
For your exercise, rewrite the rules above in bidirectional style. Which rules
should be inference rules? Which ones should be checking rules? If you are
familiar with other systems, how do these rules compare?
\section{Putting it all together in a Haskell
implementation}
\label{sec:implementation}
In the previous section, we defined a bidirectional type system for a small
core language. Here, we will start talking about what the implementation of this
language looks like in Haskell.
First, an overview of the main files of the implementation. There are a few
more source files in the repository (in the subdirectory \cd{src/}), but these
are the primary ones.
\begin{verbatim}
Syntax.hs - specification of the AST
Parser.hs - turn strings into AST
PrettyPrint.hs - displays AST in a (somewhat) readable form
Main.hs - runs the parser and type checker
Environment.hs - defines the type checking monad
TypeCheck.hs - implementation of the bidirectional type checker
\end{verbatim}
\subsection{Variable binding using the \unbound library
{[}Syntax.hs{]}}
One difficulty with implementing any sort of lambda calculus is the treatment
of variable binding. Functions ($\lambda$-expressions) and their types
($\Pi$-types) \emph{bind} variables. In the implementation of our type
checker, we will need to be able to determine whether two terms are
\emph{alpha-equivalent}, calculate the \emph{free variables} of a term, and
perform \emph{capture-avoiding substitution.} When we work with a
$\lambda$-expression, we will want to be sure that the binding variable is
\emph{fresh}, that is, distinct from all other variables in the program or in
the typing context.
In the \pif implementation, we use the \unbound library to get all of these
operations (mostly) for free. This works because we use type constructors from
this library in the definition of the abstract syntax of \pif, and those types
specify the binding structure of \pif expressions.
First, the \unbound library defines a type for variable names, called
\cd{Name} and we use this type to define \cd{TName}, the type of term
names in our AST.
%
\begin{verbatim}
-- | Type used for variable names in Terms
type TName = Unbound.Name Term
\end{verbatim}
%
The \cd{Unbound.Name} type is indexed by \cd{Term}, the AST type that it is a
name for, whose definition we will see shortly. That way \unbound can make
sure that we only substitute the right form of syntax for the right name,
i.e., we can only replace a \cd{TName} with a \cd{Term}. The library
includes an overloaded function \cd{subst\ x\ a\ b}, which corresponds to
the substitution we have been writing as $[[ b[a/x] ]]$ in our inference
rules,\footnote{See Guy Steele's talk about the notation for
substitution~\cite{steele:ppop17}. This is the most common mathematical
notation for this operation.} i.e., substitute $[[a]]$ for $[[x]]$ in
$[[b]]$.
In general, we want to apply a substitution to many different sorts of
syntax. For example, we may want to substitute a term $[[a]]$ for a term name
$[[x]]$ in all of the terms that appear in the typing context. Or there may be
other sorts of syntax that terms can be part of, and we want to be able to
substitute for term variables through that syntax too. Therefore, \unbound's
substitution function has a type with the following general pattern for
\cd{subst a x b}.
\begin{lstlisting}[language=Haskell]
class Subst a b where
subst :: Name a -> a -> b -> b
\end{lstlisting}
The \cd{subst} function in this class ensures that when we see that
\cd{a} is the right sort of thing to stick in for \cd{x}. The library
can automatically generate instances of the \cd{Subst} class.
With term names, we can define the abstract syntax that corresponds to our
language above, using the following datatype.
\begin{lstlisting}[language=Haskell]
data Term
= -- | type of types `Type`
TyType
| -- | variables `x`
Var TName
| -- | abstractions `\x. a`
Lam (Unbound.Bind TName Term)
| -- | applications `a b`
App Term Term
| -- | function types `(x : A) -> B`
TyPi Type (Unbound.Bind TName Type)
| -- | Annotated terms `( a : A )`
Ann Term Type
...
deriving (Show, Generic)
\end{lstlisting}
As you can see, variables are represented by names. \unbound's \cd{Bind}
type constructor declares the scope of the bound variables. Both \cd{Lam}
and \cd{TyPi} bind a single variable in a \cd{Term}. However, in a
\cd{TyPi} term, we also want to store the type $[[A]]$, the type of the bound
variable $[[x]]$.
Because the syntax is all shared, a \cd{Type} is just another name for a
\cd{Term}. We will use this name in the source code just for documentation.
%
\begin{lstlisting}[language=Haskell]
type Type = Term
\end{lstlisting}
%
Among other things, the \cd{Alpha} class enables functions for alpha
equivalence and free variable calculation, with the types shown below.
Usually \unbound can derive instances of this class for us so that we don't
have to worry about defining these functions manually.