Other than the type universes and Pi types, inductively defined types provide the only means of defining new types in the Calculus of Inductive Constructions. We have also seen that, fundamentally, the constructors and the recursors provide the only means of defining functions on these types. By the propositions-as-types correspondence, this means that induction is the fundamental method of proof for these types.
Working with induction and recursion is therefore fundamental to working in the Calculus of Inductive Constructions. For that reason Lean provides more natural ways of defining recursive functions, performing pattern matching and writing inductive proofs. Behind the scenes, these are “compiled” down into recursors and auxiliary definitions we covered in previous chapters.
The cases_on
recursor can be used to define functions and prove
theorems by cases. Complicated definitions may use several nested
cases_on
applications, and may be hard to read and understand.
Pattern-matching is a more convenient and standard description technique
or defining functions and proving theorems. Actually, Lean supports an extension of
pattern-matching called dependent pattern-matching. Internally,
dependent pattern-matching is compiled using cases_on
, no_confusion
and eq.rec
. Thus, the compiler is not part of the trusted code
base. A pattern matching definition is of the following form
definition [name] [parameters] : [domain] → [codomain],
[name] [patterns_1] := [value_1],
...
[name] [patterns_n] := [value_n]
The parameters are fixed, and each assignment defines the value of the function
for a different case specified by the given pattern. As a first example, we
define the function sub2
for natural numbers:
open nat
definition sub2 : nat → nat,
sub2 0 := 0,
sub2 1 := 0,
sub2 (a+2) := a
example : sub2 5 = 3 :=
rfl
The default compilation method guarantees that the pattern matching “equations” hold definitionally.
open nat
definition sub2 : nat → nat,
sub2 0 := 0,
sub2 1 := 0,
sub2 (a+2) := a
-- BEGIN
example : sub2 0 = 0 :=
rfl
example : sub2 1 = 0 :=
rfl
example (a : nat) : sub2 (a + 2) = a :=
rfl
-- END
We can use the command print definition
to inspect how our definition was compiled into
recursors.
open nat
definition sub2 : nat → nat,
sub2 0 := 0,
sub2 1 := 0,
sub2 (a+2) := a
-- BEGIN
print definition sub2
-- END
We say a term is a constructor application if it is of the form c a_1 ... a_n
where
c
is the constructor of some inductive datatype.
Note that in the definition sub2
, the terms 1
and a+2
are not constructor applications.
However, the compiler normalizes them at compilation time, and obtains the constructor applications
succ zero
and succ (succ a)
respectively. This normalization step is just a simple convenience
that allows us to write definitions resembling the ones found in textbooks.
Note that, there is no “magic”, the compiler is just using the kernel normalizer/evaluator.
If we had written 2+a
, the definition would be rejected since 2+a
does not normalize into a
constructor application.
Next, we use pattern-matching for defining Boolean negation neg
, and proving that neg (neg b) = b
.
open bool
definition neg : bool → bool,
neg tt := ff,
neg ff := tt
theorem neg_neg : ∀ (b : bool), neg (neg b) = b,
neg_neg tt := rfl, -- proof for neg (neg tt) = tt
neg_neg ff := rfl -- proof for neg (neg ff) = ff
As described in previous chapters, Lean inductive datatypes can be parametric.
The following example defines the tail
function using pattern matching.
The argument A : Type
is a parameter and occurs before :
to indicate it
does not participate in the pattern matching. Lean allows parameters to occur
after :
, but it cannot pattern match on them.
import data.list
open list
definition tail {A : Type} : list A → list A,
tail nil := nil,
tail (h :: t) := t
-- Parameter A may occur after ':'
definition tail2 : Π {A : Type}, list A → list A,
tail2 (@nil A) := (@nil A),
tail2 (h :: t) := t
-- @ is allowed on the left-hand-side
definition tail3 : Π {A : Type}, list A → list A,
@tail3 A nil := nil,
@tail3 A (h :: t) := t
-- A is explicit parameter
definition tail4 : Π (A : Type), list A → list A,
tail4 A nil := nil,
tail4 A (h :: t) := t
The default compilation method supports structural recursion:
recursive applications where one of the arguments is a subterm of the
corresponding term on the left-hand-side. Later, we describe
how to compile recursive equations using well-founded recursion.
The main advantage of the default compilation method is that the
recursive equations hold definitionally.
Our first recursive example is the Fibonacci function fib
, and the fib_pos
theorem which combines pattern-matching, recursive equations, and calculational
proofs. The theorem fib_pos
makes it clear again that there is no
difference between recursion and induction in Lean.
import data.nat
open nat
definition fib : nat → nat,
fib 0 := 1,
fib 1 := 1,
fib (a+2) := fib (a+1) + fib a
-- The defining equations hold definitionally
example : fib 0 = 1 :=
rfl
example : fib 1 = 1 :=
rfl
example (a : nat) : fib (a+2) = fib (a+1) + fib a :=
rfl
-- fib is always positive
theorem fib_pos : ∀ n, 0 < fib n,
fib_pos 0 := show 0 < 1, from zero_lt_succ 0,
fib_pos 1 := show 0 < 1, from zero_lt_succ 0,
fib_pos (a+2) := calc
0 = 0 + 0 : rfl
... < fib (a+1) + 0 : add_lt_add_right (fib_pos (a+1)) 0
... < fib (a+1) + fib a : add_lt_add_left (fib_pos a) (fib (a+1))
... = fib (a+2) : rfl
Another classical example is the list append
function.
import data.list
open list
definition append {A : Type} : list A → list A → list A,
append nil l := l,
append (h::t) l := h :: append t l
example : append [1, 2, 3] [4, 5] = [1, 2, 3, 4, 5] :=
rfl
All the examples we have seen so far can be easily written
using cases_on
and rec_on
. However, this is not the case
for indexed inductive families such as vector A n
.
A lot of boiler plate code needs to be written to define
very simple functions such as map
, zip
, unzip
using
recursors.
In the next example, we define the indexed inductive family
vector, the tail function and leave as exercise the function map
which maps a function onto each pair of elements coming from
input vectors. We encourage you to try to define map
using rec_on
,
cases_on
and no_confusion
.
namespace hide
-- BEGIN
open nat
inductive vector (A : Type) : nat → Type :=
nil {} : vector A zero,
cons : Π {n}, A → vector A n → vector A (succ n)
open vector
notation h :: t := cons h t
check @vector.cases_on
-- Π {A : Type}
-- {C : Π (a : ℕ), vector A a → Type}
-- {a : ℕ}
-- (n : vector A a),
-- (e1 : C 0 nil)
-- (e2 : Π {n : ℕ} (a : A) (a_1 : vector A n), C (succ n) (cons a a_1)),
-- C a n
definition tail {A : Type} {n : nat} (v : vector A (succ n)) : vector A n :=
vector.cases_on v
(fun (e : zero = succ n), nat.no_confusion e)
(fun (n1 : nat) (h : A) (t : vector A n1) (e : succ n1 = succ n),
nat.no_confusion e (fun n1_eq_n : n1 = n, eq.rec_on n1_eq_n t))
(eq.refl (succ n))
definition map {A B C : Type} (f : A → B → C)
: Π {n : nat}, vector A n → vector B n → vector C n :=
sorry
-- END
end hide
The main difficulty is to maintain the relationship between the indices.
The extra parameter e
in tail
is used to “communicate” the relationship
between n
and index associated with each minor premise.
Moreover, some cases are “unreachable” (e.g., zero = succ n
in the first case at
tail
), and the default way to discard them is using no_confusion
.
The map
function is even more tedious to define. All these functions are trivial
to define using recursive equations. The compiler generates all boiler plate code
automatically for us.
namespace hide
open nat
inductive vector (A : Type) : nat → Type :=
nil {} : vector A zero,
cons : Π {n}, A → vector A n → vector A (succ n)
open vector prod
notation h :: t := cons h t
-- BEGIN
definition head {A : Type} : Π {n}, vector A (succ n) → A,
head (h :: t) := h
definition tail {A : Type} : Π {n}, vector A (succ n) → vector A n,
tail (h :: t) := t
theorem eta {A : Type} : ∀ {n} (v : vector A (succ n)), head v :: tail v = v,
eta (h::t) := rfl
definition map {A B C : Type} (f : A → B → C)
: Π {n : nat}, vector A n → vector B n → vector C n,
map nil nil := nil,
map (a::va) (b::vb) := f a b :: map va vb
-- The automatically generated definitions for indexed families are not straightforward
print definition map
definition zip {A B : Type} : Π {n}, vector A n → vector B n → vector (A × B) n,
zip nil nil := nil,
zip (a::va) (b::vb) := (a, b) :: zip va vb
-- END
end hide
Note that we can omit recursive equations for “unreachable” cases such as head nil
.
We say a set of recursive equations overlap when there is an input that more than
one left-hand-side can match. In the following definition the input 0 0
matches
the left-hand-side of the first two equations. Should the function return 1
or 2
?
open nat
-- BEGIN
definition f : nat → nat → nat,
f 0 y := 1,
f x 0 := 2,
f (x+1) (y+1) := 3
-- END
Overlapping patterns are often used to succinctly express complex patterns in data. Thus, they are allowed in Lean. Lean eliminates the ambiguity by using the first applicable equation. In the example above, the following equations hold definitionally.
open nat
definition f : nat → nat → nat,
f 0 y := 1,
f x 0 := 2,
f (x+1) (y+1) := 3
-- BEGIN
variables (a b : nat)
example : f 0 0 = 1 := rfl
example : f 0 (a+1) = 1 := rfl
example : f (a+1) 0 = 2 := rfl
example : f (a+1) (b+1) = 3 := rfl
-- END
Lean also supports wildcard patterns aka anonymous variables _
. They are useful to create
patterns where we don’t care about the value of a specific argument.
In the function f
defined in the previous section, the values of x
and y
are not used
in the right-hand-side. Here is the same example using wildcards.
open nat
definition f : nat → nat → nat,
f 0 _ := 1,
f _ 0 := 2,
f _ _ := 3
variables (a b : nat)
example : f 0 0 = 1 := rfl
example : f 0 (a+1) = 1 := rfl
example : f (a+1) 0 = 2 := rfl
example : f (a+1) (b+1) = 3 := rfl
Some functional languages support incomplete patterns. In these
languages, the interpreter produces an exception or returns an
arbitrary value for incomplete cases. We can simulate the arbitrary
value approach using inhabited types.
An element of inhabited A
is simply a
witness to the fact that there is an element of A
. Later, we will
see that inhabited
is an instance of a type class
in Lean: Lean
can be instructed that suitable base types are inhabited, and can
automatically infer that other constructed types are inhabited on that
basis. The standard library provides the opaque definition arbitrary
A
for inhabited types. The function arbitrary A
just returns the
witness for A
, but since arbitrary A
is opaque, we cannot rely on
the witness chosen.
We can also use the type option A
to simulate incomplete patterns.
The idea is to return some a
for the provided patterns, and use none
for the incomplete cases.
In the following example we demonstrate both approaches.
open nat option
definition f1 : nat → nat → nat,
f1 0 _ := 1,
f1 _ 0 := 2,
f1 _ _ := arbitrary nat -- "incomplete" case
variables (a b : nat)
example : f1 0 0 = 1 := rfl
example : f1 0 (a+1) = 1 := rfl
example : f1 (a+1) 0 = 2 := rfl
example : f1 (a+1) (b+1) = arbitrary nat := rfl
definition f2 : nat → nat → option nat,
f2 0 _ := some 1,
f2 _ 0 := some 2,
f2 _ _ := none -- "incomplete" case
example : f2 0 0 = some 1 := rfl
example : f2 0 (a+1) = some 1 := rfl
example : f2 (a+1) 0 = some 2 := rfl
example : f2 (a+1) (b+1) = none := rfl
Another complication in dependent pattern matching is that some parts require constructor matching, and others are just report specialization. Lean allows users to mark subterms are inaccessible for parttern matching. These annotations are essential, for example, when a term occurring in the left-hand-side is not a variable nor a constructor application. We can view inaccessible terms as “don’t care” patterns.
An inaccessible subterm can be declared using one of the following two
notations: ⌞t⌟
or ?(t)
. The unicode version is inputed by entering
\cll
(corner-lower-left) and \clr
(corner-lower-right).
In the following example due to Goguen-McBride-McKinna, we declare
an inductive type that defines the property of “being in the image of f
”.
Then, we equip f
with an “inverse”. The typing rules forces us to
write f a
for the first argument, this term is not a variable nor
a constructor application. We can view elements of the type image_of f b
as
evidence that b
is in the image of f
. The constructor imf
is used to
build such evidence.
variables {A B : Type}
inductive image_of (f : A → B) : B → Type :=
imf : Π a, image_of f (f a)
open image_of
definition inv {f : A → B} : Π b, image_of f b → A,
inv ⌞f a⌟ (imf f a) := a
Inaccessible terms can also be used to reduce the complexity of the generated
definition. Depedent pattern matching is compiled using basically the
cases_on
and no_confusion
constructions. The number of cases_on
introduced
by the compiler can be reduced by marking parts that just report specialization.
In the next example, we define the type of finite ordinals fin n
, this type has
n
inhabitants. We also define the function to_nat
that maps a fin n
into a
nat
. If we do not mark n+1
as inaccessible, the compiler will generate a definition
containing two cases_on
expressions. We encourage you to replace
⌞n+1⌟
with (n+1)
and inspect the generated definition using print definition to_nat
.
namespace hide
-- BEGIN
open nat
inductive fin : nat → Type :=
fz : Π n, fin (succ n),
fs : Π {n}, fin n → fin (succ n)
open fin
definition to_nat : Π {n : nat}, fin n → nat,
@to_nat ⌞n+1⌟ (fz n) := zero,
@to_nat ⌞n+1⌟ (fs f) := succ (to_nat f)
-- END
end hide
Lean also provides a compiler for match-with expressions found in many functional languages. It uses essentially the same infrastructure used to compile recursive equations.
import data.list
open nat bool list
-- BEGIN
definition is_not_zero (a : nat) : bool :=
match a with
zero := ff,
succ _ := tt
end
-- We can use recursive equations and match
variable {A : Type}
variable p : A → bool
definition filter : list A → list A,
filter nil := nil,
filter (a :: l) :=
match p a with
tt := a :: filter l,
ff := filter l
end
example : filter is_not_zero [1, 0, 0, 3, 0] = [1, 3] :=
rfl
-- END
In some definitions, we have to help the compiler by providing some implicit arguments explicitly in the left-hand-side of recursive equations. If we don’t provide the implicit arguments, the elaborator is unable to solve some placeholders (aka meta-variables) in the nested match expression.
namespace hide
open nat
inductive vector (A : Type) : nat → Type :=
nil {} : vector A zero,
cons : Π {n}, A → vector A n → vector A (succ n)
open vector prod
notation h :: t := cons h t
-- BEGIN
variables {A B : Type}
definition unzip : Π {n : nat}, vector (A × B) n → vector A n × vector B n,
@unzip zero nil := (nil, nil),
@unzip (succ n) ((a, b)::v) :=
match unzip v with
(va, vb) := (a :: va, b :: vb)
end
example : unzip ((1, 10) :: (2, 20) :: nil) = (1 :: 2 :: nil, 10 :: 20 :: nil) :=
rfl
-- END
end hide
Next, we define the function diag
which extracts the diagonal of a square matrix vector (vector A n) n
.
Note that, this function is defined by structural induction. However, the term map tail v
is not a subterm
of ((a :: va) :: v)
. Could you explain what is going on?
namespace hide
open nat
inductive vector (A : Type) : nat → Type :=
nil {} : vector A zero,
cons : Π {n}, A → vector A n → vector A (succ n)
open vector
notation h :: t := cons h t
-- BEGIN
variables {A B : Type}
definition tail : Π {n}, vector A (succ n) → vector A n,
tail (h :: t) := t
definition map (f : A → B)
: Π {n : nat}, vector A n → vector B n,
map nil := nil,
map (a::va) := f a :: map va
definition diag : Π {n : nat}, vector (vector A n) n → vector A n,
diag nil := nil,
diag ((a :: va) :: v) := a :: diag (map tail v)
-- END
end hide
[TODO: write this section.]