A proof assistant, or theorem prover, is a computer program that checks a mathematical proof for correctness and in some cases generates a proof, or useful steps toward a proof, by itself. In a previous post, we explored how the proof of the Law of Quadratic Reciprocity (henceforth referred to as QR) can be formalised and checked for correctness with proof assistant Mizar. Based on the article “Formal Proof—Getting Started” (Wiedijk [2007]), and expanding the explanation of the worked example presented there, we now explore how mathematical formalisation works in an alternative theorem prover, HOL Light.
1. A bite-sized lemma
In the introductory article, “Formal Proof—Getting Started” (Wiedijk [2007]), Freek Wiedijk selects a mini-lemma related to QR to illustrate interactive proof with the HOL Light theorem prover. In HOL Light, this example lemma is called CONG_MINUS1_SQUARE
and can be found in the file 100/reciprocity.ml
[available online]. For reasons of space, the example lemma is indeed bite-sized—in fact, compared with everyday published versions of mathematical proofs, formalised proofs require far more resources in both space and time (for example, see this article discussing the The De Bruijn Factor; also Wiedijk [2007], p.1408). Nevertheless, our lemma gives a core result of modular arithmetic:
Lemma (bite-sized). For , ,
Example of an everyday proof. Since, by definition of mod, for any integer , by multiplying out we obtain: .
This is how a proof of the lemma might go in everyday maths. However, what seems intuitive to a mathematician is not necessarily suitable for proof formalisation, because a formal proof that can be checked by a proof assistant will need to build on axioms and pre-proved theorems that are accessible to the proof assistant. These will have been chosen for theoretical and practical reasons related to the software design of the proof assistant, and may be very different from everyday background mathematical knowledge that mathematicians implicitly share.
After a brief introduction to basic characteristics of the HOL Light theorem prover, we will go on to see how this bite-sized lemma can be formally proved with HOL Light. For now we just summarise the approach taken: using the equational theorem that , we rewrite the lemma into an equivalent statement and provide existential witnesses for and , hence proving the result.
2. HOL Light: basics
Installation: Detailed instructions on how to install HOL Light are available on its main website and Github page. Unfortunately, none of these approaches worked on my MacBook Pro running OS X 10.11.6 ¹. I am very grateful to the researchers on the HOL theorem-proving system mailing list who pointed out an alternative method using the Nix package manager, which was quick and easy, requiring only the following commands:
$ curl https://nixos.org/nix/install | sh
$ . $HOME/.nix-profile/etc/profile.d/nix.sh
$ nix-env -i hol_light
$ hol_light
This last command takes a few minutes to get HOL Light running, and prints a number of warnings and possibly errors in the meanwhile, but all is when if it ends by presenting the OCaml prompt symbol #
.
Objective Caml (OCaml): OCaml is a statically typed functional programming language that also supports imperative and object-oriented programming. It is the main implementation of the Caml programming language, which itself is a dialect of ML. The recommended package manager for OCaml is OPAM. The OCaml website has resources for learning and practicing OCaml, including an interactive tutorial that runs on a browser. The basics of OCaml as required for HOL Light are also introduced in Section 2 of the HOL Light tutorial. OCaml is the “meta-language” for HOL Light, which means it is used to reason about the logic contained in the theorems that HOL Light is set to prove. Historically, ML itself was developed to program the LCF theorem prover circa 1972. Here is an example of OCaml used to define a simple function at the interactive prompt:
# let successor x = x + 1;;
val successor : int -> int =
# successor 0;;
val it : int = 1
HOL Light basics: The HOL Light tutorial is an excellent starting point for using HOL Light. The HOL Light manual presents an introduction with more detailed explanations. We summarise some central points here:
- Terms and types: Being a theorem prover, HOL Light manipulates and evaluates mathematical expressions and logical assertions. These are called terms. Terms are internally interpreted by HOL Light into a tree-like syntax which is difficult for humans to read—hence the need for an OCaml pretty-printer, such as camlp5, to output mathematical expressions in a human-readable way. Terms in HOL Light fall into four categories: variables; constants; applications; and abstractions. Moreover, very term has a well-defined type; common types include
:num
for natural numbers,:real
for real numbers, and:bool
for assertions that may be true or false. HOL Light performs type-checking to ensure that the types of terms used in mathematical expressions are consistent. (If, during evaluation, a term is encountered that is type-able but whose type is not uniquely determined, HOL Light will assign it a general type automatically). For example, the termx + 1
is an application, of type:num
(terms are entered into the system between backquotes):
# `x + 1`;;
val it : term = `x + 1`
# type_of it;;
val it : hol_type = :num
- Theorems: In HOL Light, a theorem is a type of term that is a logical assertion and furthermore has been proved. Specifically, a term of type
:bool
is called a formula, and a term of typethm
(a theorem) is a formula that has already been proved using methods acceptable to HOL Light. To create a new object of typethm
, one must apply a limited set of primitive rules of inference to the HOL Light axioms or to previously-proved theorems plus possible additional assumptions (for a comprehensive list, see the HOL Light reference). For example, if we assumex = y
andy = z
, then the transitive rule of inference for boolean equalityTRANS
can be used to prove that thenx = z
(where “|-
” symbolises the usual turnstile of syntactic consequence or deducibility):
# TRANS (ASSUME `x:bool = y`) (ASSUME `y:bool = z`);;
val it : thm = x <=> y, y <=> z |- x <=> z
- Quantification as lambda abstraction: Fundamentally, HOL Light’s logic is based on Alonzo Church’s lambda calculus. HOL Light binds variables in mathematical expressions and logical assertions through the notion of lambda abstraction. This will be familiar to any student of the lambda calculus: given variable x and term t, which may or may not contain x, one may construct the new term λx.t. This is now a function that can be applied to an input. For example, if t = 2 * x, then λx.t is the term that represents the function that doubles its argument (in the HOL Light syntax, “
\
” represents λ):
# `\x.2 * x`;;
val it : term = `\x. 2 * x`
# type_of it;;
val it : hol_type = `:num->num`
HOL contains a primitive inference rule BETA
that specifies that abstraction and application are precisely inverse operations², as seen in the example below: (λx.2 * x)x = 2 * x. Furthermore, a variable x inside a term “λx. …” is considered bound, and cannot be replaced by any function that effects variable substitution or instantiation, although free instances of the same variable will be replaced:
# let th = BETA `(\x.2 * x) x`;;
val th : thm = |- (\x. 2 * x) x = 2 * x
# INST [`y:num`,`x:num`] th;;
val it : thm = |- (\x. 2 * x) y = 2 * y
The concept of lambda abstraction is so powerful that it can be used to define the logical quantifiers in HOL Light. For example, to apply the universal quantifier to predicate P is to say that predicate P holds for all arguments, which in HOL Light is the same as saying that P is equal to the constant function with value “true”: P = λx.. Hence, HOL Light defines universal quantifier , which is symbolised in the syntax by “!
“, as λP.(P = λx.) (additional brackets mine for clarity):
# FORALL_DEF;;
val it : thm = |- (!) = (\P. P = (\x. T))
- Proving theorems: HOL Light is in interactive theorem prover: proof “goals” are fed into the system, and reduced by “tactics” into a series of subgoals. By applying primitive inferences to the proofs of the subgoals, HOL Light reconstructs a proof of the original goal. “Tactics” are HOL Light functions that are applied to move from goal g to subgoal s, with the claim that g follows from s although of course not necessarily conversely; indeed, sometimes tactics reduces goals to subgoals that are logically stronger, but procedurally simpler to prove. It is not clear whether there is a checker that ensures a goal is not reduced to an unprovable subgoal that! Applying a tactic is something like the inverse of applying an inference rule—if a tactic returns subgoal s from goal g, then there is an inference rule that gives theorem g from s. The command “
g
” sets up the initial goal, and the “expand” command “e
” takes tactics as arguments to reduce the goal to subgoals. Looking at how HOL Light checks and proves theorems is the focus of the next two sections.
3. HOL Light: theorem checking and theorem proving
Naturally, the rules of inference implemented in HOL Light have been chosen such that anything provable is also true, although the converse does not hold, of course, due to Gödel’s Incompleteness Theorems (but this limitation does not as of yet seem to affect everyday mathematics). Furthermore, HOL Light has tools that help speed up interactive theorem proving by implementing automatic reasoning with boolean variables, quantifiers, arithmetic algebra, and more. These tools can prove some classes of theorems automatically, hence reducing goals to suitable subgoals more quickly. Since they are constructed from a series of applications of the primitive inference rules, these automated reasoning tools still guarantee correctness. Thus, HOL Light begins to blur the lines between automated theorem checking and theorem proving, as the automatic tools can sometimes quickly prove the logical validity of statements for which human mathematicians would take longer to find a proof (for example, see the HOL Light tutorial, p. 36). Although these automatic reasoning tools do have their limitations, as we will detail in the examples considered below, it is interesting to consider that automated theorem provers like HOL Light could reduce some seemingly intractable mathematical problems to subgoals that human insight can solve.
Here are some examples of automated reasoning tools available in HOL Light. More information can be found in the HOL Light reference:
TAUT
: This tool uses a naÏve algorithm to prove propositional tautologies t, and returns a failure if t is not a propositional tautology. This is equivalent to solving the Boolean satisfiability problem, which is decidable but NP-complete (for a proof, see Hopcroft, Motwani and Ullman [2000]). Therefore we would expectTAUT
to run in time on input of length —thus it is not efficient on large formulae but is still faster than humans on shorter formulae. Example usage:
# TAUT `(p ==> q) \/ (q ==> p)`;;
val it : thm = |- (p ==> q) \/ (q ==> p)
MESON
: This more powerful tool attempts to prove a theorem of first order predicate logic by automated proof search. The undecidability of first-order logic means there is no decision procedure that can whether arbitrary formulae are logically valid, as demonstrated by Alonzo Church (1936) and Alan Turing (1937) in response to the Entscheidungsproblem posed by David Hilbert in 1928³. Therefore,MESON
cannot be expected to succeed on arbitrary formulae and indeed it has some limitations: for example, it will fail if the term is not provable but might take an unfeasible amount of time to do, by hitting a depth limit; it may also loop indefinitely. Nevertheless, it often succeeds on valid formulae.MESON
uses a method called “model elimination” to efficiently search for first-order proofs: this involves searching for a truth-functionally contradictory statement but reducing the number of candidates by defining and retaining candidates that are only “partially contradictory” at intermediate levels (Loveland [1968]). Here is a non-trivial set-theoretic example thatMESON
can prove valid (taken from the HOL Light tutorial, p. 36; remember, “!
” in HOL Light syntax means ):
# MESON[] `(!x y z. P x y /\ P y z ==> P x z) /\ (!x y z. Q x y /\ Q y z ==> Q x z) /\ (!x y. P x y ==> P y x) /\ (!x y. P x y \/ Q x y) ==> (!x y. P x y \/ !x y. Q x y)`;;
Warning: inventing type variables
0..0..2..7..16..30..48..72..108..168..236..340..516
..702..918..1260..1660..2098..2716..3438..4298..
5528..6944..8594..11052..13780..16742..20862..solved
at 23107
val it : thm =
|- (!x y z. P x y /\ P y z ==> P x z) /\
(!x y z. Q x y /\ Q y z ==> Q x z) /\
(!x y. P x y ==> P y x) /\
(!x y. P x y \/ Q x y)
==> (!x y. P x y \/ (!x y. Q x y))
ARITH_RULE
: This tool verifies simple theorems about natural numbers by applying basic algebraic normalisation or linear inequality reasoning. It will fail if the input given is not of type:bool
, or if the input cannot be proved using the basic methods listed above. Hence,ARITH_RULE
proves basic facts about natural numbers but sadly fails on Fermat’s Last Theorem, as this requires more powerful tools from algebraic number theory:
# ARITH_RULE `(2 + x) - 1 = x + 1`;;
val it : thm = |- (2 + x) - 1 = x + 1
# ARITH_RULE `x EXP n + y EXP n = z EXP n /\ n >= 3 ==> x = 0 /\ y = 0`;;
Exception: Failure "linear_ineqs: no contradiction".
4. Proof of a bite-sized lemma in HOL Light
Now that we know a little bit about HOL Light, how it works, and how to use it, let us formally prove the bite-sized lemma by working through the example presented in Formal Proof—Getting Started (Wiedijk [2007]). Here is the statement of the lemma again:
Lemma (bite-sized). For , ,
A HOL Light proof is developed by interacting with the proof assistant via statements on the command line. This reduces the main goal to be proved (in this case, the lemma) to ever-simpler subgoals until it is reduced to a pre-proved theorem, an axiom or a tautology, at which point the formalised proof is complete. In what follows, we will go through the six command line interactions necessary to complete this proof in HOL Light, as presented in “Formal Proof—Getting Started” (Wiedijk [2007]). After each command, the system reply is reported, followed by an explanation of the proof step and providing the equivalent mathematical statement in everyday notation.
Step 1
# g `2 <= p ==> ((p - 1) * (p - 1) == 1) (mod p)`;;
Warning: Free variables in goal: p
val it : goalstack = 1 subgoal (1 total)
`2 <= p ==> ((p - 1) * (p - 1) == 1) (mod p)`
The g
command has set our lemma as HOL Light’s proof goal.
Step 2
# e (SIMP_TAC[LE_EXISTS; LEFT_IMP_EXISTS_THM]);;
val it : goalstack = 1 subgoal (1 total)
`!d. p = 2+d ==> (((2+d) - 1) * ((2+d) - 1) == 1) (mod (2+d))`
Using the e
command (known as the “expand” function), the simplification tactic SIMP_TAC
has been applied to the goal with pre-proved HOL Light library theorems LE_EXISTS
and LEFT_IMP_EXISTS_THM
as arguments. This has changed the goal to the mathematical statement: such that , .
Step 3
# e (REPEAT STRIP_TAC);;
val it : goalstack = 1 subgoal (1 total)
0 [`p = 2 + d`]
`(((2 + d) - 1) * ((2 + d) - 1) == 1) (mod (2 + d))`
Here, by repeating applying tactic STRIP_TAC
to break down the subgoal until it can be broken down no more, HOL Light says that if we assume , the new subgoal to be proved is: .
Step 4
# e (REWRITE_TAC[cong; nat_mod; ARITH_RULE `(2+x)-1 = x+1`]);;
val it : goalstack = 1 subgoal (1 total)
0 [`p = 2 + d`]
`?q1 q2. (d + 1) * (d + 1) + (2 + d) * q1 = 1 + (2 + d) * q2`
The tactic REWRITE_TAC
uses pre-proved equational theorems to rewrite the goals and subgoals. In this step, the equational theorems cong
, nat_mod
, and (2+x)-1 = x+1
are applied to rewrite the statement. The last of these is proved at runtime by HOL Light’s automatic linear arithmetic prover (over ), ARITH_RULE
(already described in Section 3 above). In particular, the equational theorem nat_mod
states that . HOL Light thus returns the statement to be proved as: .
Step 5
# e (MAP_EVERY EXISTS_TAC [`0`; `d:num`]);;
val it : goalstack = 1 subgoal (1 total)
0 [`p = 2 + d`]
`(d + 1) * (d + 1) + (2 + d) * 0 = 1 + (2 + d) * d`
In this key step, the first tactic MAP_EVERY
is used to map the second tactic EXISTS_TAC
over a list of input arguments. Importantly, EXISTS_TAC
allows the user to provide a witness to an existential statement; in the present case, … Hence, the overall command maps witness 0
to q1
and witness d
(which has type natural number) to q2
in the proof subgoal. The subgoal is therefore transformed to: .
Step 6
# e ARITH_TAC;;
val it : goalstack = No subgoals
The final step applies tactic ARITH_TAC
, which can solve linear arithmetic over . This checks that the equality in the subgoal is correct according to the rules of arithmetic—and indeed it is. Thus the formal proof is finished because there are no subgoals left to prove.
Once the formal proof a theorem is found via interactive dialogue with HOL Light, one can write all the steps in a proof file so that in future the formalisation can be run through automatically instead of requiring an interactive user. The bite-sized example lemma from Formal Proof—Getting Started [Wiedijk (2007)] is part of the full proof of the law of quadratic reciprocity formalised in HOL Light, which can be found in the file 100/reciprocity.ml
. To run the entire proof in HOL Light and check it for correctness, one simply types loadt "100/reciprocity.ml"
after the usual #
prompt. In the proof file, the lemma, which is called CONG_MINUS1_SQUARE
, is represented as below, with each interactive step printed out separated by a THEN
.
let CONG_MINUS1_SQUARE = prove
(‘2 <= p ==> ((p - 1) * (p - 1) == 1) (mod p)‘,
SIMP_TAC[LE_EXISTS; LEFT_IMP_EXISTS_THM] THEN
REPEAT STRIP_TAC THEN
REWRITE_TAC[cong; nat_mod;
ARITH_RULE ‘(2 + x) - 1 = x + 1‘] THEN
MAP_EVERY EXISTS_TAC [‘0‘; ‘d:num‘] THEN
ARITH_TAC);;
5. Conclusions
Using HOL Light gives a clear sense of its simplicity and also its power. Having a background in lambda calculus and functional programming (as I do) helps one to understand its underlying workings, for example binding variables using lambda abstraction and defining quantifications in terms of primitive notions of abstraction. There is something to be said for software that is so straightforwardly installed, and that works in almost the same way 6-8 years after the publication of the article whose worked example we followed here (Wiedijk [2007]). This surely reflects the fact that HOL Light has a simple logical core and little legacy code, as well the possibility that current developments in HOL Light are in areas unrelated to elementary number theory. Nevertheless, even if the logical core is small and simple, we still must take on trust that the implementations of the primitive inference rules, for example in building the tool ARITH_RULE
, have been done correctly (Beeson [2016]).
The interactive aspect of proving a proof goal in HOL Light is fascinating—and fun! As a part of this the ability of HOL Light’s internal functions, such as TAUT
, MESON
and ARITH_RULE
to “reason” or “search for proofs” themselves, sometimes more quickly than a mathematician could, blurs the lines between automatically checking and finding a solution. This is revealed by the interactive game aspect of procedural proof checking, whereas for other proof assistants that use a declarative style (such as Mizar) the formalisation of the proof must be written by hand.
REFERENCES
Beeson, M. Mixing computation and proofs. Journal of Formal Reasoning, Vol. 9, No. 1, pp. 71-99. 2016.
Hopcroft, J.E., Motwani, R., Ullman, J.D. “Introduction to Automata Theory, Languages, and Computation”. Addison Wesley:2nd Edition.
Loveland, D. Mechanical theorem-proving by model elimination. Journal of the ACM, Vol. 15, pp. 236-251. 1968.
Wiedijk, F. Formal Proof – Getting Started. Notices of the AMS, Vol. 55, No. 1, pp. 1408-1414. 2007.
~ Footnotes ~
(1) The problem came when installing camlp5 (v 6.17), which is a preprocessor-pretty-printer of OCaml, the programming language in which HOL Light is written. The make world.opt
command generated two warnings (warning: trigraph ignored
) although the installation did complete. However, when HOL Light was then installed from the Github source (using the first “option” specified in the instructions), the make
command failed with the following error:
File "pa_j.ml", line 1918, characters 35-43:
While expanding quotation "str_item":
Parse error: antiquot "_" or antiquot "" or [ident] expected after 'type' (in
[str_item])
File "pa_j.ml", line 1:
Error: Error while running external preprocessor
Command line: camlp5r pa_lexer.cmo pa_extend.cmo q_MLast.cmo 'pa_j.ml' > /var/folders/j8/h9klnryj7n3gwx0_03x40qkh0000gn/T/ocamlpp5fcbb9
(2) A lambda calculus with the equality λx.sx = s (where x is not a free variable of s) is known as extensional lambda calculus. This equality, known as η-conversion, is motivated by the philosophy that all terms are functions, and that all functions are identified with their extensions, that is, their (usually set-theoretic) input-output mapping. For HOL Light and other theories of formalised mathematics, set relations (including functions) commonly identified in this way.
(3) Church’s and Turing’s solutions assume the Church-Turing Thesis, that is, any “effectively calculable’ decision procedure is captured by functions computable by the lambda calculus or, equivalently, the Turing Machine.