Cantor's theorem
Before studying this essay, you should have read:
implicit typing in first order logic
implicit typing in lambda logic
as otherwise it will be difficult to make sense of the essay.
One of the most celebrated examples of the use of higher-order unification in theorem-proving is the automatic generation of the "diagonal'' proof of Cantor's theorem (see e.g. [1] or [2]). Since this theorem and its proof seem at first glance to be heavily dependent, on type theory, we want to show that Otter-λ also can find the proof. This is shown in file cantor2.in.
In this essay, we discuss this file and proof in the light of the method of "implicit typing" Intuitively, we have a set (or type) α and its power set P(α), and we suppose for proof by contradiction that we have a map c from α onto P(α). Then the statement to be proved contradictory is
forall x ε P(α) exists j ε α forall z ε α (Ap(x,z) = Ap(Ap(c,j),z)).
As usual we introduce a Skolem function J and replace j by J(x). Then the negated goal becomes
Ap(x,z) = Ap(Ap(c,J(x)),z).We also need the following two axioms:
w != not(w). (definition of negation)
u != v | v = u. (symmetry of equality)
The (implicit) typings in this problem are as follows:
type(α, J(P(α))). J maps P(\α) into α
type(i(α,P(α)),c). c maps α into P(α)
type( bool, Ap(P(α), α). Ap is proposition-valued on P(α)
type(P(α), Ap(β, α). where β is the type i(α,P(α) maps from α to P(α)
type(bool, not(bool)). The negation of a proposition is a proposition.
type(bool,=(bool,bool)).
Nevertheless the second argument of Ap always has type \α. We do not, then, know a priori that an Otter-λ proof of contradiction from the goal given above will be correctly typable, but this can easily be verified by inspection if it is true for a given proof. Otter-λ finds such a proof instantaneously: It resolves the equation Ap(x,z) = Ap(Ap(c,J(x)),z) with u != v | v = u, producing the new conclusion
Ap(Ap(c,J(x)),y)=Ap(x,y).
This is then resolved with w != not(w); the first step assigns w the value Ap(Ap(c,J(x)),y), and then we have the unification problem
Ap(x,y) = not(Ap(Ap(c,J(x)),y)).
The ``masking term'' on the right, i.e. the smallest term that contains all occurrences of x, and occurring as a second argument of Ap, is J(x). Therefore our unification algorithm selects J(x) and y to be replaced by the new lambda-bound variable, and takes
x = \lambda u. not(Ap(Ap(c,u)),u)) and y = J(x), or explicitly,
y = J(\lambda u. not(Ap(Ap(c,u)),u)).
We check this solution: Ap(x,y) then beta reduces to not(Ap(Ap(c,y)),y)), but y = J(x), so we have
not(Ap(Ap(c,J(x)),y)),
which is the right-hand side of the unification problem. Yes, they are equal.
The proof that Otter-λ finds is only four lines long. Let's check that it is in fact correctly typed.
1 [] w!=not(w).
2 [] u!=v|v=u.
4 [] Ap(x,z)=Ap(Ap(c,J(x)),z)|$ANS(x)|$ANS(z).
5 [binary,4.1,2.1] $ANS(x)|$ANS(y)|Ap(Ap(c,J(x)),y)=Ap(x,y).
6 [binary,5.1,1.1] $ANS(lambda(x,not(Ap(Ap(c,x),x))))|$ANS(J(lambda(x,not(Ap(Ap(c,x),x))))).
When interpreting this output as a proof, answer literals are ignored, so the last line is really the empty clause (the final contradiction). We need to assign the variables types in such a way that each variable gets the same type at every occurrence in the same clause, and the type specifications given above are respected. In clause 1, w gets type Prop. In clause 2, u and v get type bool. In clause 4, we assign z the type α and x gets type P(α); then J(x) has type α. Since c has type β = i(α P(α)), using the type specification type(P(&alpha);,Ap(&beta,α)), then Ap(c,J(x)) is assigned type P(α). The remaining two Ap terms, at top level on the left and right of clause 4, are correctly typed according to the other specification for Ap, namely type(bool,Ap(P(α),&alpha). That completes the check that the proof is correctly typed.
What can be concluded from this correct typing of the proof? As with any correct typing, we can conclude that Cantor's theorem is provable in various theories. All we need is a theory that supports the explicit definition of the types in the list of type specifications above. For example, ZF set theory supports the definition of the set i(X,Y) of functions from set X to set Y, so we can interpret the above axioms and proof in ZF as a proof of the usual form of Cantor's theorem as formalized in ZF, interpreting Ap(f,x) as the unique y such that the set-theoretic ordered pair <x,y> belongs to the function f (in set theory functions are the same as their graphs). Or, we could interpret the axioms and proof in various forms of type theory. Once we have checked the correct typing, we have a "polymorphic proof".
Without getting involved in theories other than lambda logic, we could of course simply make the type specifications explicit, relativizing the variables to the specified types. Our untyped (or implicitly typed) proof would then convert directly to a proof of this relativized version in lambda logic.
Russell's paradox
It is quite amusing that this very same proof is also a proof of Russell's paradox. To see this, we use the usual representation of sets in lambda calculus, namely we regard u ε v as an abbreviation for Ap(v,u), where v is a boolean-valued mapping. Then if we add the axioms Ap(c,x) = x and Ap(J,x) = x to the input file, the term x = lambda(u,not(Ap(Ap(c,u)),u)) is just another notation for
{ u: u is not in Ap(c,u) } = {u : u is not in u }.
That is just the Russell set R, automatically defined by untyped lambda unification. Note that the Russell paradox
cannot even be formulated in type theory--that was, after all, the point of Russell's invention of type theory. Of course,
we could remove c and J entirely, rather than adding extra axioms saying they are the identity, but leaving them in shows
the relation between Cantor's theorem and Russell's paradox more clearly.
We might try to escape Russell's paradox in two different ways, informally. We might claim that the problem is the axiom w != not(w); we could say that this doesn't hold for arbitrary objects, but only for "boolean" objects, or "propositions", whatever they are. This axiom was important in the proof and if we take it out of the file, no proof is found. Alternately, we might try to escape Russell's paradox by claiming that the problem is that Ap is taken to be total; what the paradox "really" shows is that Ap(R,R) is undefined.
This first suggestion, that not does not have a fixed point, is problematic in lambda logic, since every function has a fixed point. In fact, the Russell paradox is nothing but the standard fixed point construction for not: if R = lambda(x, not(Ap(x,x))) then Ap(R,R) is a fixed point of not. If we want to take the view that the solution is that not doesn't have a fixed point, we have to use partial lambda logic instead of lamba logic, in which case we
can consistently postulate that not has no fixed point. Then the paradox can be taken only to show that RR must be undefined, since otherwise not would have RR as a fixed point. Thus the first apparent solution really implies the second; if we want to regard sets as boolean-valued functions, we can't cling to the notion that all functions are total.
[1] Peter B. Andrews, Matthew Bishop, Sunil Issar, Dan Nesmith, Frank Pfenning, Hongwei Xi, TPS: A Theorem Proving System for Classical Type Theory, Carnegie Mellon University Department of Mathematics Research Report 94-166A, February, 1995. Available on the Weg in pdf format , while the published version [2] is not available electronically.
[2] Peter B. Andrews, Matthew Bishop, Sunil Issar, Dan Nesmith, Frank Pfenning, Hongwei Xi, TPS: A Theorem Proving System for Classical Type Theory, Journal of Automated Reasoning 16, 1996, 321-353.