I've been following Buzzard's evangelism of proof assistants for some time.
A few months ago I decided to try formalising some old mathematics olympiad problems in Coq. Part of my motivation was to get a sense for how much work would be involved in proving something slightly non-trivial but still very elementary. I managed it, but it was _a lot_ more work than expected (results here: https://github.com/ocfnash/imo-coq).
Partly based on this exercise, I think that while it is possible that mathematics may go the way of chess, so to speak, it is a lot more distant than the ten years mentioned in this article.
I strongly support, very much hope, and even expect, that the use of proof assistants may become mainstream in mathematics within a generation or so but I think it is impossible to guess the exact role they will play accurately.
I like to think that I'm good at these combinatorics-kind of questions, but I can't wrap my head around this one. I can follow the construction of the matrix, but I don't understand how you came up with the formula.
Oh I see. Here's how I got there (still using the terminology / notation of my original post).
Without knowing anything about k except that k-1 < r/m, I counted how many cells are column-bad for this value of k. By the pigeon-hole-principle argument in the post this is at most c(m-1)(k-1) cells. Similarly, and still knowing nothing about k except that k-1 < c/n, I count how many cells are row-bad for this value of k and see that it is at most r(n-1)(k-1). Combining these two I know that as long as we have k-1 < r/m and k-1 < c/n the number of cells that are either column-bad or row-bad (or both) is at most:
* c(m-1)(k-1) + r(n-1)(k-1)
Since I want to be able to conclude that there is a cell that is neither column-bad nor row-bad, and since there are rc cells in total, I then further require that k is such that:
* c(m-1)(k-1) + r(n-1)(k-1) < rc
(The point is that this is a strict inequality and so there must be a cell that is neither column-bad nor row-bad.)
And then solve this to get the formula for k. Of course when expositing it in the post, I _start_ with the formula which perhaps makes things seem mysterious at first, but I thought was efficient.
Could someone with knowledge of both COQ and LEAN provide more info/comparison? Are they based on the same foundation or are there differences? Might it be possible to automatically convert proofs between the two? What about "strategies" ?
I was able to find a basic example of the natural numbers and the operation `plus` in both languages, and it seems like the structure is quite similar.
Coq:
Inductive nat : Type :=
| O : nat
| S : nat → nat. (* S stands for successor *)
Fixpoint plus (n : nat) (m : nat) : nat :=
match n with
| O ⇒ m
| S n' ⇒ S (plus n' m)
end.
There are subtle differences. What ostensibly they are both based on the calculus of inductive constructions, there are incompatible extensions to the logic in the kernel.
The majority is convertible, though. If I had to make a programming analogy, it would be like converting between C++ and D, or different Lisp dialects. The difference is bigger than Python 3 vs Python 2, but less than F# vs OCaml. Clearly possible, in a sense, and if you can read one language, you can read the other, but automatic conversion is just out of reach due to edge cases.
The biggest problem in mathematics has never been proving theorems. Mathematics was never focused on proof until the formalization of math became popular approximately 100 years ago. Mathematics is used as a tool to explain the world and to create the least-complex model which predicts the data.
So computers can aid us in verifying our work--but ultimately we aren't interested in determining all the possible logical theorems. Instead, we're only interested in finding the logically valid theorems which serve the above purpose of explaining the patterns in the world concisely.
It's a subtle difference, but for computers to solve that problem requires for them to have a certain understanding of humans and our goals with science.
Mathematicians choose what to keep according to aesthetics. Mathematics can only be defined as what mathematicians like.
For a long time, they didn't like irrationals, so anything involving those didn't count as mathematics. Zero took a long time to be accepted, starting in India. Negative solutions of quadratics were illegitimate until astonishingly recent days. Complex numbers were accepted even more recently. Greek geometers knew they lived on a sphere, but spherical geometry was too unpleasant to contemplate until quite recently, when it turned out interesting theorems could live there.
People worked outside these boundaries all along, but what they wrote didn't catch on. Laplace's transforms were ignored and forgotten until they were needed to shore up Heaviside's extremely practical D operator. Complex numbers turned out to be needed to for electromagnetics. Once people got deeply into the topic, they discovered beauty and then mathematics accepted them.
Mathematics is the world's largest and longest-running effort to produce a collective work of sublime beauty. What is beautiful goes in, what isn't dies with its creator. New forms come to be seen as beautiful as they are shown to open new vistas to explore, but very slowly.
> For a long time, they didn't like irrationals, so anything involving those didn't count as mathematics.
Note that this is more or less a myth.
> Greek geometers knew they lived on a sphere, but spherical geometry was too unpleasant to contemplate until quite recently
Astronomers did a huge amount of sophisticated spherical geometry, from Mesopotamians through e.g. Hipparchus and later Ptolemy, then Arabs/Persians, Indians, medieval Europeans, right down to the present.
>Instead, we're only interested in finding the logically valid theorems which serve the above purpose of explaining the patterns in the world concisely.
That's just empirically false though. Take the "perfectoid spaces" mentioned in the article. Do you think mathematicians are interested in them because they "explain the patterns in the world concisely"? No, they're interested in them because of their mathematical significance, and their applications to problems within mathematics. Mathematicians are not physicists.
> mathematical significance, and their applications to problems within mathematics
These external problems ultimately are about explaining patterns in the world.
I'm just saying you'll never replace the relationship between the mathematician and the physicist by simply replacing the mathematician. Instead, you'd have to replace the whole academic process, making fully autonomous universities. In which case they're still not interested in exactly same problems unless they are constructed in the same way (which they are not, since they are not human).
I'm not saying that computers won't be able to aid us with these meta-theories which help push our main theories forward. So I do agree that they can come up with some theory, but not all of it, and not most of it.
>Mathematics was never focused on proof until the formalization of math became popular approximately 100 years ago.
We are transitioning surprisingly fast from kinda explaining stuff to automatically and systematically advancing exclusively through rigurously formalized proofs.
The entire article makes it sound like Lean is significant step in usability and power over other systems. That seems like an important thing and makes me interested to download it and play with it.
Microsoft Research seems to have done some exciting things with provers over the years - Z3 is another significant program. All under the direction of Leonardo de Moura, notably.
Power, perhaps. But I'm a bit skeptical about usability. Lean doesn't even use one of the most obvious things that make interactive proof systems far more usable - a declarative mode instead of the usual tactics-based scripts. (Yes, you can kinda sorta fake the former with "structuring" tactics, except not really - declarative proofs are really their own kind of thing.) There even used to be systems that automatically rendered inputed definitions and declarative proofs in natural language (given that the basic terms and symbols were previously defined of course) which does enable even an average mathematician to easily figure out what the system is up to. You just can't do this properly if all you have is a list of "tactics" fiddling with the prover state.
"Lean doesn't even use one of the most obvious things that make interactive proof systems far more usable - a declarative mode instead of the usual tactics-based scripts."
Citation needed. I can make a perfectly reasonable Isar-style declarative proof in Lean. Just because most users of Lean choose not to do this doesn't mean it can't be done. I should mention that users are more willing to write imperative proof code instead of declarative in Lean is because 1) the interactive debugger is responsive and easy to use, and 2) writing a nicely structured declarative proof is more work than an imperative proof.
I agree with this wholeheartedly. The down side is that it's going to take forever to get everyone on board. Mathematicians are very hostile to outside influences.
Constructive mathematics should be more of a thing too.
Yes. Been there, done that, a long time ago, and had the same political problems.
Many years ago, around 1980, I headed a project to build one of the first program proof of correctness systems. (It's on Github now, about two-thirds working with modern compilers.)[1] We succeeded, but it was just too hard to use. And, back then, really slow. As in 15 minutes to hours.
We had two theorem provers. One was what's now called an SMT solver - it could prove theorems with addition, subtraction, multiplication by constants, equalities, inequalities, and the Boolean operators. It could also accept rewrite rules, which it accepted as true.
Because allowing users to input rewrite rules made it easy to create false proofs, we required that the rewrite rules be proven using the Boyer-Moore theorem prover, which is a pure constructive system.[2] Then the theorems could be transferred to the SMT solver and used there.
One of the things we set out to prove were the "axioms" for arrays. There are four, credited to McCarthy in his classic 1962 paper. But we wanted to prove them from a theory of constructive arrays. In Boyer-Moore theory, we defined arrays as sorted lists of (subscript, value) tuples. And then we set out to prove the classic four axioms for arrays with machine proofs.
Which we did.[3], starting at line 3372. That file is the Boyer-Moore theorem prover grinding its way up from a very low base of number theory to the theorems we needed for our program proving system. (On a modern computer, in seconds. Took 3 hours on a VAX in 1980. We just didn't have enough engine for this stuff back then.) In that notation, "(selecta A I)" means A[I], and "(storea A I V)" is array A with A[I] replaced with V. After some struggles, we got the the prover to find a way to prove that.
But it's ugly. There's lots of case analysis, which mathematicians hate. We didn't get to redefine "equal", so we couldn't have abstract sets. We had to have ordered lists of tuples, which meant all the fussyness of updating an ordered list representation and proving that was sound. But it all worked.
So I submitted this to JACM, back when that mattered. It was rejected. The comments were along the lines of "I don't like having something that ugly at the foundations of computer science". The reviewers all agreed it was valid, but they just didn't like doing it that way, in an automatic but inelegant way.
Looks like the author of the original article here is still having that problem.
Good times back then. But we were way ahead of our time.
I don’t think it’s just politics. It’s just a lot of work to build up a proof system to the point where it is useful for doing modern mathematics. Take the example from this article:
True or false – if x is a real number,
and x² − 3x + 2 = 0, then x = 1.”
My answer “False – set x = 2.
Lean: “OK, so it now suffices to prove that
(a) 2² − 3×2 + 2 = 0 and that
(b) 2 ≠ 1.
Lean nowadays apparently is better at this, but there’s millennia of ‘trivial’ theorems to add before it is truly useful to research mathematicians, and doing that isn’t mathematics, but still has to be done by mathematicians, because nobody else can do it.
Sorry for the dim question, but could you explain what is wrong with Lean's method? I was wondering this when looking at the slides, too. (We know 2 != 1, therefore False - right?)
We know, but Lean (initially) didn’t. When writing a computer theorem prover/checker, you want the part that isn’t proven by the system itself to be bug-free first, and expressive second. The only way to get the first is by giving up that second goal.
It’s a long road from there to high school math and an even longer one to mathematics research, so users not only will be able to do the above, but they will have to, if they want to do something at the front of mathematics.
I don't think Mathematicians themselves didn't like this idea. In fact Hilbert tried it (https://en.wikipedia.org/wiki/Hilbert%27s_program) one hundred years ago and it failed and proved to be not possible. Well maybe the Hilbert program is too aggressive, and this Lean thing is much moderate. But at the end of the day, the boundary of what program solver could do is bounded already hence no one is paying too much attention.
The problem might be that once you try to get your theorem into a proof system, you're not doing maths anymore: now you're programming.
Programming is a whole discipline apart from mathematics. It's grounded in mathematics, but in practice it really is engineering. There are myriad choices to be made, most of little or no mathematical significance. Just managing names becomes a task; using the same Hebrew letter to mean five (more or less analogous) things in one proof becomes impossible. Maybe even using a Hebrew letter is hard.
Mathematicians who wanted to be programmers might be programmers already. Mathematicians willing to do engineering might be engineers already.
We have lots of experience with physics grad students writing unmaintainable spaghetti code to analyze experimental result data. It is tolerable because the code is more-or-less discarded after the paper is published. But this code is supposed to support all of the future of mathematical development, forever.
Who will rewrite the shoddy parts when their (practical) flaws become intolerable? Will it even be possible, with the weight of decades depending on it? What abstraction methods does the language offer to help decouple the statement proven from the expression of its proof?
Very often the lemma your proof depends on isn't exactly what somebody else proved, but something analogous. Mutatis mutandis, it's the same, conceptually, but that's not good enough for your proof system.
Do you want dozens of almost-identical proof statements in the system? Is the language strong enough to unify them? Does unifying them count as doing maths? It will usually feel more like engineering.
Other software engineering considerations become important, notably runtime performance. Is a proof acceptable if it would take a hundred years to evaluate, so hasn't been? Can it be parallelized? Who provides the equipment to do it?
Lots of PhD students and post-docs working in the area? - No - Yes
Talks happening about these things all over the world? - No - Yes
Mathematicans interested in 2019? - No - Yes
Earlier this year, Patrick Massot, Johan Commelin and myself formalised the definition of a perfectoid space in Lean. I am getting invitations from across the EU to speak in mathematics departments about the work. Serious piece of research, or elaborate PR stunt? Maybe both."
They may be learning more than programming from their contacts in the computer department. Is always much easier to get something out there when it is fully buzzword compatible.
> Possible: tools such as Lean will begin to do research
semi-autonomously, perhaps uncover problems in the
literature. Maybe these tools will replace research
mathematicians.
> In April, Christian Szegedy from Google told me that he
believes that computers will be beating humans at math
within ten years.
I wonder whether automated mathematics software would really be able to compete with humans at explaining how areas of mathematics fit together and what roles different mathematical objects play.
I'm not a mathematician so sorry for the simplistic example, but for example, would automated mathematics software be able to do something like (a) invent the complex numbers (i.e. as an object in algebra -- a "field extension" of the reals), and (b) also make a statement like "these are useful for modeling cyclical/oscillatory behavior"?
I think the image is the computer acting as the world's most brilliant grad student, solving lemmas instantly while the mathematician makes value judgements while driving.
In April, Christian Szegedy from Google told me that he believes that computers will be beating humans at math within ten years.
From my experience the track record of Outlandish Claims Made by People Having Vested Interest in Casting Them is rather poor. In fact, I can not recall a single one that was accurate.
I was at Buzzard's talk at Microsoft last week. Kevin spent a lot of time in the live talk explaining what he thinks Szegedy actually meant (there should eventually be video on youtube of this). Kevin clarified that in his opinion what Szegedy means by "math" isn't at all what Kevin means by "math", since it takes so long to get to the frontiers of research math. In particular, Kevin speculated that for Szegedy "math" is some specific research questions in combinatorics, which isn't at all what Kevin's view of "math" is. So Kevin's conclusion was that Szegedy really means that computers will beat humans at "something specific involving combinatorics" in the next ten years. Kevin wasn't optimistic about the same claim was about number theory, unless 10 years is replaced by "some day".
Kevin also explained live in his talk that his main motivation for his recent interest in interactive theorem provers is a very bad experience he had a few years ago refereeing an important research paper by some famous mathematicians. That experience very much shook his faith in the ability of humans to do correct deep mathematics... I suppose from that perspective "computers will be beating humans" might just mean that humans are convinced of a (false!) result, but computers can't be convinced of that same false result (in the sense that nobody can formalize it).
(Incidentally, some of the people at Kevin's talk were at Richard Stallman's talk in the same place the day before. Evidently RMS didn't allow any pictures or video, so not much will appear from that.)
> computers will be beating humans at math within ten years.
I don't think anyone knows how long that's going to take. It's clear at this point that it's going to take something more than Deep Learning has to offer.
One could cluster theorems and types by “connectedness”, but even then, I don’t think “algebra” would stand out as separate from, say, “geometry” or “number theory”. Those are just stickers ignorant humans put on things they thought to be significant entities centuries ago.
Chances are not even “arithmetic” would stand out as a separate entity. Even if it did, the computer couldn’t give it a name that makes sense to humans.
And b) ‘useful’ isn’t mathematics, it’s physics/biology/economics/whatever.
> And b) ‘useful’ isn’t mathematics, it’s physics/biology/economics/whatever.
I don’t think that’s really true. There are cyclical/oscillatory behaviors within pure mathematics, for which complex numbers are useful. Just look at the other math article that’s on HN front page currently, on the Riemann Zeta function (John Baez blog post).
No one is going to care until an important proof can't be verified and a counterexample to some step is found. At that point, you'll see a sea change. This could take twenty years, or it could happen next year.
A few months ago I decided to try formalising some old mathematics olympiad problems in Coq. Part of my motivation was to get a sense for how much work would be involved in proving something slightly non-trivial but still very elementary. I managed it, but it was _a lot_ more work than expected (results here: https://github.com/ocfnash/imo-coq).
Partly based on this exercise, I think that while it is possible that mathematics may go the way of chess, so to speak, it is a lot more distant than the ten years mentioned in this article.
I strongly support, very much hope, and even expect, that the use of proof assistants may become mainstream in mathematics within a generation or so but I think it is impossible to guess the exact role they will play accurately.