_International Journal of Bifurcation and Chaos_,
Vol. 4, No. 1, February 1994
G.J. Chaitin
IBM Research Division
P.O. Box 704
Yorktown Heights, NY
10598, USA
Abstract
One normally thinks that everything that is true is true for a reason.
I've found mathematical truths that are true for no reason at all. These
__________________________
1Lecture given Thursday 22 October 1992 at a Mathematics - Computer Science
Colloquium at the University of New Mexico. The lecture was videotaped;
this is an edited transcript. Originally published under the title
"Randomness in arithmetic and the decline and fall of reductionism in
pure mathematics" in the June 1993 issue of the Bulletin of the European
Association for Theoretical Computer Science.
1
2
mathematical truths are beyond the power of mathematical reasoning
because they are accidental and random.
Using software written in Mathematica that runs on an IBM
RS/6000 workstation, I constructed a perverse 200-page algebraic e-
quation with a parameter N and 17,000 unknowns:
Left-Hand-Side(N) = Right-Hand-Side(N).
For each whole-number value of the parameter N, ask whether this
equation has a finite or an infinite number of whole number solutions.
The answers escape the power of mathematical reason because they are
completely random and accidental.
This work is an extension of famous results of G"odel and Turing
using ideas from a new field called algorithmic information theory.
1. Hilbert on the axiomatic method
Last month I was a speaker at a symposium on reductionism at Cam-
bridge University where Turing did his work. I'd like to repeat the talk
I gave there and explain how my work continues and extends Turing's.
Two previous speakers had said bad things about David Hilbert. So I
started by saying that in spite of what you might have heard in some
of the previous lectures, Hilbert was not a twit!
Hilbert's idea is the culmination of two thousand years of math-
ematical tradition going back to Euclid's axiomatic treatment of ge-
ometry, going back to Leibniz's dream of a symbolic logic and Russell
and Whitehead's monumental Principia Mathematica. Hilbert's dream
was to once and for all clarify the methods of mathematical reasoning.
Hilbert wanted to formulate a formal axiomatic system which would
encompass all of mathematics.
Formal Axiomatic System
--------->
--------->
--------->
Hilbert emphasized a number of key properties that such a formal
axiomatic system should have. It's like a computer programming lan-
guage. It's a precise statement about the methods of reasoning, the
Randomness & Complexity in Pure Mathematics 3
postulates and the methods of inference that we accept as mathemati-
cians. Furthermore Hilbert stipulated that the formal axiomatic system
encompassing all of mathematics that he wanted to construct should
be "consistent" and it should be "complete."
Formal Axiomatic System
---------> consistent
---------> complete
--------->
Consistent means that you shouldn't be able to prove an assertion
and the contrary of the assertion.
Formal Axiomatic System
---------> consistent A ~A
---------> complete
--------->
You shouldn't be able to prove A and ~A (not A). That would be very
embarrassing.
Complete means that if you make a meaningful assertion you should
be able to settle it one way or the other. It means that either A or not
A should be a theorem, should be provable from the axioms using the
rules of inference in the formal axiomatic system.
Formal Axiomatic System
---------> consistent A ~A
---------> complete A ~A
--------->
Consider a meaningful assertion A and its contrary not A. Exactly
one of the two should be provable if the formal axiomatic system is
consistent and complete.
A formal axiomatic system is like a programming language. There's
an alphabet and rules of grammar, in other words, a formal syntax. It's
a kind of thing that we are familiar with now. Look back at Russell
and Whitehead's three enormous volumes full of symbols and you'll feel
you're looking at a large computer program in some incomprehensible
programming language.
4
Now there's a very surprising fact. Consistent and complete means
only truth and all the truth. They seem like reasonable requirements.
There's a funny consequence, though, having to do with something
called the decision problem. In German it's the Entscheidungsproblem.
Formal Axiomatic System
---------> consistent A ~A
---------> complete A ~A
---------> decision problem
Hilbert ascribed a great deal of importance to the decision problem.
HILBERT
Formal Axiomatic System
---------> consistent A ~A
---------> complete A ~A
---------> decision problem
Solving the decision problem for a formal axiomatic system is giving
an algorithm that enables you to decide whether any given meaningful
assertion is a theorem or not. A solution of the decision problem is
called a decision procedure.
HILBERT
Formal Axiomatic System
---------> consistent A ~A
---------> complete A ~A
---------> decision procedure
This sounds weird. The formal axiomatic system that Hilbert wanted
to construct would have included all of mathematics: elementary
arithmetic, calculus, algebra, everything. If there's a decision proce-
dure, then mathematicians are out of work. This algorithm, this
mechanical procedure, can check whether something is a theorem or not,
can check whether it's true or not. So to require that there be a decision
procedure for this formal axiomatic system sounds like you're asking
for a lot.
However it's very easy to see that if it's consistent and it's complete
that implies that there must be a decision procedure. Here's how you do
Randomness & Complexity in Pure Mathematics 5
it. You have a formal language with a finite alphabet and a grammar.
And Hilbert emphasized that the whole point of a formal axiomatic sys-
tem is that there must be a mechanical procedure for checking whether
a purported proof is correct or not, whether it obeys the rules or not.
That's the notion that mathematical truth should be objective so that
everyone can agree whether a proof follows the rules or not.
So if that's the case you run through all possible proofs in size order,
and look at all sequences of symbols from the alphabet one character
long, two, three, four, a thousand, a thousand and one. . .a hundred
thousand characters long. You apply the mechanical procedure which
is the essence of the formal axiomatic system, to check whether each
proof is valid. Most of the time, of course, it'll be nonsense, it'll be
ungrammatical. But you'll eventually find every possible proof. It's
like a million monkeys typing away. You'll find every possible proof,
though only in principle of course. The number grows exponentially
and this is something that you couldn't do in practice. You'd never get
to proofs that are one page long.
But in principle you could run through all possible proofs, check
which ones are valid, see what they prove, and that way you can sys-
tematically find all theorems. In other words, there is an algorithm,
a mechanical procedure, for generating one by one every theorem that
can be demonstrated in a formal axiomatic system. So if for every
meaningful assertion within the system, either the assertion is a the-
orem or its contrary is a theorem, only one of them, then you get a
decision procedure. To see whether an assertion is a theorem or not
you just run through all possible proofs until you find the assertion
coming out as a theorem or you prove the contrary assertion.
So it seems that Hilbert actually believed that he was going to solve
once and for all, all mathematical problems. It sounds amazing, but
apparently he did. He believed that he would be able to set down a
consistent and complete formal axiomatic system for all of mathematics
and from it obtain a decision procedure for all of mathematics. This is
just following the formal, axiomatic tradition in mathematics.
But I'm sure he didn't think that it would be a practical decision
procedure. The one I've outlined would only work in principle. It's
exponentially slow, it's terribly slow! Totally impractical. But the idea
was that if all mathematicians could agree whether a proof is correct
6
and be consistent and complete, in principle that would give a decision
procedure for automatically solving any mathematical problem. This
was Hilbert's magnificent dream, and it was to be the culmination of
Euclid and Leibniz, and Boole and Peano, and Russell and Whitehead.
Of course the only problem with this inspiring project is that it
turned out to be impossible!
2. G"odel, Turing and Cantor's diagonal argument
Hilbert is indeed inspiring. His famous lecture in the year 1900 is a
call to arms to mathematicians to solve a list of twenty-three difficult
problems. As a young kid becoming a mathematician you read that list
of twenty-three problems and Hilbert is saying that there is no limit to
what mathematicians can do. We can solve a problem if we are clever
enough and work at it long enough. He didn't believe that in principle
there was any limit to what mathematics could achieve.
I think this is very inspiring. So did John von Neumann. When he
was a young man he tried to carry through Hilbert's ambitious program.
Because Hilbert couldn't quite get it all to work, in fact he started off
just with elementary number theory, 1, 2, 3, 4, 5, ..., not even with
real numbers at first.
And then in 1931 to everyone's great surprise (including von Neu-
mann's), G"odel showed that it was impossible, that it couldn't be done,
as I'm sure you all know.
G"odel 1931
This was the opposite of what everyone had expected. Von Neu-
mann said it never occurred to him that Hilbert's program couldn't be
carried out. Von Neumann admired G"odel enormously, and helped him
to get a permanent position at the Institute for Advanced Study.
What G"odel showed was the following. Suppose that you have a
formal axiomatic system dealing with elementary number theory, with
1, 2, 3, 4, 5 and addition and multiplication. And we'll assume that it's
consistent, which is a minimum requirement---if you can prove false
results it's really pretty bad. What G"odel showed was that if you assume
Randomness & Complexity in Pure Mathematics 7
that it's consistent, then you can show that it's incomplete. That was
G"odel's result, and the proof is very clever and involves self-reference.
G"odel was able to construct an assertion about the whole numbers that
says of itself that it's unprovable. This was a tremendous shock. G"odel
has to be admired for his intellectual imagination; everyone else thought
that Hilbert was right.
However I think that Turing's 1936 approach is better.
G"odel 1931
Turing 1936
G"odel's 1931 proof is very ingenious, it's a real tour de force. I have to
confess that when I was a kid trying to understand it, I could read it
and follow it step by step but somehow I couldn't ever really feel that
I was grasping it. Now Turing had a completely different approach.
Turing's approach I think it's fair to say is in some ways more
fundamental. In fact, Turing did more than G"odel. Turing not only
got as a corollary G"odel's result, he showed that there could be no
decision procedure.
You see, if you assume that you have a formal axiomatic system
for arithmetic and it's consistent, from G"odel you know that it can't
be complete, but there still might be a decision procedure. There still
might be a mechanical procedure which would enable you to decide if a
given assertion is true or not. That was left open by G"odel, but Turing
settled it. The fact that there cannot be a decision procedure is more
fundamental and you get incompleteness as a corollary.
How did Turing do it? I want to tell you how he did it because
that's the springboard for my own work. The way he did it, and I'm
sure all of you have heard about it, has to do with something called the
halting problem. In fact if you go back to Turing's 1936 paper you will
not find the words "halting problem." But the idea is certainly there.
People also forget that Turing was talking about "computable num-
bers." The title of his paper is "On computable numbers, with an ap-
plication to the Entscheidungsproblem." Everyone remembers that the
halting problem is unsolvable and that comes from that paper, but not
as many people remember that Turing was talking about computable
real numbers. My work deals with computable and dramatically un-
8
computable real numbers. So I'd like to refresh your memory how
Turing's argument goes.
Turing's argument is really what destroys Hilbert's dream, and it's
a simple argument. It's just Cantor's diagonal procedure (for those of
you who know what that is) applied to the computable real numbers.
That's it, that's the whole idea in a nutshell, and it's enough to show
that Hilbert's dream, the culmination of two thousand years of what
mathematicians thought mathematics was about, is wrong. So Turing's
work is tremendously deep.
What is Turing's argument? A real number, you know 3.1415926...,
is a length measured with arbitrary precision, with an infinite number of
digits. And a computable real number said Turing is one for which there
is a computer program or algorithm for calculating the digits one by
one. For example, there are programs for pi, and there are algorithms
for solutions of algebraic equations with integer coefficients. In fact
most of the numbers that you actually find in analysis are computable.
However they're the exception, if you know set theory, because the
computable reals are denumerable and the reals are nondenumerable
(you don't have to know what that means). That's the essence of
Turing's idea.
The idea is this. You list all possible computer programs. At that
time there were no computer programs, and Turing had to invent the
Turing machine, which was a tremendous step forward. But now you
just say, imagine writing a list with every possible computer program.
p1 G"odel 1931
p2 Turing 1936
p3
p4
p5
p6
.
.
.
If you consider computer programs to be in binary, then it's natural
to think of a computer program as a natural number. And next to
each computer program, the first one, the second one, the third one,
write out the real number that it computes if it computes a real (it may
Randomness & Complexity in Pure Mathematics 9
not). But if it prints out an infinite number of digits, write them out.
So maybe it's 3.1415926 and here you have another and another and
another:
p1 3.1415926 . . . G"odel 1931
p2 . . . Turing 1936
p3 . . .
p4 . . .
p5 . . .
p6 . . .
.
.
.
So you make this list. Maybe some of these programs don't print
out an infinite number of digits, because they're programs that halt or
that have an error in them and explode. But then there'll just be a
blank line in the list.
p1 3.1415926 . . . G"odel 1931
p2 . . . Turing 1936
p3 . . .
p4 . . .
p5
p6 . . .
.
.
.
It's not really important---let's forget about this possibility.
Following Cantor, Turing says go down the diagonal and look at
the first digit of the first number, the second digit of the second, the
third. . .
p1 -.d11_ d12 d13 d14 d15 d16. . .G"odel 1931
p2 -.d21 d22_ d23 d24 d25 d26. . .Turing 1936
p3 -.d31 d32 d33_ d34 d35 d36. . .
p4 -.d41 d42 d43 d44_ d45 d46. . .
p5
p6 -.d61 d62 d63 d64 d65 d66_. . .
.
.
.
Well actually it's the digits after the decimal point. So it's the
first digit after the decimal point of the the first number, the second
10
digit after the decimal point of the second, the third digit of the third
number, the fourth digit of the fourth, the fifth digit of the fifth. And
it doesn't matter if the fifth program doesn't put out a fifth digit, it
really doesn't matter.
What you do is you change these digits. Make them different.
Change every digit on the diagonal. Put these changed digits together
into a new number with a decimal point in front, a new real number.
That's Cantor's diagonal procedure. So you have a digit which
you choose to be different from the first digit of the first number, the
second digit of the second, the third of the third, and you put these
together into one number.
p1 -.d11_ d12 d13 d14 d15 d16. . . G"odel 1931
p2 -.d21 d22_ d23 d24 d25 d26. . . Turing 1936
p3 -.d31 d32 d33_ d34 d35 d36. . .
p4 -.d41 d42 d43 d44_ d45 d46. . .
p5
p6 -.d61 d62 d63 d64 d65 d66_. . .
.
.
.
.~d11 ~d22 ~d33 ~d44 ~d55 ~d66. . .
This new number cannot be in the list because of the way it was
constructed. Therefore it's an uncomputable real number. How does
Turing go on from here to the halting problem? Well, just ask yourself
why can't you compute it? I've explained how to get this number and
it looks like you could almost do it. To compute the Nth digit of this
number, you get the Nth computer program (you can certainly do that)
and then you start it running until it puts out an Nth digit, and at
that point you change it. Well what's the problem? That sounds easy.
The problem is, what happens if the Nth computer program never
puts out an Nth digit, and you sit there waiting? And that's the halting
problem_you cannot decide whether the Nth computer program will
ever put out an Nth digit! This is how Turing got the unsolvability of
the halting problem. Because if you could solve the halting problem,
then you could decide if the Nth computer program ever puts out an
Nth digit. And if you could do that then you could actually carry out
Cantor's diagonal procedure and compute a real number which has to
differ from any computable real. That's Turing's original argument.
Randomness & Complexity in Pure Mathematics 11
Why does this explode Hilbert's dream? What has Turing proved?
That there is no algorithm, no mechanical procedure, which will decide
if the Nth computer program ever outputs an Nth digit. Thus there
can be no algorithm which will decide if a computer program ever halts
(finding the Nth digit put out by the Nth program is a special case).
Well, what Hilbert wanted was a formal axiomatic system from which
all mathematical truth should follow, only mathematical truth, and
all mathematical truth. If Hilbert could do that, it would give us a
mechanical procedure to decide if a computer program will ever halt.
Why?
You just run through all possible proofs until you either find a proof
that the program halts or you find a proof that it never halts. So if
Hilbert's dream of a finite set of axioms from which all of mathematical
truth should follow were possible, then by running through all possible
proofs checking which ones are correct, you would be able to decide if
any computer program halts. In principle you could. But you can't by
Turing's very simple argument which is just Cantor's diagonal argument
applied to the computable reals. That's how simple it is!
G"odel's proof is ingenious and difficult. Turing's argument is so
fundamental, so deep, that everything seems natural and inevitable.
But of course he's building on G"odel's work.
3. The halting probability and algorithmic randomness
The reason I talked to you about Turing and computable reals is that
I'm going to use a different procedure to construct an uncomputable
12
real, a much more uncomputable real than Turing does.
And that's how we're going to get into much worse trouble.
How do I get a much more uncomputable real? (And I'll have to
tell you how uncomputable it is.) Well, not with Cantor's diagonal
argument. I get this number, which I like to call O (omega), like this:
-|p|
O = sum 2
p halts
This is just the halting probability. It's sort of a mathematical pun.
Turing's fundamental result is that the halting problem is unsolvable_
there is no algorithm that'll settle the halting problem. My fundamental
result is that the halting probability is algorithmically irreducible or
algorithmically random.
What exactly is the halting probability? I've written down an ex-
pression for it:
-|p|
O = sum 2
p halts
Instead of looking at individual programs and asking whether they
halt, you put all computer programs together in a bag. If you generate
a computer program at random by tossing a coin for each bit of the pro-
gram, what is the chance that the program will halt? You're thinking
of programs as bit strings, and you generate each bit by an independent
toss of a fair coin, so if a program is N bits long, then the probability
-N
that you get that particular program is 2 . Any program p that halts
-|p|
contributes 2 , two to the minus its size in bits, the number of bits
in it, to this halting probability.
Randomness & Complexity in Pure Mathematics 13
By the way there's a technical detail which is very important and
didn't work in the early version of algorithmic information theory. You
couldn't write this:
-|p|
O = sum 2
p halts
It would give infinity. The technical detail is that no extension of a
valid program is a valid program. Then this sum
-|p|
sum 2
p halts
turns out to be between zero and one. Otherwise it turns out to be
infinity. It only took ten years until I got it right. The original 1960s
version of algorithmic information theory is wrong. One of the reasons
it's wrong is that you can't even define this number
-|p|
O = sum 2
p halts
In 1974 I redid algorithmic information theory with "self-delimiting"
programs and then I discovered the halting probability O (omega).
Okay, so this is a probability between zero and one
-|p|
0 < O = sum 2 < 1
p halts
like all probabilities. The idea is you generate each bit of a program
by tossing a coin and ask what is the probability that it halts. This
number O, this halting probability, is not only an uncomputable real---
Turing already knew how to do that. It is uncomputable in the worst
possible way. Let me give you some clues how uncomputable it is.
Well, one thing is it's algorithmically incompressible. If you want
to get the first N bits of O out of a computer program, if you want a
computer program that will print out the first N bits of O and then halt,
that computer program has to be N bits long. Essentially you're only
printing out constants that are in the program. You cannot squeeze
the first N bits of O. This
-|p|
0 < O = sum 2 < 1
p halts
14
is a real number, you could write it in binary. And if you want to get
out the first N bits from a computer program, essentially you just have
to put them in. The program has to be N bits long. That's irreducible
algorithmic information. There is no concise description.
Now that's an abstract way of saying things. Let me give a more
concrete example of how random O is. Emile Borel at the turn of this
century was one of the founders of probability theory.
-|p|
0 < O = sum 2 < 1
p halts
Emile Borel
Question: Can I ask a very simple question before you
get ahead?
Answer: Sure.
Question: I can't see why O should be a probability.
What if the two one-bit programs both halt? I mean, what
if the two one-bit programs both halt and then some oth-
er program halts. Then O is greater than one and not a
probability.
Answer: I told you no extension of a valid program is
a valid program.
Question: Oh right, no other programs can halt.
Answer: The two one-bit programs would be all the
programs there are. That's the reason this number
-|p|
0 < O = sum 2 < 1
p halts
can't be defined if you think of programs in the normal way.
So here we have Emile Borel, and he talked about something he
called a normal number.
-|p|
0 < O = sum 2 < 1
p halts
Emile Borel _ normal reals
What is a normal real number? People have calculated pi out to a
billion digits, maybe two billion. One of the reasons for doing this,
Randomness & Complexity in Pure Mathematics 15
besides that it's like climbing a mountain and having the world record,
is the question of whether each digit occurs the same number of times.
It looks like the digits 0 through 9 each occur 10% of the time in the
decimal expansion of pi. It looks that way, but nobody can prove it. I
think the same is true for the square root of 2, although that's not
as popular a number to ask this about.
Let me describe some work Borel did around the turn of the century
when he was pioneering modern probability theory. Pick a real number
in the unit interval, a real number with a decimal point in front, with
no integer part. If you pick a real number in the unit interval, Borel
showed that with probability one it's going to be "normal." Normal
means that when you write it in decimal each digit will occur in the limit
exactly 10% of the time, and this will also happen in any other base.
For example in binary 0 and 1 will each occur in the limit exactly 50% of
the time. Similarly with blocks of digits. This was called an absolutely
normal real number by Borel, and he showed that with probability one
if you pick a real number at random between zero and one it's going to
have this property. There's only one problem. He didn't know whether
pi is normal, he didn't know whether sqrt 2 is normal. In fact, he couldn't
exhibit a single individual example of a normal real number.
The first example of a normal real number was discovered by a friend
of Alan Turing's at Cambridge called David Champernowne, who is still
alive and who's a well-known economist. Turing was impressed with
him---I think he called him "Champ"---because Champ had published
this in a paper as an undergraduate. This number is known as
Champernowne's number. Let me show you Champernowne's number.
-|p|
0 < O = 2 < 1
p halts
Emile Borel --- normal reals
Champernowne
.01234567891011121314...99100101...
It goes like this. You write down a decimal point, then you write 0,
1, 2, 3, 4, 5, 6, 7, 8, 9, then 10, 11, 12, 13, 14 until 99, then 100, 101.
And you keep going in this funny way. This is called Champernowne's
number and Champernowne showed that it's normal in base ten, only
16
in base ten. Nobody knows if it's normal in other bases, I think it's still
open. In base ten though, not only will the digits 0 through 9 occur
exactly 10% of the time in the limit, but each possible block of two
digits will occur exactly 1% of the time in the limit, each block of three
digits will occur exactly .1% of the time in the limit, etc. That's called
being normal in base ten. But nobody knows what happens in other
bases.
The reason I'm saying all this is because it follows from the fact that
the halting probability O is algorithmically irreducible information that
this
-|p|
0 < O = sum 2 < 1
p halts
is normal in any base. That's easy to prove using ideas about coding
and compressing information that go back to Shannon. So here we
finally have an example of an absolutely normal number. I don't know
how natural you think it is, but it is a specific real number that comes
up and is normal in the most demanding sense that Borel could think
of. Champernowne's number couldn't quite do that.
This number O is in fact random in many more senses. I would say
it this way. It cannot be distinguished from the result of independent
tosses of a fair coin. In fact this number
-|p|
0 < O = sum 2 < 1
p halts
shows that you have total randomness and chaos and unpredictability
and lack of structure in pure mathematics! The same way that all it
took for Turing to destroy Hilbert's dream was the diagonal argument,
you just write down this expression
-|p|
0 < O = sum 2 < 1
p halts
and this shows that there are regions of pure mathematics where rea-
soning is totally useless, where you're up against an impenetrable wall.
This is all it takes. It's just this halting probability.
Why do I say this? Well, let's say you want to use axioms to prove
what the bits of this number O are. I've already told you that it's
Randomness & Complexity in Pure Mathematics 17
uncomputable---right?---like the number that Turing constructs using
Cantor's diagonal argument. So we know there is no algorithm which
will compute digit by digit or bit by bit this number O. But let's try to
prove what individual bits are using a formal axiomatic system. What
happens?
The situation is very, very bad. It's like this. Suppose you have a
formal axiomatic system which is N bits of formal axiomatic system
(I'll explain what this means more precisely later). It turns out that
with a formal axiomatic system of complexity N, that is, N bits in size,
you can prove what the positions and values are of at most N + c bits
of O.
Now what do I mean by formal axiomatic system N bits in size?
Well, remember that the essence of a formal axiomatic system is a
mechanical procedure for checking whether a formal proof follows the
rules or not. It's a computer program. Of course in Hilbert's days there
were no computer programs, but after Turing invented Turing machines
you could finally specify the notion of computer program exactly, and
of course now we're very familiar with it.
So the proof-checking algorithm which is the essence of any formal
axiomatic system in Hilbert's sense is a computer program, and just
see how many bits long this computer program is.2 That's essentially
how many bits it takes to specify the rules of the game, the axioms and
postulates and the rules of inference. If that's N bits, then you may be
able to prove say that the first bit of in binary is 0, that the second
bit is 1, that the third bit is 0, and then there might be a gap, and you
might be able to prove that the thousandth bit is 1. But you're only
going to be able to settle N cases if your formal axiomatic system is an
N-bit formal axiomatic system.
Let me try to explain better what this means. It means that you
can only get out as much as you put in. If you want to prove whether
an individual bit in a specific place in the binary expansion of the real
number O is a 0 or a 1, essentially the only way to prove that is to
take it as a hypothesis, as an axiom, as a postulate. It's irreducible
__________________________
2Technical Note: Actually, it's best to think of the complexity of a formal
axiomatic system as the size in bits of the computer program that
enumerates the set of all theorems.
18
mathematical information. That's the key phrase that really gives the
whole idea.
Irreducible Mathematical Information
-|p|
0 < O = sum 2 < 1
p halts
Emile Borel _ normal reals
Champernowne
.01234567891011121314...99100101...
Okay, so what have we got? We have a rather simple mathematical
object that completely escapes us. O's bits have no structure. There is
no pattern, there is no structure that we as mathematicians can com-
prehend. If you're interested in proving what individual bits of this
number at specific places are, whether they're 0 or 1, reasoning is com-
pletely useless. Here mathematical reasoning is irrelevant and can get
nowhere. As I said before, the only way a formal axiomatic system can
get out these results is essentially just to put them in as assumptions,
which means you're not using reasoning. After all, anything can be
demonstrated by taking it as a postulate that you add to your set of
axioms. So this is a worst possible case_this is irreducible mathemat-
ical information. Here is a case where there is no structure, there are
no correlations, there is no pattern that we can perceive.
4. Randomness in arithmetic
Okay, what does this have to do with randomness in arithmetic? Now
we're going back to G"odel---I skipped over him rather quickly, and now
let's go back.
Turing says that you cannot use proofs to decide whether a program
will halt. You can't always prove that a program will halt or not. That's
how he destroys Hilbert's dream of a universal mathematics. I get us
into more trouble by looking at a different kind of question, namely,
can you prove that the fifth bit of this particular real number
-|p|
0 < O = sum 2 < 1
p halts
Randomness & Complexity in Pure Mathematics 19
is a 0 or a 1, or that the eighth bit is a 0 or a 1. But these are
strange-looking questions. Who had ever heard of the halting problem
in 1936? These are not the kind of things that mathematicians normally
worry about. We're getting into trouble, but with questions rather far
removed from normal mathematics.
Even though you can't have a formal axiomatic system which can
always prove whether a program halts or not, it might be good for
everything else and then you could have an amended version of Hilbert's
dream. And the same with the halting probability O. If the halting
problem looks a little bizarre, and it certainly did in 1936, well, O is
brand new and certainly looks bizarre. Who ever heard of a halting
probability? It's not the kind of thing that mathematicians normally
do. So what do I care about all these incompleteness results!
Well, G"odel had already faced this problem with his assertion which
is true but unprovable. It's an assertion which says of itself that it's
unprovable. That kind of thing also never comes up in real mathe-
matics. One of the key elements in G"odel's proof is that he managed
to construct an arithmetical assertion which says of itself that it's
unprovable. It was getting this self-referential assertion to be in ele-
mentary number theory which took so much cleverness.
There's been a lot of work building on G"odel's work, showing that
problems involving computations are equivalent to arithmetical prob-
lems involving whole numbers. A number of names come to mind. Julia
Robinson, Hilary Putnam and Martin Davis did some of the important
work, and then a key result was found in 1970 by Yuri Matijasevic.
He constructed a diophantine equation, which is an algebraic equa-
tion involving only whole numbers, with a lot of variables. One of
the variables, K, is distinguished as a parameter. It's a polynomial
equation with integer coefficients and all of the unknowns have to be
whole numbers---that's a diophantine equation. As I said, one of the
unknowns is a parameter. Matijasevic's equation has a solution for a
particular value of the parameter K if and only if the Kth computer
program halts.
In the year 1900 Hilbert had asked for an algorithm which will de-
cide whether a diophantine equation, an algebraic equation involving
only whole numbers, has a solution. This was Hilbert's tenth problem.
It was tenth is his famous list of twenty-three problems. What Mati-
20
jasevic showed in 1970 was that this is equivalent to deciding whether
an arbitrary computer program halts. So Turing's halting problem is
exactly as hard as Hilbert's tenth problem. It's exactly as hard to de-
cide whether an arbitrary program will halt as to decide whether an
arbitrary algebraic equation in whole numbers has a solution. There-
fore there is no algorithm for doing that and Hilbert's tenth problem
cannot be solved_that was Matijasevic's 1970 result.
Matijasevic has gone on working in this area. In particular there is
a piece of work he did in collaboration with James Jones in 1984. I can
use it to follow in G"odel's footsteps, to follow G"odel's example. You
see, I've shown that there's complete randomness, no pattern, lack of
structure, and that reasoning is completely useless, if you're interested
in the individual bits of this number
-|p|
0 < O = sum 2 < 1
p halts
Following G"odel, let's convert this into something in elementary
number theory. Because if you can get into all this trouble in elementary
number theory, that's the bedrock. Elementary number theory, 1, 2, 3,
4, 5, addition and multiplication, that goes back to the ancient Greeks
and it's the most solid part of all of mathematics. In set theory you're
dealing with strange objects like large cardinals, but here you're not
even dealing with derivatives or integrals or measure, only with whole
numbers. And using the 1984 results of Jones and Matijasevic I can
indeed dress up O arithmetically and get randomness in elementary
number theory.3
What I get is an exponential diophantine equation with a param-
eter. "Exponential diophantine equation" just means that you allow
variables in the exponents. In contrast, what Matijasevic used to show
that Hilbert's tenth problem is unsolvable is just a polynomial dio-
phantine equation, which means that the exponents are always natural
Y
number constants. I have to allow X . It's not known yet whether I
actually need to do this. It might be the case that I can manage with a
polynomial diophantine equation. It's an open question, I believe that
__________________________
3To obtain the Mathematica software for doing this, send e-mail to
Randomness & Complexity in Pure Mathematics 21
it's not settled yet. But for now, what I have is an exponential dio-
phantine equation with seventeen thousand variables. This equation is
two-hundred pages long and again one variable is the parameter.
This is an equation where every constant is a whole number, a
natural number, and all the variables are also natural numbers, that
is, positive integers. (Actually non-negative integers.) One of the
variables is a parameter, and you change the value of this parameter---
take it to be 1, 2, 3, 4, 5. Then you ask, does the equation have a
finite or infinite number of solutions? My equation is constructed so
that it has a finite number of solutions if a particular individual bit
of O is a 0, and it has an infinite number of solutions if that bit is a
1. So deciding whether my exponential diophantine equation in each
individual case has a finite or infinite number of solutions is exactly the
same as determining what an individual bit of this
-|p|
0 < O = sum 2 < 1
p halts
halting probability is. And this is completely intractable because O is
irreducible mathematical information.
Let me emphasize the difference between this and Matijasevic's work
on Hilbert's tenth problem. Matijasevic showed that there is a polyno-
mial diophantine equation with a parameter with the following proper-
ty: You vary the parameter and ask, does the equation have a solution?
That turns out to be equivalent to Turing's halting problem, and there-
fore escapes the power of mathematical reasoning, of formal axiomatic
reasoning.
How does this differ from what I do? I use an exponential diophan-
tine equation, which means I allow variables in the exponent. Matija-
sevic only allows constant exponents. The big difference is that Hilbert
asked for an algorithm to decide if a diophantine equation has a so-
lution. The question I have to ask to get randomness in elementary
number theory, in the arithmetic of the natural numbers, is slightly
more sophisticated. Instead of asking whether there is a solution, I
ask whether there are a finite or infinite number of solutions---a more
abstract question. This difference is necessary.
My two-hundred page equation is constructed so that it has a finite
or infinite number of solutions depending on whether a particular bit
22
of the halting probability is a 0 or a 1. As you vary the parameter, you
get each individual bit of O. Matijasevic's equation is constructed so
that it has a solution if and only if a particular program ever halts. As
you vary the parameter, you get each individual computer program.
Thus even in arithmetic you can find O's absolute lack of structure,
O's randomness and irreducible mathematical information. Reasoning
is completely powerless in those areas of arithmetic. My equation shows
that this is so. As I said before, to get this equation I use ideas that
start in G"odel's original 1931 paper. But it was Jones and Matijasevic's
1984 paper that finally gave me the tool that I needed.
So that's why I say that there is randomness in elementary number
theory, in the arithmetic of the natural numbers. This is an impenetra-
ble stone wall, it's a worst case. From G"odel we knew that we couldn't
get a formal axiomatic system to be complete. We knew we were in
trouble, and Turing showed us how basic it was, but is an extreme
case where reasoning fails completely.
I won't go into the details, but let me talk in vague information-
theoretic terms. Matijasevic's equation gives you N arithmetical ques-
tions with yes/no answers which turn out to be only logN bits of algo-
rithmic information. My equation gives you N arithmetical questions
with yes/no answers which are irreducible, incompressible mathemati-
cal information.
5. Experimental mathematics
Okay, let me say a little bit in the minutes I have left about what this
all means.
First of all, the connection with physics. There was a big controversy
when quantum mechanics was developed, because quantum theory is
nondeterministic. Einstein didn't like that. He said, "God doesn't play
dice!" But as I'm sure you all know, with chaos and nonlinear dynamics
we've now realized that even in classical physics we get randomness
and unpredictability. My work is in the same spirit. It shows that pure
mathematics, in fact even elementary number theory, the arithmetic
of the natural numbers, 1, 2, 3, 4, 5, is in the same boat. We get
randomness there too. So, as a newspaper headline would put it, God
Randomness & Complexity in Pure Mathematics 23
not only plays dice in quantum mechanics and in classical physics, but
even in pure mathematics, even in elementary number theory. So if a
new paradigm is emerging, randomness is at the heart of it. By the
way, randomness is also at the heart of quantum field theory, as virtual
particles and Feynman path integrals (sums over all histories) show
very clearly. So my work fits in with a lot of work in physics, which is
why I often get invited to talk at physics meetings.
However the really important question isn't physics, it's mathemat-
ics. I've heard that G"odel wrote a letter to his mother who stayed in
Europe. You know, G"odel and Einstein were friends at the Institute
for Advanced Study. You'd see them walking down the street together.
Apparently G"odel wrote a letter to his mother saying that even though
Einstein's work on physics had really had a tremendous impact on how
people did physics, he was disappointed that his work had not had the
same effect on mathematicians. It hadn't made a difference in how
mathematicians actually carried on their everyday work. So I think
that's the key question: How should you really do mathematics?
I'm claiming I have a much stronger incompleteness result. If so
maybe it'll be clearer whether mathematics should be done the ordinary
way. What is the ordinary way of doing mathematics? In spite of the
fact that everyone knows that any finite set of axioms is incomplete, how
do mathematicians actually work? Well suppose you have a conjecture
that you've been thinking about for a few weeks, and you believe it
because you've tested a large number of cases on a computer. Maybe it's
a conjecture about the primes and for two weeks you've tried to prove
it. At the end of two weeks you don't say, well obviously the reason
I haven't been able to show this is because of G"odel's incompleteness
theorem! Let us therefore add it as a new axiom! But if you took
G"odel's incompleteness theorem very seriously this might in fact be
the way to proceed. Mathematicians will laugh but physicists actually
behave this way.
Look at the history of physics. You start with Newtonian physics.
You cannot get Maxwell's equations from Newtonian physics. It's a
new domain of experience---you need new postulates to deal with it.
As for special relativity, well, special relativity is almost in Maxwell's
equations. But Schr"odinger's equation does not come from Newtonian
physics and Maxwell's equations. It's a new domain of experience and
24
again you need new axioms. So physicists are used to the idea that when
you start experimenting at a smaller scale, or with new phenomena, you
may need new principles to understand and explain what's going on.
Now in spite of incompleteness mathematicians don't behave at all
like physicists do. At a subconscious level they still assume that the
small number of principles, of postulates and methods of inference, that
they learned early as mathematics students, are enough. In their hearts
they believe that if you can't prove a result it's your own fault. That's
probably a good attitude to take rather than to blame someone else,
but let's look at a question like the Riemann hypothesis. A physicist
would say that there is ample experimental evidence for the Riemann
hypothesis and would go ahead and take it as a working assumption.
What is the Riemann hypothesis? There are many unsolved ques-
tions involving the distribution of the prime numbers that can be settled
if you assume the Riemann hypothesis. Using computers people check
these conjectures and they work beautifully. They're neat formulas but
nobody can prove them. A lot of them follow from the Riemann
hypothesis. To a physicist this would be enough: It's useful, it explains
a lot of data. Of course a physicist then has to be prepared to say
"Oh oh, I goofed!" because an experiment can subsequently contradict
a theory. This happens very often.
In particle physics you throw up theories all the time and most of
them quickly die. But mathematicians don't like to have to backpedal.
But if you play it safe, the problem is that you may be losing out, and
I believe you are.
I think it should be obvious where I'm leading. I believe that ele-
mentary number theory and the rest of mathematics should be pursued
more in the spirit of experimental science, and that you should be will-
ing to adopt new principles. I believe that Euclid's statement that an
axiom is a self-evident truth is a big mistake. The Schr"odinger equation
certainly isn't a self-evident truth! And the Riemann hypothesis isn't
self-evident either, but it's very useful.
So I believe that we mathematicians shouldn't ignore incomplete-
ness. It's a safe thing to do but we're losing out on results that we
could get. It would be as if physicists said, okay no Schr"odinger equa-
tion, no Maxwell's equations, we stick with Newton, everything must
be deduced from Newton's laws. (Maxwell even tried it. He had a
Randomness & Complexity in Pure Mathematics 25
mechanical model of an electromagnetic field. Fortunately they don't
teach that in college!)
I proposed all this twenty years ago when I started getting these
information-theoretic incompleteness results. But independently a new
school on the philosophy of mathematics is emerging called the "quasi-
empirical" school of thought regarding the foundations of mathematics.
There's a book of Tymoczko's called New Directions in the Philosophy
of Mathematics (Birkh"auser, Boston, 1986). It's a good collection of
articles. Another place to look is Searching for Certainty by John Casti
(Morrow, New York, 1990) which has a good chapter on mathematics.
The last half of the chapter talks about this quasi-empirical view.
By the way, Lakatos, who was one of the people involved in this
new movement, happened to be at Cambridge at that time. He'd left
Hungary.
The main schools of mathematical philosophy at the beginning of
this century were Russell and Whitehead's view that logic was the
basis for everything, the formalist school of Hilbert, and an "intuitionist"
constructivist school of Brouwer. Some people think that Hilbert
believed that mathematics is a meaningless game played with marks of ink
on paper. Not so! He just said that to be absolutely clear and precise
what mathematics is all about, we have to specify the rules determining
whether a proof is correct so precisely that they become mechanical.
Nobody who thought that mathematics is meaningless would have been
so energetic and done such important work and been such an inspiring
leader.
Originally most mathematicians backed Hilbert. Even after G"odel
and even more emphatically Turing showed that Hilbert's dream didn't
work, in practice mathematicians carried on as before, in Hilbert's spirit.
Brouwer's constructivist attitude was mostly considered a nuisance.
As for Russell and Whitehead, they had a lot of problems getting all of
mathematics from logic. If you get all of mathematics from set theory
you discover that it's nice to define the whole numbers in terms of sets
(von Neumann worked on this). But then it turns out that there's all
kinds of problems with sets. You're not making the natural numbers
more solid by basing them on something which is more problematical.
Now everything has gone topsy-turvy. It's gone topsy-turvy, not
because of any philosophical argument, not because of G"odel's results
26
or Turing's results or my own incompleteness results. It's gone topsy-
turvy for a very simple reason_the computer!
The computer as you all know has changed the way we do every-
thing. The computer has enormously and vastly increased mathemati-
cal experience. It's so easy to do calculations, to test many cases, to run
experiments on the computer. The computer has so vastly increased
mathematical experience, that in order to cope, people are forced to
proceed in a more pragmatic fashion. Mathematicians are proceeding
more pragmatically, more like experimental scientists do. This new ten-
dency is often called "experimental mathematics." This phrase comes
up a lot in the field of chaos, fractals and nonlinear dynamics.
It's often the case that when doing experiments on the computer,
numerical experiments with equations, you see that something happens,
and you conjecture a result. Of course it's nice if you can prove it.
Especially if the proof is short. I'm not sure that a thousand page
proof helps too much. But if it's a short proof it's certainly better
than not having a proof. And if you have several proofs from different
viewpoints, that's very good.
But sometimes you can't find a proof and you can't wait for someone
else to find a proof, and you've got to carry on as best you can. So now
mathematicians sometimes go ahead with working hypotheses on the
basis of the results of computer experiments. Of course if it's physicists
doing these computer experiments, then it's certainly okay; they've
always relied heavily on experiments. But now even mathematicians
sometimes operate in this manner. I believe that there's a new journal
called the Journal of Experimental Mathematics. They should've put
me on their editorial board, because I've been proposing this for twenty
years based on my information-theoretic ideas.
So in the end it wasn't G"odel, it wasn't Turing, and it wasn't my
results that are making mathematics go in an experimental mathemat-
ics direction, in a quasi-empirical direction. The reason that mathe-
maticians are changing their working habits is the computer. I think
it's an excellent joke! (It's also funny that of the three old schools of
mathematical philosophy, logicist, formalist, and intuitionist, the most
neglected was Brouwer, who had a constructivist attitude years before
the computer gave a tremendous impulse to constructivism.)
Of course, the mere fact that everybody's doing something doesn't
Randomness & Complexity in Pure Mathematics 27
mean that they ought to be. The change in how people are behaving
isn't because of G"odel's theorem or Turing's theorems or my theorems,
it's because of the computer. But I think that the sequence of work
that I've outlined does provide some theoretical justification for what
everybody's doing anyway without worrying about the theoretical jus-
tification. And I think that the question of how we should actually
do mathematics requires at least another generation of work. That's
basically what I wanted to say_thank you very much!
Bibliography
[1] G. J. Chaitin, Algorithmic Information Theory, revised third
printing, Cambridge University Press, 1990.
[2] G. J. Chaitin, Information, Randomness & Incompleteness,
second edition, World Scientific, 1990.
[3] G. J. Chaitin, Information-Theoretic Incompleteness, World
Scientific, 1992.
[4] G. J. Chaitin, "Exhibiting randomness in arithmetic using Math-
ematica and C," IBM Research Report RC-18946, 94 pp., June
1993. (To obtain this report in machine readable form, send e-