This document presents one of the oldest results of computer science. This result dates back to a time when computers did not exist yet, and when computer science was starting its existence, as a branch of mathematical logic now known as computability theory.
The result presented here was published by Alonzo Church in The calculi of lambda-conversion, 1941.
Primitive recursive (PR) functions are a subset of computable functions, defined as follows, with the "obvious" conditions on function arity.
I, S, C, and Z are PR, with:
If G, H1, ..., Hn are PR,
then F = G*(H1, ..., Hn) is PR, with:
F(x11, ..., x1p1, ..., xn1, ..., xnpn) =
G (H1 (x11, ..., x1p1), ..., Hn (xn1, ..., xnpn))
If G is PR, then F = G[s1, ..., sn]p is PR, with:
F(x_1, ..., x_p) = G(x_s1, ..., x_sn)
Examples:
C can be defined as Z[]1
Projection (extracting the n-th argument out of m) is
P_n,m = I[n]m
If G1 and G2 are PR,
then F = f(G1, G2) is PR, with:
F(x1, ..., xn, 0) = G1 (x1, ..., xn)
and
F(x1, ..., xn, k+1)
= G2 (x1, ..., xn, k, F(x1, ..., xn, k))
Note: I used "f" as the name of this operator because this is nothing more than a "for" loop.
Example: I can be defined as f(Z,S[1]2)
We represent "false" as 0, and "true" as any non-zero number.
divides(x,y,z) is true if and only if there exists t < z such that x = y.t.
Here, the function G2 is:
G2(x,y,z,t) = if (equal (x, mul (y,z)), one, t)
G2(x,y,z,t) = if (x = y*z) then 1 else t
Easy enough, no ? Let's go on...
No ? Maybe now is a good time to have a look at the first few theorems
in the proof, just to get the hang of it.
For y > 1, multiple(x,y) is true if and only if x is a multiple of y.
primaux(x,y) is false if and only if there is an n in [2,y+1] that divides x.
isprime(x) is true if and only if x is prime.
case (e, t3, c3, t2, c2, t1, c1) is
t1 if c1 != 0
t2 if c1 = 0 and c2 != 0
t3 if c1 = 0 and c2 = 0 and c3 != 0
e otherwise
This function will be useful for making the next one more readable.
nextprime(x) is the least prime number greater than x.
This function needs a little explaining:
It is composed of two parts. The right-hand part is
*(I,add)[1,1,1]1. It takes one argument (x) and
gives two arguments (x and 2x) to the left-hand part.
The left-hand part is a loop (f) with
G2 = case*(Z,I,isprime,I,I,Z,not*(sub))[2,2,3,3,2,1]3)
G2 takes three arguments: x, k, and r. k is the loop
index. In readable notation, G2 is:
G2(x,k,r) = 0 when k <= x
G2(x,k,r) = r when k > x and r != 0
G2(x,k,r) = k when k > x and r = 0 and isprime(k)
G2(x,k,r) = 0 otherwise
r is the result of G2 on the (k-1)th turn of the loop. If it
is not zero, then it is the least prime number greater than x, and
G2 passes it along. If it is zero and k is prime, then k must
be the least prime number greater than x.
At last, here is the interesting function:
nthprime(0) = 2
nthprime(1) = 3
nthprime(2) = 5
etc.
Let us inline all intermediate definitions:
nthprime =
f(S*(S*(Z)),f(C,f(f(f(I[1]2,I[2]4)[1,2,3]4,I[4]6) [1,2,3,4,5]6,I [6] 8)*(Z,I ,f( S*( C), f(I [2]2,I[1]4)[2,3,1]3 *(f(Z[]2,f(I[2 ]2, I[ 1]4)[2 ,3, 1] 3* (f (f( S* (Z) ,Z[]2),Z[]3)*(f(I, f(Z,I[1]2)[3]3 ),f (I,f(Z ,I [ 1 ]2 )[ 3]3))[1,2,2,1]2*(I, f(C,f(I,S[3]3) [1,3]3 )),S*( Z), I) )[ 1, 2,1 ]2 *( I,S*(S)),Z,I))*(I,f (Z,I[1]2)*(f(Z ,I[1]2 )) )[1 ,1 ]1, I, I,Z ,f (S* (Z),Z[]2)*(f(I,f(Z ,I[1]2)[3]3)))[2,2,3,3,2,1]3)*(I,f(I,S[3]3))[2,2,2]2)
It's not easy to trust the correctness of such an unreadable program, so I tested it, with an interpreter, written in Caml Light:
$ ./evalrp nthprime 0 >>>>>>> 2 $ ./evalrp nthprime 1 >>>>>>> 3 $ ./evalrp nthprime 2 >>>>>>> 5 $ ./evalrp nthprime 3 >>>>>>> 7 $ ./evalrp nthprime 4 >>>>>>> 11 $ ./evalrp nthprime 5 >>>>>>> 13 $ ./evalrp nthprime 6 >>>>>>> 17 $ ./evalrp nthprime 7 >>>>>>> 19 $ time ./evalrp nthprime 8 >>>>>>> 23 real 91.6 user 91.2 sys 0.1
After that, it becomes unreasonably slow.
I have written a (very detailed) proof of correctness for the above formula.