Computing $\pi(x)$, the number of primes $p \leq x$ is a classic problem in algorithmic number theory. The prime number theorem describes the asymptotic growth of this function, and states that
The theorem was independently conjectured by Legendre and Gauss, and proved by Hadamard and de la Vallée Pussin around 100 years later in 1896. The theorem can also be written as follows, which provides a refined numerical approximation:
where $\text{Li}(x)$ denotes the logarithmic integral.
While robust numerical approximations are available, it is particularly tedious to calculate $\pi(x)$ exactly. I was working on Project Euler’s Problem 543, and found that an efficient solution requires a fast algorithm to compute $\pi(x)$. I originally just generated the primes using a sieve and counted them, which is perhaps the most straightforward way to approach the problem. To my surprise, I found that implementing the MeisselLehmer algorithm resulted in a ~120x speedup in runtime.
In this article, I will discuss the mathematical basis of the MeisselLehmer algorithm and provide an implementation in Python.
Algorithm Overview
In the 1870s, the German astronomer Ernst Meissel discovered a combinatorial method to compute $\pi(x)$, which was extended and simplified by Derrick H. Lehmer in 1959 ^{1}.
The MeisselLehmer algorithm allows us to compute $\pi(x)$ in $O\left (\frac{x}{(\ln x)^4} \right )$ time using $O\left (\frac{x^{1/3}}{\ln x} \right )$ space. Let $p_1, p_2, \dots$ denote the prime numbers. The algorithm allows us to compute $\pi(x)$ as follows:
where
Before diving into the mathematics, I will first discuss other simpler approaches to compute $\pi(x)$, due to Eratosthenes and Legendre.
Sieve of Eratosthenes – A Naive Algorithm
The simplest way to go about computing $\pi(x)$ is to just generate the primes and count them. A wellknown and ancient algorithm for doing so is the sieve due to the Greek mathematician Eratosthenes. The algorithm works by filtering out multiples of each prime, which must be composite. This algorithm has a time complexity of $O (x \log \log x)$ and generally requires $O(x)$ space.
Here is a basic implementation in Python:
The primesieve software package provides an optimized C++ implementation of the Eratosthenes sieve.
Legendre’s Formula
Legendre’s formula to compute $\pi(x)$ is as follows:
It is an improvement over sieving, as it does not require the calculation of all of the primes $\leq x$. Still, it is not very computationally efficient, as using it involves the calculation of approximately $\frac{6x(1\log 2)}{\pi^2}$ terms^{2}. It is nevertheless useful to examine, as it is similar to the MeisselLehmer algorithm we will discuss next.
This formula is a direct consequence of the inclusionexclusion principle. It uses the observation that the number of primes in $[1, x]$ plus 1 (the quantity on the left of the equation) is equal to the number of integers minus the number of composite numbers in the interval.
Let’s take a closer look. Over the interval $[1, x]$, the quantity $\left \lfloor \frac{x}{p} \right \rfloor$ counts the integers divisible by $p.$ Noting that every composite number in the interval must have some prime factor $\leq \sqrt{x}$, we start by subtracting $\sum_{p_i \leq \sqrt{x}} \left \lfloor \frac{x}{p_i} \right \rfloor$ multipes of primes $\leq \sqrt{x}$ in the interval. But this will subtract the multiples $1 \cdot p_i$, which are actually prime, so we must compensate by adding the term $\pi(\sqrt{x})$.
To account for the rest of the terms, observe that some of the composite numbers are divisible by two primes $\leq \sqrt{x}$, call them $p_i$ and $p_j$. These numbers will be double counted as multiples of both $p_i$ and $p_j$. Hence, we must adjust the total by adding the number of integers of this type, which is $\sum \left \lfloor \frac{x}{p_i p_j} \right \rfloor$. But adding this term will now remove the count of integers that are divisible by three different primes, which explains the next term. Continuing this reasoning, Legendre’s result follows.
MeisselLehmer Algorithm
We will now turn to the main subject of the article, the MeisselLehmer algorithm. For convenience, we will first introduce some notation. Let $\phi(x, a)$ denote the partial sieve function, defined as the number of integers $\leq x$, with no prime factor less than or equal to $p_a$. In set notation, we can expression this as
The name comes from the fact that $\phi(x, a)$ counts the numbers $\leq x$ that are not struck off when sieving with the first $a$ primes. Note that this allows us to rewrite Legendre’s formula as
where $a = \pi(\sqrt{x})$.
Now, let $P_k(x, a)$ denote the $k$th partial sieve function, which is defined as the number of integers $\leq x$ with exactly $k$ prime factors, with none less than or equal to $p_a$. For convenience, we define $P_0(x, a) = 1$. It then follows that
Note that the right hand side’s sum will contain a finite number of nonzero terms. Observing that
it follows that if we can compute $\phi(x, a)$, $P_2(x, a)$, $P_3(x, a)$, and so on, we can obtain the value of $\pi(x)$. We will now consider the two subproblems of computing $P_i(x, a)$ and $\phi(x, a)$.
Computing $P_i(x, a)$
We will begin by considering $P_2(x, a)$, defined as the number of integers in the interval $[1, x]$ which are products of two primes $p_i, p_j > p_a$. It follows that
Let $b = \pi(\sqrt{x})$. If we choose $a < b$, the above sum is equivalent to
where we have applied the arithmetic series formula. In a similar fashion, we can compute $P_3(x, a)$ as follows:
Let $b_i = \pi \left ( \sqrt{\frac{x}{p_i}} \right )$, and let $c = \pi(x^{1/3})$. Assuming that $a < c$ (so that $P_3$ is nonzero), it follows that:
Finally, if we choose $a = \pi(x^{1/4})$, $P_i(x, a) = 0$ for $i > 3$. Now, we need only compute $\phi(x, a)$ to obtain the MeisselLehmer formula in its full form.
Computing $\phi(x, a)$
The key to computing $\phi$ is the observation that
which follows from the definition of $\phi$: the integers not divisible by any of the primes $p_1, \dots, p_a$ are exactly those integers which are not divisible by any of $p_1, p_2, \dots, p_{a1}$, excluding those that are not divisible by $p_a$. Repeatedly applying this identity will eventually lead to $\phi(x, 1)$, which is just the number of odd numbers $\leq x$.
In the implementation below, $\phi$ is computed using a memoized recursive procedure. It turns out that one can make this computation more efficient by applying a truncation rule during the recursive chain. The details of how to do this are somewhat involved, but the interested reader can refer to ^{2} and ^{3}.
Further Reading
While the MeisselLehmer algorithm is quite fast for most practical purposes, there are algorithms that are known to be more efficient. Based on the work on Meissel and Lehmer, in 1985 Lagarias, Miller and Odlyzko found an algorithm^{2} requiring $O( \frac{x^{2/3}}{\log x} )$ time and $O(x^{1/3} \log^2 x)$ space. Lagarias and Odlyzko have also published a method describing how to compute $\pi(x)$ using an analytic approach^{4}.
In 1996, Deléglise and Rivat^{5} ^{6} refined the LagariasMillerOdlyzko method allowing one to save a factor of $\log x$ in the time complexity, at the cost of an increase by a factor of $\log x \log \log x$ in space. In 2001, Gourdon^{7} published refinements to the DelégliseRivat method that saves constant factors in time and space complexity.
Kim Walisch’s excellent primecount
software package provides highly optimized C++ implementations of many of these algorithms, with support for OpenMP parallelization. In September of 2015, this software was used to produce a recordbreaking computation of $\pi(10^{27})$, using 16core EC2 r3.8xlarge instances and a 36core Xeon server. Using the DelégliseRivat method, the computation took 23.03 CPU core years, and the peak memory usage was 235 gigabytes!
References

D. H. Lehmer, “On the exact number of primes less than a given limit”, Illinois Journal of Mathematics, vol. 3, pp. 381–388, 1959. ↩

Lagarias, J. C., V. S. Miller, and A. M. Odlyzko. “Computing $\pi(x)$: The MeisselLehmer Method.” Mathematics of Computation. 44.170 (1985): 537. Web. ↩ ↩^{2} ↩^{3}

Riesel, Hans. Prime Numbers and Computer Methods for Factorization. Boston: Birkhäuser, 1985. ↩

Lagarias, J.C., and A. M. Odlyzko. “Computing $\pi(x)$: An Analytic Method.” Journal of Algorithms 8.2 (1987): 17391. Web. ↩

Deleglise, M., and J. Rivat. “Computing $\pi(x)$: The Meissel, Lehmer, Lagarias, Miller, Odlyzko Method.” Mathematics of Computation. 65.213 (1996): 23546. Web. ↩

Silva, T. “Computing $\pi(x)$: the combinatorial method.” Revista do Detua. 4.6 (2006): 759. ↩

X. Gourdon, “Computation of $\pi(x)$: Improvements to the Meissel, Lehmer, Lagarias, Miller, Odlyzko, Deléglise and Rivat method.” Available from numbers.computation.free.fr/Constants/Primes/Pix/piNalgorithm.ps. ↩