Prime Numbers Algorithms – A beginner’s guide

Hello guys, welcome back to “code with asharam”. I will continue with my mathematics for Competitive Programming Series in this post. Prime Numbers is I guess most frequent topics in competitive programming. Almost in every contest, you can find something related to primality testing, seive of eratosthenes or finding prime factors. Therefore, it becomes very important for us to learn about prime numbers algorithms. In this post, I will discuss various methods of primality testing and algorithms for finding prime numbers. So, without any further delay, gear up to take a dive into deep and insightful oceans of mathematics.

Prime Numbers – A basic Introduction

Any positive integer which have exactly two distinct factors is called a prime number. One of the factor is $1$ and other factor is the number itself. For example, $5, 3, 11$ are prime numbers while $1, 4, 8$ are not prime numbers. Positive integers like $4, 8, 892$ which has more than two factors are called composite numbers. Note that $1$ is neither a prime nor a composite number.

Some Interesting Properties of Primes

  1. All the primes except $2$ are odd positive integers. This is due to the fact that every even integer other than 2 always have at least 3 factors – $1, 2$ and the number itself.
  2. There are infinitely many primes.
  3. The number of primes till $n$ are approximately equal to $n/\log{n}$ when $n$ is large. You can get more accurate information by Offset logarithmic integral but it is rarely used. That’s why, you don’t need to learn it.
  4. Every composite number can be written as product of prime numbers. This is called prime factorization and those primes are called prime factors of that number. For example, $8 = 2*2*2$, $6 = 2*3$.
  5. GCD of 2 primes is always 1.
  6. If a prime $p$ divides $a*b$, then, $p$ divides at least one of $a$ and $b$. For example, $2 | (6*9) \implies 2 | 6$.

That’s all for properties of prime numbers. It’s now time to discuss various algorithms for checking prime numbers, commonly known as primality testing algorithms. I will discuss them in the increasing order of their efficiency.

Primality Testing using Wilson’s theorem

According to Wilson’s theorem, a positive integer $p$ is prime if and only if $$(p-1)! \equiv 1 \mod p$$ Above result looks very beautiful at first sight but it requires $O(p)$ time for testing. What makes it more worse is the fact that it involve $(p-2)$ modular multiplication which increases the constant factor in the time complexity expression.

Primality Testing using School Method

As per definition, prime numbers have only 2 distinct factors. Therefore, one can simply run a loop from $1$ to $p$ and count the number of factors. Again, this algorithm for prime numbers testing requires $O(p)$ time and is highly inefficient.

Primality Testing using Optimized School Method

All of you know that if an integer $x$ divides $n$, then, $n/x$ also divides $n$. Therefore, we just need to run the loop from $1$ to $\sqrt{p}$ to count the number of factors of $p$. This will decrease time complexity to $O(\sqrt{p})$ which is really good compared to last two methods.

Primality Testing using Fermat’s Thoerem

According to Fermat’s theorem, if $n$ is a prime number, then, following condition holds for every positive integer $a < n$ coprime to $p$ $$a^{p-1} \equiv 1 \mod p$$
Note that some composite numbers also passes this test. But don’t be afraid because even after this flaw, accuracy of Fermat’s test is more than $99\%$. We can increase the accuracy of our algorithm by performing the test for multiple values of $a$. However, there are some stubborn composite numbers which passes the test for all the values of $a < n$. These numbers are called Carmichael number. This is the one and only major weakness of this algorithm. But don’t worry because the number of Carmichael numbers is very less. For extra knowledge, the number of Carmichael numbers upto $n$ can be approximated by
$$\exp{(\frac{\log{n}\log{\log{\log{n}}}}{\log{\log{n}}})}$$
We can easily compute $a^{p-1}\%p$ in $O(\log{n})$ time using binary exponentiation technique. If we perform Fermat’s test $k$ times, then, time complexity of our algorithm will be $O(k*\log{n})$ time which is a huge improvement compared to last few methods. Now, let’s discuss an improvement of Fermat’s test which will work even for Carmichael numbers with very high accuracy.

Primality Testing using Miller-Rabin test

Before discussing the test, let me tell you an important property.
Let’s say for an integer $x$ and prime number $p$, we have $x^2 \equiv 1 \mod p$
\[
\begin{array}{c}
x^2 \equiv 1 \mod p \\
\implies x^2 – 1 \equiv 0 \mod p \\
\implies (x-1)*(x+1) \equiv 0 \mod p \\
\end{array}
\]
Hence, we see that $p$ divides $(x-1)*(x+1)$. From property $6$ we discussed before, we can say that $p$ should divide at least one of $(x-1)$ or $(x+1)$. Therefore, we can conclude that $x\%p$ should be equal to $1$ or $-1$ ($-1$ can be interpreted as $(p-1)$). Let’s get back to our test now.
So, we don’t need to do Primality testing for even numbers and $2$ is a prime number is an universal fact. From now on we will focus on only odd primes.
Since $p$ is odd, $(p-1)$ will be even and hence, we can write $(p-1)$ as $2^s*d$ where $s$ is a positive integer and $d$ is a positive odd integer. For example, you can write $8 = 2^3*1$, $6 = 2^1*3$. Now, let’s go back to Fermat’s test.
\[
\begin{array}{c}
a^{p-1} \equiv 1 \mod p \\
a^{2^s*d} \equiv 1 \mod p \\
(a^{2^s*d} – 1) \equiv 0 \mod p \\
(a^{2^{(s-1)}*d} – 1) * (a^{2^{(s-1)}*d} + 1) \equiv 0 \mod p \\
(a^{2^{(s-2)}*d} – 1) * (a^{2^{(s-2)}*d} – 1)* (a^{2^{(s-1)}*d} + 1) \equiv 0 \mod p \\
(a^{d}-1)*(a^{2d}+1)*(a^{4d}+1)*…*(a^{2^{(s-1)}*d} + 1) \equiv 0 \mod p
\end{array}
\]
Since $p$ divides the final product we got, $p$ should divide at least one of the terms of product. It means that one of the following condition should hold:
$$a^d \equiv 1 \mod p$$
or
$$a^{2^r*d} \equiv -1 \mod p\text{ } \forall r\epsilon(0, s)$$
First condition accounts for divisibility of first term of product and second condition accounts for remaining terms.

Why Miller-Rabin test is better than Fermat’s test?

This is a natural question that arises in mind because we have only modified the equation given by Fermat. What have we achieved by modifying it? So, let me tell you about something which is hidden in these beautiful equations.
Let’s say $p$ is composite. Since, $p$ is composite, it is possible that $p$ divides the product by dividing factors from two or more terms in the product. Hence, in this way $p$ will pass Fermat’s test even when it is composite. But in Miller-Rabin test, we are forcing divisibility of at least single term. That’s why Miller-Rabin test is more strong than Fermat’s test.
Probability of Miller-Rabin test accepting Carmichael numbers as prime is proved to be equal to $1/4$. Hence, if we perform Miller-Rabin test $k$ times, then, probability will decrease to $1/4^{-k}$.
Time complexity of Miller-Rabin test is $O(k*(\log{n})^3))$ which can be improved but I am not discussing the details in this post.
IMPORTANT NOTE FOR USING MILLER-RABIN IN CP
Testing for $a = 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37$ will give $100\%$ accurate results for all $n<2^{64}$.
There are some more efficient prime numbers testing algorithms such as Solovay Strassen test but they are not of much use in CP. That’s why, I am leaving them for you to explore.
There is one more famous problem related to prime numbers, i.e., finding all prime numbers less than $n$. You can solve it in $O(n^2)$ or $O(n*\sqrt{n})$ or $O(n*k*(\log{n})^3))$ by using different methods discussed above but there is one more efficient method which I am going to discuss now.

Seive of Eratosthenes

Let’s say we want to find prime numbers till $n$, then, the algorithm will be as follows:

  1. Create a boolean array isPrime[] of size $n$ and initialize all the elements except First element to be True.
  2. Now, start iterating the array. When we encounter an element with True value, it will be a prime number. Now, you have to assign False to all of its multiples except itself.
  3. When iteration is over, you will be left with some elements still having true value in isPrime[] array and they will be the prime numbers from $1$ to $n$.

GIF taken from Wikipedia explains the algorithm in beautiful manner.

Explaining Seive Of Eratoshtenes (Algorithms for finding prime numbers)
By SKopp at German WikipediaOwn work, Original image at Image:Animation_Sieve_of_Eratosth.gif, CC BY-SA 3.0, Link

The time complexity of simple seive is $O(n*\log{n}\log{log{n}})$. There are some more powerful version of seive which I am not discussing in this post.
That’s all for theory part. Now, let’s write code to implement these algorithms which you can directly use in your CP contest.

#include <bits/stdc++.h>

using namespace std;

vector<int> a_for_miller_rabin_and_fermat = 
            {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37};		
vector<int> prime;

bool wilson_test(int n) {
    if (n==1) return false;
    int prod = 1;
    for (int i=2; i<n; i++) prod = (prod*i)%n;
    return prod==1;
}

bool school_test(int n) {
    int count_factors = 0;
    for (int i=1; i<=n; i++) count_factors += (n%i==0);
    return count_factors==2;
}

bool optimized_school_test(int n) {
    int count_factors = 0;
    for (int i=1; i*i<=n; i++) {
        if (n%i==0) {
            ++count_factors;
            if (i*i != n) ++count_factors;
        }
    }
    return count_factors==2;
}

bool expo_mod(int x, int n, int m) {
    if (n==0) return 1;
    int y = expo_mod(x, n/2, m)%m;
    y = (y*y)%m;
    if (n%2) return (y*x)%m;
    return y;
}

/* You can take random values of a but I am taking 
   values which will be give more accurate results */
   
bool fermat_test(int n) {
    if (n==1) return false;
    if (n==2) return true;
    
    for (auto a : a_for_miller_rabin_and_fermat) {
        // assert a<n
        if (a>=n) break;
        // assert (n, a are coprime)
        if (__gcd(n, a) > 1) return false;
        // assert Fermat's theorem
        if (expo_mod(a, n-1, n) != 1) return false;
    }
    // passes all test
    return true;
}

bool miller_rabin_test(int n) {
    // Finding d
    int d = n-1;
    while (d%2 == 0)  d/=2;
    
    for (auto a : a_for_miller_rabin_and_fermat) {
        // assert a<n
        if (a>=n) break;
        // assert (n, a are coprime)
        if (__gcd(a, n) > 1) return false;
        
        int expo_d = expo_mod(a, d, n);
        // check for this a is over if condition 1 is true
        if (expo_d==1) continue;
        // check for condition 2
        bool ok = false;
        while (d != n-1) {
            expo_d = (expo_d * expo_d) % n, d *= 2;
            // Condition 2 satisfied
            if (expo_d == n-1) {
                ok = true;
                break;
            }
        }
        // Miller-rabin failed for this a
        if (!ok) return false;
    }
    // All passed
    return true;
} 

void sieve(int n) {
    bool isPrime[n+1];
    for (int i=1; i<=n; i++) isPrime[i] = 1;
    isPrime[1] = 0;
    for (int i=2; i<=n; i++) {
        // assert isPrime == True
        if (!isPrime[i]) continue;
        /* j*(i-1) is a factor of (i-1) and was covered before
           Therefore start from i*i */
        for (int j=i*i; j<=n; j+=i) isPrime[j] = 0;
    }
    for (int i=1; i<=n; i++) if (isPrime[i]) prime.push_back(i);
}

int main() {
    
    cout << wilson_test(17) << " " <<  wilson_test(100) << "\n";
    cout << school_test(17) << " " << school_test(100) << "\n";
    cout << optimized_school_test(17) << " " << optimized_school_test(100) << "\n";
    /* Note that some of carmichael numbers are not passing fermat's test due to 
       appropriate choice of a and the condition gcd(a, n) = 1*/
    cout << fermat_test(17) << " " << fermat_test(100) << " " << fermat_test(561) << "\n";
    cout << miller_rabin_test(17) << " " << miller_rabin_test(100) <<  " " << 
                                            miller_rabin_test(561) << "\n";
    sieve(20);
    for (auto p : prime) cout << p << " ";
}

 
So, that’s all for prime numbers algorithms from my side. Now, it is your turn to go to codechef or codeforces and search for problems with prime, seive, primality tag and start solving them. Thank you guys for reading so far.
You can subscribe to my YouTube channel for video tutorials on competitive programming.
You can connect with me on LinkedIn. To get all of your queries answered, you can message me on Quora. Follow me on medium for more of my writings.

2 thoughts on “Prime Numbers Algorithms – A beginner’s guide”

  1. Can you help me? I read through what I could understand about prime numbers. I was doing some research for prime numbers for a children’s book I was writing.

    I think I have found some interesting concepts about Prime numbers relating to 6k + or – 1 (all the possible prime numbers greater than 3).

    That equation actually creates 2 sequences of possible prime numbers:
    Group A: 5 11 17 23 29 35 41 47 53 59 65 71 77 83 89 95 101 107 113 119
    Group B: 7 13 19 25 31 37 43 49 55 61 67 73 79 85 81 97 103 109 115 121

    To my knowledge no one has ever been able to identify the composite numbers in this list.
    to identify the composite numbers in A and B: Tables can be created which create Columns of Composite Numbers for each specific PN while also eliminating the multiples of 2 and 3 altogether. PN + 6(PN)

    PN 5 Table (30)
    5 7 11 13 17 19 23 25 29 31
    35 37 41 43 47 49 53 55 59 61
    65 67 71 73 77 79 83 85 89 91
    95 97 101 103 107 109 113 115 119 121
    CC CC

    1st Composite Column: A x B 2nd Composite Column: A x A

    PN 7 Table (42)
    7 11 13 17 19 23 25 29 31 35 37 41 43
    49 53 55 59 61 65 67 71 73 77 79 83 85
    91 95 97 101 103 107 109 113 115 119 121 125 127
    CC CC

    1st Composite Column: B x B 2nd Composite Column: B x A (Duplicate Column)

    These Composite Columns identify all the composite in the list of Possible Prime Numbers
    Example: Identifying the composite numbers in this list to 200:
    5 (30): 25, 35, 55, 65, 85, 95, 115, 125, 145, 155, 175, 185
    7 (42:): 49, 91, 133, 175 The second column in Group B sequence are B x A which are all duplicate calculations.
    11 (66): 77, 121, 143, 187
    13 (78): 91, 169,
    19 (102): 121

    To reduce duplication calculation can start at the square of the PN in its table and the second Column of Composite Numbers in Group B can be eliminated

    PN + 30(PN) can go further and eliminate the multiples of 5 in this table:
    For Group A : List The PN from 7 to 31, multiply these x’s the PN then add 30PN to create the Composite Columns
    Example: 11
    7 11 13 17 19 23 29 31 x’s the PN =
    77 121 143 187 209 253 319 341
    407 451 473 517 539 583 649 671

    For Group B: List Only the Group B Prime Numbers up to 31, and repeat the same calculations as in Group A
    Example: 7
    7 13 19 31 x’s the PN =
    49 91 119 217
    259 301 329 427
    469 511 539 637

    The process can proceed by multiplying the multiple by the next prime number 2*3*5*7*11*13*17…
    This could create a linear calculation of True prime numbers up to the square of the next PN and there would be no need to calculate further (I believe). So
    2*3*5: PNs up 49
    2*3*5*7: PNs up to 121
    2*3*5*7*11*13: PNs up to 289
    With only 233 (PNs up to 997) calculation you would identify all the PNs up to 1,018,080

    Another way would be to use the 2 composite column multiples as repeated patterns of sequence for each PN as the grow exponentially. Starting at, and the gap in sequence of Possible Prime Numbers including multiples of 5.
    5: 7 and 3
    7: 9 and 5
    11: 15 and 7
    13: 17 and 9
    17: 23 and 11
    19: 25 and 13
    23: 31 and 15
    29: 39 and 19

    I am trying to go as far as I can with my limited knowledge of math. All this is very basic but as far as I can tell I have not seen this anywhere else. I am having a difficult time in finding others who can do more with this. If nothing else it is interesting and the tables look fantastic.

    I hope to some day get someone’s attention. I even think for students this is much more interesting than factoring and shows a beauty and symmetry that we usually don’t see in PNs.

    Some of my ideas may be off but I have done my best to double check my math. Even with this email I had to do most of this in my head with just a couple of reference to my Excel sheets.
    Please help me.
    Barry Brun

    1. Hey Barry! the symmetry you have observed is pretty good. It will be interesting for students but it doesn’t sound much interesting from mathematics perspective. We can explain this symmetry as follows:
      Let’s say p is a prime.
      p = 6*m +/- 1,
      Now, consider the following expression you are using to reject all composite numbers in the table.
      p+k*6*p
      => 6*m +/- 1 + k*6*p
      => 6*(p*k+m) +/- 1
      Now, replace p*k+m with n, we will get
      6*n +/- 1
      This is the reason for the symmetry.
      Also, something similar is on the wikipedia.

Leave a Reply

%d bloggers like this: