Aug
13
2009

## Calculate Prime Numbers: Sieve of Eratosthenes

ColdFusion, Mathmatics, Performance
Ahh... the quintessential math problem-- finding prime numbers. Last week while tinkering with a math challenge I needed to find all of the primes up to a given number. There was a version on cflib.org, but I thought I could do it in less code, so I dug in myself.I turned to Google for an algorithm and found a simple one called the Sieve of Eratosthenes. It works by making a list of all the numbers up to your max and systematically eliminating all the numbers which have have divisors other than 1 and themselves (primes). This is copied from Wikipedia:
To find all the prime numbers less than or equal to a given integer n by Eratosthenes' method:

Please check out this post where I made some more performance modifications to my function at a later date:

Generating Primes Revisited: My Modifications To The Sieve of Eratosthenes

- Create a list of consecutive numbers from two to n, (2, 3, 4, ..., n).
- Initially, let p equal 2, the first prime number.
- Strike from the list all multiples of p less than or equal to n.
- Find the first number remaining on the list greater than p (this number is the next prime); replace p with this number.
- Repeat steps 3 and 4 until p2 is greater than n.
- All the remaining numbers on the list are prime.

[code]function getPrimes(num) { var i = 1; var p = 1; var m = 0; var t = 0; var l = 0; var arr = []; var arr2 = []; while (++i <= num) arr[i-1] = i; while(p<arrayLen(arr)) { i = 0; arr2 = arr; while (++i <= arrayLen(arr2)) { m = arr[p]*arr2[i]; t = arr.indexOf(m); if(t>-1) arrayDeleteAt(arr,t+1); if(m>=num) break;} p++;} return arr;} [/code]This method was pretty effective, but performance absolutely sucked once you got past an upper bound of about 10,000. The function took roughly 30 seconds to find all the primes below 100,000. Firing up SeeFusion and watching stack traces showed 99% of the processing time was spent on this:

[code]java.util.Vector.indexOf(Vector.java:335)[/code]Apparently searching an array many, many times is a very time consuming thing. Unfortunately I had to search the array if I expected to delete items out of it. My second version of the function left all the array indexes in and simply marked the non-primes as "0". This allowed me to always access specific array elements without searching. The downside is, the array doesn't shrink as you go, and the same non-primes will get marked over and over. At the end of the function it deletes out all 0's from the array and returns what's left (the primes).

[code]function getPrimes2(num) { var i = 0; var p = 1; var m = 0; var l = 0; var arr = []; while (++i <= num) arr[i] = i; while(++p<arrayLen(arr)) { i = 1; if(!arr[p]) continue; while (++i <= num) { m = p*i; if(m<=num) arr[m] = 0; else break; }} i = arrayLen(arr)+1; while (--i > 0) { if(!arr[i]) arrayDeleteAt(arr,i);} return arr;} [/code]My second version was quite a bit faster. It could generate all the primes up to 1 million in about 30 seconds. The stack traces now showed a large amount of processing time doing this:

[code]coldfusion.runtime.CFPage.ArrayDeleteAt(CFPage.java:491) [/code]Apparently it is also very costly to delete hundreds of thousands of array indexes. My final version only differed from the second in that I simply looped over the array at the end and built a new array out of the prime numbers in the first array.

[code]function getPrimes3(num) { var i = 0; var p = 1; var m = 0; var l = 0; var arr = []; var arr2 = []; while (++i <= num) arr[i] = i; while(++p<arrayLen(arr)) { i = 1; if(!arr[p]) continue; while (++i <= num) { m = p*i; if(m<=num) arr[m] = 0; else break; }} i = 0; while (++i <= arrayLen(arr)) { if(arr[i] > 1 && arr[i] < num) arr2[arrayLen(arr2)+1] = arr[i];} return arr2;} [/code]Ahh, much better. Now I can generate all primes under 1 million in about 3 seconds. Performance is exponential though-- all the primes up to 10 million takes about 30 seconds still. I was surprised at the cost of searching and deleting from an array when performed millions of times. Who would have guessed it ended up being faster leaving the array in tact and copying out the parts you needed? Of course, these performance differences are completely negligible with small numbers under a few thousand. Regardless, I feel confident I've squeezed out about as much performance as I'm going to get with this algorithm in ColdFusion. It is a little tempting to compile a C++ or Java version just to compare raw performance. Also, in case you were wondering my prime calculator used roughly half the code as the example on cflib.org and was twice as fast at cranking out primes up to 10,000,000. (There's 664,579 of them) Update: I made it even faster!

Please check out this post where I made some more performance modifications to my function at a later date:

Generating Primes Revisited: My Modifications To The Sieve of Eratosthenes

Comments are currently closed

Sean Guthriesaidat 12/02/2009 09:51:49 PM

Wed Dec 2 20:43:20 MST 2009 (start time)

-------------------------------------------------

Cacluations Complete

Wed Dec 2 20:43:21 MST 2009 (end time)

-------------------------------------------------

Answers Written to File

//code section------

// Name : bitvector.cpp

// Author : Sean Guthrie

// Version : 3

// Description : Prime Calcuator using a bolean vector array

#include <iostream>

#include <fstream>

#include <vector>

#include <stdlib.h>

using namespace std;

int main(int argc, char *argv[])

{

int x=5000000;

vector<bool>v1(x);

int i=0;

int c=0;

unsigned long long xz=3;

system("date");

cout << "-------------------------------------------------" << endl;

for (int b=3;b+b<=x;)

{i=c;

for (int u=0;u!=x;u++)

{i=i+b;

v1[i]=1;

if (i+b>x)

{u=x-1;}

}

b=b+2;

c++;

}

cout << "Cacluations Complete" << endl;

system("date");

cout << "-------------------------------------------------" << endl;

ofstream outFile;

outFile.open("dat1.dat", ios::app);

if (!outFile.fail()) //Verify File Open

{outFile << "1" << endl;

outFile << "2" << endl;

for (int u=0;u!=x;u++)

{if (v1.at(u)==1)

{xz=xz+2;}

else

{outFile << xz << endl;

xz=xz+2;}

}

}

else //Open File Failed

cout << "There was and error opening dat1.dat." << endl;

cout << "Answers Written to File" << endl;

return 0;

}

Sean Guthriesaidat 12/02/2009 10:06:11 PM

Brad Woodsaidat 12/03/2009 01:47:54 PM

Skipping the evens isn't nearly that big of a deal then-- since when I go through and mark all the multiples of two as not primes, my short circuit will prevent all those from being evaluated again anyway. Basically, the number of usable indexes in the array shrinks the more multiples you eliminate.

I'm afraid ColdFusion will never be quite as fast for this sort of stuff as C++ or Java. Looking at the stack traces, most of the processing going on is casting srings to doubles over and over again for all the numeric operations and searching the page for scopes. neither of those things would be an issue in a strongly typed language that didn't do scope hunting.

Sean Guthriesaidat 12/04/2009 12:53:09 AM