Primality Tests – Solutions

On October 29 I provided a challenge to write a T-SQL function that
implements a primality test for an input value (of a BIGINT datatype).
You can find the challenge here.

Thanks to all those who sent solutions. I know that many of you gave it
a try but didn’t send a solution since you couldn’t come up with a faster
solution than the one posted by Steve Kass. ;-)

I’ll only provide a brief overview of primality tests here since the subject
is covered thoroughly in Wikipedia
(http://en.wikipedia.org/wiki/Primality_test). I’ll focus on the T-SQL
solutions assuming you are familiar with the concepts.

Primality tests is a well researched subject in computer science, but
as far as I can tell, all existing efficient algorithms are suited for
iterative/recursive implementations (e.g., the solution posted by Steve
Kass based on the Miller-Rabin test for strong pseudoprimality). I
haven’t found any efficient algorithm that is suited for set-based
implementations, or at least, I couldn’t find or think of any set-based
adaptations of existing efficient algorithms. One of the reasons for
providing this T-SQL challenge was that in my research, I couldn’t find
any attempts (at least successful ones) to address the problem with
set-based logic—such that a SQL based solution would be very fast.
You never know, maybe if enough attention will be given to trying to
solve the problem with set-based logic, some brilliant mind will come
up with a very fast solution based on a fresh set-based approach. I’m
not losing hope… I’ll leave the puzzle open, so if you’re on to
something, I’d love to hear about it.

Primality tests fall under three categories: Naïve Methods, Probabilistic
Techniques and Fast Deterministic Techniques.

Naïve Techniques

Naïve algorithms are the slowest. The slowest of the naïve methods is:

Given an integer input n:

If n < 2, it is not a prime. If any integer m between 2 and n-1 divides n,
n is a composite, otherwise n is a prime.

Of course, you can do much better with a naïve method. The first
improvement can be achieved by lowering the upper boundary value of
m. Let n be a composite expressed as the product ab, either a or b
must be smaller than or equal to the square root of n. So it’s sufficient
to set the upper boundary of m to floor(sqrt(n)).  All of the solutions I
got to the puzzle besides Steve’s implemented such a naïve method,
but all of them used iterative logic. Unlike other existing methods, naïve
methods are in fact suited for set-based implementations. Though still
slow compared to probabilistic and fast deterministic techniques, naïve
methods can be implemented with much faster set-based solutions
than iterative ones.

I’ll create all objects in my examples in a database called primesdb:

set nocount on;

use master;

go

if db_id('primesdb') is null create database primesdb;

go

use primesdb;

go

First, you need a fast way to produce a range of integers (BIGINT in
our case). The following code creates a function called fn_nums that
returns a set of integers in the range @min, @max:

-- function returns a nums table in the range @min - @max

if object_id('dbo.fn_nums') is not null

drop function dbo.fn_nums;

go

create function dbo.fn_nums(@min as bigint, @max as bigint) returns table

as

return

with

l0 as(select 0 as c union all select 0),

l1 as(select 0 as c from l0 as a, l0 as b),

l2 as(select 0 as c from l1 as a, l1 as b),

l3 as(select 0 as c from l2 as a, l2 as b),

l4 as(select 0 as c from l3 as a, l3 as b),

l5 as(select 0 as c from l4 as a, l4 as b),

l6 as(select 0 as c from l5 as a, l5 as b),

nums as(select row_number() over(order by c) as n from l6)

select @min + n - 1 as n from nums where n <= @max - @min + 1;

go

You can now implement the first version of the fn_isprime function by
checking whether any integer up to the square root of the input @n
divides @n:

-- function fn_isprime, version 1

if object_id('dbo.fn_isprime') is not null

drop function dbo.fn_isprime;

go

create function dbo.fn_isprime(@n as bigint) returns bit

with returns null on null input

as

begin

declare @explicitprime as bigint;

set @explicitprime = 23;

if @n < @explicitprime

if @n in (2, 3, 5, 7, 11, 13, 17, 19) return 1 else return 0;

if @n%2 = 0 or @n%3 = 0 or @n%5 = 0 or @n%7 = 0

or @n%11 = 0 or @n%13 = 0 or @n%17 = 0 or @n%19 = 0 return 0;

if @n < cast(square(@explicitprime) as bigint) return 1;

declare

@sqrt as bigint,

@from as bigint,

@to   as bigint;

set @sqrt = cast(sqrt(@n) as bigint);

set @from = @explicitprime;

set @to   = @sqrt;

return

case when exists

(select * from dbo.fn_nums(@from, @to) where @n%n = 0)

then 0 else 1 end

end;

go

The function first checks whether the input is an obvious prime or
nonprime. You achieve this by explicitly handling primes below some
prime number you choose (call it @explicitprime) instead of using the
fn_nums function. For example, say you set @explicitprime to 23. If
@n < 23, the function checks explicitly if @n is one of the known
primes below 23 or not. If @n >= 23, the function checks explicitly if
any of the primes below 23 divides @n. If the previous tests did not
detect the answer and @n < 529 (square(23)), @n is a prime. Only if
the previous tests for obvious primes or nonprimes did not detect the
answer, the function queries the fn_nums table, with the range 23, floor
(sqrt(@n)), and checks if any integer in the range divides @n.

You can further optimize the solution by checking only divisors that
have potential to be primes. For example, the preliminary tests already
checked whether 2 divides @n, so you can query about half the rows
from fn_nums, and produce only odd divisors with the expression
n*2+1: