Puzzled By T-SQL Blog

# TSQL Proximity Puzzle

I got this puzzle from my good friend Herbert Albert. Herbert addressed a more detailed version of the problem with a customer of his; here I’ll present a simplified form of the problem.

Consider the following tables and sample data:

set nocount on;

use tempdb;

if object_id('dbo.t1', 'u') is not null drop table dbo.t1;

if object_id('dbo.t2', 'u') is not null drop table dbo.t2;

go

create table dbo.t1

(

col1 int        not null primary key,

col2 varchar(1) not null

);

create table dbo.t2

(

col1 int not null primary key,

col2 varchar(1) not null

);

-- small sample data to check validity of solution

insert into dbo.t1(col1, col2) values

(10, 'a'),(20, 'b'),(30, 'c'),(40, 'd'),(50, 'e');

insert into dbo.t2(col1, col2) values

(9, 'z'),(12, 'y'),(20, 'w'),(35, 'v'),(47, 'u'),(51, 't'),(1759, 's');

go

The challenge is to write a query that matches to each row from t1 the row in t2 with the “closest” value in t2.col1 to the one in t1.col1. In “closest” I mean the smallest absolute difference. In case of ties, simply the smallest t2.col1 value wins. For the given sample data, the desired result is:

t1_col1     t1_col2 t2_col1     t2_col2

----------- ------- ----------- -------

10          a       9           z

20          b       20          w

30          c       35          v

40          d       35          v

50          e       51          t

Use the given sample data and desired result to verify the validity of the solution. But to check the performance of the solutions, I will use the following sample data:

-- Virtual Auxiliary Table of Numbers

IF OBJECT_ID('dbo.GetNums') IS NOT NULL DROP FUNCTION dbo.GetNums;

GO

CREATE FUNCTION dbo.GetNums(@n AS BIGINT) RETURNS TABLE

AS

RETURN

WITH

L0   AS(SELECT 1 AS c UNION ALL SELECT 1),

L1   AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),

L2   AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),

L3   AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),

L4   AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),

L5   AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),

Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)

SELECT TOP (@n) n FROM Nums ORDER BY n;

GO

-- Populate Tables

truncate table dbo.t1;

truncate table dbo.t2;

insert into dbo.t1 with (tablock) (col1, col2)

select

n * 10000 + abs(checksum(newid())) % 10000,

char(ascii('a') + abs(checksum(newid())) % (ascii('z') - ascii('a') + 1))

from dbo.getnums(10000);

insert into dbo.t2 with (tablock) (col1, col2)

select

n * 100 + abs(checksum(newid())) % 100,

char(ascii('a') + abs(checksum(newid())) % (ascii('z') - ascii('a') + 1))

from dbo.getnums(1000000);

I’ll cover the solutions in a post next week.

Good luck!

BG