SQL Select Distinct Column wise (with GUIDs)

12/01/2010

I had some SQL trouble selecting  distinct values from a single column while also selecting the remaining columns from the table.

After trying  without any real breakthrough I turn to stackoverflow and luckily Peter Lang was the man with an fast answer.

The question was (http://stackoverflow.com/questions/2052158/ordered-sql-select-columnwise-distinct-but-return-of-all-columns)


I have a little SQL Distinct puzzle that i cannot solve (or at least not in an very elegant way).

I have two tables (try to ignore the simplicity of the example). I’m using MSSQL 2008 if that makes much of a difference.

Table: Category | categoryId (uniqueidentifier) PK | | Name varchar(50) |

Table: Download | downloadId (uniqueidentifier) PK | | categoryId (uniqueidentifier) FK | | url (varchar(max)) | | createdate (datetime) |

I have a few categories in the Category table and potentially a lot of download URLs in the Download table. I’m interested in selecting the newest using the createdate (or a top 5 if that is possible) download url for each category from the Download table.

Currently I’m doing the following, but that is not very nice, and can hardly be the correct way to do it. SELECT categoryId, max(convert(BINARY(16),downloadId)) as downloadId, max(createdate) as createdate INTO tmp FROM Download GROUP BY categoryId ORDER BY createdate

SELECT url FROM Download WHERE downloadId IN (SELECT CONVERT(uniqueidentifier, downloadId) FROM tmp)

DROP Table tmp

Any suggestions would be much appreciated.

The ingenious solution is to use ROW_NUMBER and Partition i had never figured that out myself.
So  my stupid tmp table attempt should be reformed into:


SELECT categoryId, downloadId, createdate, url
FROM (
SELECT
categoryId, downloadId, createdate, url,
ROW_NUMBER() OVER(PARTITION BY categoryId ORDER BY createdate DESC) rownum
FROM Download
) d
WHERE d.rownum <= 1

More on the topic can be found here: http://weblogs.sqlteam.com/jeffs/archive/2007/03/28/60146.aspx