(reprinted from a previous blog)
Here's the dilemma. Let's say you have a log table where each row is a record of some action taken by a user. Perhaps the user logs in, updates his profile, searches for products and makes purchase. The table has the following fields:
The result of this query is a string built from the record that represents the oldest date in the record set for each user. Something like this:
Ok, that's nifty (says ralph), but how do I get the "action" they took? Well, if you try to add that to your Group by clause you will end up with the oldest record per user for each action. Obviously that's not what you want either. Plus (and this is probably obvious to you by now) a string like 72_20040329 isn't terribly friendly to work with. I suppose you can unpack both the date and the user ID from that string and then query the db again for the action in question - but that is no better than the loop-d-loop issue we are trying to avoid.
You might not have thought of it, but this is a spot where a sub-query can really be useful. Using a subquery and a fancy where clause you can actually extract the whole row (primary key and all) in question from the log table. Here's how it's done:SELECT log_id, User_id, Action_Type, UpdateTimeThe trick is the MAX( ) function. When the root query concatenates the columns in the where clause it does it without regard to all rows in the table, but when the sub query runs it creates a string from only the maximum small date time for each user. The result is a subset of the records matching the "last action" that each user took before leaving the site. Ralph is impressed with you (of course) and immediately raises your salary - once again proving the old adage that if you build a better mouse trap people will beat a path to your door - usually trying to help you extricate your foot from your excellent device - but that is topic for another blog.
WHERE CONVERT(varchar,user_id) + '_' +
( SELECT CONVERT(varchar,user_id) + '_' +
GROUP BY user_id