45 pts.
 Block Of Activity
I have timestamps for events stored as datetime's. Given a starting datetime, is there a way to find the block of timestamps where: 1. At least one timestamp is within 1 hour of the given datetime. 2. No individual timestamp is greater than 1 hour after the previous timestamp (or possibly before the next timestamp if that's easier...or both if necessary). Basically, I'm trying to find a block of activity where the exact start and end time is uncertain.

Software/Hardware used:
ASKED: December 7, 2007  7:45 AM
UPDATED: December 11, 2007  10:51 PM

Answer Wiki:
1. Sure. Take the datetime that you are looking for and use the dateadd to add one hour and then use between to find the records between the two values. 2. This can be done with a self join again using between with a dateadd this time on the value of the outer table.
Last Wiki Answer Submitted:  December 7, 2007  8:45 am  by  Denny Cherry   64,520 pts.
All Answer Wiki Contributors:  Denny Cherry   64,520 pts.
To see all answers submitted to the Answer Wiki: View Answer History.


Discuss This Question:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _


 

Just for reference, I’m coming from a programming background, so I can’t help think in terms of if/else and loops. I’m trying not to, but it’s hard to break those habits.

I think the key to this is being able to find the first datetime that is less than 1 hour before the second datetime. I could include this as a subquery, do something similar to find the last datetime, and just list all the datetimes between the two.

If I were programming this, I would loop backwards through the records until I found one where the difference between this datetime and the last datetime > 1 hour. Similarly I would loop forward to find the last datetime. How would (or could) I do something similar in SQL?

 45 pts.

 

Hi,

You’re absolutely correct that progamming in SQL is different than programming in procedural languages. The following page may shed some more light on the differences:

<a href=”http://www.databasejournal.com/sqletc/article.php/1408491″ rel=”nofollow”>http://www.databasejournal.com/sqletc/article.php/1408491</a&gt;

especially the taxi analogy toward the middle of the page.

To answer your question, “If I were programming this, I would … How would (or could) I do something similar in SQL?”. The answer is: You wouldn’t. You’d do something completely different. Instead of describing how to get the results, you describe what results you want (in this case, rows where timestamp is between t-1hr and t+1hr).

 1,240 pts.

 

Unfortunately, I don’t know how to accurately describe exactly what I’m looking for. Let me try this. Say I have the following:

CREATE TABLE Events ( UserID int, Timestamp datetime )

INSERT INTO Events VALUES ( 280, '12/03/2007 3:24:41 AM' )
INSERT INTO Events VALUES ( 345, '12/03/2007 3:40:11 AM' )
INSERT INTO Events VALUES ( 280, '12/03/2007 4:18:32 AM' )
INSERT INTO Events VALUES ( 280, '12/03/2007 4:56:18 AM' )
INSERT INTO Events VALUES ( 345, '12/03/2007 5:29:13 AM' )
INSERT INTO Events VALUES ( 280, '12/03/2007 5:34:28 AM' )
INSERT INTO Events VALUES ( 345, '12/03/2007 5:59:11 AM' )
INSERT INTO Events VALUES ( 345, '12/03/2007 6:26:01 AM' )
INSERT INTO Events VALUES ( 345, '12/03/2007 7:19:38 AM' )
INSERT INTO Events VALUES ( 345, '12/03/2007 7:43:48 AM' )
INSERT INTO Events VALUES ( 345, '12/03/2007 9:44:17 AM' )

Given this as input (345, ’12/03/2007 7:00:00 AM’), I would like a sql statement that returns the following:
345, ’12/03/2007 5:29:13 AM’
345, ’12/03/2007 5:59:11 AM’
345, ’12/03/2007 6:26:01 AM’
345, ’12/03/2007 7:19:38 AM’
345, ’12/03/2007 7:43:48 AM’

You’ll note that each returned datetime is no more than 1 hour different from at least one other record in the set, and at least 1 record is no more than 1 hour different from the input time. I think I know how to check for each being within an hour of at least one other:

SELECT Current.UserID, Current.Timestamp, Previous.Timestamp
FROM Events as Current INNER JOIN Events as Previous
ON Current.UserID = Previous.UserID AND DateDiff('h',Current.Timestamp, Previous.Timestamp) = 0
WHERE Current.UserID = @UserID
AND Current.Timestamp  Previous.Timestamp
ORDER BY Current.Timestamp DESC;

This will essentially list all the events for that user, since every event will be within an hour of at least one other event. So how do I limit it to that block of time for which at least one record is within an hour of a given datetime? Is that even possible?

 45 pts.

 

Er…ignore that Previous.Timestamp in the opening of the select.

 45 pts.

 

Hi,

I’m having trouble posting this answer; there’s some trouble with less-than signs. When you see < below, I really wanted to use a less-than sign.

Thanks for posting the data and your expected results. Now I have a much better idea of what you want.

The crucial part of this job is identifying which events are in the same block. It would be so easy if your data already had a column that told which block a row was in. Since there is no such column in your table, let’s create such a column in a sub-query, and then do the query you really want on the result set of that sub-query.

We can identify the start of each block like this:

SELECT	*
FROM	events	AS s
WHERE	userid		= m.userid
AND	timestamp	<= m.timestamp
AND	NOT EXISTS
	(	-- Begin sub-query to find earlier events in same block
	SELECT	0
	FROM	events
	WHERE	userid	= s.userid
	AND	timestamp	< s.timestamp
	AND	DATEDIFF (hh, timestamp, s.timestamp) <= 1
	)	-- End sub-query to find earlier events in same block

Labeling each row of the table with one of these starting points is merely a matter of finidng the latest one (before or equal to timestamp) for the same userid:

SELECT	userid
,	timestamp
,	(	-- Begin sub-query to compute block_start
	SELECT	MAX (timestamp)
	FROM	events	AS s
	WHERE	userid	= m.userid
	AND	timestamp	<= m.timestamp
	AND	NOT EXISTS
		(	-- Begin sub-query to find earlier events in same block
		SELECT	0
		FROM	events
		WHERE	userid	= s.userid
		AND	timestamp	< s.timestamp
		AND	DATEDIFF (hh, timestamp, s.timestamp) <= 1
		)	-- End sub-query to find earlier events in same block
	)	-- End  sub-query to compute block_start
	AS block_start
FROM	events

So which are the rows you ultimately want to select? They are the ones in the same block as any events within an hour of the target time:

WITH	events_b
AS
(	-- Begin events_b
	SELECT	userid
	,	timestamp
	,	(	-- Begin sub-query to compute block_start
		SELECT	MAX (timestamp)
		FROM	events	AS s
		WHERE	userid	= m.userid
		AND	timestamp	<= m.timestamp
		AND	NOT EXISTS
			(	-- Begin sub-query to find earlier events in same block
			SELECT	0
			FROM	events
			WHERE	userid	= s.userid
			AND	timestamp	< s.timestamp
			AND	DATEDIFF (hh, timestamp, s.timestamp) <= 1
			)	-- End sub-query to find earlier events in same block
		)	-- End  sub-query to compute block_start
		AS block_start
	FROM	Events	AS m
)	-- End events_b
SELECT	userid
,	timestamp
FROM	events_b	AS m
WHERE	userid	= 345
AND	block_start	IN
	(	-- Begin sub-query to find close block starts
	SELECT	block_start
	FROM	events_b
	WHERE	userid	= m.userid
	AND	DATEDIFF (hh, timestamp, '12/03/2007 07:00:00 AM') BETWEEN -1 AND 1
	)	-- End sub-query to find close block starts
ORDER BY	timestamp;

I assume that, if you happen to pick a target time that is not within a block, but within an hour of two different blocks, you want to choose both ot them. (For example, if the parameters were userid=345 and timestamp=’12/03/2007 04:34:42 AM’, then you would want both the one-event block starting at 3:40:11 and the block starting at 5:29:13. If you want only the earlier, then change

BETWEEN -1 AND 1

to

BETWEEN 0 AND 1

; for only the later one, say

BETWEEN -1 AND 0
 1,240 pts.