近期面试遇到这样一个问题:
有一张用户呼叫表,表中有两列,用户ID和呼叫时间,现在需要找出一分钟内呼叫次数超过10次的用户,标记为黑名单。该如何写SQL得到这种数据。
要求:SQL中不能出现排序操作。
已找到解决方法:
WITH CTE_BASE AS(
SELECT A.USER_ID, A.CALL_TIME
, CASE WHEN TIME_DIFF(A.CALL_TIME, B.CALL_TIME) BEWTEEN 1 AND 60 SECONDS THEN 1 ELSE 0 END AS CONTI_CNT
FROM CALL_RECORD A
FULL JOIN CALL_RECORD B
ON A.USER_ID = B.USER_ID
)
, CTE_ABNORMAL_CALLS AS(
SELECT USER_ID, CALL_TIME
FROM CTE_BASE
GROUP BY USER_ID, CALL_TIME
HAVING SUM(CONTI_CNT) > 10
)
SELECT DISTINCT USER_ID
FROM CTE_ABNORMAL_CALLS
但是该方法出现表自关联,现在基表数据量很大,不知道性能会如何?
SELECT 用户ID
FROM dbo.Schedules
GROUP BY 用户ID, DATEPART(MINUTE, 呼叫时间)
HAVING count(*) >10
好像是这样吧。。。
这样跨分钟貌似有问题,比如:2019-02-22 20:49:20 和 2019-02-22 20:50:10
@jello chen: 那好办啦,你在以分分组前再加上以年、月、日、小时分组啦
@jello chen:
SELECT 用户ID
FROM dbo.呼叫表
GROUP BY 用户ID,
DATEPART(YEAR, 呼叫时间),
DATEPART(MONTH, 呼叫时间),
DATEPART(DAY, 呼叫时间),
DATEPART(HOUR, DT.[Date]),
DATEPART(MINUTE, 呼叫时间)
HAVING count(*) >10
@Jeffcky: 不是,我的问题在“一分钟内”这个点上
@jello chen:
你没根本理解意思,你看如下数据,你运行。
declare @calls table
(
userId VARCHAR(36),
callTime datetime
)
INSERT INTO @calls (callTime,userId) values ('2019-02-22 12:23:01','1FA3F63F-1F5A-4825-AB68-ED70365F85DA')
INSERT INTO @calls (callTime,userId) values ('2017-02-22 12:23:01','1FA3F63F-1F5A-4825-AB68-ED70365F85DA')
INSERT INTO @calls (callTime,userId) values ('2019-02-22 12:35:01','1FA3F63F-1F5A-4825-AB68-ED70365F85DA')
INSERT INTO @calls (callTime,userId) values ('2015-02-22 12:25:01','1FA3F63F-1F5A-4825-AB68-ED70365F85DA')
INSERT INTO @calls (callTime,userId) values ('2013-02-22 12:02:01','1FA3F63F-1F5A-4825-AB68-ED70365F85DA')
INSERT INTO @calls (callTime,userId) values ('2019-02-22 12:09:01','1FA3F63F-1F5A-4825-AB68-ED70365F85DA')
INSERT INTO @calls (callTime,userId) values ('2019-02-22 12:35:01','1FA3F63F-1F5A-4825-AB68-ED70365F85DA')
SELECT userId,DATEPART(MINUTE, callTime),COUNT()
FROM @calls
GROUP BY userId,
DATEPART(YEAR, callTime),
DATEPART(MONTH, callTime),
DATEPART(DAY,callTime),
DATEPART(HOUR, callTime),
DATEPART(MINUTE, callTime)
HAVING count() >1
@jello chen: 一分钟无非就是比如上述例子35分钟时,就有两条,应该懂我意思了把
@Jeffcky: 我不这么认为,我觉得2019-02-22 20:49:20 和 2019-02-22 20:50:10虽然分钟不一致,但是是在一分钟内的
@jello chen: oh,是的,这个确实不行,这个该如何解决?有点意思。我再想想,哈哈
@jello chen: 对的,你的理解是对的。我想出了一种SQL写法,但是用到了排序操作,现在要求不能使用排序,就不知道怎么写了。
@davawang: 有点难度。。
select * from (
select a.UserId,
(select COUNT() from T_Booking b where DateDiff(MINUTE,a.CreateTime,b.CreateTime)=0 and a.UserId=b.UserId ) as bnum
from T_Booking a
) t where bnum>10
----------------------------------------------------------------------------------------
select from (
select a.UserId,
(select COUNT(*) from T_Booking b where DateDiff(SECOND,a.CreateTime,b.CreateTime) between 0 and 60 and a.UserId=b.UserId ) as bnum
from T_Booking a
) t where bnum>10
确实很刁钻,但是现实中在表中多加一两个字段也没啥事啊,为何要绕大湾子啊
问题漏洞,一分钟内是连续一分钟还是自然一分钟?
如果不强调连续,那就当是自然一分钟,但这个在实际业务中没有意义.
但如果是连续一分钟,就要求数据记录是时序排列好的,否则就必须用到排序.
不连续的话就直接用时间戳除以60取整进行分组就好了.
是连续一分钟
借用楼上@Jeffcky的数据,稍加改动,因为数据行数有限,故把题目改为“找出一分钟内呼叫次数超过2次的用户”,尝试如下:
declare @calls table
(
userId VARCHAR(36),
callTime datetime
)
INSERT INTO @calls (callTime,userId) values ('2019-02-22 12:33:01','1FA3F63F')
INSERT INTO @calls (callTime,userId) values ('2019-02-22 12:33:21','1FA3F63F')
INSERT INTO @calls (callTime,userId) values ('2019-02-22 12:34:30','1FA3F63F')
INSERT INTO @calls (callTime,userId) values ('2019-02-22 12:35:01','1FA3F63F')
INSERT INTO @calls (callTime,userId) values ('2019-02-22 12:35:21','1FA3F63F')
INSERT INTO @calls (callTime,userId) values ('2019-02-22 12:42:01','1FA3F63F')
INSERT INTO @calls (callTime,userId) values ('2019-02-22 12:49:01','1FA3F63F')
select distinct userId
from (
select a.userId, b.callTime_begin, b.callTime_end, COUNT() cishu
from @calls a,
( select userId, callTime as callTime_begin, dateadd(ss,60,callTime) as callTime_end
from @calls
) b
where a.userId = b.userId
and a.callTime between b.callTime_begin and b.callTime_end
group by a.userId, b.callTime_begin, b.callTime_end
having COUNT() > 2
) t
SELECT DISTINCT
userId
FROM
(
SELECT COUNT(1) AS cou,
a.userId,
a.callTime
FROM #tb a
INNER JOIN #tb b
ON a.userId = b.userId
AND DATEDIFF(SECOND, a.callTime, b.callTime)
BETWEEN 0 AND 60
GROUP BY a.userId,
a.callTime
HAVING COUNT(1) > 2
) t;