Question

如何在SQL Server中根据多个条件计算时间差

2023-05-29

79

sql sql-server

我有一张如下所示的表格。我正在尝试根据人员的状态计算人员不可用的时间

情况 1 ：如果 start_date 和 end_date 相同且 status='UNKNOWN' 和 workstatus='ERROR' ，那么我必须将 end_date 视为下一个相同的状态，例如未知和错误（rowid 3 和 8）。我必须计算第一条记录的 end_date 和下一条记录的 start_date 之间的时间差（以蓝色突出显示“2023-04-24 10:15:24”至“2023-04-24 10:21:24”）。如果人员一天内多次无法联系，则应针对下一个未知人员执行此操作，即 rowid 14,15“2024-04-24 12:45:29”至“2023-04-24 13:45:29”。

情况 2 ：如果没有其他状态为 UNKNOWN 和 ERROR 的记录（当天），则我们必须考虑前一条记录（空闲或活动，即当天任何非未知的记录，在本例中为人员 1 rowid 11），即对于记录 13，记录 11 的结束日期是开始时间。 “2023-04-24 15:12:09” 到 “2023-04-24 15:47:00” 之间的时间差（以黄色突出显示）

最终，预期结果将如下所示：

这是我使用的脚本：

CREATE TABLE dbo.human_sample 
(
    rowid int, 
    person_id int, 
    start_date datetime, 
    end_date datetime, 
    status varchar(100), 
    work_status varchar(100)
);

INSERT INTO dbo.human_sample VALUES (1, 1,  '2023-04-24 10:10:15',  '2023-04-24 10:15:24',  'IDLE', 'OK')
INSERT INTO dbo.human_sample VALUES (2, 1,  '2023-04-24 10:15:24'   ,'2023-04-24 10:19:45', 'IDLE', 'OK')
INSERT INTO dbo.human_sample VALUES (3, 2,  '2023-04-24 10:15:24',  '2023-04-24 10:15:24',  'UNKNOWN',  'ERROR')
INSERT INTO dbo.human_sample VALUES (4, 1,  '2023-04-24 10:45:45',  '2023-04-24 11:35:21',  'IDLE', 'OK')
INSERT INTO dbo.human_sample VALUES (5, 1,  '2023-04-24 11:35:21',  '2023-04-24 11:35:21',  'IDLE', 'OK')
INSERT INTO dbo.human_sample VALUES (6, 1,  '2023-04-24 11:35:21',  '2023-04-24 11:55:41',  'IDLE', 'OK')
INSERT INTO dbo.human_sample VALUES (7, 1,  '2023-04-24 11:55:41',  '2023-04-24 13:25:11',  'ACTIVE',   'OK')
INSERT INTO dbo.human_sample VALUES (8, 2,  '2023-04-24 10:21:24',  '2023-04-24 10:21:24',  'UNKNOWN',  'ERROR')
INSERT INTO dbo.human_sample VALUES (9, 2,  '2023-04-24 10:21:24',  '2023-04-24 11:45:49',  'IDLE', 'OK')
INSERT INTO dbo.human_sample VALUES (10,    1,  '2023-04-24 13:25:11',  '2023-04-24 14:12:41',  'ACTIVE',   'OK')
INSERT INTO dbo.human_sample VALUES (11,    1,  '2023-04-24 14:12:41',  '2023-04-24 15:12:09',  'IDLE', 'OK')
INSERT INTO dbo.human_sample VALUES (12,    2,  '2023-04-24 11:45:49',  '2023-04-24 12:45:29',  'IDLE', 'OK')
INSERT INTO dbo.human_sample VALUES (13,    1,  '2023-04-24 15:47:00',  '2023-04-24 15:47:00',  'UNKNOWN',  'ERROR')
INSERT INTO dbo.human_sample VALUES (14,    2,  '2023-04-24 12:45:29',  '2023-04-24 12:45:29',  'UNKNOWN',  'ERROR')
INSERT INTO dbo.human_sample VALUES (15,    2,  '2023-04-24 13:45:29',  '2023-04-24 13:45:29',  'UNKNOWN',  'ERROR')

Answer 1

无法解决从 IDLE 到 UNKNOWN 的情况，因为它需要指定哪个 IDLE 时间（或 ACTIVE 之后的时间戳），这是一个不同的情况，但这为您提供了 UNKNOWN/ERROR 代码的时间差异的部分解决方案

select *, time(start_date - lag(end_date,1)
over(partition by person_id order by rowid))
as time_unavailable from (select * from human_sample
where status = 'UNKNOWN' and work_status = 'ERROR') as test;

fiddle here

Answer 2

要处理两种不同的情况，您可以在单独的查询中执行此操作，然后使用 UNION ALL 将结果合并为一个。

首先，计算 UNKNOWN 状态的数量，当数量超过 2 时，为 case 1 ，否则为 case 2 。请参阅 case_flag

对于 case 1 ，您只对那些具有 UNKNOWN 状态的行感兴趣。使用 row_number() 生成一个运行数字，该数字允许您将 2 行分组为一行。然后只需找到 datediff() 和 min( date ) 和 max ( date )

对于 case 2 ，使用 lag() 获取前一行 end_datetime ，然后计算 datediff()

with
cte as
(
  select rowid, person_id, start_date, end_date, status, work_status, 
         [date] = convert(date, start_date), 
         case_flag = case when sum (case when status = 'UNKNOWN' then 1 else 0 end) 
                               over (partition by person_id, convert(date, start_date))
                               > 1
                          then 1
                          else 2
                          end
  from   human_sample
),
-- case 1 logic
case_1_cte as
(
  select person_id, [date],
         rn = row_number() over (partition by person_id, [date] order by start_date),
         start_date, end_date
  from   cte
  where  case_flag = 1
  and    status = 'UNKNOWN'
),
case_1 as
(  
  select person_id, [date], 
         Unavailable_time_in_seconds = datediff(second, min(end_date), max(start_date))
  from   case_1_cte
  group by person_id, [date], (rn - 1) / 2 
),
-- case 2 logic
case_2_cte as
(
   select person_id, [date], start_date, end_date, status,
          prev_end_date = lag(end_date) over (partition by person_id, [date]
                                  order by start_date)
   from   cte
   where  case_flag = 2
),
case_2 as
(
  select person_id, [date], 
         Unavailable_time_in_seconds = datediff(second, prev_end_date, start_date)
  from   case_2_cte
  where  status = 'UNKNOWN'
)
-- final result
select person_id, [date], Unavailable_time_in_seconds
from   case_1
union all
select person_id, [date], Unavailable_time_in_seconds
from   case_2

db<>fiddle demo