【问题标题】:Retrieving first and last records of each group检索每组的第一条和最后一条记录
【发布时间】:2017-03-13 22:38:53
【问题描述】:

两点之间的距离:

我有一组每辆车经过的 GPS 点。我正在尝试检索每次旅行的第一条和最后一条记录。

数据:

  VehicleId       TripId          Latitude            Longitude
    121             131             33.645              -84.424
    121             131             33.452              -84.409
    121             131             33.635              -84.424
    121             131             35.717              -85.121
    121             131             35.111              -85.111

在上述数据集中,我需要将结果集作为每次行程的第一个和最后一个点。

  VehicleId       TripId          StartLat            StartLong          EndLat          EndLong
    121             131             33.645              -84.424         35.111          -85.111

我尝试使用以下查询,但收到错误“不支持引用其他表的相关子查询,除非它们可以去相关, 例如通过将它们转换为有效的 JOIN”。任何帮助将不胜感激。

    SELECT
      a.VehicleId,
      a.Tripid,
      a.Latitude AS StartLat,
      a.Longitude AS StartLong,
      b.Latitude AS EndLat,
      b.Longitude AS EndLong,
      a.DateTime
    FROMQ
      `Vehicles` AS a
    JOIN
      `Vehicles` AS b
    ON
      a.VehicleId = b.VehicleId
      AND a.Tripid = b.Tripid
    WHERE
      a.DateTime IN (
      SELECT
        MIN(DateTime)
      FROM
        `Vehicles`
      WHERE
        VehicleId = a.VehicleId
        AND Tripid = a.Tripid)
      AND b.DateTime IN (
      SELECT
        MAX(DateTime)
      FROM
        `Vehicles`
      WHERE
        VehicleId = a.VehicleId
        AND Tripid = a.Tripid)

【问题讨论】:

    标签: sql google-bigquery


    【解决方案1】:

    首先想到的是row_number()

    select v.*
    from (select v.*,
                 row_number() over (partition by vehicleid, tripid order by datetime asc) as seqnum_asc,
                 row_number() over (partition by vehicleid, tripid order by datetime desc) as seqnum_desc
          from vehicles v
         ) v
    where seqnum_asc = 1 or seqnum_desc = 1;
    

    如果你想让它们在同一行:

    select vehicleid, tripid,
           min(datetime) as start_datetime, max(datetime) as end_datetime,
           min(case when seqnum_asc = 1 then latitude end) as start_latitude,
           min(case when seqnum_asc = 1 then longitude end) as start_longitude,
           min(case when seqnum_desc = 1 then latitude end) as end_latitude,
           min(case when seqnum_desc = 1 then longitude end) as end_longitude
    from (select v.*,
                 row_number() over (partition by vehicleid, tripid order by datetime asc) as seqnum_asc,
                 row_number() over (partition by vehicleid, tripid order by datetime desc) as seqnum_desc
          from vehicles v
         ) v
    where seqnum_asc = 1 or seqnum_desc = 1
    group by vehicleid, tripid;
    

    【讨论】:

      【解决方案2】:

      这是使用聚合函数的另一个选项:

      #standardSQL
      WITH Vehicles AS (
       SELECT 121 AS VehicleId, 131 AS TripId, 33.645 AS Latitude, -84.424 AS Longitude, DATETIME "2017-03-12 12:00:00" AS DateTime UNION ALL
       SELECT 121, 131, 33.452, -84.409, DATETIME "2017-03-12 12:01:00" UNION ALL
       SELECT 121, 131, 33.635, -84.424, DATETIME "2017-03-12 12:01:32" UNION ALL
       SELECT 121, 131, 35.717, -85.121, DATETIME "2017-03-12 13:00:56" UNION ALL
       SELECT 121, 131, 35.111, -85.111, DATETIME "2017-03-12 20:30:47"
      )
      SELECT
        VehicleId,
        TripId,
        ARRAY_AGG(STRUCT(Latitude, Longitude)
                  ORDER BY DateTime ASC LIMIT 1)[OFFSET(0)] AS start_location,
        ARRAY_AGG(STRUCT(Latitude, Longitude)
                  ORDER BY DateTime DESC LIMIT 1)[OFFSET(0)] AS end_location
      FROM Vehicles
      GROUP BY
        VehicleId,
        TripId;
      

      【讨论】:

        【解决方案3】:

        使用 SQL 2012,你也可以使用

        SELECT DISTINCT VehicleId, TripId,
            FIRST_VALUE(Latitude) OVER (PARTITION BY VehicleId, TripId ORDER BY [Datetime]) AS StartLatitude,
            LAST_VALUE(Latitude) OVER (PARTITION BY VehicleId, TripId ORDER BY [Datetime] ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS EndLatitude,
            FIRST_VALUE(Longitude) OVER (PARTITION BY VehicleId, TripId ORDER BY [Datetime]) AS StartLongitude,
            LAST_VALUE(Longitude) OVER (PARTITION BY VehicleId, TripId ORDER BY [Datetime] ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS EndLongitude
        FROM    dbo.Vehicles
        

        【讨论】:

          猜你喜欢
          • 2018-05-28
          • 1970-01-01
          • 1970-01-01
          • 2010-11-21
          相关资源
          最近更新 更多