【问题标题】:Finding the maximum distance travelled寻找最大行驶距离
【发布时间】:2017-08-03 14:26:56
【问题描述】:

我有一组每辆车经过的 GPS 点。我正在尝试检索车辆在每次行程中行驶的最大距离。

数据:

    VehicleId       TripId          Latitude            Longitude
    121             131             33.645              -84.424
    121             131             33.452              -84.409
    121             131             33.635              -84.424
    121             131             35.717              -85.121
    121             131             35.111              -85.111

从上面的数据集中,我的结果集应该是这样的,其中 VehicleId 和 TripId 的每个组合的 startlat 和 startlong 应该是相同的,而 EndLat 和 EndLong 应该不断变化,这样我就可以找出每辆车从起点出发的最大距离。

    VehicleId       TripId          StartLat            StartLong       EndLat          EndLong
    121             131             33.645              -84.424         33.645              -84.424
    121             131             33.645              -84.424         33.452              -84.409
    121             131             33.645              -84.424         33.635              -84.424
    121             131             33.645              -84.424         35.717              -85.121
    121             131             33.645              -84.424         35.111              -85.111

我尝试使用以下查询,但收到错误“不支持引用其他表的相关子查询,除非它们可以去相关, 例如通过将它们转换为有效的 JOIN”。任何帮助将不胜感激。我尝试了以下查询,它适用于特定的 VehicleId 和 TripId,但我无法 将其推广到所有组合。

    SELECT
      a.VehicleId,
      a.Tripid,
      a.Latitude AS StartLat,
      a.Longitude AS StartLong,
      b.Latitude AS EndLat,
      b.Longitude AS EndLong,
      a.DateTime
    FROM
      `Vehicles` AS a
    JOIN
      `Vehicles` AS b
    ON
      a.VehicleId = b.VehicleId
      AND a.Tripid = b.Tripid
    WHERE
      a.VehicleId = 550340912
      AND a.Tripid = 18006167 AND
      a.DateTime IN (
      SELECT
        MIN(DateTime)
      FROM
        `Vehicles`
      WHERE
        VehicleId = 550340912
        AND Tripid = 18006167)

【问题讨论】:

    标签: sql google-bigquery


    【解决方案1】:

    相对于纬度/经度对,行进距离有些模糊,但我将假设此解决方案的Haversine 距离。这是完整的查询,包括设置,基于我在 a previous SO post about Haversine distance 的回答。

    这个想法是获取与车辆 ID 相关联的每次行程的开始和结束(创建所有条目的数组),然后使用数组上的子查询来选择距离最大的条目。如果您需要其他指标,可以将其替换为我使用的 HAVERSINE 函数。

    #standardSQL
    CREATE TEMP FUNCTION RADIANS(x FLOAT64) AS (
      ACOS(-1) * x / 180
    );
    CREATE TEMP FUNCTION RADIANS_TO_KM(x FLOAT64) AS (
      111.045 * 180 * x / ACOS(-1)
    );
    CREATE TEMP FUNCTION HAVERSINE(lat1 FLOAT64, long1 FLOAT64,
                                   lat2 FLOAT64, long2 FLOAT64) AS (
      RADIANS_TO_KM(
        ACOS(COS(RADIANS(lat1)) * COS(RADIANS(lat2)) *
             COS(RADIANS(long1) - RADIANS(long2)) +
             SIN(RADIANS(lat1)) * SIN(RADIANS(lat2))))
    );
    
    WITH Vehicles AS (
     SELECT 121 AS VehicleId, 131 AS TripId, 33.645 AS Latitude, -84.424 AS Longitude, DATETIME "2017-03-12 12:00:00" AS DateTime UNION ALL
     SELECT 121, 131, 33.452, -84.409, DATETIME "2017-03-12 12:01:00" UNION ALL
     SELECT 121, 131, 33.635, -84.424, DATETIME "2017-03-12 12:01:32" UNION ALL
     SELECT 121, 131, 35.717, -85.121, DATETIME "2017-03-12 13:00:56" UNION ALL
     SELECT 121, 131, 35.111, -85.111, DATETIME "2017-03-12 20:30:47"
    )
    SELECT
      (SELECT vehicle_and_distance
       FROM UNNEST(vehicles_and_distances) AS vehicle_and_distance
       ORDER BY vehicle_and_distance.distance DESC LIMIT 1).*
    FROM (
      SELECT
        ARRAY_AGG(
          STRUCT(VehicleId,
                 HAVERSINE(start_location.Latitude, start_location.Longitude,
                           end_location.Latitude, end_location.Longitude) AS distance)
        ) AS vehicles_and_distances
      FROM (
        SELECT
          VehicleId,
          TripId,
          ARRAY_AGG(STRUCT(Latitude, Longitude)
                    ORDER BY DateTime ASC LIMIT 1)[OFFSET(0)] AS start_location,
          ARRAY_AGG(STRUCT(Latitude, Longitude)
                    ORDER BY DateTime DESC LIMIT 1)[OFFSET(0)] AS end_location
        FROM Vehicles
        GROUP BY
          VehicleId,
          TripId
      )
      GROUP BY TripId
    );
    

    编辑:为了完整起见,考虑沿路线行驶的总距离也很有趣,而不仅仅是起点和终点之间的直线距离。这是另一个查询,它通过查看连续的点对来计算沿路线行进的 Haversine 距离的总和:

    #standardSQL
    CREATE TEMP FUNCTION RADIANS(x FLOAT64) AS (
      ACOS(-1) * x / 180
    );
    CREATE TEMP FUNCTION RADIANS_TO_KM(x FLOAT64) AS (
      111.045 * 180 * x / ACOS(-1)
    );
    CREATE TEMP FUNCTION HAVERSINE(lat1 FLOAT64, long1 FLOAT64,
                                   lat2 FLOAT64, long2 FLOAT64) AS (
      RADIANS_TO_KM(
        ACOS(COS(RADIANS(lat1)) * COS(RADIANS(lat2)) *
             COS(RADIANS(long1) - RADIANS(long2)) +
             SIN(RADIANS(lat1)) * SIN(RADIANS(lat2))))
    );
    
    WITH Vehicles AS (
     SELECT 121 AS VehicleId, 131 AS TripId, 33.645 AS Latitude, -84.424 AS Longitude, DATETIME "2017-03-12 12:00:00" AS DateTime UNION ALL
     SELECT 121, 131, 33.452, -84.409, DATETIME "2017-03-12 12:01:00" UNION ALL
     SELECT 121, 131, 33.635, -84.424, DATETIME "2017-03-12 12:01:32" UNION ALL
     SELECT 121, 131, 35.717, -85.121, DATETIME "2017-03-12 13:00:56" UNION ALL
     SELECT 121, 131, 35.111, -85.111, DATETIME "2017-03-12 20:30:47"
    )
    SELECT
      TripId,
      vehicle_and_distance.*
    FROM (
      SELECT
        TripId,
        ARRAY_AGG(STRUCT(VehicleId, total_distance)
                  ORDER BY total_distance DESC)[OFFSET(0)] AS vehicle_and_distance
      FROM (
        SELECT
          VehicleId,
          TripId,
          (SELECT
             SUM(HAVERSINE(
                   Latitude, Longitude,
                   vehicle_locations[OFFSET(off - 1)].Latitude,
                   vehicle_locations[OFFSET(off - 1)].Longitude))
           FROM UNNEST(vehicle_locations) WITH OFFSET off
           WHERE off > 0) AS total_distance
        FROM (
          SELECT
            VehicleId,
            TripId,
            ARRAY_AGG(STRUCT(Latitude, Longitude)
                      ORDER BY DateTime ASC) AS vehicle_locations
          FROM Vehicles
          GROUP BY
            VehicleId,
            TripId
        )
      )
      GROUP BY TripId
    );
    

    【讨论】:

    • 感谢您的解决方案。我有一个计算两个 gps 点之间距离的 UDF。我目前面临的问题是我无法通过上述查询进行概括。我试过你的查询,但它没有提供每辆车在一次旅行中行驶的最大距离。
    • 你能说得更具体点吗?你得到的结果是什么?您是在计算起点和终点之间的距离还是所有连续点之间的距离?
    • 我正在计算每个组的起点和所有其他顺序点之间的距离。在我的例子中,每组vehicleid和tripid的(33.645 -84.424)与所有其他顺序点之间的距离。
    • 您可以尝试仅运行第二个示例中的内部查询(带有SELECT VehicleId, TripId, (SELECT SUM(HAVERSINE(... 的部分)并检查输出吗?从理论上讲,这应该为您提供每个行程/车辆对的距离。如果你的距离函数可以返回负值,你可能需要使用ABS
    • 谢谢艾略特。我很抱歉没有正确地陈述这个问题。我也尝试运行内部查询,但没有得到预期的结果。所以我通过清楚地说明我的问题添加了一个单独的线程。
    猜你喜欢
    • 2018-01-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-04-27
    • 1970-01-01
    • 1970-01-01
    • 2021-07-24
    • 1970-01-01
    相关资源
    最近更新 更多