【问题标题】:Group similar URLs对相似的 URL 进行分组
【发布时间】:2019-06-21 04:43:50
【问题描述】:

我希望获取对 xmlrpc.php 和 wp-login.php 的所有请求,并在语句中使用通配符。

但这带来了一个问题,因为它不会仅在两行中输出 xmlrpc 和 wp-login 的数据,而且还包括附加查询的 URL。希望它包含请求的每个 URL,但将它们组合起来显示为 xmlrpc.php 或 wp-login.php

我是一个 mysql n00b 并且正在玩 substr replace 和 group_concat 但无法让它工作。

WITH 
  subq AS (
    SELECT url, COUNT(url) AS count
    FROM `flywheel-production.fastly_logs.ingress_logs`
    WHERE timestamp > TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -1 DAY) 
  AND (url LIKE "/wp-login.php%" OR  url LIKE "/xmlrpc.php%")
  AND site_hash = "btmpuroizf"
    GROUP BY url
  )

SELECT 
  url,
  count,
  ROUND(count / (SELECT SUM(count) FROM subq) * 100, 2) AS percent
FROM subq
ORDER BY count DESC

任何帮助将不胜感激。谢谢!

【问题讨论】:

    标签: mysql sql google-bigquery


    【解决方案1】:

    对于 BigQuery 标准 SQL

    以下调整后的查询应该做“技巧”

    #standardSQL
    WITH subq AS (
      SELECT REGEXP_EXTRACT(url, r'(.*?)(?:\?|$)') url, COUNT(url) AS COUNT
      FROM `flywheel-production.fastly_logs.ingress_logs`
      WHERE timestamp > TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -1 DAY) 
      AND (url LIKE "/wp-login.php%" OR  url LIKE "/xmlrpc.php%")
      AND site_hash = "btmpuroizf"
      GROUP BY url
    )
    SELECT 
      url,
      COUNT,
      ROUND(COUNT / (SELECT SUM(COUNT) FROM subq) * 100, 2) AS percent
    FROM subq
    ORDER BY COUNT DESC  
    

    【讨论】:

    • 你摇滚!工作完美。谢谢!
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2020-12-12
    • 2020-05-26
    • 2018-01-14
    • 2016-11-18
    • 2012-01-27
    • 1970-01-01
    • 2016-01-13
    相关资源
    最近更新 更多