【问题标题】:How to count the most frequent CloseReasonTypes per posts in the dataexplorer?如何计算数据浏览器中每个帖子最频繁的 CloseReasonTypes?
【发布时间】:2014-06-29 19:32:59
【问题描述】:

我开始撰写this query,我发现很难理解为什么要关闭这个问题。

select
   TOP ##Limit:int?38369## -- The maximum value the hardware can handle.
   Posts.Id as [Post Link], -- Question title.
   Count(PendingFlags.PostId) as [Number of pending flags], -- Number of pending flags per questions.
   Posts.OwnerUserId as [User Link], -- Let click on the colum to see if the same user ask off-topic questions often.
   Reputation as [User Reputation], -- Interesting to see that such questions are sometimes asked by high rep users.
   Posts.Score as [Votes], -- Interesting to see that some questions have more than 100 upvotes.
   Posts.AnswerCount as [Number of Answers], -- I thought we shouldn't answer on off-  topic post.
   Posts.FavoriteCount as [Number of Stars], -- Some questions seems to be very helpfull :) .
   Posts.CreationDate as [Asked on], -- The older is the question, the more is the chance that flags on them can't get reviewed.
   Posts.LastActivityDate as [last activity], -- Similar effect as with Posts.CreationDate.
   Posts.LastEditDate as [modified on],
   Posts.ViewCount
from posts
   LEFT OUTER JOIN Users on Users.id = posts.OwnerUserId
   INNER JOIN PendingFlags on PendingFlags.PostId = Posts.Id
where ClosedDate IS NULL -- The question is not closed.
group by Posts.id, Posts.OwnerUserId, Reputation, Posts.Score, Posts.FavoriteCount, Posts.AnswerCount, Posts.CreationDate, Posts.LastActivityDate, Posts.LastEditDate, Posts.ViewCount
order by Count(PendingFlags.PostId) desc; -- Questions with more flags have more chance to get them handled, and the higher is the probabilty that the question is off-topic (since several users already reviewed the question).

鉴于每个问题都有几个标志,我不能使用简单的表格来显示标志使用每个标志的原因,但我认为应该与每个问题的最常见的 CloseReasonTypes.Id 值相关帖子:这导致我遇到两个问题:

  • 首先:查看this query 后,我应该JOIN CloseReasonTypesPendingFlags 以显示原因名称而不是它们的编号。由于 PostsPendingFlags 之间没有公共字段,但由于我使用from posts作为连接表的基础,我不知道如何执行此操作加入

  • Secound :我不知道在每一行中选择最常用的关闭原因。虽然有几个问题似乎已经讨论过类似的情况,但我不能使用他们的答案,因为他们询问如何在整个表上找到最常见的值,从而产生一个具有单列和单行的表,而我需要这样做是为了计算每个帖子上的标志数。

【问题讨论】:

    标签: sql sql-server join sql-server-2014 dataexplorer


    【解决方案1】:

    虽然不完全是您想要的,但我相信query 将为您提供一个良好的开端。

    select
        PostId as [Post Link], 
        duplicate = sum(case when closereasontypeid = 101 then 1 else 0 end), 
        offtopic = sum(case when closereasontypeid = 102 then 1 else 0 end),
        unclear = sum(case when closereasontypeid = 103 then 1 else 0 end),
        toobroad = sum(case when closereasontypeid = 104 then 1 else 0 end),
        opinion = sum(case when closereasontypeid = 105 then 1 else 0 end),
        ot_superuser = sum(case when CloseAsOffTopicReasonTypeId = 4 then 1 else 0 end),
        ot_findexternal = sum(case when CloseAsOffTopicReasonTypeId = 8 then 1 else 0 end),
        ot_serverfault = sum(case when CloseAsOffTopicReasonTypeId = 7 then 1 else 0 end),
        ot_lackinfo = sum(case when CloseAsOffTopicReasonTypeId = 12 then 1 else 0 end),
        ot_typo = sum(case when CloseAsOffTopicReasonTypeId = 11 then 1 else 0 end)
    from pendingflags
    where 
        flagtypeid in (13,14)   -- Close flags
        and creationdate > '2014-04-15'
    group by PostId
    

    这仅查看自今年 4 月 15 日以来已关闭的帖子,并返回大约 23,500 条记录。

    我相信数据浏览器不包含已删除的帖子,因此这些帖子不包含在结果中。

    如果/当添加或删除新的关闭原因时,这将需要修改。

    【讨论】:

    • 这是一个好的开始,但这不是我要找的:) ...如此赞成但未被接受。它根本没有解决我的第一个或第二个问题(我知道我可以做到这一点,但我已经有其他行:它没有告诉语法选择最常见的结果)
    猜你喜欢
    • 2019-05-26
    • 2022-06-19
    • 2019-06-02
    • 1970-01-01
    • 2013-11-30
    • 1970-01-01
    • 1970-01-01
    • 2011-02-15
    • 1970-01-01
    相关资源
    最近更新 更多