【问题标题】:Getting rid of Duplicate values - Sql Server摆脱重复值 - Sql Server
【发布时间】:2013-07-10 23:23:13
【问题描述】:

我有以下 5 个表 -

   CREATE TABLE [dbo].[MSP_EpmProject](
    [ProjectUID] [uniqueidentifier] NOT NULL,
    [ProjectName] [nvarchar](255) NOT NULL,
    [ProjectAuthorName] [nvarchar](255) NULL,
 CONSTRAINT [PK_MSP_EpmProject] PRIMARY KEY CLUSTERED 
([ProjectUID] ASC)) 

CREATE TABLE [dbo].[Project_CI_Mapping](
    [ProjectName] [nvarchar](255) NOT NULL,
    [CI] [nvarchar](100) NOT NULL)

CREATE TABLE [dbo].[ca_owned_resource]( 
    [resource_name] [nvarchar](100) NOT NULL,
    [resource_description] [nvarchar](255) NULL,
    [resource_family] [int] NULL,
    [resource_class] [int] NOT NULL,
    [resource_status] [int] NULL,   
    [resource_tag] [nvarchar](64) NULL)

CREATE TABLE [dbo].[DimTeamProject](
    [ProjectNodeSK] [int] IDENTITY(1,1) NOT NULL,
    [ProjectNodeGUID] [uniqueidentifier] NOT NULL,
    [ProjectNodeName] [nvarchar](256) NULL,
PRIMARY KEY CLUSTERED 
([ProjectNodeSK] ASC))

CREATE TABLE [dbo].[DimIteration](
    [IterationSK] [int] IDENTITY(1,1) NOT NULL,
    [IterationName] [nvarchar](256) NULL,
    [IterationGUID] [nvarchar](256) NOT NULL,   
PRIMARY KEY CLUSTERED 
([IterationSK] ASC))

我有一个简单的查询,它试图从所有表中获取列,但它返回了重复值。尝试 INNER JOIN 会返回重复值,尝试 LEFT OUTER JOIN 时会为“DimIteration.IterationName”提供 NULL 值。

查询是 -

select m.ProjectName,m.ProjectAuthorName "Project Manager", p.CI,c.resource_tag "Alt CI ID", i.IterationName 
from MSP_EpmProject m, Project_CI_Mapping p, ca_owned_resource c, DimTeamProject t, DimIteration i
where i.ProjectGUID = UPPER(CAST(t.ProjectNodeGUID AS NVARCHAR(256)))
and p.CI = c.resource_name
and m.ProjectName = p.ProjectName
order by m.ProjectName,m.ProjectAuthorName, p.CI,c.resource_tag, i.IterationName

可能的映射是 -

MSP_EpmProject.ProjectName =  Project_CI_Mapping.ProjectName 
Project_CI_Mapping.CI = ca_owned_resource.resource_name
ca_owned_resource.resource_tag = DimTeamProject.ProjectNodeName
DimIteration.ProjectGUID = UPPER(CAST(DimTeamProject.ProjectNodeGUID AS NVARCHAR(256)))

什么是同样的合适的解决方案?

谢谢。

【问题讨论】:

  • 欢迎来到大多数数据库表都有 ID 的原因 :)

标签: sql sql-server sql-server-2008


【解决方案1】:

您的查询中有一个CROSS JOIN。如果您使用较新的 ANSI-92 语法重写它(我还是建议您这样做,对于 reasons explained here),您可以看到交叉连接的位置:

select  m.ProjectName,
        m.ProjectAuthorName "Project Manager", 
        p.CI,c.resource_tag "Alt CI ID", 
        i.IterationName 
from    MSP_EpmProject m
        INNER JOIN Project_CI_Mapping p
            ON m.ProjectName = p.ProjectName
        INNER JOIN ca_owned_resource c
            ON p.CI = c.resource_name
        CROSS JOIN DimTeamProject t
        INNER JOIN DimIteration i
            ON i.ProjectGUID = UPPER(CAST(t.ProjectNodeGUID AS NVARCHAR(256)))
order by m.ProjectName,m.ProjectAuthorName, p.CI,c.resource_tag, i.IterationName;

基本上没有什么可以将DimTeamProject 与之前的任何表格相关联。基于你有这个的事实

ca_owned_resource.resource_tag = DimTeamProject.ProjectNodeName

作为一种可能的关系,它根本不在您的查询中,我建议您的查询需要:

select m.ProjectName,m.ProjectAuthorName "Project Manager", p.CI,c.resource_tag "Alt CI ID", i.IterationName 
from MSP_EpmProject m, Project_CI_Mapping p, ca_owned_resource c, DimTeamProject t, DimIteration i
where i.ProjectGUID = UPPER(CAST(t.ProjectNodeGUID AS NVARCHAR(256)))
and p.CI = c.resource_name
and m.ProjectName = p.ProjectName
and c.resource_tag = t.ProjectNodeName -- NEW Clause
order by m.ProjectName,m.ProjectAuthorName, p.CI,c.resource_tag, i.IterationName

但是,正如我已经说过的,我建议使用 ANSI 92 显式连接,这样您的查询将变为:

SELECT  m.ProjectName,
        m.ProjectAuthorName "Project Manager", 
        p.CI,c.resource_tag "Alt CI ID", 
        i.IterationName 
FROM    MSP_EpmProject m
        INNER JOIN Project_CI_Mapping p
            ON m.ProjectName = p.ProjectName
        INNER JOIN ca_owned_resource c
            ON p.CI = c.resource_name
        INNER JOIN DimTeamProject t
            ON t.ProjectNodeName = c.resource_tag
        INNER JOIN DimIteration i
            ON i.ProjectGUID = UPPER(CAST(t.ProjectNodeGUID AS NVARCHAR(256)))
ORDER BY m.ProjectName,m.ProjectAuthorName, p.CI,c.resource_tag, i.IterationName;

【讨论】:

    【解决方案2】:

    不用太详细地研究这个问题,摆脱重复的一种方法是在ORDER BY 之前插入一个GROUP BY 子句,如下所示:

    select m.ProjectName,m.ProjectAuthorName "Project Manager", p.CI,c.resource_tag "Alt CI ID", i.IterationName 
    from MSP_EpmProject m, Project_CI_Mapping p, ca_owned_resource c, DimTeamProject t, DimIteration i
    where i.ProjectGUID = UPPER(CAST(t.ProjectNodeGUID AS NVARCHAR(256)))
    and p.CI = c.resource_name
    and m.ProjectName = p.ProjectName
    GROUP BY m.ProjectName,m.ProjectAuthorName, p.CI,c.resource_tag, i.IterationName
    order by m.ProjectName,m.ProjectAuthorName, p.CI,c.resource_tag, i.IterationName
    

    另一种方法是在SELECT 之后和您希望返回的第一列之前插入DISTINCT

    例如SELECT DISTINCT m.ProjectName...

    【讨论】:

      猜你喜欢
      • 2014-09-25
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-04-06
      • 1970-01-01
      • 1970-01-01
      • 2020-07-30
      相关资源
      最近更新 更多