【问题标题】:One to many join with three tables三张表的一对多连接
【发布时间】:2015-11-18 17:07:03
【问题描述】:

我有一个表格:网站、广告系列和 out,用于我正在构建的广告系列跟踪系统。当点击链接时,输出表会在站点 ID 和广告系列 ID 匹配的位置更新其点击数。

在 out 表中有一个campaign_id 和一个site_id,它们分别对应于sites 和campaigns 表。更复杂的是,每个网站可以有 4 个广告系列(campaign_a、campaign_b、campaign_id_reviews、campaign_id_reviews_phone)。我想加入这三个表,并且对于每个站点,我希望以下内容位于一行:

site.site_name, site.campaign_id_a, campaigns.campaign_name, out.hits, 
site.campaign_id_b, campaigns.campaign_name, out.hits, 
site.campaign_id_reviews, campaigns.campaign_name, out.hits, 
site.campaign_id_reviews_phone, campaigns.campaign_name, out.hits

这是我的尝试,它不会带回所有 site_id/campaign_id 组合,它只带回每个 site_id 的一条记录,而不是所有 site_id/campaign_id 组合

SELECT s.*, c.*, o.* FROM sites s
INNER JOIN campaigns c ON s.campaign_id_a=c.campaign_id
INNER JOIN campaigns ON s.campaign_id_b=campaigns.campaign_id
INNER JOIN `out` o ON s.campaign_id_a=o.campaign_id AND s.site_id=o.site_id
WHERE s.site_id NOT IN(100,101)
ORDER BY o.site_id ASC

我的创建表有 3 条记录转储:

CREATE TABLE IF NOT EXISTS `sites` (
  `site_id` mediumint(4) NOT NULL AUTO_INCREMENT,
  `site_name` varchar(70) NOT NULL,
  `campaign_id_a` tinyint(4) NOT NULL,
  `campaign_id_b` tinyint(4) NOT NULL,
  `a_display_name` varchar(50) NOT NULL,
  `b_display_name` varchar(50) NOT NULL,
  `campaign_id_reviews` tinyint(4) NOT NULL,
  `campaign_id_reviews_phone` tinyint(3) NOT NULL DEFAULT '4',
  PRIMARY KEY (`site_id`),
  UNIQUE KEY `site_id` (`site_id`),
  UNIQUE KEY `site_name` (`site_name`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=102 ;

INSERT INTO `sites` (`site_id`, `site_name`, `campaign_id_a`, `campaign_id_b`, `a_display_name`, `b_display_name`, `campaign_id_reviews`, `campaign_id_reviews_phone`) VALUES
(1, 'example.com', 1, 8, 'hard456', 'easy123', 3, 4),
(2, 'example.org', 1, 8, 'hard456', 'easy123', 3, 4),
(3, 'example.net', 8, 8, 'easy123', 'easy123', 3, 4);



CREATE TABLE IF NOT EXISTS `out` (
  `out_id` mediumint(7) NOT NULL AUTO_INCREMENT,
  `site_id` tinyint(4) NOT NULL,
  `campaign_id` tinyint(4) NOT NULL DEFAULT '0',
  `hits` int(11) NOT NULL,
  `date_last_hit` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`out_id`),
  UNIQUE KEY `site2campaign` (`site_id`,`campaign_id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=101 ;

INSERT INTO `out` (`out_id`, `site_id`, `campaign_id`, `hits`, `date_last_hit`) VALUES
(19, 60, 3, 418, '2015-11-16 22:52:33'),
(10, 2, 1, 1135, '2015-11-15 04:51:32'),
(20, 60, 1, 1710, '2015-11-14 13:52:20');




CREATE TABLE IF NOT EXISTS `campaigns` (
  `campaign_id` tinyint(4) NOT NULL AUTO_INCREMENT,
  `campaign_name` varchar(60) NOT NULL,
  `network` varchar(60) NOT NULL,
  `url` varchar(400) NOT NULL,
  PRIMARY KEY (`campaign_id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=10 ;

INSERT INTO `campaigns` (`campaign_id`, `campaign_name`, `network`, `url`) VALUES
(1, 'Hard456', 'Hard Network', 'exampleURL'),
(3, 'medium678', 'Medium Network', 'examplewithURL'),
(8, 'easy123', 'Easy Network', 'exampleURLLoaction');
(4, 'none23', 'None Network', 'urlExample');

【问题讨论】:

  • 您可以尝试使用临时表。会让您的查询看起来更简单。
  • 我认为临时表会使高流量数据库上的事情变得复杂,高成本只是为了让它看起来简单。我不知道如何使用临时表来恢复我需要的所有数据
  • 课程制作不需要。我只是想更好地了解您的用例。

标签: mysql join


【解决方案1】:

问题是您正在覆盖列名。虽然我没有测试过,但这应该可以工作。您可以使用类似的模式来获取每个广告系列 ID 的 out.hits。

SELECT
    r3.*, c4.campaign_name c_name_rev_phone
FROM
    (SELECT
            r2.*, c3.campaign_name c_name_rev
    FROM
        (SELECT 
            r1.*, c2.campaign_name c_name_b
        FROM
            (SELECT 
                s1.site_id s_id, 
                s1.site_name sname, 
                s1.campaign_id_a c_id_a, 
                s1.campaign_id_b c_id_b,
                s1.campaign_id_reviews c_id_rev,
                s1.campaign_id_reviews_phone c_id_rev_phone,
                c1.campaign_name c_name_a 
            FROM sites s1 JOIN campaigns c1 ON s1.campaign_id_a = c1.campaign_id ) r1
        JOIN campaigns c2 ON c2.campaign_id=r1.c_id_b ) r2
    JOIN campaigns c3 ON c3.campaign_id=r2.c_id_rev ) r3
JOIN campaigns c4 on c4.campaign_id=r3.c_id_rev_phone;

编辑 1-

根据问题中提供的示例数据,以下查询的结果将是

SELECT s.*, c.* FROM sites s
INNER JOIN campaigns c ON s.campaign_id_a=c.campaign_id
INNER JOIN campaigns ON s.campaign_id_b=campaigns.campaign_id
WHERE s.site_id NOT IN(100,101);

结果:

site_id |site_name      |campaign_id_a  |campaign_id_b  |a_display_name |b_display_name |campaign_id_reviews    |campaign_id_reviews_phone  |campaign_id    |campaign_name  |network        |url
--------|---------------|---------------|---------------|---------------|---------------|-----------------------|---------------------------|---------------|---------------|---------------|------------------
1       |example.com    |1              |8              |hard456        |easy123        |3                      |4                          |1              |Hard456        |Hard Network   |exampleURL
2       |example.org    |1              |8              |hard456        |easy123        |3                      |4                          |1              |Hard456        |Hard Network   |exampleURL
3       |example.net    |8              |8              |easy123        |easy123        |3                      |4                          |8              |easy123        |Easy Network   |exampleURLLoaction

回答中的查询结果:

s_id    |sname          |c_id_a |c_id_b |c_id_rev   |c_id_rev_phone |c_name_a   |c_name_b   |c_name_rev |c_name_rev_phone
------------------------|-------|-------|-----------|---------------|-----------|-----------|-----------|----------------
1       |example.com    |1      |8      |3          |4              |Hard456    |easy123    |medium678  |none23
2       |example.org    |1      |8      |3          |4              |Hard456    |easy123    |medium678  |none23
3       |example.net    |8      |8      |3          |4              |easy123    |easy123    |medium678  |none23

注意campaign_id_acampaign_id_bcampaign_id_reviews 等的广告系列名称是如何单独获取的。

以下是可以为您提供完整答案的查询:

SELECT
r3.*, 
c4.campaign_name c_name_rev_phone,
o4.hits hits_rev_phone
FROM
    (SELECT
            r2.*, 
            c3.campaign_name c_name_rev,
            o3.hits hits_rev
    FROM
        (SELECT 
            r1.*, 
            c2.campaign_name c_name_b, 
            o2.hits hits_b
        FROM
            (SELECT 
                s1.site_id s_id, 
                s1.site_name sname, 
                s1.campaign_id_a c_id_a, 
                s1.campaign_id_b c_id_b,
                s1.campaign_id_reviews c_id_rev,
                s1.campaign_id_reviews_phone c_id_rev_phone,
                c1.campaign_name c_name_a,
                o1.hits hits_a
            FROM sites s1 JOIN campaigns c1 ON s1.campaign_id_a = c1.campaign_id JOIN `out` o1 ON (c1.campaign_id=o1.campaign_id AND o1.site_id=s1.site_id)) r1
        JOIN campaigns c2 ON c2.campaign_id=r1.c_id_b JOIN `out` o2 ON (c2.campaign_id=o2.campaign_id AND o2.site_id=r1.s_id)) r2
    JOIN campaigns c3 ON c3.campaign_id=r2.c_id_rev JOIN `out` o3 ON (c3.campaign_id=o3.campaign_id AND o3.site_id=r2.s_id)) r3
JOIN campaigns c4 on c4.campaign_id=r3.c_id_rev_phone JOIN `out` o4 ON (c4.campaign_id=o4.campaign_id AND o4.site_id=r3.s_id);

这将为您提供的示例数据集提供一个空结果,因为表 out 中不存在许多 campaign_idsite_id 组合。由于我们正在进行INNER 加入,因此您将丢失此信息。如果您希望在site_idcampaign_id 组合不存在时将hits 报告为0,那么您需要使用LEFT JOIN

如果这不是您想要的,请随时恢复。

【讨论】:

  • 我认为这个概念是正确的,但是您创建的查询返回的结果与我的问题中的上述查询相同,但没有“INNER JOIN out o ON s.campaign_id_a=o.campaign_id AND s.site_id=o.site_id”。我的主要问题是为网站表中的每个活动 ID 取回输出表中的点击量。我已经尝试了无数次尝试添加到 out 表中,但似乎无法弄清楚。
  • 这不会返回相同的结果。在您的情况下,如果您删除与out 的连接,则输出将只返回站点表中针对campaign_id_a 的一个campaign_name。如果您理解这一点,那么您可以轻松地将其扩展到 out.hits。这是因为列campaign_name 的值是从您进行第一次内部连接时出现的第一个引用中提取的。我也会更新答案以包含 out.hits,但了解问题中的查询有什么问题对您来说很重要。
【解决方案2】:

您似乎参加了两次活动,但只返回与 compaign_id_a 匹配的任何内容。 Campaign_id_b 在结果中被忽略,而其他 2 个广告系列 ID 根本不被处理。

拆分它以获得每个广告系列 ID 并将结果合并在一起:-

(SELECT s.site_id,
        s.site_name,
        s.campaign_id_a,
        s.campaign_id_b,
        s.a_display_name,
        s.b_display_name,
        s.campaign_id_reviews,
        s.campaign_id_reviews_phone,
        o.out_id,
        o.hits,
        o.date_last_hit, 
        c.campaign_id,
        c.campaign_name,
        c.network,
        c.url 
FROM sites s
INNER JOIN campaigns c ON s.campaign_id_a = c.campaign_id
INNER JOIN `out` o ON c.campaign_id = o.campaign_id AND s.site_id = o.site_id
WHERE s.site_id NOT IN (100,101))
UNION
(SELECT s.site_id,
        s.site_name,
        s.campaign_id_a,
        s.campaign_id_b,
        s.a_display_name,
        s.b_display_name,
        s.campaign_id_reviews,
        s.campaign_id_reviews_phone,
        o.out_id,
        o.hits,
        o.date_last_hit, 
        c.campaign_id,
        c.campaign_name,
        c.network,
        c.url 
FROM sites s
INNER JOIN campaigns c ON s.campaign_id_b = c.campaign_id
INNER JOIN `out` o ON c.campaign_id = o.campaign_id AND s.site_id = o.site_id
WHERE s.site_id NOT IN (100,101))
UNION
(SELECT s.site_id,
        s.site_name,
        s.campaign_id_a,
        s.campaign_id_b,
        s.a_display_name,
        s.b_display_name,
        s.campaign_id_reviews,
        s.campaign_id_reviews_phone,
        o.out_id,
        o.hits,
        o.date_last_hit, 
        c.campaign_id,
        c.campaign_name,
        c.network,
        c.url 
FROM sites s
INNER JOIN campaigns c ON s.campaign_id_reviews = c.campaign_id
INNER JOIN `out` o ON c.campaign_id = o.campaign_id AND s.site_id = o.site_id
WHERE s.site_id NOT IN (100,101))
UNION
(SELECT s.site_id,
        s.site_name,
        s.campaign_id_a,
        s.campaign_id_b,
        s.a_display_name,
        s.b_display_name,
        s.campaign_id_reviews,
        s.campaign_id_reviews_phone,
        o.out_id,
        o.hits,
        o.date_last_hit, 
        c.campaign_id,
        c.campaign_name,
        c.network,
        c.url 
FROM sites s
INNER JOIN campaigns c ON s.campaign_id_reviews_phone = c.campaign_id
INNER JOIN `out` o ON c.campaign_id = o.campaign_id AND s.site_id = o.site_id
WHERE s.site_id NOT IN (100,101))
ORDER BY site_id ASC

【讨论】:

  • 尽管我希望它能够正常工作,但我一直在 Order by 上遇到错误,所以我把它拿出来了。然后我不断收到内部连接“c.campaign_id_a”的错误。我也不完全了解这些工会是如何运作的
  • 修复了几个错别字,现在应该可以使用了。还专门列出了 SELECT 语句中的列,以避免列名重复。