【问题标题】:Writing SQL query refering same table twice编写两次引用同一个表的 SQL 查询
【发布时间】:2013-12-20 23:40:34
【问题描述】:

考虑以下表定义

class MCastSession(Base):
  __tablename__ = 'mcast_session'
  id = Column(Integer, primary_key=True)
  ip = Column(Integer)
  port = Column(Integer)
  __table_args__ = ( UniqueConstraint('ip', 'port'), )

class Topic(Base):
  __tablename__ = 'topic'
  id = Column(Integer, primary_key=True)
  name = Column(String, unique=True)
  mcast_session_id = Column(Integer, ForeignKey('mcast_session.id'))
  mcast_session = relationship('MCastSession')

class Host(Base):
  __tablename__ = 'host'
  id = Column(Integer, primary_key=True)
  name = Column(String, unique=True)

class Subscriber(Base):
  __tablename__ = 'subscriber'
  id = Column(Integer, primary_key=True)
  topic_id = Column(Integer, ForeignKey('topic.id'))
  topic = relationship('Topic')
  host_id = Column(Integer, ForeignKey('host.id'))
  host = relationship('Host')
  __table_args__ = ( UniqueConstraint('topic_id', 'host_id'), )

Example data:
Topic Session
T1    IP1:port1
T2    IP1:port2
T3    IP1:port2
T4    IP2:port1

Topic Host
T1    H1
T2    H1
T4    H2

我想编写一个查询来获取订阅多播 ip 但不处理该 ip 的所有主题的所有主机。在上面的例子中。 H1 有 T1,因此订阅了 IP1,但没有 T3,T3 也有相同的 IP1。所以查询应该返回 H1。 H2 为其订阅的 ips (T4) 处理所有主题(T4),因此 H2 不应出现在结果中。如何编写上述查询?

【问题讨论】:

  • 如果有人可以帮助处理原始 SQL 查询,那也会有所帮助
  • 引用同一个表两次:select a.* from my_tab as first_reference join my_tab as second_reference on first_reference.ID = second_reference.ID +1

标签: python sqlite database-design orm sqlalchemy


【解决方案1】:

下面的查询将得到目标:

select distinct host1.name host_name
  from Subscriber Subscriber1
 inner join host host1
    on host1.id = Subscriber1.Host_Id
 inner join topic topic1
    on topic1.id = Subscriber1.Topic_Id
 inner join mcast_session mcast_session1
    on mcast_session1.id = topic1.mcast_session_id
 where (select count(*)
          from mcast_session
         where mcast_session.ip = mcast_session1.ip) !=
       (select count(*)
          from topic
         inner join mcast_session
            on topic.mcast_session_id = mcast_session.id
         where mcast_session.ip = mcast_session1.ip)

为了解释逻辑,查询可能会有所帮助:

select  host1.name host_name,
       topic1.name topic_name,
       mcast_session1.ip,
       mcast_session1.port,
(select count(*)
          from mcast_session
         where mcast_session.ip = mcast_session1.ip) host_to_topic_registeration,
       (select count(*)
          from topic
         inner join mcast_session
            on topic.mcast_session_id = mcast_session.id
         where mcast_session.ip = mcast_session1.ip
           ) ip_topic_count
  from Subscriber Subscriber1
 inner join host host1
    on host1.id = Subscriber1.Host_Id
 inner join topic topic1
    on topic1.id = Subscriber1.Topic_Id
 inner join mcast_session mcast_session1
    on mcast_session1.id = topic1.mcast_session_id

sqlfiddle sample

【讨论】:

    【解决方案2】:

    查看所需结果的另一种方法是应用以下逻辑:

    1. 为每个IP地址计算Topics广播的总数
    2. 对于每个Host,计算按IP 地址分组的Topics 广播总数
    3. Desires Host 是指每个IP 地址的Topics 数量不等于(实际上更少)IP 地址总数的那些。

    以下 SA 代码应为您提供所需的 Host 实例:

    # subquery to get number of topics per IP
    subq_ip_topics = (session.query(
            MCastSession.ip.label("mcast_session_ip"),
            func.count(Topic.id).label("num_topics")
        )
        .join(Topic)
        .group_by(MCastSession.ip)
        ).subquery().alias("ip_topics")
    
    # subquery to get number of topics per host per ip
    subq_host_ip_topics = (session.query(
            Host.id.label("host_id"),
            MCastSession.ip.label("mcast_session_ip"),
            func.count(Topic.id).label("num_topics")
        )
        .join(Subscriber)
        .join(Topic)
        .join(MCastSession)
        .group_by(Host.id, MCastSession.ip)
        ).subquery().alias("host_ip_topics")
    
    # final query: get those Hosts where results on both sub-queries do not match
    query = (session.query(Host)
            .join(subq_host_ip_topics, Host.id == subq_host_ip_topics.c.host_id)
            .join(subq_ip_topics, and_(
                subq_host_ip_topics.c.mcast_session_ip == subq_ip_topics.c.mcast_session_ip,
                subq_host_ip_topics.c.num_topics != subq_ip_topics.c.num_topics
                ))
            )
    

    产生下面的SQL 代码(对于SQLite):

    SELECT  host.id AS host_id, host.name AS host_name
    
    FROM    host
    
    JOIN   (SELECT  host.id AS host_id,
                    mcast_session.ip AS mcast_session_ip,
                    count(topic.id) AS num_topics
            FROM    host
            JOIN    subscriber
                ON  host.id = subscriber.host_id
            JOIN    topic
                ON  topic.id = subscriber.topic_id
            JOIN    mcast_session
                ON  mcast_session.id = topic.mcast_session_id
            GROUP BY host.id, mcast_session.ip
           ) AS host_ip_topics
        ON  host.id = host_ip_topics.host_id
    
    JOIN   (SELECT  mcast_session.ip AS mcast_session_ip,
                    count(topic.id) AS num_topics
            FROM    mcast_session
            JOIN    topic
                ON  mcast_session.id = topic.mcast_session_id
            GROUP BY mcast_session.ip
           ) AS ip_topics
        ON  host_ip_topics.mcast_session_ip = ip_topics.mcast_session_ip
        AND host_ip_topics.num_topics != ip_topics.num_topics
    

    现在,如果您想在查询中多次使用同一个表,您可以使用aliased。下面的代码将返回一个元组列表(MCaseSession, NNN),其中NNN是具有相同IP的MCastSession对象的数量:

    aliased_MCastSession = aliased(MCastSession, name="MCastSession2")
    qry = session.query(\
        MCastSession, \
        func.count(aliased_MCastSession.id).label("number_with_same_ip")).\
    filter(MCastSession.ip == aliased_MCastSession.ip).\
    group_by(MCastSession)
    

    但我不需要为提出的解决方案执行此操作,因为我为此使用了子查询。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2012-06-04
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-11-07
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多