【问题标题】:What is the correct SQLAlchemy equivalent to this SQLite statement?与此 SQLite 语句等效的正确 SQLAlchemy 是什么?
【发布时间】:2021-06-01 03:05:11
【问题描述】:

我正在尝试获取每个团队的最新变化。

SQLite 语句
按预期工作。

SELECT * FROM (
  SELECT * FROM team_history ORDER BY changed_at DESC
) sub GROUP BY team

SQLAlchemy 实现
无论出于何种原因,我都必须使用 asc() 而不是 desc() 进行排序以获得相同的结果,这就是为什么我怀疑我的实现是否正确。

session.query(TeamHistory)\
    .select_entity_from(
        session.query(TeamHistory).order_by(asc(TeamHistory.changed_at)).subquery()
    ).group_by(TeamHistory.team)\
    .all()

环境

Python:3.8.0
SQLAlchemy:1.3.23

复制

架构:

CREATE TABLE "team_history" (ID integer PRIMARY KEY, changed_at TEXT, team TEXT);

记录:

[{"ID":1,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":2,"changed_at":"2021-03-02 10:00:00","team":"A"},
 {"ID":3,"changed_at":"2021-03-02 10:30:00","team":"B"},
 {"ID":4,"changed_at":"2021-03-02 10:00:00","team":"A"},
 {"ID":5,"changed_at":"2021-03-02 11:30:00","team":"B"},
 {"ID":6,"changed_at":"2021-03-02 10:00:00","team":"A"},
 {"ID":7,"changed_at":"2021-03-02 11:00:00","team":"B"},
 {"ID":8,"changed_at":"2021-03-02 10:00:00","team":"A"},
 {"ID":9,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":10,"changed_at":"2021-03-02 10:00:00","team":"A"},
 {"ID":11,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":12,"changed_at":"2021-03-02 10:00:00","team":"A"},
 {"ID":13,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":14,"changed_at":"2021-03-02 12:30:00","team":"A"},
 {"ID":15,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":16,"changed_at":"2021-03-02 12:00:00","team":"A"},
 {"ID":17,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":18,"changed_at":"2021-03-02 13:30:00","team":"A"},
 {"ID":19,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":20,"changed_at":"2021-03-02 10:00:00","team":"A"}]

解决方案
谢谢大家!

session.query(TeamHistory)\
    .group_by(TeamHistory.team)\
    .having(func.max(TeamHistory.changed_at))\
    .all()

【问题讨论】:

  • 谢谢,@rfkortekaas。如果上面的代码是正确的,那一定是我的设置。你知道它可能是什么吗?可能是配置错误?
  • 您也可以在create_engine() 调用中使用echo=True 来查看正在发出的SQL 语句。
  • @rfkortekaas 非常感谢您的帮助!我注意到一个区别:在我的模型中,我使用的是changed_at = Column(DateTime)。我会尽快试试你的例子。
  • @rfkortekaas 我运行了您的示例并得到了2019。用asc 替换desc 后,我得到185。我们使用相同的数据运行相同的代码并得到不同的结果?

标签: python sql sqlite sqlalchemy sql-subselect


【解决方案1】:

使用时:

SELECT *
FROM tablename
GROUP BY somecolumn

SQLite 为somecolumn 的每个不同值返回 1 行,但是哪一行?
documentation 声明该行未定义,这意味着它是任意选择的,尽管根据我的经验,似乎将返回结果集中属于每个组的第一行。
但这并不能保证,并且应该避免像上面这样的查询和您的查询。

有多种方法可以为每个team 获取最新changed_at 所在的行。
其中一个在 SQLite 中有效(尽管它在其他数据库中不起作用)是:

SELECT * FROM team_history GROUP BY team HAVING MAX(changed_at)

请参阅demo
所以,这是你应该翻译成 SQLAlchemy 的查询(我帮不了你)。

还有其他方法,使用窗口函数,或EXISTS

【讨论】:

  • 感谢您的解释和解决方案。以下 SQLAlchemy 实现有效:session.query(TeamHistory).group_by(TeamHistory.team).having(func.max(TeamHistory.changed_at)).all().
【解决方案2】:

基本上,您提出的查询应该可以工作。 SQLite 将日期存储为 ISO-8601 格式的字符串,该格式具有字典顺序和时间顺序相同的属性。同样使用日期时间列作为TEXT 具有相同的属性。

因此,子查询中的desc 排序应该会产生以下结果:

ID DateTime Team
18 2021-03-02 13:30:00 A
14 2021-03-02 12:30:00 A
16 2021-03-02 12:00:00 A
5 2021-03-02 11:30:00 B
.. .. ..

正如另一个答案中所述,group by 的问题是任意选择每个组的返回行,但看起来它始终是第一行。知道了这一点,我们有不同的解决方案来确定它应该返回哪些行:

# Using an aggregate function in select
session.query(TeamHistory.team, func.max(TeamHistory.changed_at)).group_by(TeamHistory.team).all()

# Using an aggregate function with `having`
session.query(TeamHistory).group_by(TeamHistory.team).having(func.max(TeamHistory.changed_at).all()

这会产生以下工作示例:

from sqlalchemy import Column, Integer, create_engine, Text, DateTime, func
from sqlalchemy.orm import sessionmaker, deferred, column_property
from sqlalchemy.ext.declarative import declarative_base
import datetime

Base = declarative_base()

class TeamHistory(Base):
    __tablename__ = 'team_history'
    id = Column(Integer, primary_key=True)
    changed_at = Column(DateTime)
    team = Column(Text)

if __name__ == '__main__':
    engine = create_engine('sqlite://')
    Base.metadata.create_all(engine)
    Session = sessionmaker(engine)

    db = Session()
    
    lst = [{"ID":1,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":2,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":3,"changed_at":"2021-03-02 10:30:00","team":"B"},{"ID":4,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":5,"changed_at":"2021-03-02 11:30:00","team":"B"},{"ID":6,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":7,"changed_at":"2021-03-02 11:00:00","team":"B"},{"ID":8,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":9,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":10,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":11,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":12,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":13,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":14,"changed_at":"2021-03-02 12:30:00","team":"A"},{"ID":15,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":16,"changed_at":"2021-03-02 12:00:00","team":"A"},{"ID":17,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":18,"changed_at":"2021-03-02 13:30:00","team":"A"},{"ID":19,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":20,"changed_at":"2021-03-02 10:00:00","team":"A"}]
    for dct in lst:
        dt = datetime.datetime.strptime(dct.get('changed_at'), '%Y-%m-%d %H:%M:%S')
        nth = TeamHistory(id=dct.get('ID'), changed_at=dt, team=dct.get('team'))
        db.add(nth)

    db.commit()

    res = db.query(TeamHistory)\
        .select_entity_from(
            db.query(TeamHistory).order_by(TeamHistory.changed_at.desc()).subquery()
        ).group_by(TeamHistory.team)\
        .all()

    for r in res:
        print(r.id, r.changed_at, r.team)
    print()


    res = db.query(TeamHistory)\
        .group_by(TeamHistory.team)\
        .having(func.max(TeamHistory.changed_at))\
        .all()

    for r in res:
        print(r.id, r.changed_at, r.team)
    print()

    res = db.query(TeamHistory.id, func.max(TeamHistory.changed_at), TeamHistory.team)\
        .group_by(TeamHistory.team)\
        .all()

    for r in res:
        print(r[0], r[1], r[2])

如果你主要对团队最后的变化感兴趣,你也可以使用deferredcolumn_property只得到MAX(changed_at)

from sqlalchemy import Column, Integer, create_engine, Text, DateTime, func, select
from sqlalchemy.orm import sessionmaker, deferred, column_property
from sqlalchemy.ext.declarative import declarative_base
import datetime

Base = declarative_base()

class TeamHistory(Base):
    __tablename__ = 'team_history_def'
    id = Column(Integer, primary_key=True)
    changed_at = deferred(Column(DateTime))
    last_changed = column_property(func.max(changed_at))
    team = Column(Text)

if __name__ == '__main__':
    engine = create_engine('sqlite://')
    Base.metadata.create_all(engine)
    Session = sessionmaker(engine)

    db = Session()
    
    lst = [{"ID":1,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":2,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":3,"changed_at":"2021-03-02 10:30:00","team":"B"},{"ID":4,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":5,"changed_at":"2021-03-02 11:30:00","team":"B"},{"ID":6,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":7,"changed_at":"2021-03-02 11:00:00","team":"B"},{"ID":8,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":9,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":10,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":11,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":12,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":13,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":14,"changed_at":"2021-03-02 12:30:00","team":"A"},{"ID":15,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":16,"changed_at":"2021-03-02 12:00:00","team":"A"},{"ID":17,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":18,"changed_at":"2021-03-02 13:30:00","team":"A"},{"ID":19,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":20,"changed_at":"2021-03-02 10:00:00","team":"A"}]
    for dct in lst:
        dt = datetime.datetime.strptime(dct.get('changed_at'), '%Y-%m-%d %H:%M:%S')
        nth = TeamHistory(id=dct.get('ID'), changed_at=dt, team=dct.get('team'))
        db.add(nth)

    db.commit()

    res = db.query(TeamHistory)\
        .group_by(TeamHistory.team)\
        .all()

    for r in res:
        print(r.id, r.changed_at, r.team)
    print()

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2020-06-05
    • 2014-03-12
    • 1970-01-01
    • 2017-01-27
    • 2017-10-20
    • 1970-01-01
    • 2022-06-30
    相关资源
    最近更新 更多