使用 SQLAlchemy 和 PostgreSQL 查询两个表答案

【问题标题】：Querying two tables using SQLAlchemy and PostgreSQL使用 SQLAlchemy 和 PostgreSQL 查询两个表
【发布时间】：2021-04-10 10:37:54
【问题描述】：

我需要帮助来改进我的 SQLAlchemy 查询。我使用 Python 3.7、SQLAlchemy 1.3.15 和 PosgresSQL 9.4.3 作为数据库。我正在尝试返回特定日期和时间段的约会计数。但是，我的约会和约会空档表是分开的，我必须同时查询模型/表才能获得所需的结果。这就是我所拥有的；

约会模型

约会表有几列，其中包括约会空档表的外键。

class Appointment(ResourceMixin, db.Model): 
    __tablename__ = 'appointments'          

    id = db.Column(db.Integer, primary_key=True)
    user_id = db.Column(db.Integer, db.ForeignKey('users.id', onupdate='CASCADE', ondelete='CASCADE'), index=True, nullable=True)
    slot_id = db.Column(db.Integer, db.ForeignKey('appointment_slots.id', onupdate='CASCADE', ondelete='CASCADE'), index=True, nullable=False)
    appointment_date = db.Column(db.DateTime(), nullable=False)
    appointment_type = db.Column(db.String(128), nullable=False, default='general')

预约空档表

约会时隙表包含约会的时隙。该模型由返回到约会表的关系组成。

class AppointmentSlot(ResourceMixin, db.Model):                                                   
    __tablename__ = 'appointment_slots'                                                           
    id = db.Column(db.Integer, primary_key=True)                                                                         
    # Relationships.                                                                              
    appointments = db.relationship('Appointment', uselist=False,                                  
                                   backref='appointments', lazy='joined', passive_deletes=True)   
    start_time = db.Column(db.String(5), nullable=False, server_default='08:00')                                                                                             
    end_time = db.Column(db.String(5), nullable=False, server_default='17:00')

SQLAlchemy 查询

目前我正在运行以下 SQLAlchemy 查询来获取特定日期和时间段的约会计数；

appointment_count = db.session.query(func.count(Appointment.id)).join(AppointmentSlot)\
        .filter(and_(Appointment.appointment_date == date, AppointmentSlot.id == Appointment.id,
                     AppointmentSlot.start_time == time)).scalar()

上面的查询返回正确的结果，是个位数的值，但我担心查询没有优化。目前查询返回 380ms ，但 appointments 和 appointment_slots 表中只有 8 条记录。这些表最终将拥有成百上千条记录。我担心即使查询现在正在工作，但它最终会因记录的增加而挣扎。

如何改进或优化此查询以提高性能？我正在使用appointment_slots 表上的约会关系查看 SQLAlchemy 子查询，但无法使其工作并确认性能。我认为必须有更好的方法来运行此查询，特别是使用appointment_slots 表上的appointments relationship，但我目前很难过。有什么建议吗？

【问题讨论】：

如果您在调用scalar() 之前启用调试日志记录或仅print(...) 对象，它将向您显示它正在调用的查询。然后，您可以检查它是否正在执行优化查询，然后检查您已索引或未索引的列，然后修复该问题。您从哪里获得时间 - 数据库本身或连接到数据库并运行 python 脚本的其他计算机？

标签： python postgresql sqlalchemy

【解决方案1】：

我对查询加载时间的看法不正确。我实际上正在查看 380 毫秒的页面加载。我还通过从appointments 模型中删除slot_id 并将appointment_id 外键添加到appointment_slots 模型来更改模型上的某些字段。以下查询的页面加载；

appointment_count = db.session.query(func.count(Appointment.id)).join(AppointmentSlot)\
        .filter(and_(Appointment.appointment_date == date, 
AppointmentSlot.appointment_id == Appointment.id, AppointmentSlot.start_time == time)).scalar()

最终成为； 0.4637ms.

但是，我仍然尝试改进查询，并且能够通过使用 SQLAlchemy 子查询来做到这一点。以下子查询；

subquery = db.session.query(Appointment.id).filter(Appointment.appointment_date == date).subquery()
query = db.session.query(func.count(AppointmentSlot.id))\
        .filter(and_(AppointmentSlot.appointment_id.in_(subquery),
 AppointmentSlot.start_time == time)).scalar()

返回0.3700ms 的加载时间，这表明性能比使用连接查询好得多。

【讨论】：