【发布时间】:2020-12-17 04:42:16
【问题描述】:
我在使用 SQLAlchemy 编写 SQL 数据库时遇到了性能问题。我们有数千条记录要写,每条记录都有很多关系。通过调查,我们意识到每条记录都添加了一个“插入”。这里以一个小数据模型为例:
模型声明
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Table, Column, Integer, String, MetaData, ForeignKey
from sqlalchemy.orm import relationship,backref,sessionmaker
from sqlalchemy import create_engine
engine= create_engine('sqlite:///model.db',echo=True)
Session=sessionmaker(bind=engine)
session=Session()
Base=declarative_base()
metagate=Base.metadata
class Parent(Base):
__tablename__='PARENT'
PK_ID=Column(Integer,primary_key=True)
attribute=Column(String(10))
relation1=relationship('Child', cascade="all, delete-orphan" ,single_parent=True, back_populates="relation2")
def __init__(self, attribute=None):
self.attribute=attribute
class Child(Base):
__tablename__='CHILD'
PK_ID=Column(Integer,primary_key=True)
attribute=Column(String(10))
relation2_ID=Column(Integer, ForeignKey('PARENT.PK_ID'))
relation2 = relationship("Parent", cascade="all, delete-orphan",single_parent=True, back_populates="relation1")
def __init__(self, attribute=None):
self.attribute=attribute
Base.metadata.create_all(engine)
运行 sn-p
obj1=Parent('foo')
for attribute in range(10):
obj2=Child(str(attribute))
obj1.relation1.append(obj2)
session.add(obj1)
session.commit()
生成的 SQL
SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
()
SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
()
PRAGMA table_info("PARENT")
()
PRAGMA table_info("CHILD")
()
SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
()
SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
()
BEGIN (implicit)
INSERT INTO "PARENT" (attribute) VALUES (?)
('foo',)
INSERT INTO "CHILD" (attribute, "relation2_ID") VALUES (?, ?)
('0', 3)
INSERT INTO "CHILD" (attribute, "relation2_ID") VALUES (?, ?)
('1', 3)
INSERT INTO "CHILD" (attribute, "relation2_ID") VALUES (?, ?)
('2', 3)
INSERT INTO "CHILD" (attribute, "relation2_ID") VALUES (?, ?)
('3', 3)
INSERT INTO "CHILD" (attribute, "relation2_ID") VALUES (?, ?)
('4', 3)
INSERT INTO "CHILD" (attribute, "relation2_ID") VALUES (?, ?)
('5', 3)
INSERT INTO "CHILD" (attribute, "relation2_ID") VALUES (?, ?)
('6', 3)
INSERT INTO "CHILD" (attribute, "relation2_ID") VALUES (?, ?)
('7', 3)
INSERT INTO "CHILD" (attribute, "relation2_ID") VALUES (?, ?)
('8', 3)
INSERT INTO "CHILD" (attribute, "relation2_ID") VALUES (?, ?)
('9', 3)
COMMIT
我们期待的是这样的:
SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
()
SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
()
PRAGMA table_info("PARENT")
()
PRAGMA table_info("CHILD")
()
SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
()
SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
()
BEGIN (implicit)
INSERT INTO "PARENT" (attribute) VALUES (?)
('foo',)
INSERT INTO "CHILD" (attribute, "relation2_ID") VALUES
('0', 3),
('1', 3),
...
('9', 3)
COMMIT
我们最初对每个孩子使用Session.add,然后是Session.bulk_save_objects(),现在是“级联保存更新”,但没有看到任何性能优势。我想知道是否有办法在单个查询中插入所有相关关系?
如果有帮助,最终的数据库将是 SQL Server 2012。我们的第一次尝试花费了 1 多小时来保存 1 条记录:
- 数据大小:800kB左右
- 涉及的表数:15
- 相关记录数:20000左右
提前致谢,
BLH
【问题讨论】:
标签: python sql sql-server orm sqlalchemy