【问题标题】:sqlite performance issue: one index per table is somewhat painfulsqlite 性能问题:每个表一个索引有点痛苦
【发布时间】:2012-05-30 18:04:35
【问题描述】:

这是我的架构(给予或接受):

cmds.Add(@"CREATE TABLE [Services] ([Id] INTEGER PRIMARY KEY, [AssetId] INTEGER NULL, [Name] TEXT NOT NULL)");
cmds.Add(@"CREATE INDEX [IX_Services_AssetId] ON [Services] ([AssetId])");
cmds.Add(@"CREATE INDEX [IX_Services_Name] ON [Services] ([Name])");

cmds.Add(@"CREATE TABLE [Telemetry] ([Id] INTEGER PRIMARY KEY, [ServiceId] INTEGER NULL, [Name] TEXT NOT NULL)");
cmds.Add(@"CREATE INDEX [IX_Telemetry_ServiceId] ON [Telemetry] ([ServiceId])");
cmds.Add(@"CREATE INDEX [IX_Telemetry_Name] ON [Telemetry] ([Name])");

cmds.Add(@"CREATE TABLE [Events] ([Id] INTEGER PRIMARY KEY, [TelemetryId] INTEGER NOT NULL, [TimestampTicks] INTEGER NOT NULL, [Value] TEXT NOT NULL)");
cmds.Add(@"CREATE INDEX [IX_Events_TelemetryId] ON [Events] ([TelemetryId])");
cmds.Add(@"CREATE INDEX [IX_Events_TimestampTicks] ON [Events] ([TimestampTicks])");

这是我对他们奇怪的计时器结果的查询:

sqlite> SELECT MIN(e.TimestampTicks) FROM Events e INNER JOIN Telemetry ss ON ss.ID = e.TelemetryID INNER JOIN Services s ON s.ID = ss.ServiceID WHERE s.AssetID = 1;

634678974004420000 CPU时间:用户0.296402 sys 0.374402

sqlite> SELECT MIN(e.TimestampTicks) FROM Events e INNER JOIN Telemetry ss ON ss.ID = e.TelemetryID INNER JOIN Services s ON s.ID = ss.ServiceID WHERE s.AssetID = 2;

634691940264680000 CPU时间:用户0.062400 sys 0.124801

sqlite> SELECT MIN(e.TimestampTicks) FROM Events e INNER JOIN Telemetry ss ON ss.ID = +e.TelemetryID INNER JOIN Services s ON s.ID = ss.ServiceID WHERE s.AssetID = 1;

634678974004420000 CPU时间:用户0.000000 sys 0.000000

sqlite> SELECT MIN(e.TimestampTicks) FROM Events e INNER JOIN Telemetry ss ON ss.ID = +e.TelemetryID INNER JOIN Services s ON s.ID = ss.ServiceID WHERE s.AssetID = 2;

634691940264680000 CPU时间:用户0.265202 sys 0.078001

现在我可以理解为什么添加“+”可能会更改时间,但为什么它与 AssetId 更改如此不一致?我应该为这些 MIN 查询创建其他索引吗? Events 表中有 900000 行。

查询计划(第一个带有“+”):

0|0|0|SEARCH TABLE Events AS e USING INDEX IX_Events_TimestampTicks (~1 rows)
0|1|1|SEARCH TABLE Telemetry AS ss USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)
0|2|2|SEARCH TABLE Services AS s USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)

0|0|2|SEARCH TABLE Services AS s USING COVERING INDEX IX_Services_AssetId (AssetId=?) (~1 rows)
0|1|1|SEARCH TABLE Telemetry AS ss USING COVERING INDEX IX_Telemetry_ServiceId (ServiceId=?) (~1 rows)
0|2|0|SEARCH TABLE Events AS e USING INDEX IX_Events_TelemetryId (TelemetryId=?) (~1 rows)

编辑:总而言之,鉴于上面的表,如果这些是唯一要执行的查询,您将创建哪些索引:

SELECT MIN/MAX(e.TimestampTicks) FROM Events e INNER JOIN Telemetry t ON t.ID = e.TelemetryID INNER JOIN Services s ON s.ID = t.ServiceID WHERE s.AssetID = @AssetId;

SELECT e1.* FROM Events e1 INNER JOIN Telemetry t1 ON t1.Id = e1.TelemetryId INNER JOIN Services s1 ON s1.Id = t1.ServiceId WHERE t1.Name = @TelemetryName AND s1.Name = @ServiceName;

SELECT * FROM Events e INNER JOIN Telemetry t ON t.Id = e.TelemetryId INNER JOIN Services s ON s.Id = t.ServiceId WHERE s.AssetId = @AssetId AND e.TimestampTicks >= @StartTimeTicks ORDER BY e.TimestampTicks LIMIT 1000;

SELECT e.Id, e.TelemetryId, e.TimestampTicks, e.Value FROM (
                SELECT e2.Id AS [Id], MAX(e2.TimestampTicks) as [TimestampTicks]
                                FROM Events e2 INNER JOIN Telemetry t ON t.Id = e2.TelemetryId INNER JOIN Services s ON s.Id = t.ServiceId
                                WHERE s.AssetId = @AssetId AND e2.TimestampTicks <= @StartTimeTicks 
                                GROUP BY e2.TelemetryId) AS grp
INNER JOIN Events e ON grp.Id = e.Id;

【问题讨论】:

    标签: sql performance sqlite indexing


    【解决方案1】:

    布兰农,

    关于 AssetID 变化的时差: 也许您已经尝试过,但是您是否连续多次运行每个查询?您的操作系统和 sqlite 的内存缓存通常会比会话中的第一次运行更快地进行第二次查询。我会连续运行给定查询四次,看看第 2-4 次运行的时间是否更一致。

    关于“+”的使用 (对于那些可能不知道的人,在 SELECT 中带有“+”的字段之前会提示 sqlite 不要在查询中使用该字段的索引。如果 sqlite 已优化存储以仅保留数据,可能会导致您的查询丢失结果在那个索引中。怀疑这已被弃用。) 您是否运行过 ANALYZE 命令?它在做决定时对 sqlite 优化器有很大帮助。
    http://sqlite.org/lang_analyze.html 一旦您的架构稳定并填充了您的表,您可能只需要运行一次 - 无需每天都运行。

    索引者 INDEXED BY 是作者不鼓励在典型使用中使用的功能,但您可能会发现它对您的评估很有帮助。 http://www.sqlite.org/lang_indexedby.html

    我很想知道你发现了什么, Donald Griggs,美国哥伦比亚南卡罗来纳州

    【讨论】:

    • 我已经连续多次运行查询。我还使用了分析功能。我已经查看了 INDEXED BY 功能,但还没有尝试过。它与“+”有何不同?我真正需要的是根据项目接近或远离的概率动态切换索引使用(这可以通过比较具有某个 FK 的行数与表中所有行的计数来确定)。
    • 我已经开始使用 INDEXED BY。 '+' 处理似乎是不确定的。
    猜你喜欢
    • 2014-08-18
    • 1970-01-01
    • 2017-03-12
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-03-19
    • 2016-04-14
    • 2020-08-09
    相关资源
    最近更新 更多