从 csv 文件导入 TDengine答案

【问题标题】：TDengine import from csv file从 csv 文件导入 TDengine
【发布时间】：2022-04-30 16:22:47
【问题描述】：

刚发现导入排序后的csv文件的速度比导入TDengine数据库中未排序的csv文件的速度要快，每个csv文件有1000000行，唯一的区别是一个文件有时间戳排序，另一个文件有时间戳未排序。

谁能解释为什么导入排序后的 csv 文件更快？

taos> create table if not exists t1(ts timestamp, c1 int, c2 float, c3 int, c4 int);
Query OK, 0 of 0 row(s) in database (0.001659s)

taos> insert into t1 file 'unsorted.csv';
Query OK, 1000000 of 1000000 row(s) in database (2.025508s)

taos> create table if not exists t2(ts timestamp, c1 int, c2 float, c3 int, c4 int);
Query OK, 0 of 0 row(s) in database (0.001335s)

taos> insert into t2 file 'sorted.csv';
Query OK, 1000000 of 1000000 row(s) in database (0.994504s)

【问题讨论】：

标签： sql database time-series iot tdengine

【解决方案1】：

我猜原因是 TDengine 存储使用 LSM-tree 结构。由于导入的数据是时间序列数据，记录按主时间戳键排序。因此，写入有序数据将利用 LSM，因为数据只是附加到磁盘块。但是对于随机访问是有惩罚的。

【讨论】：

【解决方案2】：

对于基于时间的数据库或数据结构，排序记录总是更好。我认为这主要是根据您的业务场景 - 如果排序记录易于生成，则使用它，如果不是，让时间序列数据库（如 TDengine）处理它。

【讨论】：