与 MariaDB 相比，InfluxDB 2.0 使用 Python 客户端的写入性能较慢答案

【问题标题】：InfluxDB 2.0 has slow write performance with Python client compared to MariaDB与 MariaDB 相比，InfluxDB 2.0 使用 Python 客户端的写入性能较慢
【发布时间】：2021-04-27 19:04:33
【问题描述】：

我是 InfluxDB 的新手，我正在尝试比较 MariaDB 和 InfluxDB 2.0 的性能。因此，我对存储在 txt 文件 (30mb) 中的大约 350.000 行进行了基准测试。

在使用 MariaDB 时，我使用“executemany”将多行写入数据库，所有行大约需要 20 秒（使用 Python）。

所以，我使用 Python 客户端对 InfluxDB 进行了同样的尝试，附上我的主要步骤。

#Configuring the write api
write_api = client.write_api(write_options=WriteOptions(batch_size=10_000, flush_interval=5_000))

#Creating the Point
p = Point(“Test”).field(“column_1”,value_1).field(“column_2”,value_2) #having 7 fields in total

#Appending the point to create a list
data.append(p)

#Then writing the data as a whole into the database, I do this after collecting 200.000 points (this had the best performance), then I clean the variable “data” to start again
write_api.write(“bucket”, “org”, data)

执行此操作大约需要 40 秒，是 MariaDB 时间的两倍。

我被这个问题困扰了很长一段时间，因为文档建议我分批编写它，我这样做了，理论上它应该比 MariaDB 更快。

但可能我错过了什么

提前感谢您！

【问题讨论】：

标签： mariadb benchmarking influxdb influxdb-python

【解决方案1】：

将 20MB 的数据铲到磁盘上需要一些时间。

executemany 可能会进行批处理。（我不知道细节。）

听起来 InfluxDB 做得不太好。

将大量数据铲到一个表中：

给定一个 CSV 文件，LOAD DATA INFILE 是最快的。但如果您必须先创建该文件，它可能不会赢得比赛。
“批处理”INSERTs 非常快：INSERT ... VALUE (1,11), (2, 22), ... 对于 100 行，其运行速度大约是单行 INSERTs 的 10 倍。超过 100 行左右，就会进入“收益递减”状态。
将单独的INSERTs 组合成一个“事务”可以避免事务开销。（同样是“收益递减”。）

用户和数据库之间有一百个包； InfluXDB 是另一个。我不知道细节。

【讨论】：