如何自定义 matplotlib 中的误差线答案

【问题标题】：How can I customize the error bars in matplotlib如何自定义 matplotlib 中的误差线
【发布时间】：2021-11-19 03:56:30
【问题描述】：

我打算在我的图中绘制所有可能的信息，包括平均值、标准差和 MSE，并参考图中的每个点。

from sklearn.metrics import mean_absolute_error as mae
from sklearn.metrics import mean_squared_error as mse
import numpy as np
import matplotlib.pyplot as plt

为了简单起见，假设我只有三点。

true = np.array([[1047.]
 [ 953.]
 [1073.]])
pred = np.array([[ -69.921265]
 [-907.8611  ]
 [ 208.98877 ]])

my_mae= mae(true, pred) #mean absolute error
my_mse= mse(true,pred) #mean squared error

err = abs(true - pred) #get the error per point
mean_err = np.mean(err) #calculate the mean
sd_err  = np.std(err) #calculate the standard deviation

然后，我绘制我的误差线。

dy= 100

plt.errorbar(true,pred, yerr=dy, fmt='o', color='black',ecolor='red', elinewidth=3, capsize=0);

首先，我想以某种方式引用每个错误栏以查看它引用的数据点。其次，我想将所有四条信息添加到情节中。如有任何帮助，我将不胜感激。

【问题讨论】：

“a”和“b”的值是多少？
这是一个错字，我现在改正了。
我的代码有什么不足吗？

标签： python numpy matplotlib plot

【解决方案1】：

我将数据更改为随机数据以模拟您的 400 行数据。完成 400 个数据点的绘制大约需要 0.1 秒。

from sklearn.metrics import mean_absolute_error as mae
from sklearn.metrics import mean_squared_error as mse
import numpy as np
import matplotlib.pyplot as plt
import time

true = np.arange(400)
pred = np.random.rand(400, 1) * 1000
number = np.arange(1, 401)

my_mae= mae(true, pred) #mean absolute error
my_mse= mse(true,pred) #mean squared error

err = abs(true - pred) #get the error per point
mean_err = np.mean(err) #calculate the mean
sd_err  = np.std(err) #calculate the standard deviation

dy= 100

plt.close()
fig, ax = plt.subplots(figsize = (13,5))
ax.set_xlim(0, 400)
ax.grid()

######
t1 = time.time()
######
plt.errorbar(true,pred, yerr=dy, fmt='o', color='black',ecolor='red', elinewidth=3, capsize=0)
for i in range(400):
    ax.annotate(f'{i+1}', (true[i]+10, pred[i]))
ax.annotate(f'mae = {round(my_mae, 2)}, mse = {round(my_mse, 2)}, mean error = {round(mean_err, 2)}, standard deviation = {round(sd_err, 2)}', (190, -130))
#######
t2 = time.time()
#######
plt.tight_layout()
print('time needed:', t2-t1, 's')

输出：

time needed: 0.11669778823852539 s

【讨论】：

【解决方案2】：

给你，如果这解决了你的问题，考虑接受答案：

from sklearn.metrics import mean_absolute_error as mae
from sklearn.metrics import mean_squared_error as mse
import numpy as np
import matplotlib.pyplot as plt

true = np.array([[1047.],
 [ 953.],
 [1073.]])
pred = np.array([[ -69.921265],
 [-907.8611  ],
 [ 208.98877 ]])

my_mae= mae(true, pred) #mean absolute error
my_mse= mse(true,pred) #mean squared error

err = abs(true - pred) #get the error per point
mean_err = np.mean(err) #calculate the mean
sd_err  = np.std(err) #calculate the standard deviation

dy= 100

for i, z in enumerate (pred,1):
    plt.errorbar(true,pred, yerr=dy, fmt='o', color='black',ecolor='red', elinewidth=3, capsize=0, zorder=3);
    plt.annotate(i, (true[i-1], pred[i-1]),fontsize=20, color='blue')
    
label_1=['my_mae','my_mse', 'mean_err', 'sd_err']
label_2=[my_mae,my_mse, mean_err,sd_err]

for q,w in zip(label_1, label_2): 
    plt.plot([], [],'o', label=(f'{q}: {w}'))

plt.legend(loc='lower right')

这是你应该得到的：

【讨论】：

它确实有效，但你知道，当我使用真实数据（400 行）运行它时，生成图表需要 40 秒。
For 循环确实滞后很多。有什么办法可以提高性能？
哦，那是相当多的时间！可悲的是，我一无所知，我希望其他用户可以建议我们如何提高效率。
您可以做一件事，接受此解决方案后，您可以重新发布此代码，以寻求在处理更大数据时更高效的方法。