- OP 中的实现不是确定或绘制线性模型的正确方法。因此,绕过了确定绘制线的角度的问题,并展示了一种更严格的绘制回归线的方法。
-
可以通过将日期时间日期转换为序数来添加回归线。 可以使用
sklearn 计算模型,或使用seaborn.regplot 添加到图中,如下所示。
- 用
pandas.DataFrame.plot绘制完整数据
- 在
python 3.8.11、pandas 1.3.2、matplotlib 3.4.3、seaborn 0.11.2、sklearn 0.24.2 中测试
进口和数据
import yfinance as yf
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LinearRegression
# download the data
df = yf.download('aapl', '2015-01-01', '2021-01-01')
# convert the datetime index to ordinal values, which can be used to plot a regression line
df.index = df.index.map(pd.Timestamp.toordinal)
# display(df.iloc[:5, [4]])
Adj Close
Date
735600 24.782110
735603 24.083958
735604 24.086227
735605 24.423975
735606 25.362394
# convert the regression line start date to ordinal
x1 = pd.to_datetime('2019-01-02').toordinal()
# data slice for the regression line
data=df.loc[x1:].reset_index()
使用 seaborn 绘制回归线
- 使用
seaborn.regplot 无需计算即可将回归线添加到数据的折线图中。
- 将 x 轴标签转换为日期时间格式
- 如果您需要调整端点,请使用 xticks 和标签。
# plot the Adj Close data
ax1 = df.plot(y='Adj Close', c='k', figsize=(15, 6), grid=True, legend=False,
title='Adjusted Close with Regression Line from 2019-01-02')
# add a regression line
sns.regplot(data=data, x='Date', y='Adj Close', ax=ax1, color='magenta', scatter_kws={'s': 7}, label='Linear Model', scatter=False)
ax1.set_xlim(df.index[0], df.index[-1])
# convert the axis back to datetime
xticks = ax1.get_xticks()
labels = [pd.Timestamp.fromordinal(int(label)).date() for label in xticks]
ax1.set_xticks(xticks)
ax1.set_xticklabels(labels)
ax1.legend()
plt.show()
计算线性模型
# create the model
model = LinearRegression()
# extract x and y from dataframe data
x = data[['Date']]
y = data[['Adj Close']]
# fit the mode
model.fit(x, y)
# print the slope and intercept if desired
print('intercept:', model.intercept_)
print('slope:', model.coef_)
intercept: [-90078.45713565]
slope: [[0.1222514]]
# calculate y1, given x1
y1 = model.predict(np.array([[x1]]))
print(y1)
array([[28.27904095]])
# calculate y2, given the last date in data
x2 = data.Date.iloc[-1]
y2 = model.predict(np.array([[x2]]))
print(y2)
array([[117.40030862]])
# this can be added to `ax1` with
ax1 = df.plot(y='Adj Close', c='k', figsize=(15, 6), grid=True, legend=False,
title='Adjusted Close with Regression Line from 2019-01-02')
ax1.plot([x1, x2], [y1[0][0], y2[0][0]], label='Linear Model', c='magenta')
ax1.legend()
斜坡角度
- 这是
axes 的方面的工件,它不等于x 和y。当坡向相等时,看到坡度为 7.0 度。
x = x2 - x1
y = y2[0][0] - y1[0][0]
slope = y / x
print(round(slope, 7) == round(model.coef_[0][0], 7))
[out]:
True
angle = round(np.rad2deg(np.arctan2(y, x)), 1)
print(angle)
[out]:
7.0
# given the existing plot
ax1 = df.plot(y='Adj Close', c='k', figsize=(15, 6), grid=True, legend=False,
title='Adjusted Close with Regression Line from 2019-01-02')
ax1.plot([x1, x2], [y1[0][0], y2[0][0]], label='Linear Model', c='magenta')
# make the aspect equal
ax1.set_aspect('equal', adjustable='box')