【问题标题】:Negative Shannon Entropy负香农熵
【发布时间】:2021-12-02 17:53:39
【问题描述】:

我编写了一个简短的代码来计算股票的对数收益和数据的香农熵。但是,我得到了香农熵的负值,这非常奇怪。我正在使用 S=-plogp。 p 不是离散区间有问题吗?如何将 p 划分为多个区间,以便将熵计算为 S = - SUM_k(pklogpk)?

import yfinance as yf
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from scipy.stats import norm


plot_lreturnshist = False
plot_lreturns = True

#Import the data from yfinance. What Ticker, what period of time we want
AAPL = yf.Ticker("AAPL")
history = AAPL.history(period = "5y")
#Extract only the close data
Close = history["Close"]



#Set up a recurrence to add a column in our dataframe for the logarithmic returns of the stock
#Log returns are calculated as log_2(Close(day x)/Close(day x-1))  

logreturn = []
for i in range(len(Close)):
    if i == 0:
        logreturn.append(0) 
    else:
        x = np.log2(abs(Close[i]))-np.log2(abs(Close[i-1]))
        logreturn.append(x)
#Now we have an array with the logarithmic returns, we add it to the pandas dataframe
history["logreturn"] = logreturn
#We then pull it out for ease of use
lreturn = history["logreturn"]

if plot_lreturns == True:
    fig,ax = plt.subplots()
    ax.plot(lreturn, color = "dodgerblue")  


#We plot the data in a histogram, by 
if plot_lreturnshist == True:
    mu, std = norm.fit(lreturn)
    plt.hist(lreturn, bins=50, density=True, alpha=0.6, color='g', ec = 'black')
    
    xmin, xmax = plt.xlim()
    x = np.linspace(xmin, xmax, 100)
    p = norm.pdf(x, mu, std)
    plt.plot(x, p, 'k', linewidth=2)
    title = r"Fit results: $\mu$ = $%.2f$,  $\sigma$ = $%.2f$" % (mu, std)
    plt.title(title)
    plt.xlabel(r"$\ln(Y_{t+1}/Y_t$)")

    plt.show()

mu, std = norm.fit(lreturn)
p = norm.pdf(x, mu, std)
S = np.sum(-p*np.log(p))
print("S")

【问题讨论】:

    标签: python pandas finance entropy yfinance


    【解决方案1】:

    我已经根据移动体积直方图作为概率输入制作了一个熵指标,我也得到了负值。在热力学中,负熵意味着你获得热量,所以也许这意味着市场活动增加,但它并没有告诉你朝哪个方向。

    您可以在我的lib of indicators @ github 中找到我的指标尝试。它简称为“熵”

    编辑:根据您的评论,我修改了熵函数,现在它给出了正值

    def entropy(c_close, c_volume, period, bins=2):
        size = len(c_close)
        out = np.array([np.nan] * size)
        # ROLLING WINDOW
        for i in range(period - 1, size):
            e = i + 1
            s = e - period
            close_w = c_close[s:e]
            volume_w = c_volume[s:e]
            # HISTO BASED ON CLOSE / VOLUME
            min_w = np.min(close_w)
            norm = 1.0 / (np.max(close_w) - min_w)
            sum_h = np.array([0.0] * bins)
            for j in range(period):
                sum_h[int((close_w[j] - min_w) * bins * norm)] += volume_w[j] ** 2
            count = np.sqrt(sum_h)
            # NORMALIZE HISTO COUNT (CONVERT TO PROBA)
            count = count / sum(count)
            # DELETE PROBAS = 0 TO AVOID GAPS
            count = count[np.nonzero(count)]
            # ENTROPY 
            out[i] = -sum(count * np.log2(count))
         return out
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2014-03-31
      • 1970-01-01
      • 1970-01-01
      • 2013-05-14
      • 1970-01-01
      • 2013-02-07
      • 1970-01-01
      相关资源
      最近更新 更多