概率时间序列，观察到的数据概率（似曾相识）答案

【问题标题】：Probability time series, observed data probabilities (deja vu)概率时间序列，观察到的数据概率（似曾相识）
【发布时间】：2010-11-04 12:51:38
【问题描述】：

好的，伙计们...感谢您查看这个问题。我记得在大学里做了以下事情，但是我忘记了确切的解决方案。任何采取正确方向的人。

我有一个 N 的时间序列数据（我们将使用三个）。数据序列按时间顺序是连续的（例如 obsOne[1] 与 obsTwo[1] 和 obsThree[1] 一起发生）

obsOne[47, 136, -108, -15, 22, ...], obsTwo[448, 321, 122, -207, 269, ...], obsThree[381, 283, 429, -393 , 242, ...]

第 2 步。从数据系列中，我为每个数据系列创建了一系列宽度为 Z 的 X 范围箱。（例如观察 obsOne：bin1 = [ 136]

第 3 步。现在创建一个包含数据系列所有可能组合的表。因此，如果我有 4 个 bin 和 3 个数据系列，所有组合将总计 4x4x4 = 64 个可能的结果。（例如 row1 = obsOne bin1 + obsTwo bin1 + obsThree bin1, row2 = obsOne bin1 + obsTwo bin1 + obsThree bin2, ... row5 = obsOne bin1 + obsTwo bin1 + obsThree binX, row6 = obsOne bin1 + obsTwo bin2 + obsThree bin1, row7 = obsOne bin1 + obsTwo bin1 + obsThree bin2, row9 = obsOne bin1 + obsTwo bin2 + obsThree binX, ...)

第 4 步。我现在回到数据系列，找出数据系列中的每一行在表中的位置，并计算观察次数。（例如 obsOne[2] obsTwo[2] obsThree[2] = 表上的第 30 行，obsOne[X] obsTwo[X] obsThree[X] = 表上的第 52 行。

第 5 步。然后，我只取表中具有正匹配的行，计算有多少观察落在该行上，除以数据系列中的观察总数，这给出了我在观察数据上的该范围的概率.

我为这个基本问题道歉，而不是数学专家。很多年前我已经这样做了。我忘记了我用的是哪种方法，它比这种漫长的（古老的“手工”）方法要快得多。当时我没有使用 python，它是 c++ 中的其他一些专有包。我想看看是否有什么东西可以用python（现在是一家python商店）解决这个问题，总是可以扩展，所以它是软约束。

【问题讨论】：

我完全看不懂你的步骤。 “创建一系列宽度为 Z 的 X 范围箱”？ “创建一个包含数据系列所有可能组合的表”？程序对我来说根本不清楚。分析的目标是什么？
我认为应该自动将声誉授予通读整个问题的任何人，除了笑话你能简化它并首先定义目的然后解释

标签： python probability time-series data-analysis

【解决方案1】：

你是在说这样的事情吗？

from __future__ import division
from collections import defaultdict

obsOne= [47, 136, -108, -15, 22, ]
obsTwo= [448, 321, 122, -207, 269, ]
obsThree= [381, 283, 429, -393, 242, ]

class BinParams( object ):
    def __init__( self, timeSeries, X ):
        self.mx= max(timeSeries )
        self.mn= min(timeSeries )
        self.Z=(self.mx-self.mn)/X
    def index( self, sample ):
        return (sample-self.mn)//self.Z

binsOne=  BinParams( obsOne, 4 )
binsTwo=  BinParams( obsTwo, 4 )
binsThree= BinParams( obsThree, 4 )

counts= defaultdict(int)
for s1, s2, s3 in zip( obsOne, obsTwo, obsThree ):
    posn= binsOne.index(s1), binsTwo.index(s2), binsThree.index(s3)
    counts[posn] += 1

for k in counts:
    print k, counts[k], counts[k]/len(counts)

【讨论】：