Fisher精确检验分析和分析时间段答案

【问题标题】：Fisher's exact test analysis and analysing time periodFisher精确检验分析和分析时间段
【发布时间】：2014-01-26 12:20:01
【问题描述】：

我想分析两个变量来测试一组数据之间的相关性。其中一个变量是“字符串”，另一个是“日期”（一个时间段）。据我了解，对于我的建议，适当的测试应该是“Fisher 精确测试”。

由于某些类别中有很多 0，因此无法进行卡方检验。我正在考虑运行 Fisher 的精确测试，但不知道如何，因为我在 R 中很新。

数据样本：

  **Parking locations**           `Time sequence`
        Other locations             9:30-13:00
        Bicycle shed (Ground floor) 17:00-20:00
        Bicycle parking (East side) 6:00-9:30
        Bicycle shed (Ground floor) 13:00-17:00
        Bicycle shed (First floor)  9:30-13:00
        Bicycle shed (First floor)  13:00-17:00
        Bicycle shed (Ground floor) 13:00-17:00
        Bicycle shed (Ground floor) 13:00-17:00
        Supervised bicycle parking  6:00-9:30
        Bicycle shed (Ground floor) 6:00-9:30

我的问题是知道是否可以在Spss 中运行分析，或者我应该使用R。？
另外，Time sequence 列在时间段（9:30 到 13:00）的数据类型应该是什么？

【问题讨论】：

你想检验哪个假设？
stat.ethz.ch/R-manual/R-patched/library/stats/html/…
@SvenHohenstein，我想知道首选停车位置和自行车停放时间之间是否有关系？
也许使用regex 来提取开始时间（假设您要分析位置与到达时间），或者如果您想要停车时间的长度，请拆分时间字符串，以便您可以计算经过的时间，然后对按位置分组的时间运行统计测试。使用plyr 或aggregate 函数。
我看到每个时间序列都有倍数。那是你的原始数据吗？我假设您想要四个时间间隔（6:00-9:30、9:30-13:00、13:00-17:00、17:00-20:00）而不是停车时间的长度.对吗？

标签： r spss

【解决方案1】：

我将您的数据输入到 csv file. 中（备注：由于第二列对齐，您的数据看起来像制表符分隔，这也可以）

然后你可以在 R 中做到这一点：

data=read.csv("~/bikes.csv", header=T)
t<-table(data)
fisher.test(t)

t的内容和fisher检验的结果可以看this screenshot.

这是复制的输出：

> t
                         Time.sequence
Parking.locations             13:00-17:00 17:00-20:00 6:00-9:30 9:30-13:00
  Bicycle parking (East side)           0           0         1          0
  Bicycle shed (First floor)            1           0         0          1
  Bicycle shed (Ground floor)           3           1         1          0
  Other locations                       0           0         0          1
  Supervised bicycle parking            0           0         1          0
> fisher.test(t)

    Fisher's Exact Test for Count Data

data:  t 
p-value = 0.419
alternative hypothesis: two.sided

这是一个非常基本的命令示例

?fisher.test

您可以看到大于 2 x 2 的表格有一些设置。如果我的任何假设是错误的（例如 Parking.locations 的分离），我会更新我的答案。

【讨论】：

你应该复制/粘贴你的代码/输出而不是张贴图片
这只是输出，但如果有帮助，我可以复制它。

【解决方案2】：

如果我是您，我会确保您的数据采用逗号分隔格式 (csv)。这样，您可以使用read.csv 简单地将数据读入 R 中。

如果你想将它们用作分类变量，你可以简单地使用R：

fisher.test(parking_location, time_sequence)

随着更具体的信息可用，我会相应地更新答案；这适用于字符串（例如 Bicycle shed (First floor) 和 Bicycle shed (Ground floor)）是唯一的，并且它认为间隔也是固定的。

【讨论】：

但是parking_location是一个字符串，而且好像不是所有的都是相等的。
是的，但是就像您指出的那样，这取决于他是否想将它们视为独特的。如果它们是不同的，那么这种方法就很好了。