计算每个文本文件的第二列值的平均值答案

【问题标题】：Calculating Mean of 2nd Column Values for Each Text File计算每个文本文件的第二列值的平均值
【发布时间】：2019-01-16 01:14:00
【问题描述】：

我有 10 个文本文件，每个文件都有多行和 3 列，由逗号 (',') 分隔。我的目标是计算每行 10 个文本文件之间的平均值，并且只使用第二列值。

例如：

1.txt: [1,2,3; 4,5,6; 7,8,9; ...]
2.txt: [10,11,12; 13,14,15; 16,17,18; ...]
3.txt: [19,20,21; 22,23,24; 25,26,27; ...]

我想要第二列值的平均值，比如说： A=(2+11+20)/3...然后 B=(5+14+23)/3...然后 C=(8+17+26)/3

因此我会得到[A;B;C] => 3x1 矩阵

目前我只能读取所有文件，但无法在我想要的数组中正确设置它们。

file_list = glob.glob(os.path.join(os.getcwd(), "Chl_96", "*.txt"))

corpus = []

for file_path in file_list:
    with open(file_path) as f_input:
         corpus.append(f_input.read())
print (corpus)

【问题讨论】：

对文件内容使用 split 命令。在分号上拆分以获取行。用逗号分割行以获得单个元素。
请edit您的问题并在文本文件中显示数据的实际格式——这很重要。

标签： python statistics

【解决方案1】：

由于您还没有清楚地描述输入数据的格式（在我看来），假设每个文件都是这样的，这里有一些内容：

例如2.txt:

10,11,12
13,14,15
16,17,18

计算每个输入文件中每一行的第二列的平均值的代码。样本文件只有三个，所以计算的平均值是多少。

from ast import literal_eval
import glob
import os

COL = 1  # Column (second) with value to be averaged.
means = []  # One for column specified above in each file.

file_list = glob.iglob(os.path.join(os.getcwd(), "Chl_96", "*.txt"))

for file_path in file_list:
    with open(file_path) as f_input:
        col_total = 0
        for i, line in enumerate(f_input):
            row = [col for col in line.rstrip().split(',')]
            col_total += int(row[COL])
        means.append(col_total / (i+1))

# Print calculated mean of second column of rows in each file.
print(means)  # -> [5.0, 14.0, 23.0]

【讨论】：