【发布时间】:2016-07-04 07:24:18
【问题描述】:
我有这样的列的文件。下面的示例输入是部分输入。
请查看下面的主文件链接。每个文件只有两行。
Gene 0.4% 0.7% 1.1% 1.4% 1.8% 2.2% 2.5% 2.9% 3.3% 3.6% 4.0% 4.3% 4.7% 5.1% 5.4% 5.8% 6.2% 6.5% 6.9% 7.2% 7.6% 8.0% 8.3% 8.7% 9.1% 9.4% 9.8% 10.1% 10.5% 10.9% 11.2% 11.6% 12.0% 12.3% 12.7% 13.0% 13.4% 13.8% 14.1% 14.5% 14.9% 15.2% 15.6% 15.9% 16.3% 16.7% 17.0% 17.4% 17.8% 18.1% 18.5% 18.8% 19.2% 19.6% 19.9% 20.3% 20.7% 21.0% 21.4% 21.7% 22.1% 22.5% 22.8% 23.2% 23.6% 23.9% 24.3% 24.6% 25.0% 25.4% 25.7% 26.1% 26.4% 26.8% 27.2% 27.5% 27.9% 28.3% 28.6% 29.0% 29.3% 29.7% 30.1% 30.4% 30.8% 31.2% 31.5% 31.9% 32.2% 32.6% 33.0% 33.3% 33.7% 34.1% 34.4% 34.8% 35.1% 35.5% 35.9% 36.2% 36.6% 37.0% 37.3% 37.7% 38.0% 38.4% 38.8% 39.1% 39.5% 39.9% 40.2% 40.6% 40.9% 41.3% 41.7% 42.0% 42.4% 42.8% 43.1% 43.5% 43.8% 44.2% 44.6% 44.9% 45.3% 45.7% 46.0% 46.4% 46.7% 47.1% 47.5% 47.8% 48.2% 48.6% 48.9% 49.3% 49.6% 50.0% 50.4% 50.7% 51.1% 51.4% 51.8% 52.2% 52.5% 52.9% 53.3% 53.6% 54.0% 54.3% 54.7% 55.1% 55.4% 55.8% 56.2% 56.5% 56.9% 57.2% 57.6% 58.0% 58.3% 58.7% 59.1% 59.4% 59.8% 60.1% 60.5% 60.9% 61.2% 61.6% 62.0% 62.3% 62.7% 63.0% 63.4% 63.8% 64.1% 64.5% 64.9% 65.2% 65.6% 65.9% 66.3% 66.7% 67.0% 67.4% 67.8% 68.1% 68.5% 68.8% 69.2% 69.6% 69.9% 70.3% 70.7% 71.0% 71.4% 71.7% 72.1% 72.5% 72.8% 73.2% 73.6% 73.9% 74.3% 74.6% 75.0% 75.4% 75.7% 76.1% 76.4% 76.8% 77.2% 77.5% 77.9% 78.3% 78.6% 79.0% 79.3% 79.7% 80.1% 80.4% 80.8% 81.2% 81.5% 81.9% 82.2% 82.6% 83.0% 83.3% 83.7% 84.1% 84.4% 84.8% 85.1% 85.5% 85.9% 86.2% 86.6% 87.0% 87.3% 87.7% 88.0% 88.4% 88.8% 89.1% 89.5% 89.9% 90.2% 90.6% 90.9% 91.3% 91.7% 92.0% 92.4% 92.8% 93.1% 93.5% 93.8% 94.2% 94.6% 94.9% 95.3% 95.7% 96.0% 96.4% 96.7% 97.1% 97.5% 97.8% 98.2% 98.6% 98.9% 99.3% 99.6% 100.0% 0.4% 0.7% 1.1% 1.4% 1.8% 2.2% 2.5% 2.9% 3.3% 3.6% 4.0% 4.3% 4.7% 5.1% 5.4% 5.8% 6.2% 6.5% 6.9% 7.2% 7.6% 8.0% 8.3% 8.7% 9.1% 9.4% 9.8% 10.1% 10.5% 10.9% 11.2% 11.6% 12.0% 12.3% 12.7% 13.0% 13.4% 13.8% 14.1% 14.5% 14.9% 15.2% 15.6% 15.9% 16.3% 16.7% 17.0% 17.4% 17.8% 18.1% 18.5% 18.8% 19.2% 19.6% 19.9% 20.3% 20.7% 21.0% 21.4% 21.7% 22.1% 22.5% 22.8% 23.2% 23.6% 23.9% 24.3% 24.6% 25.0% 25.4% 25.7% 26.1% 26.4% 26.8% 27.2% 27.5% 27.9% 28.3% 28.6% 29.0% 29.3% 29.7% 30.1% 30.4% 30.8% 31.2% 31.5% 31.9% 32.2% 32.6% 33.0% 33.3% 33.7% 34.1% 34.4% 34.8% 35.1% 35.5% 35.9% 36.2% 36.6% 37.0% 37.3% 37.7% 38.0% 38.4% 38.8% 39.1% 39.5% 39.9% 40.2% 40.6% 40.9% 41.3% 41.7% 42.0% 42.4% 42.8% 43.1% 43.5% 43.8% 44.2% 44.6% 44.9% 45.3% 45.7% 46.0% 46.4% 46.7% 47.1% 47.5% 47.8% 48.2% 48.6% 48.9% 49.3% 49.6% 50.0% 50.4% 50.7% 51.1% 51.4% 51.8% 52.2% 52.5% 52.9% 53.3% 53.6% 54.0% 54.3% 54.7% 55.1% 55.4% 55.8% 56.2% 56.5% 56.9% 57.2% 57.6% 58.0% 58.3% 58.7% 59.1% 59.4% 59.8% 60.1% 60.5% 60.9% 61.2% 61.6% 62.0% 62.3% 62.7% 63.0% 63.4% 63.8% 64.1% 64.5% 64.9% 65.2% 65.6% 65.9% 66.3% 66.7% 67.0% 67.4% 67.8% 68.1% 68.5% 68.8% 69.2% 69.6% 69.9% 70.3% 70.7% 71.0% 71.4% 71.7% 72.1% 72.5% 72.8% 73.2% 73.6% 73.9% 74.3% 74.6% 75.0% 75.4% 75.7% 76.1% 76.4% 76.8% 77.2% 77.5% 77.9% 78.3% 78.6% 79.0% 79.3% 79.7% 80.1% 80.4% 80.8% 81.2% 81.5% 81.9% 82.2% 82.6% 83.0% 83.3% 83.7% 84.1% 84.4% 84.8% 85.1% 85.5% 85.9% 86.2% 86.6% 87.0% 87.3% 87.7% 88.0% 88.4% 88.8% 89.1% 89.5% 89.9% 90.2% 90.6% 90.9% 91.3% 91.7% 92.0% 92.4% 92.8% 93.1% 93.5% 93.8% 94.2% 94.6% 94.9% 95.3% 95.7% 96.0% 96.4% 96.7% 97.1% 97.5% 97.8% 98.2% 98.6% 98.9% 99.3% 99.6% 100.0%
基本上,这是我需要做的。
一个。从第二列开始,这里是 0.4%。
b.一直到你在标题名称中点击“10”。如果标题名称正好是 10.0%,那么也包括该列。如果没有,只包括直到它之前的列。在此示例中,由于我们有 10.1%(第 29 列),我们将包括从 0.4%(秒)到第 28 列的 9.8% 的列。如果第 29 列是 10.0%,那么它也会被包括在内。
c。平均第二行中这些相应列的值(此处未显示数据 - 请单击此链接查看总数据集 - https://goo.gl/W8jND7)。在本例中,从 0.4%(第二列)到 9.8%(第 28 列)。
d。在输出中,打印第一列是“基因”,这个平均值是列标题为
Gene Average_10%
e。然后从 10.1%(第 29 列)开始检查,直到您在标题名称中点击“20”。重复步骤 b 到 d。并将输出打印为
Gene Average_10% Average_20%
重复这个直到你有
Gene Average_10% Average_20% Average_30% Average_40% Average_50% Average_60% Average_70% Average_80% Average_90% Average_100%
f。当你达到 100% 后,就意味着一个数据集完成了。
g.如果您在这里仔细观察我的列标题,在第一个 100% 之后还有另外 0.4%-100% 的列。我将在上述链接的输入文件中包含 13 个这些 0.4%-100%s。
我。我有多个文件,标题可以是
1% 2% 3%....100%
1.5% 2.5% 3.5%....100%
它因文件而异。但是平均的逻辑(如果你点击“10”、“20”等)总是一样的。并且样本数 13 也是相同的,这意味着每个文件将有 13 次 100%s。
【问题讨论】:
标签: linux awk average multiple-columns