在 gnuplot 中保存“stats”命令的输出答案

【问题标题】：Save output from 'stats' command in gnuplot在 gnuplot 中保存“stats”命令的输出
【发布时间】：2019-05-06 10:54:06
【问题描述】：

我想对运行在 600 个节点上的基准测试中的输出文件进行统计分析。特别是，我需要最小值、上四分位数、中位数、下四分位数、最小值和平均值。我的输出是文件testrun16-[1-600]

附上代码：

ListofFiles = system('dir testrun16-*')

set print 'MaxValues.dat'
do for [file in ListofFiles]{
stats file using 1 nooutput
print STATS_max
}

set print 'upquValues.dat'
do for [file in ListofFiles]{
stats file using 1 nooutput
print STATS_up_quartile
}

set print 'MedianValues.dat'
do for [file in ListofFiles]{
stats file using 1 nooutput
print STATS_median
}

set print 'loquValues.dat'
do for [file in ListofFiles]{
stats file using 1 nooutput
print STATS_lo_quartile
}

set print 'MinValues.dat'
do for [file in ListofFiles]{
stats file using 1 nooutput
print STATS_min
}

set print 'MeanValues.dat'
do for [file in ListofFiles]{
stats file using 1 nooutput
print STATS_mean
}

unset print
set term x11
set title 'CLAIX2016 distribution of OSnoise using FWQ'
set xlabel "Number of Nodes"
set ylabel "Runtime [ns]"
plot 'MaxValues.dat' using 1 title 'maximum value', 'upquValues.dat' title 'upper quartile', 'MedianValues.dat' using 1 title 'median value', 'loquValues.dat' title 'lower quartile', 'MinValues.dat' title 'minimum value', 'MeanValues.dat' using 1 title 'mean value';
set term png
set output 'noises.png'
replot

我获得这些值并可以绘制它们。但是，每次运行的元组都会混淆。 testrun16-17.dat 的平均值绘制在x=317 上，它的最小值也在另一个地方。

如何保存输出但将元组保持在一起并将每个节点绘制在它的实际位置上？

【问题讨论】：

dir testrun16-* 是否按照您想要的顺序提供文件名？即，testrun16-17.dat 是该命令的第 17 个输出吗？
我刚刚通过添加另一个排序选项dir testrun16-* -v 对其进行了测试，至少在控制台输出中对它们进行排序。然而 gunplot 不断将 17th 文件放在 317 位置
显然我不能编辑 cmets？无论如何。我还将数字小于 10 的文件重命名为 testrun16-001.dat 等格式。这现在将第 17 个条目推到第 65 位。

标签： linux statistics gnuplot

【解决方案1】：

Windows（和 Linux？）可能有一些特殊的方法来对目录列表中的数据进行排序（或取消排序）。为了消除这种不确定性，您可以按编号循环文件。但是，这假设从 1 到最大值（=FilesCount，在您的情况下为 600）的所有数字实际上都存在。抱歉，您标记了 Linux，但我只知道 Windows，获取 Windows 中仅文件名列表的命令是 'dir /B testrun16-*'。

您将统计数字写在 7 个不同的文件中是否有特殊原因？为什么不放到一个文件中？

类似这样的：（在 OP 评论后修改）

### batch statistics
reset session

FileRootName = 'testrun16'
FileList = system('dir /B '.FileRootName.'-*')
FilesCount =  words(FileList)
print "Files found: ", FilesCount

# function for extracting the number from the filename 
GetFileNumber(s) = int(s[strstrt(s,"-")+1:strstrt(s,".dat")-1])

set print FileRootName.'_Statistics.dat'
    print "File Max UpQ Med LoQ Min Mean"
    do for [FILE in FileList] {
        stats FILE u 1 nooutput
        print sprintf("%d %g %g %g %g %g %g", \
        GetFileNumber(FILE), \
        STATS_max, STATS_up_quartile, STATS_median, \
        STATS_lo_quartile, STATS_min, STATS_mean)
    }
set print

plot FileRootName.'_Statistics.dat' \
       u 1:2 title 'maximum value', \
    '' u 1:3 title 'upper quartile', \
    '' u 1:4 title 'median value', \
    '' u 1:5 title 'lower quartile', \
    '' u 1:6 title 'minimum value', \
    '' u 1:7 title 'mean value'
### end of code

【讨论】：

感谢您的回复！遗憾的是，并非所有数字都实际存在，因为某些节点尚未安排或由于维护而无法使用。我创建了 7 个文件，因为每次我尝试制作单个文件时我的方法都会覆盖这些文件，因此感谢您修复它！这也应该大大增加脚本的运行时间。我会在加载文件时使用排序选项尝试您的代码，并且没有明确的 for 循环。还是你有更好的主意？（也赞成，但由于我的代表限制它没有出现）
好的。那么你必须从文件名中提取数字...查看修改后的代码。