我假设是一个 unix shell(即 bash)。
阅读排序命令的手册页:
man sort
来自手册页:
环境指定的区域设置会影响排序顺序。设置 LC_ALL=C 以获得使用本机字节值的传统排序顺序。
export LC_ALL=C
sort -t , -k 10,10 -n censusBlockDensities.csv
标志说明:
-t ,:指定逗号作为字段分隔符。
-k 10,10:指定仅在第 10 个字段(开始、停止)排序(第一个字段是 1,而不是 0)。
KEYDEF 是 F[.C][OPTS][,F[.C][OPTS]] 表示开始和停止位置,其中 F 是字段编号,C 是字段中的字符位置;两者都是原点 1,停止位置默认为行尾。如果既不是 -t 也不是
-b 生效,字段中的字符从前一个空格的开头开始计算。 OPTS 是一个或多个单字母排序选项 [bdfgiMhnRrV],它覆盖该键的全局排序选项。如果没有给出键,则使用整行作为键。
-n:执行数字排序,而不是默认的字母数字排序(或者,将“n”添加到-k 参数中,如下评论中所述)。
censusBlockDensities.csv
17001,1,1010,Adams IL,39.960197,-91.373363,0.08861,00.037495258,23,613.41090336
17001,1,1020,Adams IL,39.955861,-91.354113,0.19038,0.493081936,2,4.05612100686
17001,1,1031,Adams IL,39.956978,-91.369,0.002268,0.005874093,0,0,22.8543955664
17001,1,1041,Adams IL,39.94333,-91.345319,0.000358,0.0009236128,0,0480.4506562
17001,1,1051,Adams IL,39.948201,-91.352052,0.213797,0.553731688,64,115.5794427
输出:
17001,1,1020,Adams IL,39.955861,-91.354113,0.19038,0.493081936,2,4.05612100686
17001,1,1031,Adams IL,39.956978,-91.369,0.002268,0.005874093,0,0,22.8543955664
17001,1,1051,Adams IL,39.948201,-91.352052,0.213797,0.553731688,64,115.5794427
17001,1,1041,Adams IL,39.94333,-91.345319,0.000358,0.0009236128,0,0480.4506562
17001,1,1010,Adams IL,39.960197,-91.373363,0.08861,00.037495258,23,613.41090336
编辑:有用的评论表明我的回答有误。您还需要-n 标志来执行数字排序(默认为字母数字)。我已经修改了我的答案以包括这一点。您还可以通过尝试使用-r 标志以相反的顺序排序来验证它是否正常工作。我还在-k 10 参数中添加了停止字段索引,如another post 中所述。
此外,您应该检查输入文件以确保每行中的字段数量相同:
awk '{print gsub(/,/,"")}' censusBlockDensities.csv
9
9
10 <-- the third record has an additional field
9
9