【发布时间】:2011-05-11 11:08:26
【问题描述】:
我有一个 csv 文件,其中每一行都定义了给定建筑物中的一个房间。除房间外,每一排都有一个地板区域。我要提取的是所有建筑物中的所有楼层。
我的文件是这样的......
"u_floor","u_room","name"
0,"00BDF","AIRPORT TEST "
0,0,"BRICKER HALL, JOHN W "
0,3,"BRICKER HALL, JOHN W "
0,5,"BRICKER HALL, JOHN W "
0,6,"BRICKER HALL, JOHN W "
0,7,"BRICKER HALL, JOHN W "
0,8,"BRICKER HALL, JOHN W "
0,9,"BRICKER HALL, JOHN W "
0,19,"BRICKER HALL, JOHN W "
0,20,"BRICKER HALL, JOHN W "
0,21,"BRICKER HALL, JOHN W "
0,25,"BRICKER HALL, JOHN W "
0,27,"BRICKER HALL, JOHN W "
0,29,"BRICKER HALL, JOHN W "
0,35,"BRICKER HALL, JOHN W "
0,45,"BRICKER HALL, JOHN W "
0,59,"BRICKER HALL, JOHN W "
0,60,"BRICKER HALL, JOHN W "
0,61,"BRICKER HALL, JOHN W "
0,63,"BRICKER HALL, JOHN W "
0,"0006M","BRICKER HALL, JOHN W "
0,"0008A","BRICKER HALL, JOHN W "
0,"0008B","BRICKER HALL, JOHN W "
0,"0008C","BRICKER HALL, JOHN W "
0,"0008D","BRICKER HALL, JOHN W "
0,"0008E","BRICKER HALL, JOHN W "
0,"0008F","BRICKER HALL, JOHN W "
0,"0008G","BRICKER HALL, JOHN W "
0,"0008H","BRICKER HALL, JOHN W "
我想要的是所有建筑物的所有楼层。
我正在使用 cat、awk、sort 和 uniq 来获取此列表,尽管我在建筑物名称字段(例如“BRICKER HALL,JOHN W”)中的“,”有问题,并且它正在抛弃我的整个 csv一代。
cat Buildings.csv | awk -F, '{print $1","$2}' | sort | uniq > Floors.csv
如何让 awk 使用逗号但忽略字段“”之间的逗号?或者,有人有更好的解决方案吗?
根据提供的建议使用 awk csv 解析器的答案,我能够得到解决方案:
cat Buildings.csv | awk -f csv.awk | awk -F" -> 2|" '{print $2}' | awk -F"|" '{print $2","$3}' | sort | uniq > floors.csv
我们想使用csv awk 程序,然后从那里我想使用“-> 2|”这是基于 csv awk 程序的格式。 print $2 there 只打印 csv 解析的内容,这是因为程序打印原始行后跟“-> #”,其中 # 是从 csv 解析的计数。 (即列。)从那里我可以在“|”上拆分这个 awk csv 结果whcih 是用它替换逗号的内容。然后排序、uniq 和管道输出到文件并完成!
感谢您的帮助。
【问题讨论】: