wrt Additionally, for whatever reason, said Imperial measurements contain an excess of quotation marks. - 当您以英尺和英寸编写测量值时,' 代表英尺," 代表英寸。所以 5 英尺 11 英寸写成5' 11"。在引用了 "foo" 之类的字段的 CSV 中,您需要某种方法来包含 ",而在某些 CSV 格式(例如从 Excel 中导出)中这样做的一种方法是将 " 加倍以转义它。因此,在引用字段中包含foo"bar 将是"foo""bar"。现在让我们回到5' 11" - 同样的逻辑适用并将其包含在带引号的字段中,您可以将其写为"5' 11""",其中最后一个" 之前的"" 是" 的转义表示包含在引用的字段中。请参阅 What's the most robust way to efficiently parse CSV using awk? 以获取对适用 CSV“标准”的引用以及有关使用标准 UNIX 工具 awk 解析 CSV 的更多信息。
wrt 您的具体问题 - 与其使用硬编码值一次转换一个数字,不如通过算法一次将它们全部转换。使用 GNU awk 进行 FPAT:
$ cat tst.awk
BEGIN {
FPAT = "([^,]*)|(\"[^\"]+\")"
OFS = ","
}
{
split($4,feetinches,/[^0-9]+/)
ft = feetinches[2] + (feetinches[3] / 12)
$4 = "\"" int( (10000 * ft / 3.28084) + 0.5 ) / 100 "\""
print
}
$ cat file
"Female","Hispanic",25,"5' 11"""
"Male","Scottish",54,"6' 1"""
"Female","English",12,"4' 7"""
"TBD","Martian",935,"8' 5"""
$ awk -f tst.awk file
"Female","Hispanic",25,"180.34"
"Male","Scottish",54,"185.42"
"Female","English",12,"139.70"
"TBD","Martian",935,"256.54"