在bash中获取文件指针并任意读取行答案

【问题标题】：Get a file pointer in bash and read lines arbitrarily在bash中获取文件指针并任意读取行
【发布时间】：2025-12-14 07:50:01
【问题描述】：

我有一个 bash 脚本，我需要在其中逐行读取文件。我知道通常的while read line 是如何工作的，但我的程序不能很好地适应while 循环。我有两个文件，需要在某些条件下逐行比较它们（不是diff：条件是一个文件中的行是否以另一个文件中的行开头）。目前我有一个Java版本的程序，它有三个嵌套循环，两个文件的循环交织在一起，我需要打破嵌套循环（我知道怎么做）。所以我想要一个优雅的解决方案来在 bash 中完成以下基本任务（以下代码是我的 Java 程序）：

BufferedReader reader = new BufferedReader(new FileReader(inputFile)); // initialize a file pointer
reader.ready();                                                        // whether the pointer is at the end of the file (used in while and if conditions)
lineStr = reader.readLine();                                           // read next line

我在网上找到的所有解决方案都使用规范的 while read line 结构，但我的程序无法适应它。所以我想用更多的控制来操作文件。

【问题讨论】：

标签： linux bash file-io io

【解决方案1】：

这是另一种将文件读入数组并处理每个文件的方法（确保您没有耗尽数 GB 文件的内存）：

#!/bin/bash

## test input
test -r "$1" && test -r "$2" || {
    printf "\n  %s error: file not readable, usage: %s <file1> <file2>\n\n", "${0//*\//}" "${0//*\//}"
    exit 1
}

## save/set internal field separator to break on newlines
oldIFS=$IFS
IFS=$'\n'
let linecnt=0

## declare arrays to hold lines from each file and read files
declare -a f1
declare -a f2

f1=( `< "$1"` )
f2=( `< "$2"` )

lines_f1=${#f1[@]}
lines_f2=${#f2[@]}

## test number of lines in each, iterate over lesser of two
if test "$lines_f1" -eq "$lines_f1"; then
    lines=$lines_f1
else
    test "$lines_f1" -lt "$lines_f1" && lines=$lines_f1 || lines=$lines_f2
fi

## iterate over lines in each file doing something with them
for ((i = 0; i < $lines; i++)); do

    ## do something with the lines
    printf "f1 [%d] : %s\n" $i "${f1[$i]}"
    printf "f2 [%d] : %s\n\n" $i "${f2[$i]}"

done

IFS="$oldIFS"

exit 0

【讨论】：

这看起来不对：f1=( `< "$1"` ) -- 这会将文件的单词读入数组。要将 lines 读入数组，您需要一个 while read 循环或 bash v4 命令 mapfile
这是正确的 - 注意上面的IFS=$'\n'。查看man bash 并搜索IFS（内部字段分隔符）。通过设置为$'\n'，您可以阅读整行——直到下一个换行。试一试任何 2 个文件。说cp ~/.bashrc bashrc.txt 然后bash cmpfiles.sh ~/.bashrc bashrc.txt
对，错过了。将 IFS 分配到更接近实际使用它的值的位置会很有帮助。你的风格有点不习惯。
是的，您可以在 IFS 设置上方交换 declare -a 语句，这样就可以了。

【解决方案2】：

要在循环中逐行比较两个文件，您可以这样做：

while read -u 4 A && read -u 5 B; do
    <do something with $A and $B>
done 4< file1.txt 5< file2.txt

或

for (( ;; )); do
    read -u 4 A || {
        <read error/eof; perhaps you can send a message here and/or break the loop with break>
    }
    read -u 5 B || {
        <do something similar>
    }
    <do something with $A and $B>
done 4< file1.txt 5< file2.txt

【讨论】：

你能告诉我4 和5 是什么吗？它们只是任意文件变量名吗？
默认打开文件时覆盖标准输入（fd 0），但您可以为其指定自定义 fd，例如4 和 5。read 的 -u 选项指定要读取的 fd。 (fd = 文件描述符)
@konsolebox 非常优雅的答案。这些都在我的工具箱中：p
编写可移植性较差的形式for (( ;; )); do而不是while true ; do有优势吗？
for (( )) 有自己的处理程序。 while true 或 while : 调用 true 或 : 首先返回 0，解析 $? 并决定循环是否从它继续，我发现循环更慢或效率更低。