Linux：阻塞直到文件中的字符串匹配（“tail + grep with blocking”）答案

【问题标题】：Linux: Block until a string is matched in a file ("tail + grep with blocking")Linux：阻塞直到文件中的字符串匹配（“tail + grep with blocking”）
【发布时间】：2024-01-11 03:11:01
【问题描述】：

在 bash/GNU 工具中是否有一些单行方式来阻止文件中匹配的字符串？理想情况下，超时。我想避免多行循环。

更新：似乎我应该强调我希望该过程在字符串匹配时结束。

【问题讨论】：

标签： linux grep block gnu

【解决方案1】：

感谢两位的回答，但重要的是进程会阻塞直到找到，然后结束。我发现了这个：

grep -q 'PATTERN' <(tail -f file.log)

-q 不太便携，但我只会使用 Red Hat Enterprise Linux，所以没关系。并有超时：

timeout 180 grep -q 'PATTERN' <(tail -f file.log)

【讨论】：

不知道你为什么说-q不可移植，它是由POSIX指定的。
@devnull GNU grep 的手册页建议可移植脚本应避免使用 -q，因为某些旧版本的 Unix 没有它。他们建议重定向到 /dev/null；如果这是一个问题，你可以使用grep -m1 'PATTERN' <(tail -f file.log) >/dev/null。
这个解决方案的问题是，即使在 grep 退出后，进程 'tail' 仍然存在。
@Zskdan 如何避免这种情况？
使用grep -q 'PATTERN' <(timeout 10 tail -f file.log) 使tail 进程终止，同时包含超时。

【解决方案2】：

我用 sed 而不是 grep 做了一个变体，打印所有解析的行。

sed '/PATTERN/q' <(tail -n 0 -f file.log)

脚本在https://gist.github.com/2377029

【讨论】：

【解决方案3】：

看看--max-count选项：

tail -f file.log | grep -m 1 'PATTERN'

它将在匹配PATTERN的第一行之后退出。

编辑：注意下面@Karoly 的评论。如果file.log 速度较慢，则grep 进程可能会阻塞，直到在匹配行之后将其他内容添加到文件中。

echo 'context PATTERN line' >> file.log  ## grep shows the match but doesn't exit

将打印匹配的行，但在将其他内容附加到文件之前它不会退出（即使它还没有换行符）：

echo -n ' ' >> file.log  ## Now the grep process exits

在某些情况下（例如高速日志文件），这没什么大不了的，因为无论如何可能很快就会将新内容添加到文件中。

另请注意，这种行为不会在从控制台作为标准输入读取时发生，因此grep 从管道读取的方式似乎有所不同：

$ grep -m1 'PATTERN' -      # manually type PATTERN and enter, exits immediately
$ cat | grep -m1 'PATTERN'  # manually type PATTERN and enter, and it hangs

【讨论】：

它不会退出 - 你首先必须等到文件中写入一些内容......
是的，这是原始问题中的要求：“阻止直到匹配字符串”... -m 1 参数使其在第一次匹配后退出PATTERN
是的，直觉上它应该是这样工作的。不幸的是，首先您需要在日志文件中出现额外的行来触发实际退出。自己测试一下。我在 grep 输出中看到 PATTERN 突出显示，但进程没有退出。注意：这是 bash，你的 shell 可能更聪明。
确实，看起来你是对的。我正在测试具有高速日志的东西（android logcat，IIRC），所以我没有注意到挂起。我不确定为什么 grep 会这样工作，因为手册页说它应该在达到最大计数时立即停止。
@Joe 您的答案/cmets 中描述的这种细微差别让我困惑了好几天。如果关键字是 grep ed，是否有任何其他方法可以在脚本末尾使用 sleep infinity 立即退出流？

【解决方案4】：

tail -f file | grep word | head -n1

将发布带有异步超时的片段

现在：How to include a timer in Bash Scripting?

链接的答案定义了一个“run_or_timeout”函数，它以一种非常熟悉 bash 的方式完成您正在寻找的事情

【讨论】：

不鼓励使用 -1 语法，IIRC。使用-n 1。
这样，tail 不会在找到字符串时结束。
这让我很惊讶。稍后需要检查
@Ondra Žižka 好吧...grep --line-buffered 是等式的一部分。此外，现在看起来好像运行 head -n0 将在下一行在第一个匹配项之后中止。稍后我会尝试弄清楚更多

【解决方案5】：

$ 尾 -f 路径 | sed /模式/q

或者，如果你想抑制不匹配行的输出：

$ 尾 -f 路径 | sed -n '/模式/{p; q;}'

添加超时的简单方法是：

$ cmd&睡眠10；杀死$！ 2> /开发/空

（抑制来自 kill 的错误，以便在进程终止时在时间到期之前，您不会收到“没有这样的过程”警告）。请注意，这根本不可靠，因为 cmd 将终止并且 pid 计数将环绕和其他一些命令将在计时器到期时拥有该 pid。

【讨论】：

这行得通，但是 sed 的缓冲区（或其他东西）导致它需要额外的一行来触发，这可能永远不会出现（这一行往往是日志中的最后一行）。
第二个命令中的 sed 给了我“正则表达式的未终止地址”错误。
@Ondra Žižka：所以所需的额外行与我在上午所做的效果相同。 -- 我要走了

【解决方案6】：

等待文件出现

while [ ! -f /path/to/the.file ] 
do sleep 2; done

等待字符串出现在文件中

while ! grep "the line you're searching for" /path/to/the.file  
do sleep 10; done

https://superuser.com/a/743693/129669

【讨论】：

【解决方案7】：

我有类似的要求，并提出了以下要求。

您所关注的单行代码是以“超时 ....”开头的行，其余代码是为单行代码提供所需信息并进行清理所需的准备工作之后。

##
## Start up the process whose log file we want to monitor for a specific pattern.
##
touch file_to_log_nohup_output.log
nohup "some_command" "some_args" >> file_to_log_nohup_output.log 2>&1 &
my_cmd_pid=$!


## Specify what our required timeout / pattern and log file to monitor is
my_timeout=10m
my_logfile="/path/to/some_command's/log/file.txt"
my_pattern="Started all modules."


## How does this work?
## - In a bash sub shell, started in the background, we sleep for a second and
##   then execute tail to monitor the application's log file.
## - Via the arguments passed to it, tail has been configured to exit if the
##   process whose log file it is monitoring dies.
## - The above sub shell, is executed within another bash sub shell in which
##   we identify the process id of the above sub shell and echo it to stdout.
## - Lastly, in that sub shell we wait for the sub shell with tail running in
##   it as a child process, to terminate and if it does terminate, we redirect
##   any output from its stderr stream to /dev/null.
## - The stdout output of the above sub shell is piped into another sub shell
##   in which we setup a trap to watch for an EXIT event, use head -1 to read
##   the process id of the tail sub shell and finally start a grep process
##   to grep the stdout for the requested pattern. Grep will quit on the first
##   match found. The EXIT trap will kill the process of the tail sub shell
##   if the sub shell running grep quits.
##
## All of this is needed to tidy up the monitoring child processes for
## tail'ing + grep'ing the application log file.
##
## Logic of implementing the above sourced from: http://superuser.com/a/1052328


timeout ${my_timeout} bash -c '((sleep 1; exec tail -q -n 0 --pid=$0 -F "$1" 2> /dev/null) & echo $! ; wait $! 2>/dev/null ) | (trap "kill \${my_tail_pid} 2>/dev/null" EXIT; my_tail_pid="`head -1`"; grep -q "$2")' "${my_cmd_pid}" "${my_logfile}" "${my_pattern}" 2>/dev/null &


##
## We trap SIGINT (i.e. when someone presses ctrl+c) to clean up child processes.
##
trap 'echo "Interrupt signal caught. Cleaning up child processes: [${my_timeout_pid} ${my_cmd_pid}]." >> "file_to_log_nohup_output.log"; kill ${my_timeout_pid} ${my_cmd_pid} 2> /dev/null' SIGINT
wait ${my_timeout_pid}
my_retval=$?
trap - SIGINT


## If the time out expires, then 'timeout' will exit with status 124 otherwise
## it exits with the status of the executed command (which is grep in this
## case).
if [ ${my_retval} -eq 124 ]; then
    echo "Waited for [${my_timeout}] and the [${my_pattern}] pattern was not encountered in application's log file."
    exit 1
else
    if [ ${my_retval} -ne 0 ]; then
        echo "An issue occurred whilst starting process. Check log files:"
        echo "  * nohup output log file: [file_to_log_nohup_output.log]"
        echo "  * application log file: [${my_logfile}]"
        echo "  * application's console log file (if applicable)"
        exit 1
    else
        info_msg "Success! Pattern was found."
        exit 0
    fi
fi

我已将上述内容实现为一个独立的脚本，该脚本可用于运行命令等待其日志文件具有所需的模式，并带有超时。

可在此处获取：run_and_wait.sh

【讨论】：