编写shell脚本答案

【问题标题】：writing shell script编写shell脚本
【发布时间】：2023-07-05 01:38:02
【问题描述】：

我想编写一个 shell 脚本，它会从标准输入读取文件，删除所有字符串和空行字符，并将输出写入标准输出。文件如下所示：

#some lines that do not contain <html> in here
<html>a<html>
<tr><html>b</html></tr>
#some lines that do not contain <html> in here
<html>c</html>

所以，输出文件应该包含：

#some lines that do not contain <html> in here
a
<tr>b</html></tr>
#some lines that do not contain <html> in here
c</html>

我尝试编写这个 shell 脚本：

read INPUT #read file from std input
tr -d '[:blank:]'
grep "<html>" | sed -r 's/<html>//g'
echo $INPUT

但是这个脚本根本不起作用。任何的想法？谢谢

【问题讨论】：

如果可能的话，您可能想在 Perl（或某个 shell 以外的其他东西）中尝试这个：check out the answer(s) on this other question
@summea 我不能。我必须使用 #!/usr/bin/bash
应该保留cmets吗？
我想我不明白为什么您在一个文档中还有多个 <html></html> 对......
我也不知道。这只是我老师给我们的一些随机文件

标签： shell

【解决方案1】：

纯 bash：

#!/bin/bash

while read line
do
    #ignore comments
    [[ "$line" = "\#" ]] && continue
    #ignore empty lines
    [[ $line =~ ^$ ]] && continue
    echo ${line//\<html\>/}
done < $1

输出：

$ ./replace.sh input
#some lines that do not contain in here
a
<tr>b</html></tr>
#some lines that do not contain in here
c</html>

纯sed：

sed -e :a -e '/^[^#]/N; s/<html>//; ta' input | sed '/^$/d'

【讨论】：

什么 [[ "$line" = "\#" ]] 是什么意思？而且我不能只使用 grep 和 sed
见上面源代码中的 cmets
所以第一个 sed 将删除，但是第二个 sed 做什么呢？
第二个删除空行

【解决方案2】：

Awk 可以轻松做到：

awk '/./ {gsub("<html>","");print}' INPUTFILE

首先，它对至少包含一个字符的每一行（因此空行被丢弃）进行操作，然后将“<html>”全局替换为该行上的一个空字符串，然后将其打印出来。

【讨论】：

OP 需要保留 cmets
我只能使用 grep 和 sed。但是 /./ 是什么意思？是指当前目录吗？
@HannaGabby - /./ 是一个正则表达式，表示一个字符 [任何]