允许一个字符不同:
$ cat tst.awk
BEGIN {
lgth = length(str)
for (i=1; i<=lgth; i++) {
head = esc(substr(str,1,i-1))
tail = esc(substr(str,i+1))
part = head "." tail
reg = (i>1 ? reg "|" : "") part
}
reg = "(" tolower(reg) ")"
printf "Searching for string \"%s\"\n", str | "cat>&2"
printf "Searching for regexp \"%s\"\n", reg | "cat>&2"
}
tolower($0) ~ reg
function esc(str) {
gsub(/[^^\\]/,"[&]",str)
gsub(/\^|\\/,"\\\\&",str)
return str
}
.
$ awk -v str='tataag' -f tst.awk file
>1 agctcaTATAAGtataagctagaagta
>4 gctagcaTATCAGgatgtagtagta
Searching for string "tataag"
Searching for regexp "(.[a][t][a][a][g]|[t].[t][a][a][g]|[t][a].[a][a][g]|[t][a][t].[a][g]|[t][a][t][a].[g]|[t][a][t][a][a].)"
允许缺少一个字符:
$ cat tst.awk
BEGIN {
lgth = length(str)
for (i=1; i<=lgth; i++) {
head = esc(substr(str,1,i))
tail = esc(substr(str,i+1))
part = head "?" tail
reg = (i>1 ? reg "|" : "") part
}
reg = "(" tolower(reg) ")"
printf "Searching for string \"%s\"\n", str | "cat>&2"
printf "Searching for regexp \"%s\"\n", reg | "cat>&2"
}
tolower($0) ~ reg
function esc(str) {
gsub(/[^^\\]/,"[&]",str)
gsub(/\^|\\/,"\\\\&",str)
return str
}
.
$ awk -v str='tataag' -f tst.awk file
>1 agctcaTATAAGtataagctagaagta
>3 atatagcgctagagccgtagta
Searching for string "tataag"
Searching for regexp "([t]?[a][t][a][a][g]|[t][a]?[t][a][a][g]|[t][a][t]?[a][a][g]|[t][a][t][a]?[a][g]|[t][a][t][a][a]?[g]|[t][a][t][a][a][g]?)"
以上所有转义都是为了确保您的字符串被视为文字字符串,即使/当它包含正则表达式元字符时也是如此。
完成测试后,您可以删除 2 个打印语句。