如何在 shell 中使用 sed 或 grep 提取字符串中前两个破折号之间的文本答案

【问题标题】：How to extract text between first 2 dashes in the string using sed or grep in shell如何在 shell 中使用 sed 或 grep 提取字符串中前两个破折号之间的文本
【发布时间】：2021-07-18 15:59:56
【问题描述】：

我有这样的字符串feature/test-111-test-test。我需要提取字符串直到第二个破折号并将正斜杠更改为破折号。

我必须在 Makefile 中使用 shell 语法来做这件事，但对我来说，一些正则表达式对我有帮助，或者这种情况

最后我必须得到这样的东西：
输入 - feature/test-111-test-test
输出 - feature-test-111- 或至少 feature-test-111

feature/test-111-test-test | grep -oP '\A(?:[^-]++-??){2}' | sed -e 's/\//-/g')

但grep -oP 在我的情况下不起作用。这个正则表达式也不好用 - (.*?-.*?)-.*。

【问题讨论】：

"grep -oP doesn't work"。你确定只是grep 没用。您发布的命令似乎已损坏，第一部分只是一个字符串。我本来希望像echo 这样的东西来打印那个字符串。另外，最后还有一个流浪)。

标签： regex shell sed makefile grep

【解决方案1】：

您可以使用awk，记住在Makefile中$ char in awk 命令must be doubled:

url=$(shell echo 'feature/test-111-test-test' | awk -F'-' '{gsub(/\//, "-", $$1);print $$1"-"$$2"-"}')
echo "$url"
# => feature-test-111-

请参阅online demo。这里，-F'-' 将字段分隔符设置为 -，gsub(/\//, "-", $1) 将字段 1 中的 / 替换为 -，print $1"-"$2"-" 打印 - 的值 - 分隔字段 1 和 2。

或者，使用正则表达式作为字段分隔符：

url=$(shell echo 'feature/test-111-test-test' | awk -F'[-/]' '{print $$1"-"$$2"-"$$3"-"}')
echo "$url"
# => feature-test-111-

-F'[-/]' 选项将字段分隔符设置为- 和/。

'{print $1"-"$2"-"$3"-"}' 部分使用分隔连字符打印第一个、第二个和第三个值。

请参阅online demo。

【讨论】：

感谢@wiktor-stribiżew 的回复。对我来说，它不能正常工作。 s='feature/test-111-test-test' url=$(shell awk -F'[-/]' '{print $1"-"$2"-"$3"-"}'
@VitaliiSotnichenko 在 Makefile 中，$is special。你需要加倍，所以使用url=$(shell awk -F'[-/]' '{print $$1"-"$$2"-"$$3"-"}' <<< "$s")
@VitaliiSotnichenko 我更新的解决方案现在可以工作了吗？
@VitaliiSotnichenko 您可以在print 命令中附加http://，awk -F'-' '{gsub(/\//, "-", $$1);print "http://"$$1"-"$$2"-"}' 或awk -F'[-/]' '{print "http://"$$1"-"$$2"-"$$3"-"}'
非常适合我，非常感谢

【解决方案2】：

要获得 nth 出现的字符 C，您不需要花哨的 perl 正则表达式。相反，为n 次构建“（任何不是C，然后是C）形式的正则表达式”：

grep -Eo '([^-]*-){2}' | tr / -

【讨论】：

【解决方案3】：

使用sed 和cut

echo feature/test-111-test-test| cut -d'-' -f-2 |sed 's/\//-/'

输出

feature-test-111

echo feature/test-111-test-test| cut -d'-' -f-2 |sed 's/\//-/;s/$/-/'

输出

feature-test-111-

【讨论】：

【解决方案4】：

另一个使用捕获组和正则表达式/模式迭代的sed 解决方案（与 Socowi 使用的相同）：

$ s='feature/test-111-test-test'
$ sed -E 's/\//-/;s/^(([^-]*-){3}).*$/\1/' <<< "${s}"
feature-test-111-

地点：

-E - 启用扩展的正则表达式支持
s/\//-/ - 将 / 替换为 -
s/^....*$/ - 匹配输入行的开始和结束
(([^-]-){3}) - 捕获组 #1，由 3 组 anything not - 组成，后跟 -
\1 - 仅打印捕获组 #1（这将丢弃该行中不属于捕获组的所有其他内容）

将结果存储在变量中：

$ url=$(sed -E 's/\//-/;s/^(([^-]*-){3}).*$/\1/' <<< "${s}")
$ echo $url
feature-test-111-

【讨论】：

【解决方案5】：

您可以使用简单的 BRE 正则表达式形式的 not something then that something，即 [^-]*- 来获取除 - 之外的所有字符，直到 -。

这行得通：

echo 'feature/test-111-test-test' | sed -nE 's/^([^/]*)\/([^-]*-[^-]*-).*/\1-\2/p'
feature-test-111-

【讨论】：

【解决方案6】：

另一个使用参数扩展/替换的想法：

s='feature/test-111-test-test'
tail="${s//\//-}"                   # replace '/' with '-'

# split first field from rest of fields ('-' delimited); do this 3x times

head="${tail%%-*}"                  # pull first field
tail="${tail#*-}"                   # drop first field

head="${head}-${tail%%-*}"          # pull first field; append to previous field
tail="${tail#*-}"                   # drop first field

head="${head}-${tail%%-*}-"         # pull first field; append to previous fields; add trailing '-'

$ echo "${head}"
feature-test-111-

【讨论】：

【解决方案7】：

一个简短的sed解决方案，没有扩展正则表达式：

sed 's|\(.*\)/\([^-]*-[^-]*\).*|\1-\2|'

【讨论】：