由于您的格式是不变的,您可以在同一个正则表达式中匹配和捕获。
var str1 = 'I am a "somevalue" of "anothervalue"',
str2 = 'I am a "somevalue"',
str3 = 'I am a "value with \\"escaped\\" quotes"',
regex = /^I am a "((?:\\"|[^"])*)"(?: of "((?:\\"|[^"])*)")?/;
function match(str) {
var matches = str.match(regex);
if (matches !== null) {
console.log(matches.slice(1)); // ["somevalue", "anothervalue"]
}
}
match(str1); // ["somevalue", "anothervalue"]
match(str2); // ["somevalue", undefined]
match(str3); // ["value with \"escaped\" quotes", undefined]
切片调用是删除包含整个字符串的第一个匹配项。如果没有可匹配的内容,您将获得“未定义”作为第二个匹配项。
根据引号内引号的转义方式,您可能需要稍微修改一下正则表达式。我假设\ 将成为转义字符。
NODE EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
I am a " 'I am a "'
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more
times (matching the most amount
possible)):
--------------------------------------------------------------------------------
\\ '\'
--------------------------------------------------------------------------------
" '"'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
[^"] any character except: '"'
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
" '"'
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
of " ' of "'
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more
times (matching the most amount
possible)):
--------------------------------------------------------------------------------
\\ '\'
--------------------------------------------------------------------------------
" '"'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
[^"] any character except: '"'
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
" '"'
--------------------------------------------------------------------------------
)? end of grouping
话虽如此,此解决方案仅适用于“高使用率”的特定值。如果我们以非常高的速度谈论数百万个查询,那么您最好使用更适合解析文本的技术(这可能不会在 JavaScript/node 中)。