Javascript Regexp循环所有匹配项答案

【问题标题】：Javascript Regexp loop all matchesJavascript Regexp循环所有匹配项
【发布时间】：2011-08-15 16:17:20
【问题描述】：

我正在尝试使用堆栈溢出的富文本编辑器做类似的事情。鉴于此文本：

[Text Example][1]

[1][http://www.example.com]

我想循环以这种方式找到的每个[string][int]：

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
var arrMatch = null;
var rePattern = new RegExp(
  "\\[(.+?)\\]\\[([0-9]+)\\]",
  "gi"
);
while (arrMatch = rePattern.exec(Text)) {
  console.log("ok");
}

这很好用，它会为每个[string][int] 发出“ok”警报。不过，我需要做的是，对于找到的每个匹配项，将初始匹配项替换为第二个匹配项的组件。

所以在循环中 $2 将代表最初匹配的 int 部分，我会运行这个正则表达式（pseduo）

while (arrMatch = rePattern.exec(Text)) {
    var FindIndex = $2; // This would be 1 in our example
    new RegExp("\\[" + FindIndex + "\\]\\[(.+?)\\]", "g")

    // Replace original match now with hyperlink
}

这将匹配

[1][http://www.example.com]

第一个示例的最终结果是：

<a href="http://www.example.com" rel="nofollow">Text Example</a>

编辑

我现在已经做到了：

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
reg = new RegExp(
  "\\[(.+?)\\]\\[([0-9]+)\\]",
  "gi");
var result;
while ((result = reg.exec(Text)) !== null) {
  var LinkText = result[1];
  var Match = result[0];
  Text = Text.replace(new RegExp(Match, "g"), '<a href="#">" + LinkText + "</a>');
}
console.log(Text);

【问题讨论】：

标签： javascript regex loops match

【解决方案1】：

我知道它已经过时了，但是因为我偶然发现了这篇文章，所以我想把事情弄清楚。

首先，你解决这个问题的思维方式太复杂了，当本应简单的问题的解决方案变得太复杂时，就该停下来想想哪里出了问题。其次，您的解决方案在某种程度上效率非常低，您首先尝试查找要替换的内容，然后尝试在同一文本中搜索引用的链接信息。所以计算复杂度最终变成O(n^2)。

看到这么多对错误的支持非常令人失望，因为来到这里的人主要从公认的解决方案中学习，认为这似乎是合法的答案，并在他们的项目中使用这个概念，然后就变成了一个非常执行不力的产品。

解决这个问题的方法非常简单。您需要做的就是找到文本中所有引用的链接，将它们保存为字典，然后使用字典搜索要替换的占位符。而已。就是这么简单！在这种情况下，您将获得仅O(n) 的复杂性。

原来是这样的：

const text = `
 [2][https://en.wikipedia.org/wiki/Scientific_journal][5][https://en.wikipedia.org/wiki/Herpetology]

The Wells and Wellington affair was a dispute about the publication of three papers in the Australian Journal of [Herpetology][5] in 1983 and 1985. The publication was established in 1981 as a [peer-reviewed][1] [scientific journal][2] focusing on the study of [3][https://en.wikipedia.org/wiki/Amphibian][amphibians][3] and [reptiles][4] ([herpetology][5]). Its first two issues were published under the editorship of Richard W. Wells, a first-year biology student at Australia's University of New England. Wells then ceased communicating with the journal's editorial board for two years before suddenly publishing three papers without peer review in the journal in 1983 and 1985. Coauthored by himself and high school teacher Cliff Ross Wellington, the papers reorganized the taxonomy of all of Australia's and New Zealand's [amphibians][3] and [reptiles][4] and proposed over 700 changes to the binomial nomenclature of the region's herpetofauna.
[1][https://en.wikipedia.org/wiki/Academic_peer_review]    
[4][https://en.wikipedia.org/wiki/Reptile]          
`;

const linkRefs = {};
const linkRefPattern = /\[(?<id>\d+)\]\[(?<link>[^\]]+)\]/g;
const linkPlaceholderPattern = /\[(?<text>[^\]]+)\]\[(?<refid>\d+)\]/g;

const parsedText = text
    .replace(linkRefPattern, (...[,,,,,ref]) => (linkRefs[ref.id] = ref.link, ''))
    .replace(linkPlaceholderPattern, (...[,,,,,placeholder]) => `<a href="${linkRefs[placeholder.refid]}">${placeholder.text}</a>`)
    .trim();

console.log(parsedText);

【讨论】：

【解决方案2】：

使用反向引用来限制匹配，以便在您的文本为时代码将匹配：

[Text Example][1]\n[1][http://www.example.com]

如果你的文本是这样的，代码将不匹配：

[Text Example][1]\n[2][http://www.example.com]

var re = /\[(.+?)\]\[([0-9]+)\s*.*\s*\[(\2)\]\[(.+?)\]/gi;
var str = '[Text Example][1]\n[1][http://www.example.com]';
var subst = '<a href="$4">$1</a>';

var result = str.replace(re, subst);
console.log(result);

\number 在正则表达式中用于引用组匹配号，$number 以相同的方式被替换函数用于引用组结果。

【讨论】：

【解决方案3】：

另一种在不依赖 exec 和 match 细节的情况下迭代所有匹配项的方法是使用字符串替换函数，将正则表达式作为第一个参数，将函数作为第二个参数。当这样使用时，函数参数接收整个匹配作为第一个参数，分组匹配作为下一个参数，索引作为最后一个：

var text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
var arrMatch = null;
var rePattern = new RegExp("\\[(.+?)\\]\\[([0-9]+)\\]", "gi");
text.replace(rePattern, function(match, g1, g2, index){
    // Do whatever
})

您甚至可以使用全局 JS 变量 arguments 迭代每个匹配项的所有组，不包括第一个和最后一个。

【讨论】：

【解决方案4】：

这里我们使用 exec 方法，它有助于获取所有匹配项（在 while 循环的帮助下）并获取匹配字符串的位置。

    var input = "A 3 numbers in 333";
    var regExp = /\b(\d+)\b/g, match;
    while (match = regExp.exec(input))
      console.log("Found", match[1], "at", match.index);
    // → Found 3 at 2 //   Found 333 at 15

【讨论】：

这真的很有用。

【解决方案5】：

我同意 Jason 的观点，即使用现有的 Markdown 库会更快/更安全，但您正在寻找 String.prototype.replace（另外，使用 RegExp 文字！）：

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
var rePattern = /\[(.+?)\]\[([0-9]+)\]/gi;

console.log(Text.replace(rePattern, function(match, text, urlId) {
  // return an appropriately-formatted link
  return `<a href="${urlId}">${text}</a>`;
}));

【讨论】：

【解决方案6】：

我最终做到了：

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
reg = new RegExp(
  "\\[(.+?)\\]\\[([0-9]+)\\]",
  "gi");
var result;
while (result = reg.exec(Text)) {
  var LinkText = result[1];
  var Match = result[0];
  var LinkID = result[2];
  var FoundURL = new RegExp("\\[" + LinkID + "\\]\\[(.+?)\\]", "g").exec(Text);
  Text = Text.replace(Match, '<a href="' + FoundURL[1] + '" rel="nofollow">' + LinkText + '</a>');
}
console.log(Text);

【讨论】：

【解决方案7】：

此格式基于Markdown。有several JavaScript ports 可用。如果您不想要整个语法，那么我建议窃取与链接相关的部分。

【讨论】：

谢谢，不过我想学习如何做到这一点，我一直在谷歌上搜索如何循环每场比赛并进行另一场比赛。
很公平。看起来其他一些答案提供了代码。