正则表达式解析问题答案

【问题标题】：Regexp parsing prоblem正则表达式解析问题
【发布时间】：2013-09-30 20:13:57
【问题描述】：

有一个block div，在他里面有一个未知数量的链接，如“a href onclick”，如果有多个链接，则用逗号和空格隔开。

var reg = /<div class="labeled fl_l"><a href="[^"]*" onclick="[^"]*">(.+?)<\/a>(, <a href="[^"]*" onclick="[^"]*">(.+?)<\/a>{1,})?<\/div>/mg;
var arr;
while ((arr = reg.exec(data)) != null) {
            console.log(arr[0]); //contains the entire text (because it is java script)
    console.log(arr[1]); //contains the name of the first link
    console.log(arr[2]); //contains the following "a href" entirely (if I will point out (?: x, <a... /a>), then the nested brackets will not work)
    console.log(arr[3]); //contains the name of the second link, **and then all of the code**
}

}

我认为应该使用([^ <] *)而不是(. +?)，但它根本不起作用。

【问题讨论】：

我会马上阻止你。是时候让你知道真相了...stackoverflow.com/questions/1732348/…
除了 plalx 的评论之外，如果您要询问有关正则表达式的问题，至少要给出一些来源示例以及您希望提取的内容。然后忘记正则表达式并使用专为此目的设计的 Javascript 中的 DOM 操作函数。
不要使用正则表达式解析 HTML。你在 JavaScript 中，你不能访问 DOM 元素本身吗？它已经被解析了，你只需要访问它。
以getElementByClass 开始获取父div，然后getElementByTagName 获取所有A 标签。然后您可以使用getAttributeNode 获取您感兴趣的任何属性。

标签： javascript regex parsing

【解决方案1】：

如果使用正则表达式是理想的（它们不是），我会使用两个单独的表达式，一个查找

和

之间的所有内容，然后另一个找到每个链接。

但是，regular expressions aren't the right tool for the job. 您可能要考虑使用 xPath 来遍历链接。

【讨论】：