【发布时间】:2020-04-11 00:25:57
【问题描述】:
我想用正则表达式匹配这些字符串,并获取<div>之间的所有数据数据</div>我已经尝试了所有方法但仍然无法做到......
这是我的正则表达式 101:https://regex101.com/r/w7et0M/2
代码如下:
<div> Sentinel. Winged Guardian cannot have restricted attachments.
Forced: After an attack in which Winged Guardian defends resolves, pay 1 Tactics resource or discard Winged Guardian from play.
</div>
<div>Attach to a hero.
Attached hero gains +1 Attack.
Action: Pay 1 resource from attached hero's pool to attach Dunedain Mark to another hero.
</div>
<div>Action: Search the top 5 cards of your deck for any number of Eagle cards and add them to your hand. Shuffle the other cards back into your deck.
</div>
<div>Attach to a hero.
Attached hero gains +1 Attack.
Action: Pay 1 resource from attached hero's pool to attach Dunedain Mark to another hero.
</div>
<div>Guarded.
Response: After the players quest successfully, the players may claim Signs of Gollum if it has no attached encounters. When claimed, attach Signs of Gollum to any hero committed to the quest. (Counts as a Condition attachment with: 'Forced: After attached hero is damaged or leaves play, return this card to the top of the encounter deck.')
</div>
谁能帮我找到解决办法?
【问题讨论】:
-
“谁能帮帮我” 当然,do not use regex for this。您是否尝试过使用 HTML 解析器? More about why regex isn't the right tool to parse HTML.
-
嗯,艾哈迈德就是问题所在。这是链接
http://hallofbeorn.com/LotR?CardSet=The+Hunt+for+Gollum,上面带有div的代码是编写的。我也尝试过用 simple.html.dom 来做,但仍然没有。代码就是因为这个原因。我认为很难被抓取 -
您可以提取匹配正则表达式
(?s)(?<=<div>).*?(?=<\/div>)的文本。 Demo(?s)指定单行模式,这会导致.匹配换行符。不指定单行模式是正则表达式的问题之一。(?<=<div>)是一个积极的后视;(?=<\/div>)是一个正向预测。不需要捕获组。