如何让 ruby Nokogiri NodeSet 的 inner_html 未转义？答案

【问题标题】：How get inner_html of ruby Nokogiri NodeSet unescaped?如何让 ruby Nokogiri NodeSet 的 inner_html 未转义？
【发布时间】：2009-11-19 11:33:55
【问题描述】：

我想从 Nokogiri NodeSet 中获取未转义的内部 html。有谁知道怎么做？

【问题讨论】：

标签： ruby nokogiri

【解决方案1】：

有什么不合适的吗？

nodeset.inner_html

【讨论】：

【解决方案2】：

loofah gem 在这里帮了我很多忙。

【讨论】：

【解决方案3】：

将您的节点包装在 CDATA 中：

def wrap_in_cdata(node)
    # Using Nokogiri::XML::Node#content instead of #inner_html (which
    # escapes HTML entities) so nested nodes will not work
    node.inner_html = node.document.create_cdata(node.content)
    node
end

Nokogiri::XML::Node#inner_html 转义 HTML 实体，CDATA 部分除外。

fragment = Nokogiri::HTML.fragment "<div>Here is an unescaped string: <span>Turn left > right > straight & reach your destination.</span></div>"
puts fragment.inner_html
# <div>Here is an unescaped string: <span>Turn left &gt; right &gt; straight &amp; reach your destination.</span></div>


fragment.xpath(".//span").each {|node| node.inner_html = node.document.create_cdata(node.content) }
fragment.inner_html
# <div>Here is an unescaped string: <span>Turn left > right > straight & reach your destination.</span>\n</div>

【讨论】：

【解决方案4】：

旧版本的 libxml2 可能会导致 Nokogiri 返回一些转义字符。我最近遇到了这个问题。

【讨论】：