【问题标题】:How to replace overlapping strings in Javascript without destroying the HTML structure如何在不破坏 HTML 结构的情况下替换 Javascript 中的重叠字符串
【发布时间】:2022-02-12 05:54:07
【问题描述】:

我有一个字符串和一个包含 N 个项目的数组:

<div>
sometimes the fox can fly really high
</div>
const arr = ['the fox can', 'fox can fly', 'really high']`

我想找到一种方法用 HTML 替换 div 中的文本,以突出显示数组中的那些特定短语,而不会破坏 HTML。这可能是有问题的,因为我不能做一个简单的循环和替换,因为替换后其他单词将不匹配,因为突出显示跨度会破坏innerHTML 上的indexOfincludes 之类的东西,当然我可以使用innerText 阅读文本,但它没有提供任何内容,因此我可以添加“下一个”跨度而不破坏原始 HTML 突出显示。理想情况下,我还希望能够根据我使用的单词自定义类名,而不仅仅是一个通用的突出显示类。

结果应该是

<div>
sometimes
<span class="highlight-1">the <span class="highlight-2">fox can</span></span><span class="highlight-2"> fly</span> <span class="highlight-3">really high</span>
</div>

我尝试了什么?

我真的考虑过这一点,在网上找不到任何有助于解决这种情况的资源,目前,我还需要额外的值,例如 charStartcharEnd 这个词,我不喜欢这个解决方案,因为它依赖于使用DOMParser() API,感觉真的很hacky,绝对没有性能,我只是得到一种“氛围”,我不应该使用这种方法并且必须有更好的解决方案,我正在达到向 SO 寻求关于如何完成这一挑战的想法。

      let text = `<p id="content">${content}</p>`
      let parser = new DOMParser().parseFromString(text, "text/html")

      for (const str of strings) {
        const content = parser.querySelector("#content")
        let descLength = 0

        for (const node of content.childNodes) {
          const text = node.textContent

          let newTextContent = ""

          for (const letter in text) {
            let newText = text[letter]
            if (descLength === str.charStart) {
              newText = `<em class="highlight ${str.type}" data-id="${str.id}">${text[letter]}`
            } else if (descLength === str.charEnd) {
              newText = `${text[letter]}</em>`
            }

            newTextContent += newText
            descLength++
          }

          node.textContent = newTextContent
        }

        // Replace the &lt; with `<` and replace &gt; with `>` to construct the HTML as text inside lastHtml
        const lastHtml = parser
          .querySelector("#content")
          .outerHTML.split("&lt;")
          .join("<")
          .split("&gt;")
          .join(">")

        // Redefine the parser variable with the updated HTML and let it automatically correct the element structure
        parser = new DOMParser().parseFromString(lastHtml, "text/html")

        /**
         * Replace the placeholder `<em>` element with the span elements to prevent future issues. We need the HTML
         * to be invalid for it to be correctly fixed by DOMParser, otherwise the HTML would be valid and *not* render how we'd like it to
         * Invalid => `<span>test <em>title </span>here</em>
         * Invalid (converted) => `<span>test <em>title </em></span><em>here</em>
         * Valid => `<span>test <span>title </span>here</span>
         */

        parser.querySelector("#content").innerHTML = parser
          .querySelector("#content")
          .innerHTML.replaceAll("<em ", "<span ")
          .replaceAll("</em>", "</span>")
      }

【问题讨论】:

    标签: javascript html dom


    【解决方案1】:

    我会复习你的例子只是为了给出一个想法。下面的代码不是一个干净的函数,请根据你的需要进行调整。

    const str = "sometimes the fox can fly really high";
    const arr = ['the fox can', 'fox can fly', 'really high'];
    
    // First, find the indices of start and end positions for your substrings.
    // Call them event points and push them to an array.
    eventPoints = [];
    arr.forEach((a, i) => {
      let index = strLower.indexOf(a)
      while (index !== -1) {
        let tagClass = `highlight-${i}`
        eventPoints.push({ pos: index, className: tagClass, eventType: "start" })
        eventPoints.push({ pos: index + a.length, className: tagClass, eventType: "end" })
        index = strLower.indexOf(a, index + 1)
      }
      return
    });
    
    // Sort the event points based on the position properties
    eventPoints.sort((a, b) => a.pos < b.pos ? -1 : a.pos > b.pos ? 1 : 0);
    
    // Init the final string, a stack and an index to keep track of the current position on the full string
    let result = "";
    let stack = [];
    let index = 0;
    // Loop over eventPoints
    eventPoints.forEach(e => {
        // concat the substring between index and e.pos to the result
        result += str.substring(index, e.pos);
        if (e.eventType === "start") {
            // when there is a start event, open a span
            result += `<span class="${e.className}">`;
            // keep track of which span is opened
            stack.push(e.className);
        }
        else {
            // when there is an end event, close tags opened after this one, keep track of them, reopen them afterwards
            let tmpStack = [];
            while (stack.length > 0) {
                result += "</span>";
                let top = stack.pop();
                if (top === e.className) {
                    break;
                }
                tmpStack.push(top);
            }
            while (tmpStack.length > 0) {
                let tmp = tmpStack.pop();
                result += `<span class="${tmp}">`;
                stack.push(tmp);
            }
        }
        index = e.pos;
    });
    
    result += str.substring(index, str.length)
    
    
    console.log(result);
    

    【讨论】:

    • 嘿!感谢您花时间回答这个问题,我喜欢您对堆栈和临时堆栈重新打开适当跨度的想法。我必须做的唯一更改是在返回result 之前添加result += str.substring(index, str.length),否则除非字符串以数组项之一结束,否则它将不包括字符串的其余部分。谢谢!
    • 我做了更多的测试,也意识到它不会检测字符串是否出现多次,但是循环通过 indexOf 很容易。
    • 不客气,很高兴听到它有帮助。你的观点是对的,谢谢你的编辑:)
    猜你喜欢
    • 2014-12-24
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-06-05
    • 1970-01-01
    • 2017-09-14
    • 2011-10-05
    相关资源
    最近更新 更多