【问题标题】:Find and replace a href value with PowerShell?用 PowerShell 查找和替换 href 值?
【发布时间】:2021-11-03 23:34:38
【问题描述】:

我有一个包含大量链接的 HTML 文件。 它们的格式为 http:/oldsite/showFile.asp?doc=1234&lib=lib1 我想用 http://newsite/?lib=lib1&doc=1234

(1234和lib1是可变的)

你知道怎么做吗?

谢谢 P

【问题讨论】:

    标签: powershell


    【解决方案1】:

    我不认为你的例子是正确的。

    http:/oldsite/showFile.asp?doc=1234&lib=lib1 应该是
    http:/oldsite/showFile.asp?doc=1234&lib=lib1

    http://newsite/?lib=lib1&doc=1234 应该是http://newsite?lib=lib1&doc=1234

    要对这些进行替换,您可以这样做

    'http:/oldsite/showFile.asp?doc=1234&lib=lib1' -replace 'http:/oldsite/showFile\.asp\?(doc=\d+)&(lib=\w+)', 'http://newsite?$2&$1'
    

    返回http://newsite?lib=lib1&doc=1234

    要在文件中替换这些,您可以使用:

    (Get-Content -Path 'X:\TheHtmlFile.html' -Raw) -replace 'http:/oldsite/showFile\.asp\?(doc=\d+)&(lib=\w+)', 'http://newsite?$2&$1' |
     Set-Content -Path 'X:\TheNewHtmlFile.html'
    

    正则表达式详细信息:

    http:/oldsite/showFile        Match the characters “http:/oldsite/showFile” literally
    \.                            Match the character “.” literally
    asp                           Match the characters “asp” literally
    \?                            Match the character “?” literally
    (                             Match the regular expression below and capture its match into backreference number 1
       doc=                       Match the characters “doc=” literally
       \d                         Match a single digit 0..9
          +                       Between one and unlimited times, as many times as possible, giving back as needed (greedy)
    )                            
    &                             Match the character “&” literally
    (                             Match the regular expression below and capture its match into backreference number 2
       lib=                       Match the characters “lib=” literally
       \w                         Match a single character that is a “word character” (letters, digits, etc.)
          +                       Between one and unlimited times, as many times as possible, giving back as needed (greedy)
    )
    

    【讨论】:

    • 非常感谢您的详细解释。
    【解决方案2】:

    读入文件,遍历每一行并将旧值替换为新值,将输出发送到新文件:

    gc file.html | % { $_.Replace('oldsite...','newsite...') } | out-file new-file.html
    

    【讨论】:

    • 谢谢。我不确定如何进行替换。我认为这是我需要的正则表达式。
    猜你喜欢
    • 2019-03-18
    • 1970-01-01
    • 2020-11-23
    • 1970-01-01
    • 2012-10-19
    • 1970-01-01
    • 2012-02-12
    • 1970-01-01
    • 2016-08-21
    相关资源
    最近更新 更多