【问题标题】:Python Regex - Replacing IP addressesPython 正则表达式 - 替换 IP 地址
【发布时间】:2020-08-09 07:17:24
【问题描述】:

我是 python 和正则表达式的新手,我一直试图在 txt 文件中隐藏 IP 地址日志。我应该避免使用 for 循环和 if 检查 - 如果可能,因为 txt 文件很大(158MB)。

(所有IP地址都以172开头)

这是我尝试过的代码:

import re
txt = "test"
x = re.sub(r"^172\.*", "XXX.\", txt)
print(x)

示例 txt 文件:

ABCDEFGHIJKLMNOPRST172.12.65.10RSTUVYZ
ASDG172.56.23.14FSDGHSFSDFDSFHSF
!'^%%&!'+!'+^%&!ÂSDBSDF172.23.23.23SADASFSA
ASGFGD 172.12.23.56 ASDSAFASFDASSADSA

期望的输出:

ABCDEFGHIJKLMNOPRSTXXX.XXX.XXX.XXXRSTUVYZ
ASDGXXX.XX.XX.XXFSDGHSFSDFDSFHSF
!'^%%&!'+!'+^%&!ÂSDBSDFXXX.XXX.XXX.XXXSADASFSA
ASGFGD XXX.XXX.XXX.XXX ASDSAFASFDASSADSA

【问题讨论】:

  • re.sub(r'(172\.\d{1,3}\.\d{1,3}\.\d{1,3})', "XXX.XXX.XXX.XXX", text)
  • 另一个问题,在声明部分我将“test”作为字符串分配给 txt 变量。但是,我想从文件中读取它我该怎么办?我使它像: txt = open("test.txt", "r+") x = re.sub(r'(172\.\d{1,3}\.\d{1,3}\.\ d{1,3})', "XXX.XXX.XXX.XXX", txt) 但它给出了一个类型错误:TypeError: expected string or bytes-like object

标签: python regex


【解决方案1】:

使用:172(?:\.\d{1,3}){3}

代码:

string = r'''ABCDEFGHIJKLMNOPRST172.12.65.10RSTUVYZ
ASDG172.56.23.14FSDGHSFSDFDSFHSF
!'^%%&!'+!'+^%&!SDBSDF172.23.23.23SADASFSA
ASGFGD 172.12.23.56 ASDSAFASFDASSADSA'''

print re.sub(r'172(?:\.\d{1,3}){3}', "XXX.XXX.XXX.XXX", string)

输出:

ABCDEFGHIJKLMNOPRSTXXX.XXX.XXX.XXXRSTUVYZ
ASDGXXX.XXX.XXX.XXXFSDGHSFSDFDSFHSF
!'^%%&!'+!'+^%&!SDBSDFXXX.XXX.XXX.XXXSADASFSA
ASGFGD XXX.XXX.XXX.XXX ASDSAFASFDASSADSA

Demo & explanation

【讨论】:

    【解决方案2】:

    你确实应该使用re.sub

    re.sub("(172)(\.(?:[0-9]{1,3}\.){2}[0-9]{1,3})", r"XXX.XXX.XXX.XXX", tested_addr)
    

    关于正则表达式的解释(您并不需要组来满足您的要求,但它是理解正则表达式部分的好方法:

    ^(172)(\.(?:[0-9]{1,3}\.){2}[0-9]{1,3})$

    ^ asserts position at start of a line
    1st Capturing Group (172)
    172 matches the characters 172 literally (case sensitive)
    2nd Capturing Group (\.(?:[0-9]{1,3}\.){2}[0-9]{1,3})
    \. matches the character . literally (case sensitive)
    Non-capturing group (?:[0-9]{1,3}\.){2}
    {2} Quantifier — Matches exactly 2 times
    Match a single character present in the list below [0-9]{1,3}
    {1,3} Quantifier — Matches between 1 and 3 times, as many times as possible, giving back as needed (greedy)
    0-9 a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
    \. matches the character . literally (case sensitive)
    Match a single character present in the list below [0-9]{1,3}
    {1,3} Quantifier — Matches between 1 and 3 times, as many times as possible, giving back as needed (greedy)
    0-9 a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
    $ asserts position at the end of a line
    

    【讨论】:

    • 这是一个不错的正则表达式,但是,如何替换为 xxx
    • @Toto 更新了完整的解决方案
    • 那没有给出预期的结果,(重新)阅读问题。
    • @Toto 现在可以了
    猜你喜欢
    • 1970-01-01
    • 2017-03-15
    • 2011-06-20
    • 2015-11-22
    • 1970-01-01
    • 2013-02-21
    相关资源
    最近更新 更多