【发布时间】:2019-04-10 14:11:41
【问题描述】:
我有一个网站链接列表,除了更改年份之外完全相同的网站链接,这是我想要找到的。我正在使用 re.match 来尝试找到它,因为除了 4 个字符(20xx)之外,字符串完全相同。出于某种原因,它只返回 None,我不知道为什么。
我尝试过使用findall和fullmatch等其他re方法,但是没有用。
state_links = ["https://2009-2017.state.gov/r/pa/prs/ps/2009/index.htm",
"https://2009-2017.state.gov/r/pa/prs/ps/2010/index.htm",
"https://2009-2017.state.gov/r/pa/prs/ps/2011/index.htm",
"https://2009-2017.state.gov/r/pa/prs/ps/2012/index.htm",
"https://2009-2017.state.gov/r/pa/prs/ps/2013/index.htm",
"https://2009-2017.state.gov/r/pa/prs/ps/2014/index.htm",
"https://2009-2017.state.gov/r/pa/prs/ps/2015/index.htm",
"https://2009-2017.state.gov/r/pa/prs/ps/2016/index.htm"]
for link in state_links:
year = re.match(r"https://2009-2017.state.gov/r/pa/prs/ps/(.*)/index.htm", link)
print(year)
【问题讨论】:
-
对我来说它工作正常,请再次检查。
-
您应该转义正则表达式中的所有
.字符。但在这种情况下应该没什么区别。
标签: regex python-3.x