【发布时间】:2020-06-17 22:28:27
【问题描述】:
我有 400 个包含多行的文件。我想找到特定的行并仅提取/打印其中的一部分。
我想到达线路:
Full seesion name: T27I5E8_S1_N005_V004
仅打印:
S1_V004
我试过了:
for filename in os.listdir(data_directory):
with open(data_directory + "/" + filename) as file:
for line in file:
if re.search(r'([S][\d])|([V][\d]{3})', line):
print(line)
但它会打印出整行。 我也试过了:
subjID = re.compile(r'([S][\d])|([V][\d]{3})')
for filename in os.listdir(data_directory):
with open(data_directory + "/" + filename) as file:
for line in file:
print(subjID.findall(line))
但输出看起来像:
[]
[]
[]
[]
[('S1', ''), ('', 'V094')]
[]
[]
[]
[]
[]
[]
[]
[('S1', ''), ('', 'V094')]
[]
[]
[]
[]
[]
[]
[]
【问题讨论】:
-
它打印由于
print(line)而导致的行尝试print( "_".join(re.findall(r'(?<![^_])[SV]\d+(?![^_])', line)) ) -
只打印S部分,不打印V部分。
标签: python regex pandas extract