【发布时间】:2016-04-10 10:04:27
【问题描述】:
我有一个包含这样内容的文本:
(some text)
libncursesw5-dev:amd64 depends on libc6-dev | libc-dev;(some text)
libx32ncursesw5 depends on libc6-x32 (>= 2.16);(some text)
libx32ncurses5-dev depends on libncurses5-dev (= 5.9+20150516-2ubuntu1);(some text)
libx32ncursesw5-dev depends on libc6-dev-x32;(some text)
lib32tinfo-dev depends on lib32c-dev;(some text)
以下是其中一个句子的完整示例:
dpkg: error processing package lib32tinfo5 (--install):
dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of libncurses5-dev:amd64:
libncurses5-dev:amd64 depends on libc6-dev | libc-dev; however:
Package libc6-dev is not installed.
Package libc-dev is not installed.
整个文本被分成几个段落,例如上面的段落,每个段落包含其中一个句子。
我想要一个在 python 中使用 re 库的正则表达式,它可以使用 findall 选项给我类似的东西:
('libc6-dev', '', 'libc-dev', '')
('libc6-x32','2.16')
('libncurses5-dev','5.9+20150516-2ubuntu1')
('libc6-dev-x32','')
('lib32c-dev','')
换句话说,我希望得到您的帮助,以便从此类文本中获取包含包及其版本(如果指定)的元组。
我做了这个正则表达式:
(?<=depends on )([a-zA-Z0-9\-]*)(?: \([=> ]*([a-zA-Z0-9-+.]*)(?:\)))?|(?: \| )([a-zA-Z0-9\-]*)(?: \([=> ]*([a-zA-Z0-9-+.]*)(?:\)))?(?=;)
我得到了这个结果:
('libc6-dev', '', '', '')
('', '', 'libc-dev', '')
('libc6-x32', '2.16', '', '')
('libncurses5-dev', '5.9+20150516-2ubuntu1', '', '')
('libc6-dev-x32', '', '', '')
('lib32c-dev', '', '', '')
如你所见,对于句子:
libncursesw5-dev:amd64 depends on libc6-dev | libc-dev;
我得到了这个答案:
('libc6-dev', '', '', '')
('', '', 'libc-dev', '')
而不是这个:
('libc6-dev', '', 'libc-dev', '')
感谢您的帮助。
【问题讨论】:
标签: python regex string python-3.x