在 Python 中使用正则表达式生成列表答案

【问题标题】：Using regular expression in Python to generate a list在 Python 中使用正则表达式生成列表
【发布时间】：2012-03-16 08:52:43
【问题描述】：

我有一个小脚本，用于生成具有预定义字体和文本的图像。我想将其更改为使用多种字体来呈现相同的文本，例如5 种字体的字母A。我将我的字体列表定义为：

fonts = [ 'Georgia', 'Consolas', 'Arial']

然后我用在：

for item in enumerate(fonts) :
 ...

我想生成包含所有内容的列表，例如Times New Roman 系列字体。我试图用正则表达式生成列表，但没有运气。我真的不知道如何将它嵌入到列表中（带引号，开头和结尾带有/ 等）

我尝试过这样的事情：fonts = [ '/^.Times.*$/' ] 和这个fonts = [ '/Times.*/g' ]，但没有成功。

当我想使用像Luicida Console Regular这样的3字字体时出现了第二个问题，然后我得到了这个错误：

C:\Users\xxx\Desktop\test.py:46: PangoWarning: couldn't load font "Luicid
a Console 40", falling back to "Sans 40", expect ugly output.
  pangctx.show_layout (layout)

看起来脚本只从字体名称中获取了两个单词。

编辑

def main ():
    surface = cairo.ImageSurface (cairo.FORMAT_ARGB32, WIDTH, HEIGHT)
    context = cairo.Context (surface) 
    source  = context.get_source ()
    font    = sys.argv[1]

    fonts    = [ 'Georgia', 'Consolas', 'Arial',  'Lucida Console', 'Times New Roman' ]
    output  = sys.argv[2]
    text    = sys.argv[3]

    background = cairo.SolidPattern (255, 255, 255)
    context.rectangle (0, 0, WIDTH, HEIGHT)
    context.set_source (background)
    context.fill ()

    pangctx = pangocairo.CairoContext (context)

    layout  = pangctx.create_layout () 
    layout.set_width ((WIDTH - 2 * PADDING) * pango.SCALE)
    layout.set_single_paragraph_mode (True)
    layout.set_wrap (pango.WRAP_CHAR)

    size    = 40 * pango.SCALE
    spacing = 10 * pango.SCALE
    markup = ''
    for index, item in enumerate(fonts):
        print index, item
        markup  += '<span font="'+ item +'" size="' + str(size) + '" letter_spacing="' + str(spacing) + '">' + text +'</span>'
    layout.set_markup (markup)
    pangctx.update_layout (layout)

    context.new_path ()
    context.move_to (PADDING, PADDING)
    context.set_source (source)
    context.set_source_rgb (0, 0, 0)

    pangctx.show_layout (layout)
    surface.write_to_png (output)

编辑看来这仍然是pango中的一个错误launchpad link

【问题讨论】：

你为什么认为正则表达式在那里有效，“Luicida Console”到底是什么？
@IgnacioVazquez-Abrams “Luicida Console Regular”是一种字体——至少在我的系统（Win7）上是这样。 'regex is valid there' 是什么意思？
不，不是。是什么让您相信正则表达式将被您传递给的任何对象使用？
@IgnacioVazquez-Abrams 那么对不起。我的错。

标签： python regex list variables python-2.7

【解决方案1】：

获取所有可用字体名称的列表：

fonts = [f.get_name() for f in layout.get_context().list_families()]

仅保留与正则表达式匹配的字体，例如，选择名称中包含 mono 或 space 的字体（不区分大小写）：

mono_fonts = filter(re.compile(r'(?i)mono|space').search, fonts)

顺便说一句，使用字符串格式而不是+ 运算符可能更具可读性：

markup += '<span font="{}" size="{}" letter_spacing="{}">{}</span>'.format(
                       item, size, spacing, text)

font 属性似乎对我有用：

>>> import cairo
>>> import pango
>>> cairo.version
'1.8.8'
>>> pango.version_string()
'1.29.3'

【讨论】：

【解决方案2】：

首先，您的尝试似乎存在拼写错误。是“露西达”，不是“露西达”

第二，你好像在用Pango？为什么不使用它来列出所有可用的字体变体？参见例如pygtk tutorial。

编辑：查看您的代码和Pango Markup Language 的引用，似乎“字体”不是有效属性。改用“font_family”。

【讨论】：

感谢您指出这一点。虽然我仍然遇到这个 Times New Roman 问题 - 脚本没有读取 Roman 字 - pastebin
您能否编辑问题以显示完整的代码，或者至少显示您使用的 Pango 调用？
您可能指的是font_desc，而不是font_family。虽然font 和font_desc 产生相同的结果。

【解决方案3】：

根据正则表达式匹配过滤列表fonts。

import re
r = re.compile("Times.*")
for item in enumerate(f for f in fonts if r.match(f)):
    ...

在你的情况下，你可以只检查字符串包含：

for item in enumerate(f for f in fonts if f.contains("Times")):
    ...

您需要为第二个错误提供更多详细信息。

【讨论】：

【解决方案4】：

这将生成字体列表中以“Times”开头的所有字体的列表：

timesFonts = filter(lambda x: re.match(r'^Times.*', x), fonts)

【讨论】：