【问题标题】:How can I manipulate a list into a column?如何将列表操作成列?
【发布时间】:2020-01-23 22:55:03
【问题描述】:

我有一些来自下面显示的 word 文件的输出:

Doc = docx2python('C:/Users/Sam/Data/Information.docx')
print(Doc.body[0])


[[['Event Info', '1)\tHalf (1 or 2)', '2)\tMinutes (on video)', '3)\tSeconds (on video)', '4)/tStaff, 0 = N/A)',]]]

我想知道如何将这些列表放入一个列中,并显示以下输出:

Event
Half
Minutes
Seconds
Staff

【问题讨论】:

  • 这是一个实际的制表符\t 还是一个文字反斜杠后跟t
  • 到目前为止您尝试了哪些方法,结果如何?

标签: python list doc


【解决方案1】:

这样的?

Doc = docx2python('C:/Users/Sam/Data/Information.docx')
d=Doc.body[0]

# Putting some data into d for testing.
# Remove this for actual production.
d= [[['Event Info', '1)\tHalf (1 or 2)', '2)\tMinutes (on video)', '3)\tSeconds (on video)', '4)\tStaff, 0 = N/A)',]]]

# We'll need regular expressions.
import re

# Helper functions.

def startsWithADigit(x):
    return re.match(r"^[0-9]", x)

def getStuffAfterPotentialTabCharacter(x):
    return x.split("\t")[-1] 

def getFirstWord(x):
    return re.sub(r"([a-zA-Z]+).*", r'\1', x)


# Get rid of indented lists.
l=d[0][0]

# Get stuff after potential tab characters.
p=[getStuffAfterPotentialTabCharacter(x) for x in l]

# Get the first word in each record, as that seems to be requested.
q=[getFirstWord(x) for x in p]

# Print the result.
for x in q:
    print(x)

【讨论】:

  • 变量和函数名一般应遵循lower_case_with_underscores风格。
猜你喜欢
  • 2012-03-17
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2019-02-25
  • 1970-01-01
  • 2017-10-24
  • 1970-01-01
相关资源
最近更新 更多