将字符串分解为Python中的字符列表[重复]答案

【问题标题】：Break string into list of characters in Python [duplicate]将字符串分解为Python中的字符列表[重复]
【发布时间】：2012-03-23 02:32:20
【问题描述】：

基本上我想从文件中提取一行文本，将字符分配给一个列表，然后创建一个列表中所有单独字符的列表——一个列表列表。

目前，我已经尝试过：

fO = open(filename, 'rU')
fL = fO.readlines()

这就是我所拥有的。我不太清楚如何提取单个字符并将它们分配给新列表。

我从文件中得到的行将类似于：

fL = 'FHFF HHXH XXXX HFHX'

我想把它变成这个列表，每个单独的字符：

['F', 'H', 'F', 'F', 'H', ...]

【问题讨论】：

标签： python list readlines

【解决方案1】：

您可以使用list：

new_list = list(fL)

请注意，据我所知，该行中的任何空格都将包含在此列表中。

【讨论】：

使用 utf-8 字符无法按预期工作。对于字符串“zyć”，我期待一个包含 3 个字符的列表，但我得到了这个列表：['z', 'y', '\xc4', '\x87']。您能否指导如何解决此问题。谢谢
我得到了答案，我忘了在我的字符串前添加 'u'，所以它没有被视为 unicode。谢谢。

【解决方案2】：

我好像有点晚了，但是……

a='hello'
print list(a)
# ['h','e','l','l', 'o']

【讨论】：

【解决方案3】：

字符串是可迭代的（就像列表一样）。

我的解释是你真的想要这样的东西：

fd = open(filename,'rU')
chars = []
for line in fd:
   for c in line:
       chars.append(c)

或

fd = open(filename, 'rU')
chars = []
for line in fd:
    chars.extend(line)

或

chars = []
with open(filename, 'rU') as fd:
    map(chars.extend, fd)

chars 将包含文件中的所有字符。

【讨论】：

@FlexedCookie itertools.chain 确实是最简单的——chars = list(itertools.chain.from_iterable(open(filename, 'rU)))。
上面的代码没有考虑空格，即" "

【解决方案4】：

python >= 3.5

3.5 及以上版本允许使用PEP 448 - Extended Unpacking Generalizations:

>>> string = 'hello'
>>> [*string]
['h', 'e', 'l', 'l', 'o']

这是语言语法的规范，所以比调用list要快：

>>> from timeit import timeit
>>> timeit("list('hello')")
0.3042821969866054
>>> timeit("[*'hello']")
0.1582647830073256

【讨论】：

【解决方案5】：

所以要将字符串 hello 作为单个字符添加到列表中，试试这个：

newlist = []
newlist[:0] = 'hello'
print (newlist)

  ['h','e','l','l','o']

但是，这样做更容易：

splitlist = list(newlist)
print (splitlist)

【讨论】：

但更简单的是：newlist = list('hello')
@tim 是的，只是注意到我没有把它放进去:)

【解决方案6】：

fO = open(filename, 'rU')
lst = list(fO.read())

【讨论】：

【解决方案7】：

或者在处理非常大的文件/列表时，使用一种“计算效率更高”的精美列表推导

fd = open(filename,'r')
chars = [c for line in fd for c in line if c is not " "]
fd.close()

顺便说一句：被接受的答案不考虑空格...

【讨论】：

【解决方案8】：

a='hello world'
map(lambda x:x, a)

['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']

一个简单的方法是使用函数“map()”。

【讨论】：

【解决方案9】：

在 python 中，很多东西都是可迭代的，包括文件和字符串。遍历文件处理程序会为您提供该文件中所有行的列表。遍历一个字符串会为您提供该字符串中所有字符的列表。

charsFromFile = []
filePath = r'path\to\your\file.txt' #the r before the string lets us use backslashes

for line in open(filePath):
    for char in line:
        charsFromFile.append(char) 
        #apply code on each character here

或者如果你想要一个班轮

#the [0] at the end is the line you want to grab.
#the [0] can be removed to grab all lines
[list(a) for a in list(open('test.py'))][0]

.

编辑：正如 agf 所说，您可以使用 itertools.chain.from_iterable

他的方法更好，除非你想指定抓取哪些行 list(itertools.chain.from_iterable(open(filename, 'rU)))

然而，这确实需要熟悉 itertools，因此会失去一些可读性

如果您只想遍历字符，而不关心存储列表，那么我会使用嵌套的 for 循环。这种方法也是可读性最强的。

【讨论】：

【解决方案10】：

因为字符串是（不可变的）序列，它们可以像列表一样解包：

with open(filename, 'rU') as fd:
    multiLine = fd.read()
    *lst, = multiLine

在运行 map(lambda x: x, multiLine) 时，显然效率更高，但实际上它返回的是地图对象而不是列表。

with open(filename, 'rU') as fd:
    multiLine = fd.read()
    list(map(lambda x: x, multiLine))

将地图对象转为列表会比解包方法花费更长的时间。

【讨论】：