【发布时间】:2018-06-27 18:42:53
【问题描述】:
我有来自这个问题Df groupby set comparison的以下代码:
import pandas as pd
wordlist = pd.read_csv('data/example.txt', sep='\r', header=None, index_col=None, names=['word'])
wordlist = wordlist.drop_duplicates(keep='first')
# wordlist['word'] = wordlist['word'].astype(str)
wordlist['split'] = ''
wordlist['anagrams'] = ''
for index, row in wordlist.iterrows() :
row['split'] = list(row['word'])
anaglist = wordlist['anagrams'] = wordlist['word'].apply(lambda x: ''.join(sorted(list(x))))
wordlist['anagrams'] = anaglist
wordlist = wordlist.drop(['split'], axis=1)
wordlist = wordlist['anagrams'].drop_duplicates(keep='first')
print(wordlist)
print(wordlist.dtypes)
我的 example.txt 文件中的某些输入似乎被读取为整数,尤其是当字符串具有不同的字符长度时。我似乎无法强迫熊猫使用 .astype(str) 将数据视为字符串
发生了什么事?
【问题讨论】: