在熊猫数据框中插入值答案

【问题标题】：Insert value in panda dataframe在熊猫数据框中插入值
【发布时间】：2017-06-13 18:16:39
【问题描述】：

我在 Excel 工作表中有数据。我想检查一个范围的一列值，如果该值在该范围内（5000-15000），那么我想在另一列中插入值（正确或标志）。

我有三列：城市、租金、状态。

我尝试了追加和插入方法，但没有奏效。我该怎么做？

这是我的代码：

对于索引，df.iterrows() 中的行：

if row['city']=='mumbai':

    if 5000<= row['rent']<=15000:

        pd.DataFrame.append({'Status': 'Correct'})

它显示了这个错误：

TypeError: append() 缺少 1 个必需的位置参数：'other'

在列中逐行插入数据应该遵循什么程序？

【问题讨论】：

标签： python excel pandas

【解决方案1】：

我认为您可以将numpy.where 与between 创建的布尔掩码一起使用并与city 进行比较：

mask = (df['city']=='mumbai') & df['rent'].between(5000,15000)
df['status'] = np.where(mask, 'Correct', 'Uncorrect')

示例：

df = pd.DataFrame({'city':['mumbai','mumbai','mumbai', 'a'],
                   'rent':[1000,6000,10000,10000]})
mask = (df['city']=='mumbai') & df['rent'].between(5000,15000)
df['status'] = np.where(mask, 'Correct', 'Flag')
print (df)
     city   rent   status
0  mumbai   1000     Flag
1  mumbai   6000  Correct
2  mumbai  10000  Correct
3       a  10000     Flag

loc 的另一种解决方案：

mask = (df['city']=='mumbai') & df['rent'].between(5000,15000)
df['status'] = 'Flag'
df.loc[mask, 'status'] =  'Correct'
print (df)
     city   rent   status
0  mumbai   1000     Flag
1  mumbai   6000  Correct
2  mumbai  10000  Correct
3       a  10000     Flag

写入excel使用to_excel，如果需要删除索引列添加index=False：

df.to_excel('file.xlsx', index=False)

编辑：

对于多个masks 可以使用：

df = pd.DataFrame({'city':['Mumbai','Mumbai','Delhi', 'Delhi', 'Bangalore', 'Bangalore'],
                   'rent':[1000,6000,10000,1000,4000,5000]})
print (df)
        city   rent
0     Mumbai   1000
1     Mumbai   6000
2      Delhi  10000
3      Delhi   1000
4  Bangalore   4000
5  Bangalore   5000

m1 = (df['city']=='Mumbai') & df['rent'].between(5000,15000)
m2 = (df['city']=='Delhi') & df['rent'].between(1000,5000)
m3 = (df['city']=='Bangalore') & df['rent'].between(3000,5000)

m = m1 | m2 | m3
print (m)
0    False
1     True
2    False
3     True
4     True
5     True
dtype: bool

from functools import reduce
mList = [m1,m2,m3]
m = reduce(lambda x,y: x | y, mList)
print (m)
0    False
1     True
2    False
3     True
4     True
5     True
dtype: bool

print (df[m])
        city  rent
1     Mumbai  6000
3      Delhi  1000
4  Bangalore  4000
5  Bangalore  5000

【讨论】：

它显示正确的结果，但它没有在我的 Excel 表中写入数据
它会删除我以前的工作表数据。它只插入一列状态。请在这种情况下提供帮助
嗯，数据和df不同？如果原始数据被相同覆盖，只添加新列，有什么问题吗？你能解释更多吗？
实际上，我有 50 列的 excel 文件，而状态是其中一个空列。我想在现有文件的此列中插入值，但它会删除所有其他列。
嗯，这很奇怪。因为如果使用mask = (df['city']=='mumbai') & df['rent'].between(5000,15000) 和 `df['status'] = np.where(mask, 'Correct', 'Flag')` 则没有理由删除其他列...