似乎只需要多列:
df['totalbytes'] = df['bytesbytes']*df['bytesfrequency']
或者使用mul:
df['totalbytes'] = df['bytesbytes'].mul(df['bytesfrequency'])
示例:
df = pd.DataFrame({'bytesbytes':[3985,1420,0,0],
'bytesfrequency':[2,6,2,2]})
df['totalbytes'] = df['bytesbytes']*df['bytesfrequency']
print (df)
bytesbytes bytesfrequency totalbytes
0 3985 2 7970
1 1420 6 8520
2 0 2 0
3 0 2 0
但可能需要groupby 第一列request 并使用transform 创建多个新的Series(两列都由transform 转换,可能只需要一个):
df = pd.DataFrame({ 'request':['a','a','b','b'],
'bytesbytes':[3985,1420,1420,0],
'bytesfrequency':[2,6,6,2]})
g = df.groupby('request')
print (g['bytesbytes'].transform('first'))
0 3985
1 3985
2 1420
3 1420
Name: bytesbytes, dtype: int64
print (g['bytesfrequency'].transform('first'))
0 2
1 2
2 6
3 6
Name: bytesfrequency, dtype: int64
df['totalbytes'] = g['bytesbytes'].transform('first')*g['bytesfrequency'].transform('first')
print (df)
bytesbytes bytesfrequency request totalbytes
0 3985 2 a 7970
1 1420 6 a 7970
2 1420 6 b 8520
3 0 2 b 8520
编辑:
如果需要删除 request 列的重复项:
df = pd.DataFrame({ 'request':['a','a','b','b'],
'bytesbytes':[3985,1420,1420,0],
'bytesfrequency':[2,6,6,2]})
print (df)
bytesbytes bytesfrequency request
0 3985 2 a
1 1420 6 a
2 1420 6 b
3 0 2 b
一行解决方案 - drop_duplicates,多个和最后一个 drop 列:
df = df.drop_duplicates('request')
.assign(totalbytes=df['bytesbytes']*df['bytesfrequency'])
.drop(['bytesbytes','bytesfrequency'], axis=1)
print (df)
request totalbytes
0 a 7970
2 b 8520
df = df.drop_duplicates('request')
df['totalbytes'] = df['bytesbytes']*df['bytesfrequency']
df = df.drop(['bytesbytes','bytesfrequency'], axis=1)
print (df)
request totalbytes
0 a 7970
2 b 8520