【发布时间】:2018-01-08 11:33:14
【问题描述】:
我正在尝试自动化代码,因为它会导致重复工作,因为一直手动更改代码中的列名。我试图了解如何像在 SAS 中一样在 python 中创建宏变量。非常感谢任何帮助!
##I'm creating my cutoff points first
#### I need to assign a value to col1 in a macro in order not to hardcode it all the time!
cutoff1 = my_data['col1'].describe([.1,.2,.3,.4,.5,.6,.7, .8, 0.9])['10%'].astype('float64')
cutoff2 = my_data['col1'].describe([.1,.2,.3,.4,.5,.6,.7, .8, 0.9])['20%'].astype('float64')
cutoff3 = my_data['col1'].describe([.1,.2,.3,.4,.5,.6,.7, .8, 0.9])['30%'].astype('float64')
##Then I'm assigning the new values to my continuous variables by using the thresholds I've determined above
#### I also need to assign a value to COL1_RANK such as %s='COL1' i.e. %s&'_RANK'
def f(row):
if row['col1'] <=cutoff1 :
COL1_RANK = 1
elif row['col1']<=cutoff2:
COL1_RANK = 2
elif row['col1']<=cutoff3:
COL1_RANK = 3
else :
COL1_RANK = 4
return COL1_RANK
my_data['COL1_RANK'] = my_data.apply(f, axis=1)
my_data.head(5)
【问题讨论】: