【发布时间】:2016-10-24 17:34:32
【问题描述】:
我有一段python代码如图:
# Main Loop that take values attributed to the row by row basis and sorts
# them into correpsonding columns based on matching the 'Name' and the newly
# generated column names.
listed_names=list(df_cv) #list of column names to reference later.
variable=listed_names[3:] #List of the 3rd to the last column. Column 1&2 are irrelevant.
for i in df_cv.index: #For each index in the Dataframe (DF)
for m in variable: #For each variable in the list of variable column names
if df_cv.loc[i,'Name']==m: #If index location in variable name is equal to the variable column name...
df_cv.loc[i,m]=df_cv.loc[i,'Value'] #...Then that location is equal to the value in same row under the column 'Value'
基本上,它需要一个 3xn 的时间/名称/值列表,并按 unique(n) 将其排序为大小为 n 的 pandas df。
Time Name Value
1 Color Red
2 Age 6
3 Temp 25
4 Age 1
进入这个:
Time Color Age Temp
1 Red
2 6
3 25
4 1
我的代码需要很长时间才能运行,我想知道是否有更好的方法来设置我的循环。我来自 MATLAB 背景,所以 python 的风格(即 everything 不使用行/列仍然是陌生的)。
如何让这部分代码运行得更快?
【问题讨论】:
标签: python for-loop pandas optimization