【问题标题】:Concatenating data frames into a pandas dataframe with pre set row names python将数据帧连接到具有预设行名python的pandas数据帧中
【发布时间】:2016-11-18 12:57:13
【问题描述】:

我试图重构以前非常手动的代码,并涉及为我创建的每个新数据框设置索引,以从本质上创建这个所需的输出:

    f1          precision   recall
A   0.600315956 0.72243346  0.513513514
B   0.096692112 0.826086957 0.051351351
C   0.085642317 0.62962963  0.045945946
D   0.108641975 0.628571429 0.059459459

这是我当前的代码:

summaryDF = pd.DataFrame().set_index(['A','B','C','D'])

def evaluation(trueLabels, evalLabels):

    precision = precision_score(trueLabels, evalLabels)
    recall = precision_score(trueLabels, evalLabels)
    f1 = precision_score(trueLabels, evalLabels)
    accuracy = accuracy_score(trueLabels, evalLabels)

    data = {'precision': precision,
               'recall': recall,
               'f1': f1}

    DF = pd.DataFrame(data)

    summaryDF.concat(DF,ignore_index=True)


results = [y_randpred,y_cat_random_to_binary,y_cat_random_to_binary_threshold,y_closed_random_to_binary]

for result in results:
    evaluation(y_true_claim, result)

这是我的错误跟踪:

Traceback (most recent call last):
  File "/Users/dhruv/Documents/bla/bla/src/main/bla.py", line 419, in <module>
    summaryDF = pd.DataFrame().set_index(['A','B','C','D'])
  File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 2607, in set_index
    level = frame[col].values
  File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 1797, in __getitem__
    return self._getitem_column(key)
  File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 1804, in _getitem_column
    return self._get_item_cache(key)
  File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/generic.py", line 1084, in _get_item_cache
    values = self._data.get(item)
  File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/internals.py", line 2851, in get
    loc = self.items.get_loc(item)
  File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/index.py", line 1572, in get_loc
    return self._engine.get_loc(_values_from_object(key))
  File "pandas/index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas/index.c:3824)
  File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:3704)
  File "pandas/hashtable.pyx", line 686, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12280)
  File "pandas/hashtable.pyx", line 694, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12231)
KeyError: 'A'

知道我做错了什么吗?

【问题讨论】:

    标签: python pandas dataframe row concat


    【解决方案1】:

    我解决了我的问题。

    使用this answer,我的代码变成:

    summaryDF = pd.DataFrame(columns=('precision','recall','f1'))
    
    def evaluation(trueLabels, evalLabels):
    
        global summaryDF
    
        precision = precision_score(trueLabels, evalLabels)
        recall = recall_score(trueLabels, evalLabels)
        f1 = f1_score(trueLabels, evalLabels)
    
        data = {'precision': [precision],
                   'recall': [recall],
                   'f1': [f1]
                }
    
        DF = pd.DataFrame(data)
    
        summaryDF = pd.concat([summaryDF,DF])
    
    results = [y_randpred,
               y_cat_random_to_binary,
               y_cat_random_to_binary_threshold,
               y_closed_random_to_binary,
               y_closedCat_random_to_binary_threshold]
    
    for result in results:
        evaluation(y_true_claim, result)
    
    summaryDF.index=list(['A',
                         'B',
                         'C',
                         'D',
                         'E'])
    

    关键方面是我需要将元素放在方括号中以获得精确度、召回率和 F1,然后还通过 summaryDF.index 而不是 set_index 方法设置索引。

    所以我只在数据帧的末尾追加然后设置索引,而不是在我追加数据帧的开头,因为任何启动的数据帧都必须在某种开头有一个索引。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-04-05
      • 1970-01-01
      • 2019-11-02
      • 1970-01-01
      • 1970-01-01
      • 2016-02-28
      相关资源
      最近更新 更多