【问题标题】:Normalizing a pandas DataFrame by row按行规范化 pandas DataFrame
【发布时间】:2013-09-06 19:36:40
【问题描述】:

规范化 pandas DataFrame 的每一行的最惯用的方法是什么?规范化列很容易,所以一个(非常难看!)选项是:

(df.T / df.T.sum()).T

Pandas 广播规则阻止 df / df.sum(axis=1) 这样做

【问题讨论】:

    标签: python pandas normalization dataframe


    【解决方案1】:

    要解决广播问题,您可以使用div 方法:

    df.div(df.sum(axis=1), axis=0)
    

    pandas User Guide: Matching / broadcasting behavior

    【讨论】:

      【解决方案2】:

      我建议使用Scikit preprocessing 库并根据需要转置您的数据框:

      '''
      Created on 05/11/2015
      
      @author: rafaelcastillo
      '''
      
      import matplotlib.pyplot as plt
      import pandas
      import random
      import numpy as np
      from sklearn import preprocessing
      
      def create_cos(number_graphs,length,amp):
          # This function is used to generate cos-kind graphs for testing
          # number_graphs: to plot
          # length: number of points included in the x axis
          # amp: Y domain modifications to draw different shapes
          x = np.arange(length)
          amp = np.pi*amp
          xx = np.linspace(np.pi*0.3*amp, -np.pi*0.3*amp, length)
          for i in range(number_graphs):
              iterable = (2*np.cos(x) + random.random()*0.1 for x in xx)
              y = np.fromiter(iterable, np.float)
              if i == 0: 
                  yfinal =  y
                  continue
              yfinal = np.vstack((yfinal,y))
          return x,yfinal
      
      x,y = create_cos(70,24,3)
      data = pandas.DataFrame(y)
      
      x_values = data.columns.values
      num_rows = data.shape[0]
      
      fig, ax = plt.subplots()
      for i in range(num_rows):
          ax.plot(x_values, data.iloc[i])
      ax.set_title('Raw data')
      plt.show() 
      
      std_scale = preprocessing.MinMaxScaler().fit(data.transpose())
      df_std = std_scale.transform(data.transpose())
      data = pandas.DataFrame(np.transpose(df_std))
      
      
      fig, ax = plt.subplots()
      for i in range(num_rows):
          ax.plot(x_values, data.iloc[i])
      ax.set_title('Data Normalized')
      plt.show()                                   
      

      【讨论】:

      • 除了涉及preprocessing.MinMaxScaler 和相应import 的三行之外,所有这些绘图代码都无关紧要。你能减少你的答案吗?
      猜你喜欢
      • 2012-08-21
      • 2014-11-20
      • 2019-03-15
      • 1970-01-01
      • 2019-06-13
      • 1970-01-01
      • 2018-05-18
      • 2016-07-06
      • 2016-03-06
      相关资源
      最近更新 更多