【问题标题】:InvalidIndexError: Reindexing only valid with uniquely valued Index objectsInvalidIndexError:重新索引仅对具有唯一值的索引对象有效
【发布时间】:2015-09-15 03:18:07
【问题描述】:

使用 Python,我正在努力将 208 个 CSV 文件合并到一个数据帧中。 (我的文件名是 Customer_1.csv、Customer_2.csv、和 Customer_208.csv)

以下是我的代码,

%matplotlib inline
import pandas as pd
df_merged = pd.concat([pd.read_csv('data_TVV1/Customer_{0}.csv'.format(i), names = ['Time', 'Energy_{0}'.format(i)], parse_dates=['Time'], index_col=['Time'], skiprows=1) for i in range(1, 209)], axis=1)

我收到一个错误提示,

    InvalidIndexError                         Traceback (most recent call last)
<ipython-input-4-a4d19b3c2a3e> in <module>()
----> 1 df_merged = pd.concat([pd.read_csv('data_TVV1/Customer_{0}.csv'.format(i), names = ['Time', 'Energy_{0}'.format(i)], parse_dates=['Time'], index_col=['Time'], skiprows=1) for i in range(1, 209)], axis=1)

/Users/Suzuki/Envs/DataVizProj/lib/python2.7/site-packages/pandas/tools/merge.pyc in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, copy)
    752                        keys=keys, levels=levels, names=names,
    753                        verify_integrity=verify_integrity,
--> 754                        copy=copy)
    755     return op.get_result()
    756 

/Users/Suzuki/Envs/DataVizProj/lib/python2.7/site-packages/pandas/tools/merge.pyc in __init__(self, objs, axis, join, join_axes, keys, levels, names, ignore_index, verify_integrity, copy)
    884         self.copy = copy
    885 
--> 886         self.new_axes = self._get_new_axes()
    887 
    888     def get_result(self):

/Users/Suzuki/Envs/DataVizProj/lib/python2.7/site-packages/pandas/tools/merge.pyc in _get_new_axes(self)
    944                 if i == self.axis:
    945                     continue
--> 946                 new_axes[i] = self._get_comb_axis(i)
    947         else:
    948             if len(self.join_axes) != ndim - 1:

/Users/Suzuki/Envs/DataVizProj/lib/python2.7/site-packages/pandas/tools/merge.pyc in _get_comb_axis(self, i)
    970                 raise TypeError("Cannot concatenate list of %s" % types)
    971 
--> 972         return _get_combined_index(all_indexes, intersect=self.intersect)
    973 
    974     def _get_concat_axis(self):

/Users/Suzuki/Envs/DataVizProj/lib/python2.7/site-packages/pandas/core/index.pyc in _get_combined_index(indexes, intersect)
   5730             index = index.intersection(other)
   5731         return index
-> 5732     union = _union_indexes(indexes)
   5733     return _ensure_index(union)
   5734 

/Users/Suzuki/Envs/DataVizProj/lib/python2.7/site-packages/pandas/core/index.pyc in _union_indexes(indexes)
   5759 
   5760         if hasattr(result, 'union_many'):
-> 5761             return result.union_many(indexes[1:])
   5762         else:
   5763             for other in indexes[1:]:

/Users/Suzuki/Envs/DataVizProj/lib/python2.7/site-packages/pandas/tseries/index.pyc in union_many(self, others)
    847             else:
    848                 tz = this.tz
--> 849                 this = Index.union(this, other)
    850                 if isinstance(this, DatetimeIndex):
    851                     this.tz = tz

/Users/Suzuki/Envs/DataVizProj/lib/python2.7/site-packages/pandas/core/index.pyc in union(self, other)
   1400                 result.extend([x for x in other.values if x not in value_set])
   1401         else:
-> 1402             indexer = self.get_indexer(other)
   1403             indexer, = (indexer == -1).nonzero()
   1404 

/Users/Suzuki/Envs/DataVizProj/lib/python2.7/site-packages/pandas/core/index.pyc in get_indexer(self, target, method, limit)
   1685 
   1686         if not self.is_unique:
-> 1687             raise InvalidIndexError('Reindexing only valid with uniquely'
   1688                                     ' valued Index objects')
   1689 

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

你有什么办法解决这个问题吗???..

【问题讨论】:

    标签: python csv pandas


    【解决方案1】:

    您的代码适用于我用于测试的五个文件的小样本(每个文件包含两列和三行)。仅用于调试,尝试在 for 循环中编写它。首先,在循环之前,将所有文件读入列表。然后再次循环并使用try/except 块附加每个以捕获错误。最后,打印问题文件并进行调查。

    # First, read all the files into a list.
    files_in = [pd.read_csv('data_TVV1/Customer_{0}.csv'.format(i), 
                            names = ['Time', 'Energy_{0}'.format(i)], 
                            parse_dates=['Time'], 
                            index_col=['Time'], 
                            skiprows=1) 
                for i in range(1, 209)]
    
    df = pd.DataFrame()
    errors = []
    
    # Try to append each file to the dataframe.
    for i i range(1, 209):
        try:
            df = pd.concat([df, files_in[i - 1]], axis=1)
        except:
            errors.append(i)
    
    # Print files containing errors.
    for error in errors:
        print(files_in[error])
    

    【讨论】:

      猜你喜欢
      • 2020-10-23
      • 2021-12-23
      • 1970-01-01
      • 2023-04-02
      • 2022-07-15
      • 2021-07-22
      • 2021-05-22
      • 2018-10-27
      • 2019-04-09
      相关资源
      最近更新 更多