【发布时间】:2025-12-10 18:50:01
【问题描述】:
我已将 csv 作为多索引数据框导入。这是数据的模型:
df = pd.read_csv("coursedata2.csv", index_col=[0,2])
print (df)
COURSE
ID Course List
12345 Interior Environments DESN10000
Rendering & Present Skills DESN20065
Lighting DESN20025
22345 Drawing Techniques DESN10016
Colour Theory DESN14049
Finishes & Sustainable Issues DESN12758
Lighting DESN20025
32345 Window Treatments&Soft Furnish DESN27370
42345 Introduction to CADD INFO16859
Principles of Drafting DESN10065
Drawing Techniques DESN10016
The Fundamentals of Design DESN15436
Colour Theory DESN14049
Interior Environments DESN10000
Drafting DESN10123
Textiles and Applications DESN10199
Finishes & Sustainable Issues DESN12758
[17 rows x 1 columns]
我可以使用 .xs 轻松地按标签对其进行切片——例如:
selected = df.xs (12345, level='ID')
print selected
COURSE
Course List
Interior Environments DESN10000
Rendering & Present Skills DESN20065
Lighting DESN20025
[3 rows x 1 columns]
>
但我想做的是逐步浏览数据框并按 ID 对每个课程块执行操作。真实数据中的 ID 值是相当随机的整数,按升序排列。
df.index 显示:
df.index
MultiIndex(levels=[[12345, 22345, 32345, 42345], [u'Colour Theory', u'Colour Theory ', u'Drafting', u'Drawing Techniques', u'Finishes & Sustainable Issues', u'Interior Environments', u'Introduction to CADD', u'Lighting', u'Principles of Drafting', u'Rendering & Present Skills', u'Textiles and Applications', u'The Fundamentals of Design', u'Window Treatments&Soft Furnish']],
labels=[[0, 0, 0, 1, 1, 1, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3], [5, 9, 7, 3, 1, 4, 7, 12, 6, 8, 3, 11, 0, 5, 2, 10, 4]],
names=[u'ID', u'Course List'])
在我看来,我应该能够使用第一个索引标签来增加数据框。 IE。获取标签 0 然后 1 然后 2 然后 3 的所有课程,......但看起来 .xs 不会按标签切片。
我错过了什么吗?
【问题讨论】:
-
试试``df.groupby(level='ID').apply(func)`,见这里:pandas.pydata.org/pandas-docs/stable/…
标签: python csv pandas indexing slice