【发布时间】:2020-10-20 11:20:30
【问题描述】:
我想查看存储在我的数据框中的所有列的数据类型,而不需要遍历它们。有什么办法?
【问题讨论】:
-
使用
df.dtypes -
哇,这真的很有帮助,谢谢 jerzael
我想查看存储在我的数据框中的所有列的数据类型,而不需要遍历它们。有什么办法?
【问题讨论】:
df.dtypes
10 min to pandas 有 DataFrame.dtypes 的好例子:
df2 = pd.DataFrame({
'A' : 1.,
'B' : pd.Timestamp('20130102'),
'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
'D' : np.array([3] * 4,dtype='int32'),
'E' : pd.Categorical(["test","train","test","train"]),
'F' : 'foo' })
print (df2)
A B C D E F
0 1.0 2013-01-02 1.0 3 test foo
1 1.0 2013-01-02 1.0 3 train foo
2 1.0 2013-01-02 1.0 3 test foo
3 1.0 2013-01-02 1.0 3 train foo
print (df2.dtypes)
A float64
B datetime64[ns]
C float32
D int32
E category
F object
dtype: object
但是dtypes=object有点复杂(一般情况下,显然是string):
示例:
df = pd.DataFrame({'strings':['a','d','f'],
'dicts':[{'a':4}, {'c':8}, {'e':9}],
'lists':[[4,8],[7,8],[3]],
'tuples':[(4,8),(7,8),(3,)],
'sets':[set([1,8]), set([7,3]), set([0,1])] })
print (df)
dicts lists sets strings tuples
0 {'a': 4} [4, 8] {8, 1} a (4, 8)
1 {'c': 8} [7, 8] {3, 7} d (7, 8)
2 {'e': 9} [3] {0, 1} f (3,)
所有值都具有相同的dtypes:
print (df.dtypes)
dicts object
lists object
sets object
strings object
tuples object
dtype: object
但type不一样,如果需要循环检查:
for col in df:
print (df[col].apply(type))
0 <class 'dict'>
1 <class 'dict'>
2 <class 'dict'>
Name: dicts, dtype: object
0 <class 'list'>
1 <class 'list'>
2 <class 'list'>
Name: lists, dtype: object
0 <class 'set'>
1 <class 'set'>
2 <class 'set'>
Name: sets, dtype: object
0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
Name: strings, dtype: object
0 <class 'tuple'>
1 <class 'tuple'>
2 <class 'tuple'>
Name: tuples, dtype: object
或带有iat的列的第一个值:
print (type(df['strings'].iat[0]))
<class 'str'>
print (type(df['dicts'].iat[0]))
<class 'dict'>
print (type(df['lists'].iat[0]))
<class 'list'>
print (type(df['tuples'].iat[0]))
<class 'tuple'>
print (type(df['sets'].iat[0]))
<class 'set'>
【讨论】:
使用 DataFrame.info() 方法
>>> df.info()
RangeIndex: 5 entries, 0 to 4
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 int_col 5 non-null int64
1 text_col 5 non-null object
2 float_col 5 non-null float64
dtypes: float64(1), int64(1), object(1)
memory usage: 248.0+ bytes
文档: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.info.html
【讨论】: