【发布时间】:2021-08-02 18:31:13
【问题描述】:
我尝试转换company_size 列中的值,并尝试了不同的方法,但仍然得到 ValueError。有人有什么建议吗?
代码如下:
handle_df.loc[handle_df['company_size'].isin(['I prefer not to answer',"I don't know"]),['company_size']]=np.NaN
handle_df.loc[handle_df['company_size'] == 'Fortune 1000 (1,000+)' ,['company_size']]='1000'
handle_df.loc[handle_df['company_size'] == '1/25/2013' ,['company_size']]='1-25'
handle_df.loc[handle_df['company_size'] == '1/5/2014' ,['company_size']]='1-5'
handle_df.loc[handle_df['company_size'] == '6/15/2014' ,['company_size']]='6-15'
handle_df.loc[handle_df['company_size'].isin(['Student',"Other (not working, consultant, etc.)"]),['company_size']]='1'
handle_df['company_size'] = handle_df['company_size'].map(lambda x:str(x).replace(" to ",'-'))
handle_df = process_data_range(handle_df,'company_size',dropna=False)
完整的错误回溯
ValueError: could not convert string to float: ''
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-22-84fdd9e45198> in <module>
9 handle_df['company_size'] = handle_df['company_size'].map(lambda x:str(x).replace(" to ",'-'))
10
---> 11 handle_df = process_data_range(handle_df,'company_size',dropna=False)
12
<ipython-input-11-510bc7e7040d> in process_data_range(df, feature, dropna)
28 # for na value,I mark them as 'NaN'
29 change_to_midpoint = lambda x: np.average([float(i) for i in x.split("-")]) if "-" in x else np.NaN if x == "" else float(x)
---> 30 df[feature] = df[feature].map(change_to_midpoint)
31
32 if dropna:
~\anaconda3\lib\site-packages\pandas\core\series.py in map(self, arg, na_action)
3907 January History Final exam A
3908 February Geography Final exam B
-> 3909 March History Coursework A
3910 April Geography Coursework C
3911 dtype: object
~\anaconda3\lib\site-packages\pandas\core\base.py in _map_values(self, mapper, na_action)
935 **bins**
936
--> 937 Bins can be useful for going from a continuous variable to a
938 categorical variable; instead of counting unique
939 apparitions of values, divide the index in the specified
~\anaconda3\lib\site-packages\pandas\_libs\lib.pyx in pandas._libs.lib.map_infer()
<ipython-input-11-510bc7e7040d> in <lambda>(x)
27 # if age range ,change it to the midpoint
28 # for na value,I mark them as 'NaN'
---> 29 change_to_midpoint = lambda x: np.average([float(i) for i in x.split("-")]) if "-" in x else np.NaN if x == "" else float(x)
30 df[feature] = df[feature].map(change_to_midpoint)
31
<ipython-input-11-510bc7e7040d> in <listcomp>(.0)
27 # if age range ,change it to the midpoint
28 # for na value,I mark them as 'NaN'
---> 29 change_to_midpoint = lambda x: np.average([float(i) for i in x.split("-")]) if "-" in x else np.NaN if x == "" else float(x)
30 df[feature] = df[feature].map(change_to_midpoint)
31
ValueError: could not convert string to float: ''
【问题讨论】:
-
您在哪一行得到 ValueError?
-
嗯,错误消息表明您在尝试转换为
float的任何列中有一些空字符串''。我在您发布的代码中看不到任何浮点转换 - 如果您编辑问题以包含完整代码、输入数据、预期输出和完整错误回溯(又名 minimal reproducible example),这将有所帮助。
标签: python pandas dataframe data-cleaning data-preprocessing