【发布时间】:2021-05-05 09:09:29
【问题描述】:
在下面的代码行中,我得到如下所示的错误。
d3["WOE"] = np.where(((d3.DIST_EVENT==0) | (d3.DIST_NON_EVENT ==0)) ,np.nan ,np.log(d3.DIST_EVENT/d3.DIST_NON_EVENT))
如果分子或分母为 0,则 np.nan 的条件应满足且 d3["WOE"] 应为 nan。为什么会出现以下错误?
---------------------------------------------------------------------------
FloatingPointError Traceback (most recent call last)
<ipython-input-56-a9b015683238> in <module>
----> 1 final_iv, IV = data_vars(df_leads_short,df_leads_short.close_flag)
2 IV.sort_values('IV')
<ipython-input-55-5530ad13fa5a> in data_vars(df1, target)
122 count = count + 1
123 else:
--> 124 conv = char_bin(target, df1[i])
125 conv["VAR_NAME"] = i
126 count = count + 1
<ipython-input-55-5530ad13fa5a> in char_bin(Y, X)
92 d3["DIST_EVENT"] = d3.EVENT/d3.sum().EVENT
93 d3["DIST_NON_EVENT"] = d3.NONEVENT/d3.sum().NONEVENT
---> 94 d3["WOE"] = np.where(((d3.DIST_EVENT==0) | (d3.DIST_NON_EVENT ==0)) ,np.nan ,np.log(d3.DIST_EVENT/d3.DIST_NON_EVENT))
95 #d3["WOE"] = np.log(d3.DIST_EVENT/d3.DIST_NON_EVENT)
96 d3["IV"] = np.where((d3.DIST_EVENT==0) | (d3.DIST_NON_EVENT ==0 ),np.nan ,(d3.DIST_EVENT-d3.DIST_NON_EVENT)*np.log(d3.DIST_EVENT/d3.DIST_NON_EVENT))
/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in __array_ufunc__(self, ufunc, method, *inputs, **kwargs)
1934 self, ufunc: Callable, method: str, *inputs: Any, **kwargs: Any
1935 ):
-> 1936 return arraylike.array_ufunc(self, ufunc, method, *inputs, **kwargs)
1937
1938 # ideally we would define this to avoid the getattr checks, but
/opt/conda/lib/python3.7/site-packages/pandas/core/arraylike.py in array_ufunc(self, ufunc, method, *inputs, **kwargs)
356 # ufunc(series, ...)
357 inputs = tuple(extract_array(x, extract_numpy=True) for x in inputs)
--> 358 result = getattr(ufunc, method)(*inputs, **kwargs)
359 else:
360 # ufunc(dataframe)
FloatingPointError: divide by zero encountered in log
【问题讨论】:
-
where是一个 python 函数。它的参数在传入之前会进行完整的评估。