【发布时间】:2020-06-23 13:52:23
【问题描述】:
我有一个出租车票价预测数据集,我需要将其中一个变量“pickup_datetime”从对象数据类型转换为日期时间数据类型。我正在使用熊猫将其转换为日期时间。代码:
data['pickup_datetime']=pd.to_datetime(data['pickup_datetime'],format='%Y-%m-%d %H:%M:%S UTC')
我遇到了错误。有人可以建议如何从我的变量中找到不正确的行吗?
> TypeError Traceback (most recent call last)
> ~\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in
> _convert_listlike_datetimes(arg, box, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
> 290 try:
> --> 291 values, tz = conversion.datetime_to_datetime64(arg)
> 292 return DatetimeIndex._simple_new(values, name=name, tz=tz)
>
> pandas/_libs/tslibs/conversion.pyx in
> pandas._libs.tslibs.conversion.datetime_to_datetime64()
>
> TypeError: Unrecognized value type: <class 'str'>
>
> During handling of the above exception, another exception occurred:
>
> ValueError Traceback (most recent call
> last) <ipython-input-59-c94549aa5074> in <module>
> ----> 1 data['pickup_datetime']=pd.to_datetime(data['pickup_datetime'],format='%Y-%m-%d
> %H:%M:%S UTC')
>
> ~\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in
> to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact,
> unit, infer_datetime_format, origin, cache)
> 590 else:
> 591 from pandas import Series
> --> 592 values = convert_listlike(arg._values, True, format)
> 593 result = Series(values, index=arg.index, name=arg.name)
> 594 elif isinstance(arg, (ABCDataFrame, compat.MutableMapping)):
>
> ~\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in
> _convert_listlike_datetimes(arg, box, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
> 292 return DatetimeIndex._simple_new(values, name=name, tz=tz)
> 293 except (ValueError, TypeError):
> --> 294 raise e
> 295
> 296 if result is None:
>
> ~\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in
> _convert_listlike_datetimes(arg, box, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
> 259 try:
> 260 result, timezones = array_strptime(
> --> 261 arg, format, exact=exact, errors=errors)
> 262 if '%Z' in format or '%z' in format:
> 263 return _return_parsed_timezone_results(
>
> pandas/_libs/tslibs/strptime.pyx in
> pandas._libs.tslibs.strptime.array_strptime()
>
> ValueError: time data '43' does not match format '%Y-%m-%d %H:%M:%S
> UTC' (match)
【问题讨论】:
-
试试这个:
data['pickup_datetime']=pd.to_datetime(data['pickup_datetime'], errors='coerce', format='%Y-%m-%d %H:%M:%S UTC'); incorrect_rows = data['pickup_datetime'].isna()。如果您提供了样本,则更容易检查该解决方案是否适用于您的数据。 -
这段代码解决了问题。谢谢 Vipool !!
标签: python python-3.x pandas dataframe datetime