【问题标题】:Converting object datatype to datetime datatype in Python using pandas使用 pandas 在 Python 中将对象数据类型转换为日期时间数据类型
【发布时间】:2020-06-23 13:52:23
【问题描述】:

我有一个出租车票价预测数据集,我需要将其中一个变量“pickup_datetime”从对象数据类型转换为日期时间数据类型。我正在使用熊猫将其转换为日期时间。代码:

data['pickup_datetime']=pd.to_datetime(data['pickup_datetime'],format='%Y-%m-%d %H:%M:%S UTC')

我遇到了错误。有人可以建议如何从我的变量中找到不正确的行吗?

> TypeError                                 Traceback (most recent call last)
> ~\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in
> _convert_listlike_datetimes(arg, box, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
>     290             try:
> --> 291                 values, tz = conversion.datetime_to_datetime64(arg)
>     292                 return DatetimeIndex._simple_new(values, name=name, tz=tz)
> 
> pandas/_libs/tslibs/conversion.pyx in
> pandas._libs.tslibs.conversion.datetime_to_datetime64()
> 
> TypeError: Unrecognized value type: <class 'str'>
> 
> During handling of the above exception, another exception occurred:
> 
> ValueError                                Traceback (most recent call
> last) <ipython-input-59-c94549aa5074> in <module>
> ----> 1 data['pickup_datetime']=pd.to_datetime(data['pickup_datetime'],format='%Y-%m-%d
> %H:%M:%S UTC')
> 
> ~\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in
> to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact,
> unit, infer_datetime_format, origin, cache)
>     590         else:
>     591             from pandas import Series
> --> 592             values = convert_listlike(arg._values, True, format)
>     593             result = Series(values, index=arg.index, name=arg.name)
>     594     elif isinstance(arg, (ABCDataFrame, compat.MutableMapping)):
> 
> ~\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in
> _convert_listlike_datetimes(arg, box, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
>     292                 return DatetimeIndex._simple_new(values, name=name, tz=tz)
>     293             except (ValueError, TypeError):
> --> 294                 raise e
>     295 
>     296     if result is None:
> 
> ~\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in
> _convert_listlike_datetimes(arg, box, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
>     259                 try:
>     260                     result, timezones = array_strptime(
> --> 261                         arg, format, exact=exact, errors=errors)
>     262                     if '%Z' in format or '%z' in format:
>     263                         return _return_parsed_timezone_results(
> 
> pandas/_libs/tslibs/strptime.pyx in
> pandas._libs.tslibs.strptime.array_strptime()
> 
> ValueError: time data '43' does not match format '%Y-%m-%d %H:%M:%S
> UTC' (match)

【问题讨论】:

  • 试试这个:data['pickup_datetime']=pd.to_datetime(data['pickup_datetime'], errors='coerce', format='%Y-%m-%d %H:%M:%S UTC'); incorrect_rows = data['pickup_datetime'].isna()。如果您提供了样本,则更容易检查该解决方案是否适用于您的数据。
  • 这段代码解决了问题。谢谢 Vipool !!

标签: python python-3.x pandas dataframe datetime


【解决方案1】:

Pandas to_datetime() 有一个 errors 参数,可以让您处理通常由于格式设置引起的错误。默认情况下,它设置为“提高”。你可以忽略这样的错误:

data['pickup_datetime'] = pd.to_datetime(data['pickup_datetime'], format='%Y-%m-%d %H:%M:%S UTC', errors='coerce')

这将为所有不正确的日期时间返回 NaT。

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2017-09-29
    • 2022-01-06
    • 2019-10-11
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-12-10
    相关资源
    最近更新 更多