【问题标题】:Cannot compare type 'Timestamp' with type 'str' Pandas Python无法将“时间戳”类型与“str”类型 Pandas Python 进行比较
【发布时间】:2020-08-15 03:50:45
【问题描述】:

我有两个带有日期时间的数据框:

df["datetime"] = df[["date","time"]].apply(lambda row: ' '.join(row.values.astype(str)), axis=1)
df["datetime"] = pd.to_datetime(df["datetime"], format='%Y-%m-%d  %H:%M:%S')

对于另一个:

df_labels.columns = ["start_date","start_time","end_date","end_time","mode"]
df_labels["start_datetime"] = df_labels[["start_date","start_time"]].apply(lambda row: ' '.join(row.values.astype(str)), axis=1)
df_labels["end_datetime"] = df_labels[["end_date","end_time"]].apply(lambda row: ' '.join(row.values.astype(str)), axis=1)
df_labels["start_datetime"] = df_labels["start_datetime"].str.replace("/","-")
df_labels["end_datetime"] = df_labels["end_datetime"].str.replace("/","-")
df_labels["start_datetime"] = pd.to_datetime(df_labels["start_datetime"], format='%Y-%m-%d  %H:%M:%S')
df_labels["end_datetime"] = pd.to_datetime(df_labels["end_datetime"], format='%Y-%m-%d  %H:%M:%S')

以上所有代码都运行成功。

df 示例:

    lat         long        u1  alt d               date        time       datetime              mode
0   39.921712   116.472343  0   13  39298.146204    2007-08-04  03:30:32    2007-08-04 03:30:32 
1   39.921705   116.472343  0   13  39298.146215    2007-08-04  03:30:33    2007-08-04 03:30:33 
2   39.921695   116.472345  0   13  39298.146227    2007-08-04  03:30:34    2007-08-04 03:30:34 
3   39.921683   116.472342  0   13  39298.146238    2007-08-04  03:30:35    2007-08-04 03:30:35 
4   39.921672   116.472342  0   13  39298.146250    2007-08-04  03:30:36    2007-08-04 03:30:36 

df_labels 示例:

    start_date  start_time  end_date    end_time    mode    start_datetime  end_datetime
0   2007/06/26  11:32:29    2007/06/26  11:40:29    bus 2007-06-26 11:32:29 2007-06-26 11:40:29
1   2008/03/28  14:52:54    2008/03/28  15:59:59    train   2008-03-28 14:52:54 2008-03-28 15:59:59
2   2008/03/28  16:00:00    2008/03/28  22:02:00    train   2008-03-28 16:00:00 2008-03-28 22:02:00
3   2008/03/29  01:27:50    2008/03/29  15:59:59    train   2008-03-29 01:27:50 2008-03-29 15:59:59
4   2008/03/29  16:00:00    2008/03/30  15:59:59    train   2008-03-29 16:00:00 2008-03-30 15:59:59

但是,当我运行这个时:

for index, row in df_labels.iterrows():
    df.loc[(df["datetime"] >= row["start_datetime"]) & (df["datetime"] < row["end_datetime"])] = row["mode"]

我收到以下错误:

TypeError: Cannot compare type 'Timestamp' with type 'str'

请指教

【问题讨论】:

  • 我相信你将一些列转换为字符串,你只需要确保它们都是日期时间数据类型
  • 尝试:df.dtypes(运行)和 df_labels(运行)。 - 这可以帮助您查看哪个数据框具有哪些数据类型。它有助于了解您的转换是否成功。

标签: python pandas dataframe datetime compare


【解决方案1】:

考虑:日期时间值采用这种dd/mm/yy hh:mm:ss 格式。

df['datetime'] = pd.to_datetime(df['datetime'], format='%d/%m/%y %H:%M:%S')
df_labels["start_datetime"] = pd.to_datetime(df_labels["start_datetime"], format='%d/%m/%y %H:%M:%S')
df_labels["end_datetime"] = pd.to_datetime(df_labels["end_datetime"], format='%%d/%m/%y %H:%M:%S')

确保数据的dtypes:

df.dtypes
df_label.dtypes

正确转换后,日期时间列应显示datetime64[ns]

附加(效率):

import numpy as np
import pandas as pd
import pandasql as ps
from pandas import Timestamp
from pandasql import sqldf
import sqlite3

conn = sqlite3.connect(':memory:')
##### write the tables
df.to_sql('df', conn, index=False)
df_label.to_sql('df', conn, index=False)

qry = '''
    select *
    from df
    inner join 
    (select mode df_label_mode, start_date, end_date from df_label) df_label
    on (df.datetime between df_label.start_date and df_label.end_date)
    '''
df_x = pd.read_sql_query(qry, conn)
df_x.head()

参考:Converting date column

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2020-08-13
    • 2018-09-01
    • 2018-12-30
    • 1970-01-01
    • 2020-04-21
    • 1970-01-01
    • 1970-01-01
    • 2017-10-09
    相关资源
    最近更新 更多