【发布时间】:2017-05-19 01:48:21
【问题描述】:
在ETL 过程中,我需要从一个 Postgres 数据库中提取 JSON 列并将其加载到另一个。我们为此使用 Pandas,因为它有很多方法可以从不同的源/目标读取和写入数据,并且所有转换都可以使用 Python 和 Pandas 编写。老实说,我们对这种方法感到非常满意。但我们遇到了问题。
通常读取和写入数据非常容易。您只需使用pandas.read_sql_table 从源读取数据并使用pandas.to_sql 将其写入目标。但是,由于其中一个源表具有 JSON 类型的列(来自 Postgres),to_sql 函数崩溃并显示以下错误消息。
df.to_sql(table_name, analytics_db)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/core/generic.py", line 1201, in to_sql
chunksize=chunksize, dtype=dtype)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/io/sql.py", line 470, in to_sql
chunksize=chunksize, dtype=dtype)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/io/sql.py", line 1147, in to_sql
table.insert(chunksize)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/io/sql.py", line 663, in insert
self._execute_insert(conn, keys, chunk_iter)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/pandas/io/sql.py", line 638, in _execute_insert
conn.execute(self.insert_statement(), data)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 945, in execute
return meth(self, multiparams, params)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 263, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement
compiled_sql, distilled_params
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context
context)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1393, in _handle_dbapi_exception
exc_info
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 202, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1159, in _execute_context
context)
File "/home/ec2-user/python-virtual-environments/etl/local/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 459, in do_executemany
cursor.executemany(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) can't adapt type 'dict'
【问题讨论】:
标签: json postgresql pandas sqlalchemy etl