【问题标题】:PyGreSQL AWS Glue PythonPyGreSQL AWS Glue Python
【发布时间】:2020-08-10 22:55:20
【问题描述】:

我正在尝试将 AWS Glue 中的 PyGreSQL 包与 Python 作业一起使用。

我已从此处将轮文件上传到 S3 存储桶:

https://pypi.org/project/PyGreSQL/#files

x64 的 3.6

然后在我使用的工作中:

import pg

使用此配置,我在运行作业时收到以下错误:


WARNING: The directory '/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.

2020-08-08T20:22:47.845+02:00
Traceback (most recent call last):
  File "/tmp/runscript.py", line 123, in <module>
    runpy.run_path(temp_file_path, run_name='__main__')
  File "/usr/local/lib/python3.6/runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/usr/local/lib/python3.6/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/local/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/tmp/glue-python-scripts-vbox2q05/postloading3.py", line 7, in <module>
  File "/glue/lib/installation/pg.py", line 1436, in <module>
    set_query_helpers(_dictiter, _namediter, _namednext, _scalariter)
NameError: name 'set_query_helpers' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/runscript.py", line 142, in <module>
    raise e_type(e_value).with_traceback(new_stack)
  File "/tmp/glue-python-scripts-vbox2q05/postloading3.py", line 7, in <module>
  File "/glue/lib/installation/pg.py", line 1436, in <module>
    set_query_helpers(_dictiter, _namediter, _namednext, _scalariter)
NameError: name 'set_query_helpers' is not defined

你知道我是否缺少一些要上传的依赖库吗?根据 AWS,PyGreSQL 与 Glue 兼容

【问题讨论】:

标签: python amazon-web-services aws-glue pygresql


【解决方案1】:

它通过添加以下代码来工作:

def get_connection(host):
    rs_conn_string = "host=%s port=%s dbname=%s user=%s password=%s" % ("sffg-redshift-c1....", 5439, "dev", "awsuser", "sfg.")
    rs_conn = pg.connect(dbname=rs_conn_string)
    rs_conn.query("set statement_timeout = 1200000")
    return rs_conn

############################MAIN################################################### 
con1 = get_connection("aredshift-c1....")

然后

import pg

咨询 aws glue pdf 指南有助于找到使其工作的简单方法

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2018-06-29
    • 2020-12-01
    • 2019-02-21
    • 1970-01-01
    • 1970-01-01
    • 2022-01-09
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多