【问题标题】:Connecting to AWS Elasticsearch instance using Python使用 Python 连接到 AWS Elasticsearch 实例
【发布时间】:2016-11-26 11:58:07
【问题描述】:

我有一个 Elasticsearch 实例,托管在 AWS 上。我可以从我的终端与 Curl 连接。我现在正在尝试使用 python elasticsearch 包装器。我有:

from elasticsearch import Elasticsearch

client = Elasticsearch(host='https://ec2-xx-xx-xxx-xxx.us-west-2.compute.amazonaws.com', port=9200)

查询是:

data = client.search(index="mynewindex", body={"query": {"match": {"email": "gmail"}}})
    for hit in data:
        print(hit.email)
    print data

来自 heroku 的完整回溯是:

2016-07-22T14:06:06.031347+00:00 heroku[router]: at=info method=GET path="/" host=elastictest.herokuapp.com request_id=9a96d447-fe02-4670-bafe-efba842927f3 fwd="88.106.66.168" dyno=web.1 connect=1ms service=393ms status=500 bytes=456
2016-07-22T14:09:18.035805+00:00 heroku[slug-compiler]: Slug compilation started
2016-07-22T14:09:18.035810+00:00 heroku[slug-compiler]: Slug compilation finished
2016-07-22T14:09:18.147278+00:00 heroku[web.1]: Restarting
2016-07-22T14:09:18.147920+00:00 heroku[web.1]: State changed from up to starting
2016-07-22T14:09:20.838784+00:00 heroku[web.1]: Starting process with command `gunicorn application:application --log-file=-`
2016-07-22T14:09:20.834521+00:00 heroku[web.1]: Stopping all processes with SIGTERM
2016-07-22T14:09:17.850918+00:00 heroku[api]: Deploy b7187d3 by hector@fastmail.se
2016-07-22T14:09:17.850993+00:00 heroku[api]: Release v21 created by hector@fastmail.se
2016-07-22T14:09:21.372589+00:00 app[web.1]: [2016-07-22 14:09:21 +0000] [3] [INFO] Handling signal: term
2016-07-22T14:09:21.383946+00:00 app[web.1]: [2016-07-22 14:09:21 +0000] [3] [INFO] Shutting down: Master
2016-07-22T14:09:21.367656+00:00 app[web.1]: [2016-07-22 14:09:21 +0000] [9] [INFO] Worker exiting (pid: 9)
2016-07-22T14:09:21.366309+00:00 app[web.1]: [2016-07-22 14:09:21 +0000] [10] [INFO] Worker exiting (pid: 10)
2016-07-22T14:09:22.286766+00:00 heroku[web.1]: Process exited with status 0
2016-07-22T14:09:23.344822+00:00 app[web.1]: [2016-07-22 14:09:23 +0000] [3] [INFO] Starting gunicorn 19.6.0
2016-07-22T14:09:23.345481+00:00 app[web.1]: [2016-07-22 14:09:23 +0000] [3] [INFO] Using worker: sync
2016-07-22T14:09:23.351173+00:00 app[web.1]: [2016-07-22 14:09:23 +0000] [9] [INFO] Booting worker with pid: 9
2016-07-22T14:09:23.370580+00:00 app[web.1]: [2016-07-22 14:09:23 +0000] [10] [INFO] Booting worker with pid: 10
2016-07-22T14:09:23.345376+00:00 app[web.1]: [2016-07-22 14:09:23 +0000] [3] [INFO] Listening at: http://0.0.0.0:59867 (3)
2016-07-22T14:09:24.536725+00:00 heroku[web.1]: State changed from starting to up
2016-07-22T14:09:39.043240+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1544, in handle_user_exception
2016-07-22T14:09:39.043239+00:00 app[web.1]:     rv = self.handle_user_exception(e)
2016-07-22T14:09:39.043241+00:00 app[web.1]:     reraise(exc_type, exc_value, tb)
2016-07-22T14:09:39.043233+00:00 app[web.1]: Traceback (most recent call last):
2016-07-22T14:09:39.043238+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1641, in full_dispatch_request
2016-07-22T14:09:39.043236+00:00 app[web.1]:     response = self.full_dispatch_request()
2016-07-22T14:09:39.043235+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1988, in wsgi_app
2016-07-22T14:09:39.043214+00:00 app[web.1]: [2016-07-22 14:09:39,041] ERROR in app: Exception on / [GET]
2016-07-22T14:09:39.043241+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1639, in full_dispatch_request
2016-07-22T14:09:39.043242+00:00 app[web.1]:     rv = self.dispatch_request()
2016-07-22T14:09:39.043242+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1625, in dispatch_request
2016-07-22T14:09:39.043243+00:00 app[web.1]:     return self.view_functions[rule.endpoint](**req.view_args)
2016-07-22T14:09:39.043243+00:00 app[web.1]:   File "/app/application.py", line 23, in index
2016-07-22T14:09:39.043246+00:00 app[web.1]:     return func(*args, params=params, **kwargs)
2016-07-22T14:09:39.043245+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 69, in _wrapped
2016-07-22T14:09:39.043246+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 548, in search
2016-07-22T14:09:39.043247+00:00 app[web.1]:     doc_type, '_search'), params=params, body=body)
2016-07-22T14:09:39.043250+00:00 app[web.1]:     status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
2016-07-22T14:09:39.043250+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 105, in perform_request
2016-07-22T14:09:39.043244+00:00 app[web.1]:     data = client.search(index="mynewindex", body={"query": {"match": {"email": "gmail"}}})
2016-07-22T14:09:39.043251+00:00 app[web.1]:     raise ConnectionError('N/A', str(e), e)
2016-07-22T14:09:39.043249+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/transport.py", line 329, in perform_request
2016-07-22T14:09:39.043253+00:00 app[web.1]: ConnectionError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7f185a94d8d0>: Failed to establish a new connection: [Errno -2] Name or service not known) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7f185a94d8d0>: Failed to establish a new connection: [Errno -2] Name or service not known)
2016-07-22T14:09:42.692817+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1641, in full_dispatch_request
2016-07-22T14:09:42.692816+00:00 app[web.1]:     response = self.full_dispatch_request()
2016-07-22T14:09:42.692795+00:00 app[web.1]: [2016-07-22 14:09:42,691] ERROR in app: Exception on / [GET]
2016-07-22T14:09:42.692820+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1639, in full_dispatch_request
2016-07-22T14:09:42.692819+00:00 app[web.1]:     reraise(exc_type, exc_value, tb)
2016-07-22T14:09:42.692819+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1544, in handle_user_exception
2016-07-22T14:09:42.692827+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/transport.py", line 329, in perform_request
2016-07-22T14:09:42.692828+00:00 app[web.1]:     status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
2016-07-22T14:09:42.692828+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 105, in perform_request
2016-07-22T14:09:42.692829+00:00 app[web.1]:     raise ConnectionError('N/A', str(e), e)
2016-07-22T14:09:42.692831+00:00 app[web.1]: ConnectionError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7f185a946d10>: Failed to establish a new connection: [Errno -2] Name or service not known) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7f185a946d10>: Failed to establish a new connection: [Errno -2] Name or service not known)
2016-07-22T14:09:42.692821+00:00 app[web.1]:     rv = self.dispatch_request()
2016-07-22T14:09:42.692821+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1625, in dispatch_request
2016-07-22T14:09:42.692822+00:00 app[web.1]:     return self.view_functions[rule.endpoint](**req.view_args)
2016-07-22T14:09:42.692823+00:00 app[web.1]:   File "/app/application.py", line 23, in index
2016-07-22T14:09:42.692823+00:00 app[web.1]:     data = client.search(index="mynewindex", body={"query": {"match": {"email": "gmail"}}})
2016-07-22T14:09:42.692824+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 69, in _wrapped
2016-07-22T14:09:42.692814+00:00 app[web.1]: Traceback (most recent call last):
2016-07-22T14:09:42.692818+00:00 app[web.1]:     rv = self.handle_user_exception(e)
2016-07-22T14:09:42.692815+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1988, in wsgi_app
2016-07-22T14:09:42.692825+00:00 app[web.1]:     return func(*args, params=params, **kwargs)
2016-07-22T14:09:42.692826+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 548, in search
2016-07-22T14:09:42.692826+00:00 app[web.1]:     doc_type, '_search'), params=params, body=body)
2016-07-22T14:09:42.685540+00:00 heroku[router]: at=info method=GET path="/" host=elastictest.herokuapp.com request_id=87ae9ec2-edb6-4e58-b9d6-89709b883091 fwd="88.106.66.168" dyno=web.1 connect=1ms service=11ms status=500 bytes=456

我认为错误与“连接字符串”有关,因为主要错误似乎是 ConnectionError

那么两个问题:

1) 如何正确连接?入站安全规则当前配置为接受所有传入流量

2) 查询代码是否有错误?

非常感谢一如既往。

【问题讨论】:

  • 什么端口的安全设置设置为 0.0.0.0/0?
  • 抱歉,这可能会产生误导,我的意思是入站规则接受所有传入流量,因此据我了解,这不是连接失败的原因。

标签: python amazon-web-services heroku elasticsearch amazon-ec2


【解决方案1】:

这是一个 Python 小脚本,有助于创建与 AWS Elasticsearch 实例的连接。

from elasticsearch import Elasticsearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth


host = '' # For example, my-test-domain.us-east-1.es.amazonaws.com
region = '' # e.g. us-west-1

service = 'es'

credentials = {
    'access_key': '',
    'secret_key': ''
}

awsauth = AWS4Auth(credentials['access_key'], credentials['secret_key'], region, service)

es = Elasticsearch(
    hosts=[{'host': host, 'port': 443}],
    http_auth=awsauth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection
)

print(es.info())

参考:AWS Elasticsearch Signing Requests

【讨论】:

    【解决方案2】:
    host = 'ec2-xx-xx-xxx-xxx.us-west-2.compute.amazonaws.com' #without 'https'
    YOUR_ACCESS_KEY = ''
    YOUR_SECRET_KEY = ''
    REGION = 'us-west-2' #change to your region
    awsauth = AWS4Auth(YOUR_ACCESS_KEY, YOUR_SECRET_KEY, REGION, 'es')
    
    es = Elasticsearch(
        hosts=[{'host': host, 'port': 443}],
        http_auth=awsauth,
        use_ssl=True,
        verify_certs=True,
        connection_class=RequestsHttpConnection
    )
    print(es.info())
    

    【讨论】:

    • 如果在 ECS 集群上运行,有没有办法使用实例角色来做到这一点?
    • @Jhirschibar 也许试试 boto3 credentials = boto3.Session().get_credentials() - more detailed example from AWS docs
    【解决方案3】:

    这是使用python连接elasticsearch服务器的正确方法:

    es = Elasticsearch(['IP:PORT',])
    

    Elasticsearch 的构造函数没有hostport 参数。第一个参数应该是一个列表,列表中的每一项都可以是代表主机的字符串:

    'schema://ip:port'
    

    或带有关于该主机的扩展参数的字典

    {'host': 'ip/hostname', 'port': 443, 'url_prefix': 'es', 'use_ssl': True}
    

    在您的情况下,您可能希望使用:

     client = Elasticsearch(['https://ec2-xx-xx-xxx-xxx.us-west-2.compute.amazonaws.com:9200'])
    

    端口是多余的,因为你使用的是默认端口,所以你可以使用删除它
    client = Elasticsearch(['https://ec2-xx-xx-xxx-xxx.us-west-2.compute.amazonaws.com'])

    【讨论】:

    • 谢谢。我尝试了 IP 和 Url,但都不起作用。在每种情况下,响应都是拒绝连接。您可以在上面的问题中看到我尝试过。是ec2服务器的问题吗?
    • @user1903663,在 Elasticsearch 的构造函数中没有 host=。我更新了答案以强调这一点。
    • 另一个选项是您的服务器 (heroku) 正在阻止您的传出连接。您可以尝试从该服务器运行import urllib2; urllib2.urlopen('https://ec2-xx-xx-xxx-xxx.us-west-2.compute.amazonaws.com:9200') 吗?
    • 等一下,我试试看
    • 检查python语法:)
    猜你喜欢
    • 1970-01-01
    • 2021-11-10
    • 1970-01-01
    • 2012-11-23
    • 2018-02-08
    • 2019-04-04
    • 2017-07-06
    • 2017-09-06
    • 1970-01-01
    相关资源
    最近更新 更多