【问题标题】:Python BigQuery really strange timeoutPython BigQuery 真的很奇怪的超时
【发布时间】:2014-05-15 14:12:01
【问题描述】:

我正在构建一个将数据流式传输到 bigquery 的服务。如果我删除需要 4-5 分钟才能加载的部分(我正在预缓存一些映射),则以下代码可以完美运行

from googleapiclient import discovery
from oauth2client import file
from oauth2client import client
from oauth2client import tools

from oauth2client.client import SignedJwtAssertionCredentials

## load email and key
credentials = SignedJwtAssertionCredentials(email, key, scope='https://www.googleapis.com/auth/bigquery')

if credentials is None or credentials.invalid:
        raw_input('invalid key')
        exit(0)

http = httplib2.Http()
http = credentials.authorize(http)

service = discovery.build('bigquery', 'v2', http=http)


## this does not hang, because it is before the long operation
service.tabledata().insertAll(...)


## some code that takes 5 minutes to execute
r = load_mappings()
## aka long operation

## this hangs
service.tabledata().insertAll(...)

如果我离开需要 5 分钟才能执行的部分,Google API 将停止响应我之后执行的请求。它只是挂在那里,甚至不返回错误。我把它留了 10 到 20 分钟,看看会发生什么,它就在那里。如果我按 ctrl+c,我会得到这个:

^CTraceback (most recent call last):
  File "./to_bigquery.py", line 116, in <module>
    main(sys.argv)
  File "./to_bigquery.py", line 101, in main
    print service.tabledata().insertAll(projectId=p_n, datasetId="XXX", tableId="%s_XXXX" % str(shop), body=_mybody).execute()
  File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 132, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/googleapiclient/http.py", line 716, in execute
    body=self.body, headers=self.headers)
  File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 132, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line 490, in new_request
    redirections, connection_type)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1593, in request
    (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1335, in _request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1291, in _conn_request
    response = conn.getresponse()
  File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
    response.begin()
  File "/usr/lib/python2.7/httplib.py", line 407, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.7/httplib.py", line 365, in _read_status
    line = self.fp.readline()   
  File "/usr/lib/python2.7/socket.py", line 430, in readline
    data = recv(1)
  File "/usr/lib/python2.7/ssl.py", line 241, in recv
    return self.read(buflen)
  File "/usr/lib/python2.7/ssl.py", line 160, in read
    return self._sslobj.read(len)

我已经设法通过在凭据授权之前放置大型加载操作来临时修复它,但这对我来说似乎是一个错误。我错过了什么?

编辑:我在等待时遇到了错误:

  File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 132, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/googleapiclient/http.py", line 716, in execute
    body=self.body, headers=self.headers)
  File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 132, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line 490, in new_request
    redirections, connection_type)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1593, in request
    (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1335, in _request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1291, in _conn_request
    response = conn.getresponse()
  File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
    response.begin()
  File "/usr/lib/python2.7/httplib.py", line 407, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.7/httplib.py", line 365, in _read_status
    line = self.fp.readline()
  File "/usr/lib/python2.7/socket.py", line 430, in readline
    data = recv(1)
  File "/usr/lib/python2.7/ssl.py", line 241, in recv
    return self.read(buflen)
  File "/usr/lib/python2.7/ssl.py", line 160, in read
    return self._sslobj.read(len)
socket.error: [Errno 110] Connection timed out

它说超时。这似乎发生在冷桌..

【问题讨论】:

  • 您是否尝试在长时间操作后移动凭据授权和服务构建?
  • 我尝试在本地复制此内容(使用time.sleep 进行长时间操作)但没有成功;几个问题:(1)这种情况是持续发生还是偶尔发生? (2) load_mappings 可能在做任何与您的网络连接相关的事情吗? (3) 你使用的是什么版本的oauth2clientgoogleapiclient? (4)_mybody有多大?两个可能相切的问题:(1)在insertAll 电话之后,你有.execute() 吗? (您的堆栈跟踪建议您这样做)(2)raw_input 而不是 print 是什么?
  • 它始终如一地发生。长操作是从 mongodb 加载一些数据。 oauthclient 1.2 googleapiclient 1.2 _mybody 仅包含一行。我正在运行 execute() raw_input 是停止程序的陷阱,它永远不会到达那里。我在构建服务之前通过移动负载来修复它。 ://
  • 我怀疑来自 mongo 的负载使一些打开的连接处于混淆 httplib2 的状态。您正在使用的 mongo 库是否使用 httplib2?您是否与该呼叫共享 http 实例?
  • 定义。不,我只是使用典型的import pymongoc = pymongo.MongoClient() 我将把代码拆分到几个相互交谈的应用程序中。在我创建新表并尝试非常快速地插入它们之后,我已经看到了这种情况。

标签: python google-api google-oauth google-bigquery


【解决方案1】:
def refresh_bq(self):
    credentials = SignedJwtAssertionCredentials(email, key, scope='https://www.googleapis.com/auth/bigquery')

    if credentials is None or credentials.invalid:
        raw_input('invalid key')
        exit(0)

    http = httplib2.Http()
    http = credentials.authorize(http)

    service = discovery.build('bigquery', 'v2', http=http)
    self.service = service

每次我做一些不需要预处理的插入时,我都会运行 self.refresh_bq() ,它可以完美地工作。凌乱的黑客,但我需要让它尽快工作。有定义。某处的错误。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2012-08-23
    • 1970-01-01
    • 1970-01-01
    • 2020-08-24
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多