【发布时间】:2014-05-15 14:12:01
【问题描述】:
我正在构建一个将数据流式传输到 bigquery 的服务。如果我删除需要 4-5 分钟才能加载的部分(我正在预缓存一些映射),则以下代码可以完美运行
from googleapiclient import discovery
from oauth2client import file
from oauth2client import client
from oauth2client import tools
from oauth2client.client import SignedJwtAssertionCredentials
## load email and key
credentials = SignedJwtAssertionCredentials(email, key, scope='https://www.googleapis.com/auth/bigquery')
if credentials is None or credentials.invalid:
raw_input('invalid key')
exit(0)
http = httplib2.Http()
http = credentials.authorize(http)
service = discovery.build('bigquery', 'v2', http=http)
## this does not hang, because it is before the long operation
service.tabledata().insertAll(...)
## some code that takes 5 minutes to execute
r = load_mappings()
## aka long operation
## this hangs
service.tabledata().insertAll(...)
如果我离开需要 5 分钟才能执行的部分,Google API 将停止响应我之后执行的请求。它只是挂在那里,甚至不返回错误。我把它留了 10 到 20 分钟,看看会发生什么,它就在那里。如果我按 ctrl+c,我会得到这个:
^CTraceback (most recent call last):
File "./to_bigquery.py", line 116, in <module>
main(sys.argv)
File "./to_bigquery.py", line 101, in main
print service.tabledata().insertAll(projectId=p_n, datasetId="XXX", tableId="%s_XXXX" % str(shop), body=_mybody).execute()
File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 132, in positional_wrapper
return wrapped(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/googleapiclient/http.py", line 716, in execute
body=self.body, headers=self.headers)
File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 132, in positional_wrapper
return wrapped(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line 490, in new_request
redirections, connection_type)
File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1593, in request
(response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1335, in _request
(response, content) = self._conn_request(conn, request_uri, method, body, headers)
File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1291, in _conn_request
response = conn.getresponse()
File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
response.begin()
File "/usr/lib/python2.7/httplib.py", line 407, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.7/httplib.py", line 365, in _read_status
line = self.fp.readline()
File "/usr/lib/python2.7/socket.py", line 430, in readline
data = recv(1)
File "/usr/lib/python2.7/ssl.py", line 241, in recv
return self.read(buflen)
File "/usr/lib/python2.7/ssl.py", line 160, in read
return self._sslobj.read(len)
我已经设法通过在凭据授权之前放置大型加载操作来临时修复它,但这对我来说似乎是一个错误。我错过了什么?
编辑:我在等待时遇到了错误:
File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 132, in positional_wrapper
return wrapped(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/googleapiclient/http.py", line 716, in execute
body=self.body, headers=self.headers)
File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 132, in positional_wrapper
return wrapped(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line 490, in new_request
redirections, connection_type)
File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1593, in request
(response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1335, in _request
(response, content) = self._conn_request(conn, request_uri, method, body, headers)
File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1291, in _conn_request
response = conn.getresponse()
File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
response.begin()
File "/usr/lib/python2.7/httplib.py", line 407, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.7/httplib.py", line 365, in _read_status
line = self.fp.readline()
File "/usr/lib/python2.7/socket.py", line 430, in readline
data = recv(1)
File "/usr/lib/python2.7/ssl.py", line 241, in recv
return self.read(buflen)
File "/usr/lib/python2.7/ssl.py", line 160, in read
return self._sslobj.read(len)
socket.error: [Errno 110] Connection timed out
它说超时。这似乎发生在冷桌..
【问题讨论】:
-
您是否尝试在长时间操作后移动凭据授权和服务构建?
-
我尝试在本地复制此内容(使用
time.sleep进行长时间操作)但没有成功;几个问题:(1)这种情况是持续发生还是偶尔发生? (2)load_mappings可能在做任何与您的网络连接相关的事情吗? (3) 你使用的是什么版本的oauth2client和googleapiclient? (4)_mybody有多大?两个可能相切的问题:(1)在insertAll电话之后,你有.execute()吗? (您的堆栈跟踪建议您这样做)(2)raw_input而不是print是什么? -
它始终如一地发生。长操作是从 mongodb 加载一些数据。 oauthclient 1.2 googleapiclient 1.2 _mybody 仅包含一行。我正在运行 execute() raw_input 是停止程序的陷阱,它永远不会到达那里。我在构建服务之前通过移动负载来修复它。 ://
-
我怀疑来自 mongo 的负载使一些打开的连接处于混淆
httplib2的状态。您正在使用的 mongo 库是否使用httplib2?您是否与该呼叫共享http实例? -
定义。不,我只是使用典型的
import pymongoc = pymongo.MongoClient()我将把代码拆分到几个相互交谈的应用程序中。在我创建新表并尝试非常快速地插入它们之后,我已经看到了这种情况。
标签: python google-api google-oauth google-bigquery