作为免责声明,当从命令行工作时,使用 bq 工具通常就足够了,或者对于更复杂的用例,BigQuery client libraries 支持使用 BigQuery 从多种语言进行编程。不过,有时向 REST API 发出普通请求以了解某些 API 在低级别如何工作仍然很有用。
首先,确保您拥有installed the Google Cloud SDK。这应该包括gcloud 和bq 命令行工具。如果您还没有,请通过终端运行以下命令来授权您的帐户:
gcloud auth login
这应该会提示您登录,然后给您一个访问代码,您可以将其粘贴到您的终端中。 (确切的过程可能会随着时间而改变)。
现在让我们尝试使用 BigQuery REST API 进行查询,调用 jobs.query method。用你自己的项目名称修改这个脚本,你可以从the Google Cloud Console找到,然后将脚本粘贴到你的终端:
PROJECT="YOUR_PROJECT_NAME"
QUERY="\"SELECT 1 AS x, 'foo' AS y;\""
REQUEST="{\"kind\":\"bigquery#queryRequest\",\"useLegacySql\":false,\"query\":$QUERY}"
echo $REQUEST | \
curl -X POST -d @- -H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://www.googleapis.com/bigquery/v2/projects/$PROJECT/queries
如果有效,您应该会看到如下所示的输出:
{
"kind": "bigquery#queryResponse",
"schema": {
"fields": [
{
"name": "x",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "y",
"type": "STRING",
"mode": "NULLABLE"
}
]
},
"jobReference": {
"projectId": "<your project ID>",
"jobId": "<your job ID>"
},
"totalRows": "1",
"rows": [
{
"f": [
{
"v": "1"
},
{
"v": "foo"
}
]
}
],
"totalBytesProcessed": "0",
"jobComplete": true,
"cacheHit": false
}
如果您还没有设置bq 命令行工具,您可以在终端中使用bq init 来设置。完成后,您可以尝试使用它运行相同的查询:
bq query --use_legacy_sql=False "SELECT 1 AS x, 'foo' AS y;"
您还可以通过传递--apilog= 选项查看bq 工具发出的REST API 请求:
bq --apilog= query --use_legacy_sql=False "SELECT [1, 2, 3] AS x;"
现在让我们尝试一个使用jobs.insert method 而不是query API 的示例。运行此脚本,将 YOUR_PROJECT_NAME 替换为您的项目名称:
PROJECT="YOUR_PROJECT_NAME"
QUERY="\"SELECT 1 AS x, 'foo' AS y;\""
REQUEST="{\"configuration\":{\"query\":{\"useLegacySql\":false,\"query\":${QUERY}}}}"
echo $REQUEST | \
curl -X POST -d @- -H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://www.googleapis.com/bigquery/v2/projects/$PROJECT/jobs
与立即返回响应的query API 不同,您将看到类似于此的结果:
{
"kind": "bigquery#job",
"etag": "\"<etag string>\"",
"id": "<project name>:<job ID>",
"selfLink": "https://www.googleapis.com/bigquery/v2/projects/<project name>/jobs/<job ID>",
"jobReference": {
"projectId": "<project name>",
"jobId": "<job ID>"
},
"configuration": {
"query": {
"query": "SELECT 1 AS x, 'foo' AS y;",
"destinationTable": {
"projectId": "<project name>",
"datasetId": "<anonymous dataset>",
"tableId": "<anonymous table>"
},
"createDisposition": "CREATE_IF_NEEDED",
"writeDisposition": "WRITE_TRUNCATE",
"useLegacySql": false
}
},
"status": {
"state": "RUNNING"
},
"statistics": {
"creationTime": "<timestamp millis>",
"startTime": "<timestamp millis>"
},
"user_email": "<your email address>"
}
注意状态:
"status": {
"state": "RUNNING"
},
如果您想立即查看作业,可以使用jobs.get method。与之前类似,从终端运行此命令,使用上一步输出中的作业 ID:
PROJECT="YOUR_PROJECT_NAME"
JOB_ID="YOUR_JOB_ID"
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://www.googleapis.com/bigquery/v2/projects/$PROJECT/jobs/$JOB_ID
如果查询已完成,您将收到一条回复:
...
"status": {
"state": "DONE"
},
...
最后,我们可以发出请求以获取查询结果,同样使用 REST API。
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://www.googleapis.com/bigquery/v2/projects/$PROJECT/queries/$JOB_ID
输出将类似于我们使用上面的jobs.query 方法时:
{
"kind": "bigquery#getQueryResultsResponse",
"etag": "\"<etag string>\"",
"schema": {
"fields": [
{
"name": "x",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "y",
"type": "STRING",
"mode": "NULLABLE"
}
]
},
"jobReference": {
"projectId": "<project ID>",
"jobId": "<job ID>"
},
"totalRows": "1",
"rows": [
{
"f": [
{
"v": "1"
},
{
"v": "foo"
}
]
}
],
"totalBytesProcessed": "0",
"jobComplete": true,
"cacheHit": true
}