【发布时间】:2021-01-20 22:57:04
【问题描述】:
问题摘要
当我尝试
SELECT使用WHERE子句查询分区表时,Athena 产生错误。
在我的log 表中有 4 种类型的分区。
- 年
string - 月
string - 一天
string - 小时
string
我在分区表上尝试了SELECT 查询。
但收到以下错误消息。
错误消息
GENERIC_INTERNAL_ERROR: No value present
This query ran against the "default" database, unless qualified by the query.
SELECT 我尝试过的查询
SELECT *
FROM logs
WHERE year='2020'
AND month='10'
AND day ='05';
与
SELECT *
FROM "default"."logs"
WHERE year='2020'
AND month='10'
AND day ='05';
由于有关No value present 的错误消息,我检查了分区结果。
SHOW PARTITIONS logs;
结果
year=2020/month=10/day=05/hour=17
year=2020/month=10/day=05/hour=11
year=2020/month=10/day=05/hour=19
year=2020/month=10/day=05/hour=04
year=2020/month=10/day=05/hour=18
year=2020/month=10/day=05/hour=15
year=2020/month=10/day=05/hour=14
year=2020/month=10/day=05/hour=16
year=2020/month=10/day=05/hour=13
year=2020/month=10/day=05/hour=21
year=2020/month=10/day=05/hour=05
year=2020/month=10/day=05/hour=08
year=2020/month=10/day=05/hour=20
year=2020/month=10/day=05/hour=12
year=2020/month=10/day=05/hour=03
year=2020/month=10/day=05/hour=01
year=2020/month=10/day=05/hour=10
year=2020/month=10/day=05/hour=02
year=2020/month=10/day=05/hour=09
year=2020/month=10/day=05/hour=22
year=2020/month=10/day=05/hour=23
year=2020/month=10/day=05/hour=06
year=2020/month=10/day=05/hour=07
year=2020/month=10/day=05/hour=00
year=2020/month=10/day=04/hour=00
非常感谢您的帮助。
更多信息
CREATE TABLE 我使用的命令
创建表格
CREATE EXTERNAL TABLE `logs`(
`date` date,
`time` string,
`location` string,
`bytes` bigint,
`request_ip` string,
`method` string,
`host` string,
`uri` string,
`status` int,
`referrer` string,
`user_agent` string,
`query_string` string,
`cookie` string,
`result_type` string,
`request_id` string,
`host_header` string,
`request_protocol` string,
`request_bytes` bigint,
`time_taken` float,
`xforwarded_for` string,
`ssl_protocol` string,
`ssl_cipher` string,
`response_result_type` string,
`http_version` string,
`fle_status` string,
`fle_encrypted_fields` int)
PARTITIONED BY (
`year` string,
`month` string,
`day` string,
`hour` string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
WITH SERDEPROPERTIES (
'input.regex'='^(?!#)([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)\\\\s+([^ \\\\t]+)$')
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
's3://mybucket/path'
TBLPROPERTIES (
'projection.date.format'='yyyy/MM/dd',
'projection.date.interval'='1',
'projection.date.interval.unit'='DAYS',
'projection.date.range'='2019/11/27, NOW-1DAYS',
'projection.date.type'='date',
'projection.day.type'='string',
'projection.enabled'='true',
'projection.hour.type'='string',
'projection.month.type'='string',
'projection.year.type'='string',
'skip.header.line.count'='2',
'storage.location.template'='s3://mybucket/path/distributionID/${year}/${month}/${day}/${hour}/',
'transient_lastDdlTime'='1575005094')
【问题讨论】:
-
您能给我们看一个数据样本,以及用于创建表的
CREATE TABLE命令,以便我们尝试重现这种情况吗? -
当然,@JohnRotenstein。根据您的要求,我添加了
CREATE TABLE信息。请检查上述问题。 -
啊!您似乎正在从 Amazon CloudFront 查询日志。这可能会有所帮助:Querying Amazon CloudFront Logs - Amazon Athena。您是否以某种方式根据文件中的日期将日志放入子目录中?
-
@JohnRotenstein 好的,我去看看。谢谢
-
刚刚运行 select * from logs 得到了什么?
标签: amazon-web-services amazon-athena partition