【问题标题】:amazon athena create request with partitionsamazon athena 创建带分区的请求
【发布时间】:2018-08-06 19:00:00
【问题描述】:

我创建了一个表,其分区如下:首先按年、月和日。

问题:我希望得到 12/2017 和 03/2018 的数据,我该怎么做? 我的想法:

where (year='2017' and month='12') and ( year ='2018' and month='03')

正确吗?我不会感到困惑,因此 Amazon Athena 获取以下数据:

12/2017 and 03/2018 and 03/2017 and 12/2018 

因为 and 运算符?

PS:我无法测试,我只有免费帐户。 谢谢。

【问题讨论】:

    标签: sql amazon-s3 amazon-athena


    【解决方案1】:

    无论如何,我尝试了一组迷你数据,发现 Amazon Athena 考虑了括号。

    我的测试如下: 生成的表的 DDl:

    CREATE EXTERNAL TABLE `manyands`(
      `years` int COMMENT 'from deserializer', 
      `months` int COMMENT 'from deserializer', 
      `days` int COMMENT 'from deserializer')
    PARTITIONED BY ( 
      `year` string, 
      `month` string)
    ROW FORMAT SERDE 
      'org.openx.data.jsonserde.JsonSerDe' 
    STORED AS INPUTFORMAT 
      'org.apache.hadoop.mapred.TextInputFormat' 
    OUTPUTFORMAT 
      'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
    LOCATION
      's3://mybucket/'
    

    我的一组数据测试:

    我的测试:

    1-SELECT * FROM "atlasdatabase"."manyands" where month='1'; 我得到了 CSV 格式:

    "years","months","days","year","month"
    "2017","1","21","2017","1"
    "2018","1","81","2018","1"
    

    2-SELECT * FROM "atlasdatabase"."manyands" where month='1' and year='2017';

    "years","months","days","year","month"
    "2017","1","21","2017","1"
    

    3-SELECT * FROM "atlasdatabase"."manyands" where (month='1' and year='2018') and (month='3' and year='2017') ;

    empty (Zéro enregistrements renvoyés)
    

    4-SELECT * FROM "atlasdatabase"."manyands" where (month='1' and year='2018') or (month='3' ) ;

    "years","months","days","year","month"
    "2018","1","81","2018","1"
    "2017","3","73","2017","3"
    "2018","3","73","2018","3"
    

    结论:在多个分区实例之间添加OR运算符。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-05-03
      • 2020-03-19
      • 2020-05-12
      • 1970-01-01
      • 2018-11-17
      相关资源
      最近更新 更多