【问题标题】:Pig - How to Join and Define Schema in One StepPig - 如何一步加入和定义模式
【发布时间】:2014-04-15 09:53:43
【问题描述】:

我采取以下措施:

A = LOAD 'a.txt' USING PigStorage('\\u001') AS (
    foo:int
    ,bar:chararray
);
B = LOAD 'b.txt' USING PigStorage('\\u001') AS (
    foo:int
    ,baz:long
);
C = JOIN A BY foo, B BY foo;
D = FOREACH C GENERATE
    A::foo AS foo
    ,A::bar AS bar
    ,B::baz AS baz
;

如何在一个步骤中加入和定义架构?

【问题讨论】:

    标签: hadoop apache-pig bigdata cloudera


    【解决方案1】:

    根据documentation,您在加入关系时无法定义架构。
    笔记: 从语法上讲,您可以嵌套命令以使您感觉节省了一些步骤,例如:

    D = foreach
        (join (LOAD 'a.txt' USING PigStorage('\\u001') AS (foo:int ,bar:chararray)) by foo,
              (LOAD 'b.txt' USING PigStorage('\\u001') AS (foo:int ,baz:long)) by foo
        ) generate $0 as foo, $1 as bar, $3 as baz;
    

    但我会避免这样做。它很混乱,但它生成的解释计划与原始计划相同。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2023-04-05
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-03-20
      • 1970-01-01
      相关资源
      最近更新 更多