【问题标题】:create a table with two columns type RECORD创建一个具有两列类型记录的表
【发布时间】:2016-03-21 08:26:32
【问题描述】:

我正在使用大查询,我想创建一个使用“记录”类型列填充表的作业。数据将由查询填充 - 是否可以创建具有两列类型记录的表?

类似于 BG Public Datasets 中的表 [bigquery-public-data:samples.trigrams]

谢谢!

【问题讨论】:

    标签: google-bigquery


    【解决方案1】:

    将查询输出控制为记录的最简单方法是使用 JavaScript UDF。

    例如:

    SELECT *
    FROM js(
    (
      SELECT item
      FROM [fh-bigquery:wikidata.latest_raw] 
    ),
    item,
    "[{name: 'id', type:'string'},
      {name: 'sitelinks', type:'record', mode:'repeated', fields: [{name: 'site', type: 'string'},{name: 'title', type: 'string'},{name: 'encoded', type: 'string'}]},
      ]",
      "function(r, emit) {
        [...]
    emit({
        id: obj.id,
        sitelinks: sitelinks,
        });  
      }")
    

    请参阅https://github.com/fhoffa/code_snippets/blob/master/wikidata/create_wiki_en_table.sql 上的完整示例。

    【讨论】:

    • 谢谢!是否可以使用“NEST”功能?
    【解决方案2】:

    随着 BigQuery 标准 SQL 的引入,我们有了处理记录的简单方法
    试试下面,不要忘记取消选中 Show Options 下的 Use Legacy SQL 复选框

    WITH YourTable AS (
      SELECT 1 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z UNION ALL
      SELECT 1 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z UNION ALL
      SELECT 2 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z UNION ALL
      SELECT 2 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z UNION ALL
      SELECT 3 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z UNION ALL
      SELECT 3 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z UNION ALL
      SELECT 3 AS a, 2 AS b, 3 AS c, 13 AS x, 12 AS y, 13 AS z
    )
    SELECT 
      a, ARRAY_AGG(STRUCT(b, c)) AS aa, 
      x, ARRAY_AGG(STRUCT(y, z)) AS xx
    FROM YourTable
    GROUP BY a, x
    

    BigQuery Legacy SQL 中的类似结果可以通过以下代码完成:

    SELECT *
    FROM JS( 
      ( // input table 
      SELECT 
        a, GROUP_CONCAT(CONCAT(STRING(b), ';', STRING(c))) AS aa, 
        x, GROUP_CONCAT(CONCAT(STRING(y), ';', STRING(z))) AS xx
      FROM 
        (SELECT 1 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z),
        (SELECT 1 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z),
        (SELECT 2 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z),
        (SELECT 2 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z),
        (SELECT 3 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z),
        (SELECT 3 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z),
        (SELECT 3 AS a, 2 AS b, 3 AS c, 13 AS x, 12 AS y, 13 AS z)
      GROUP BY a,x
      ), 
      a, aa, x, xx, // input columns 
      "[ // output schema 
      {name: 'a', type:'integer'},
      {name: 'aa', type:'record', mode:'repeated', 
      fields: [
        {name: 'b', type: 'integer'},
        {name: 'c', type: 'integer'}
        ]},
      {name: 'x', type:'integer'},
      {name: 'xx', type:'record', mode:'repeated', 
      fields: [
        {name: 'y', type: 'integer'},
        {name: 'z', type: 'integer'}
        ]}
       ]", 
      "function(row, emit) { // function 
        var aa = []; 
        aa1 = row.aa.split(',');
        for (var i = 0; i < aa1.length; i++) { 
          aa2 = aa1[i].split(';');
          aa.push({b:parseInt(aa2[0]), c:parseInt(aa2[1])}); 
        }; 
        var xx = []; 
        xx1 = row.xx.split(',');
        for (var i = 0; i < aa1.length; i++) { 
          xx2 = xx1[i].split(';');
          xx.push({y:parseInt(xx2[0]), z:parseInt(xx2[1])}); 
        };
        emit({
          a: row.a, 
          aa: aa, 
          x: row.x,
          xx: xx
          }); 
      }"
    )  
    

    为此(对于旧版 SQL),您需要设置目标表并选中 Allow Large Results 复选框并取消选中 Flatten Results 复选框(全部在 Show Options 下)

    【讨论】:

      最近更新 更多