【问题标题】:Converting a MongoDB aggregate into an ArangoDB COLLECT将 MongoDB 聚合转换为 ArangoDB COLLECT
【发布时间】:2018-06-06 17:07:45
【问题描述】:

我正在将数据从 Mongo 迁移到 Arango,我需要重现 $group 聚合。我已经成功地重现了结果,但我担心我的方法可能不是最理想的。 AQL 可以提高吗?

我有一个如下所示的数据集合:

{
    "_id" : ObjectId("5b17f9d85b2c1998598f054e"),
    "department" : [ 
        "Sales", 
        "Marketing"
    ],
    "region" : [ 
        "US", 
        "UK"
    ]
}

{
    "_id" : ObjectId("5b1808145b2c1998598f054f"),
    "department" : [ 
        "Sales", 
        "Marketing"
    ],
    "region" : [ 
        "US", 
        "UK"
    ]
}

{
    "_id" : ObjectId("5b18083c5b2c1998598f0550"),
    "department" : "Development",
    "region" : "Europe"
}

{
    "_id" : ObjectId("5b1809a75b2c1998598f0551"),
    "department" : "Sales"
}

注意该值可以是字符串、数组或不存在

在 Mongo 中,我使用以下代码来聚合数据:

db.test.aggregate([
{
    $unwind:{
        path:"$department",
        preserveNullAndEmptyArrays: true
    }
},
{
    $unwind:{
        path:"$region",
        preserveNullAndEmptyArrays: true
    }
},
{
    $group:{
        _id:{
            department:{ $ifNull: [ "$department", "null" ] },
            region:{ $ifNull: [ "$region", "null" ] },
        },
        count:{$sum:1}
    }
}
])

在 Arango 中,我使用以下 AQL:

FOR i IN test
    LET FIELD1=(FOR a IN APPEND([],NOT_NULL(i.department,"null")) RETURN a)
    LET FIELD2=(FOR a IN APPEND([],NOT_NULL(i.region,"null")) RETURN a)

    FOR f1 IN FIELD1
        FOR f2 IN FIELD2
            COLLECT id={department:f1,region:f2} WITH COUNT INTO counter

            RETURN {_id:id,count:counter}

编辑: APPEND 用于将字符串值转换为数组

两者都产生如下所示的结果;

{
    "_id" : {
        "department" : "Marketing",
        "region" : "US"
    },
    "count" : 2.0
}

{
    "_id" : {
        "department" : "Development",
        "region" : "Europe"
    },
    "count" : 1.0
}

{
    "_id" : {
        "department" : "Sales",
        "region" : "null"
    },
    "count" : 1.0
}

{
    "_id" : {
        "department" : "Marketing",
        "region" : "UK"
    },
    "count" : 2.0
}

{
    "_id" : {
        "department" : "Sales",
        "region" : "UK"
    },
    "count" : 2.0
}

{
    "_id" : {
        "department" : "Sales",
        "region" : "US"
    },
    "count" : 2.0
}

【问题讨论】:

    标签: arangodb aql


    【解决方案1】:

    您的方法似乎没问题。我建议使用TO_ARRAY() 而不是APPEND() 以便更容易理解。

    这两个函数都跳过空值,因此不可避免地要提供一些占位符,或者显式测试 null 并返回一个具有 null 值的数组(或任何最有效的值)给你):

    FOR doc IN test
        FOR field1 IN doc.department == null ? [ null ] : TO_ARRAY(doc.department)
        FOR field2 IN doc.region == null ? [ null ] : TO_ARRAY(doc.region)
        COLLECT department = field1, region = field2
        WITH COUNT INTO count
            RETURN { _id: { department, region }, count }
    

    收集测试

    [
      {
        "_key": "5b17f9d85b2c1998598f054e",
        "department": [
          "Sales",
          "Marketing"
        ],
        "region": [
          "US",
          "UK"
        ]
      },
      {
        "_key": "5b18083c5b2c1998598f0550",
        "department": "Development",
        "region": "Europe"
      },
      {
        "_key": "5b1808145b2c1998598f054f",
        "department": [
          "Sales",
          "Marketing"
        ],
        "region": [
          "US",
          "UK"
        ]
      },
      {
        "_key": "5b1809a75b2c1998598f0551",
        "department": "Sales"
      }
    ]
    

    结果:

    [
      {
        "_id": {
          "department": "Development",
          "region": "Europe"
        },
        "count": 1
      },
      {
        "_id": {
          "department": "Marketing",
          "region": "UK"
        },
        "count": 2
      },
      {
        "_id": {
          "department": "Marketing",
          "region": "US"
        },
        "count": 2
      },
      {
        "_id": {
          "department": "Sales",
          "region": null
        },
        "count": 1
      },
      {
        "_id": {
          "department": "Sales",
          "region": "UK"
        },
        "count": 2
      },
      {
        "_id": {
          "department": "Sales",
          "region": "US"
        },
        "count": 2
      }
    ]
    

    【讨论】:

    • 谢谢,很高兴知道我在正确的轨道上。不过你的版本干净多了!
    猜你喜欢
    • 2022-01-22
    • 2021-11-25
    • 2015-10-09
    • 2019-08-09
    • 2019-03-15
    • 2014-07-31
    • 2019-02-22
    • 2020-08-20
    • 2020-02-20
    相关资源
    最近更新 更多