Arangodb AQL 递归图遍历答案

【问题标题】：Arangodb AQL recursive graph traversalArangodb AQL 递归图遍历
【发布时间】：2016-10-31 23:24:49
【问题描述】：

我有一个包含三个集合的图表，其中项目可以通过边连接。 ItemA 是 itemB 的父项，而 itemB 又是 itemC 的父项。元素只能通过方向边连接

"_from : child, _to : parent"

目前我只能通过此 AQL 查询获得“线性”结果：

LET contains = (FOR v IN 1..? INBOUND 'collectionA/itemA' GRAPH 'myGraph' RETURN v)

     RETURN {
        "root": {
            "id": "ItemA",
            "contains": contains
       }
   }

结果如下所示：

"root": {
    "id": "itemA",
    "contains": [
        {
            "id": "itemB"
        },
        {
            "id": "itemC"
        }
    ]
}

但我需要像这样获得图遍历的“分层”结果：

"root": {
    "id": "itemA",
    "contains": [
        {
            "id": "itemB",
            "contains": [
                {
                    "id": "itemC"
                }
            }
        ]
    }

那么，我可以通过运行 aql 查询获得这个“分层”结果吗？

还有一件事：遍历应该一直运行到遇到叶节点为止。所以事先不知道遍历的深度。

【问题讨论】：

一些可能对您有所帮助的相关技术：for v, e, p in 1..3 inbound 并返回 p。如果你想要更多的特异性，你可以使用p.vertices[0], p.vertices[1], p.vertices[2]。尽管p 已经采用分层格式，但您可以从那里构建您的回报以显示您想要的值。
最大嵌套深度是否已知？还是它是递归的，没有可预测的深度？
为什么结果必须是分层的？是否应该防止结果集中出现重复？
@DavidThomas 这是递归的，没有可预测的深度。
@CoDEmanX，是的，它应该是（我认为为此我应该在遍历中使用uniqueVertices : global option）

标签： recursion arangodb graph-traversal aql

【解决方案1】：

我找到了解决办法。我们决定使用 UDF (user defined functions)。

以下是构建适当层次结构的几个步骤：

在 arango db 中注册函数。
运行您的 aql 查询，该查询构造一个平面结构（顶点和该顶点的相应路径）。并将结果作为 UDF 函数的输入参数传递。这里我的函数只是将每个元素附加到它的父元素

就我而言： 1) 在arango db中注册函数。

db.createFunction(
        'GO::LOCATED_IN::APPENT_CHILD_STRUCTURE',
            String(function (root, flatStructure) {
                if (root && root.id) {
                    var elsById = {};
                    elsById[root.id] = root;

                    flatStructure.forEach(function (element) {
                        elsById[element.id] = element;
                        var parentElId = element.path[element.path.length - 2];
                        var parentEl = elsById[parentElId];

                        if (!parentEl.contains)
                            parentEl.contains = new Array();

                        parentEl.contains.push(element);
                        delete element.path;
                    });
                }
                return root;
            })
        );

2) 使用 udf 运行 AQL：

    LET flatStructure = (FOR v,e,p IN 1..? INBOUND 'collectionA/itemA' GRAPH 'myGraph' 
       LET childPath = (FOR pv IN p.vertices RETURN pv.id_source)
    RETURN MERGE(v, childPath))

    LET root = {"id": "ItemA"} 

    RETURN GO::LOCATED_IN::APPENT_CHILD_STRUCTURE(root, flatStructure)

注意：在实现你的功能时请不要忘记the naming convention。

【讨论】：

用户函数/Foxx 端点在 2019 年仍然是推荐的方法吗？

【解决方案2】：

我还需要知道这个问题的答案，所以这里有一个可行的解决方案。

我确信代码需要为您定制，并且可以做一些改进，如果适用于此示例答案，请相应地发表评论。

解决方案是使用支持递归并构建树的 Foxx 微服务。我遇到的问题是循环路径，但我实现了一个最大深度限制来阻止这种情况，在下面的示例中硬编码为 10。

创建 Foxx 微服务：

创建一个新文件夹（例如递归树）
创建目录脚本
将文件manifest.json和index.js放到根目录
将文件setup.js放到脚本目录中
然后创建一个包含这三个文件的新 zip 文件（例如 Foxx.zip）
导航到 ArangoDB 管理控制台
点击服务 |添加服务
输入适当的挂载点，例如/我的/树
点击 Zip 标签
拖入您创建的Foxx.zip 文件，它应该可以毫无问题地创建
如果出现错误，请确保集合 myItems 和 myConnections 不存在，并且名为 myGraph 的图表不存在，因为它将尝试使用示例数据创建它们。
然后导航到 ArangoDB 管理控制台，服务 | /我的/树
点击API
展开 /tree/{rootId}
提供 ItemA 的 rootId 参数并点击“试用”
您应该会从提供的根 ID 中看到结果。

如果 rootId 不存在，则不返回任何内容如果 rootId 没有孩子，它会为“包含”返回一个空数组如果 rootId 有循环的“包含”值，它会返回嵌套深度限制，我希望有一种更简洁的方法来阻止它。

以下是三个文件： setup.js（位于脚本文件夹中）：

'use strict';
const db = require('@arangodb').db;
const graph_module =  require("org/arangodb/general-graph");

const itemCollectionName = 'myItems';
const edgeCollectionName = 'myConnections';
const graphName = 'myGraph';

if (!db._collection(itemCollectionName)) {
  const itemCollection = db._createDocumentCollection(itemCollectionName);
  itemCollection.save({_key: "ItemA" });
  itemCollection.save({_key: "ItemB" });
  itemCollection.save({_key: "ItemC" });
  itemCollection.save({_key: "ItemD" });
  itemCollection.save({_key: "ItemE" });

  if (!db._collection(edgeCollectionName)) {
    const edgeCollection = db._createEdgeCollection(edgeCollectionName);
    edgeCollection.save({_from: itemCollectionName + '/ItemA', _to: itemCollectionName + '/ItemB'});
    edgeCollection.save({_from: itemCollectionName + '/ItemB', _to: itemCollectionName + '/ItemC'});
    edgeCollection.save({_from: itemCollectionName + '/ItemB', _to: itemCollectionName + '/ItemD'});
    edgeCollection.save({_from: itemCollectionName + '/ItemD', _to: itemCollectionName + '/ItemE'});
  }

  const graphDefinition = [ 
    { 
      "collection": edgeCollectionName, 
      "from":[itemCollectionName], 
      "to":[itemCollectionName]
    }
  ];

  const graph = graph_module._create(graphName, graphDefinition);
}

mainfest.json（位于根文件夹中）：

{
  "engines": {
    "arangodb": "^3.0.0"
  },
  "main": "index.js",
  "scripts": {
    "setup": "scripts/setup.js"
  }
}

index.js（位于根文件夹中）：

'use strict';
const createRouter = require('@arangodb/foxx/router');
const router = createRouter();
const joi = require('joi');

const db = require('@arangodb').db;
const aql = require('@arangodb').aql;

const recursionQuery = function(itemId, tree, depth) {
  const result = db._query(aql`
    FOR d IN myItems
    FILTER d._id == ${itemId}
    LET contains = (
      FOR c IN 1..1 OUTBOUND ${itemId} GRAPH 'myGraph' RETURN { "_id": c._id }
    )
    RETURN MERGE({"_id": d._id}, {"contains": contains})
  `);

  tree = result._documents[0];

  if (depth < 10) {
    if ((result._documents[0]) && (result._documents[0].contains) && (result._documents[0].contains.length > 0)) {
        for (var i = 0; i < result._documents[0].contains.length; i++) {
        tree.contains[i] = recursionQuery(result._documents[0].contains[i]._id, tree.contains[i], depth + 1);
        }
    }
  }
  return tree;
}

router.get('/tree/:rootId', function(req, res) {
  let myResult = recursionQuery('myItems/' + req.pathParams.rootId, {}, 0);
  res.send(myResult);
})
  .response(joi.object().required(), 'Tree of child nodes.')
  .summary('Tree of child nodes')
  .description('Tree of child nodes underneath the provided node.');

module.context.use(router);

现在您可以调用 Foxx 微服务 API 端点，提供 rootId 它将返回完整的树。非常快。

ItemA 的示例输出是：

{
  "_id": "myItems/ItemA",
  "contains": [
    {
      "_id": "myItems/ItemB",
      "contains": [
        {
          "_id": "myItems/ItemC",
          "contains": []
        },
        {
          "_id": "myItems/ItemD",
          "contains": [
            {
              "_id": "myItems/ItemE",
              "contains": []
            }
          ]
        }
      ]
    }
  ]
}

可以看到Item B包含了两个子ItemC和ItemD，然后ItemD也包含了ItemE。

我等不及 ArangoDB AQL 改进了FOR v, e, p IN 1..100 OUTBOUND 'abc/def' GRAPH 'someGraph' 样式查询中可变深度路径的处理。不建议在 3.x 中使用自定义访问者，但并没有真正替换为处理路径中顶点深度的通配符查询或处理路径上的 prune 或 exclude 样式命令的强大功能遍历。

如果可以简化，希望得到 cmets/feedback。

【讨论】：

AQL 确实支持可变深度路径 - 您可以使用 path.vertices[n] 或 path.edges[n] 检索任何深度层，其中 n 是深度。
是的，确实如此，但不幸的是你必须指定n，你不能使用通配符。例如，假设您要查询来自节点的所有出站路径，深度为 1..10，然后如果该路径中的一条边具有特定属性，或者路径中的顶点具有特定属性，则执行特殊操作一个值。你最终编写了指定 n 10 次的代码，你有巨大的 IF 命令。我希望它像IF path.vertices[*].myKey == 'trigger' 一样简单，并且它会在该图形查询的每个可能深度上动态处理。如果满足触发器，还可以取消对路径的处理。