【发布时间】:2019-01-21 21:13:59
【问题描述】:
请查看 csv 文件的图片。我正在使用 Cypher,Neo4j。如您所见,带有时间戳的活动都分别属于一个 case_id。许多属于同一个case_id(这里你看到case_id 3、2、1),但请想象还有更多。我想对属于同一案例 ID 的活动进行分组并执行相同的操作!查询每个组(分组是必不可少的)。
除了为每个组重写相同的查询之外,还有其他方法吗,就像这里分三步完成的那样?:
1.
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM "file:///XY" AS row
WITH toInteger(row.case_id) AS cid, row
WHERE cid=3
CREATE (act:Activity {caseId: cid, activityName: row.activity, time: row.timestamp})
'QUERY'
2.
LOAD CSV WITH HEADERS FROM "file:///XY" AS row
WITH toInteger(row.case_id) AS cid, row
WHERE cid=2
CREATE (act:Activity {caseId: cid, activityName: row.activity, time: row.timestamp})
'QUERY'
3.
LOAD CSV WITH HEADERS FROM "file:///XY" AS row
WITH toInteger(row.case_id) AS cid, row
WHERE cid=1
CREATE (act:Activity {caseId: cid, activityName: row.activity, time:
row.timestamp})
'QUERY'
所以基本上我想概括WHERE cid=3(or 2 or 1),在没有明确命名的情况下迭代所有不同的case-id。有点像Java中的for each element in array (array content: group by case_id) do QUERY。
知道怎么做吗?
提前谢谢你,如果这听起来太神秘,我很乐意提供更好的描述。
更新: 这是查询:
MATCH(act: Activity)
WHERE act.caseId = 1 //and here I want to be able to simplify for EVERY caseId
WITH act ORDER BY act.time ASC
WITH apoc.coll.frequencies(apoc.coll.pairsMin(COLLECT(act.activityName))) AS g
UNWIND g AS p
RETURN*
【问题讨论】:
标签: loops csv filter neo4j cypher