【发布时间】:2023-08-20 03:04:01
【问题描述】:
总结
尝试了解OrientDB ETL配置json文件。
假设是一个 CSV 文件,其中:
- 每一行都是一个顶点
- “类”列给出了顶点的预期类
- 顶点有多个类(Foo、Bar、Baz)
如何将顶点的类设置为“类”列的值?
排除故障的努力
我在 OrientDB ETL 文档中花了很多时间试图解决这个问题。我尝试了let 和block 和code 组件的许多不同组合。我试过像className和$className和${classname}这样的变量名。
当前结果:
-
code组件能够正确打印 `className' 的值,所以我知道它设置正确。 -
vertex组件未正确引用变量,因此将每个顶点的类设置为null。
上下文
我在 localhost 上有一个新创建的数据库 (PLOCAL GRAPH),名为“deleteme”。
我有一个如下所示的顶点 CSV 文件 (nodes.csv):
id,name,class
1,Jack,Foo
2,Jill,Bar
3,Gephri,Baz
还有一个如下所示的 ETL 配置文件 (test.json):
{
"config": {
"log": "DEBUG"
},
"source": {"file": {"path": "nodes.csv"}},
"extractor": {"csv": {}},
"transformers": [
{"block": {"let": {"name": "$className",
"value": "$input.class"}}},
{"code": {"language": "Javascript",
"code": "print(className + '\\n'); input;"}},
{"vertex": {"class": "$className"}}
],
"loader": {
"orientdb": {
"dbURL": "remote:localhost:2424/deleteme",
"dbUser": "admin",
"dbPassword": "admin",
"dbType": "graph",
"tx": false,
"wal": false,
"batchCommit": 1000,
"classes": [
{"name": "Foo", "extends": "V"},
{"name": "Bar", "extends": "V"},
{"name": "Baz", "extends": "V"}
]
}
}
}
当我运行 ETL 作业时,我的输出如下所示:
aj@host:~/bin/orientdb-community-2.1.13/bin$ ./oetl.sh test.json
OrientDB etl v.2.1.13 (build 2.1.x@r9bc1a54a4a62c4de555fc5360357f446f8d2bc84; 2016-03-14 17:00:05+0000) www.orientdb.com
BEGIN ETL PROCESSOR
[file] INFO Reading from file nodes.csv with encoding UTF-8
[orientdb] DEBUG - OrientDBLoader: created vertex class 'Foo' extends 'V'
[orientdb] DEBUG orientdb: found 0 vertices in class 'null'
+ extracted 0 rows (0 rows/sec) - 0 rows -> loaded 0 vertices (0 vertices/sec) Total time: 1001ms [0 warnings, 0 errors]
[orientdb] DEBUG - OrientDBLoader: created vertex class 'Bar' extends 'V'
[orientdb] DEBUG orientdb: found 0 vertices in class 'null'
[orientdb] DEBUG - OrientDBLoader: created vertex class 'Baz' extends 'V'
[orientdb] DEBUG orientdb: found 0 vertices in class 'null'
[csv] DEBUG document={id:1,class:Foo,name:Jack}
[1:block] DEBUG Transformer input: {id:1,class:Foo,name:Jack}
[1:block] DEBUG Transformer output: {id:1,class:Foo,name:Jack}
[1:code] DEBUG Transformer input: {id:1,class:Foo,name:Jack}
Foo
[1:code] DEBUG executed code=OCommandExecutorScript [text=print(className); input;], result={id:1,class:Foo,name:Jack}
[1:code] DEBUG Transformer output: {id:1,class:Foo,name:Jack}
[1:vertex] DEBUG Transformer input: {id:1,class:Foo,name:Jack}
[1:vertex] DEBUG Transformer output: v(null)[#3:0]
[csv] DEBUG document={id:2,class:Bar,name:Jill}
[2:block] DEBUG Transformer input: {id:2,class:Bar,name:Jill}
[2:block] DEBUG Transformer output: {id:2,class:Bar,name:Jill}
[2:code] DEBUG Transformer input: {id:2,class:Bar,name:Jill}
Bar
[2:code] DEBUG executed code=OCommandExecutorScript [text=print(className); input;], result={id:2,class:Bar,name:Jill}
[2:code] DEBUG Transformer output: {id:2,class:Bar,name:Jill}
[2:vertex] DEBUG Transformer input: {id:2,class:Bar,name:Jill}
[2:vertex] DEBUG Transformer output: v(null)[#3:1]
[csv] DEBUG document={id:3,class:Baz,name:Gephri}
[3:block] DEBUG Transformer input: {id:3,class:Baz,name:Gephri}
[3:block] DEBUG Transformer output: {id:3,class:Baz,name:Gephri}
[3:code] DEBUG Transformer input: {id:3,class:Baz,name:Gephri}
Baz
[3:code] DEBUG executed code=OCommandExecutorScript [text=print(className); input;], result={id:3,class:Baz,name:Gephri}
[3:code] DEBUG Transformer output: {id:3,class:Baz,name:Gephri}
[3:vertex] DEBUG Transformer input: {id:3,class:Baz,name:Gephri}
[3:vertex] DEBUG Transformer output: v(null)[#3:2]
END ETL PROCESSOR
+ extracted 3 rows (4 rows/sec) - 3 rows -> loaded 3 vertices (4 vertices/sec) Total time: 1684ms [0 warnings, 0 errors]
哦,那DEBUG orientdb: found 0 vertices in class 'null' 是什么意思?
【问题讨论】:
标签: orientdb