【发布时间】:2020-09-12 13:44:38
【问题描述】:
我有使用 Java、Apache POI 和 Jackson 从 Excel 文件提取到 JSON 结构的数据。生成的 JSON 数据结构如下所示:
{
"fileName" : "C:\\Users\\jgagnon\\sample_data\\PDM_BOM.xlsx",
"sheets" : [ {
"name" : "PDM_BOM",
"data" : [ [ "BRANCH", "PARENT ITEM NUMBER", "2ND ITEM NUMBER", "QUANTITY REQUIRED", "UNIT OF MEASURE", "ISSUE TYPE CODE", "LINE TYPE", "STOCK TYPE", "TYPE BOM", "LINE NUMBER", "OPERATING SEQUENCE", "EFFECTIVE FROM DATE", "EFFECTIVE THRU DATE", "DRAWING NUMBER", "UNIT COST", "SCRAP PERCENT" ],
[ "B20", "208E8840040", "5P884LPFSR2", 0.32031, "LB", "I", "S", "M", "M", 1.0, 10.0, "09/11/13", "12/31/40", null, 0.0, 0.0 ],
[ "B20", "208E8840168", "5P884LPFSR2", 1.36, "LB", "I", "S", "M", "M", 1.0, 10.0, "02/26/08", "12/31/40", null, 0.0, 0.0 ],
[ "B20", "208E8840172", "5P884LPFSR2", 1.3924, "LB", "I", "S", "M", "M", 1.0, 10.0, "02/26/08", "12/31/40", null, 0.0, 0.0 ],
[ "B20", "208E8840180", "5P884LPFSR2", 1.4565, "LB", "I", "S", "P", "M", 1.0, 10.0, "03/04/09", "12/31/40", null, 0.0, 0.0 ],
[ "B20", "21PPH150166", "8P315TPMRG", 1.39629, "LB", "I", "S", "M", "M", 1.0, 10.0, "03/05/18", "12/31/40", null, 0.0, 0.0 ] ],
"maxCols" : 16,
"maxRows" : 14996
} ]
}
在data 元素中,本质上是一个数组数组,表示工作表中的所有行。第一行数组是后面数据行的列标题。
我希望能够重组data,使其采用对象映射的形式,其中每个键(对于本示例)是@987654325 中的值@ 柱子。与键关联的映射值将是一个 JSON 对象,由当前数据行的列标题和列值的键/值对组成。
所以,使用上面的例子,我最终会得到这样的结果: (我的 JSON 语法/结构可能不正确)
{
"data": {
"208E8840040": {
"BRANCH": "B20",
"PARENT ITEM NUMBER": "208E8840040",
"2ND ITEM NUMBER": "5P884LPFSR2",
"QUANTITY REQUIRED": 0.32031,
"UNIT OF MEASURE": "LB",
"ISSUE TYPE CODE": "I",
"LINE TYPE": "S",
"STOCK TYPE": "M",
"TYPE BOM": "M",
"LINE NUMBER": 1.0,
"OPERATING SEQUENCE": 10.0,
"EFFECTIVE FROM DATE": "09/11/13",
"EFFECTIVE THRU DATE": "12/31/40",
"DRAWING NUMBER": null,
"UNIT COST": 0.0,
"SCRAP PERCENT": 0.0
},
"208E8840168": {
"BRANCH": "B20",
"PARENT ITEM NUMBER": "208E8840168",
"2ND ITEM NUMBER": "5P884LPFSR2",
"QUANTITY REQUIRED": 1.36,
...
},
...
}
}
我正在寻找一种将前者转换为后者的方法。
更新:
我刚刚意识到我遗漏了一个重要的细节。
此表(表)中的数据基本上由PARENT ITEM NUMBER 列键入。但是,尽管该列是主标识符,但它在表中并不总是唯一的。
在很多情况下,PARENT ITEM NUMBER 具有相同值的多行。这些行中的每一行都包含有关构成“父项”的元素的信息(将它们视为子项)。这些子项由2ND ITEM NUMBER 列标识。
此外,许多子项在表中都有自己的行,其中PARENT ITEM NUMBER 是由2ND ITEM NUMBER 为父项标识的子项编号。正如您可能已经猜到的那样,这些子项可以有自己的子项,依此类推。
基本上,这是相关数据的多个层次结构的表格表示。一些子项将出现(被重复使用)多个父项。
我不知道这会使我想做的事情变得复杂。
更新:
感谢https://stackoverflow.com/users/51591/micha%c5%82-ziober 的最初想法。我稍微调整了它以生成子项目列表的地图。修改后的代码如下:
public String convertToJson(File jsonFile) throws IOException {
ArrayNode arrayNode = readDataArray(jsonFile);
List<Map<String, JsonNode>> rowMaps = convertArrayToMaps(arrayNode);
Map<Object, List<Map<String, JsonNode>>> dataMap = rowMaps.stream()
.collect(Collectors.groupingBy(map -> map.get("PARENT ITEM NUMBER").textValue()));
return jsonMapper.writeValueAsString(Collections.singletonMap("data", dataMap));
}
这是一个输出示例:
{
"data" : {
"MTDMN97PJ1A9" : [ { <- 1 child
"BRANCH" : "B70",
"PARENT ITEM NUMBER" : "MTDMN97PJ1A9",
"2ND ITEM NUMBER" : "MTDMN970144XO",
"QUANTITY REQUIRED" : 12.0,
"UNIT OF MEASURE" : "SY",
"ISSUE TYPE CODE" : "I",
"LINE TYPE" : "S",
"STOCK TYPE" : "M",
"TYPE BOM" : "M",
"LINE NUMBER" : 1.0,
"OPERATING SEQUENCE" : 10.0,
"EFFECTIVE FROM DATE" : "01/18/19",
"EFFECTIVE THRU DATE" : "12/31/40",
"DRAWING NUMBER" : null,
"UNIT COST" : 0.0,
"SCRAP PERCENT" : 0.0
} ],
"ZCP723A1152" : [ { <- 4 children
"BRANCH" : "B70",
"PARENT ITEM NUMBER" : "ZCP723A1152",
"2ND ITEM NUMBER" : "5P587UMFSD2",
"QUANTITY REQUIRED" : 2.32222,
"UNIT OF MEASURE" : "LB",
"ISSUE TYPE CODE" : "I",
"LINE TYPE" : "S",
"STOCK TYPE" : "M",
"TYPE BOM" : "M",
"LINE NUMBER" : 3.0,
"OPERATING SEQUENCE" : 10.0,
"EFFECTIVE FROM DATE" : "05/15/17",
"EFFECTIVE THRU DATE" : "12/31/40",
"DRAWING NUMBER" : null,
"UNIT COST" : 0.0,
"SCRAP PERCENT" : 0.0
}, {
"BRANCH" : "B70",
"PARENT ITEM NUMBER" : "ZCP723A1152",
"2ND ITEM NUMBER" : "8P550ZPPOOLE",
"QUANTITY REQUIRED" : 2.32222,
"UNIT OF MEASURE" : "LB",
"ISSUE TYPE CODE" : "I",
"LINE TYPE" : "S",
"STOCK TYPE" : "M",
"TYPE BOM" : "M",
"LINE NUMBER" : 1.0,
"OPERATING SEQUENCE" : 10.0,
"EFFECTIVE FROM DATE" : "05/15/17",
"EFFECTIVE THRU DATE" : "12/31/40",
"DRAWING NUMBER" : null,
"UNIT COST" : 0.0,
"SCRAP PERCENT" : 0.0
}, {
"BRANCH" : "B70",
"PARENT ITEM NUMBER" : "ZCP723A1152",
"2ND ITEM NUMBER" : "8P906WPPA3077",
"QUANTITY REQUIRED" : 4.64444,
"UNIT OF MEASURE" : "LB",
"ISSUE TYPE CODE" : "I",
"LINE TYPE" : "S",
"STOCK TYPE" : "M",
"TYPE BOM" : "M",
"LINE NUMBER" : 2.0,
"OPERATING SEQUENCE" : 10.0,
"EFFECTIVE FROM DATE" : "05/15/17",
"EFFECTIVE THRU DATE" : "12/31/40",
"DRAWING NUMBER" : null,
"UNIT COST" : 0.0,
"SCRAP PERCENT" : 0.0
}, {
"BRANCH" : "B70",
"PARENT ITEM NUMBER" : "ZCP723A1152",
"2ND ITEM NUMBER" : "8U910LKSHBL",
"QUANTITY REQUIRED" : 2.32222,
"UNIT OF MEASURE" : "LB",
"ISSUE TYPE CODE" : "I",
"LINE TYPE" : "S",
"STOCK TYPE" : "M",
"TYPE BOM" : "M",
"LINE NUMBER" : 4.01,
"OPERATING SEQUENCE" : 10.0,
"EFFECTIVE FROM DATE" : "12/13/17",
"EFFECTIVE THRU DATE" : "12/31/40",
"DRAWING NUMBER" : null,
"UNIT COST" : 0.0,
"SCRAP PERCENT" : 0.0
} ],
... many more entries
}
【问题讨论】: