【问题标题】:jq: Group by nested structures and flatten JSONjq:按嵌套结构分组并展平 JSON
【发布时间】:2018-12-05 10:42:44
【问题描述】:

我通常是 jq 和命令行工具的新手,但我需要按 JSON 文件中的嵌套结构进行分组并将嵌套结构展平,而且我已经几天没能找到可行的解决方案,这是我的 JSON 示例。

[
  {
    "Value1": "0",
    "Conversions": "0",
    "Revenue": "0.00",
    "serverTimestamp": 84615198,
    "pluginsIcons": [
      {
        "pluginName": "pdf",
        "pluginIcon": "pdf1"
      },
      {
        "pluginName": "java",
        "pluginIcon": "java1"
      }
    ],
    "plugins": "pdf, java",
    "customVariables": {
      "3": {
        "customVariableValue3": "F",
        "customVariableName3": "Gender"
      },
      "2": {
        "customVariableValue2": "Person",
        "customVariableName2": "Role"
      },
      "1": {
        "customVariableValue1": "Partner1",
        "customVariableName1": "Partner"
      }
    },
    "interactions": "7",
    "actions": "3",
    "actionDetails": [
      {
        "timestamp": 84615195,
        "interactionPosition": "1",
        "type": "action"
      },
      {
        "timestamp": 84615145,
        "interactionPosition": "2",
        "type": "action"
      },
      {
        "timestamp": 84615693,
        "interactionPosition": "3",
        "type": "action",
        "customVariables": {
          "2": {
            "customVariablePageValue2": "value2",
            "customVariablePageName2": "name2"
          },
          "1": {
            "customVariablePageValue1": "value1",
            "customVariablePageName1": "name1"
          }
        }
      }
    ],
    "operatingSystem": "Windows 10"
  },
  {
    "Value1": "18",
    "Conversions": "1",
    "Revenue": "0.00",
    "serverTimestamp": 84615189,
    "pluginsIcons": [
      {
        "pluginName": "pdf",
        "pluginIcon": "pdf1"
      }
    ],
    "plugins": "pdf",
    "customVariables": {
      "3": {
        "customVariableValue3": "M",
        "customVariableName3": "Gender"
      },
      "2": {
        "customVariableValue2": "Admin",
        "customVariableName2": "Role"
     },
      "1": {
        "customVariableValue1": "Place",
        "customVariableName1": "Subdomain"
      }
    },
    "interactions": "6",
    "actions": "3",
    "actionDetails": [
      {
        "timestamp": 84635189,
        "timeSpent": "11",
        "interactionPosition": "1",
        "type": "action"
      },
      {
        "timestamp": 846351834,
        "timeSpent": "11",
        "interactionPosition": "2",
        "type": "search"
      },
      {
        "timestamp": 846351832,
        "timeSpent": "1",
        "interactionPosition": "3",
        "type": "action",
        "customVariables": {
          "2": {
            "customVariablePageValue2": "value2",
            "customVariablePageName2": "name2"
          },
          "1": {
            "customVariablePageValue3": "value3",
            "customVariablePageName3": "name3"
          }
        },
        "generationTime": "890"
      }
    ],
    "operatingSystem": "Windows 10"
  }
]

查看最终结果的方式应该是为“actionDetails”下的嵌套数组中的每个“动作”使用一个扁平条目

我已经能够展平结构,但随后分组(并为每个操作复制其他列)变得复杂。在展平之前按“动作”分组对我不起作用,因为它们是嵌套的。

原始 JSON 中的第一个条目之后的外观示例如下:

[
  {
    "timestamp": 84615195,
    "interactionPosition": "1",
    "type": "action",
    "Value1": "0",
    "Conversions": "0",
    "Revenue": "0.00",
    "pluginName1": "pdf",
    "pluginIcon1": "pdf",
    "pluginName2": "java",
    "pluginIcon2": "java",
    "plugins": "pdf, java",
    "Gender": "F",
    "Role": "Person",
    "Partner": "Partner1",
    "interactions": "7",
    "actions": "3",
    "operatingSystem": "Windows 10"
  },
  {
    "timestamp": 84615145,
    "interactionPosition": "2",
    "type": "action",
    "Value1": "0",
    "Conversions": "0",
    "Revenue": "0.00",
    "pluginName1": "pdf",
    "pluginIcon1": "pdf",
    "pluginName2": "java",
    "pluginIcon2": "java",
    "plugins": "pdf, java",
    "Gender": "F",
    "Role": "Person",
    "Partner": "Partner1",
    "interactions": "7",
    "actions": "3",
    "operatingSystem": "Windows 10"
  },
  {
    "timestamp": 84615693,
    "interactionPosition": "3",
    "type": "action",
    "Value1": "0",
    "Conversions": "0",
    "Revenue": "0.00",
    "pluginName1": "pdf",
    "pluginIcon1": "pdf",
    "pluginName2": "java",
    "pluginIcon2": "java",
    "plugins": "pdf, java",
    "Gender": "F",
    "Role": "Person",
    "Partner": "Partner1",
    "interactions": "7",
    "actions": "3",
    "operatingSystem": "Windows 10",
    "name1": "value1",
    "name2": "value2"
   }
]

您可能在上面注意到,一些扁平键名已被关联值替换(在同一个嵌套结构内)。这不是完全必要的,但这将是一个不错的奖励。另外值得注意的是:我的 JSON 很大(800MB),我想这样做,但我认为这一点最好在另一个问题中提出。

提前感谢您的任何帮助或建议!

【问题讨论】:

  • 你可以使用这样的东西:map(. as $parent | .actionDetails | map({ timestamp, interactionPosition, type, Value1:$parent.Value1, Conversions:$parent.Conversions }))
  • @peak 我不确定我是否完全理解你的意思。我省略了代码 sn-p 外部的括号,但假设这是隐含的。否则,格式适合扁平化 json 的格式
  • 您的编辑解决了这个问题,所以我删除了评论。供以后参考,如果您还没有阅读相关指南,请参阅minimal reproducible example
  • @Aaron 这实际上是一个简单的开始......考虑到列的数量,该命令变得相对荒谬,但它最终工作并且比我拥有的任何其他解决方案都更好地完成了分组方面试过了。谢谢!
  • 如果有任何方法可以编辑 @Aaron 的命令以包含所有 $parent 字段而不单独指定它们,这将对我有很大帮助。

标签: json jq


【解决方案1】:

以下答案并未涉及您提到的所有要求 但它有望帮助您克服明显面临的主要障碍。

由于我不清楚您对“customVariables”的要求, 我会完全忽略 .customVariables,希望你也会 一旦你克服了主要障碍,就可以自己处理 .pluginsIcons 。 因此,为了清楚起见,我将简单地删除这些键。

据我了解,您希望在基于扁平化之后进行一些分组 在 .actionDetails 上。这些要求我也不清楚,所以让我们 专注于扁平化:

.[]
| .actionDetails[] + (del(.actionDetails) | del(.customVariables) | del(.pluginsIcons))

这会产生一个 JSON 对象流,其中前两个是:

{
  "timestamp": 84615195,
  "interactionPosition": "1",
  "type": "action",
  "Value1": "0",
  "Conversions": "0",
  "Revenue": "0.00",
  "serverTimestamp": 84615198,
  "plugins": "pdf, java",
  "interactions": "7",
  "actions": "3",
  "operatingSystem": "Windows 10"
}
{
  "timestamp": 84615145,
  "interactionPosition": "2",
  "type": "action",
  "Value1": "0",
  "Conversions": "0",
  "Revenue": "0.00",
  "serverTimestamp": 84615198,
  "plugins": "pdf, java",
  "interactions": "7",
  "actions": "3",
  "operatingSystem": "Windows 10"
}

这与您显示的预期输出非常相似,因此希望您可以从此处获取。

【讨论】:

  • 我收到了来自直接 jq 命令的多个编译错误及其较小的变体,不幸的是我无法从中找到可行的解决方案。
  • 我已经使用 jq 1.3、jq 1.4、jq 1.5 和 jq 1.6 验证了它的工作原理(jq -f program.jq input.json)。
  • 您说得对,感谢您的回复!您是正确的,因为它清除了我挑战的主要障碍,我将尝试处理重命名和重组 customVariables。
猜你喜欢
  • 1970-01-01
  • 2016-09-29
  • 2019-01-26
  • 2022-01-13
  • 2019-01-21
  • 1970-01-01
  • 2017-02-02
相关资源
最近更新 更多