【问题标题】:Reading and formatting Multilevel, Uneven JSON读取和格式化多级、不均匀的 JSON
【发布时间】:2021-01-19 07:59:34
【问题描述】:

我有一个如下所示的 JSON

{
  "timestamps": [
    "2020-12-17T20:05:00Z",
    "2020-12-17T20:10:00Z",
    "2020-12-17T20:15:00Z",
    "2020-12-17T20:20:00Z",
    "2020-12-17T20:25:00Z",
    "2020-12-17T20:30:00Z"
  ],
  "properties": [
    {
      "values": [
        -20.58975828559592,
        -19.356728999226693,
        -19.808982964173023,
        -19.673928070777993,
        -19.712275037138411,
        -19.48422739982918
      ],
      "name": "Neg Flow",
      "type": "Double"
    },
    {
      "values": [
        2,
        20,
        19,
        20,
        19,
        16
      ],
      "name": "Event Count",
      "type": "Long"
    }
  ],
  "progress": 100.0
}

如何将其转换为如下所示的数据框。虽然我能够遍历各个数据项,但我有兴趣找出是否有一种简洁的方法可以做到这一点?

+----------------------+---------------------+-------------+
|Time Stamps           | Neg Flow            | Event Count |
+----------------------+---------------------+-------------+
|2020-12-17T20:05:00Z  |-20.58975828559592   | 2           |
+----------------------+---------------------+-------------+
|2020-12-17T20:10:00Z  |-19.356728999226693  | 20          |
+----------------------+---------------------+-------------+

【问题讨论】:

  • 您能否以可复制的形式提供我们可以复制粘贴的数据?
  • @RonakShah,上面的Json文件就是数据。它的可复制!
  • 当我将它复制到 R 中时,它会返回许多错误消息,例如 Error: unexpected '}' in " }"Error in "type":"Long" : NA/NaN argument 等等。不知道如何复制它。

标签: r tidyverse jsonlite


【解决方案1】:

这是一种方法。

library(jsonlite) # read json
library(dplyr) # maniputate data frame
library(magrittr) # for the use of %<>%

# temp.json is my file using the content you provided
json_data <- read_json("temp.json")

# initial data with timestamp
data <- tibble(`Time Stamps` = unlist(json_data[["timestamps"]]))

# properties process
for (property in json_data[["properties"]]) {
  property_name <- property[["name"]]
  # using dynamic namming for more reference please refer to link at end of post
  data %<>% mutate({{property_name}} := unlist(property[["values"]]))
}

输出:

# A tibble: 6 x 3
  `Time Stamps`        `Neg Flow` `Event Count`
  <chr>                     <dbl>         <int>
1 2020-12-17T20:05:00Z      -20.6             2
2 2020-12-17T20:10:00Z      -19.4            20
3 2020-12-17T20:15:00Z      -19.8            19
4 2020-12-17T20:20:00Z      -19.7            20
5 2020-12-17T20:25:00Z      -19.7            19
6 2020-12-17T20:30:00Z      -19.5            16

在此处了解有关使用dplyr 进行编程的更多信息:

https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html

【讨论】:

    猜你喜欢
    • 2018-02-02
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-01-03
    • 2017-09-15
    • 1970-01-01
    • 1970-01-01
    • 2012-11-10
    相关资源
    最近更新 更多