【问题标题】:List of lists to data frame in R [duplicate]R中数据框的列表列表[重复]
【发布时间】:2016-09-10 16:41:23
【问题描述】:

我在以下格式的 R 中有一个有点混乱的嵌套列表,我很难将其转换为数据框。

[[1]]
[[1]]$id
[1] 101

[[1]]$resource_state
[1] 'ON'

[[1]]$athlete
[[1]]$athlete$id
[1] 10001

[[1]]$athlete$resource_state
[1] 2

[[2]]
[[2]]$id
[1] 102

[[2]]$resource_state
[1] 'OFF'

[[2]]$athlete
[[2]]$athlete$id
[1] 10001

[[2]]$athlete$resource_state
[1] 1

我尝试使用 sapply 和 lapply 以及 data.table 但我无法获得所需的输出(见下文) - 有什么想法吗?谢谢

id   resource_state  athlete     athlete_resource_state
101  ON              10001       1
102  OFF             10001       2

在下面添加了 dput 输出以帮助解决错误(抱歉,这很长!):

    Warning message:
In (function (..., deparse.level = 1)  :
  number of columns of result is not a multiple of vector length (arg 2)


    > dput(starred)
list(structure(list(id = 3096023L, resource_state = 2L, name = "Bench 1 to Bench 2", 
    activity_type = "Run", distance = 72.8203, average_grade = 3.5, 
    maximum_grade = 2.7, elevation_high = 104.8, elevation_low = 102.3, 
    start_latlng = list(53.8344181, -1.4967539), end_latlng = list(
        53.8338743, -1.4961714), start_latitude = 53.8344181, 
    start_longitude = -1.4967539, end_latitude = 53.8338743, 
    end_longitude = -1.4961714, climb_category = 0L, city = "Leeds", 
    state = "West Yorkshire", country = "United Kingdom", private = FALSE, 
    hazardous = FALSE, starred = TRUE, pr_time = 19L, athlete_pr_effort = structure(list(
        id = 13733712180, elapsed_time = 19L, distance = 68.6, 
        start_date = "2016-05-05T18:07:18Z", start_date_local = "2016-05-05T19:07:18Z", 
        is_kom = FALSE), .Names = c("id", "elapsed_time", "distance", 
    "start_date", "start_date_local", "is_kom")), starred_date = "2016-05-14T22:43:07Z"), .Names = c("id", 
"resource_state", "name", "activity_type", "distance", "average_grade", 
"maximum_grade", "elevation_high", "elevation_low", "start_latlng", 
"end_latlng", "start_latitude", "start_longitude", "end_latitude", 
"end_longitude", "climb_category", "city", "state", "country", 
"private", "hazardous", "starred", "pr_time", "athlete_pr_effort", 
"starred_date")), structure(list(id = 10490299L, resource_state = 2L, 
    name = "Regent Street - North", activity_type = "Run", distance = 408.4, 
    average_grade = 0, maximum_grade = 1.9, elevation_high = 35.1, 
    elevation_low = 32.3, start_latlng = list(53.799975, -1.533407), 
    end_latlng = list(53.80355, -1.532706), start_latitude = 53.799975, 
    start_longitude = -1.533407, end_latitude = 53.80355, end_longitude = -1.532706, 
    climb_category = 0L, city = "Leeds", state = NULL, country = "United Kingdom", 
    private = FALSE, hazardous = FALSE, starred = TRUE, pr_time = 80L, 
    athlete_pr_effort = structure(list(id = 9432540436, elapsed_time = 80L, 
        distance = 408.4, start_date = "2015-09-16T19:08:31Z", 
        start_date_local = "2015-09-16T20:08:31Z", is_kom = FALSE), .Names = c("id", 
    "elapsed_time", "distance", "start_date", "start_date_local", 
    "is_kom")), starred_date = "2016-05-14T22:40:09Z"), .Names = c("id", 
"resource_state", "name", "activity_type", "distance", "average_grade", 
"maximum_grade", "elevation_high", "elevation_low", "start_latlng", 
"end_latlng", "start_latitude", "start_longitude", "end_latitude", 
"end_longitude", "climb_category", "city", "state", "country", 
"private", "hazardous", "starred", "pr_time", "athlete_pr_effort", 
"starred_date")))

    > str(starred)
List of 2
 $ :List of 25
  ..$ id               : int 3096023
  ..$ resource_state   : int 2
  ..$ name             : chr "Bench 1 to Bench 2"
  ..$ activity_type    : chr "Run"
  ..$ distance         : num 72.8
  ..$ average_grade    : num 3.5
  ..$ maximum_grade    : num 2.7
  ..$ elevation_high   : num 105
  ..$ elevation_low    : num 102
  ..$ start_latlng     :List of 2
  .. ..$ : num 53.8
  .. ..$ : num -1.5
  ..$ end_latlng       :List of 2
  .. ..$ : num 53.8
  .. ..$ : num -1.5
  ..$ start_latitude   : num 53.8
  ..$ start_longitude  : num -1.5
  ..$ end_latitude     : num 53.8
  ..$ end_longitude    : num -1.5
  ..$ climb_category   : int 0
  ..$ city             : chr "Leeds"
  ..$ state            : chr "West Yorkshire"
  ..$ country          : chr "United Kingdom"
  ..$ private          : logi FALSE
  ..$ hazardous        : logi FALSE
  ..$ starred          : logi TRUE
  ..$ pr_time          : int 19
  ..$ athlete_pr_effort:List of 6
  .. ..$ id              : num 1.37e+10
  .. ..$ elapsed_time    : int 19
  .. ..$ distance        : num 68.6
  .. ..$ start_date      : chr "2016-05-05T18:07:18Z"
  .. ..$ start_date_local: chr "2016-05-05T19:07:18Z"
  .. ..$ is_kom          : logi FALSE
  ..$ starred_date     : chr "2016-05-14T22:43:07Z"
 $ :List of 25
  ..$ id               : int 10490299
  ..$ resource_state   : int 2
  ..$ name             : chr "Regent Street - North"
  ..$ activity_type    : chr "Run"
  ..$ distance         : num 408
  ..$ average_grade    : num 0
  ..$ maximum_grade    : num 1.9
  ..$ elevation_high   : num 35.1
  ..$ elevation_low    : num 32.3
  ..$ start_latlng     :List of 2
  .. ..$ : num 53.8
  .. ..$ : num -1.53
  ..$ end_latlng       :List of 2
  .. ..$ : num 53.8
  .. ..$ : num -1.53
  ..$ start_latitude   : num 53.8
  ..$ start_longitude  : num -1.53
  ..$ end_latitude     : num 53.8
  ..$ end_longitude    : num -1.53
  ..$ climb_category   : int 0
  ..$ city             : chr "Leeds"
  ..$ state            : NULL
  ..$ country          : chr "United Kingdom"
  ..$ private          : logi FALSE
  ..$ hazardous        : logi FALSE
  ..$ starred          : logi TRUE
  ..$ pr_time          : int 80
  ..$ athlete_pr_effort:List of 6
  .. ..$ id              : num 9.43e+09
  .. ..$ elapsed_time    : int 80
  .. ..$ distance        : num 408
  .. ..$ start_date      : chr "2015-09-16T19:08:31Z"
  .. ..$ start_date_local: chr "2015-09-16T20:08:31Z"
  .. ..$ is_kom          : logi FALSE
  ..$ starred_date     : chr "2016-05-14T22:40:09Z"

【问题讨论】:

    标签: r


    【解决方案1】:

    您可以将lapplyunlist 结合使用:

    do.call(rbind, lapply(myList, unlist))
    

    myList 是您的嵌套列表。

    【讨论】:

    • 谢谢,但是当我在我的完整列表中尝试此操作时出现错误:警告消息:In (function (..., deparse.level = 1) : number of columns of result不是向量长度的倍数(arg 1)
    • 请注意,此解决方案不会保留列类型,而是将所有内容转换为字符矩阵(而不是 data.frame)
    • 只有当您的列表是您在问题中显示的内容时,此答案才有效。如果有问题,您可能希望显示列表的更多信息。 'dput(myList)' 和 'str(myList)' 会很有帮助。
    • 我已经添加了这些输出,希望能够重现错误。我想我被嵌套列表绊倒了,这些列表不一定总是相同的长度。请问可以看一下吗?
    • 您的列表中没有 resource_state 变量。你还想要吗?你现在需要什么变量?
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2012-01-18
    • 1970-01-01
    • 2021-03-27
    • 2018-10-31
    • 2019-04-01
    • 2016-07-24
    • 2023-03-30
    相关资源
    最近更新 更多