【发布时间】:2021-08-17 18:49:13
【问题描述】:
这是我们从体育 API 中获取的 1 行数据,该 API 作为嵌套列表进入我们。我们的fetch_results$data 是一个列表,其中包含许多比赛中的每一场这样的嵌套列表,因为该数据适用于许多足球比赛。 list-of-list 嵌套可以深入 3-4 层,内部列表用于 scores、time 和 visitorTeam 下面等等。
> dput(fetch_results$data[1])
list(list(id = 11984409L, league_id = 1326L, season_id = 15733L,
stage_id = 77442469L, round_id = 186274L, group_id = 225400L,
aggregate_id = NULL, venue_id = 7189L, referee_id = NULL,
localteam_id = 18716L, visitorteam_id = 18658L, winner_team_id = NULL,
weather_report = NULL, commentaries = FALSE, attendance = NULL,
pitch = NULL, details = "Match 1", neutral_venue = FALSE,
winning_odds_calculated = FALSE, formations = list(localteam_formation = NULL,
visitorteam_formation = NULL), scores = list(localteam_score = 0L,
visitorteam_score = 0L, localteam_pen_score = NULL, visitorteam_pen_score = NULL,
ht_score = NULL, ft_score = NULL, et_score = NULL, ps_score = NULL),
time = list(status = "NS", starting_at = list(date_time = "2021-06-11 19:00:00",
date = "2021-06-11", time = "19:00:00", timestamp = 1623438000L,
timezone = "UTC"), minute = NULL, second = NULL, added_time = NULL,
extra_minute = NULL, injury_time = NULL), coaches = list(
localteam_coach_id = 455836L, visitorteam_coach_id = 784486L),
standings = list(localteam_position = 3L, visitorteam_position = 1L),
assistants = list(first_assistant_id = NULL, second_assistant_id = NULL,
fourth_official_id = NULL), leg = "1/1", colors = NULL,
deleted = FALSE, is_placeholder = FALSE, localTeam = list(
data = list(id = 18716L, legacy_id = 213L, name = "Turkey",
short_code = "TUR", twitter = NULL, country_id = 404L,
national_team = TRUE, founded = 1923L, logo_path = "https://cdn.sportmonks.com/images//soccer/teams/28/18716.png",
venue_id = 9634L, current_season_id = 15733L, is_placeholder = NULL)),
visitorTeam = list(data = list(id = 18658L, legacy_id = 205L,
name = "Italy", short_code = "ITA", twitter = NULL, country_id = 251L,
national_team = TRUE, founded = 1898L, logo_path = "https://cdn.sportmonks.com/images//soccer/teams/2/18658.png",
venue_id = 7189L, current_season_id = 15733L, is_placeholder = NULL))))
为了展平为数据框,我们使用:
zed <- fetch_results$data %>%
purrr::map(unlist) %>%
purrr::map(t) %>%
purrr::map(as_tibble) %>%
dplyr::bind_rows() %>%
readr::type_convert()
我们的数据框输出的一行如下所示:
如果您仔细查看列表列表,则会在主数据框中删除 许多 个值为 NULL 的对象。整个score 列表及其所有键都将被删除。根据this stackoverflow post,看起来 unlist() 丢弃 NULL 值是罪魁祸首...
该线程中发布的解决方案仅解决嵌套深度为 1 层的 NULL 值,但是上面的列表有许多嵌套列表,如果您在上面搜索 list() 可以看到。
在不删除任何具有 NULL 值的列的情况下展平此列表列表的最佳方法是什么?如果最好的方法是首先用 NA 替换 NULL,那么最好的方法是什么?我们现有的代码会进行展平并接近,但不会保留带有 NULL 的列。
【问题讨论】: