【发布时间】:2021-12-08 16:31:57
【问题描述】:
我正在通过networkD3::sankeyNetwork() 在r 中使用networkd3 构建一个sankey plot 的文档和教程。
我可以使用其他人的代码来完成这项工作(来自这里:sankey diagram in R - data preparation - 请参阅 CJ Yetman 的 networkd3 的 tidyverse 方式)
当我自己尝试实现这一点时,我的节点在 x 轴上的排列顺序错误 - 导致流程无法理解。
但是我无法确定sankeyNetwork 在哪里获取有关 x 轴位置的信息。
这是我的实现,但没有产生预期的结果:
library(tidyverse)
library(networkD3)
#Create the data
df <- data.frame('one' = c('a', 'b', 'b', 'a'),
'two' = c('c', 'd', 'e', 'c'),
'three' = c('f', 'g', 'f', 'f'))
#My code
#Create the links
links <- df %>%
mutate(row = row_number()) %>% #Get row for grouping and pivoting
pivot_longer(-row) %>% #pivot to long format
group_by(row) %>%
mutate(source_c = lead(value)) %>% #Get flow
filter(!is.na(source_c)) %>% #Get rid of NA
rename(target_c = value) %>% #Correct names
group_by(target_c, source_c) %>% #Count frequencies
summarize(value = n()) %>%
ungroup() %>%
mutate(target = as.integer(factor(target_c)), #Convert to numeric values
source = as.integer(factor(source_c))) %>%
mutate(source = source - 1, #zero index
target = target - 1) %>%
data.frame()
#create the nodes
nodes <- data.frame(name = factor(unique(c(links$target_c, links$source_c))))
#plot the network
sankeyNetwork(Links = links, Nodes = nodes, Source = 'source',
Target = 'target', Value = 'value', NodeID = 'name')
产量:
使用链接答案中的工作代码:
links <-
df %>%
mutate(row = row_number()) %>% # add a row id
gather('col', 'source', -row) %>% # gather all columns
mutate(col = match(col, names(df))) %>% # convert col names to col nums
mutate(source = paste0(source, '_', col)) %>% # add col num to node names
group_by(row) %>%
arrange(col) %>%
mutate(target = lead(source)) %>% # get target from following node in row
ungroup() %>%
filter(!is.na(target)) %>% # remove links from last column in original data
select(source, target) %>%
group_by(source, target) %>%
summarise(value = n()) # aggregate and count similar links
# create nodes data frame from unque nodes found in links data frame
nodes <- data.frame(id = unique(c(links$source, links$target)),
stringsAsFactors = FALSE)
# remove column id from names
nodes$name <- sub('_[0-9]*$', '', nodes$id)
# set links data to the 0-based index of the nodes in the nodes data frame
links$source <- match(links$source, nodes$id) - 1
links$target <- match(links$target, nodes$id) - 1
sankeyNetwork(Links = links, Nodes = nodes, Source = 'source',
Target = 'target', Value = 'value', NodeID = 'name')
我很欣赏工作代码和我的代码不同,但我看不到 sankeyNetwork 调用行号(即 x 轴)数据的位置 - 没有调用包含该信息的任何变量.我想我可以让我自己的代码工作来准备数据,一旦我知道它需要是什么样子。
【问题讨论】:
标签: networkd3 r javascript r d3.js htmlwidgets networkd3