【问题标题】:Creating treechart from tabbed text in R从 R 中的选项卡式文本创建树形图
【发布时间】:2014-07-23 17:20:24
【问题描述】:

我想制作一个用制表符正确缩进的以下数据的树/流程图:

Vertebrates
    fish
        goldfish
        clownfish
    amphibian
        frog
        toad
    reptiles
        snake
        lizard
        turtle
        tortoise
    birds
        sparrow
        crow
        parrot
    mammals
        dog
        cat
        horse
        whale

如何将此树数据转换为流程图(箭头从上向下或从左到右)(通过计算每行中的选项卡数来确定正确的位置)。我相信它可以通过“图表”包(Graph flow chart of transition from stateshttp://cran.r-project.org/web/packages/diagram/index.html)来完成,但无法弄清楚确切的步骤。感谢您的帮助。

下面给出了一个粗略的期望输出示例。文本周围可能有框。


编辑: 理想情况下,它应该是一个灵活的解决方案,以便在添加或删除关卡时也能正常工作。例如添加 2 种麻雀:

Vertebrates         
    fish        
        goldfish    
        clownfish   
    amphibian       
        frog    
        toad    
    reptiles        
        snake   
        lizard  
        turtle  
        tortoise    
    birds       
        sparrow 
            house
            factory
        crow    
        parrot  
        crane   
    mammals     
        dog 
        cat 
        horse   
        whale   

dat  = structure(list(V1 = c("Vertebrates", NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA), V2 = c(NA, "fish", NA, NA, "amphibian", NA, NA, "reptiles", 
NA, NA, NA, NA, "birds", NA, NA, NA, NA, NA, NA, "mammals", NA, 
NA, NA, NA), V3 = c(NA, NA, "goldfish", "clownfish", NA, "frog", 
"toad", NA, "snake", "lizard", "turtle", "tortoise", NA, "sparrow", 
NA, NA, "crow", "parrot", "crane", NA, "dog", "cat", "horse", 
"whale"), V4 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, "house", "factory", NA, NA, NA, NA, NA, NA, NA, NA
)), .Names = c("V1", "V2", "V3", "V4"), class = "data.frame", row.names = c(NA, 
-24L))
> 

【问题讨论】:

  • 这更像是树形而不是流程图?也许像树状图?
  • 我同意。我在上面添加了该术语。谢谢。

标签: r flowchart


【解决方案1】:

这是使用igraph 的一种相当复杂的方式。我们需要将您的数据排列成两列,fromto 表示箭头从 -> 到

library(zoo)
library(igraph)

# read tab delimited data - keep structure by setting "" to missing
# (would of been great if you had given this in a format easier to use)

dat <- read.table("test.txt", sep="\t", header=FALSE, fill=TRUE, 
                  na.strings="", strip.white=TRUE, stringsAsFactors=FALSE)

head(dat, 7)
#             V1        V2        V3
#1   Vertebrates      <NA>      <NA>
#2          <NA>      fish      <NA>
#3          <NA>      <NA>  goldfish
#4          <NA>      <NA> clownfish
#5          <NA> amphibian      <NA>
#6          <NA>      <NA>      frog
#7          <NA>      <NA>      toad

准备数据以绘制图表

# carry forward the last value in first two columns to impute missing
dat[1:2] <- sapply(dat[1:2], na.locf, na.rm=FALSE)
dat <- na.omit(dat)

# get edges for graph - we want two columns (from and to) for each edges
edges <- rbind(dat[1:2],setNames(dat[2:3],names(dat[1:2])))

# create graph
g <- graph.data.frame(edges)

# Plot graph
E(g)$curved <- 0
plot.igraph(g, vertex.size=0, edge.arrow.size=0 ,
                      layout=-layout.reingold.tilford(g)[,2:1])

数据,因为会有更好的方法来做到这一点!

dat <- structure(list(V1 = c("Vertebrates", NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), V2 = c(NA, 
"fish", NA, NA, "amphibian", NA, NA, "reptiles", NA, NA, NA, 
NA, "birds", NA, NA, NA, "mammals", NA, NA, NA, NA), V3 = c(NA, 
NA, "goldfish", "clownfish", NA, "frog", "toad", NA, "snake", 
"lizard", "turtle", "tortoise", NA, "sparrow", "crow", "parrot", 
NA, "dog", "cat", "horse", "whale")), .Names = c("V1", "V2", 
"V3"), class = "data.frame", row.names = c(NA, -21L))


EDIT:更新以下新数据

调用您更新的数据dat2

# To prepare the data

# carry forward the last value in columns if lower level (col to the right)
# is non-missing
dat2[1] <- na.locf(dat2[1], na.rm=FALSE)

for(i in ncol(dat2):2)  {
  dat2[[i-1]] <-  ifelse(!is.na(dat2[[i]]), na.locf(dat2[[i-1]], na.rm=F), 
                                                                   dat2[[i-1]])
      }            

# get edges for graph
edges <- rbind(na.omit(dat2[1:2]),
                       do.call('rbind',
                               lapply(1:(ncol(dat2)-2), function(i) 
                                  na.omit(setNames(dat2[(1+i):(2+i)],
                                                         names(dat2[1:2])))))
                         )

然后像以前一样继续,给

【讨论】:

  • 谢谢,它适用于这些数据,但如果我添加另一个级别,比如两种麻雀:房子、工厂,它就不起作用了。我们可以创建一个自动调整的函数吗?
  • 您能否编辑您的问题以添加一个更复杂数据的小示例。如果您使用dput(yourdata),您将获得与我答案末尾相同的格式的数据 - 让它更容易,干杯
  • 完成。感谢您的帮助。
  • @mso;更新 - 应该很容易从中制作一个(丑陋的)功能 - HTH
  • 效果很好。但是,如果我更改条目的顺序,有时图表上显示的顺序会有所不同。
猜你喜欢
  • 1970-01-01
  • 2015-07-25
  • 1970-01-01
  • 1970-01-01
  • 2011-08-13
  • 1970-01-01
  • 2019-07-19
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多