【问题标题】:How can I combine all lists and make a unique names如何组合所有列表并创建唯一名称
【发布时间】:2018-03-24 16:06:11
【问题描述】:

我有这样的数据


 ldf <- list(structure(list(Abund = c("BROS", "KIS", "TTHS", 
"MKS"), `Value: F111: cold, Sample1` = c("1.274e7", "", 
"", "2.301e7"), `Value: F111: warm, Sample1` = c("", "", 
"", "")), .Names = c("Abund", "Value: F111: cold, Sample1", 
"Value: F111: warm, Sample1"), row.names = c(NA, 4L), class = "data.frame"), 
    structure(list(Abund = c("BROS", "TMS", "KIS", 
    "HERS"), `Value: F216: cold, Sample2` = c("1.670e6", 
    "4.115e7", "", "1.302e7"), `Value: F216: warm, Sample2` = c("", 
    "2.766e7", "", "1.396e7")), .Names = c("Abund", "Value: F216: cold, Sample2", 
    "Value: F216: warm, Sample2"), row.names = c(NA, 4L), class = "data.frame"), 
    structure(list(Abund = c("BROS", "TMS", "KIS", 
    "HERS"), `Value: F655: cold, Sample3` = c("7.074e4", 
    "1.038e7", "", "7.380e5"), `Value: F655: warm, Sample3` = c("", 
    "6.874e6", "", "7.029e5")), .Names = c("Abund", "Value: F655: cold, Sample3", 
    "Value: F655: warm, Sample3"), row.names = c(NA, 4L), class = "data.frame")) 

我想在这种情况下取​​一个唯一的名字 Abund 然后我尝试将数据靠近它,如下所示 所以一个愿望输出是这样的


Abund   coldsample1 Sample1 coldSample2 warmSample2 coldSample3 warmSample3
BROS    1.27E+07        1.67E+06        7.07E+04    
TMS                     4.12E+07        2.77E+07    1.04E+07    6.87E+06
HERS                    1.30E+07        1.40E+07    7.38E+05    7.03E+05
MKS     2.30E+07                    
KIS                     
TTHS

【问题讨论】:

  • warmSample1 的标头缺少“暖”前缀。此外,coldSample3 的值 7.07E+04 似乎向左移动了一列进入 warmSample2 列。

标签: r


【解决方案1】:

在 Base R 中,你可以做这样的事情......

#if you have a dataframe (in the original version of this question): create a list of dataframes by splitting every three columns, and setting the column names as required...
dflist <- lapply(1:3,function(i) {
      df <- ldf[,(3*i-2):(3*i)]
      names(df) <- c("Abund",paste0("ColdSample",i),paste0("WarmSample",i))
      return(df)})

#merge these together
dfout <- Reduce(function(x,y) merge(x,y,all=TRUE), dflist)

#if ldf is a list of dataframes (in the modified version of the question), you can just do
dfout <- Reduce(function(x,y) merge(x,y,all=TRUE), ldf)

#and perhaps tidy up the names with
names(dfout) <- make.names(names(dfout))

dfout
  Abund ColdSample1 WarmSample1 ColdSample2 WarmSample2 ColdSample3 WarmSample3
1  BROS     1.274e7                 1.670e6                 7.074e4            
2  HERS        <NA>        <NA>     1.302e7     1.396e7     7.380e5     7.029e5
3   KIS                                                                        
4   MKS     2.301e7                    <NA>        <NA>        <NA>        <NA>
5   TMS        <NA>        <NA>     4.115e7     2.766e7     1.038e7     6.874e6
6  TTHS                                <NA>        <NA>        <NA>        <NA>

【讨论】:

  • (一个明显的例子,Reduce 应该允许来自mapplyMoreArgs 之类的东西。)
  • @Andrew Gustar 当我将它应用于我的真实数据时,我说Error in lst2[, (3 * i - 2):(3 * i)] : incorrect number of dimensions你知道为什么吗?
  • 这取决于lst2 是什么。您在上面指定的 ldf 实际上是一个数据框,尽管我注意到您的问题标题使用了“列表”一词,所以这可能是问题的一部分。
  • @Andrew Gustar 你能检查一下上面的示例数据吗
  • 在这种情况下,您可以直接应用合并...dfout &lt;- Reduce(function(x,y) merge(x,y,all=TRUE),ldf) - 我已经更新了上面的答案
【解决方案2】:

data.table 的帮助下,您可以做到(根据不同长度的数据帧列表进行编辑后)

x <- rbindlist(lapply(ldf, function(i) cbind(i["Abund"], stack(i[2:3]), row.names = NULL))) #create a data table with all the unique values of the Abund accrosss the 3 different list of data.frames you provided
y <- dcast(x, Abund~ind, value.var="values") #cast the long format data into a usasble form
names(y) <- gsub(".*:", "", names(y)); names(y) <- gsub(", ", "", names(y)) #get nicer variable names
(y <- y[,lapply(.SD,function(j){ifelse(j=="", NA, j)})]) #prints the end table with a correct and complete list of NAs
#   Abund  coldSample1  warmSample1  coldSample2  warmSample2  coldSample3  warmSample3
#1:  BROS      1.274e7           NA      1.670e6           NA      7.074e4           NA
#2:  HERS           NA           NA      1.302e7      1.396e7      7.380e5      7.029e5
#3:   KIS           NA           NA           NA           NA           NA           NA
#4:   MKS      2.301e7           NA           NA           NA           NA           NA
#5:   TMS           NA           NA      4.115e7      2.766e7      1.038e7      6.874e6
#6:  TTHS           NA           NA           NA           NA           NA           NA

【讨论】:

  • 什么是输出,因为我看不到
  • 这个使用你上次更新的数据,上面包括结果,希望有用
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2018-03-12
  • 2012-10-06
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多