【问题标题】:Error in apply(counts, 2, function(x) rpkm(x, lengths)) : dim(X) must have a positive length应用错误(计数,2,函数(x)rpkm(x,长度)):dim(X)必须具有正长度
【发布时间】:2021-03-23 11:37:29
【问题描述】:

我正在尝试使用脚本tpm_rpkm.R 脚本。但我说错了

apply(counts, 2, function(x) rpkm(x, lengths)) 中的错误:dim(X) 的长度必须为正数。

(数据表应该没有任何错误,因为它是通过与脚本作者使用的同一程序生成的。)

这是脚本

#! /usr/bin/env Rscript

# Author: Andy Saurin (andrew.saurin@univ-amu.fr)
#
# Simple RScript to calculate RPKMs and TPMs
# based on method for RPKM/TPM calculations shown in http://www.rna-seqblog.com/rpkm-fpkm-and-tpm-clearly-explained/
#
# The input file is the output of featureCounts
#

rpkm <- function(counts, lengths) {
  pm <- sum(counts) /1e6
  rpm <- counts/pm
  rpm/(lengths/1000)
}

tpm <- function(counts, lengths) {
  rpk <- counts/(lengths/1000)
  coef <- sum(rpk) / 1e6
  rpk/coef
}


## read table from featureCounts output
args <- commandArgs(T)

tag <- tools::file_path_sans_ext(args[1])


cat('Reading in featureCounts data...')
ftr.cnt <- read.table(args[1], sep="\t", header=T, quote="") #Important to disable default quote behaviour or else genes with apostrophes will be taken as strings
cat(' Done\n')

if ( ncol(ftr.cnt) < 7 ) { 
    cat(' The input file is not the raw output of featureCounts (number of columns > 6) \n')
    quit('no')
}

lengths = ftr.cnt[,6]

counts <- ftr.cnt[,7:ncol(ftr.cnt)]

cat('Performing RPKM calculations...')

rpkms <- apply(counts, 2, function(x) rpkm(x, lengths) )
ftr.rpkm <- cbind(ftr.cnt[,1:6], rpkms)

rpkms <- apply(counts, 2, function(x) rpkm(x, lengths) )
ftr.rpkm <- cbind(ftr.cnt[,1:6], rpkms)
write.table(ftr.rpkm, file=paste0(tag, "_rpkm.txt"), sep="\t", row.names=FALSE, quote=FALSE)
cat(' Done.\n\tSaved as ')
cat ( paste0(tag, "_rpkm.txt", '\n') )

cat('Performing TPM calculations...')

tpms <- apply(counts, 2, function(x) tpm(x, lengths) )

ftr.tpm <- cbind(ftr.cnt[,1:6], tpms)

write.table(ftr.tpm, file=paste0(tag, "_tpm.txt"), sep="\t", row.names=FALSE, quote=FALSE)
cat(' Done.\n\tSaved as ')
cat ( paste0(tag, "_tpm.txt", '\n') )


quit('no')

命令输出:

Rscript tpm_rpkm.R 450-3-hard_filtered.featureCounts 读取 featureCounts 数据...完成执行 RPKM 计算...应用错误(计数,2,函数(x)rpkm(x,长度)):dim( X) 必须有一个正长度停止执行

我的特征计数表如下所示:

基因 |铬 |开始 |结束 |绞线 |长度 | 1_1 | NODE_1_length_59711_cov_84.026979_g0_i0 | 116 | 904 | + |第789章198 1_2 | NODE_1_length_59711_cov_84.026979_g0_i0 |第1178章3514 | - |第2337章第2294章 1_3 NODE_1_length_59711_cov_84.026979_g0_i0 | 3618 | 4319 | + | 702 | 502 1_4 | NODE_1_length_59711_cov_84.026979_g0_i0 | 4337 | 4921 | + | 585 | 320 1_5 | NODE_1_length_59711_cov_84.026979_g0_i0 | 4953 | 5906 | + |第954章799 1_6 | NODE_1_length_59711_cov_84.026979_g0_i0 | 5920 | 7056 | + |第1137章532 1_7 | NODE_1_length_59711_cov_84.026979_g0_i0 | 7061 | 8071 | + | 1011 | 761 1_8 | NODE_1_length_59711_cov_84.026979_g0_i0 | 8068 | 8766 | + | 699 | 188 1_9 | NODE_1_length_59711_cov_84.026979_g0_i0 | 8766 | 9656 | + | 891 | 217 1_10 | NODE_1_length_59711_cov_84.026979_g0_i0 | 9640 | 10710 | + | 1071 | 408 1_11 | NODE_1_length_59711_cov_84.026979_g0_i0 | 10692 | 11348 | + |第657章162 1_12 | NODE_1_length_59711_cov_84.026979_g0_i0 | 11359 | 12282 | + |第924章342

有人知道怎么处理吗?

【问题讨论】:

  • 请阅读[我如何提出一个好问题? ](stackoverflow.com/help/how-to-ask) 和Markdown help 学习如何正确地格式化你的问题。特别是,请重新格式化您的表格,并显示您用来调用脚本的确切代码。

标签: r rscript


【解决方案1】:

将计数定义更新为:

counts <- ftr.cnt[,7:ncol(ftr.cnt), drop=FALSE]

这应该确保它仍然是一个二维结构,apply 现在可以对其工作。

【讨论】:

  • 谢谢。那行得通。感谢您的快速回复。
  • @Shail 如果答案有帮助,请随时单击左侧的复选标记接受答案。每个帖子只能接受一个答案。参考 - stackoverflow.com/help/someone-answers
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2023-03-27
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多