缺少 nextflow 进程预期的输出文件答案

【问题标题】：Missing output file(s) expected by nextflow process缺少 nextflow 进程预期的输出文件
【发布时间】：2021-08-04 18:04:48
【问题描述】：

我有一个 nextflow 过程，它输入多个文件做某事，然后输出一些文件。在此过程中，我删除了条件中的空文件。

    process imputation {
    input:
    set val(chrom),val(chunk_array),val(chunk_start),val(chunk_end),path(in_haps),path(refs),path(maps) from imp_ch
    output:
    tuple val("${chrom}"),path("${chrom}.*") into imputed
    script:
    def (haps,sample)=in_haps
    def (haplotype, legend, samples)=refs
    """
    impute4 -g "${haps}" -h "${haplotype}" -l "${legend}" -m "${maps}" -o "${chrom}.imputed.chunk${chunk_array}" -no_maf_align -o_gz -int "${chunk_start}" "${chunk_end}" -Ne 20000 -buffer 1000 -seed 54321
    if [[ \$(gunzip -c "${chrom}.imputed.chunk${chunk_array}.gen.gz" | head -c1 | wc -c) == "0"]]
    then
     rm "${chrom}.imputed.chunk${chunk_array}.gen.gz"
    else
     qctools -g "${chrom}.imputed.chunk${chunk_array}.gen.gz" -snp-stats -osnp "${chrom}.imputed.chunk${chunk_array}.snp.stats"
    fi
    """
    }

该过程运行良好。 impute4 程序给出*gen.gz 文件的输出，其中一些可能是空的。因此，添加了 if 语句以删除那些空文件，因为 qctools 无法读取空文件并且进程崩溃。问题是，现在我收到错误：

Missing output file(s) `chr16*` expected by process `imputation (165)` (note: input files are not included in the default matching set)

我该如何解决这个问题。有什么帮助吗？

【问题讨论】：

标签： nextflow

【解决方案1】：

this nextflow 模式有帮助吗？

短版：

process foo {
  output:
  file 'foo.txt' optional true into foo_ch

  script:
  '''
  your_command
  '''
}

基本上，通过指定输出是可选的，如果没有找到任何定义的输出 glob，进程不会失败。

但是，根据输出文件的数量，您可能希望在输出声明中更具体地说明需要哪些类型的输出文件以及哪些是可选的，以确保在所有命令都失败时您的进程仍然失败（无论出于何种原因)

【讨论】：

【解决方案2】：

将可选模式用作suggested by user jfy133 将是解决您的问题的一种方法。在任何情况下，您都可能希望将这两个命令拆分到不同的进程中。

您还可以存储您在 if 子句中使用的行数或测试语句，并在运行 qctools 之前在您的第一个进程的输出通道上使用 nextflow filter 或 branch 运算符

Filter:

Channel
    .from( 1, 2, 3, 4, 5 )
    .filter { it % 2 == 1 }

Branch:

Channel
    .from(1,2,3,40,50)
    .branch {
        small: it < 10
        large: it > 10
    }
    .set { result }

 result.small.view { "$it is small" }
 result.large.view { "$it is large" }

您的解决方案可能如下所示

process imputation {
    input:
        ...
    output:
        env(isempty), file(other), file(output) into imputed

    script:
        def (haps,sample)=in_haps
        def (haplotype, legend, samples)=refs
        """
        impute4 <your parameters>
        isempty=\$(gunzip -c "${chrom}.imputed.chunk${chunk_array}.gen.gz" | head -c1 | wc -c)
        """
}

filtered_imputed = imputed.filter { empty: it[0] > 0 }

process qctools {
    input:
        val(isempty), <your input> from filtered_imputed
    output:
        <your desired output> into qctools_output

    script:
    """
    qctools <your parameters>
    """
"""

【讨论】：