如何根据文件名和文件计数将文件移动到目录中答案

【问题标题】：How To Move Files Into Directories based on filename and Count Files如何根据文件名和文件计数将文件移动到目录中
【发布时间】：2015-03-18 13:28:35
【问题描述】：

我看到了与我的问题有些相似的其他主题，但是我仍然是初学者，无法真正理解发布的某些代码。

我有一个遵循任何一种格式的目录列表，但是末尾的数字不相同，缩写 (UVM) 可以来自缩写列表。

jhu-usc.edu_UVM.HumanMethylation450.Level_1.1.0.0/ jhu-usc.edu_UVM.HumanMethylation450.aux.1.0.0/ jhu-usc.edu_UVM.HumanMethylation450.mage-tab.1.0.0/

我希望制作一个脚本，根据缩写（ex.UVM）将目录及其文件从当前目录递归移动到新目录（如果还没有，则创建目录）。

然后我希望能够计算目录中以 *idat 结尾的文件的数量，并输出一个 .txt 文件，上面写着“对于缩写有 这么多 idat 文件。

最近我没有太多时间来解决这个问题，而且我的截止日期很快就要到了。如果有人可以帮助我解决这个问题，我将不胜感激。

如果问题的措辞或格式不正确，请原谅我，这是我的第一篇文章，所以我尽力了。

谢谢！

【问题讨论】：

有谁知道为每个文件夹本身（即UVMFOLDER）写入文件夹中的idat文件数量的方法

标签： bash command-line ipython-notebook

【解决方案1】：

这样的？

#!/bin/sh

#Definition of where you want the directory to be moved
destination_root="/tmp/stuff"

 function abbreviation() {

    #Default
    destination="${destination_root}/UNKNOWN"

    if [[ $1 == UVM* ]]; then
            destination="${destination_root}/UVMFOLDER"
    fi

    if [[ $1 == BRCA* ]]; then
            destination="${destination_root}/BRCAFOLDER"
    fi

    #Ensure folder is present
    mkdir -p ${destination} 
}


#Loop through all folders matching the prefix
for instance in jhu-usc.edu_*
do
    #Count instances of *idat files and put the result in a.txt file
    echo "${instance} - `find ${instance} -type f -name "*idat" | wc -l`" | tee -a a.txt

    abbreviation `echo "${instance}" | sed s/jhu-usc.edu_//g`

    #Move the folder to the destination
    mv ${instance} ${destination}
done

这将创建一个包含如下内容的.txt：

jhu-usc.edu_UVM.HumanMethylation450.Level_1.1.0.0 - 3
jhu-usc.edu_UVM.HumanMethylation450.aux.1.0.0 - 2

其中 3 和 2 是文件夹中以 idat 结尾的文件的实例数。

编辑

更改了移动文件的输出文件夹 - 我删除了前缀 jhu-usc.edu_。因此，例如 jhu-usc.edu_UVM.HumanMethylation450.Level_1.1.0.0 将被移动到 /tmp/stuff/UVM.HumanMethylation450.Level_1.1.0.0

【讨论】：

谢谢！我相信计数会起作用，但是我希望根据 jhu-usc.edu_ 之后的缩写将目录移动到新的目的地，因此它将来自 30 个缩写列表（UVM、BRCA、 BLCA 等），如果需要的话。所以基本上，如果是 BRCA，那么它会创建并发送到新的 BRCA 目录，或者如果它是 UVM，那么它会创建新的 UVM 目录并将其发送到那里
谢谢。我在 mv 之前添加了 sudo ，因为它没有给我许可。它移动文件夹，但是我也希望它只是将所有类似缩写的东西移动到同一个文件夹（所以任何带有 UVM 的东西都进入一个 UVM 文件夹，任何带有 BRCA 的东西都进入一个 BRCA），然后计算有多少 idat 文件在每个缩写文件夹（因此 BRCA 有 10000 个 idat 文件）。还有一种方法可以像句子一样打印出来（例如，BRCA 有 1000 个 idat 文件。UVM 有 5000 个......等等，每个都在一个新行上？）。很抱歉问了这么多问题，但我真的很感激！
查看我更新的答案 :) 现在您必须为所有不同的缩写定义所有路径。或者您可以使用某种算法根据 abbreviations() 函数中的输入设置路径。
非常感谢您的帮助。抱歉，我一直在尝试一些东西..我刚刚发布了脚本的更新版本..我希望看看是否可以按文件夹计算 idat 文件，而不是按文件夹计算它们，即UVMFolder 有 10000 个 idat 文件。我重新排列了代码，以便所有文件都可以存在于当前目录中。

【解决方案2】：

#!/bin/bash
#Definition of where you want the directory to be moved
destination_root="/data/nrnb01_nobackup/agross/TCGA_methylation"

#Delete tar.gz files
find . -type f -name '*tar.gz' -exec rm {} +

#Move all files within cancersubtype folders into current directory to allow ease of moving to specific directories
find . -mindepth 2 -type f -print -exec mv {} . \;

function abbreviation() {
    #Default
    destination="${destination_root}/UNKNOWN"

    if [[ $1 == UVM* ]]; then
            destination="${destination_root}/UVMFOLDER"
    fi

    if [[ $1 == BRCA* ]]; then
            destination="${destination_root}/BRCAFOLDER"
    fi

    if [[ $1 == ACC* ]]; then
            destination="${destination_root}/ACCFOLDER"
    fi

    if [[ $1 == BLCA* ]]; then
            destination="${destination_root}/BLCAFOLDER"
    fi

    if [[ $1 == CESC* ]]; then
            destination="${destination_root}/CESCFOLDER"
    fi

    if [[ $1 == CHOL* ]]; then
            destination="${destination_root}/CHOLFOLDER"
    fi

    if [[ $1 == COAD* ]]; then
            destination="${destination_root}/COADFOLDER"
    fi

    if [[ $1 == DLBC* ]]; then
            destination="${destination_root}/DLBCFOLDER"
    fi

    if [[ $1 == ESCA* ]]; then
            destination="${destination_root}/ESCAFOLDER"
    fi

    if [[ $1 == GBM* ]]; then
            destination="${destination_root}/GBMFOLDER"
    fi

    if [[ $1 == HNSC* ]]; then
            destination="${destination_root}/HNSCFOLDER"
    fi

    if [[ $1 == KICH* ]]; then
            destination="${destination_root}/KICHFOLDER"
    fi

    if [[ $1 == KIRC* ]]; then
            destination="${destination_root}/KIRCFOLDER"
    fi

    if [[ $1 == KIRP* ]]; then
            destination="${destination_root}/KIRPFOLDER"
    fi

    if [[ $1 == LAML* ]]; then
            destination="${destination_root}/LAMLFOLDER"
    fi

    if [[ $1 == LGG* ]]; then
            destination="${destination_root}/LGGFOLDER"
    fi

    if [[ $1 == LIHC* ]]; then
            destination="${destination_root}/LIHCFOLDER"
    fi

    if [[ $1 == LUAD* ]]; then
            destination="${destination_root}/LUADFOLDER"
    fi

    if [[ $1 == LUSC* ]]; then
            destination="${destination_root}/LUSCFOLDER"
    fi

    if [[ $1 == MESO* ]]; then
            destination="${destination_root}/MESOFOLDER"
    fi

    if [[ $1 == OV* ]]; then
            destination="${destination_root}/OVFOLDER"
    fi

    if [[ $1 == PAAD* ]]; then
            destination="${destination_root}/PAADFOLDER"
    fi

    if [[ $1 == PCPG* ]]; then
            destination="${destination_root}/PCPGFOLDER"
    fi

    if [[ $1 == PRAD* ]]; then
            destination="${destination_root}/PRADFOLDER"
    fi

    if [[ $1 == READ* ]]; then
            destination="${destination_root}/READFOLDER"
    fi

    if [[ $1 == SARC* ]]; then
            destination="${destination_root}/SARCFOLDER"
    fi

    if [[ $1 == SKCM* ]]; then
            destination="${destination_root}/SKCMFOLDER"
    fi

    if [[ $1 == STAD* ]]; then
            destination="${destination_root}/STADFOLDER"
    fi

    if [[ $1 == TGCT* ]]; then
            destination="${destination_root}/TGCTFOLDER"
    fi

    if [[ $1 == THCA* ]]; then
            destination="${destination_root}/THCAFOLDER"
    fi

    if [[ $1 == THYM* ]]; then
            destination="${destination_root}/THYMFOLDER"
    fi

    if [[ $1 == UCEC* ]]; then
            destination="${destination_root}/UCECFOLDER"
    fi

    if [[ $1 == UCS* ]]; then
            destination="${destination_root}/UCSFOLDER"
    fi


    #Ensure folder is present
    mkdir -p ${destination} 
}

#Loop through all folders matching the jhu prefix
for instance in jhu-usc.edu_*
do
    #Count instances of *idat files and put the result in idat_count.txt file
    echo "${instance} - `find ${instance} -type f -name "*idat" | wc -l`" | tee -a idat_count.txt

    abbreviation `echo "${instance}" | sed s/jhu-usc.edu_//g`

    #Move the folder to the destination
    mv ${instance} ${destination}
done

【讨论】：

我更新了我的答案。我希望看看是否可以更改代码以计算每个文件夹中的 idat 文件总数，即 BLCAFOLDER 有 1000 个 idat 文件