如何使用 Apache Commons 解压 TAR 文件答案

【问题标题】：How to untar a TAR file using Apache Commons如何使用 Apache Commons 解压 TAR 文件
【发布时间】：2012-07-10 23:33:57
【问题描述】：

我正在使用 Apache Commons 1.4.1 库来压缩和解压缩 ".tar.gz" 文件。

最后一点我遇到了问题 - 将 TarArchiveInputStream 转换为 FileOutputStream。

奇怪的是，这条线断了：

FileOutputStream fout = new FileOutputStream(destPath);

destPath 是一个文件，其规范路径为：C:\Documents and Settings\Administrator\My Documents\JavaWorkspace\BackupUtility\untarred\Test\subdir\testinsub.txt

产生的错误：

Exception in thread "main" java.io.IOException: The system cannot find the path specified

知道它可能是什么吗？为什么找不到路径？

我在下面附上整个方法（大部分来自here）。

private void untar(File dest) throws IOException {
    dest.mkdir();
    TarArchiveEntry tarEntry = tarIn.getNextTarEntry();
    // tarIn is a TarArchiveInputStream
    while (tarEntry != null) {// create a file with the same name as the tarEntry
        File destPath = new File(dest.toString() + System.getProperty("file.separator") + tarEntry.getName());
        System.out.println("working: " + destPath.getCanonicalPath());
        if (tarEntry.isDirectory()) {
            destPath.mkdirs();
        } else {
            destPath.createNewFile();
            FileOutputStream fout = new FileOutputStream(destPath);
            tarIn.read(new byte[(int) tarEntry.getSize()]);
            fout.close();
        }
        tarEntry = tarIn.getNextTarEntry();
    }
    tarIn.close();
}

【问题讨论】：

不好意思问这个问题，但我尝试使用您的代码示例并看到它在给定我正在使用的特定 gzip 文件的情况下工作。考虑到在 inputStream 上读取的内容，它如何在不调用 fout.write(...) 的情况下工作？在answer @user1894600 suggests中，他要显式调用write(...)，并提供已经读入内存的字节数组。

标签： java file-io tar apache-commons compression

【解决方案1】：

几个一般的点，你为什么用File构造函数做voodoo，那里有一个perfectly usable constructor，你可以在其中定义你要创建的File的名称并给一个父文件？

其次，我不太确定 Windows 路径中的空格是如何处理的。这可能是您的问题的原因。尝试使用我上面提到的构造函数，看看它是否有所作为：File destPath = new File(dest, tarEntry.getName());（假设File dest 是一个正确的文件，并且存在并且您可以访问。

第三，在对File 对象进行任何操作之前，您应该检查它是否存在以及是否可以访问。这最终将帮助您查明问题。

【讨论】：

感谢您的回复。我决定重写模块，它工作得很好。我已经听取了您关于不要使用 File 对象的建议，因此我将您的答案标记为正确的答案（基于原则）
很高兴它有所帮助，并希望最终一切顺利。祝你好运:)
我使用相同的代码解压 .tar 文件而不是 .tar.gz。我从这行'new File（dest，tarEntry.getName（））'得到文件内容而不是文件名。我该怎么做才能获取 .tar 中的文件名

【解决方案2】：

您的程序有 java 堆空间错误。所以我认为需要做一点改变。这是代码...

public static void uncompressTarGZ(File tarFile, File dest) throws IOException {
    dest.mkdir();
    TarArchiveInputStream tarIn = null;

    tarIn = new TarArchiveInputStream(
                new GzipCompressorInputStream(
                    new BufferedInputStream(
                        new FileInputStream(
                            tarFile
                        )
                    )
                )
            );

    TarArchiveEntry tarEntry = tarIn.getNextTarEntry();
    // tarIn is a TarArchiveInputStream
    while (tarEntry != null) {// create a file with the same name as the tarEntry
        File destPath = new File(dest, tarEntry.getName());
        System.out.println("working: " + destPath.getCanonicalPath());
        if (tarEntry.isDirectory()) {
            destPath.mkdirs();
        } else {
            destPath.createNewFile();
            //byte [] btoRead = new byte[(int)tarEntry.getSize()];
            byte [] btoRead = new byte[1024];
            //FileInputStream fin 
            //  = new FileInputStream(destPath.getCanonicalPath());
            BufferedOutputStream bout = 
                new BufferedOutputStream(new FileOutputStream(destPath));
            int len = 0;

            while((len = tarIn.read(btoRead)) != -1)
            {
                bout.write(btoRead,0,len);
            }

            bout.close();
            btoRead = null;

        }
        tarEntry = tarIn.getNextTarEntry();
    }
    tarIn.close();
}

祝你好运

【讨论】：

所以，如果声明为byte [] btoRead = new byte[(int)tarEntry.getSize()];时字节数组可能太大，就会出现堆空间错误？
反应很好。但是要修改下面的deskPath.createNewFile();创建父目录if (!destPath.getParentFile().exists()) { destPath.getParentFile().mkdirs(); } destPath.createNewFile();