LZW 压缩答案 - 爱码网

【问题标题】：LZW CompressionLZW 压缩
【发布时间】：2017-08-06 02:33:31
【问题描述】：

LZW 压缩算法正在增加压缩后的比特大小：

这是压缩功能的代码：

// compression
void compress(FILE *inputFile, FILE *outputFile) {    
    int prefix;
    int character;

    int nextCode;
    int index;

    // LZW starts out with a dictionary of 256 characters (in the case of 8 codeLength) and uses those as the "standard"
    //  character set.
    nextCode = 256; // next code is the next available string code
    dictionaryInit();

    // while (there is still data to be read)
    while ((character = getc(inputFile)) != (unsigned)EOF) { // ch = read a character;

        // if (dictionary contains prefix+character)
        if ((index = dictionaryLookup(prefix, character)) != -1) prefix = index; // prefix = prefix+character
        else { // ...no, try to add it
            // encode s to output file
            writeBinary(outputFile, prefix);

            // add prefix+character to dictionary
            if (nextCode < dictionarySize) dictionaryAdd(prefix, character, nextCode++);

            // prefix = character
            prefix = character; //... output the last string after adding the new one
        }
    }
    // encode s to output file
    writeBinary(outputFile, prefix); // output the last code

    if (leftover > 0) fputc(leftoverBits << 4, outputFile);

    // free the dictionary here
    dictionaryDestroy();
}

其中writeBinary（它在程序中就像一个缓冲区）函数如下：

void writeBinary(FILE * output, int code);

int leftover = 0;
int leftoverBits;

    void writeBinary(FILE * output, int code) {
        if (leftover > 0) {
            int previousCode = (leftoverBits << 4) + (code >> 8);

            fputc(previousCode, output);
            fputc(code, output);

            leftover = 0; // no leftover now
        } else {
            leftoverBits = code & 0xF; // save leftover, the last 00001111
            leftover = 1;

            fputc(code >> 4, output);
        }
    }

你能找出错误吗？我将不胜感激！

【问题讨论】：

这是猜谜游戏吗？沃尔多在哪里？什么错误？请描述问题...
压缩文件的大小大于未压缩文件的大小。该算法应该在压缩版本的位方面使用减小的大小。有一个逻辑错误，我无法指出。如果您能指出逻辑错误，这是一个请求。
并非所有数据集都将使用特定算法进行压缩，您是否有用于确认该数据应该压缩的参考代码？
为什么leftoverBits << 4中有4个？ fputc(previousCode, output); fputc(code, output); 写入 16 位。它应该只写 9,10,11... IMO writeBinary() 和 ()v) 如果代码真的在尝试 LZW，则完全有问题并且无法挽救。
是的，例如字符串 'TOBEORNOTTOBEORTOBEORNOT#' - 我已将其保存在一个 .txt 文件中，该文件占用 25 个字节。压缩后，压缩后的文件占用 27 字节，但应该会减小到原始文件大小的 22%。

标签： c compression lzw

【解决方案1】：

chux 已经为您指出了解决方案：您需要从 9 位代码开始，并在当前位大小的可用代码用尽时将代码大小增加到 12。如果你从头开始编写 12 位代码，当然没有压缩效果。

【讨论】：