从文件输出原始字节，第一个字节已损坏答案

【问题标题】：Outputting raw bytes from a file, first byte is corrupted从文件输出原始字节，第一个字节已损坏
【发布时间】：2015-01-17 12:35:30
【问题描述】：

所以我写了一个小程序，将文件的内容读入 char 数组（因为 fstream 似乎只支持 char 指针）。我想要做的是将原始字节发送到控制台。 AFAIK char 是一种 8 位数据类型，所以它不应该太难。但是，如果我只打印数组的成员，我会得到对应于 ASCII 值的字符，所以我使用的是静态转换。这工作正常，除了第一个字节似乎没有正确转换。我使用 PNG 文件作为 test.bin 文件。 PNG 文件总是以 137,80,78,71,13,10,26,10 的字节序列开头。但是第一个字节打印不正确。我感觉它必须对超过 127 的值做一些事情。但是，我无法将读取缓冲区数据类型更改为其他任何类型（如 unsigned char 或 unsigned short int），因为 fstream 中的 foo.read() 仅支持字符目标缓冲区。如何让 fstream 将原始字节读入可用的无符号类型？

我的代码：

#include <iostream>
#include <fstream>
#include <sys/stat.h>

#define filename "test.bin"

void pause(){
    std::string dummy;
    std::cout << "Press enter to continue...";
    std::getline(std::cin, dummy);
}


int main(int argc, char** argv) {
    using std::cout;
    using std::endl;
    using std::cin;
    // opening file
    std::ifstream fin(filename, std::ios::in | std::ios::binary);
    if (!fin.is_open()) {
       cout << "error: open file for input failed!" << endl;
       pause();
       abort();
    }
    //getting the size of the file
    struct stat statresults;
    if (stat(filename, &statresults) == 0){
        cout<<"File size:"<<statresults.st_size<<endl;
    }
    else{
        cout<<"Error determining file size."<<endl;
        pause();
        abort();
    }
    //setting up read buffer and reading the entire file into the buffer
    char* rBuffer = new char[statresults.st_size];
    fin.read(rBuffer, statresults.st_size);

    //print the first 8 bytes
    int i=0;
    for(i;i<8;i++) {
        cout<<static_cast<unsigned short>(rBuffer[i])<<";";
    }



    pause();
    fin.clear();
    fin.close();
    delete [] rBuffer;
    pause();
    return 0;
}

【问题讨论】：

你能用调试器看到实际值是多少吗？
rBuffer[0]的值为-119。
如果你寻找到文件的开头呢？ fin.seekg(0, fin.beg);
您能否验证（使用十六进制编辑器）您的文件具有正确的值？
不需要为ifstream 提供in 标志，也不需要在关闭它之前清除它。

标签： c++ arrays file-io casting unsigned

【解决方案1】：

试试 fin.read() 以外的方法怎么样？

代替：

char* rBuffer = new char[statresults.st_size];
fin.read(rBuffer, statresults.st_size);

你可以使用：

unsigned char* rBuffer = new unsigned char[statresults.st_size];
for(int i = 0; i < statresults.st_size; i++)
{
    fin.get(rBuffer[i]);
}

【讨论】：

谢谢，这可以正常工作，但是我有点担心这种方法的速度。虽然在这种特殊情况下并不重要，但一次读取 1 个字节的文件对我来说似乎有点 hackish。我有一种感觉，如果我需要做大量的 IO，这将成为一个瓶颈。

【解决方案2】：

-119 有符号是 137 无符号（二进制都是 1000 1001）。
这将符号扩展为短 1111 1111 1000 1001，即 65,417 无符号。
我认为这就是您所看到的价值。

读取无符号缓冲区：

unsigned char* rBuffer = new unsigned char[statresults.st_size];
fin.read(reinterpret_cast<char*>(rBuffer), statresults.st_size);

【讨论】：

【解决方案3】：

您可能希望使用 unsigned char 作为您的“字节”。你可以试试这样的：

using byte = unsigned char;

...

byte* buffer = new byte[statresults.st_size];
fin.read( reinterpret_cast<char*>( buffer ), statresults.st_size );

...

delete[] buffer;

【讨论】：