读取浮点二进制数据时出现问题答案

【问题标题】：Having issue when reading binary data of float读取浮点二进制数据时出现问题
【发布时间】：2018-05-10 02:39:32
【问题描述】：

我想使用os 模块专门处理读/写二进制文件。我在读取占用超过 1 个字节的数据类型的值时遇到问题，例如 int64、float32 等。为了说明我的问题，让我们看看我写的以下示例。我生成np.float64 类型的随机值，每个值是8 字节：

# Write
n = 10
dim = 2
fd = os.open('test.dat', os.O_CREAT | os.O_WRONLY)
data_w = np.random.uniform(low=0.5, high=13.3, size=(n,dim)).astype(np.float64)
print("Written Data are:\n%s\n" % data_w)
os.write(fd, data_w.tobytes())
os.close(fd)
print("------------------ \n")

# Read
start_read = 0  # 0 for now. Later I can read from any row!
total_num_to_read = n*dim
fd = os.open('test.dat', os.O_RDONLY)
os.lseek(fd, start_read, 0)  # start_read from the beginning 0
raw_data = os.read(fd, total_num_to_read)  # How many values to be read
data_r = np.fromiter(raw_data, dtype=np.float64).reshape(-1, dim)
print("Data Read are:\n%s\n" % data_r)
os.close(fd)

读数不正确。看看它是如何返回的：

Written Data are:
[[ 2.75763292  9.87883101]
 [ 1.73752327  9.9633879 ]
 [ 1.01616811  1.81174597]
 [ 9.93904659 10.6757686 ]
 [ 7.02452029  2.68652109]
 [ 5.29766028 11.15384409]
 [ 4.12499766 10.37214532]
 [11.75811252  3.30378401]
 [ 1.72738203  2.11228277]
 [ 7.7321937  11.64298051]]

------------------ 

Data Read are:
[[250.  87.]
 [227. 216.]
 [161.  15.]
 [  6.  64.]
 [162. 178.]
 [ 59.  35.]
 [246. 193.]
 [ 35.  64.]
 [218.  97.]
 [ 81.  50.]]

我无法正确检索它！我认为np.fromiter(raw_data, dtype=np.float64).reshape(-1, dim) 应该处理它，但我不知道问题出在哪里。鉴于我知道二进制数据属于特定数据类型（即np.float64），在这种情况下如何读取二进制数据？

【问题讨论】：

“我想专门使用 os 模块...” 只是出于好奇，为什么os '专门'？为什么不使用numpy 的内置方法？

标签： python python-3.x

【解决方案1】：

您应该使用np.fromstring(raw_data) 而不是fromiter()。检查文档以了解每个功能的用途。另外，从文件读取时，读取正确的字节数！！！：8* total_num_to_read.

In [103]: # Write
     ...: n = 10
     ...: dim = 2
     ...: fd = os.open('test.dat', os.O_CREAT | os.O_WRONLY)
     ...: data_w = np.random.uniform(low=0.5, high=13.3, size=(n,dim)).astype(np.float64)
     ...: print("Written Data are:\n%s\n" % data_w)
     ...: os.write(fd, data_w.tobytes())
     ...: os.close(fd)
     ...: print("------------------ \n")
     ...: 
     ...: # Read
     ...: start_read = 0  # 0 for now. Later I can read from any row!
     ...: total_num_to_read = n*dim
     ...: fd = os.open('test.dat', os.O_RDONLY)
     ...: os.lseek(fd, start_read, 0)  # start_read from the beginning 0
     ...: raw_data = os.read(fd, 8*total_num_to_read)  # How many values to be read
     ...: data_r = np.fromstring(raw_data, dtype=np.float64).reshape(-1, dim)
     ...: print("Data Read are:\n%s\n" % data_r)
     ...: os.close(fd)
     ...: 
     ...: 
Written Data are:
[[ 11.2465988    5.45304778]
 [ 12.06466331   9.95717255]
 [  7.35402895   1.68972606]
 [  0.7259652    1.01265826]
 [  3.11340311   2.44725153]
 [  2.82109715   5.02768335]
 [ 12.69054614   9.26028537]
 [  5.13785639   2.0780649 ]
 [  4.6796513    4.24710598]
 [  2.34859141   8.87224674]]

------------------ 

Data Read are:
[[ 11.2465988    5.45304778]
 [ 12.06466331   9.95717255]
 [  7.35402895   1.68972606]
 [  0.7259652    1.01265826]
 [  3.11340311   2.44725153]
 [  2.82109715   5.02768335]
 [ 12.69054614   9.26028537]
 [  5.13785639   2.0780649 ]
 [  4.6796513    4.24710598]
 [  2.34859141   8.87224674]]

【讨论】：