【发布时间】:2019-10-03 17:09:55
【问题描述】:
我正在使用依赖于readBin() 的函数将 MNIST 图像文件读入 R。但是,逐行运行函数我看到readBin() 为同一行代码返回不同的值(没有任何参数更改)。怎么会?
#Getting the data
> download.file("http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz",
+ "t10k-images-idx3-ubyte.gz")
#unzipped the .gz file manually out of R. The extracted file is 'train-images.idx3-ubyte'
#Using file() to read the 'train-images.idx3-ubyte' file
> f = file("train-images.idx3-ubyte", 'rb')
#this is what 'f' is:
> f
A connection with
description "train-images.idx3-ubyte"
class "file"
mode "rb"
text "binary"
opened "opened"
can read "yes"
can write "no"
#The following lines show the execution of readBin with the same parameters, though giving a different value each time
> readBin(f, 'integer', n = 1, size = 4, endian = 'big')
[1] 2051
> readBin(f, 'integer', n = 1, size = 4, endian = 'big')
[1] 60000
> readBin(f, 'integer', n = 1, size = 4, endian = 'big')
[1] 28
> readBin(f, 'integer', n = 1, size = 4, endian = 'big')
[1] 28
> readBin(f, 'integer', n = 1, size = 4, endian = 'big')
[1] 0
【问题讨论】:
-
继续读取文件。第一次调用读取第一个字节;然后第二个调用从第一个调用后的左侧位置读取,依此类推。
-
谢谢@nicola!但是为什么每次调用中读取的字节块不相等呢?导致不一致的原因是什么?
-
显然我没有说清楚。
readBin从上次调用的左侧位置读取。在每次通话中,您都说要读取 4 个字节。所以第一次调用从文件中读取前 4 个字节。第二次调用读取字节 5-8。第三个字节 9-12 依此类推。这就像一个字一个字地读一个句子。假设你有一句话“大家好!你好吗?”。那句话就像你的档案。您对readBin的调用就像:“当时读一个单词”。第一个调用会返回“Hello”,第二个会返回“everybody”,以此类推。
标签: r binaryfiles mnist