iOS：如何将音频文件读入浮动缓冲区答案

【问题标题】：iOS: How to read an audio file into a float bufferiOS：如何将音频文件读入浮动缓冲区
【发布时间】：2011-09-24 06:45:50
【问题描述】：

我有一个非常短的音频文件，比如 .PCM 格式的十分之一秒

我想使用 RemoteIO 反复循环文件以产生连续的乐音。那么如何将它读入浮点数组呢？

编辑：虽然我可能会挖掘出文件格式，将文件提取到 NSData 并手动处理它，但我猜有一种更明智的通用方法......（例如处理不同的格式）

【问题讨论】：

为什么文件的 NSData 不够用？
我猜每种音频文件格式都会有一些头信息。否则它怎么知道采样率/数据格式等？

标签： ios file core-audio

【解决方案1】：

您可以使用 ExtAudioFile 以多种客户端格式从任何受支持的数据格式中读取数据。下面是一个将文件读取为 16 位整数的示例：

CFURLRef url = /* ... */;
ExtAudioFileRef eaf;
OSStatus err = ExtAudioFileOpenURL((CFURLRef)url, &eaf);
if(noErr != err)
  /* handle error */

AudioStreamBasicDescription format;
format.mSampleRate = 44100;
format.mFormatID = kAudioFormatLinearPCM;
format.mFormatFlags = kAudioFormatFormatFlagIsPacked;
format.mBitsPerChannel = 16;
format.mChannelsPerFrame = 2;
format.mBytesPerFrame = format.mChannelsPerFrame * 2;
format.mFramesPerPacket = 1;
format.mBytesPerPacket = format.mFramesPerPacket * format.mBytesPerFrame;

err = ExtAudioFileSetProperty(eaf, kExtAudioFileProperty_ClientDataFormat, sizeof(format), &format);

/* Read the file contents using ExtAudioFileRead */

如果你想要 Float32 数据，你可以像这样设置format：

format.mFormatID = kAudioFormatLinearPCM;
format.mFormatFlags = kAudioFormatFlagsNativeFloatPacked;
format.mBitsPerChannel = 32;

【讨论】：

看起来你找到了一个用于读取音频文件的优秀 api，但部分代码让我感到困惑。具体来说，您可以使用ExtAudioFileSetProperty(...) 设置格式吗？如果这样做，这是否会为您进行某种数据转换，因为 wav 的格式可能与指定的格式不同？老实说，我很想知道。在你发布之前我不知道Extended Audio File Services，所以我没有经验。
ExtAudioFile 是 AudioFile 和 AudioConverter 组合的包装器。无论设置何种客户端格式，都将在调用ExtAudioFileRead 和ExtAudioFileWrite 时使用，内部AudioConverter 将从客户端格式转换为文件的数据格式。

【解决方案2】：

这是我用来将音频数据（音频文件）转换为浮点表示并保存到数组中的代码。

-(void) PrintFloatDataFromAudioFile {

NSString *  name = @"Filename";  //YOUR FILE NAME
NSString * source = [[NSBundle mainBundle] pathForResource:name ofType:@"m4a"]; // SPECIFY YOUR FILE FORMAT

const char *cString = [source cStringUsingEncoding:NSASCIIStringEncoding];

CFStringRef str = CFStringCreateWithCString(
                                            NULL,
                                            cString,
                                            kCFStringEncodingMacRoman
                                            );
CFURLRef inputFileURL = CFURLCreateWithFileSystemPath(
                                                      kCFAllocatorDefault,
                                                      str,
                                                      kCFURLPOSIXPathStyle,
                                                      false
                                                      );

ExtAudioFileRef fileRef;
ExtAudioFileOpenURL(inputFileURL, &fileRef);


  AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = 44100;   // GIVE YOUR SAMPLING RATE 
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kLinearPCMFormatFlagIsFloat;
audioFormat.mBitsPerChannel = sizeof(Float32) * 8;
audioFormat.mChannelsPerFrame = 1; // Mono
audioFormat.mBytesPerFrame = audioFormat.mChannelsPerFrame * sizeof(Float32);  // == sizeof(Float32)
audioFormat.mFramesPerPacket = 1;
audioFormat.mBytesPerPacket = audioFormat.mFramesPerPacket * audioFormat.mBytesPerFrame; // = sizeof(Float32)

// 3) Apply audio format to the Extended Audio File
ExtAudioFileSetProperty(
                        fileRef,
                        kExtAudioFileProperty_ClientDataFormat,
                        sizeof (AudioStreamBasicDescription), //= audioFormat
                        &audioFormat);

int numSamples = 1024; //How many samples to read in at a time
UInt32 sizePerPacket = audioFormat.mBytesPerPacket; // = sizeof(Float32) = 32bytes
UInt32 packetsPerBuffer = numSamples;
UInt32 outputBufferSize = packetsPerBuffer * sizePerPacket;

// So the lvalue of outputBuffer is the memory location where we have reserved space
UInt8 *outputBuffer = (UInt8 *)malloc(sizeof(UInt8 *) * outputBufferSize);



AudioBufferList convertedData ;//= malloc(sizeof(convertedData));

convertedData.mNumberBuffers = 1;    // Set this to 1 for mono
convertedData.mBuffers[0].mNumberChannels = audioFormat.mChannelsPerFrame;  //also = 1
convertedData.mBuffers[0].mDataByteSize = outputBufferSize;
convertedData.mBuffers[0].mData = outputBuffer; //

UInt32 frameCount = numSamples;
float *samplesAsCArray;
int j =0;
    double floatDataArray[882000]   ; // SPECIFY YOUR DATA LIMIT MINE WAS 882000 , SHOULD BE EQUAL TO OR MORE THAN DATA LIMIT

while (frameCount > 0) {
    ExtAudioFileRead(
                     fileRef,
                     &frameCount,
                     &convertedData
                     );
    if (frameCount > 0)  {
        AudioBuffer audioBuffer = convertedData.mBuffers[0];
        samplesAsCArray = (float *)audioBuffer.mData; // CAST YOUR mData INTO FLOAT

       for (int i =0; i<1024 /*numSamples */; i++) { //YOU CAN PUT numSamples INTEAD OF 1024

            floatDataArray[j] = (double)samplesAsCArray[i] ; //PUT YOUR DATA INTO FLOAT ARRAY
              printf("\n%f",floatDataArray[j]);  //PRINT YOUR ARRAY'S DATA IN FLOAT FORM RANGING -1 TO +1
            j++;


        }
    }
}}

【讨论】：

感谢 sn-p。我不得不在 iOS 上使用 malloc 来初始化 floatDataArray。除此之外，效果很好。

【解决方案3】：

我不熟悉 RemoteIO，但我熟悉 WAV，并认为我会在它们上发布一些格式信息。如果需要，应该可以轻松解析出时长、比特率等信息...

首先，这是一个很好的网站，详细介绍了WAVE PCM soundfile format。该站点还出色地说明了“fmt”子块内的不同字节地址所指的内容。

WAVE 文件格式

WAVE 由“RIFF”块和后续子块组成
每个块至少有 8 个字节
前 4 个字节是块 ID
接下来的 4 个字节是块大小（块大小给出了块剩余部分的大小，不包括用于块 ID 和块大小的 8 个字节）
每个 WAVE 都有以下块/子块
- “RIFF”（第一个也是唯一一个块。所有其余的在技术上都是子块。）
- “fmt”（通常是“RIFF”之后的第一个子块，但可以位于“RIFF”和“data”之间的任何位置。该块包含有关 WAV 的信息，例如通道数、采样率和字节率）
- “data”（必须是最后一个子块，包含所有声音数据）

常见的 WAVE 音频格式：

PCM
IEEE_Float
PCM_EXTENSIBLE（带有 PCM 或 IEEE_FLOAT 的子格式）

WAVE 持续时间和大小

WAVE 文件的持续时间可以计算如下：

seconds = DataChunkSize / ByteRate

在哪里

ByteRate = SampleRate * NumChannels * BitsPerSample/8

并且 DataChunkSize 不包括为“数据”子块的 ID 和大小保留的 8 个字节。

知道了这一点，如果知道 WAV 的持续时间和 ByteRate，就可以计算出 DataChunkSize。

DataChunkSize = seconds * ByteRate

这对于从 mp3 或 wma 等格式转换时计算 wav 数据的大小很有用。请注意，典型 wav 标头为 44 字节，后跟 DataChunkSize（如果使用 Normalizer 工具转换 wav，则始终是这种情况 - 至少在撰写本文时如此）。

【讨论】：

【解决方案4】：

Swift 5 更新

这是一个简单的函数，可帮助您将音频文件放入浮点数组中。这适用于单声道和立体声音频，要获得立体声音频的第二个通道，只需取消注释示例 2


import AVFoundation

//..

do {
    guard let url = Bundle.main.url(forResource: "audio_example", withExtension: "wav") else { return }
    let file = try AVAudioFile(forReading: url)
    if let format = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: file.fileFormat.sampleRate, channels: file.fileFormat.channelCount, interleaved: false), let buf = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(file.length)) {

        try file.read(into: buf)
        guard let floatChannelData = buf.floatChannelData else { return }
        let frameLength = Int(buf.frameLength)
        
        let samples = Array(UnsafeBufferPointer(start:floatChannelData[0], count:frameLength))
//        let samples2 = Array(UnsafeBufferPointer(start:floatChannelData[1], count:frameLength))
        
        print("samples")
        print(samples.count)
        print(samples.prefix(10))
//        print(samples2.prefix(10))
    }
} catch {
    print("Audio Error: \(error)")
}

【讨论】：