音频样本混合或改变音量会导致饱和度和白噪声答案

【问题标题】：Audio samples mixing or changing volume causes saturation and white noise音频样本混合或改变音量会导致饱和度和白噪声
【发布时间】：2020-08-04 00:01:08
【问题描述】：

我有一个多声道输入（我在 mac 上使用 Soundflower 64ch），我正在尝试将 64 个声道中的 4 个声道混音为立体声输出。

我正在做的是，读取 1024 帧的块，每帧有 64 个通道，然后将字节缓冲区转换为短数组（值在 -32,768 32,767 之间，因为样本是 16 位）。

这样我添加了例如channel1[sample] + channel2[sample]，我得到了两个频道的混合。但这里有个问题，总和会溢出 Short（16 位）范围，在声音中引入饱和。所以我正在做的是(channel1[sample] + channel2[sample]) / 2，但是当我除以 2 时，我听到了很多白色的声音。

此外，如果我尝试通过执行channel1[sample] * 0.5 来降低频道的音量，则会出现很多饱和度。为什么会这样？

这是我的完整代码，请注意我将字节转换为短字节以更好地处理，然后我将转换回字节以将混音写入立体声输出：

public static void main(String[] args) throws LineUnavailableException {

    int inputChannels = 64;

    AudioFormat inputFormat = new AudioFormat(48000, 16, inputChannels, true, false);
    AudioFormat outputFormat = new AudioFormat(48000, 16, 2, true, false);

    TargetDataLine mic = AudioSystem.getTargetDataLine(inputFormat);
    SourceDataLine speaker = AudioSystem.getSourceDataLine(outputFormat);

    mic.open(inputFormat);
    speaker.open(outputFormat);
    mic.start();
    speaker.start();


    AudioInputStream audioInputStream = new AudioInputStream(mic);

    int bytesPerFrame = audioInputStream.getFormat().getFrameSize();

    // Set an arbitrary buffer size of 1024 frames.
    int CHUNK = 1024 ;
    int numBytes = CHUNK * bytesPerFrame;
    byte[] audioBytes = new byte[numBytes];

    try {
        byte[][] frames = new byte[CHUNK][bytesPerFrame];
        int i = 0, j = 0
                ;
        while (true) {
            // read to audioBytes.
            audioInputStream.read(audioBytes);

            // split audioBytes in _CHUNK_ frames (1024 frames)
            for(j=0; j<CHUNK; j++) {
                frames[j] = Arrays.copyOfRange(audioBytes, j * bytesPerFrame, j * bytesPerFrame + bytesPerFrame);
            }

            // convert bytearray to shortarray
            short[][] shortFrames = new short[CHUNK][inputChannels];
            for(i=0; i < frames.length; i++) {
                ByteBuffer.wrap(frames[i]).order(ByteOrder.BIG_ENDIAN).asShortBuffer().get(shortFrames[i]);
            }

            short[] leftOutput = new short[CHUNK*2];
            short[] rightOutput = new short[CHUNK*2];

            for (i=0; i<CHUNK; i++) {
                short channel1 = shortFrames[i][0];
                short channel2 = shortFrames[i][1];
                short channel3 = shortFrames[i][2];
                short channel4 = shortFrames[i][3];

                leftOutput[i] = (short)(channel4);
                rightOutput[i] = (short)(channel4);;
            }


            //convert shortarray in byte buffer
            ByteBuffer byteBuf = ByteBuffer.allocate(CHUNK * 2 * 2); // 2 bytes * 2 output channels
            for (i=0; i<CHUNK; i++) {

                byteBuf.putShort(leftOutput[i]);
                byteBuf.putShort(rightOutput[i]);
            }

            speaker.write(byteBuf.array(),0,byteBuf.array().length);

        }
    } catch (Exception ex) {
        // Handle the error...
        System.out.println("exception");
        System.out.println(ex.toString());
    }
}

【问题讨论】：

我不知道这些函数存在于 ByteBuffer 中。在能够明确回答之前，我将不得不和他们一起玩。我总是手动进行组装（取高字节并移位并添加到低字节）。作为猜测，请确保 BIG_ENDIAN 对于输入和输出都是正确的，并且您实际上在输出端正确执行此操作。如果播放需要 LITTLE_ENDIAN，或者 ByteBuf 在您的 putShort 上默认为 LITTLE_ENDIAN 但需要为 BIG_ENDIAN，这可能会解释您所听到的内容。
感谢您的回答。我试过 LITTLE 和 BIG，如果我放 LITTLE，我只会听到噪音，如果我使用上面的代码（默认为 BIG），我可以感知声音，但上面有很多白噪音。所以我认为字节序没问题。抱歉，我对 Java 很陌生，我以更简单的方式将字节转换为句柄值的缩写，但它如何直接处理字节？

标签： java api audio signal-processing

【解决方案1】：

IDK 如果问题是字节如何转换为短裤并返回，但既然您在评论中询问了这个问题，我会发布它。假设 buffer 具有 16 位编码的连续 little-endian 字节。只需反转 big-endian 的字节索引即可。

pcmShort = ( buffer[i] & 0xff ) | ( buffer[i+1] << 8 );

我使用的 pcm 到字节的转换如下（对于 little-endian，反转 big-endian 的索引）：

outBuffer[i] = (byte)pcmShort[0];
outBuffer[i+1] = (byte)((int)pcmShort[0] >> 8);

也许您可以在相同的数据上并排使用这两种方法（您尝试使用 ByteBuffer 和 getShort 以及上述方法）并检查生成的数组是否具有相同的值？

我会尝试做的另一件事是让单个轨道正常工作。如果这听起来不错，然后检查混合。信号太热以至于它们溢出是不太可能的。所以可能还有其他事情发生。

我应该自己尝试一下，我不确定我什么时候能做到。这可能是对我一直在做的事情的改进。

【讨论】：

您可以将(int)pcmShort[0] >> 8 替换为pcmShort[0] >>> 8