Jtransforms，输出频率不准确。答案

【问题标题】：Jtransforms, ouput freq not accurate.Jtransforms，输出频率不准确。
【发布时间】：2014-09-18 08:46:22
【问题描述】：

我正在为 Google Glass 开发一款应用程序，该应用程序自录制音频以来实时（ish）显示峰值电流峰值频率。我目前的问题是频率报告变化非常迅速，因此很难确定频率我也不确定我的 NumberFormat 输出格式是否正确，因为它只会达到“00.000”。我可能需要一些关于窗口的帮助，但我对它的理解就在那里。

谢谢！

public class RTAactivity extends Activity {

private static final int SAMPLING_RATE = 44100;

private TextView tvfreq;
private TextView tvdb;

private RecordingThread mRecordingThread;
private int mBufferSize;
private short[] mAudioBuffer;
private String mDecibelFormat;
private double  mFreqFormat = 0.0;
private int blockSize = 1024;  //4096
private DoubleFFT_1D fft;
private int[] bufferDouble, bufferDouble2;



@Override
protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.rta_view);
    getWindow().addFlags(WindowManager.LayoutParams.FLAG_KEEP_SCREEN_ON);

    tvfreq = (TextView) findViewById(R.id.tv_freq);
    tvdb = (TextView) findViewById(R.id.tv_decibels);

    // Compute the minimum required audio buffer size and allocate the buffer.
    mBufferSize = AudioRecord.getMinBufferSize(SAMPLING_RATE, AudioFormat.CHANNEL_IN_MONO,
            AudioFormat.ENCODING_PCM_16BIT);
    mAudioBuffer = new short[mBufferSize / 2];
    bufferDouble2 = new int[mBufferSize /2];
    bufferDouble = new int[(blockSize-1) * 2 ];

    mDecibelFormat = getResources().getString(R.string.decibel_format);
}

@Override
protected void onResume() {
    super.onResume();

    mRecordingThread = new RecordingThread();
    mRecordingThread.start();
}

@Override
protected void onPause() {
    super.onPause();

    if (mRecordingThread != null) {
        mRecordingThread.stopRunning();
        mRecordingThread = null;
    }
}
private class RecordingThread extends Thread{

    private boolean mShallContinue = true;

    @Override
    public void run() {
        android.os.Process.setThreadPriority(Process.THREAD_PRIORITY_AUDIO);

        AudioRecord record = new AudioRecord(AudioSource.MIC, SAMPLING_RATE, AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT, mBufferSize);

        short[] buffer = new short[blockSize];
        double[] audioDataDoubles = new double[(blockSize * 2)];
        double[] re = new double[blockSize];
        double[] im = new double[blockSize];
        double[] magnitude = new double[blockSize];

        //start collecting data
        record.startRecording();



        DoubleFFT_1D fft = new DoubleFFT_1D(blockSize);

        while (shallContinue()) {

            /**decibels */
            record.read(mAudioBuffer, 0, mBufferSize / 2);
            updateDecibelLevel();

            /**frequency */
                ///windowing!?
            for(int i=0;i<mAudioBuffer.length;i++) {
                bufferDouble2[i] = (int) mAudioBuffer[i];
            }

            for(int i=0;i<blockSize-1;i++){
                double x=-Math.PI+2*i*(Math.PI/blockSize);
                double winValue=(1+Math.cos(x))/2.0;
                bufferDouble[i]= (int) (bufferDouble2[i]*winValue); }

               // bufferDouble[2*i]=bufferDouble2[i];
               // bufferDouble[2*i+1] = (int) 0.0;}


            int bufferReadResult = record.read(buffer, 0, blockSize);

            // Read in the data from the mic to the array
            for (int i = 0; i < blockSize && i < bufferReadResult; i++) {
                audioDataDoubles[2 * i] = (double) buffer[i] / 32768.0; // signed 16 bit
                audioDataDoubles[(2 * i) + 1] = 0.0;
            }

        //audiodataDoubles now holds data to work with
        fft.complexForward(audioDataDoubles);   //complexForward


        // Calculate the Real and imaginary and Magnitude.

        for (int i = 0; i < blockSize; i++) {
            double real = audioDataDoubles[2 * i];
            double imag = audioDataDoubles[2 * i + 1];
            magnitude[i] = Math.sqrt((real * real) + (imag * imag));
        }
        for (int i = 0; i < blockSize; i++) {
            // real is stored in first part of array
            re[i] = audioDataDoubles[i * 2];
            // imaginary is stored in the sequential part
            im[i] = audioDataDoubles[(i * 2) + 1];
            // magnitude is calculated by the square root of (imaginary^2 + real^2)
            magnitude[i] = Math.sqrt((re[i] * re[i]) + (im[i] * im[i]));
        }

        double peak = -1.0;
        // Get the largest magnitude peak
        for (int i = 0; i < blockSize; i++) {
            peak = magnitude[i];
        }

        // calculated the frequency
        mFreqFormat = (SAMPLING_RATE * peak) / blockSize;
        updateFrequency();

    }

        record.stop();   //stop recording please.
        record.release();  // Deystroy the recording, PLEASE!
    }

    /**true if the thread should continue running or false if it should stop
    */
    private synchronized boolean shallContinue() {return mShallContinue; }

    /** Notifies the thread that it should stop running at the next opportunity. */
    private synchronized void stopRunning() { mShallContinue = false; }


    private void updateDecibelLevel() {
        // Compute the root-mean-squared of the sound buffer and then apply the formula for
        // computing the decibel level, 20 * log_10(rms). This is an uncalibrated calculation
        // that assumes no noise in the samples; with 16-bit recording, it can range from
        // -90 dB to 0 dB.
        double sum = 0;

        for (short rawSample : mAudioBuffer) {
            double sample = rawSample / 32768.0;
            sum += sample * sample;
        }

        double rms = Math.sqrt(sum / mAudioBuffer.length);
        final double db = 20 * Math.log10(rms);

        // Update the text view on the main thread.
        tvdb.post(new Runnable() {
            @Override
            public void run() {
                tvdb.setText(String.format(mDecibelFormat, db));
            }
        });
    }

  }
           /// post the output frequency to TextView
private void updateFrequency() {
    tvfreq.post(new Runnable() {
        @Override
        public void run() {
            NumberFormat nM = NumberFormat.getNumberInstance();
            tvfreq.setText(nM.format(mFreqFormat) + " hz");
        }
    });


}

}

【问题讨论】：

您需要检查您的代码 - 由于某种原因，您计算了两次幅度（无害但毫无意义），但更重要的是，您的寻峰循环已完全中断。

标签： android audio signal-processing fft

【解决方案1】：

您的代码有几个问题，但最重要的一个是您的峰值查找循环已完全中断 - 更改：

    double peak = -1.0;
    // Get the largest magnitude peak
    for (int i = 0; i < blockSize; i++) {
        peak = magnitude[i];
    }

到：

    double peak_val = magnitude[0];   // init magnitude of peak
    peak = 0;                         // init index of peak
    for (int i = 1; i < blockSize; i++) {
        double val = magnitude[i];
        if (val > peak_val) {
            peak_val = val;           // update magnitude of peak
            peak = i;                 // update index of peak
        }
    }

【讨论】：

谢谢保罗 R！我必须说，在构建它的过程中，我成为了你的忠实粉丝，并从你在 SO 上的帖子中学到了最多。如上所述，我已经实施了您的修复，它似乎解决了我的问题，我现在通过我的 Klipsch ref 扬声器播放它时读取 440hz！我确实注意到发生了一件奇怪的事情，尽管读数有时会跳到 43,281hz？另外，如果您想指出我给自己的其他几个问题，我们将不胜感激。再次感谢您的回复。
很抱歉，当我播放 440hz 我读到 430hz 时，它实际上读取了大约 10hz，我会仔细检查我的计算并在明天尝试另一组扬声器。任何更多的想法都会很棒。谢谢！
很高兴它现在至少部分工作。请注意，您的 FFT 的分辨率仅为 44100/1024 = 43 Hz，因此您可能会在 bin 10 中看到 440 Hz 的峰值，这为您提供了 430 Hz 的估计频率。至于代码的其他问题，我已经提到了cmets中的冗余量级计算，但是我稍后再看一下，看看是否还有其他问题。
好吧，我将 fft 大小更改为 4096，在 44.1 时，它似乎足够准确，无需更改 SampleRate 和缓冲区大小。我最后担心的是我在大约 43,000hz 左右得到的随机尖峰，无论我的测试频率是多少，它现在报告频率大约 +/-1-4hz，然后半秒后它尖峰到大频率并回到正确的读数。
我还删除了第一次震级计算，谢谢提醒，我不知道我是怎么错过的......

【解决方案2】：

添加：仅使用 FFT 的峰值幅度 bin 的频率分辨率将设置（量化）为采样率除以 FFT 的长度（您的参数为 44100/1024 Hz）。对于较短的 FFT，430 Hz 可能是最接近 440 的 FFT 结果箱。要做得更好，您需要插值、使用较长的 FFT 或使用其他频率估计算法。

如果您尝试显示音高频率（音乐音高或人声音高），这通常与 FFT 结果中的峰值频谱频率不同。查找音高检测/估计方法（有关该主题的许多学术论文），因为这通常需要比计算 FFT 幅度峰值更复杂和稳健的算法。

【讨论】：

虽然这是真的，而且 OP 可能需要做更多的研究，但它并没有解决眼前的问题（代码中的错误），所以它应该只是一个评论而不是一个答案.
我的目标是找到共振频率。我是一名音频工程师，设置音响系统并让它们响起是我每天都会做的事情。我花了很多时间做研究才让我走到这一步，如果您有任何论文可以推荐以帮助我进一步理解，我们将不胜感激！