【问题标题】:Confusion with data after scipy.fftconvolvescipy.fftconvolve 后与数据混淆
【发布时间】:2016-12-20 12:13:18
【问题描述】:

这个问题之前可能有人问过,但我在任何地方都找不到我的具体答案。所以这里是:我试图找到两个 USB 麦克风捕获的两个音频信号(高斯白噪声)之间的时间差(滞后)。没有问题,我已经从两个不同的流中录制了音频,保存它们并将数据读回数组。接下来,我尝试了几种不同的方法来查找延迟:先进行 FFT,然后进行互相关、scipy.fftconvole、零填充,而且似乎还有无数其他方法/策略。作为确认我计算的延迟的测试,我尝试使用该时间延迟来计算声速。麦克风间距为 0.6 米,我应该得到 1.728 毫秒的时间延迟(v = dis / time)。我将麦克风录音与源信号进行卷积,我不确定如何处理从fftconvolve 获得的峰值。从这些值我应该能够计算我的时差?您能提供的任何帮助都会非常好。

读取数据:

samprat_1, data_1 = scipy.io.wavfile.read(sin_tone_1)   
samprat_2, data_2 = scipy.io.wavfile.read(sin_tone_2)
samrat_tone_played, data_tone_played = scipy.io.wavfile.read(sin_tone_played)

卷积:

data_1 = _1_
data_2 = _2_
_tp_ = data_tone_played
convol_1_wsig = signal.fftconvolve(_1_, _tp_[::-1], "full")
convol_2_wsig = signal.fftconvolve(_2_, _tp_[::-1], "full")

这里也是 FFTconvolve 结果的图:

【问题讨论】:

  • 只是为了确保,如果您查看一小部分信号(例如,在绘图中),您能否看到一个是延时的另一个版本?由于您使用的是两个不同的物理麦克风,因此对应关系并不准确。每个麦克风都会引入自己的随机噪声。因此,除非信号中有非常明显的区别特征(如响亮和柔和的点),否则首先没有希望找到时间延迟。
  • 这里有几个问题:1)你应该使用“互相关”而不是卷积(或者这里是::-1 - 无法分辨?); 2)什么是_1__tp_;等等。看来您正在尝试连续执行 6 个步骤,查看结果,然后猜测哪个步骤不起作用。相反,您需要确保第一步有效,然后再进行第二步。例如,在进行互相关(或卷积)之前,您应该减去均值,但您只能通过注意该步骤来计算。
  • 这个问题可能更适合dsp.stackexchange.com 了解数学算法后,您可能会发现它很容易实现。如果您在实施时遇到问题,请在此处询问。
  • @WarrenWeckesser,我也是这么想的(关于 dsp.stackexchange),但是在制定了解决方案之后,有一个与代码相关的主要绊脚石,所以我很高兴 op 先来到这里。
  • 添加signal-processing 标签。

标签: python scipy signal-processing fft


【解决方案1】:

来自其他网站的一些更令人困惑的示例,评估为 由函数 cnn 总结的过程。 函数 rounded 可能相当于命令 round。


'''

      Convolution , using 
      
      scipy.signals.fftconvolve.
      
      As this routine has various options, I shall check the results
     using information from various sources.  


'''

import numpy as np
import scipy.signal as sg

def rounded(xv):
    sv = np.sign(xv)
    xv =  xv*1000
    xv = np.abs(xv)
    xv = xv +  500
    xv = xv.astype(int)
    xv = xv*sv
    xv =  xv/1000
    xv = xv.astype(int)
    return  xv


def cnn(vol,volf):
    vol=np.transpose(vol)
    volf=np.transpose(volf)
    volg=sg.fftconvolve(vol,volf, mode = 'valid')
    volg = rounded(volg)
    #volg=np.transpose(volg)
    return volg
    


# ------------------------------------------------------------------


# 10 10 10 0 0 0
# 10 10 10 0 0 0
# 10 10 10 0 0 0
# 10 10 10 0 0 0
# 10 10 10 0 0 0
# 10 10 10 0 0 0


# 1 0 -1
# 1 0 -1
# 1 0 -1

# 0 30 30 0
# 0 30 30 0
# 0 30 30 0
# 0 30 30 0

#



vol=[[10,10,10,10,10,10],[10,10,10,10,10,10],[10,10,10,10,10,10],[0,0,0,0,0,0],[0,0,0,0,0,0],[0,0,0,0,0,0]]
vol=np.transpose(vol)
print("  ")
print(vol)
print('   ')

volf=[[-1,-1,-1],[0,0,0],[1,1,1]]
volf=np.transpose(volf)
print(volf)
print('   ')


# mode = full valid  same

volg=sg.fftconvolve(vol,volf, mode = 'valid')
#volg=np.transpose(volg) # usually not required.

volg = rounded(volg)

print(volg)
# With slightly altered kernel, the result is true.
# should be
#[[0,30,30,0],[0,30,30,0],[0,30,30,0],[0,30,30,0]]

# original volf  =[[-1,-1,-1],[0,0,0],[1,1,1]] , prior to adjustment or translate.

# https://www.analyticsvidhya.com/blog/2018/12/guide-convolutional-neural-network-cnn/


# From same website .

# 3 0 1 2 7 4
# 1 5 8 9 3 1
# 2 7 2 5 1 3
# 0 1 3 1 7 8
# 4 2 1 6 2 8
# 2 4 5 2 3 9

# 1 0 -1
# 1 0 -1
# 1 0 -1

# result
# -5  -4  0  8 
# -10 -2  2  3
# 0   -2 -4 -7
# -3  -2 -3 -16


vol=[[3,1,2,0,4,2],[0,5,7,1,2,4],[1,8,2,3,1,5],[2,9,5,1,6,2],[7,3,1,7,2,3],[4,1,3,8,8,9]]
vol=np.transpose(vol)

print(" ....................... ")
print("  ")
print(vol)
print('   ')
volf=[[-1,-1,-1],[0,0,0],[1,1,1]]
volf=np.transpose(volf)
print(volf)
print('   ')


# mode = full valid  same

volg=sg.fftconvolve(vol,volf, mode = 'valid')


volg = rounded(volg)

print(volg)
# The result is true, however the kernel is different than at the website.

# quit()

# ---------------------------------------------------------------------------

# another example , from .
# https://insightsimaging.springeropen.com/articles/10.1007/s13244-018-0639-9

# 1 2 1 0 2
# 2 0 0 1 0
# 1 0 2 1 0
# 0 1 0 2 1
# 0 2 1 0 2

# 1 0 1
# 0 1 0
# 1 0 1

# result.
# 1 4 2 0 3
# 4 5 3 6 1
# 2 2 6 2 3
# 2 5 3 7 2
# 1 2 4 1 4


vol=[[0,0,0,0,0,0,0],[0,1,2,1,0,0,0],[0,2,0,0,1,2,0],[0,1,0,2,0,1,0],[0,0,1,1,2,0,0],[0,2,0,0,1,2,0],[0,0,0,0,0,0,0]]
vol=np.transpose(vol)
print("  ")
print(vol)
print('   ')
volf=[[1,0,1],[0,1,0],[1,0,1]]

volf=np.transpose(volf)
print(volf)
print('   ')

volg = sg.fftconvolve(vol,volf, mode = 'valid')

volg = rounded(volg)

print(volg)
print(" ..........<<  >>.......... ")

#  Same result after applying rounded .

#quit()

# without padding .

# 1 2 1 0 2
# 2 0 0 1 0
# 1 0 2 1 0
# 0 1 0 2 1
# 0 2 1 0 2     

# 1 0 1
# 0 1 0
# 1 0 1

#  result
# 5 3 6
# 2 6 2
# 5 3 7    




vol=[[1,2,1,0,0],[2,0,0,1,2],[1,0,2,0,1],[0,1,1,2,0],[2,0,0,1,2]]
vol=np.transpose(vol)
print("  ")
print(vol)
print('   ')
volf=[[1,0,1],[0,1,0],[1,0,1]]
volf=np.transpose(volf)
print(volf)
print('   ')

volg = sg.fftconvolve(vol,volf, mode = 'valid')


volg = rounded(volg)

print(volg)

#  Same result after applying rounded .



#quit()





# ----------------------------- valid ----------------------------------

#  Trying another example, from
# https://medium.com/analytics-vidhya/convolutional-neural-networks-cnn-explained-step-by-step-69137a54e5e7

# 1 1 1 0 0
# 0 1 1 1 0
# 0 0 1 1 1
# 0 0 1 1 0
# 0 1 1 0 0

# 1 0 1
# 0 1 0
# 1 0 1  

 
# 4 3 4
# 2 4 3 
# 2 3 4 
 

 
vol=[[1,1,1,0,0],[0,1,1,1,0],[0,0,1,1,1],[0,0,1,1,0],[0,1,1,0,0]]
vol=np.transpose(vol)
print("  ")
print(vol)
print('   ')
volf=[[1,0,1],[0,1,0],[1,0,1]]
volf=np.transpose(volf)
print(volf)
print('   ')

volg = sg.fftconvolve(vol,volf, mode = 'valid')
volg=np.transpose(volg) # Notice syntax for this procedure . 
volg = rounded(volg)
print(volg)
#  This is correct, however having to take the transpose of volg is
# unexpected and differs from most of the other examples.

#quit()


# ....................................................



# Yet another example, using previous method .

# https://www.deeplearningwizard.com/deep_learning/practical_pytorch/pytorch_convolutional_neuralnetwork/

# 0 0 0 0 0 0 0
# 0 2 2 0 0 0 0
# 0 0 0 2 2 2 0
# 0 2 0 0 2 0 0
# 0 0 0 0 2 0 0
# 0 0 2 2 0 0 0
# 0 0 0 0 0 0 0

# 1 1 1
# 0 0 0
# 0 0 0  

 
# 0 0 0 0 0 
# 4 4 2 0 0  
# 0 2 4 6 4 
# 2 2 2 2 2
# 0 0 2 2 2

vol=[[0,0,0,0,0,0,0],[0,2,0,2,0,0,0],[0,2,0,0,0,2,0],[0,0,2,0,0,2,0],[0,0,2,2,2,0,0],[0,0,2,0,0,0,0],[0,0,0,0,0,0,0]]
vol=np.transpose(vol)
print("  ")
print(vol)
print('   ')
volf=[[0,0,1],[0,0,1],[0,0,1]]
volf=np.transpose(volf)
print(volf)
print('   ')

volg = sg.fftconvolve(vol,volf, mode = 'valid')
 
volg = rounded(volg)
print(volg)
#  this is correct, however volf is reversed.


#quit()
# .....................................................

# https://victorzhou.com/blog/intro-to-cnns-part-1/


# 0  50 0  29
# 0  80 31 2
# 33 90 0  75
# 0  9  0  95

# -1 0 1
# -2 0 2
# -1 0 1

#  29 -192
# -35 -22

vol=[[0,0,33,0],[50,80,90,9],[0,31,0,0],[29,2,75,95]]
vol=np.transpose(vol)
print(" ...............<<<<< ")
print(vol)
print('   ')
volf=[[1,2,1],[0,0,0],[-1,-2,-1]]
volf=np.transpose(volf)
print(volf)
print('   ')
volg = sg.fftconvolve(vol,volf, mode = 'valid')

volg = rounded(volg)
print(volg.astype(int))
#  this is correct, however volf is top row is reversed with bottom row.
# --------------------------------------------------------
#
#  To summarize, the function cnn is mostly correct for the examples
# that I found upon the web, when one enters the data values in the
# format shown .
#

【讨论】:

    【解决方案2】:

    在这种情况下,最好先确保您的算法适用于完美数据,然后再尝试使用真实数据。事实证明,即使您了解该算法,Scipy 也很愚蠢,没有提供一种简单的方法来获得与相关性相对应的滞后。

    这是我的little script:我生成了两个长度不等的信号向量,但都包含一个特定的信号(线性啁啾),延迟已知(分别为 0.1 和 0.2 秒)。然后我同时使用scipy.signal.correlatescipy.signal.fftconvolve 来确保我可以恢复两个信号之间的正确延迟。这归结为正确生成滞后向量(即与相关索引相对应的时间向量),并知道如何解释它们。

    import numpy as np
    import scipy.signal as signal
    import numpy.fft as fft
    try:
        import pylab
    except ImportError:
        print("Can't import pylab, do you have matplotlib installed?")
    
    ## Generator for the ideal underlying signal
    # Parameters for a linear chirp that lasts 0.5 seconds, over which the frequency
    # is swept from -10 Hz to 5 Hz.
    signalLength = 0.5 # seconds
    fStart = -10.0 # Hz
    fEnd = +5.0 # Hz
    fRate = (fEnd - fStart) / signalLength
    # `x` is a function that evaluates this underlying signal for any given time
    # vector `t`.
    x = lambda t: (np.sin(2 * np.pi * (fStart + fRate * 0.5 * t) * t) *
                   np.logical_and(t >= 0, t<signalLength))
    
    ## Generate some test data
    fs = 100.0 # Sample rate, Hz
    
    # First signal: lasts 2 seconds, sampled at `fs` Hz
    Tend = 2.0 # seconds
    t1 = np.arange(0, Tend, 1/fs)
    y1 = x(t1 - 0.1)
    # Plot?
    try:
        pylab.figure()
        pylab.subplot(211)
        pylab.plot(t1, y1)
    except NameError:
        pass
    
    # Second signal: lasts 1 second, also sampled at `fs` Hz
    t2 = np.arange(0, Tend/2, 1/fs)
    y2 = x(t2 - 0.2)
    try:
        pylab.plot(t2, y2)
        pylab.legend(['y1', 'y2'])
    except NameError:
        pass
    
    ## Correlate
    # Evaluate the correlation
    z = signal.correlate(y1, y2)
    # And this is crucial: the vector of lags for which the above `z` corresponds
    # to.
    tz = (np.arange(z.size) - (y2.size - 1)) / fs
    # Here's how to evaluate the relative delay between y1 and y2
    print('y1 is y2 delayed by {} seconds'.format(tz[np.argmax(np.abs(z))]))
    try:
        pylab.subplot(212)
        pylab.plot(tz, z)
    except NameError:
        pass
    
    ## Correlate with flipped y1-vs-y2 to make sure it still works
    z = signal.correlate(y2, y1)
    # Note that now, we subtract `y1.size` because `y1` is second argument to
    # `correlate`
    tz = (np.arange(z.size) - (y1.size - 1)) / fs
    print('y2 is y1 delayed by {} seconds'.format(tz[np.argmax(np.abs(z))]))
    try:
        pylab.subplot(212)
        pylab.plot(tz, z)
        pylab.legend(['correlate(y1,y2)', 'correlate(y2,y1)'])
    except NameError:
        pass
    

    这会生成(见下图):

    y1 is y2 delayed by -0.1 seconds
    y2 is y1 delayed by 0.1 seconds
    

    万岁!好的!所以我们知道,如果你 correlate(y1, y2),要生成滞后向量 tz,你减去 y2 的长度。我们知道,在这种情况下,最大幅度相关发生在一个滞后上,这意味着“要延迟多少 y2 才能得到 y1”。

    让我们将其包装成一些函数以供通用重用。该函数不知道采样率,所以它只会返回整数滞后,但我们可以除以外面的采样率。

    ## Wrap this in a function
    def correlateWithLags(y1, y2, *args, **kwargs):
        "Returns `scipy.signal.correlate` output as wel as vector of lags"
        z = signal.correlate(y1, y2, *args, **kwargs)
        lags = np.arange(z.size) - (y2.size - 1)
        return (z, lags)
    
    # Make sure all works as above
    (z, nz) = correlateWithLags(y1, y2)
    tz = nz / fs
    print('y1 is y2 delayed by {} seconds: `correlateWithLags`'.format(tz[np.argmax(np.abs(z))]))
    

    这会生成:

    y1 is y2 delayed by -0.1 seconds: `correlateWithLags`
    

    还在工作!

    现在有趣的是,让我们将 scipy.signal.correlate 替换为 scipy.signal.fftconvolve。这可以大大减少长信号长度的运行时间。相关性与卷积相同,只是您必须对其中一个信号进行时间反转,但在处理完那个小细节之后,结果应该完全相同,产生正确延迟的机制与上述相同。

    ## Use `fftconvolve` instead of `correlate` for fast-convolution
    def fftCorrelateWithLags(y1, y2, *args, **kwargs):
        "Equivalent to correlateWithLags but uses `scipy.signal.fftconvolve`"
        # NOTA BENE: reverse `y2`! And if complex, need `np.conj(y2)` too!
        z = signal.fftconvolve(y1, y2[::-1], *args, **kwargs)
        lags = np.arange(z.size) - (y2.size - 1)
        return (z, lags)
    
    # Make sure it still works
    (z, nz) = fftCorrelateWithLags(y1, y2)
    tz = nz / fs
    print('y1 is y2 delayed by {} seconds: `fftCorrelateWithLags`'.format(tz[np.argmax(np.abs(z))]))
    

    仍然有效:

    y1 is y2 delayed by -0.1 seconds: `fftCorrelateWithLags`
    

    Scipy 的 correlate 确实应该为您提供一个函数,该函数返回与相关性相对应的滞后向量。出于这个原因,我认为将其发布在 StackOverflow 上而不是 http://dsp.stackexchange.com 上是可以的,因为即使您了解算法,您仍然需要处理代码废话?。

    后记。这是full code in a Gist。这是生成的图像,显​​示了两个信号(顶部)和进行相关的两种方法(correlate(y1, y2)correlate(y2, y1),底部)。

    【讨论】:

    • 非常感谢您的回答,非常有帮助。因此,我处理了我的代码,它读取了我制作的人为数据,延迟了我添加的帧数(此测试为 76)。当我希望过渡到真实记录的数据时,我应该首先考虑什么?我应该实施什么过滤?当我用麦克风运行这个实验并从扬声器播放音调时,每次运行我都会得到不同的数字,有一些巨大的异常值。任何建议都非常感谢 - kyle
    • 在你的问题中,你说你试图找到两个麦克风之间的距离。在您的最终应用程序中,您仍然可以控制两个麦克风听到的音频吗?我的意思是——如果你知道你要广播一个音调,那么无论如何,你应该与一个音调(或你广播的任何信号)相关联——这就是雷达的工作原理并且工作得很好!音调越长,您的延迟估计就越准确和解决得越好。 (或者,如果它具有大量带宽,您可以使用短信号,例如啁啾或 BPSK 之类的方波)。
    • 但是,如果您无法控制两个麦克风将拾取的音频,并且距离估计必须使用环境音频来完成,嗯。我不知道该怎么做。
    • 您可能会看到两个音频通道之间的相关性出现虚假尖峰,因为声音是从墙壁反弹的。有时人们所做的是,他们寻找高于某个阈值的相关性中的第一个(最早)峰值,并说这是他们的直接路径(扬声器到麦克风),而所有其他峰值都是多路径反射。
    • Ahmed Fasih,在最终的应用程序中,我将能够控制源的音调。最终我试图使用麦克风之间的延迟(可能超过两个)来计算源的到达角。我认为这种声速测试将是一个好的开始。白噪声在相关数据中产生了一个很好的峰值,我应该使用方波并进行更长的测试吗?
    猜你喜欢
    • 1970-01-01
    • 2011-03-17
    • 1970-01-01
    • 2021-11-22
    • 2014-12-30
    • 1970-01-01
    • 1970-01-01
    • 2014-07-09
    • 1970-01-01
    相关资源
    最近更新 更多