scipy.fftconvolve 后与数据混淆答案

【问题标题】：Confusion with data after scipy.fftconvolvescipy.fftconvolve 后与数据混淆
【发布时间】：2016-12-20 12:13:18
【问题描述】：

这个问题之前可能有人问过，但我在任何地方都找不到我的具体答案。所以这里是：我试图找到两个 USB 麦克风捕获的两个音频信号（高斯白噪声）之间的时间差（滞后）。没有问题，我已经从两个不同的流中录制了音频，保存它们并将数据读回数组。接下来，我尝试了几种不同的方法来查找延迟：先进行 FFT，然后进行互相关、scipy.fftconvole、零填充，而且似乎还有无数其他方法/策略。作为确认我计算的延迟的测试，我尝试使用该时间延迟来计算声速。麦克风间距为 0.6 米，我应该得到 1.728 毫秒的时间延迟（v = dis / time）。我将麦克风录音与源信号进行卷积，我不确定如何处理从fftconvolve 获得的峰值。从这些值我应该能够计算我的时差？您能提供的任何帮助都会非常好。

读取数据：

samprat_1, data_1 = scipy.io.wavfile.read(sin_tone_1)   
samprat_2, data_2 = scipy.io.wavfile.read(sin_tone_2)
samrat_tone_played, data_tone_played = scipy.io.wavfile.read(sin_tone_played)

卷积：

data_1 = _1_
data_2 = _2_
_tp_ = data_tone_played
convol_1_wsig = signal.fftconvolve(_1_, _tp_[::-1], "full")
convol_2_wsig = signal.fftconvolve(_2_, _tp_[::-1], "full")

这里也是 FFTconvolve 结果的图：

【问题讨论】：

只是为了确保，如果您查看一小部分信号（例如，在绘图中），您能否看到一个是延时的另一个版本？由于您使用的是两个不同的物理麦克风，因此对应关系并不准确。每个麦克风都会引入自己的随机噪声。因此，除非信号中有非常明显的区别特征（如响亮和柔和的点），否则首先没有希望找到时间延迟。
这里有几个问题：1）你应该使用“互相关”而不是卷积（或者这里是::-1 - 无法分辨？）； 2）什么是_1_、_tp_；等等。看来您正在尝试连续执行 6 个步骤，查看结果，然后猜测哪个步骤不起作用。相反，您需要确保第一步有效，然后再进行第二步。例如，在进行互相关（或卷积）之前，您应该减去均值，但您只能通过注意该步骤来计算。
这个问题可能更适合dsp.stackexchange.com 了解数学算法后，您可能会发现它很容易实现。如果您在实施时遇到问题，请在此处询问。
@WarrenWeckesser，我也是这么想的（关于 dsp.stackexchange），但是在制定了解决方案之后，有一个与代码相关的主要绊脚石，所以我很高兴 op 先来到这里。
添加signal-processing 标签。

标签： python scipy signal-processing fft

【解决方案1】：

来自其他网站的一些更令人困惑的示例，评估为由函数 cnn 总结的过程。函数 rounded 可能相当于命令 round。


'''

      Convolution , using 
      
      scipy.signals.fftconvolve.
      
      As this routine has various options, I shall check the results
     using information from various sources.  


'''

import numpy as np
import scipy.signal as sg

def rounded(xv):
    sv = np.sign(xv)
    xv =  xv*1000
    xv = np.abs(xv)
    xv = xv +  500
    xv = xv.astype(int)
    xv = xv*sv
    xv =  xv/1000
    xv = xv.astype(int)
    return  xv


def cnn(vol,volf):
    vol=np.transpose(vol)
    volf=np.transpose(volf)
    volg=sg.fftconvolve(vol,volf, mode = 'valid')
    volg = rounded(volg)
    #volg=np.transpose(volg)
    return volg
    


# ------------------------------------------------------------------


# 10 10 10 0 0 0
# 10 10 10 0 0 0
# 10 10 10 0 0 0
# 10 10 10 0 0 0
# 10 10 10 0 0 0
# 10 10 10 0 0 0


# 1 0 -1
# 1 0 -1
# 1 0 -1

# 0 30 30 0
# 0 30 30 0
# 0 30 30 0
# 0 30 30 0

#



vol=[[10,10,10,10,10,10],[10,10,10,10,10,10],[10,10,10,10,10,10],[0,0,0,0,0,0],[0,0,0,0,0,0],[0,0,0,0,0,0]]
vol=np.transpose(vol)
print("  ")
print(vol)
print('   ')

volf=[[-1,-1,-1],[0,0,0],[1,1,1]]
volf=np.transpose(volf)
print(volf)
print('   ')


# mode = full valid  same

volg=sg.fftconvolve(vol,volf, mode = 'valid')
#volg=np.transpose(volg) # usually not required.

volg = rounded(volg)

print(volg)
# With slightly altered kernel, the result is true.
# should be
#[[0,30,30,0],[0,30,30,0],[0,30,30,0],[0,30,30,0]]

# original volf  =[[-1,-1,-1],[0,0,0],[1,1,1]] , prior to adjustment or translate.

# https://www.analyticsvidhya.com/blog/2018/12/guide-convolutional-neural-network-cnn/


# From same website .

# 3 0 1 2 7 4
# 1 5 8 9 3 1
# 2 7 2 5 1 3
# 0 1 3 1 7 8
# 4 2 1 6 2 8
# 2 4 5 2 3 9

# 1 0 -1
# 1 0 -1
# 1 0 -1

# result
# -5  -4  0  8 
# -10 -2  2  3
# 0   -2 -4 -7
# -3  -2 -3 -16


vol=[[3,1,2,0,4,2],[0,5,7,1,2,4],[1,8,2,3,1,5],[2,9,5,1,6,2],[7,3,1,7,2,3],[4,1,3,8,8,9]]
vol=np.transpose(vol)

print(" ....................... ")
print("  ")
print(vol)
print('   ')
volf=[[-1,-1,-1],[0,0,0],[1,1,1]]
volf=np.transpose(volf)
print(volf)
print('   ')


# mode = full valid  same

volg=sg.fftconvolve(vol,volf, mode = 'valid')


volg = rounded(volg)

print(volg)
# The result is true, however the kernel is different than at the website.

# quit()

# ---------------------------------------------------------------------------

# another example , from .
# https://insightsimaging.springeropen.com/articles/10.1007/s13244-018-0639-9

# 1 2 1 0 2
# 2 0 0 1 0
# 1 0 2 1 0
# 0 1 0 2 1
# 0 2 1 0 2

# 1 0 1
# 0 1 0
# 1 0 1

# result.
# 1 4 2 0 3
# 4 5 3 6 1
# 2 2 6 2 3
# 2 5 3 7 2
# 1 2 4 1 4


vol=[[0,0,0,0,0,0,0],[0,1,2,1,0,0,0],[0,2,0,0,1,2,0],[0,1,0,2,0,1,0],[0,0,1,1,2,0,0],[0,2,0,0,1,2,0],[0,0,0,0,0,0,0]]
vol=np.transpose(vol)
print("  ")
print(vol)
print('   ')
volf=[[1,0,1],[0,1,0],[1,0,1]]

volf=np.transpose(volf)
print(volf)
print('   ')

volg = sg.fftconvolve(vol,volf, mode = 'valid')

volg = rounded(volg)

print(volg)
print(" ..........<<  >>.......... ")

#  Same result after applying rounded .

#quit()

# without padding .

# 1 2 1 0 2
# 2 0 0 1 0
# 1 0 2 1 0
# 0 1 0 2 1
# 0 2 1 0 2     

# 1 0 1
# 0 1 0
# 1 0 1

#  result
# 5 3 6
# 2 6 2
# 5 3 7    




vol=[[1,2,1,0,0],[2,0,0,1,2],[1,0,2,0,1],[0,1,1,2,0],[2,0,0,1,2]]
vol=np.transpose(vol)
print("  ")
print(vol)
print('   ')
volf=[[1,0,1],[0,1,0],[1,0,1]]
volf=np.transpose(volf)
print(volf)
print('   ')

volg = sg.fftconvolve(vol,volf, mode = 'valid')


volg = rounded(volg)

print(volg)

#  Same result after applying rounded .



#quit()





# ----------------------------- valid ----------------------------------

#  Trying another example, from
# https://medium.com/analytics-vidhya/convolutional-neural-networks-cnn-explained-step-by-step-69137a54e5e7

# 1 1 1 0 0
# 0 1 1 1 0
# 0 0 1 1 1
# 0 0 1 1 0
# 0 1 1 0 0

# 1 0 1
# 0 1 0
# 1 0 1  

 
# 4 3 4
# 2 4 3 
# 2 3 4 
 

 
vol=[[1,1,1,0,0],[0,1,1,1,0],[0,0,1,1,1],[0,0,1,1,0],[0,1,1,0,0]]
vol=np.transpose(vol)
print("  ")
print(vol)
print('   ')
volf=[[1,0,1],[0,1,0],[1,0,1]]
volf=np.transpose(volf)
print(volf)
print('   ')

volg = sg.fftconvolve(vol,volf, mode = 'valid')
volg=np.transpose(volg) # Notice syntax for this procedure . 
volg = rounded(volg)
print(volg)
#  This is correct, however having to take the transpose of volg is
# unexpected and differs from most of the other examples.

#quit()


# ....................................................



# Yet another example, using previous method .

# https://www.deeplearningwizard.com/deep_learning/practical_pytorch/pytorch_convolutional_neuralnetwork/

# 0 0 0 0 0 0 0
# 0 2 2 0 0 0 0
# 0 0 0 2 2 2 0
# 0 2 0 0 2 0 0
# 0 0 0 0 2 0 0
# 0 0 2 2 0 0 0
# 0 0 0 0 0 0 0

# 1 1 1
# 0 0 0
# 0 0 0  

 
# 0 0 0 0 0 
# 4 4 2 0 0  
# 0 2 4 6 4 
# 2 2 2 2 2
# 0 0 2 2 2

vol=[[0,0,0,0,0,0,0],[0,2,0,2,0,0,0],[0,2,0,0,0,2,0],[0,0,2,0,0,2,0],[0,0,2,2,2,0,0],[0,0,2,0,0,0,0],[0,0,0,0,0,0,0]]
vol=np.transpose(vol)
print("  ")
print(vol)
print('   ')
volf=[[0,0,1],[0,0,1],[0,0,1]]
volf=np.transpose(volf)
print(volf)
print('   ')

volg = sg.fftconvolve(vol,volf, mode = 'valid')
 
volg = rounded(volg)
print(volg)
#  this is correct, however volf is reversed.


#quit()
# .....................................................

# https://victorzhou.com/blog/intro-to-cnns-part-1/


# 0  50 0  29
# 0  80 31 2
# 33 90 0  75
# 0  9  0  95

# -1 0 1
# -2 0 2
# -1 0 1

#  29 -192
# -35 -22

vol=[[0,0,33,0],[50,80,90,9],[0,31,0,0],[29,2,75,95]]
vol=np.transpose(vol)
print(" ...............<<<<< ")
print(vol)
print('   ')
volf=[[1,2,1],[0,0,0],[-1,-2,-1]]
volf=np.transpose(volf)
print(volf)
print('   ')
volg = sg.fftconvolve(vol,volf, mode = 'valid')

volg = rounded(volg)
print(volg.astype(int))
#  this is correct, however volf is top row is reversed with bottom row.
# --------------------------------------------------------
#
#  To summarize, the function cnn is mostly correct for the examples
# that I found upon the web, when one enters the data values in the
# format shown .
#

【讨论】：

【解决方案2】：

在这种情况下，最好先确保您的算法适用于完美数据，然后再尝试使用真实数据。事实证明，即使您了解该算法，Scipy 也很愚蠢，没有提供一种简单的方法来获得与相关性相对应的滞后。

这是我的little script：我生成了两个长度不等的信号向量，但都包含一个特定的信号（线性啁啾），延迟已知（分别为 0.1 和 0.2 秒）。然后我同时使用scipy.signal.correlate 和scipy.signal.fftconvolve 来确保我可以恢复两个信号之间的正确延迟。这归结为正确生成滞后向量（即与相关索引相对应的时间向量），并知道如何解释它们。

import numpy as np
import scipy.signal as signal
import numpy.fft as fft
try:
    import pylab
except ImportError:
    print("Can't import pylab, do you have matplotlib installed?")

## Generator for the ideal underlying signal
# Parameters for a linear chirp that lasts 0.5 seconds, over which the frequency
# is swept from -10 Hz to 5 Hz.
signalLength = 0.5 # seconds
fStart = -10.0 # Hz
fEnd = +5.0 # Hz
fRate = (fEnd - fStart) / signalLength
# `x` is a function that evaluates this underlying signal for any given time
# vector `t`.
x = lambda t: (np.sin(2 * np.pi * (fStart + fRate * 0.5 * t) * t) *
               np.logical_and(t >= 0, t<signalLength))

## Generate some test data
fs = 100.0 # Sample rate, Hz

# First signal: lasts 2 seconds, sampled at `fs` Hz
Tend = 2.0 # seconds
t1 = np.arange(0, Tend, 1/fs)
y1 = x(t1 - 0.1)
# Plot?
try:
    pylab.figure()
    pylab.subplot(211)
    pylab.plot(t1, y1)
except NameError:
    pass

# Second signal: lasts 1 second, also sampled at `fs` Hz
t2 = np.arange(0, Tend/2, 1/fs)
y2 = x(t2 - 0.2)
try:
    pylab.plot(t2, y2)
    pylab.legend(['y1', 'y2'])
except NameError:
    pass

## Correlate
# Evaluate the correlation
z = signal.correlate(y1, y2)
# And this is crucial: the vector of lags for which the above `z` corresponds
# to.
tz = (np.arange(z.size) - (y2.size - 1)) / fs
# Here's how to evaluate the relative delay between y1 and y2
print('y1 is y2 delayed by {} seconds'.format(tz[np.argmax(np.abs(z))]))
try:
    pylab.subplot(212)
    pylab.plot(tz, z)
except NameError:
    pass

## Correlate with flipped y1-vs-y2 to make sure it still works
z = signal.correlate(y2, y1)
# Note that now, we subtract `y1.size` because `y1` is second argument to
# `correlate`
tz = (np.arange(z.size) - (y1.size - 1)) / fs
print('y2 is y1 delayed by {} seconds'.format(tz[np.argmax(np.abs(z))]))
try:
    pylab.subplot(212)
    pylab.plot(tz, z)
    pylab.legend(['correlate(y1,y2)', 'correlate(y2,y1)'])
except NameError:
    pass

这会生成（见下图）：

y1 is y2 delayed by -0.1 seconds
y2 is y1 delayed by 0.1 seconds

万岁！好的！所以我们知道，如果你 correlate(y1, y2)，要生成滞后向量 tz，你减去 y2 的长度。我们知道，在这种情况下，最大幅度相关发生在一个滞后上，这意味着“要延迟多少 y2 才能得到 y1”。

让我们将其包装成一些函数以供通用重用。该函数不知道采样率，所以它只会返回整数滞后，但我们可以除以外面的采样率。

## Wrap this in a function
def correlateWithLags(y1, y2, *args, **kwargs):
    "Returns `scipy.signal.correlate` output as wel as vector of lags"
    z = signal.correlate(y1, y2, *args, **kwargs)
    lags = np.arange(z.size) - (y2.size - 1)
    return (z, lags)

# Make sure all works as above
(z, nz) = correlateWithLags(y1, y2)
tz = nz / fs
print('y1 is y2 delayed by {} seconds: `correlateWithLags`'.format(tz[np.argmax(np.abs(z))]))

这会生成：

y1 is y2 delayed by -0.1 seconds: `correlateWithLags`

还在工作！

现在有趣的是，让我们将 scipy.signal.correlate 替换为 scipy.signal.fftconvolve。这可以大大减少长信号长度的运行时间。相关性与卷积相同，只是您必须对其中一个信号进行时间反转，但在处理完那个小细节之后，结果应该完全相同，产生正确延迟的机制与上述相同。

## Use `fftconvolve` instead of `correlate` for fast-convolution
def fftCorrelateWithLags(y1, y2, *args, **kwargs):
    "Equivalent to correlateWithLags but uses `scipy.signal.fftconvolve`"
    # NOTA BENE: reverse `y2`! And if complex, need `np.conj(y2)` too!
    z = signal.fftconvolve(y1, y2[::-1], *args, **kwargs)
    lags = np.arange(z.size) - (y2.size - 1)
    return (z, lags)

# Make sure it still works
(z, nz) = fftCorrelateWithLags(y1, y2)
tz = nz / fs
print('y1 is y2 delayed by {} seconds: `fftCorrelateWithLags`'.format(tz[np.argmax(np.abs(z))]))

仍然有效：

y1 is y2 delayed by -0.1 seconds: `fftCorrelateWithLags`

Scipy 的 correlate 确实应该为您提供一个函数，该函数返回与相关性相对应的滞后向量。出于这个原因，我认为将其发布在 StackOverflow 上而不是 http://dsp.stackexchange.com 上是可以的，因为即使您了解算法，您仍然需要处理代码废话?。

后记。这是full code in a Gist。这是生成的图像，显示了两个信号（顶部）和进行相关的两种方法（correlate(y1, y2) 与 correlate(y2, y1)，底部）。

【讨论】：

非常感谢您的回答，非常有帮助。因此，我处理了我的代码，它读取了我制作的人为数据，延迟了我添加的帧数（此测试为 76）。当我希望过渡到真实记录的数据时，我应该首先考虑什么？我应该实施什么过滤？当我用麦克风运行这个实验并从扬声器播放音调时，每次运行我都会得到不同的数字，有一些巨大的异常值。任何建议都非常感谢 - kyle
在你的问题中，你说你试图找到两个麦克风之间的距离。在您的最终应用程序中，您仍然可以控制两个麦克风听到的音频吗？我的意思是——如果你知道你要广播一个音调，那么无论如何，你应该与一个音调（或你广播的任何信号）相关联——这就是雷达的工作原理并且工作得很好！音调越长，您的延迟估计就越准确和解决得越好。（或者，如果它具有大量带宽，您可以使用短信号，例如啁啾或 BPSK 之类的方波）。
但是，如果您无法控制两个麦克风将拾取的音频，并且距离估计必须使用环境音频来完成，嗯。我不知道该怎么做。
您可能会看到两个音频通道之间的相关性出现虚假尖峰，因为声音是从墙壁反弹的。有时人们所做的是，他们寻找高于某个阈值的相关性中的第一个（最早）峰值，并说这是他们的直接路径（扬声器到麦克风），而所有其他峰值都是多路径反射。
Ahmed Fasih，在最终的应用程序中，我将能够控制源的音调。最终我试图使用麦克风之间的延迟（可能超过两个）来计算源的到达角。我认为这种声速测试将是一个好的开始。白噪声在相关数据中产生了一个很好的峰值，我应该使用方波并进行更长的测试吗？