如何使用二进制数组 WebSocket 创建 TargetDataLine？答案

【问题标题】：How to create a TargetDataLine using a binary array WebSocket?如何使用二进制数组 WebSocket 创建 TargetDataLine？
【发布时间】：2015-10-13 20:56:12
【问题描述】：

我创建了一个字节数组 WebSocket，它从客户端的麦克风 (navigator.getUserMedia) 实时接收音频块。在 WebSocket 停止接收新字节数组一段时间后，我已经将此流录制到服务器中的 WAV 文件中。以下代码代表当前情况。

WebSocket

@OnMessage
public void message(byte[] b) throws IOException{
    if(byteOutputStream == null) {
        byteOutputStream = new ByteArrayOutputStream();
        byteOutputStream.write(b);
    } else {
        byteOutputStream.write(b);
    }
}

存储 WAV 文件的线程

public void store(){
    byte b[] = byteOutputStream.toByteArray();
    try {
        AudioFormat audioFormat = new AudioFormat(44100, 16, 1, true, true);
        ByteArrayInputStream byteStream = new ByteArrayInputStream(b);
        AudioInputStream audioStream = new AudioInputStream(byteStream, audioFormat, b.length);
        DateTime date = new DateTime();
        File file = new File("/tmp/"+date.getMillis()+ ".wav");
        AudioSystem.write(audioStream, AudioFileFormat.Type.WAVE, file);
        audioStream.close();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

但是，我使用此 WebSocket 的目标不是记录 WAV 文件，而是使用在 TarsosDSP 库上实现的 YIN pitch detection algorithm 实时处理音频。换句话说，这基本上是执行PitchDetectorExample，但使用来自WebSocket 的数据而不是默认音频设备（OS mic）。以下代码表示PitchDetectorExample 当前如何使用操作系统提供的 mic 线初始化实时音频处理。

private void setNewMixer(Mixer mixer) throws LineUnavailableException, UnsupportedAudioFileException {      
    if(dispatcher!= null){
        dispatcher.stop();
    }
    currentMixer = mixer;
    float sampleRate = 44100;
    int bufferSize = 1024;
    int overlap = 0;
    final AudioFormat format = new AudioFormat(sampleRate, 16, 1, true, true);
    final DataLine.Info dataLineInfo = new DataLine.Info(TargetDataLine.class, format);
    TargetDataLine line;
    line = (TargetDataLine) mixer.getLine(dataLineInfo);
    final int numberOfSamples = bufferSize;
    line.open(format, numberOfSamples);
    line.start();
    final AudioInputStream stream = new AudioInputStream(line);
    JVMAudioInputStream audioStream = new JVMAudioInputStream(stream);
    // create a new dispatcher
    dispatcher = new AudioDispatcher(audioStream, bufferSize, overlap);
    // add a processor
    dispatcher.addAudioProcessor(new PitchProcessor(algo, sampleRate, bufferSize, this));
    new Thread(dispatcher,"Audio dispatching").start();
}

有一种方法可以将 WebSocket 数据作为 TargetDataLine 处理，因此可以将其与 AudioDispatcher 和 PitchProcessor 挂钩？不知何故，我需要将从 WebSocket 接收到的字节数组发送到音频处理线程。

欢迎提出有关如何实现此目标的其他想法。谢谢！

【问题讨论】：

标签： java websocket bytearray audio-processing

【解决方案1】：

我不确定您是否需要 audioDispatcher。如果您知道字节是如何编码的（PCM，16bits le mono？）那么您可以将它们实时转换为浮点并将它们提供给pitchdetector算法，在您的websocket中您可以执行类似的操作（忘记输入流和音频调度程序）：

 int index;
 byte[] buffer = new byte[2048];
 float[] floatBuffer = new float[1024];
 FastYin detector = new FastYin(44100,1024);
 public void message(byte[] b){
   for(int i = 0 ; i < b.length; i++){
     buffer[index] = b[i];
     index++
     if(index==2048){
       AudioFloatConverter converter = AudioFloatConverter.getConverter(new Format(16bits, little endian, mono,...));
       //converts the byte buffer to float
       converter.toFloatArray(buffer,floatBuffer);
       float pitch = detector.getPitch(floatBuffer);
       //here you have your pitch info that you can use
       index = 0;
     }
   }

您确实需要注意已通过的字节数：因为两个字节代表一个浮点数（如果使用 16 位 pcm 编码），您需要从偶数字节开始。字节序和采样率也很重要。

问候

乔伦

【讨论】：

除了几个月前我忘记表达我的感激之情之外，您的建议挽救了我的项目，我能够继续从音频信号中提取我需要的所有信息。只是一个快速的反馈是我必须使用 TarsosDSPAudioFloatConverter 而不是您建议的 AudioFloatConverter，detector.getPitch 方法也返回一个 PitchDetectionResult 对象而不是一个浮点数。我不确定这是否是因为我使用的 TarsosDSP lib 版本（2.3），但无论如何它是成功的。非常感谢 Joren 并祝贺您创建了这个惊人的 DSP 解决方案！ :)