I want to do voice recognition.

hyuk · Aug 25, 2023

Hi,

I want to perform some action by voice recognition during live.
For example, I want to capture a specific word.
Is there any support like voice recognition in flashphoner?
If there is no supported function, I would like to use the voice data of the flashphoner directly.

webrtc for example..

var pc = new RTCPeerConnection();
pc.ontrack = function(event) {
// Receive voice data.
var audioTrack = event. track;
};

You want to acquire an audioTrack.
Where can I get that information?

Max · Aug 25, 2023

Good day.
At server side, you can implement Java class to get audio tracks in PCM format and redirect it to third-party voice recognition tools: Server audio processing
At client side, you can access audio track via video tag:

Code:

    stream = session.createStream(options).on(STREAM_STATUS.PENDING, function (stream) {
        ...
    }).on(STREAM_STATUS.PLAYING, function (stream) {
        let video = document.getElementById(stream.id());
        let audioTracks = video.audioTracks;
        // The you can handle audioTracks received
        doSomethingWithAudioTracks(audioTracks);
        ...
    }).on(STREAM_STATUS.STOPPED, function () {
        ...
    });

But you can't get audio data from AudioTrack. You should use MediaElementAudioSourceNode to do this:

Code:

let ac = new AudioContext();
// Make elem global instead of local for this to work.  Then
let s = new MediaElementAudioSourceNode(ac, {mediaElement: video});
// 8192 is just an example.  You could choose other values
let spn = ac.createScriptProcessorNode(8192, <number of channels>, 0);
spn.onaudioprocess = (e) => {
   let input = e.inputBuffer;
   for (let k = 0; k < input.length; ++k) {
      let channel = input.getChannelData(k);
      // channel has one channel from the tracks. Process it as you see
      // fit.
   }
}

s.connect(spn).connect(ac.destination);

I want to do voice recognition.

hyuk

Member

Max

Administrator