The Near-LATTE (Lock And Track Type Engine) variant for AVR-listening within the Yobe SDK performs Automatic Voice Recognition (AVR) on the voice of a pre-enrolled user to extract their voice from surrounding noise and human crosstalk to determine when the pre-enrolled user is silent and consequently mute the entire signal at those times. The software also has the capability of enrolling a user on the basis of 10-20 seconds of unscripted speech. The Voice Template is used to determine the intervals that a pre-enrolled user is not talking. During those intervals, the audio signal is muted. Typical use cases for the Near-LATTE variant for AVR-listening is a registered user engaged in a two way conversation and needing the signal to be entirely muted when they are not speaking or a solution wanting to input or record only the authorized speaker at all times, even though there may be other people speaking as well as other sources of noise.
The Near-LATTE variant for AVR-listening has the following capabilities:
Note: Only one user can be enrolled at a time.
Place the provided libraries and header files in a location that can be discovered by your application's build system.
LATTE's main functionality is accessed via the Yobe::IDListener class.
Yobe::Create::NewIDListener is used to obtain a shared pointer to a new Yobe::IDListener instance. The instance must then be initialized using the license provided by Yobe, as well as two configuration arguments: the Microphone Orientation and the Output Buffer Type.
A Yobe::BiometricTemplate must be created using the desired user's voice so the IDListener can select that template and identify the voice.
Register the user by inputting their voice audio data using Yobe::IDListener::RegisterTemplate. This is done using a continuous array of audio samples.
It is recommended to first process the audio data so that only speech is present in the audio; this will yield better identification results. To achieve this, the IDListener can be placed into Enrollment Mode. In this mode, the audio that should be used to create a BiometricTemplate is processed, buffer-by-buffer, using Yobe::IDListener::ProcessBuffer. This ProcessBuffer will return a status of Yobe::Status::ENROLLING as long as the buffers are being processed in Enrollment Mode. Enrollment Mode is started by calling Yobe::IDListener::StartEnrollment, and is stopped by either manually calling Yobe::IDListener::StopEnrollment or by processing enough buffers for it to stop automatically based on an internal counter (currently, this is enough buffers to equal 20 seconds of audio).
Enrollment Mode can be started at any point after initial calibration. Any samples processed while in Enrollment Mode will not be matched for identification to a selected template, if there is one.
Select the user using the template returned by the registration.
Any new audio buffers passed to Yobe::IDListener::ProcessBuffer while not in Enrollment Mode will be processed with respect to the selected user's voice.
Audio data is passed into the Yobe::IDListener one buffer at a time. See Audio Buffers for more details on their format. As seen in the method signatures for the Yobe::IDListener::ProcessBuffer functions, the audio can be encoded as Double or PCM 16-bit Integer. The output buffer size can also vary from call to call. You can use this variable to prepare your application to deal with the buffers accordingly. In this case, it only happens when there is a transition from unauthorized to authorized state.
out_buffer
in the above example now contains the processed version of the audio that is contained in input_buffer
. An example of what to do with this out_buffer
is to append its contents to a stream or larger buffer. An out-parameter is used to store whether the selected user was detected in the last buffer of audio via a boolean. In this example, the variable is_user_detected
is assumed to be initialized before this call.
Note: You can find the library's built in buffer size using Yobe::Info::InputBufferSize.
To ensure the IDListener is properly deinitialized, simply call Yobe::IDListener::Deinit.