Working with audio
It is possible to record audio tracks from the computer's microphone in MindDev. MindDev will use the default Windows microphone for its recordings.
Audio management
Audio recording requires two entities:
- The microphone manager
- The microphone recorder
The microphone manager has the following parameters:
- A device identifier
- A file name prefix
- A bit rate (default 44100)
- Maximum recording time (default 3599)
- A device name
- An option to use the voice processor
!!! Tip ‘Complete microphone recording’
The microphone manager will generate an audio fihcier of the complete pass.
!!! Tip ‘Microphone device name’
The microphone name can be left blank. By default MindDev will use the computer's default microphone.
!!! Note ‘Maximum recording time of the microphone’
The maximum microphone recording time is 1h minus 1 second.
Recording an audio stream
Audio is recorded in the same way as any other device. You need to add a microphone recording entity by specifying the identifier of the microphone handler to be used and the file name prefix.
An audio file will be generated for each test, in addition to the full handler recording.
!!! Tip ‘Audio compilation’
MindDev offers an audio compilation tool for obtaining an audio file of a complete experiment from all the recordings of all the trials in the experiment. It is therefore preferable to use the recording trial by trial.
!!! File format’ note
The audio file format is .wav.
Translated with DeepL.com (free version)
Replaying a sound file
MindDev stores sound in a special form, a specific DataRecord
named SoundDataRecord
. This container contains no audio data, it only contains the reference to the audio file in .wav format. The interest of this SoundDataRecord
is to allow a graphic visualization of the audio curve in addition to being able to read again the file in question directly in the editor.
Storing audio data
The audio data is only stored in the audio file linked to the DataRecord
. If the .wav file is lost, all audio data is lost.
Audio compilation tool
MindDev provides an audio compilation tool to generate an audio file per experiment for a given protocol. The compilation will put the audio files of the trials end to end and mark each trial start with a small signal. It is thus easier to analyze the audio file with this marking. In case some of the audio files from the trials are missing, the compiled audio file will have signal sequences of the length of the trial.
Working with audio recognition
MindDev relies on the VOSK AI to enable real-time audio recognition. Several considerations need to be taken into account to enable correct use. First of all, you need a trained AI model. Models can be downloaded here: https://alphacephei.com/vosk/models
To integrate a VOSK AI model into MindDev, use the VOSK tool integrated into MindDev:
To install a VOSK AI model, it is necessary to install it via the associated button (A MindDev restart will then be required) :
Using audio recognition AI in real time
To use the audio recognition AI in real time, a specific node of the ‘VOSK Speech Engine’ type should be added, which will convert the audio stream into text. subsequently, it will be possible to compare the text to agree on the success of a test, for example.
The audio-to-text conversion manager has several parameters:
- its own identifier (allowing several templates to be used)
- The microphone identifier
- The number of maximum alternatives
- The template to be used (must be downloaded and installed beforehand)
The VOSK conversion manager requires the use of an entity that manages the computer's microphone (and whose identifier is correctly specified). Only the MindDev version 2 microphone manager is compatible. This manager has the following parameters:
With this configuration, it becomes possible to use audio recognition as a success condition. In the trials, it is possible to add a ‘Success On speech’ type node which will stop the trial in progress if a recognised word is encountered.
The properties of this node are as follows:
- Audio manager identifier (must correspond to a VOSK Speech Engine node identifier)
- A word to be recognised