Unicornn

Content

Training (Jupyter Notebook)
Deployment (Unity 3D)

Training (Jupyter Notebook)

Requirements

pip install -r requirements.txt

Dataset

Download the speech_commands_v0.02 dataset from Warden P. (2018) and unpack it in the Dataset folder.

Dataset
├── data-speech_commands_v0.02
    ├── _background_noise_
    ├── backward
    ...
    └── zero

Warden, P.: Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition (april 2018) https://arxiv.org/abs/1804.03209

We split the dataset in a train and test dataset with the ratio 80:20.

To train the model we use the mel-spectogram instead of raw audio for a better feature detection.

Training Process

Run the jupyter notebook train.ipynb. The section "Params" at the top of the notebook allows certain settings like model type and optimizer type. After the notebook finished the folder Weights contains the trained weights (.pth) and model with trainied weights in the ONNX-format.

Training Course

We compared the VGG19BN and ResNet34 such as SGD and Adam. All plots based on the test dataset

VGG19 + SGD-Optimizer

VGG19 + Adam-Optimizer (best result)

ResNet34 + SGD-Optimizer

ResNet34 + Adam-Optimizer

Training Result

Pretrained Models

Download pretrained models from here.

Deployment (Unity3D)

[Verified for 2019.4.1f - Windows only]

See our video tutorial Unicornn - HowTo for a quick 10 minutes introduction
Contains the same information as the following Readme.md

Quick Start

Start PlayMode and wait till monitoring says Python running and circle is green

A background python process is initiated (see requirements.txt)

librosa

numpy

datetime

pylab

PIL

numba 0.48

You can activate console window by selecting useShell [x] in GameManager -> PythonInterface.cs to see process output
Process is terminated automatically after leaving PlayMode

Start voice recording by pressing Start Button
Say two words with silence (± 1 sec) in between

Check your Mic level with VU-Meter on the right side after first recording
Check Treshold if there's background noise (GameManager -> MicrophoneInput.cs)
Possible Words: e.g: Zero [...] Left

Objects		Actions
Zero		Forward
One		Backward
Two		Left
Three		Right
Four		Up
Five		Down
Six
Seven
Eight
Nine

Stop recording with Stop Button and wait
Word splitting and processing is done automatically

You can see the detected words and their probability next to our Unicornn

Scene Overview

| Unicornn

| GameManager
- | SpeechCommands.cs: translate prediction to action in scene
- | MicrophoneInput.cs: process microphone input, slice words, use threshold
- | PythonInterface.cs: run background process (librosa) to create spectograms
| Agent
| Agent.cs: take .onnx as model and input spectograms from sliced words, find prediction * | SceneStuff
| UI elements, buttons and visual elements

Barracuda Setup

Barracuda can be installed via the Unity Unity PackageManager. For futher informations look at: https://docs.unity3d.com/Packages/[email protected]/manual/index.html
Project was tested with Barracuda 1.0.0

General Information

Models for Agent.cs

You can exchange our different trained models by using the Button in the lower left corner
- If you want to use own models drag them to the coressponding field in Agents.cs
- By default we chose the model with the best results to start with (VGG + Adam)

Audio settings for MicrophoneInput.cs

By default we chose a very low threshold to detect silence between words
If you have a louder environment it could happen, that a "silent" moment is still above our threshold
- Only change GameManager -> MicrophoneInput -> Threshold to a bigger value

Python Background Process

We process the audio input in Unity3D, save the sliced float arrays as .wav and use librosa to generate mel-spectograms
The background python process listens for existing file-names in the project folder and processes them if they exist
After processing the process deletes the .wav files for the next iteration
The script writes its own pid-ID to a text-file because it is started via cmd.exe
- To terminate all processes we need the process-ID for all children-processes
By checking if process exited we can monitor the status of our background process

process.hasExited()

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Dataset		Dataset
Docu		Docu
Erklärungen		Erklärungen
Models		Models
Notebook		Notebook
Unity		Unity
slicingShowcase		slicingShowcase
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
train.ipynb		train.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unicornn

Training (Jupyter Notebook)

Requirements

Dataset

Training Process

Training Course

Training Result

Pretrained Models

Deployment (Unity3D)

Quick Start

Scene Overview

Barracuda Setup

General Information

Models for Agent.cs

Audio settings for MicrophoneInput.cs

Python Background Process

About

Contributors 5

Languages

Alpe6825/Unicornn

Folders and files

Latest commit

History

Repository files navigation

Unicornn

Training (Jupyter Notebook)

Requirements

Dataset

Training Process

Training Course

Training Result

Pretrained Models

Deployment (Unity3D)

Quick Start

Scene Overview

Barracuda Setup

General Information

Models for Agent.cs

Audio settings for MicrophoneInput.cs

Python Background Process

About

Resources

Stars

Watchers

Forks

Contributors 5

Languages