Speech Emotion based face generation using Condition GANs

Generating human faces through conditional GANs which are conditioned on emotions identified from a human speech using SER (Speech Emotion Recognition)

An image showing the overall pipeline

Below is a short demo of the web app showing generation of human faces based on emotion identified from human speech.

Results

Training samples

Generated samples

Getting Started

Prerequisites

pandas==1.0.4
Keras==2.3.1
librosa==0.7.2
streamlit==0.61.0
tensorflow==2.0.0
numpy==1.18.1
tqdm==4.42.0
scipy==1.4.1
tensorflow_hub==0.8.0
matplotlib==3.1.3
Flask==1.1.2
ipython==7.17.0
Pillow==7.2.0
pyaudio==0.2.11
scikit_learn==0.23.2

Directory Structure

Project

├── speech_emotion_recognition
│   ├── code
│   │   ├── ser_training.ipynb
│   │   ├── ser_prediction.ipynb
│   ├── data
│   │   ├── Audio_Speech_Actors_01-24
│   │   │   ├── Actor_01
│   │   │   │   ├── 03-01-01-01-01-01-01.wav
│   │   │   │   ├── 03-01-01-01-01-02-01.wav
│   │   │   │   ...
│   │   │   ├── Actor_02
│   │   │   ...
│   │   │   ├── Actor_24
│   ├── weights

├── conditional_gan
│   ├── code
│   │   ├── cgan_training.ipynb
│   │   ├── cgan_prediction.ipynb
│   ├── data
│   │   ├── fer2013.csv
│   ├── weights

├── streamlit_webapp

Data

For SER :

The dataset can be downloaded at:
https://www.kaggle.com/uwrfkaggler/ravdess-emotional-speech-audio
and should be put it in the location

./speech_emotion_recognition/data/

It consists of speech audios in the voice of 24 actors. 5 sample audio file by the first actor has been put in the above location as an example.

For GANs :

The dataset can be downloaded at:
https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data
and should be put it in the location

./conditional_gan/data/

We are interested in the "fer2013.csv" file from the data bundle. A sample file containing data for only 5 faces has been put as an example.

Model Trainining

[Note1: Please host and run these files on Google Colab]
[Note2: Please mount the drive where data files are present(follow the directory structure)]

For each of SER and cGAN, there are two separate Jupyter Notebook files, one for training and one for prediction.

For SER:

Training :

./speech_emotion_recognition/code/ser_training.ipynb

The weights obtained are stored in ./speech_emotion_recognition/weights
The pretrained weights corresponding to the best model are already put at this location.

Prediction :

./speech_emotion_recognition/code/ser_prediction.ipynb

For cGAN:

Training :

./conditional_gan/code/cgan_training.ipynb

Prediction :

./conditional_gan/code/cgan_prediction.ipynb

References

Mirza, M. and Osindero, S., 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784.
Livingstone, S.R. and Russo, F.A., 2018. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PloS one, 13(5), p.e0196391.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y., 2014. Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680).
Francois Chollet. 2017. Deep Learning with Python (1st. ed.). Manning Publications Co., USA.
https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge
https://medium.com/@ma.bagheri/a-tutorial-on-conditional-generative-adversarial-nets-keras-implementation-694dcafa6282
https://machinelearningmastery.com/how-to-develop-a-conditional-generative-adversarial-network-from-scratch/

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
conditional_gan		conditional_gan
images		images
speech_emotion_recognition		speech_emotion_recognition
streamlit_webapp		streamlit_webapp
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Emotion based face generation using Condition GANs

Results

Getting Started

Prerequisites

Directory Structure

Data

For SER :

For GANs :

Model Trainining

For SER:

Training :

Prediction :

For cGAN:

Training :

Prediction :

References

About

Releases

Packages

Contributors 2

Languages

harshit158/ser-based-conditional-gan

Folders and files

Latest commit

History

Repository files navigation

Speech Emotion based face generation using Condition GANs

Results

Getting Started

Prerequisites

Directory Structure

Data

For SER :

For GANs :

Model Trainining

For SER:

Training :

Prediction :

For cGAN:

Training :

Prediction :

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages