Open source code and data

Below you can find some open source tools and libraries in the domain of multimodal signal processing and machine learning, that have been (co)developed by members of our group.

Audio / Speech and Language

pyAudioAnalysis A Python library covering a wide range of audio analysis tasks such as: feature extraction, classification, segmentation and clsutering of audio signals GitHub
deep-audio-features A Python library for training Convolutional Neural Netowrks as audio classifiers. The library provides wrappers to pytorch for training CNNs on audio classification tasks, and using the CNNs as feature extractors. GitHub
amvoc A Python Tool for Analysis of Mouse Vocal Communication. Developed for (and with the help of) Erich Jarvis Lab of the Rockefeller University, NY GitHub
paura A Python AUdio Recording and Analysis Tool, that lets you record sounds from your microphone in Python and perform some basic audio analysis in realtime GitHub
readys A Speech Analytics Python Tool for Speech Quality Assessment GitHub

Video / Image analysis

multimodal movie analysis A set of scripts to extract features from a movie using audio and visual analysis GitHub
video_annotator A simple video annotator built in task. It can be used for annotating video summarization metadata. GitHub

General Machine Learning

Multilabel SMOTE A Python lib for a multi-label SMOTE dataset upsampling. The implemented approach resamples from the representative data that belongs only at the minority class. GitHub

Datasets

GreThE A Dataset for Speech Emotion Recognition in Greek Theatrical Plays GitHub
PuSQ A Public Speaking Quality dataset for assessing the speaker's skills GitHub
archeo A Python code and a dataset for sound event detection in areas of touristic Interest GitHub