Deep Audio API

Deep Audio API provides access to a set of models that analyze audio signals in terms of:

General auditory analysis (discriminate between music, speech, other sounds and silence)
Musical classification (musical genres, moods and styles)
Speech to Text (ASR)
Speaker characteristics (gender, speaking style etc)
Environmental sound analysis (recognize quality of "soundscape")

Access to the API is provided through a simple Python client that sends audio data via an GRPC connection. Audio predictions are returned in a simple json format.

Response speed depends on the internet connection, but for an average 5Mbps upload speed the realtime ratio is more than 30x (e.g. it takes less than 2 minutes to analyze 60 minutes of audio data).

Ask for a demo by contacting us here