Deep Audio API
Deep Audio API provides access to a set of models that analyze audio signals in terms of:
- General auditory analysis (discriminate between music, speech, other sounds and silence)
- Musical classification (musical genres, moods and styles)
- Speech to Text (ASR)
- Speaker characteristics (gender, speaking style etc)
- Environmental sound analysis (recognize quality of "soundscape")
Access to the API is provided through a simple Python client that sends audio data via an GRPC connection. Audio predictions are returned in a simple json format.
Response speed depends on the internet connection, but for an average 5Mbps upload speed the realtime ratio is more than 30x (e.g. it takes less than 2 minutes to analyze 60 minutes of audio data).
Ask for a demo by contacting us here