Deep Audio API

Deep Audio API provides access to a set of models that analyze audio signals in terms of:

  • General auditory analysis (discriminate between music, speech, other sounds and silence)
  • Musical classification (musical genres, moods and styles)
  • Speech to Text (ASR)
  • Speaker characteristics (gender, speaking style etc)
  • Environmental sound analysis (recognize quality of "soundscape")

Access to the API is provided through a simple Python client that sends audio data via an GRPC connection. Audio predictions are returned in a simple json format.

Response speed depends on the internet connection, but for an average 5Mbps upload speed the realtime ratio is more than 30x (e.g. it takes less than 2 minutes to analyze 60 minutes of audio data).

Ask for a demo by contacting us here