Conversational Metrics API:
Insights from audio
Comprehensive speech analysis of conversations - meetings, customer support calls, interviews, earnings-call. Measure intent, cross-talk, number of questions asked, talk-to-listen ratio, pitch, tone, speech disfluency & more to generate actionable insights from multi-speaker conversations.
Speaker Diarization API:
Who spoke when?
Speaker recognition/diarization is the identification of an individual person based on characteristics found in the unique voice qualities. In an audio recoding with multiple speakers (conference call, dialogs etc.), the Diarization API identifies the speaker at precisely the time they spoke during the conversation. On the left is an audio recording of a debate, the image shows the cluster generated based on the speech pattern and precise time the speaker participated in the conversation.
Emotion Recognition API:
If Emotions Could Talk
Emotion Recognition API identifies emotions from paralinguistic properties of speech (without text based references). Some of the emotions extracted are anger, stress, disgust, etc. On the right are the identified audio emotion metrics extracted from CNN video clip.
Signal vs. Noise
Media recordings are susceptible to noise. Noise embedded in audio files can be random or white noise. Denoising algorithms are used to remove the noise. Look and listen at the sample audio clips with corresponding outputs displayed below: