Mass-scale, user-independent, device-independent, voice messaging system
First Claim
Patent Images
1. A voice messaging system for converting an audio message from a caller into text, the voice messaging system comprising:
- at least one automatic speech recognition (ASR) system to automatically recognize at least some of the audio message;
a computer implemented preprocessing front-end to process the audio message from the caller and to detect if the audio message contains no voice content, wherein;
if the preprocessing front-end detects that the audio message contains no voice content, the preprocessing front-end does not provide the audio message to the ASR component; and
if the preprocessing front-end detects that the audio message contains voice content, the front-end provides the audio message to the ASR component, andwherein the computer implemented preprocessing front-end comprises a computer implemented speech quality detector to determine at least one measure of speech quality of the voice content of the audio message, and wherein the speech quality detector detects drop-outs, estimates noise levels and/or calculates an overall measure of voice quality using an adaptive threshold to reject lowest quality messages.
3 Assignments
0 Petitions
Accused Products
Abstract
A mass-scale, user-independent, device-independent, voice messaging system that converts unstructured voice messages into text for display on a screen is disclosed. The system comprises (i) computer implemented sub-systems and also (ii) a network connection to human operators providing transcription and quality control; the system being adapted to optimize the effectiveness of the human operators by further comprising 3 core sub-systems, namely (i) a pre-processing front end that determines an appropriate conversion strategy; (ii) one or more conversion resources; and (iii) a quality control sub-system.
50 Citations
24 Claims
-
1. A voice messaging system for converting an audio message from a caller into text, the voice messaging system comprising:
-
at least one automatic speech recognition (ASR) system to automatically recognize at least some of the audio message; a computer implemented preprocessing front-end to process the audio message from the caller and to detect if the audio message contains no voice content, wherein; if the preprocessing front-end detects that the audio message contains no voice content, the preprocessing front-end does not provide the audio message to the ASR component; and if the preprocessing front-end detects that the audio message contains voice content, the front-end provides the audio message to the ASR component, and wherein the computer implemented preprocessing front-end comprises a computer implemented speech quality detector to determine at least one measure of speech quality of the voice content of the audio message, and wherein the speech quality detector detects drop-outs, estimates noise levels and/or calculates an overall measure of voice quality using an adaptive threshold to reject lowest quality messages. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method comprising:
-
receiving an audio message from a caller; determining at least one measure of speech quality of the voice content of the audio message including detecting drop-outs, estimating noise levels and/or calculating an overall measure of voice quality using an adaptive threshold to reject lowest quality messages; processing the audio message to determine if the audio message contains voice content; providing the audio message to an automatic speech recognition component to convert, at least in part, the voice content to text if the audio message is determined to contain voice content; not providing the audio message to an automatic speech recognition component if the audio message is determined to contain no voice content. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. An apparatus comprising:
-
at least one input to receive an audio message from a caller; and at least one processor capable of receiving the audio message, the at least one processor configured to; determine at least one measure of speech quality of the voice content of the audio message including detecting drop-outs, estimating noise levels and/or calculating an overall measure of voice quality using an adaptive threshold to reject lowest quality messages; process the audio message to determine if the audio message contains voice content; provide the audio message to an automatic speech recognition component to convert, at least in part, the voice content to text if the audio message is determined to contain voice content; and not provide the audio message to an automatic speech recognition component if the audio message is determined to contain no voice content. - View Dependent Claims (20, 21, 22, 23, 24)
-
Specification