Mass-scale, user-independent, device-independent voice messaging system

US 8,976,944 B2
Filed: 10/31/2007
Issued: 03/10/2015
Est. Priority Date: 02/10/2006
Status: Active Grant

First Claim

Patent Images

1. A voice messaging system for converting an audio voice message from a caller to a recipient to text, the voice messaging system comprising:

an automatic speech recognition (ASR) system to automatically recognize at least some of the audio voice message, the ASR system comprising;

a plurality of ASR components, each specially configured to recognize a respective type of content; and

a computer implemented boundary selection sub-system to process the audio voice message to identify at least one portion of the audio voice message which contains content of the type for which one of the plurality of ASR components is specially configured, wherein the identified at least one portion of the audio message is sent to the one of the plurality of ASR components specially configured for the type of content identified in the at least one portion to be automatically recognized.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A mass-scale, user-independent, device-independent, voice messaging system that converts unstructured voice messages into text for display on a screen is disclosed. The system comprises (i) computer implemented sub-systems and also (ii) a network connection to human operators providing transcription and quality control; the system being adapted to optimize the effectiveness of the human operators by further comprising 3 core sub-systems, namely (i) a pre-processing front end that determines an appropriate conversion strategy; (ii) one or more conversion resources; and (iii) a quality control sub-system.

129 Citations

16 Claims

1. A voice messaging system for converting an audio voice message from a caller to a recipient to text, the voice messaging system comprising:
- an automatic speech recognition (ASR) system to automatically recognize at least some of the audio voice message, the ASR system comprising;
  
  a plurality of ASR components, each specially configured to recognize a respective type of content; and
  
  a computer implemented boundary selection sub-system to process the audio voice message to identify at least one portion of the audio voice message which contains content of the type for which one of the plurality of ASR components is specially configured, wherein the identified at least one portion of the audio message is sent to the one of the plurality of ASR components specially configured for the type of content identified in the at least one portion to be automatically recognized.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The system of claim 1 in which the computer implemented boundary selection sub-system identifies a portion of the audio message that contains a telephone number, and wherein the portion is sent to an ASR component specially configured to automatically recognize telephone numbers.
  - 3. The system of claim 1 in which the computer implemented boundary selection sub-system identifies a portion of the audio message that contains an appointment, and wherein the portion is sent to an ASR component specially configured to automatically recognize appointments.
  - 4. The system of claim 1 in which the computer implemented boundary selection sub-system identifies a portion of the audio message that contains an address, and wherein the portion is sent to an ASR component specially configured to automatically recognize addresses.
  - 5. The system of claim 1, wherein the computer implemented boundary selection sub-system detects at least one of a greeting portion, a body portion and a tail portion of the audio voice message.
  - 6. The system of claim 1, wherein the computer implemented boundary selection sub-system identifies a portion of the audio message that contains a real noun, and wherein the portion is sent to an ASR component specially configured to automatically recognize real nouns.

7. A method for converting an audio voice message from a caller to a recipient to text, the method comprising:
- processing the audio voice message to identify at least one portion of the audio voice message which contains content of a type for which one of a plurality of ASR components is specially configured to automatically recognize;
  
  sending the identified at least one portion of the audio voice message to the one of the plurality of ASR components specially configured for the type of content identified in the at least one portion of the audio voice message to be automatically recognized;
  
  receiving, from the ASR component to which the at least one portion of the audio voice message was sent, a text portion corresponding to the automatic recognition of the at least one portion of the audio;
  
  assembling the text portion into the text; and
  
  outputting the text to the recipient.
- View Dependent Claims (9, 10, 11, 12)
- - 9. The method of claim 7, comprising identifying a portion of the audio message that contains a telephone number, and sending the portion to an ASR component specially configured to automatically recognize telephone numbers.
  - 10. The method of claim 7, comprising identifying a portion of the audio message that contains an appointment, and sending the portion to an ASR component specially configured to automatically recognize appointments.
  - 11. The method of claim 7, comprising identifying a portion of the audio message that contains an address, and sending the portion to an ASR component specially configured to automatically recognize addresses.
  - 12. The method of claim 7, comprising identifying a portion of the audio message that contains a real noun, and sending the portion to an ASR component specially configured to automatically recognize real nouns.

8. At least one non-transitory computer readable storage device for storing instructions that, when executed on at least one computer, cause the at least one computer to perform a method for converting an audio voice message from a caller to a recipient to text, the method comprising:
- processing the audio voice message to identify at least one portion of the audio voice message which contains content of a type for which one of a plurality of ASR components is specially configured to automatically recognize;
  
  sending the identified at least one portion of the audio voice message to the one of the plurality of ASR components specially configured for the type of content identified in the at least one portion of the audio voice message to be automatically recognized;
  
  receiving, from the ASR component to which the at least one portion of the audio voice message was sent, a text portion corresponding to the automatic recognition of the at least one portion of the audio;
  
  assembling the text portion into the text;
  
  andoutputting the text to the recipient.
- View Dependent Claims (13, 14, 15, 16)
- - 13. The at least one non-transitory computer readable storage device of claim 8, comprising identifying a portion of the audio message that contains a telephone number, and sending the portion to an ASR component specially configured to automatically recognize telephone numbers.
  - 14. The at least one non-transitory computer readable storage device of claim 8, comprising identifying a portion of the audio message that contains an appointment, and sending the portion to an ASR component specially configured to automatically recognize appointments.
  - 15. The at least one non-transitory computer readable storage device of claim 8, comprising identifying a portion of the audio message that contains an address, and sending the portion to an ASR component specially configured to automatically recognize addresses.
  - 16. The at least one non-transitory computer readable storage device of claim 8, comprising identifying a portion of the audio message that contains a real noun, and sending the portion to an ASR component specially configured to automatically recognize real nouns.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Doulton, Daniel Michael
Primary Examiner(s)
Tsang, Fan
Assistant Examiner(s)
Huynh, Van D

Application Number

US11/931,736
Publication Number

US 20080049908A1
Time in Patent Office

2,687 Days
Field of Search

704/3, 704/4, 704/246, 704/251, 704/235, 379/88.18, 379/88.17, 379/88.22, 379/88.05, 379/265.12, 379/88.14, 369/25.01
US Class Current

379/88.14
CPC Class Codes

G10L 15/26   Speech to text systems G10L...

H04M 2201/60   Medium conversion

H04M 3/4936   Speech interaction details ...

H04M 3/5183   Call or contact centers wit...

H04M 3/53333   Message receiving aspects

Mass-scale, user-independent, device-independent voice messaging system

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

129 Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Mass-scale, user-independent, device-independent voice messaging system

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

129 Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links