Method of and system for improving accuracy in a speech recognition system

US 7,624,010 B1
Filed: 07/31/2001
Issued: 11/24/2009
Est. Priority Date: 07/31/2000
Status: Expired due to Term

First Claim

Patent Images

1. A speech recognition system comprising:

a querying device for posing at least one query over a telephone to a telephone respondent;

a speech recognition device that is configured and arranged to receive an audio response from said respondent over the telephone and to conduct a speaker-independent speech recognition analysis of said audio response to automatically produce a corresponding text response;

a storage device for recording and storing said audio response as it is received by said speech recognition device;

an accuracy determination device for automatically comparing said text response to a text set of expected responses and determining whether said text response corresponds to one of said expected responses, wherein said accuracy determination device is configured and arranged to determine whether said text response corresponds to one of said expected responses within a predetermined accuracy confidence parameter and to automatically flag said audio response so as to produce a flagged audio response for further review by a human operator, wherein the human operator is different from the telephone respondent, when said text response does not correspond to one of said expected responses within said predetermined accuracy confidence parameter; and

a human interface device for enabling said human operator to hear said flagged audio response and review the corresponding text response for the flagged audio response to determine the actual text response for the flagged audio response, either by selecting from a pre-determined list of text responses or typing the actual text response if no such match exists in the pre-determined list of text responses.

View all claims

11 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for transcribing an audio response includes:

- A. constructing an application including a plurality of queries and a set of expected responses for each query, the set including a plurality of expected responses to each query in a textual form;
- B. posing each of the queries to a respondent with a querying device;
- C. receiving an audio response to each query from the respondent;
- D. performing a speech recognition function on each audio response with an automatic speech recognition device to transcribe each audio response to a textual response to each query;
- E. recording each audio response with a recording device; and
- F. comparing, with the automatic speech recognition device, each textual response to the set of expected responses for each corresponding query to determine if each textual response corresponds to any of the expected responses in the set of expected responses for the corresponding query. The method includes flagging each audio response corresponding to a textual response that does not correspond to one of the expected responses in the set of expected responses to the corresponding query, reviewing each flagged audio response to determine if a corresponding expected response is included in the set of expected responses the query associated with each audio response, and entering a text response if no such match exists.

Citations

14 Claims

1. A speech recognition system comprising:
- a querying device for posing at least one query over a telephone to a telephone respondent;
  
  a speech recognition device that is configured and arranged to receive an audio response from said respondent over the telephone and to conduct a speaker-independent speech recognition analysis of said audio response to automatically produce a corresponding text response;
  
  a storage device for recording and storing said audio response as it is received by said speech recognition device;
  
  an accuracy determination device for automatically comparing said text response to a text set of expected responses and determining whether said text response corresponds to one of said expected responses, wherein said accuracy determination device is configured and arranged to determine whether said text response corresponds to one of said expected responses within a predetermined accuracy confidence parameter and to automatically flag said audio response so as to produce a flagged audio response for further review by a human operator, wherein the human operator is different from the telephone respondent, when said text response does not correspond to one of said expected responses within said predetermined accuracy confidence parameter; and
  
  a human interface device for enabling said human operator to hear said flagged audio response and review the corresponding text response for the flagged audio response to determine the actual text response for the flagged audio response, either by selecting from a pre-determined list of text responses or typing the actual text response if no such match exists in the pre-determined list of text responses.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The speech recognition system of claim 1, wherein said human interface device comprises a personal computer including a monitor for enabling the human operator to view said text responses and an audio speaker device for enabling the operator to listen to said flagged audio responses.
  - 3. The speech recognition system of claim 2, wherein said querying device includes a program having an application file, said application file including code which causes the at least one query to be posed to the respondent, a list of expected responses and an address at which a file containing the received audio response will be stored in the storage device.
  - 4. The speech recognition system of claim 1, wherein said human interface device includes a graphical user interface on which the human operator views said text set of expected responses, wherein after listening to said audio response, the human operator is able to select one of said expected responses from said text set of expected responses if the human operator determines that the response corresponds to one of said expected responses.
  - 5. The speech recognition system of claim 4, wherein said graphical user interface comprises an application navigation window for enabling the human operator to navigate through said text set of expected responses, and an audio navigation window for enabling the human operator to control playback of said audio response.
  - 6. The speech recognition system of claim 1, wherein said querying device includes a program having an application file, said application file including code which causes the at least one query to be posed to the respondent, a list of expected responses and an address at which a file containing the received audio response will be stored in the storage device.
  - 7. The speech recognition system of claim 6, wherein said human interface device includes a graphical user interface on which the human operator views said text set of expected responses, wherein after listening to said audio response, the human operator is able to select one of said expected responses from said text set of expected responses.
  - 8. The speech recognition system of claim 7, wherein said graphical user interface includes a text entry window which enables the human operator to enter a text response if none of said expected responses from said text set of expected responses corresponds to said audio response.
  - 9. The speech recognition system of claim 7 wherein said graphical user interface comprises an application navigation window for enabling the human operator to navigate through said text set of expected responses, and an audio navigation window for enabling the human operator to control playback of said audio response.
  - 10. The speech recognition system of claim 9, wherein said graphical user interface includes a text entry window which enables the human operator to enter a text response if none of said expected responses from said text set of expected responses corresponds to said audio response.

11. A method of transcribing an audio response comprising:
- A. posing a query over a telephone to a telephone respondent;
  
  B. receiving an audio response from said respondent over the telephone;
  
  C. performing a speaker-independent speech recognition function on said audio response to automatically convert said audio response to a textual response;
  
  D. recording said audio response;
  
  E. comparing said textual response to a set of expected responses to said query, said set including a plurality of expected responses to said query in a textual form; and
  
  F. flagging said audio response so as to produce a flagged audio response for further review by a human operator if the corresponding textual response does not correspond to one of said expected responses in said set of expected responses within a predetermined accuracy confidence parameter;
  
  G. a human operator listening to the actual audio response corresponding to said flagged audio response, wherein the human operator is different than the telephone respondent; and
  
  H. a human operator determining if one of said expected responses corresponds to said actual audio response, wherein the human operator is different than the telephone respondent; and
  
  I. if such determination of step H. is in the affirmative, selecting, from said set of expected responses, a textual response that corresponds to said audio response.
- View Dependent Claims (12)
- - 12. The method of claim 11, further comprising:
    - J. manually transcribing a textual response that corresponds to said audio response if such determination of step H is negative.

13. A method of transcribing an audio response comprising:
- A. constructing a speaker-independent speech recognition application including a plurality of queries and a set of expected responses for each query, said set including a plurality of expected responses to each query in a textual form;
  
  B. posing each of said queries to a telephone respondent over the telephone;
  
  C. receiving an audio response to each query over the telephone from said respondent;
  
  D. performing a speaker-independent speech recognition function on each said audio response to automatically convert each said audio response to a textual response to each query;
  
  E. recording and storing each audio response;
  
  F. automatically comparing each textual response to said set of expected responses for each corresponding query to determine if each textual response corresponds to any of said expected responses in said set of expected responses for the corresponding query;
  
  G. flagging an audio response so as to produce a flagged audio response for further review by a human operator if the corresponding textual response does not correspond to one of said expected responses in said set of expected responses within a predetermined accuracy confidence parameter as determined by said speaker-independent speech recognition analysis, H. a human operator listening to the actual audio response corresponding to said flagged audio response, wherein the human operator is different than the telephone respondent;
  
  I. a human operator determining if one of said expected responses corresponds to said actual audio response, wherein the human operator is different than the telephone respondent; and
  
  J. if such determination of step I. is in the affirmative, the human operator selecting, from said set of expected responses, a textual response that corresponds to said audio response, and flagging each audio response that does not correspond to one of said expected responses in said set of expected responses to the corresponding query.
- View Dependent Claims (14)
- - 14. The method of claim 13, further comprising manually transcribing a textual response that corresponds to each flagged audio response if such determination of step J is negative.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Eliza Corporation (Gainwell Technologies LLC)
Original Assignee
Eliza Corporation (Gainwell Technologies LLC)
Inventors
Kroeker, John, Boulanov, Oleg
Primary Examiner(s)
Opsasnick; Michael N

Application Number

US09/918,733
Time in Patent Office

3,038 Days
Field of Search

704/235, 704/260, 704/257, 704/270, 704/275, 379/88.01
US Class Current

704/235
CPC Class Codes

G10L 15/22 Procedures used during a sp...

Method of and system for improving accuracy in a speech recognition system

First Claim

11 Assignments

0 Petitions

Accused Products

Abstract

Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Method of and system for improving accuracy in a speech recognition system

First Claim

11 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links