Speech and signal digitization by using recognition metrics to select from multiple techniques

US 7,016,835 B2
Filed: 12/19/2002
Issued: 03/21/2006
Est. Priority Date: 10/29/1999
Status: Expired due to Term

First Claim

Patent Images

1. A system for digitizing input information from a first information type to a digital format, comprising:

a memory that stores computer readable code; and

a processor operatively coupled to said memory, said processor configured to;

analyze said input information to determine if at least one of at least first and second characteristics is present, wherein said processor is further configured to;

recognize said input information with a plurality of prioritized recognition methods that exhibit varying degrees of improved recognition for a certain characteristic of the input information; and

determine whether a general recognizer or an intermediate recognizer produces a script closer to a reference output;

select output information recognized with a first recognition method if said analyzing step determines that said input information includes said first characteristic; and

select output information recognized with a second recognition method if said analyzing step determines that said input information includes said second characteristic.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A characteristic-specific digitization method and apparatus are disclosed that reduces the error rate in converting input information into a computer-readable format. The input information is analyzed and subsets of the input information are classified according to whether the input information exhibits a specific physical parameter affecting recognition accuracy. If the input information exhibits the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a characteristic-specific recognizer that demonstrates improved performance for the given physical parameter. If the input information does not exhibit the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a general recognizer that performs well for typical input information. In one implementation, input speech having very low recognition accuracy as a result of a physical speech characteristic is automatically identified and recognized using a characteristic-specific speech recognizer.

Citations

30 Claims

1. A system for digitizing input information from a first information type to a digital format, comprising:
- a memory that stores computer readable code; and
  
  a processor operatively coupled to said memory, said processor configured to;
  
  analyze said input information to determine if at least one of at least first and second characteristics is present, wherein said processor is further configured to;
  
  recognize said input information with a plurality of prioritized recognition methods that exhibit varying degrees of improved recognition for a certain characteristic of the input information; and
  
  determine whether a general recognizer or an intermediate recognizer produces a script closer to a reference output;
  
  select output information recognized with a first recognition method if said analyzing step determines that said input information includes said first characteristic; and
  
  select output information recognized with a second recognition method if said analyzing step determines that said input information includes said second characteristic.

2. A method for digitizing input information from a first information type to a digital format, said method comprising the steps of:
- analyzing said input information to determine if at least one of at least first and second characteristics is present, wherein said analyzing step further comprises the steps of;
  
  recognizing said input information with a plurality of prioritized recognition methods that exhibit varying degrees of improved recognition for a certain characteristic of the input information; and
  
  determining whether a general recognizer or an intermediate recognizer produces a script closer to a reference output;
  
  selecting output information recognized with a first recognition method if said analyzing step determines that said input information includes said first characteristic; and
  
  selecting output information recognized with a second recognition method if said analyzing step determines that said input information includes said second characteristic.
- View Dependent Claims (3, 4, 5, 6, 7, 8)
- - 3. The method of claim 2, wherein said reference output is produced by a characteristic-specific recognizer.
  - 4. The method of claim 2, wherein said reference output is obtained using a voting method.
  - 5. The method of claim 2, wherein said input information type is speech.
  - 6. The method of claim 2, wherein said input information type is handwriting.
  - 7. The method of claim 2, wherein said input information type is printed text.
  - 8. The method of claim 2, wherein said input information type is pictures.

9. A method for digitizing input information from a first information type to a digital format, said method comprising the steps of:
- identifying portions of said input information having a characteristic that impairs recognition accuracy with a plurality of prioritized recognition methods that exhibit varying degrees of improved recognition for said characteristic and determining whether a general recognizer or an intermediate recognizer produces a script closer to a reference output; and
  
  recognizing said input information portions with a recognizer that exhibits improved performance for said characteristic.

10. A system for digitizing input information from a first information type to a digital format, comprising:
- a memory that stores computer readable code; and
  
  a processor operatively coupled to said memory, said processor configured to;
  
  identify portions of said input information having a characteristic that impairs recognition accuracy with a plurality of prioritized recognition methods that exhibit varying degrees of improved recognition for said characteristic and determining whether a general recognizer or an intermediate recognizer produces a script closer to a reference script; and
  
  recognize said input information portions with a recognizer that exhibits improved performance for said characteristic.

11. A method for transcribing speech, said method comprising the steps of:
- analyzing a speech sample to determine if at least one of at least first and second types of speech is present, wherein said analyzing step further comprises the steps of;
  
  recognizing said speech with a plurality of prioritized speech recognition methods that exhibit varying degrees of improved speech recognition for a certain characteristic of the input speech; and
  
  determining whether a general speech recognizer or an intermediate speech recognizer produces a script closer to a reference script;
  
  selecting speech recognized with a first speech recognition method if said analyzing step determines that said speech sample includes said first type of speech; and
  
  selecting speech recognized with a second speech recognition method if said analyzing step determines that said speech sample includes said second type of speech.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The method of claim 11, further comprising the step of generating an index of said speech samples with said first type of speech and an index of said speech samples with said second type of speech.
  - 13. The method of claim 12, wherein said index is obtained using a voting method.
  - 14. The method of claim 11, wherein said first type of speech is fast rate speech and said second type of speech is normal rate speech.
  - 15. The method of claim 11, wherein said first type of speech is speech with background noise and said second type of speech is speech without background noise.

16. A system for transcribing speech, comprising:
- a memory that stores computer readable code; and
  
  a processor operatively coupled to said memory, said processor configured to;
  
  analyze a speech sample to determine if at least one of at least first and second types of speech is present, wherein said processor is further configured to;
  
  recognize said speech with a plurality of prioritized speech recognition methods that exhibit varying degrees of improved speech recognition for a certain characteristic of the input speech; and
  
  determine whether a general speech recognizer or an intermediate speech recognizer produces a script closer to a reference script;
  
  select speech recognized with a first speech recognition method if said analyzing step determines that said speech sample includes said first type of speech; and
  
  select speech recognized with a second speech recognition method if said analyzing step determines that said speech sample includes said second type of speech.

17. A method for transcribing speech, said method comprising the steps of:
- identifying portions of said speech having a characteristic that impairs speech recognition accuracy with a plurality of prioritized speech recognition methods that exhibit varying degrees of improved speech recognition for a certain characteristic of the input speech and determining whether a general speech recognizer or an intermediate speech recognizer produces a script closer to a reference script; and
  
  recognizing said identified speech portions with a speech recognizer that exhibits improved performance for said characteristic.
- View Dependent Claims (18, 19, 20)
- - 18. The method of claim 17, further comprising the step of generating an index of said speech having said characteristic that impairs speech recognition accuracy and an index of said speech not having said characteristic that impairs speech recognition accuracy.
  - 19. The method of claim 17, wherein said characteristic that impairs speech recognition accuracy is speech with background noise.
  - 20. The method of claim 17, wherein said characteristic that impairs speech recognition accuracy is a speech rate that is faster than a normal speech rate.

21. A system for transcribing speech, comprising:
- a memory that stores computer readable code; and
  
  a processor operatively coupled to said memory, said processor configured to;
  
  identify portions of said speech having a characteristic that impairs speech recognition accuracy with a plurality of prioritized speech recognition methods that exhibit varying degrees of improved speech recognition for a certain characteristic of the input speech and determining whether a general speech recognizer or an intermediate speech recognizer produces a script closer to a reference script; and
  
  recognize said identified speech portions with a speech recognizer that exhibits improved performance for said characteristic.

22. A method for transcribing speech, said method comprising the steps of:
- recognizing a speech sample with at least three speech recognition methods that are prioritized according to performance for a given characteristic of input speech, at least one of said speech recognition methods producing a reference script;
  
  comparing said recognized speech samples with said reference script to identify portions of said speech having a characteristic that impairs speech recognition accuracy; and
  
  selecting speech portions recognized with a speech recognizer that exhibits improved performance for said characteristic relative to a general speech recognizer.
- View Dependent Claims (23, 24, 25)
- - 23. The method of claim 22, further comprising the step of generating an index of said speech having said characteristic that impairs speech recognition accuracy and an index of said speech not having said characteristic that impairs speech recognition accuracy.
  - 24. The method of claim 22, wherein said characteristic that impairs speech recognition accuracy is a speech rate that is faster than a normal speech rate.
  - 25. The method of claim 22, wherein said characteristic that impairs speech recognition accuracy is speech with background noise.

26. A system for transcribing speech, comprising:
- a memory that stores computer readable code; and
  
  a processor operatively coupled to said memory, said processor configured to;
  
  recognize a speech sample with at least three speech recognition methods that are prioritized according to performance for a given characteristic of input speech, at least one of said speech recognition methods producing a reference script;
  
  compare said recognized speech samples with said reference script to identify portions of said speech having a characteristic that impairs speech recognition accuracy; and
  
  select speech portions recognized with a speech recognizer that exhibits improved performance for said characteristic relative to a general speech recognizer.

27. An article of manufacture for digitizing input information from a first information type to a digital format, comprising:
- a step to analyze said input information to determine if at least one of at least first and second types of information is present, wherein said step to analyze further comprises the steps of;
  
  a step to recognize said input information with a plurality of prioritized recognition methods that exhibit varying degrees of improved recognition for a certain characteristic of the input information; and
  
  a step to determine whether a general recognizer or an intermediate recognizer produces a script closer to a reference output;
  
  a step to select output information recognized with a first recognition method if said analyzing step determines that said input information includes said first characteristic; and
  
  a step to select output information recognized with a second recognition method if said analyzing step determines that said input information includes said second characteristic.

28. An article of manufacture for digitizing input information from a first information type to a digital format, comprising:
- a step to identify portions of said input information having a characteristic that impairs recognition accuracy with a plurality of prioritized recognition methods that exhibit varying degrees of improved recognition for said characteristic and determining whether a general recognizer or an intermediate recognizer produces a script closer to a reference script; and
  
  a step to recognize said input information portions with a recognizer that exhibits improved performance for said characteristic.

29. An article of manufacture for transcribing speech, comprising:
- a step to analyze a speech sample to determine if at least one of at least first and second types of speech is present, wherein said step to analyze further comprises the steps of;
  
  a step to recognize said speech with a plurality of prioritized speech recognition methods that exhibit varying degrees of improved speech recognition for a certain characteristic of the input speech; and
  
  a step to determine whether a general speech recognizer or an intermediate speech recognizer produces a script closer to a reference script;
  
  a step to select speech recognized with a first speech recognition method if said analyzing step determines that said speech sample includes said first type of speech; and
  
  a step to select speech recognized with a second speech recognition method if said analyzing step determines that said speech sample includes said second type of speech.

30. An article of manufacture for transcribing speech, comprising:
- a step to identify portions of said speech having a characteristic that impairs speech recognition accuracy with a plurality of prioritized speech recognition methods that exhibit varying degrees of improved speech recognition for a certain characteristic of the input speech and determining whether a general speech recognizer or an intermediate speech recognizer produces a script closer to a reference script; and
  
  a step to recognize said identified speech portions with a speech recognizer that exhibits improved performance for said characteristic.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Kanevsky, Dimitri, Olsen, Peder Andreas, Eide, Ellen Marie, Gopinath, Ramesh Ambat
Primary Examiner(s)
ARMSTRONG, ANGELA A

Application Number

US10/323,549
Publication Number

US 20030115053A1
Time in Patent Office

1,188 Days
Field of Search

704229-233, 704238-256, 704271-275, 704/235, 704/236, 704/270, 382/229, 345/156, 341/50, 356/406, 379/406.08, 379/388.04
US Class Current

704/231
CPC Class Codes

G10L 15/32 Multiple recognisers used i...

G10L 17/26 Recognition of special voic...

Speech and signal digitization by using recognition metrics to select from multiple techniques

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

Speech and signal digitization by using recognition metrics to select from multiple techniques

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links