Methods and apparatus for improving automatic digitization techniques using recognition metrics

US 20030115053A1
Filed: 12/19/2002
Published: 06/19/2003
Est. Priority Date: 10/29/1999
Status: Active Grant

First Claim

Patent Images

1. A method for digitizing input information, said method comprising the steps of:

analyzing said input information to determine if at least one of at least first and second types of information is present;

recognizing said input information with a first recognition method if said input information includes said first type of information; and

recognizing said input information with a second recognition method if said input information includes said second type of information.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A characteristic-specific digitization method and apparatus are disclosed that reduces the error rate in converting input information into a computer-readable format. The input information is analyzed and subsets of the input information are classified according to whether the input information exhibits a specific physical parameter affecting recognition accuracy. If the input information exhibits the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a characteristic-specific recognizer that demonstrates improved performance for the given physical parameter. If the input information does not exhibit the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a general recognizer that performs well for typical input information. In one implementation, input speech having very low recognition accuracy as a result of a physical speech characteristic is automatically identified and recognized using a characteristic-specific speech recognizer that demonstrates improved performance for the given speech characteristic.

69 Citations

View as Search Results

34 Claims

1. A method for digitizing input information, said method comprising the steps of:
- analyzing said input information to determine if at least one of at least first and second types of information is present;
  
  recognizing said input information with a first recognition method if said input information includes said first type of information; and
  
  recognizing said input information with a second recognition method if said input information includes said second type of information.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein said analyzing step further comprises the steps of recognizing said input information with a plurality of prioritized recognition methods that exhibit varying degrees of improved recognition for a certain characteristic of the input information and determining whether a general recognizer or an intermediate recognizer produces a script closer to a reference output.
  - 3. The method of claim 2, wherein said reference output is produced by a characteristic-specific recognizer.
  - 4. The method of claim 2, wherein said reference output is obtained using a voting method.
  - 5. The method of claim 1, wherein said input information consists of speech.
  - 6. The method of claim 1, wherein said input information consists of handwriting.
  - 7. The method of claim 1, wherein said input information consists of printed text.
  - 8. The method of claim 1, wherein said input information consists of pictures.

9. A method for digitizing input information, said method comprising the steps of:
- identifying portions of said input information having a characteristic that impairs recognition accuracy; and
  
  recognizing said input information portions with a recognizer that exhibits improved performance for said characteristic.
- View Dependent Claims (10)
- - 10. The method of claim 9, wherein said identifying step further comprises the steps of recognizing said input information with a plurality of prioritized recognition methods that exhibit varying degrees of improved recognition for said characteristic and determining whether a general recognizer or an intermediate recognizer produces a script closer to a reference output.

11. A method for transcribing speech, said method comprising the steps of:
- analyzing a speech sample to determine if at least one of at least first and second types of speech is present;
  
  recognizing said speech sample with a first speech recognition method if said speech sample includes said first type of speech; and
  
  recognizing said speech sample with a second speech recognition method if said speech sample includes said second type of speech.
- View Dependent Claims (12, 13, 14, 15, 16)
- - 12. The method of claim 11, wherein said analyzing step further comprises the steps of recognizing said speech with a plurality of prioritized speech recognition methods that exhibit varying degrees of improved speech recognition for a certain characteristic of the input speech and determining whether a general speech recognizer or an intermediate speech recognizer produces a script closer to a reference script.
  - 13. The method of claim 11, further comprising the step of generating an index of said speech samples with said first type of speech and an index of said speech samples with said second type of speech.
  - 14. The method of claim 11, wherein said first type of speech is fast rate speech and said second type of speech is normal rate speech.
  - 15. The method of claim 13, wherein said index is obtained using a voting method.
  - 16. The method of claim 11, wherein said first type of speech is speech with background noise and said second type of speech is speech without background noise.

17. A method for transcribing speech, said method comprising the steps of:
- identifying portions of said speech having a characteristic that impairs speech recognition accuracy; and
  
  recognizing said identified speech portions with a speech recognizer that exhibits improved performance for said characteristic.
- View Dependent Claims (18, 19, 20, 21)
- - 18. The method of claim 17, wherein said identifying step further comprises the steps of recognizing said speech with a plurality of prioritized speech recognition methods that exhibit varying degrees of improved speech recognition for a certain characteristic of the input speech and determining whether a general speech recognizer or an intermediate speech recognizer produces a script closer to a reference script.
  - 19. The method of claim 17, further comprising the step of generating an index of said speech having said characteristic that impairs speech recognition accuracy and an index of said speech not having said characteristic that impairs speech recognition accuracy.
  - 20. The method of claim 17, wherein said characteristic that impairs speech recognition accuracy is fast rate speech.
  - 21. The method of claim 17, wherein said characteristic that impairs speech recognition accuracy is speech with background noise.

22. A method for transcribing speech, said method comprising the steps of:
- recognizing a speech sample with at least three speech recognition methods that are prioritized according to performance for a given characteristic of input speech, at least one of said speech recognition methods producing a reference script;
  
  comparing said recognized speech samples with said reference script to identify portions of said speech having a characteristic that impairs speech recognition accuracy; and
  
  recognizing said identified speech portions with a speech recognizer that exhibits improved performance for said characteristic relative to a general speech recognizer.
- View Dependent Claims (23, 24, 25)
- - 23. The method of claim 22, further comprising the step of generating an index of said speech having said characteristic that impairs speech recognition accuracy and an index of said speech not having said characteristic that impairs speech recognition accuracy.
  - 24. The method of claim 22, wherein said characteristic that impairs speech recognition accuracy is fast rate speech.
  - 25. The method of claim 22, wherein said characteristic that impairs speech recognition accuracy is speech with background noise.

26. A system for digitizing input information, comprising:
- a memory that stores computer readable code; and
  
  a processor operatively coupled to said memory, said processor configured to;
  
  analyze said input information to determine if at least one of at least first and second types of information is present;
  
  recognize said input information with a first recognition method if said input information includes said first type of information; and
  
  recognize said input information with a second recognition method if said input information includes said second type of information.

27. A system for digitizing input information, comprising:
- a memory that stores computer readable code; and
  
  a processor operatively coupled to said memory, said processor configured to;
  
  identify portions of said input information having a characteristic that impairs recognition accuracy; and
  
  recognize said input information portions with a recognizer that exhibits improved performance for said characteristic.

28. A system for transcribing speech, comprising:
- a memory that stores computer readable code; and
  
  a processor operatively coupled to said memory, said processor configured to;
  
  analyze a speech sample to determine if at least one of at least first and second types of speech is present;
  
  recognize said speech sample with a first speech recognition method if said speech sample includes said first type of speech; and
  
  recognize said speech sample with a second speech recognition method if said speech sample includes said second type of speech.

29. A system for transcribing speech, comprising:
- a memory that stores computer readable code; and
  
  a processor operatively coupled to said memory, said processor configured to;
  
  identify portions of said speech having a characteristic that impairs speech recognition accuracy; and
  
  recognize said identified speech portions with a speech recognizer that exhibits improved performance for said characteristic.

30. A system for transcribing speech, comprising:
- a memory that stores computer readable code; and
  
  a processor operatively coupled to said memory, said processor configured to;
  
  recognize a speech sample with at least three speech recognition methods that are prioritized according to performance for a given characteristic of input speech, at least one of said speech recognition methods producing a reference script;
  
  compare said recognized speech samples with said reference script to identify portions of said speech having a characteristic that impairs speech recognition accuracy; and
  
  recognize said identified speech portions with a speech recognizer that exhibits improved performance for said characteristic relative to a general speech recognizer.

31. An article of manufacture for digitizing input information, comprising:
- a step to analyze said input information to determine if at least one of at least first and second types of information is present;
  
  a step to recognize said input information with a first recognition method if said input information includes said first type of information; and
  
  a step to recognize said input information with a second recognition method if said input information includes said second type of information.

32. An article of manufacture for digitizing input information, comprising:
- a step to identify portions of said input information having a characteristic that impairs recognition accuracy; and
  
  a step to recognize said input information portions with a recognizer that exhibits improved performance for said characteristic.

33. An article of manufacture for transcribing speech, comprising:
- a step to analyze a speech sample to determine if at least one of at least first and second types of speech is present;
  
  a step to recognize said speech sample with a first speech recognition method if said speech sample includes said first type of speech; and
  
  a step to recognize said speech sample with a second speech recognition method if said speech sample includes said second type of speech.

34. An article of manufacture for transcribing speech, comprising:
- a step to identify portions of said speech having a characteristic that impairs speech recognition accuracy; and
  
  a step to recognize said identified speech portions with a speech recognizer that exhibits improved performance for said characteristic.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Kanevsky, Dimitri, Olsen, Peder Andreas, Eide, Ellen Marie, Gopinath, Ramesh Ambat

Granted Patent

US 7,016,835 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/231
CPC Class Codes

G10L 15/32 Multiple recognisers used i...

G10L 17/26 Recognition of special voic...

Methods and apparatus for improving automatic digitization techniques using recognition metrics

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

69 Citations

34 Claims

Specification

Solutions

Use Cases

Quick Links

Methods and apparatus for improving automatic digitization techniques using recognition metrics

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

69 Citations

34 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links