Learning a verification model for speech recognition based on extracted recognition and language feature information

US 8,751,226 B2
Filed: 06/18/2007
Issued: 06/10/2014
Est. Priority Date: 06/29/2006
Status: Active Grant

First Claim

Patent Images

1. A speech processing apparatus comprising:

a recognition feature information extracting unit which extracts recognition feature information having a characteristic of recognition result data obtained by performing a speech recognition process on an inputted speech, from said recognition result data, said recognition feature information being speech recognition result data for learning which includes plural recognition hypotheses;

a language feature information extracting unit which extracts language feature information having a characteristic of a pre-registered language resource from said pre-registered language resource, said pre-registered language resource including document data, sentences, text data, word sequences, or dictionaries, the extracted language feature information including linguistic characteristics included in an existing word sequence, or importance of similarity of a document; and

a verification model obtaining unit which obtains a verification model by a learning process based on the extracted recognition feature information and language feature information, the obtained verification model being used to verify a speech recognition result data which is inputted as a verification target to a speech recognition system, wherein said verification model obtaining unit obtains a discriminative model as said verification model, said discriminative model being indicative of a correct and false label or degree of importance according to use.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech processing apparatus 101 includes a recognition feature extracting unit 12 that extracts recognition feature information which is a characteristic of a speech recognition result 15 obtained by performing a speech recognition process on an inputted speech from the speech recognition result 15; a language feature extracting unit 11 that extracts language feature information which is a characteristic of a pre-registered language resource 14 from the language resource 14; and a model learning unit 13 that obtains a verification model 16 by a learning process based on the extracted recognition feature information and language feature information.

13 Citations

View as Search Results

26 Claims

1. A speech processing apparatus comprising:
- a recognition feature information extracting unit which extracts recognition feature information having a characteristic of recognition result data obtained by performing a speech recognition process on an inputted speech, from said recognition result data, said recognition feature information being speech recognition result data for learning which includes plural recognition hypotheses;
  
  a language feature information extracting unit which extracts language feature information having a characteristic of a pre-registered language resource from said pre-registered language resource, said pre-registered language resource including document data, sentences, text data, word sequences, or dictionaries, the extracted language feature information including linguistic characteristics included in an existing word sequence, or importance of similarity of a document; and
  
  a verification model obtaining unit which obtains a verification model by a learning process based on the extracted recognition feature information and language feature information, the obtained verification model being used to verify a speech recognition result data which is inputted as a verification target to a speech recognition system, wherein said verification model obtaining unit obtains a discriminative model as said verification model, said discriminative model being indicative of a correct and false label or degree of importance according to use.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The speech processing apparatus as set forth in claim 1, further comprising a selecting unit which selects the type of said recognition feature information and language feature information to be used for said learning process.
  - 3. The speech processing apparatus as set forth in claim 2, wherein said selecting unit selects the type of said recognition feature information and language feature information to be used for said learning process based on appearance frequency of said recognition result data and language resource.
  - 4. The speech processing apparatus as set forth in claim 1, further comprising a weighted value setting unit which sets a weighted value of said recognition feature information and language feature information to be used for said learning process.
  - 5. The speech processing apparatus as set forth in claim 4, wherein said weighted value setting unit sets said weighted value according to the type of said recognition feature information and language feature information.
  - 6. The speech processing apparatus as set forth in claim 4, wherein said weighted value setting unit calculates a weighting sum for each of said recognition feature information and language feature information using said weighted value set to each of said recognition feature information and language feature information, and adopts the calculated weighting sum as a feature value.
  - 7. The speech processing apparatus as set forth in claim 1, further comprising a verification unit which verifies recognition result data newly inputted as a verification target using said verification model.
  - 8. The speech processing apparatus as set forth in claim 1, wherein said language feature information extracting unit extracts said language feature information based on a use history with respect to a pre-registered language resource.
  - 9. The speech processing apparatus as set forth in claim 1, wherein said verification model obtaining unit obtains a conditional random field model as said discriminative model.
  - 10. The speech processing apparatus as set forth in claim 1, wherein said language feature information includes a relationship between words having a large distance, a sentence structure, and meaning contents of a sentence.
  - 11. The speech processing apparatus as set forth in claim 1, wherein said language feature information extracting unit obtains characteristics including importance or similarity of a document based on a use history including a user'"'"'s reference frequency or reference history with respect to a group of the document, said verification model obtaining unit uses the obtained characteristics for learning as said language feature information.
  - 12. The speech processing apparatus as set forth in claim 1, wherein said language feature information extracting unit extracts said language feature information from information including importance of a word or type of a sentence which is preset with respect to said pre-registered language resource or said speech recognition result for learning according to a user'"'"'s use purpose.
  - 13. The speech processing apparatus as set forth in claim 1, wherein further comprises:
    - a feature extracting unit which extracts a predetermined recognition feature and language feature with respect to recognition hypothesis obtained by performing the speech recognition process on the inputted speech;
      
      a verification processing unit which calculates a verification result and its confidence measure score with reference to said feature extracted by said feature extracting unit and said verification model previously obtained by said verification model obtaining unit; and
      
      an information integrating unit which integrates said recognition hypothesis and the result of the verification process by said verification processing unit and outputs it as a verification complete speech recognition result.
  - 14. The speech processing apparatus as set forth in claim 1, wherein said language feature information extracting unit calculates value of a corresponding feature or processes frequency of appearance of feature.
  - 15. A non-transitory computer-readable medium that causes a computer to function as the speech processing apparatus as set forth in claim 1.

16. A speech processing method comprising:
- extracting recognition feature information having a characteristic of recognition result data obtained by performing a speech recognition process on an inputted speech, from said recognition result data by a speech processing apparatus, said recognition feature information being speech recognition result data for learning which includes plural recognition hypotheses;
  
  extracting language feature information having a characteristic of a pre-registered language resource from said pre-registered language resource by said speech processing apparatus, said pre-registered language resource including document data, sentences, text data, word sequences, or dictionaries, the extracted language feature information including linguistic characteristics included in an existing word sequence, or importance or similarity of a document; and
  
  obtaining a verification model by a learning process based on the extracted recognition feature information and language feature information by said speech processing apparatus, the obtained verification model being used to verify a speech recognition result data which is inputted as a verification target to a speech recognition system, wherein said obtained verification model is a discriminative model, said discriminative model being indicative of a correct and false label or degree of importance according to use.
- View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25)
- - 17. The speech processing method as set forth in claim 16, further comprising selecting the type of recognition feature information and language feature information to be used for said learning process by said speech processing apparatus.
  - 18. The speech processing method as set forth in claim 17, wherein said type selecting by said speech processing apparatus selects the type of said recognition feature information and language feature information based on appearance frequency of said recognition result data and language resource.
  - 19. The speech processing method as set forth in claim 16, further comprising setting a weighted value of the recognition feature information and language feature information to be used for said learning process by said speech processing apparatus.
  - 20. The speech processing method as set forth in claim 19, wherein said setting the weighted value by said speech processing apparatus sets said weighted value according to the type of said recognition feature information and language feature information.
  - 21. The speech processing method as set forth in claim 19, wherein said setting the weighted value by said speech processing apparatus calculates a weighting sum for each of said recognition feature information and language feature information using said weighted value set to each of said recognition feature information and language feature information, and adopts the calculated weighting sum as a feature value.
  - 22. The speech processing method as set forth in claim 16, further comprising verifying recognition result data newly inputted as a verification target using said verification model by the speech processing apparatus.
  - 23. The speech processing method as set forth in claim 16, wherein said extracting the language feature information by said speech processing apparatus extracts said language feature information based on a use history with respect to a pre-registered language resource.
  - 24. The speech processing method as set forth in claim 16, wherein said obtaining said verification model by said speech processing apparatus obtains a conditional random field model as said discriminative model.
  - 25. The speech processing method as set forth in claim 16, wherein said extracting the recognition feature information by said speech processing apparatus extracts an attribute regarding transcription represented by said recognition result data, part of speech of said recognition result data, and pronunciation of said recognition feature information.

26. A speech processing apparatus comprising:
- means for extracting recognition feature information having a characteristic of recognition result data obtained by performing a speech recognition process on an inputted speech, from said recognition result data, said recognition feature information being speech recognition result data for learning which includes plural recognition hypotheses;
  
  means for extracting language feature information having a characteristic of a pre-registered language resource from said pre-registered language resource, said pre-registered language resource including document data, sentences, text data, word sequences, or dictionaries, the extracted language feature information including linguistic characteristics included in an existing word sequence, or importance or similarity of a document; and
  
  means for obtaining a verification model by a learning process based on the extracted recognition feature information and language feature information, the obtained verification model being used to verify a speech recognition result data which is inputted as a verification target to a speech recognition system, wherein said obtained verification model is a discriminative model, said discriminative model being indicative of a correct and false label or degree of importance according to use.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NEC Corporation
Original Assignee
NEC Corporation
Inventors
Yamamoto, Hitoshi, Miki, Kiyokazu
Primary Examiner(s)
PULLIAS, JESSE SCOTT

Application Number

US12/306,632
Publication Number

US 20090204390A1
Time in Patent Office

2,549 Days
Field of Search

None
US Class Current

704/231
CPC Class Codes

G10L 15/065 Adaptation

G10L 15/183 using context dependencies,...

Learning a verification model for speech recognition based on extracted recognition and language feature information

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

13 Citations

26 Claims

Specification

Use Cases

Quick Links

Others

Learning a verification model for speech recognition based on extracted recognition and language feature information

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

13 Citations

26 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others