SPEECH RECOGNITION APPARATUS

US 20070100636A1
Filed: 10/30/2006
Published: 05/03/2007
Est. Priority Date: 11/02/2005
Status: Active Grant

First Claim

Patent Images

1. A speech recognition apparatus for allowing setting by speech, comprising:

an input unit configured to input a setting instruction by speech;

a speech interpretation unit configured to recognize and interpret contents of the setting instruction by speech to generate first structured data containing candidates of the interpretation result;

an instruction input detecting unit configured to detect a setting instruction input by a user;

an instruction input interpretation unit configured to interpret contents of the setting instruction input to generate second structured data; and

a selection unit configured to select one of the interpretation candidates contained in the first structured data on the basis of the second structured data.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition apparatus that enables efficient multimodal input in setting a plurality of items by one utterance is provided. An input unit inputs a setting instruction by speech. A speech interpretation unit recognizes and interprets the contents of the setting instruction by speech to generate first structured data containing candidates of the interpretation result. An instruction input detecting unit detects a setting instruction input by a user. An instruction input interpretation unit interprets the contents of the setting instruction input to generate second structured data. A selection unit selects one of the interpretation candidates contained in the first structured data based on the second structured data.

44 Citations

View as Search Results

22 Claims

1. A speech recognition apparatus for allowing setting by speech, comprising:
- an input unit configured to input a setting instruction by speech;
  
  a speech interpretation unit configured to recognize and interpret contents of the setting instruction by speech to generate first structured data containing candidates of the interpretation result;
  
  an instruction input detecting unit configured to detect a setting instruction input by a user;
  
  an instruction input interpretation unit configured to interpret contents of the setting instruction input to generate second structured data; and
  
  a selection unit configured to select one of the interpretation candidates contained in the first structured data on the basis of the second structured data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 17, 18)
- - 2. The apparatus according to claim 1, wherein said instruction input detecting unit detects a setting instruction input for an object displayed on a display screen.
  - 3. The apparatus according to claim 1, wherein each interpretation candidate contained in the first structured data contains information of a setting item name and a setting value, and the second structured data contains information of a setting item name.
  - 4. The apparatus according to claim 3, wherein said selection unit selects, from the interpretation candidates contained in the first structured data, interpretation candidates containing a setting item name matching the setting item name contained in the second structured data.
  - 5. The apparatus according to claim 4, wherein each interpretation candidate contained in the first structured data further contains likelihood information of the interpretation result, and said selection unit selects, from the interpretation candidates in the first structured data, which contain the setting item name matching the setting item name contained in the second structured data, an interpretation candidate of a highest rank of the likelihood information.
  - 6. The apparatus according to claim 1, wherein each of the first and second structured data contains a start time and an end time of the setting instruction input.
  - 7. The apparatus according to claim 6, wherein said instruction input interpretation unit holds a plurality of second structured data generated, and said selection unit selects the second structured data corresponding to the first structured data on the basis of a start time and an end time of the setting. instruction input contained in the first structured data.
  - 8. The apparatus according to claim 5, wherein said selection unit selects the interpretation candidate of the highest rank of the likelihood information when no interpretation candidate can be selected from the first structured data on the basis of the second structured data.
  - 9. The apparatus according to claim 1, wherein said selection unit rejects input by said speech input unit and notifies a user when no interpretation candidate can be selected from the first structured data on the basis of the second structured data.
  - 10. The apparatus according to claim 1, further comprising a setting unit configured to set the speech recognition apparatus on the basis of the interpretation candidate selected by said selection unit.
  - 17. The apparatus according to claim 1, further comprising a setting window control unit configured to display a setting window corresponding to the setting instruction input when said instruction input detecting unit detects the setting instruction input, said setting window control unit inhibiting display of the setting window when said speech input unit inputs the setting instruction.
  - 18. The apparatus according to claim 1, further comprising a setting window control unit configured to display a setting window corresponding to the setting instruction input when said instruction input detecting unit detects the setting instruction input, and a speech input switching unit configured to switch ON/OFF of the setting instruction input by said speech input unit, said setting window control unit inhibiting display of the setting window when said speech input switching unit indicates speech input ON.

11. A speech recognition apparatus for allowing setting by speech, comprising:
- an input unit configured to input a setting instruction by speech;
  
  a feature extraction unit configured to extract a feature parameter string from the speech input by said input unit;
  
  a search unit configured to search for a pattern resembling to the feature parameter string extracted by said feature extraction unit most from predetermined phoneme sequence pattern candidates; and
  
  an instruction input detecting unit configured to detect a setting instruction input by a user, said search unit narrowing down the phoneme sequence pattern candidates on the basis of the setting instruction input detected by said instruction input detecting unit.
- View Dependent Claims (12, 13, 14, 15, 16)
- - 12. The apparatus according to claim 11, wherein said search unit narrows down the phoneme sequence pattern candidates on the basis of information of a setting item indicated by the setting instruction input.
  - 13. The apparatus according to claim 11, further comprising a grammar storage unit configured to store a speech recognition grammar, wherein the phoneme sequence pattern candidates are generated on the basis of the speech recognition grammar stored in said grammar storage unit.
  - 14. The apparatus according to claim 11, wherein said instruction input detecting unit detects a setting instruction input for an object displayed on a display screen.
  - 15. The apparatus according to claim 14, wherein said instruction input detecting unit detects, as the setting instruction input, an instruction input for a specific screen area indicating a setting item.
  - 16. The apparatus according to claim 11, further comprising an output unit configured to output a search result by said search unit.

19. A method for setting a device by using speech recognition, comprising the steps of:
- inputting a setting instruction by speech;
  
  recognizing and interpreting contents of the setting instruction by speech to generate first structured data containing candidates of the interpretation result;
  
  detecting a setting instruction input by a user;
  
  interpreting contents of the detected setting instruction input to generate second structured data; and
  
  selecting one of the interpretation candidates contained in the first structured data on the basis of the second structured data.

20. A method for setting a device by using speech recognition, comprising the steps of:
- inputting a setting instruction by speech;
  
  extracting a feature parameter string from the input speech;
  
  searching for a pattern resembling to the extracted feature parameter string most from predetermined phoneme sequence pattern candidates; and
  
  detecting a setting instruction input by a user, wherein in the search step, the phoneme sequence pattern candidates are narrowed down on the basis of the detected setting instruction input.

21. A computer program stored on a computer-readable medium for setting device options using speech recognition, the program comprising code for performing the following steps of:
- inputting a setting instruction by speech;
  
  recognizing and interpreting contents of the setting instruction by speech to generate first structured data containing candidates of the interpretation result;
  
  detecting a setting instruction input by a user;
  
  interpreting contents of the detected setting instruction input to generate second structured data; and
  
  selecting one of the interpretation candidates contained in the first structured data on the basis of the second structured data.

22. A computer program stored on a computer-readable medium for setting device options using speech recognition, the program comprising code for performing the following steps of:
- inputting a setting instruction by speech;
  
  extracting a feature parameter string from the input speech;
  
  searching for a pattern resembling to the extracted feature parameter string most from predetermined phoneme sequence pattern candidates; and
  
  detecting a setting instruction input by a user, wherein in the search step, the phoneme sequence pattern candidates are narrowed down on the basis of the detected setting instruction input.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Original Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Inventors
Hirota, Makoto, Yamamoto, Hiroki

Granted Patent

US 7,844,458 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/276
CPC Class Codes

G10L 15/1822 Parsing for meaning underst...

SPEECH RECOGNITION APPARATUS

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

44 Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

SPEECH RECOGNITION APPARATUS

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

44 Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links