Speech recognition based interactive information retrieval scheme using dialogue control to reduce user stress

US 20040015365A1
Filed: 07/15/2003
Published: 01/22/2004
Est. Priority Date: 05/31/1999
Status: Active Grant

First Claim

Patent Images

1. A method of speech recognition based interactive information retrieval for ascertaining and retrieving a target information of a user by determining a retrieval key entered by the user using a speech recognition processing, comprising the steps of:

(a) storing retrieval key candidates that constitute a number of data that cannot be processed by the speech recognition processing in a prescribed processing time, as recognition target words in a speech recognition database, the recognition target words being divided into prioritized recognition target words that constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time and that have relatively higher importance levels based on statistical information defined for the recognition target words, and non-prioritized recognition target words other than the prioritized recognition target words;

(b) requesting the user by a speech dialogue with the user to enter a speech input indicating the retrieval key, and carrying out the speech recognition processing for the speech input with respect to the prioritized recognition target words to obtain a recognition result;

(c) carrying out a confirmation process using a speech dialogue with the user according to the recognition result to determine the retrieval key, when the recognition result satisfies a prescribed condition for judging that the retrieval key can be determined only by a confirmation process with the user;

(d) carrying out a related information query using a speech dialogue with the user to request the user to enter another speech input indicating a related information of the retrieval key, when the recognition result does not satisfy the prescribed condition;

(e) carrying out the speech recognition processing for the another speech input to obtain another recognition result, and adjusting the recognition result according to the another recognition result to obtain adjusted recognition result; and

(f) repeating the step (c) or the steps (d) and (e) using the adjusted recognition result in place of the recognition result, until the retrieval key is determined.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In the disclosed speech recognition based interactive information retrieval scheme, the recognition target words in the speech recognition database are divided into prioritized recognition target words that constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time and that have relatively higher importance levels based on statistical information, and the other non-prioritized recognition target words. Then, the speech recognition processing for the speech input with respect to the prioritized recognition target words is carried out at higher priority, and a confirmation process is carried out when the recognition result satisfies a prescribed condition for judging that the retrieval key can be determined only by a confirmation process with the user. On the other hand, a related information query to request the user to enter another speech input for a related information of the retrieval key is carried out when the recognition result does not satisfy the prescribed condition, and the original recognition result is adjusted according to the recognition result for another speech input. In this way, the retrieval key determination is realized through natural speech dialogues with the user.

36 Citations

View as Search Results

42 Claims

1. A method of speech recognition based interactive information retrieval for ascertaining and retrieving a target information of a user by determining a retrieval key entered by the user using a speech recognition processing, comprising the steps of:
- (a) storing retrieval key candidates that constitute a number of data that cannot be processed by the speech recognition processing in a prescribed processing time, as recognition target words in a speech recognition database, the recognition target words being divided into prioritized recognition target words that constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time and that have relatively higher importance levels based on statistical information defined for the recognition target words, and non-prioritized recognition target words other than the prioritized recognition target words;
  
  (b) requesting the user by a speech dialogue with the user to enter a speech input indicating the retrieval key, and carrying out the speech recognition processing for the speech input with respect to the prioritized recognition target words to obtain a recognition result;
  
  (c) carrying out a confirmation process using a speech dialogue with the user according to the recognition result to determine the retrieval key, when the recognition result satisfies a prescribed condition for judging that the retrieval key can be determined only by a confirmation process with the user;
  
  (d) carrying out a related information query using a speech dialogue with the user to request the user to enter another speech input indicating a related information of the retrieval key, when the recognition result does not satisfy the prescribed condition;
  
  (e) carrying out the speech recognition processing for the another speech input to obtain another recognition result, and adjusting the recognition result according to the another recognition result to obtain adjusted recognition result; and
  
  (f) repeating the step (c) or the steps (d) and (e) using the adjusted recognition result in place of the recognition result, until the retrieval key is determined.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
- - 2. The method of claim 1, wherein the step (d) also carries out the speech recognition processing for the speech input with respect to as many of the non-prioritized recognition target words as a number of data that can be processed by the speech recognition processing in the prescribed processing time to obtain additional recognition result, while carrying out the related information query using the speech dialogue with the user, and the step (e) also adjusts the recognition result by adding the additional recognition result.
  - 3. The method of claim 2, wherein the non-prioritized recognition target words are subdivided into a plurality of sets each containing a number of recognition target words that can be processed by the speech recognition processing in the prescribed processing time, and the step (d) carries out the speech recognition processing for the speech input with respect to the plurality of sets in an order of the importance levels of the recognition target words contained in each set.
  - 4. The method of claim 1, wherein the recognition result indicates recognition retrieval key candidates and their recognition likelihoods and the another recognition result indicates recognition related information candidates and their recognition likelihoods, and the step (e) adjusts the recognition result by calculating new recognition likelihoods for the recognition retrieval key candidates according to recognition likelihoods for the recognition retrieval key candidates indicated in the recognition result and recognition likelihoods for the recognition related information candidates indicated in the another recognition result.
  - 5. The method of claim 4, wherein the step (e) calculates the new recognition likelihoods for the recognition retrieval key candidates by multiplying a recognition likelihood of each recognition retrieval key candidate with a recognition likelihood of a corresponding recognition related information candidate.
  - 6. The method of claim 1, wherein the recognition result indicates recognition retrieval key candidates and their recognition likelihoods, and the step (c) judges that the recognition result satisfies the prescribed condition, when a number of recognition retrieval key leading candidates which have recognition likelihoods that are exceeding a prescribed likelihood threshold is less than or equal to a prescribed number but not zero.
  - 7. The method of claim 1, wherein the statistical information used at the step (a) is access frequencies of the retrieval key candidates.
  - 8. The method of claim 1, wherein the prescribed processing time used at the step (a) is a real dialogue processing time specified in advance.
  - 9. The method of claim 1, wherein the retrieval key indicates an attribute value of one attribute of the target information, and the related information requested by the related information query of the step (d) is an attribute value of another attribute of the target information other than the one attribute.
  - 10. The method of claim 9, wherein attributes of the target information are hierarchically ordered, and the another attribute is a hierarchically adjacent one of the one attribute.
  - 11. The method of claim 9, wherein the another attribute is selected to be an attribute having attribute value candidates that constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time.
  - 12. The method of claim 1, wherein the step (a) stores the retrieval key candidates indicating attribute values of a plurality of attributes of the target information, such that the retrieval key entered by the user can indicate an attribute value of any one of the plurality of attributes.
  - 13. The method of claim 1, wherein the step (a) stores the retrieval key candidates as lower level data, and also stores higher level data that constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time, where each lower level data is dependent on one higher level data and lower level data that are dependent on one higher level data constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time.
  - 14. The method of claim 13, wherein the step (c) judges that the recognition result satisfies the prescribed condition when the retrieval key can be determined by a number of confirmation queries less than or equal to a prescribed number.
  - 15. The method of claim 13, wherein the step (d) judges that the recognition result does not satisfy the prescribed condition when the user negated the prescribed number of the confirmation queries.
  - 16. The method of claim 13, wherein the related information requested by the related information query of the step (d) is a higher level data indicating a generic concept to which a specific concept indicated by the retrieval key belongs.
  - 17. The method of claim 16, wherein the step (e) adjusts the recognition result by carrying out another confirmation process using a speech dialogue with the user according to the another recognition result to determine the higher level data, extracting the lower level data that are dependent on determined higher level data as new recognition target data, carrying out the speech recognition processing for the speech input with respect to the new recognition target data to obtain the another recognition result.

18. A method of speech recognition based interactive information retrieval for ascertaining and retrieving a target information of a user by determining a retrieval key entered by the user using a speech recognition processing, comprising the steps of:
- (a) storing retrieval key candidates that are classified according to attribute values of an attribute item in a speech recognition database;
  
  (b) requesting the user by a speech dialogue with the user to enter a speech input indicating an attribute value of the attribute item for the retrieval key, and carrying out the speech recognition processing for the speech input to obtain a recognition result indicating attribute value candidates and their recognition likelihoods;
  
  (c) selecting those attribute value candidates which have recognition likelihoods that are exceeding a prescribed likelihood threshold as attribute value leading candidates, and extracting those retrieval key candidates that belong to the attribute value leading candidates as new recognition target data;
  
  (d) requesting the user by a speech dialogue with the user to enter another speech input indicating the retrieval key, and carrying out the speech recognition processing for the another speech input with respect to the new recognition target data to obtain another recognition result; and
  
  (e) carrying out a confirmation process using a speech dialogue with the user according to the another recognition result to determine the retrieval key.
- View Dependent Claims (19)
- - 19. The method of claim 18, wherein the attribute item is selected to be an attribute having attribute value candidates that constitute a number of data that can be processed by the speech recognition processing in a prescribed processing time.

20. A speech recognition based interactive information retrieval apparatus for ascertaining and retrieving a target information of a user by determining a retrieval key entered by the user using a speech recognition processing, comprising:
- a speech recognition database configured to store retrieval key candidates that constitute a number of data that cannot be processed by the speech recognition processing in a prescribed processing time, as recognition target words, the recognition target words being divided into prioritized recognition target words that constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time and that have relatively higher importance levels based on statistical information defined for the recognition target words, and non-prioritized recognition target words other than the prioritized recognition target words;
  
  a speech recognition unit configured to carry out the speech recognition processing; and
  
  a dialogue control unit configured to carry out speech dialogues with the user;
  
  wherein the dialogue control unit carries out a speech dialogue for requesting the user to enter a speech input indicating the retrieval key, such that the speech recognition unit carries out the speech recognition processing for the speech input with respect to the prioritized recognition target words to obtain a recognition result;
  
  the dialogue control unit carries out a speech dialogue for a confirmation process according to the recognition result to determine the retrieval key, when the recognition result satisfies a prescribed condition for judging that the retrieval key can be determined only by a confirmation process with the user;
  
  the dialogue control unit carries out a speech dialogue for a related information query to request the user to enter another speech input indicating a related information of the retrieval key, when the recognition result does not satisfy the prescribed condition, such that the speech recognition unit carries out the speech recognition processing for the another speech input to obtain another recognition result and the dialogue control unit adjusts the recognition result according to the another recognition result to obtain adjusted recognition result, and the dialogue control unit controls the speech dialogues to repeat the confirmation process or the related information query using the adjusted recognition result in place of the recognition result, until the retrieval key is determined.
- View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36)
- - 21. The apparatus of claim 20, wherein the speech recognition unit also carries out the speech recognition processing for the speech input with respect to as many of the non-prioritized recognition target words as a number of data that can be processed by the speech recognition processing in the prescribed processing time to obtain additional recognition result, while the dialogue control unit is carrying out the related information query using the speech dialogue with the user, and the dialogue control unit also adjusts the recognition result by adding the additional recognition result.
  - 22. The apparatus of claim 21, wherein the speech recognition database stores the non-prioritized recognition target words that are subdivided into a plurality of sets each containing a number of recognition target words that can be processed by the speech recognition processing in the prescribed processing time, and the speech recognition unit carries out the speech recognition processing for the speech input with respect to the plurality of sets in an order of the importance levels of the recognition target words contained in each set.
  - 23. The apparatus of claim 20, wherein the speech recognition unit obtains the recognition result that indicates recognition retrieval key candidates and their recognition likelihoods and the another recognition result that indicates recognition related information candidates and their recognition likelihoods, and the dialogue control unit adjusts the recognition result by calculating new recognition likelihoods for the recognition retrieval key candidates according to recognition likelihoods for the recognition retrieval key candidates indicated in the recognition result and recognition likelihoods for the recognition related information candidates indicated in the another recognition result.
  - 24. The apparatus of claim 23, wherein the dialogue control unit calculates the new recognition likelihoods for the recognition retrieval key candidates by multiplying a recognition likelihood of each recognition retrieval key candidate with a recognition likelihood of a corresponding recognition related information candidate.
  - 25. The apparatus of claim 20, wherein the speech recognition unit obtains the recognition result that indicates recognition retrieval key candidates and their recognition likelihoods, and the dialogue control unit judges that the recognition result satisfies the prescribed condition, when a number of recognition retrieval key leading candidates which have recognition likelihoods that are exceeding a prescribed likelihood threshold is less than or equal to a prescribed number but not zero.
  - 26. The apparatus of claim 20, wherein the statistical information used in the speech recognition database is access frequencies of the retrieval key candidates.
  - 27. The apparatus of claim 20, wherein the prescribed processing time used in the speech recognition database is a real dialogue processing time specified in advance.
  - 28. The apparatus of claim 20, wherein the retrieval key indicates an attribute value of one attribute of the target information, and the related information requested by the related information query carried out by the dialogue control unit is an attribute value of another attribute of the target information other than the one attribute.
  - 29. The apparatus of claim 28, wherein attributes of the target information are hierarchically ordered, and the another attribute is a hierarchically adjacent one of the one attribute.
  - 30. The apparatus of claim 28, wherein the another attribute is selected to be an attribute having attribute value candidates that constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time.
  - 31. The apparatus of claim 20, wherein the speech recognition database stores the retrieval key candidates indicating attribute values of a plurality of attributes of the target information, such that the retrieval key entered by the user can indicate an attribute value of any one of the plurality of attributes.
  - 32. The apparatus of claim 20, wherein the speech recognition database stores the retrieval key candidates as lower level data, and also stores higher level data that constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time, where each lower level data is dependent on one higher level data and lower level data that are dependent on one higher level data constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time.
  - 33. The apparatus of claim 32, wherein the dialogue control unit judges that the recognition result satisfies the prescribed condition when the retrieval key can be determined by a number of confirmation queries less than or equal to a prescribed number.
  - 34. The apparatus of claim 32, wherein the dialogue control unit judges that the recognition result does not satisfy the prescribed condition when the user negated the prescribed number of the confirmation queries.
  - 35. The apparatus of claim 32, wherein the related information requested by the related information query carried out by the dialogue control unit is a higher level data indicating a generic concept to which a specific concept indicated by the retrieval key belongs.
  - 36. The apparatus of claim 35, wherein the dialogue control unit adjusts the recognition result by carrying out another confirmation process using a speech dialogue with the user according to the another recognition result to determine the higher level data, extracting the lower level data that are dependent on determined higher level data as new recognition target data, carrying out the speech recognition processing for the speech input with respect to the new recognition target data to obtain the another recognition result.

37. A speech recognition based interactive information retrieval apparatus for ascertaining and retrieving a target information of a user by determining a retrieval key entered by the user using a speech recognition processing, comprising:
- a speech recognition database configured to store retrieval key candidates that are classified according to attribute values of an attribute item;
  
  a speech recognition unit configured to carry out the speech recognition processing; and
  
  a dialogue control unit configured to carry out speech dialogues with the user;
  
  wherein the dialogue control unit carries out a speech dialogue for requesting the user to enter a speech input indicating an attribute value of the attribute item for the retrieval key, such that the speech recognition unit carries out the speech recognition processing for the speech input to obtain a recognition result indicating attribute value candidates and their recognition likelihoods;
  
  the dialogue control unit selects those attribute value candidates which have recognition likelihoods that are exceeding a prescribed likelihood threshold as attribute value leading candidates, and extracts those retrieval key candidates that belong to the attribute value leading candidates as new recognition target data;
  
  the dialogue control unit carries out a speech dialogue for requesting the user to enter another speech input indicating the retrieval key, such that the speech recognition unit carries out the speech recognition processing for the another speech input with respect to the new recognition target data to obtain another recognition result; and
  
  the dialogue control unit carries out a speech dialogue for a confirmation process according to the another recognition result to determine the retrieval key.
- View Dependent Claims (38)
- - 38. The apparatus of claim 37, wherein the attribute item is selected to be an attribute having attribute value candidates that constitutes a number of data that can be processed by the speech recognition processing in a prescribed processing time.

39. A computer usable medium having computer readable program codes embodied therein for causing a computer to function as a speech recognition based interactive information retrieval system for ascertaining and retrieving a target information of a user by determining a retrieval key entered by the user using a speech recognition processing and a speech recognition database for storing retrieval key candidates that constitute a number of data that cannot be processed by the speech recognition processing in a prescribed processing time, as recognition target words in a speech recognition database, the recognition target words being divided into prioritized recognition target words that constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time which have relatively higher importance levels based on statistical information defined for the recognition target words, and non-prioritized recognition target words other than the prioritized recognition target words, the computer readable program codes include:
- a first computer readable program code for causing said computer to request the user by a speech dialogue with the user to enter a speech input indicating the retrieval key, and carry out the speech recognition processing for the speech input with respect to the prioritized recognition target words to obtain a recognition result;
  
  a second computer readable program code for causing said computer to carry out a confirmation process using a speech dialogue with the user according to the recognition result to determine the retrieval key, when the recognition result satisfies a prescribed condition for judging that the retrieval key can be determined only by a confirmation process with the user;
  
  a third computer readable program code for causing said computer to carry out a related information query using a speech dialogue with the user to request the user to enter another speech input indicating a related information of the retrieval key, when the recognition result does not satisfy the prescribed condition;
  
  a fourth computer readable program code for causing said computer to carry out the speech recognition processing for the another speech input to obtain another recognition result, and adjust the recognition result according to the another recognition result to obtain adjusted recognition result; and
  
  a fifth computer readable program code for causing said computer to repeat processing of the second computer readable program code or the third and fourth computer readable program codes using the adjusted recognition result in place of the recognition result, until the retrieval key is determined.

40. A computer usable medium storing a data structure to be used as a speech recognition database in a speech recognition based interactive information retrieval system for ascertaining and retrieving a target information of a user by determining a retrieval key entered by the user using a speech recognition processing, the data structure comprising:
- retrieval key candidates that constitute a number of data that cannot be processed by the speech recognition processing in a prescribed processing time, as recognition target words, the recognition target words being divided into prioritized recognition target words that constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time which have relatively higher importance levels based on statistical information defined for the recognition target words, and non-prioritized recognition target words other than the prioritized recognition target words.
- View Dependent Claims (41)
- - 41. The computer usable medium of claim 40, wherein the data structure stores the retrieval key candidates as lower level data, and also stores higher level data that constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time, where each lower level data is dependent on one higher level data and lower level data that are dependent on one higher level data constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time.

42. A computer usable medium having computer readable program codes embodied therein for causing a computer to function as a speech recognition based interactive information retrieval system for ascertaining and retrieving a target information of a user by determining a retrieval key entered by the user using a speech recognition processing and a speech recognition database for storing retrieval key candidates that are classified according to attribute values of an attribute item, the computer readable program codes include:
- a first computer readable program code for causing said computer to request the user by a speech dialogue with the user to enter a speech input indicating an attribute value of the attribute item for the retrieval key, and carry out the speech recognition processing for the speech input to obtain a recognition result indicating attribute value candidates and their recognition likelihoods;
  
  a second computer readable program code for causing said computer to select those attribute value candidates which have recognition likelihoods that are exceeding a prescribed likelihood threshold as attribute value leading candidates, and extract those retrieval key candidates that belong to the attribute value leading candidates as new recognition target data;
  
  a third computer readable program code for causing said computer to request the user by a speech dialogue with the user to enter another speech input indicating the retrieval key, and carry out the speech recognition processing for the another speech input with respect to the new recognition target data to obtain another recognition result; and
  
  a fourth computer readable program code for causing said computer to carry out a confirmation process using a speech dialogue with the user according to the another recognition result to determine the retrieval key.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Kumiko Ohmori, Masanobu Higashida, Noriko Mizusawa
Original Assignee
Kumiko Ohmori, Masanobu Higashida, Noriko Mizusawa
Inventors
Ohmori, Kumiko, Mizusawa, Noriko, Higashida, Masanobu

Granted Patent

US 7,286,988 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/276
CPC Class Codes

G10L 15/22   Procedures used during a sp...

G10L 2015/223   Execution procedure of a sp...

G10L 2015/226   using non-speech characteri...

Y10S 707/99933   Query processing, i.e. sear...

Speech recognition based interactive information retrieval scheme using dialogue control to reduce user stress

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

36 Citations

42 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition based interactive information retrieval scheme using dialogue control to reduce user stress

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

36 Citations

42 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links