Hierarchical methods and apparatus for extracting user intent from spoken utterances

US 8,560,325 B2
Filed: 08/01/2012
Issued: 10/15/2013
Est. Priority Date: 08/31/2005
Status: Expired due to Fees

First Claim

Patent Images

1. A method for determining an intended action of a user of a computing system environment, the computing system environment comprising a voice system, the intended action being specified via a spoken input of the user, the method comprising:

obtaining a decoding of the spoken input of the user; and

extracting the intended action from the decoding of the spoken input using an iterative hierarchical extraction process comprising analyzing the decoding of the spoken input in multiple hierarchically dependent semantic stages, comprising;

determining a first level of classification of the intended action from the decoding of the spoken input during a first semantic stage of the iterative hierarchical extraction process, the first level of classification having a plurality of sub-classifications associated with the first level of classification; and

determining, from among the plurality of sub-classifications associated with the first level of classification, a second level of classification of the intended action from the same decoding of the spoken input during a second semantic stage of the iterative hierarchical extraction process,wherein determining the intended action further comprises utilizing information about the user or the user'"'"'s environment.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Improved techniques are disclosed for permitting a user to employ more human-based grammar (i.e., free form or conversational input) while addressing a target system via a voice system. For example, a technique for determining intent associated with a spoken utterance of a user comprises the following steps/operations. Decoded speech uttered by the user is obtained. An intent is then extracted from the decoded speech uttered by the user. The intent is extracted in an iterative manner such that a first class is determined after a first iteration and a sub-class of the first class is determined after a second iteration. The first class and the sub-class of the first class are hierarchically indicative of the intent of the user, e.g., a target and data that may be associated with the target.

40 Citations

View as Search Results

20 Claims

1. A method for determining an intended action of a user of a computing system environment, the computing system environment comprising a voice system, the intended action being specified via a spoken input of the user, the method comprising:
- obtaining a decoding of the spoken input of the user; and
  
  extracting the intended action from the decoding of the spoken input using an iterative hierarchical extraction process comprising analyzing the decoding of the spoken input in multiple hierarchically dependent semantic stages, comprising;
  
  determining a first level of classification of the intended action from the decoding of the spoken input during a first semantic stage of the iterative hierarchical extraction process, the first level of classification having a plurality of sub-classifications associated with the first level of classification; and
  
  determining, from among the plurality of sub-classifications associated with the first level of classification, a second level of classification of the intended action from the same decoding of the spoken input during a second semantic stage of the iterative hierarchical extraction process,wherein determining the intended action further comprises utilizing information about the user or the user'"'"'s environment.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein utilizing information about the user or the user'"'"'s environment comprises utilizing information about the user'"'"'s environment.
  - 3. The method of claim 2, wherein utilizing information about the user'"'"'s environment comprises utilizing information about the user'"'"'s environment including location of the user.
  - 4. The method of claim 2, wherein utilizing information about the user'"'"'s environment comprises utilizing information about the user'"'"'s environment selected from the group consisting of:
    - weather, speed, humidity, and temperature.
  - 5. The method of claim 1, wherein utilizing information about the user or the user'"'"'s environment comprises utilizing information about the user.
  - 6. The method of claim 5, wherein utilizing information about the user comprises utilizing information about the user including a biometric of the user.
  - 7. The method of claim 1, wherein the method comprises extracting a value for at least one attribute at each of the first semantic stage and the second semantic stage of the iterative hierarchical extraction process.

8. At least one computer readable storage device encoded with a plurality of instructions that, when executed, cause at least one processor to perform a method for determining an intended action of a user of a computing system environment, the computing system environment comprising a voice system, the intended action being specified via a spoken input of the user, wherein the method comprises acts of:
- obtaining a decoding of the spoken input of the user; and
  
  extracting the intended action from the decoding of the spoken input using an iterative hierarchical extraction process comprising analyzing the decoding of the spoken input in multiple hierarchically dependent semantic stages, comprising;
  
  determining a first level of classification of the intended action from the decoding of the spoken input during a first semantic stage of the iterative hierarchical extraction process, the first level of classification having a plurality of sub-classifications associated with the first level of classification; and
  
  determining, from among the plurality of sub-classifications associated with the first level of classification, a second level of classification of the intended action from the same decoding of the spoken input during a second semantic stage of the iterative hierarchical extraction process,wherein determining the intended action further comprises utilizing information about the user or the user'"'"'s environment.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The at least one computer readable storage device of claim 8, wherein determining the intended action comprises utilizing information about the user'"'"'s environment.
  - 10. The at least one computer readable storage device of claim 9, wherein utilizing information about the user'"'"'s environment comprises utilizing information about the user'"'"'s environment including location of the user.
  - 11. The at least one computer readable storage device of claim 9, wherein utilizing information about the user'"'"'s environment comprises utilizing information about the user'"'"'s environment selected from the group consisting of:
    - weather, speed, humidity, and temperature.
  - 12. The at least one computer readable storage device of claim 8, wherein utilizing information about the user or the user'"'"'s environment comprises utilizing information about the user.
  - 13. The at least one computer readable storage device of claim 12, wherein utilizing information about the user comprises utilizing information about the user including a biometric of the user.
  - 14. The at least one computer readable storage device of claim 8, wherein the method comprises extracting a value for at least one attribute at each of the first semantic stage and the second semantic stage of the iterative hierarchical extraction process.

15. An apparatus comprising:
- at least one processor programmed to determine an intended action specified via a spoken input of a user of a computing system environment comprising a voice system by;
  
  obtaining a decoding of the spoken input of the user; and
  
  extracting the intended action from the decoding of the spoken input using an iterative hierarchical extraction process comprising analyzing the decoding of the spoken input in multiple hierarchically dependent semantic stages, comprising;
  
  determining a first level of classification of the intended action from the decoding of the spoken input during a first semantic stage of the iterative hierarchical extraction process, the first level of classification having a plurality of sub-classifications associated with the first level of classification; and
  
  determining, from among the plurality of sub-classifications associated with the first level of classification, a second level of classification of the intended action from the same decoding of the spoken input during a second semantic stage of the iterative hierarchical extraction processwherein determining the intended action further comprises utilizing information about the user or the user'"'"'s environment.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The apparatus of claim 15, wherein utilizing information about the user or the user'"'"'s environment comprises utilizing information indicative of external situations provided by one or more sensors.
  - 17. The apparatus of claim 16, wherein utilizing information about the user or the user'"'"'s environment comprises utilizing information about the user'"'"'s environment including location of the user.
  - 18. The apparatus of claim 16, wherein utilizing information about the user or the user'"'"'s environment comprises utilizing information about the user'"'"'s environment selected from the group consisting of:
    - weather, speed, humidity, and temperature.
  - 19. The apparatus of claim 15, wherein utilizing information about the user or the user'"'"'s environment comprises utilizing information about the user.
  - 20. The apparatus of claim 19, wherein utilizing information about the user comprises utilizing information about the user including a biometric of the user.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Kanevsky, Dimitri, Reisinger, Joseph Simon, Sicconi, Roberto, Viswanathan, Mahesh
Primary Examiner(s)
He, Jialong

Application Number

US13/564,596
Publication Number

US 20130006637A1
Time in Patent Office

440 Days
Field of Search

704/275, 704/251, 704/270, 704/257
US Class Current

704/275
CPC Class Codes

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/1822   Parsing for meaning underst...

G10L 2015/226   using non-speech characteri...

Hierarchical methods and apparatus for extracting user intent from spoken utterances

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

40 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Hierarchical methods and apparatus for extracting user intent from spoken utterances

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

40 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links