Method and apparatus for command recognition using data-driven semantic inference
First Claim
Patent Images
1. A method for recognizing a voice command, the method comprising:
- recognizing a sequence of words received as the voice command; and
using data-driven semantic inference with the recognized sequence of words to recognize the voice command.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for command recognition using data-driven semantic inference includes recognizing a sequence of words received as the voice command. Data-driven semantic inference is then used with the recognized sequence of words to recognize the voice command. Thus, the command is identified on the basis of the semantics of words of the spoken command rather than the particular grammar of each of predetermined different ways the command could be worded.
-
Citations
24 Claims
-
1. A method for recognizing a voice command, the method comprising:
-
recognizing a sequence of words received as the voice command; and
using data-driven semantic inference with the recognized sequence of words to recognize the voice command. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
generating a vector representation of the recognized sequence of words; and
comparing the vector representation to a plurality of semantic anchors, wherein each of the plurality of semantic anchors corresponds to one of a plurality of voice commands.
-
-
4. The method of claim 3, wherein the using further comprises:
-
choosing a semantic anchor of the plurality of semantic anchors that is most similar to the vector representation; and
classifying the sequence of words as the command that corresponds to the chosen semantic anchor.
-
-
5. The method of claim 4, wherein the choosing comprises:
-
for each of the plurality of semantic anchors, identifying the similarity between the vector representation and the semantic anchor by calculating the cosine of the angle between the product of the vector representation and a diagonal matrix of singular values and the product of the semantic anchor and the diagonal matrix of singular values; and
choosing the semantic anchor of the plurality of semantic anchors that corresponds to the largest cosine value as the semantic anchor that is most similar to the vector representation.
-
-
6. The method of claim 3, wherein the vector representation is an indication of how frequently each of a plurality of words occurs within the recognized sequence of words.
-
7. The method of claim 6, wherein each of the plurality of semantic anchors is an indication of how frequently each of the plurality of words occurs with respect to the corresponding command.
-
8. The method of claim 3, wherein each of the plurality of semantic anchors represents a plurality of different ways of speaking the corresponding command.
-
9. A machine-readable medium having stored thereon a plurality of instructions that, when executed by a processor, cause the processor to recognize a voice command by:
-
recognizing a sequence of words received as the voice command; and
using data-driven semantic inference with the recognized sequence of words to recognize the voice command. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
generating a vector representation of the recognized sequence of words; and
comparing the vector representation to a plurality of semantic anchors, wherein each of the plurality of semantic anchors corresponds to one of a plurality of voice commands.
-
-
12. The machine-readable medium of claim 11, wherein the using further comprises:
-
choosing a semantic anchor of the plurality of semantic anchors that is most similar to the vector representation; and
classifying the sequence of words as the command that corresponds to the chosen semantic anchor.
-
-
13. The machine-readable medium of claim 12, wherein the choosing comprises:
-
for each of the plurality of semantic anchors, identifying the similarity between the vector representation and the semantic anchor by calculating the cosine of the angle between the product of the vector representation and a diagonal matrix of singular values and the product of the semantic anchor and the diagonal matrix of singular values; and
choosing the semantic anchor of the plurality of semantic anchors that corresponds to the largest cosine value as the semantic anchor that is most similar to the vector representation.
-
-
14. The machine-readable medium of claim 11, wherein the vector representation is an indication of how frequently each of a plurality of words occurs within the recognized sequence of words.
-
15. The machine-readable medium of claim 14, wherein each of the plurality of semantic anchors is an indication of how frequently each of the plurality of words occurs with respect to the corresponding command.
-
16. The machine-readable medium method of claim 11, wherein each of the plurality of semantic anchors represents a plurality of different ways of speaking the corresponding command.
-
17. An apparatus for recognizing a voice command, the apparatus comprising:
-
a speech recognizer to recognize a sequence of words received as the voice command; and
a semantic classifier, coupled to the speech recognizer, to use data-driven semantic inference with the recognized sequence of words to recognize the voice command. - View Dependent Claims (18, 19, 20)
an action generator, coupled to the semantic classifier, to use the recognized voice command to determine an action to be performed.
-
-
19. The apparatus of claim 17, wherein the semantic classifier is further to generate a vector representation of the recognized sequence of words, and compare the vector representation to a plurality of semantic anchors, wherein each of the plurality of semantic anchors corresponds to one of a plurality of voice commands.
-
20. The apparatus of claim 19, wherein the semantic classifier is further to choose a semantic anchor of the plurality of semantic anchors that is most similar to the vector representation, and to classify the sequence of words as the command that corresponds to the chosen semantic anchor.
-
21. An apparatus for recognizing a voice command, the apparatus comprising:
-
means for recognizing a sequence of words received as the voice command; and
means, coupled to the means for recognizing, for using data-driven semantic inference with the recognized sequence of words to recognize the voice command. - View Dependent Claims (22, 23, 24)
means for generating a vector representation of the recognized sequence of words; and
means, coupled to the means for generating, for comparing the vector representation to a plurality of semantic anchors, wherein each of the plurality of semantic anchors corresponds to one of a plurality of voice commands.
-
-
24. The apparatus of claim 23, wherein the means for using further comprises:
-
means, coupled to the means for generating, for choosing a semantic anchor of the plurality of semantic anchors that is most similar to the vector representation; and
means, coupled to the means for generating, for classifying the sequence of words as the command that corresponds to the chosen semantic anchor.
-
Specification