Interactive computer system recognizing spoken commands
First Claim
1. An interactive computer system comprising:
- a processor executing a target computer program having a series of active program states occurring over a succession of time periods, said target computer program generating active state image data signals representing an active state image for an active state of the target computer program occurring during each time period, each active state image containing one or more objects;
means for displaying at least a first active-state image for a first active state occurring during a first time period;
means for identifying an object displayed in the first active-state image, and for generating from an identified object displayed in the first active-state image a list of one or more first active-state commands identifying a first active-state function which can be performed in the first active state of the target computer program;
means for storing a system vocabulary of acoustic command models, each acoustic command model representing one or more series of acoustic feature values representing an utterance of one or more words associated with the acoustic command model;
means for identifying a first active-state vocabulary of acoustic command models for the first active state, the first active-state vocabulary comprising the acoustic command models from the system vocabulary representing the first active-state commands, wherein the first active-state vocabulary changes dynamically as a function of both the identity of the target computer program and the active state image data signals which identify an active state of the target computer program; and
a speech recognizer for measuring a value of at least one feature of an utterance during each of a first sequence of time intervals within the first time period to produce a first series of feature signals, said speech recognizer comparing the first series of feature signals to each of the acoustic command models in the first active-state vocabulary to generate a match score for the utterance and each acoustic command model, and said speech recognizer outputting a command signal corresponding to the acoustic command model from the first active-state vocabulary having a best match score.
1 Assignment
0 Petitions
Accused Products
Abstract
An interactive computer system having a processor executing a target computer program, and having a speech recognizer for converting an utterance into a command signal for the target computer program. The target computer program has a series of active program states occurring over a series of time periods. At least a first active-state image is displayed for a first active state occurring during a first time period. At least one object displayed in the first active-state image is identified, and a list of one or more first active-state commands identifying functions which can be performed in the first active state of the target computer program is generated from the identified object. A first active-state vocabulary of acoustic command models for the first active state comprises the acoustic command models from a system vocabulary representing the first active-state commands. A speech recognizer measures the value of at least one feature of an utterance during each of a series of successive time intervals within the first time period to produce a series of feature signals. The measured feature signals are compared to each of the acoustic command models in the first active-state vocabulary to generate a match score for the utterance and each acoustic command model. The speech recognizer outputs a command signal corresponding to the command model from the first active-state vocabulary having the best match score.
131 Citations
25 Claims
-
1. An interactive computer system comprising:
-
a processor executing a target computer program having a series of active program states occurring over a succession of time periods, said target computer program generating active state image data signals representing an active state image for an active state of the target computer program occurring during each time period, each active state image containing one or more objects; means for displaying at least a first active-state image for a first active state occurring during a first time period; means for identifying an object displayed in the first active-state image, and for generating from an identified object displayed in the first active-state image a list of one or more first active-state commands identifying a first active-state function which can be performed in the first active state of the target computer program; means for storing a system vocabulary of acoustic command models, each acoustic command model representing one or more series of acoustic feature values representing an utterance of one or more words associated with the acoustic command model; means for identifying a first active-state vocabulary of acoustic command models for the first active state, the first active-state vocabulary comprising the acoustic command models from the system vocabulary representing the first active-state commands, wherein the first active-state vocabulary changes dynamically as a function of both the identity of the target computer program and the active state image data signals which identify an active state of the target computer program; and a speech recognizer for measuring a value of at least one feature of an utterance during each of a first sequence of time intervals within the first time period to produce a first series of feature signals, said speech recognizer comparing the first series of feature signals to each of the acoustic command models in the first active-state vocabulary to generate a match score for the utterance and each acoustic command model, and said speech recognizer outputting a command signal corresponding to the acoustic command model from the first active-state vocabulary having a best match score. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method of computer interaction comprising:
-
executing, on a processor, a target computer program having a series of active program states occurring over a succession of time periods, said target computer program generating active state image data signals representing an active state image for an active state of the target computer program occurring during each time period, each active state image containing one or more objects; displaying at least a first active-state image for a first active state occurring during a first time period; identifying an object displayed in the first active-state image, and generating from an identified object displayed in the first active-state image a list of one or more first active-state commands identifying a first active-state function which can be performed in the first active state of the target computer program; storing a system vocabulary of acoustic command models, each acoustic command model representing one or more series of acoustic feature values representing an utterance of one or more words associated with the acoustic command model; identifying a first active-state vocabulary of acoustic command models for the first active state, the first active-state vocabulary comprising the acoustic command models from the system vocabulary representing the first active-state commands wherein the first active-state vocabulary changes dynamically as a function of both the identity of the target computer program and the active state image data signals which identify an active state of the target computer program; and measuring a value of at least one feature of an utterance during each of first sequence of time intervals within the first time period to produce a first series of feature signals; comparing the first series of feature signals to each of the acoustic command models in the first active-state vocabulary to generate a match score for the utterance and each acoustic command model; and outputting a command signal corresponding to the acoustic command model from the first active-state vocabulary having a best match score. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
Specification