Information processing apparatus, information processing method, and program
First Claim
Patent Images
1. An information processing apparatus comprising a microprocessor comprising a plurality of portions configured to be executed by the microprocessor, comprising:
- a pre-score adjustment portion which calculates a pre-score based on context information obtained as observation information, for an intention model as a unit corresponding to each of a plurality of types of intention information registered in advance;
a multi-matching portion which determines the most suitable word group for an input speech based on a user'"'"'s utterance and calculates an acoustic score and a linguistic score to be given to the word group for the intention model as a unit; and
an intention determination portion which determines intention information corresponding to an intention model achieving the highest total score as an intention corresponding to the user'"'"'s utterance by comparing total scores calculated from the pre-score, the acoustic score, and the linguistic score of the intention model as a unit,a pre-score storing portion in which a pre-score corresponding to a context with respect to each context information corresponding to a plurality of different types of observation information has been registered, whereinthe observation information includes the plurality of types of observation information,the pre-score adjustment portion selects the pre-score corresponding to a context that has been registered in the pre-score storing portion based on the context information and calculates a pre-score for the intention model as a unit by applying the selected pre-score corresponding to a context, andthe context information as the observation information includes at least information of speech input person identification input from an image processing portion.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus, method and program for performing a speech recognition process utilizing contextual information that comprises an estimation of the intention of an utterance of a user. The recognition process includes calculating a pre-score based on observed contextual information according intention models which correspond to a plurality of types of intention information and combining the pre-scoring results with acoustic and linguistic scores to obtain an improved recognition or comprehension of the intent of a user utterance.
18 Citations
9 Claims
-
1. An information processing apparatus comprising a microprocessor comprising a plurality of portions configured to be executed by the microprocessor, comprising:
-
a pre-score adjustment portion which calculates a pre-score based on context information obtained as observation information, for an intention model as a unit corresponding to each of a plurality of types of intention information registered in advance; a multi-matching portion which determines the most suitable word group for an input speech based on a user'"'"'s utterance and calculates an acoustic score and a linguistic score to be given to the word group for the intention model as a unit; and an intention determination portion which determines intention information corresponding to an intention model achieving the highest total score as an intention corresponding to the user'"'"'s utterance by comparing total scores calculated from the pre-score, the acoustic score, and the linguistic score of the intention model as a unit, a pre-score storing portion in which a pre-score corresponding to a context with respect to each context information corresponding to a plurality of different types of observation information has been registered, wherein the observation information includes the plurality of types of observation information, the pre-score adjustment portion selects the pre-score corresponding to a context that has been registered in the pre-score storing portion based on the context information and calculates a pre-score for the intention model as a unit by applying the selected pre-score corresponding to a context, and the context information as the observation information includes at least information of speech input person identification input from an image processing portion. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An information processing method executed in an information processing apparatus comprising a microprocessor, the method comprising:
-
causing a pre-score adjustment portion to calculate a pre-score based on context information obtained as observation information, for an intention model as a unit corresponding to each of a plurality of types of intention information registered in advance; causing a multi-matching portion to determine the most suitable word group for an input speech based on a user'"'"'s utterance and to calculate an acoustic score and a linguistic score to be given to the word group for the intention model as a unit; causing an intention determination portion to determine intention information corresponding to an intention model achieving the highest total score as an intention corresponding to the user'"'"'s utterance by comparing the total scores calculated from the pre-score, acoustic score, and the linguistic score of the intention model as a unit; and causing a pre-score storing portion to register a pre-score corresponding to a context with respect to each context information corresponding to a plurality of different types of observation information, wherein the observation information includes the plurality of types of observation information, causing the pre-score storing portion to register further comprises causing the pre-score storing portion to select the pre-score corresponding to a context that has been registered in the pre-score storing portion based on the context information and calculates a pre-score for the intention model as a unit by applying the selected pre-score corresponding to a context, and the context information as the observation information includes at least information of speech input person identification input from an image processing portion.
-
-
9. A non-transitory, computer readable medium configured to cause an information processing apparatus to perform information processing comprising:
-
causing a pre-score adjustment portion to calculate a pre-score based on context information obtained as observation information, for an intention model as a unit corresponding to each of a plurality of types of intention information registered in advance; causing a multi-matching portion to determine the most suitable word group for an input speech based on a user'"'"'s utterance and to calculate an acoustic score and a linguistic score to be given to the word group for the intention model as a unit; causing an intention determination portion to determine intention information corresponding to an intention model achieving the highest total score as an intention corresponding to the user'"'"'s utterance by comparing the total scores calculated from the pre-score, the acoustic score, and the linguistic score of the intention model as a unit; and causing a pre-score storing portion to register a pre-score corresponding to a context with respect to each context information corresponding to a plurality of different types of observation information, wherein the observation information includes the plurality of types of observation information, causing the pre-score storing portion to register further comprises causing the pre-score storing portion to select the pre-score corresponding to a context that has been registered in the pre-score storing portion based on the context information and calculates a pre-score for the intention model as a unit by applying the selected pre-score corresponding to a context, and the context information as the observation information includes at least information of speech input person identification input from an image processing portion.
-
Specification