Speech recognition using nonparametric speech models
First Claim
Patent Images
1. A method of evaluating a speech sample using a computer, the method comprising:
- collecting training observations, each training observation representing a single utterance by a single speaker;
partitioning the training observations into groups of related training observations;
receiving a speech sample; and
assessing a degree to which the speech sample resembles a group of training observations by evaluating the speech sample relative to particular training observations in the group of training observations.
8 Assignments
0 Petitions
Accused Products
Abstract
The content of a speech sample is recognized using a computer system by evaluating the speech sample against a nonparametric set of training observations, for example, utterances from one or more human speakers. The content of the speech sample is recognized based on the evaluation results. The speech recognition process also may rely on a comparison between the speech sample and a parametric model of the training observations.
-
Citations
24 Claims
-
1. A method of evaluating a speech sample using a computer, the method comprising:
-
collecting training observations, each training observation representing a single utterance by a single speaker;
partitioning the training observations into groups of related training observations;
receiving a speech sample; and
assessing a degree to which the speech sample resembles a group of training observations by evaluating the speech sample relative to particular training observations in the group of training observations. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented method of recognizing content in a speech sample based on a multi-dimensional speech model derived from training observations, the method comprising:
-
receiving a speech sample;
identifying a portion of the speech model based on a comparison between the speech sample and the speech model;
evaluating the speech sample against particular training observations on a subset of the training observations that corresponds to the identified portion of the speech model; and
recognizing a content of the speech sample based on the evaluating. - View Dependent Claims (13, 14, 15, 16, 17, 18)
dividing the speech sample into a series of frames;
evaluating each frame relative to each portion of the speech model;
assigning a score to each portion of the speech model for each frame; and
determining that a portion of the speech model is to be identified if the score for the portion differs from a threshold value in a desired direction.
-
-
17. The method of claim 16, wherein identifying a portion of the speech sample comprises designating at least one frame as corresponding to the identified Portion, and in which the recognizing comprises for each identified portion of the speech model:
-
evaluating the at least one designated frame relative to each training observation for the identified portion of the speech model;
modifying the score for the identified portion based on a result of the evaluation relative to training observations; and
identifying the content of the speech sample as corresponding to the identified portion based on the modified score.
-
-
18. The method of claim 17 in which the modifying comprises smoothing the score using a weighting factor.
-
19. A speech recognition system comprising:
-
an input device configured to receive a speech sample to be recognized;
a stored nonparametric vocabulary representing utterances from one or more human speakers, the vocabulary including discrete training observations, each of which represents a single utterance by a single speaker; and
a processor coupled to the input device and to the nonparametric vocabulary and configured to evaluate the speech sample against the nonparametric vocabulary. - View Dependent Claims (20)
-
-
21. A computer program, residing on a computer readable medium, for a speech recognition system comprising a processor and an input device, the computer program comprising instructions to perform the following operations:
-
evaluate a speech sample against a nonparametric speech model, the speech model including discrete training observations, each of which represents a single utterance by a single speaker; and
recognize a speech content of the speech sample based on a result of the evaluation. - View Dependent Claims (22, 23, 24)
-
Specification