Criteria for usable repetitions of an utterance during speech reference enrollment
First Claim
1. A speech reference enrollment method, comprising the steps of:
- (a) receiving a first utterance of a vocabulary word;
(b) extracting a plurality of features from the first utterance;
(c) receiving a second utterance of the vocabulary word;
(d) determining a duration of the second utterance;
(e) when the duration is less than a minimum duration, requesting a user speak a third utterance of the vocabulary word and proceeding to step (i);
(f) extracting the plurality of features from the second utterance;
(g) determining a first similarity between the plurality of features from the first utterance and the plurality of features from the second utterance;
(h) when the first similarity is less than a predetermined similarity, requesting a user to speak a third utterance of the vocabulary word;
(i) extracting the plurality of features from the third utterance;
(j) determining a second similarity between the plurality of features from the first utterance and the plurality of features from the third utterance; and
(k) when the second similarity is greater than or equal to the predetermined similarity, forming a reference for the vocabulary word.
8 Assignments
0 Petitions
Accused Products
Abstract
A speech reference enrollment method involves the following steps: (a) requesting a user speak a vocabulary word; (b) detecting a first utterance (354); (c) requesting the user speak the vocabulary word; (d) detecting a second utterance (358); (e) determining a first similarity between the first utterance and the second utterance (362); (f) when the first similarity is less than a predetermined similarity, requesting the user speak the vocabulary word; (g) detecting a third utterance (366); (h) determining a second similarity between the first utterance and the third utterance (370); and (i) when the second similarity is greater than or equal to the predetermined similarity, creating a reference (364).
64 Citations
20 Claims
-
1. A speech reference enrollment method, comprising the steps of:
-
(a) receiving a first utterance of a vocabulary word; (b) extracting a plurality of features from the first utterance; (c) receiving a second utterance of the vocabulary word; (d) determining a duration of the second utterance; (e) when the duration is less than a minimum duration, requesting a user speak a third utterance of the vocabulary word and proceeding to step (i); (f) extracting the plurality of features from the second utterance; (g) determining a first similarity between the plurality of features from the first utterance and the plurality of features from the second utterance; (h) when the first similarity is less than a predetermined similarity, requesting a user to speak a third utterance of the vocabulary word; (i) extracting the plurality of features from the third utterance; (j) determining a second similarity between the plurality of features from the first utterance and the plurality of features from the third utterance; and (k) when the second similarity is greater than or equal to the predetermined similarity, forming a reference for the vocabulary word. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A speech reference enrollment method, comprising the steps of:
-
(a) requesting a user speak a vocabulary word; (b) detecting a first utterance; (c) determining if the first utterance exceeds an amplitude threshold; (d) when the first utterance does not exceed the amplitude threshold, return to step (a); (e) requesting the user speak the vocabulary word; (f) detecting a second utterance; (g) determining a first similarity between the first utterance and the second utterance; (h) when the first similarity is less than a predetermined similarity, requesting the user speak the vocabulary word; (i) detecting a third utterance; (i) determining a second similarity between the first utterance and the third utterance; and (k) when the second similarity is greater than or equal to the predetermined similarity, creating a reference. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A computer readable storage medium containing computer readable instructions that when executed by a computer performs the following steps:
-
(a) requesting a user speak a vocabulary word; (b) receiving a first digitized utterance; (c) extracting a plurality of features from the first digitized utterance; (d) determining a signal to noise ratio; (e) when the signal to noise ratio is less than a predetermined signal to noise ratio, returning to step (a); (f) requesting the user speak the vocabulary word; (g) receiving a second digitized utterance of the vocabulary word; (h) extracting the plurality of features from the second digitized utterance; (i) determining a first similarity between the plurality of features from the first digitized utterance and the plurality of features from the second digitized utterance; (j) when the first similarity is less than a predetermined similarity, requesting the user to speak a third utterance of the vocabulary word; (k) extracting the plurality of features from a third digitized utterance; (l) determining a second similarity between the plurality of features from the first digitized utterance and the plurality of features from the third digitized utterance; and (m) when the second similarity is greater than or equal to the predetermined similarity, forming a reference for the vocabulary word. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification