Method and system for deriving a large-span semantic language model for large-vocabulary recognition systems
First Claim
1. A method for deriving a large-span semantic language model for a large vocabulary recognition system, the method comprising the steps of:
- (a) mapping words into a vector space, where each word is represented by a vector;
(b) clustering the vectors into a set of clusters, where each cluster represents a semantic event;
(c) computing a first probability that a first word will occur given a history of prior words by,(i) calculating a second probability that a vector representing the first word belongs to each of the clusters, the second probability capable of being independent of a location of the first word in a sentence;
(ii) calculating a third probability of each cluster occurring in a history of prior words; and
(iii) weighting the second probability by the third probability.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method for deriving a large-span semantic language model for a large vocabulary recognition system is disclosed. The method and system maps words from a vocabulary into a vector space, where each word is represented by a vector. After the vectors are mapped to the space, the vectors are clustered into a set of clusters, where each cluster represents a semantic event. After clustering the vectors, a probability that a first word will occur given a history of prior words is computed by (i) calculating a probability that the vector representing the first word belongs to each of the clusters; (ii) calculating a probability of each cluster occurring in a history of prior words; and weighting (i) by (ii) to provide the probability.
-
Citations
32 Claims
-
1. A method for deriving a large-span semantic language model for a large vocabulary recognition system, the method comprising the steps of:
-
(a) mapping words into a vector space, where each word is represented by a vector; (b) clustering the vectors into a set of clusters, where each cluster represents a semantic event; (c) computing a first probability that a first word will occur given a history of prior words by, (i) calculating a second probability that a vector representing the first word belongs to each of the clusters, the second probability capable of being independent of a location of the first word in a sentence; (ii) calculating a third probability of each cluster occurring in a history of prior words; and (iii) weighting the second probability by the third probability. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 22, 23, 24, 25, 26, 27)
-
-
10. A system for deriving a large-span semantic language model comprising:
-
a training database; a memory; a processor coupled to the memory; and a pattern recognition system executed by the process, the pattern recognition system including, means for mapping words from the training database into a vector space, where each word is represented by a vector, means for clustering the vectors into a set of clusters, where each cluster represents a semantic event, and means for computing a first probability that a first word will occur given a history of prior words by calculating a second probability that a vector representing the first word belongs to each of the clusters, calculating a third probability of each cluster occurring in a history of prior words, and by weighting the second probability by the third probability; wherein the second probability is capable of being independent of a location of the first word in a sentence. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer-readable medium containing program instructions for deriving a large-span semantic language model for a large vocabulary recognition system, the program instructions for:
-
(a) mapping words into a vector space, where each word is represented by a vector; (b) clustering the vectors into a set of clusters, where each cluster represents a semantic event; (c) computing a first probability that a first word will occur given a history of prior words by, (i) calculating a second probability that a vector representing the first word belongs to each of the clusters, the second probability capable of being independent of a location of the first word in a sentence; (ii) calculating a third probability of each cluster occurring in a history of prior words; and (iii) providing the first probability by weighting the second probability by the third probability. - View Dependent Claims (20, 21)
-
-
28. A method for deriving a large-span semantic language model for a large vocabulary recognition system, the method comprising the steps of:
-
(a) tabulating the number of times each word occurs in a set of N training documents using a word-document matrix, wherein entries from the matrix form word vectors of dimension N; (b) reducing the vectors to a dimension R, where R is significantly less than N; (c) clustering the vectors into a set of clusters, where each cluster represents a semantic event; (d) calculating a distance between a vector representing a first word and each of the clusters to provide a probability that the vector representing the first word belongs to each of the clusters, the probability that the vector representing the first word belongs to each of the clusters capable of being independent of a location of the first word in a sentence; (e) calculating how frequently each cluster occurs in a history of prior words to provide a probability of each cluster occurring in the history of prior words; and (f) computing a probability that the vector representing the first word will occur given the history of prior words by weighting the probability that the vector representing the first word belongs to each of the clusters by the probability of each cluster occurring in the history of prior words. - View Dependent Claims (29, 30, 31, 32)
-
Specification