Method and apparatus for speech recognition using latent semantic adaptation
First Claim
Patent Images
1. A method for generating a speech recognition database comprising:
- generating a latent semantic analysis (LSA) space from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors;
receiving a new document that represents a change in the language; and
adapting the LSA space to reflect the change in the language, wherein the adapting includes changing a position of the one or more document vectors in the LSA space by the change in the language.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for speech recognition using latent semantic adaptation is described herein. According to one aspect of the present invention, a method for recognizing speech comprises using latent semantic analysis (LSA) to generate an LSA space for a collection of documents and to continually adapt the LSA space with new documents as they become available. Adaptation of the LSA space is optimally two-sided, taking into account the new words in the new documents. Alternatively, adaptation is one-sided, taking into account the new documents but discarding any new words appearing in those documents.
-
Citations
55 Claims
-
1. A method for generating a speech recognition database comprising:
-
generating a latent semantic analysis (LSA) space from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors; receiving a new document that represents a change in the language; and adapting the LSA space to reflect the change in the language, wherein the adapting includes changing a position of the one or more document vectors in the LSA space by the change in the language. - View Dependent Claims (2, 3, 10, 11)
-
-
4. A method for generating a speech recognition database comprising:
-
generating a latent semantic analysis (LSA) space from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors; receiving a new document that represents a change in the language; and adapting the LSA space to reflect the change in the language, wherein the change in the language includes changing a position of the one or more document vectors, wherein adapting the LSA space to reflect the change in the language comprises transforming the LSA space to take into account the new document'"'"'s influence on the LSA space without re-computing the LSA space, wherein the transforming the LSA space comprises obtaining a training document vector that characterizes a semantic position of the training document within the LSA space; computing a new document vector that characterizes a semantic position of the new document within the LSA space; deriving a document vector transformation matrix; and applying the document vector transformation matrix to the training document vector and the new document vector to shift a position of each document vector in the LSA space, where the shift in the position reflects the change in the language; obtaining a training word vector that characterizes a semantic position of the training word within the LSA space; computing a new word vector that characterizes a semantic position of the new word within the LSA space; deriving a word vector transformation matrix; and applying the word vector transformation matrix to the training word vector and the new word vector to shift a position of each word vector in the LSA space, where the shift in the position reflects the change in the language. - View Dependent Claims (5, 6, 7, 8, 9)
-
-
12. A computer-readable medium having executable instructions to cause a computer to perform a method for generating a speech recognition database comprising:
-
generating a latent semantic analysis (LSA) space from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors; receiving a new document that represents a change in the language; and adapting the LSA space to reflect the change in the language, wherein the adapting includes changing a position of the one or more document vectors in the LSA space by the change in the language. - View Dependent Claims (13, 14, 21, 22)
-
-
15. A computer-readable medium having executable instructions to cause a computer to perform a method for generating a speech recognition database comprising:
-
generating a latent semantic analysis (LSA) space from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors; receiving a new document that represents a change in the language; and adapting the LSA space to reflect the change in the language, wherein the change in the language includes changing a position of the one or more document vectors, wherein adapting the LSA space to reflect the change in the language further comprises transforming the LSA space to take into account the new document'"'"'s influence on the LSA space without re-computing the LSA space, wherein the transforming the LSA space comprises obtaining a training document vector that characterizes a semantic position of the training document within the LSA space; computing a new document vector that characterizes a semantic position of the new document within the LSA space; deriving a document vector transformation matrix; and applying the document vector transformation matrix to the training document vector and the new document vector to shift a position of each document vector in the LSA space, where the shift in the position reflects the change in the language; obtaining a training word vector that characterizes a semantic position of the training word within the LSA space; computing a new word vector that characterizes a semantic position of the new word within the LSA space; deriving a word vector transformation matrix; and applying the word vector transformation matrix to the training word vector and the new word vector to shift a position of each word vector in the LSA space, where the shift in the position reflects the change in the language.
-
-
16. A computer-readable medium having executable instructions to cause a computer to perform a method for generating a speech recognition database comprising:
-
generating a latent semantic analysis (LSA) space from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors; receiving a new document that represents a change in the language; and adapting the LSA space to reflect the change in the language, wherein the change in the language includes changing a position of the one or more document vectors, wherein adapting the LSA space to reflect the change in the language further comprises transforming the LSA space to take into account the new document'"'"'s influence on the LSA space without re-computing the LSA space, wherein transforming the LSA space comprises obtaining a training document vector that characterizes a semantic position of the training document within the LSA space; computing a new document vector that characterizes a semantic position of the new document within the LSA space; deriving a document vector transformation matrix; and applying the document vector transformation matrix to the training document vector and the new document vector to shift a position of each document vector in the LSA space, where the shift in the position reflects the change in the language, wherein the training document vector is VS where VS is computed from a right singular matrix V and a diagonal matrix S, each of which was obtained from a previous singular value decomposition (SVD) of a training word-document matrix constructed during the generation of the LSA space, the training word-document matrix representing the extent to which each of the words appears in each of the documents of the training corpus; the new document vector is ZS where ZS is computed from the diagonal matrix S and an extension matrix Z, wherein Z is an extension of the right singular matrix V obtained by folding in a new word-document matrix, the new word-document matrix representing the extent to which a new word appears in the new document; and the document vector transformation matrix is J, wherein J is obtained from a Choleski decomposition of a matrix derived from an extension matrix Y, wherein Y is an extension of a left singular matrix U obtained by folding in the new word-document matrix, and wherein U was obtained from the previous SVD of the training word-document matrix constructed during the generation of the LSA space. - View Dependent Claims (17, 18, 19, 20)
-
-
23. An apparatus for generating a speech recognition database, the apparatus comprising:
-
a latent semantic analysis (LSA) space generator to generate an LSA space from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors; a document receiver to receive a new document that represents a change in the language; and an LSA space adapter to adapt the LSA space to reflect the change in the language, wherein adapting includes changing a position of the one or more document vectors in the LSA space by the change in the language. - View Dependent Claims (24, 25, 32, 33)
-
-
26. An apparatus for generating a speech recognition database, the apparatus comprising:
-
a latent semantic analysis (LSA) space generator to generate an LSA space from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors; a document receiver to receive a new document that represents a change in the language; and an LSA space adapter to adapt the LSA space to reflect the change in the language, wherein the change in the language includes changing a position of the one or more document vectors, wherein LSA space adapter transforms the LSA space to take into account the new document'"'"'s influence on the LSA space without recomputing the LSA space, wherein the LSA space adapter transforms the LSA space by obtaining a training document vector that characterizes a semantic position of the training document within the LSA space; computing a new document vector that characterizes a semantic position of the new document within the LSA space; deriving a document vector transformation matrix; and applying the document vector transformation matrix to the training document vector and the new document vector to shift a position of each document vector in the LSA space, where the shift in the position reflects the change in the language; obtaining a training word vector that characterizes a semantic position of the training word within the LSA space; computing a new word vector that characterizes a semantic position of the new word within the LSA space; deriving a word vector transformation matrix; and applying the word vector transformation matrix to the training word vector and the new word vector to shift a position of each word vector in the LSA space, where the shift in the position reflects the change in the language. - View Dependent Claims (27, 28, 29, 30, 31)
-
-
34. An apparatus for recognizing speech, the apparatus comprising:
-
means for recognizing an audio input as a new document; and means for processing the new document using latent semantic adaptation, wherein the means for processing include means for generating a latent semantic analysis (LSA) space from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors; means for receiving the new document that represents a change in the language; and means for adapting the LSA space to reflect the change in the language, wherein the means for adapting includes means for changing a position of the one or more document vectors in the LSA space by the change in the language; and means, coupled to the means for processing, for semantically inferring from a vector representation of the new document which of a plurality of known words and known documents correlate to the new document. - View Dependent Claims (35, 36, 43, 44)
-
-
37. An apparatus for recognizing speech, the apparatus comprising:
-
means for recognizing an audio input as a new document; and means for processing the new document using latent semantic adaptation, wherein the means for processing include means for generating a latent semantic analysis (LSA) space from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors; means for receiving the new document that represents a change in the language; and
means for adapting the LSA space to reflect the change in the language, wherein the change in the language includes changing a position of the one or more document vectors, wherein the means for adapting the LSA space to reflect the change in the language comprises a means for transforming the LSA space to take into account the new document'"'"'s influence on the LSA space without re-computing the LSA space, wherein the means for transforming the LSA space comprisesmeans for obtaining a training document vector that characterizes a semantic position of the training document within the LSA space; means for computing a new document vector that characterizes a semantic position of the new document within the LSA space; means for deriving a document vector transformation matrix; and means for applying the document vector transformation matrix to the training document vector and the new document vector to shift a position of each document vector in the LSA space, where the shift in the position reflects the change in the language; means, coupled to the means for processing, for semantically inferring from a vector representation of the new document which of a plurality of known words and known documents correlate to the new document, means for obtaining a training word vector that characterizes a semantic position of the training word within the LSA space; means for computing a new word vector that characterizes a semantic position of the new word within the LSA space; means for deriving a word vector transformation matrix; and means for applying the word vector transformation matrix to the training word vector and the new word vector to shift a position of each word vector in the LSA space, where the shift in the position reflects the change in the language. - View Dependent Claims (38, 39)
-
-
40. An apparatus for recognizing speech, the apparatus comprising:
-
means for recognizing an audio input as a new document; and means for processing the new document using latent semantic adaptation, wherein the means for processing include means for generating a latent semantic analysis (LSA) space from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors; means for receiving the new document that represents a change in the language; and means for adapting the LSA space to reflect the change in the language, wherein the change in the language includes changing a position of the one or more document vectors, wherein the means for adapting the LSA space to reflect the change in the language comprises a means for transforming the LSA space to take into account the new document'"'"'s influence on the LSA space without re-computing the LSA space, wherein the means for transforming the LSA space comprises means for obtaining a training document vector that characterizes a semantic position of the training document within the LSA space; means for computing a new document vector that characterizes a semantic position of the new document within the LSA space; means for deriving a document vector transformation matrix; and means for applying the document vector transformation matrix to the training document vector and the new document vector to shift a position of each document vector in the LSA space, where the shift in the position reflects the change in the language; and means, coupled to the means for processing, for semantically inferring from a vector representation of the new document which of a plurality of known words and known documents correlate to the new document, wherein the means for transforming the LSA space further comprises means for applying the document vector transformation matrix and the word vector transformation matrix simultaneously.
-
-
41. An apparatus for recognizing speech, the apparatus comprising:
-
means for recognizing an audio input as a new document; and means for processing the new document using latent semantic adaptation, wherein the means for processing include means for generating a latent semantic analysis (LSA) space from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors; means for receiving the new document that represents a change in the language; and
means for adapting the LSA space to reflect the change in the language, wherein the change in the language includes changing a position of the one or more document vectors, wherein the means for adapting the LSA space to reflect the change in the language comprises a means for transforming the LSA space to take into account the new document'"'"'s influence on the LSA space without re-computing the LSA space, wherein the means for transforming the LSA space comprises;means for obtaining a training document vector that characterizes a semantic position of the training document within the LSA space; means for computing a new document vector that characterizes a semantic position of the new document within the LSA space; means for deriving a document vector transformation matrix; and means for applying the document vector transformation matrix to the training document vector and the new document vector to shift a position of each document vector in the LSA space, where the shift in the position reflects the change in the language; and means, coupled to the means for processing, for semantically inferring from a vector representation of the new document which of a plurality of known words and known documents correlate to the new document, wherein when the new document matrix contains more new documents than new words, then the means for transforming the LSA space further comprises means for applying the word vector transformation matrix K, first; and means for applying the document vector transformation matrix J second, wherein the means for obtaining the extension matrix Y is not by folding in the new word-document matrix, but is rather by deriving extension matrix Y from the extension matrix Z.
-
-
42. An apparatus for recognizing speech, the apparatus comprising:
-
means for recognizing an audio input as a new document; and means for processing the new document using latent semantic adaptation, wherein the means for processing include means for generating a latent semantic analysis (LSA) space from a training corpus of documents representative of a language wherein the LSA space includes one or more document vectors; means for receiving the new document that represents a change in the language; and means for adapting the LSA space to reflect the change in the language, wherein the change in the language includes changing a position of the one or more document vectors, wherein the means for adapting the LSA space to reflect the change in the language comprises a means for transforming the LSA space to take into account the new document'"'"'s influence on the LSA space without re-computing the LSA space, wherein the means for transforming the LSA space comprises; means for obtaining a training document vector that characterizes a semantic position of the training document within the LSA space; means for computing a new document vector that characterizes a semantic position of the new document within the LSA space; means for deriving a document vector transformation matrix; and means for applying the document vector transformation matrix to the training document vector and the new document vector to shift a position of each document vector in the LSA space, where the shift in the position reflects the change in the language; and means, coupled to the means for processing, for semantically inferring from a vector representation of the new document which of a plurality of known words and known documents correlate to the new document, wherein when the new document matrix contains more new words than new documents, then the means for transforming the LSA space further comprises means for applying the document vector transformation matrix J first; and means for applying the word vector transformation matrix K second, wherein the means for obtaining the extension matrix Z is not by folding in the new word-document matrix, but is rather by deriving the extension matrix Z from the extension matrix Y.
-
-
45. A system for processing speech, the system comprising:
-
a speech recognition database comprising a latent semantic analysis (LSA) space generated from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors; an input receiver to receive a new document that represents a change in the language; and a processing system to adapt the LSA space to reflect the change in the language, wherein the adapting includes changing a position of the one or more document vectors in the LSA space by the change in the language. - View Dependent Claims (46, 47, 54, 55)
-
-
48. A system for processing speech, the system comprising:
-
a speech recognition database comprising a latent semantic analysis (LSA) space generated from a training corpus of documents representative of a language, wherein the LSA space includes one or more document vectors; an input receiver to receive a new document that represents a change in the language; and a processing system to adapt the LSA space to reflect the change in the language, wherein the change in the language includes changing a position of the one or more document vectors, wherein the processing system adapts the LSA space by transforming the LSA space to take into account the new document'"'"'s influence on the LSA space without re-computing the LSA space, wherein the processing system transforms the LSA space by obtaining a training document vector that characterizes a semantic position of the training document within the LSA space; computing a new document vector that characterizes a semantic position of the new document within the LSA space; deriving a document vector transformation matrix; and applying the document vector transformation matrix to the training document vector and the new document vector to shift a position of each document vector in the LSA space, where the shift in the position reflects the change in the language; obtaining a training word vector that characterizes a semantic position of the training word within the LSA space; computing a new word vector that characterizes a semantic position of the new word within the LSA space; deriving a word vector transformation matrix; and applying the word vector transformation matrix to the training word vector and the new word vector to shift a position of each word vector in the LSA space, where the shift in the position reflects the change in the language. - View Dependent Claims (49, 50, 51, 52, 53)
-
Specification