Perturbing latent semantic indexing spaces
First Claim
1. A computer-based text processing method, comprising:
- (a) generating an abstract mathematical vector space based on a collection of documents, wherein respective documents in the collection of documents have a representation in the abstract mathematical vector space and respective terms contained in the collection of documents have a representation in the abstract mathematical vector space; and
(b) perturbing the abstract mathematical vector space to produce a perturbed abstract mathematical vector space that is stored in an electronic format accessible to a user, wherein perturbing the abstract mathematical vector space comprises at least one of (i) modifying the representation of a document with a newly computed representation for that document, or (ii) modifying the representation of a term with a newly computed representation for that term.
4 Assignments
0 Petitions
Accused Products
Abstract
A text processing method is provided that includes the following steps. First, an abstract mathematical vector space is generated based on a collection of documents. Respective documents in the collection of documents have a representation in the abstract mathematical vector space and respective terms contained in the collection of documents have a representation in the abstract mathematical vector space. Then, the abstract mathematical vector space is perturbed to produce a perturbed abstract mathematical vector space that is stored in an electronic format accessible to a user. Perturbing the abstract mathematical vector space may include modifying the representation of a document with a newly computed representation for that document, or modifying the representation of a term with a newly computed representation for that term.
15 Citations
26 Claims
-
1. A computer-based text processing method, comprising:
-
(a) generating an abstract mathematical vector space based on a collection of documents, wherein respective documents in the collection of documents have a representation in the abstract mathematical vector space and respective terms contained in the collection of documents have a representation in the abstract mathematical vector space; and
(b) perturbing the abstract mathematical vector space to produce a perturbed abstract mathematical vector space that is stored in an electronic format accessible to a user, wherein perturbing the abstract mathematical vector space comprises at least one of (i) modifying the representation of a document with a newly computed representation for that document, or (ii) modifying the representation of a term with a newly computed representation for that term. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-based text processing method, comprising:
-
(a) providing a first abstract mathematical vector space and a second abstract mathematical vector space, wherein the first abstract mathematical vector space is based on a first collection of documents and the second abstract mathematical vector space is based on a second collection of documents; and
(b) merging the first abstract mathematical vector space with the second abstract mathematical vector space to produce a merged abstract mathematical vector space that is stored in an electronic format accessible to a. user, wherein merging is based on a vector averaging of vectors in the first abstract mathematical vector space with vectors in the second abstract mathematical vector space. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A computer program product comprising a computer usable medium having control logic stored therein for processing text, the control logic comprising:
-
computer readable first program code that causes the computer to generate an abstract mathematical vector space based on a collection of documents, wherein respective documents in the collection of documents have a representation in the abstract mathematical vector space and respective terms contained in the collection of documents have a representation in the abstract mathematical vector space; and
computer readable second program code that causes the computer to perturb the abstract mathematical vector space to produce a perturbed abstract mathematical vector space that is stored in an electronic format accessible to a user, wherein perturbing the abstract mathematical vector space comprises at least one of (i) modifying the representation of a document with a newly computed representation for that document, or (ii) modifying the representation of a term with a newly computed representation for that term. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
-
-
22. A computer program product comprising a computer usable medium having control logic stored therein for processing text, the control logic comprising:
-
computer readable first program code that causes the computer to provide a first abstract mathematical vector space and a second abstract mathematical vector space, wherein the first abstract mathematical vector space is based on a first collection of documents and the second abstract mathematical vector space is based on a second collection of documents; and
computer readable second program code that causes the computer to merge the first abstract mathematical vector space with the second abstract mathematical vector space to produce a merged abstract mathematical vector space that is stored in an electronic format accessible to a user, wherein merging is based on a vector averaging of vectors in the first abstract mathematical vector space with vectors in the second abstract mathematical vector space. - View Dependent Claims (23, 24, 25, 26)
-
Specification