Perturbing latent semantic indexing spaces
First Claim
1. A computer-based text processing method, comprising:
- (a) generating, using a processor, an abstract mathematical vector space based on a collection of documents, wherein respective documents in the collection of documents have a representation in the abstract mathematical vector space and respective terms contained in the collection of documents have a representation in the abstract mathematical vector space; and
(b) perturbing, using a processor, the abstract mathematical vector space to produce a perturbed abstract mathematical vector space that is stored in an electronic format accessible to a user, wherein perturbing the abstract mathematical vector space comprises at least one of (i) modifying the representation of a document with a newly computed representation for that document, and (ii) modifying the representation of a term with a newly computed representation for that term.
4 Assignments
0 Petitions
Accused Products
Abstract
A text processing method is provided that includes the following steps. First, an abstract mathematical vector space is generated based on a collection of documents. Respective documents in the collection of documents have a representation in the abstract mathematical vector space and respective terms contained in the collection of documents have a representation in the abstract mathematical vector space. Then, the abstract mathematical vector space is perturbed to produce a perturbed abstract mathematical vector space that is stored in an electronic format accessible to a user. Perturbing the abstract mathematical vector space may include modifying the representation of a document with a newly computed representation for that document, or modifying the representation of a term with a newly computed representation for that term.
-
Citations
25 Claims
-
1. A computer-based text processing method, comprising:
-
(a) generating, using a processor, an abstract mathematical vector space based on a collection of documents, wherein respective documents in the collection of documents have a representation in the abstract mathematical vector space and respective terms contained in the collection of documents have a representation in the abstract mathematical vector space; and (b) perturbing, using a processor, the abstract mathematical vector space to produce a perturbed abstract mathematical vector space that is stored in an electronic format accessible to a user, wherein perturbing the abstract mathematical vector space comprises at least one of (i) modifying the representation of a document with a newly computed representation for that document, and (ii) modifying the representation of a term with a newly computed representation for that term. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-based text processing method, comprising:
-
(a) providing, using a processor, a first abstract mathematical vector space and a second abstract mathematical vector space, wherein each document of a first collection of documents is represented in the first abstract mathematical vector space and each document of a second collection of documents is represented in the second abstract mathematical vector space; and (b) merging the first abstract mathematical vector space with the second abstract mathematical vector space to produce a merged abstract mathematical vector space that is stored in an electronic format accessible to a user, wherein the merging comprises at least one of (i) averaging a representation of a first document in the first abstract mathematical vector space with a representation of the first document in the second abstract mathematical vector space to generate a new representation of the first document in the merged abstract mathematical vector space, and (ii) averaging a representation of a first term included in the first abstract mathematical vector space with a representation of the first term in the second abstract mathematical vector space to generate a new representation of the first term in the merged abstract mathematical vector space. - View Dependent Claims (10, 11, 12)
-
-
13. A tangible computer program product comprising a computer readable storage medium having control logic stored therein for causing a computer to process text, the control logic comprising:
-
computer readable first program code that causes the computer to generate an abstract mathematical vector space based on a collection of documents, wherein respective documents in the collection of documents have a representation in the abstract mathematical vector space and respective terms contained in the collection of documents have a representation in the abstract mathematical vector space; and computer readable second program code that causes the computer to perturb the abstract mathematical vector space to produce a perturbed abstract mathematical vector space that is stored in an electronic format accessible to a user, wherein perturbing the abstract mathematical vector space comprises at least one of (i) modifying the representation of a document with a newly computed representation for that document, and (ii) modifying the representation of a term with a newly computed representation for that term. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
-
21. A tangible computer program product comprising a computer readable storage medium having control logic stored therein for causing a computer to process text, the control logic comprising:
-
computer readable first program code that causes the computer to provide a first abstract mathematical vector space and a second abstract mathematical vector space, wherein the first abstract mathematical vector space is based on a first collection of documents and the second abstract mathematical vector space is based on a second collection of documents; and computer readable second program code that causes the computer to merge the first abstract mathematical vector space with the second abstract mathematical vector space to produce a merged abstract mathematical vector space that is stored in an electronic format accessible to a user, wherein merging is based on a vector averaging of vectors in the first abstract mathematical vector space with vectors in the second abstract mathematical vector space. - View Dependent Claims (22, 23, 24, 25)
-
Specification