Computer information retrieval using latent semantic structure
First Claim
1. An information retrieval method comprising the steps ofgenerating term-by-data object matrix data to represent information files stored in a computer system, said matrix data being indicative of the frequency of occurrence of selected terms contained in the data objects stored in the information files,decomposing said matrix into a reduced singular value representation composed of distinct term and data object files,in response to a user query, generating a pseudo-object utilizing said selected terms and inserting said pseudo-object into said matrix data, andexamining the similarity between said pseudo-object and said term and data object files to generate an information response and storing said response in the system in a form accessible by the user.
3 Assignments
0 Petitions
Accused Products
Abstract
A methodology for retrieving textual data objects is disclosed. The information is treated in the statistical domain by presuming that there is an underlying, latent semantic structure in the usage of words in the data objects. Estimates to this latent structure are utilized to represent and retrieve objects. A user query is recouched in the new statistical domain and then processed in the computer system to extract the underlying meaning to respond to the query.
773 Citations
11 Claims
-
1. An information retrieval method comprising the steps of
generating term-by-data object matrix data to represent information files stored in a computer system, said matrix data being indicative of the frequency of occurrence of selected terms contained in the data objects stored in the information files, decomposing said matrix into a reduced singular value representation composed of distinct term and data object files, in response to a user query, generating a pseudo-object utilizing said selected terms and inserting said pseudo-object into said matrix data, and examining the similarity between said pseudo-object and said term and data object files to generate an information response and storing said response in the system in a form accessible by the user.
-
11. A method for retrieving information from an information file stored in a computer system comprising the steps of
generating term-by-data object matrix data by processing the information file, performing a singular value decomposition on said matrix data to obtain the reduced term and data object vectors and diagonal values, in response to a user query, generating a pseudo-object vector and augmenting said matrix data with said pseudo-vector using reduced forms of said term vector and said diagonal values and storing said augmented data in the system, and examining the similarities between said pseudo-object vector and said reduced term vector and a reduced form of said data object vector to generate the information and storing the information in a response file accessible to the user.
Specification