Search engine with increased performance and specificity
First Claim
1. A search engine for searching and retrieving information from a data repository comprising:
- a pre-processing component that modifies the records of the data repository wherein;
i. the concepts of the language are identified in each record;
ii. term ambiguity and term synonymy are resolved;
iii. compound or complex semantic units are processed and simplified;
iv. presence and type of relationships between terms are detected; and
v. anaphoric terms and sortal anaphoric noun phrases are resolved, and the actual entities they refer to are identified;
a new, second data repository where the data is stored in the pre-processed, modified representation, containing all the concept IDs and the relation types;
a user interface wherein the user enters a query, and where the user query is translated to concept IDs of the language;
a data engine wherein concept IDs of the user query are matched against the concept IDs of the data records of said second data repository, and where the matching records are returned according to a relevance metric calculated for each data record; and
a multitude of computing hardware wherein said pre-processing component, second data repository, said user interface, and said data engine operate simultaneously and in parallel in response to a single user query.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention discloses a system and methods for retrieval of most relevant information from a given digital data repository. This is done in the first step by verifying two conditions of relevance, presence of query words plus presence of at least one type of relationship between the words in the data record. Additionally a numeric relevance score is computed for each relevant record, such that they can be sorted descendingly according to this relevance metric. The most relevant results will be shown first, while irrelevant records are eliminated. This reduces the volume of the results substantially. The information retrieval system according to this invention includes: a data pre-processing component where multiple steps of processing is performed, a second new data repository where the modified data is stored, a user interface with the capability of real-time translation of user'"'"'s query, a search engine, and computing hardware in a distributed architecture.
-
Citations
18 Claims
-
1. A search engine for searching and retrieving information from a data repository comprising:
-
a pre-processing component that modifies the records of the data repository wherein;
i. the concepts of the language are identified in each record;
ii. term ambiguity and term synonymy are resolved;
iii. compound or complex semantic units are processed and simplified;
iv. presence and type of relationships between terms are detected; and
v. anaphoric terms and sortal anaphoric noun phrases are resolved, and the actual entities they refer to are identified;
a new, second data repository where the data is stored in the pre-processed, modified representation, containing all the concept IDs and the relation types;
a user interface wherein the user enters a query, and where the user query is translated to concept IDs of the language;
a data engine wherein concept IDs of the user query are matched against the concept IDs of the data records of said second data repository, and where the matching records are returned according to a relevance metric calculated for each data record; and
a multitude of computing hardware wherein said pre-processing component, second data repository, said user interface, and said data engine operate simultaneously and in parallel in response to a single user query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
Specification