System for processing at least partially structured data
First Claim
Patent Images
1. A method for comparing a population of terms in a term repository comprising:
- standardizing each term within the population of terms based on at least one standardization rule; and
comparing at least a pair of terms including determining a match between the terms if, once standardized, they are substantially identical.
20 Assignments
0 Petitions
Accused Products
Abstract
An improved system for processing at least partially structured data includes a method for comparing a population of terms in a term repository, including standardizing each term within the population of terms based on at least one standardization rule. The method also includes comparing at least a pair of terms including determining a match between the terms if, once standardized, they are substantially identical.
152 Citations
29 Claims
-
1. A method for comparing a population of terms in a term repository comprising:
-
standardizing each term within the population of terms based on at least one standardization rule; and
comparing at least a pair of terms including determining a match between the terms if, once standardized, they are substantially identical. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for processing at least partially structured data comprising:
-
a topic repository;
a topic repository populator operative at least partially automatically to employ structure in at least partially structured data to organize said data in association with said topic repository to facilitate access to said data by topic; and
a topic oriented user interface employed by a user to access said data in association with said topic repository by topic. - View Dependent Claims (17, 18, 19, 20, 23, 28, 29)
-
-
12. For use in a system for processing at least partially structured data comprising a topic repository and a topic oriented user interface employed by a user to access said data in said topic repository by topic,
a topic repository populator operative at least partially automatically to employ structure in at least partially structured data to organize said data in said topic repository to facilitate access to said data by topic.
-
13. A method for processing at least partially structured data comprising:
-
at least partially automatically employing structure in at least partially structured data to organize said data in association with a topic repository to facilitate access to said data by topic; and
employing a topic oriented user interface to access said data in association with said topic repository by topic.
-
-
14. For use in a method for processing at least partially structured data comprising a topic repository and a topic oriented user interface employed by a user to access said data in said topic repository by topic,
at least partially automatically employing structure in at least partially structured data to organize said data in said topic repository to facilitate access to said data by topic.
-
15. Topic server apparatus comprising:
-
a topic extractor operative to automatically prepare at least one data source, the data source including a multiplicity of data entries, for topic-based retrieval, by automatically extracting a multiplicity of topics from the data entries such that each data entry is assigned to at least one topic;
a disambiguator operative to automatically rank the relevancy of each of the multiplicity of topics to a given query including defining at least one most relevant topics; and
a data accessor operative to automatically access at least some of the data entries assigned to at least one of the most relevant topics.
-
-
16. An assembly of intercommunicating topic servers, each topic server comprising:
-
a topic extractor operative to prepare at least one data source, the data source including a multiplicity of data entries, for topic-based retrieval, by extracting a multiplicity of topics from the data entries such that each data entry is assigned to at least one topic;
a disambiguator operative to rank the relevancy of each of the multiplicity of topics to a given query including defining at least one most relevant topics; and
a data accessor operative to access at least some of the data entries assigned to at least one of the most relevant topics, wherein each topic server includes a topic server intercommunicator operative;
to receive a user'"'"'s query, to present the user'"'"'s query to at least one additional topic server, and to receive at least one data entries assigned to at least one topic extracted by said at least one additional topic server and ranked by said at least one additional topic server as relevant to the user'"'"'s query, from at least one data source associated with the additional topic server.
-
-
21. Apparatus for automatically generating an information retrieval system for retrieval of information from at least one at least partially structured data sources, the apparatus comprising:
-
rule input apparatus accepting from a user at least one data source-specific topic-generation rules corresponding respectively to at least one at least partially structured data sources; and
a topic generator operative to employ the rules corresponding to each individual data source from among the at least one data sources to generate at least one topic for each data entry within that individual data source. - View Dependent Claims (22, 25, 26, 27)
-
-
24. A data retrieval system comprising:
-
a first plurality of data sources storing a multiplicity of data entries;
a second plurality of topic repositories storing topic information characterizing the multiplicity of data entries; and
a third plurality of topic servers associated with the second plurality of topic repositories and performing, responsive to a topic-defining query, topic-based retrieval of selected data entries from among said multiplicity of data entries based on said topic information.
-
Specification