Data store for knowledge-based data mining system
First Claim
Patent Images
1. A system, comprising:
- at least one data store containing entities;
at least one lower level analysis engine communicating with the data store and generating an output using a first set of rules;
at least one higher level analysis engine receiving the output of the lower level analysis engine and generating an output using a second set of rules, wherein the outputs are associated with entities in the data store; and
an indexer associated with the data store, wherein tokenization is decoupled from indexing in the indexer.
1 Assignment
0 Petitions
Accused Products
Abstract
In a data mining system, data is gathered into a data store using, e.g., a Web crawler. The data is classified into entities and stored into underlying vertical and horizontal tables respectively representing miner outputs and entities that can be the subjects of indexing. Data miners use rules to process the entities and append respective keys to the entities representing characteristics of the entities as derived from rules embodied in the miners, with the keys being associated with the entities in the tables. With these keys, characteristics of entities as defined by disparate expert authors of the data miners are identified for use in responding to complex data requests from customers.
-
Citations
54 Claims
-
1. A system, comprising:
-
at least one data store containing entities; at least one lower level analysis engine communicating with the data store and generating an output using a first set of rules; at least one higher level analysis engine receiving the output of the lower level analysis engine and generating an output using a second set of rules, wherein the outputs are associated with entities in the data store; and an indexer associated with the data store, wherein tokenization is decoupled from indexing in the indexer. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 13, 15, 16)
-
-
11. A system, comprising:
-
at least one data store containing entities; at least one lower level analysis engine communicating with the data store and generating output using a first set of rules; at least one higher level analysis engine receiving the output of the lower level analysis engine and generating an output using a second set of rules, wherein the outputs are associated with entities in the data store; and an indexer associated with the data score, wherein the indexer includes indices of keys and key values found in the data store and further wherein the indexer contains Boolean indices storing “
yes”
or “
no”
values to queries of the form, “
does key k have value v?”
.
-
-
12. A system, comprising:
-
at least one data store containing entities; at least one lower level analysis engine communication with the data store and generating an output using a first set of rules; at least one higher level analysis engine receiving the output of the lower level analysis engine and generating an output using a second set of rules, wherein the outputs are associated with entities in the data store; and an indexer associated with the data store, wherein the indexer includes indices of keys and key values found in the data store and further wherein the indexer contains range indices storing ranges of key values.
-
-
14. A system, comprising:
-
at least one data store containing entities; at least one lower level analysis engine communicating with the data store and generating an output using a first set of rules; at least one higher level analysis engine receiving the output of the lower level analysis engine and generating an output using a second set of rules, wherein the outputs are associated with entities in the data store; and an indexer associated with the data store, wherein the indexer includes indices and the data store includes tables that do not indicate where in an entity a particular name or text occurs, but only that an entity has a particular characteristic.
-
-
17. A system, comprising:
-
at least one data store containing entities; at least one lower level analysis engine communicating with the data store and generating an output using a first set of rules; at least one higher level analysis engine receiving the output of the lower level analysis engine and generating an output using a second set of rules, wherein the outputs are associated with entities in the data store; and an indexer associated with the data store, wherein the indexer contains graph data to support inlink and outlink queries.
-
-
18. A method for storing data to support a knowledge-band data mining system, comprising:
-
storing entities in at least one data store associated with an index; communicating with the data store using at least a first analysis engine; generating an output using a first set of rules associated with the first analysis engine; sending the output to at least a second analysis engine; generating an output using a second set of rules associated with the second engine; and associating the outputs with entities, wherein the indexer contains graph data to support inlink and outlink queries. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 32, 34, 35)
-
-
30. A method for storing data to support a knowledge-based data mining system, comprising:
-
storing entities in at least one data store associated with an index; communicating with the data store using at least a first analysis engine; generating an output using a first set of rules associated with the first analysis engine; sending the cutout to at least a second analysis engine; generating an cutout using a second set of rules associated with the second engine; and associating the cutouts with entities, wherein the Indexer includes indices of keys and key values found in the data store and further wherein the indexer contains Boolean indices storing “
yes”
or “
no”
values to queries of the form, “
does key k have value v?”
.
-
-
31. A method for storing data to support a knowledge-based data mining system, comprising:
-
storing entities in at least one data store associated with an index; communicating with the data store using at least a first analysis engine; generating an output using a first set of rules associated with the first analysis engine; sending the output to at least a second analysis engine; generating an output using a second set of rules associated with the second engine; and associating the outputs with entities, wherein the indexer includes indices of keys and key values found in the data store and further wherein the indexer contains range indices storing ranges of key values.
-
-
33. A method for storing data to support a knowledge-based data mining system, comprising:
-
storing entities in at least one data score associated with an index; communicating with the data store using at least a first analysis engine; generating an output using a first set of rules associated with the first analysis engine; sending the output to at least a second analysis engine; generating an output using a second set of rules associated with the second engine; and associating the outputs with entities, wherein the indexer includes indices and the data store includes tables that do not indicate where in an entity a particular name or text occurs, but only that an entity has a particular characteristic.
-
-
36. A method for storing data to support a knowledge-based data mining system, comprising:
-
storing entities in at least one data store associated with an index; communicating with the data store using at least a first analysis engine; generating an output using a first set of rules associated with the first analysis engine; sending the output to at least a second analysis engine; generating an output using a second set of rules associated with the second engine; and associating the outputs with entities, wherein tokenization is decoupled from indexing in the indexer.
-
-
37. A system for data mining, comprising:
-
means for storing entities in at least one data store in both horizontal tables and vertical tables, the data store being associated with an index; means for communicating with the data store using at least a first analysis engine; means for generating an output using a first set of rules associated with the first analysis engine; means for sending the output to at least a second analysis engine; means for generating an output wing a second set of rules associated with the second engine; and means for associating the outputs with entities, wherein the indexer includes indices and the data store includes tables that do not indicate where in an entity a particular name or text occurs, but only that an entity has a particular characteristic. - View Dependent Claims (38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 50, 51, 52, 53)
-
-
49. A system for data mining, comprising:
-
means for storing entities in at least one data store in both horizontal tables and vertical tables, the data store being associated wit an index; means for communicating with the data store using at least a first analysis engine; means for generating an output using a first set of rules associated with the first analysis engine; means for sending the cutout to at least ,a second analysis engine; means for generating an cutout using a second set of rules associated with the second engine; and means for associating the cutouts with entities, wherein the indexer include indices of keys and key values found in the tables and further wherein the indexer contains Boolean indices storing “
yes”
or “
no”
values to queries of the form, “
does key k have value v?”
.
-
-
54. A system for data mining, comprising:
-
means for storing entities in at least one data store in both horizontal tables and vertical tables, the data store being associated with an index; means for communicating with the data store using at least a first analysis engine; means for generating an cutout using a first set of rules associated with the first analysis engine; means for sending the output to at least a second analysis engine; means for generating an output using a second set of rules associated with the second engine; and means for associating the outputs with entities, wherein the indexer contains graph data to support inlink and outlink queries.
-
Specification