KNOWLEDGE DISCOVERY AGENT SYSTEM AND METHOD
First Claim
Patent Images
1. A system for processing information, comprising:
- a computer for running software programs in a distributed network, said computer including a processing means in communication with said distributed network;
at least one storage means in communication with said processing means for storing programs and information; and
at least one data agent in said distributed network for conducting a specific response to commands generated by said processing means;
wherein said computer, said at least one storage means, and said at least one data agent are operable for;
learning associations of natural language artifacts in unstructured data sources, wherein said artifacts include at least one of words, phrases, subjects, predicates, modifiers, and other syntactic forms; and
learning semantic and syntactic relationships in structured data sources, wherein said structured data sources are in conventional formats used by relational database systems;
further wherein learned associations resulting from said learning associations of natural language artifacts are formed using grouping of one natural language artifact in an interaction window with another at least one natural language artifact in said interaction window, based on a criteria of shared features of one or more sets from said grouping constituting measurements from said data sources;
further wherein said criteria of shared features are dynamically determined without the use of a priori classifications and using conditional probability constraints between sets of learned associations; and
further wherein said grouping creates a network of conditional probabilities between all encountered natural language artifacts or a subset thereof, determined by consideration of conditional interaction probabilities based on a history of measurements from said data sources.
8 Assignments
0 Petitions
Accused Products
Abstract
A system and method for processing information in unstructured or structured form, comprising a computer running in a distributed network with one or more data agents. Associations of natural language artifacts may be learned from natural language artifacts in unstructured data sources, and semantic and syntactic relationships may be learned in structured data sources, using grouping based on a criteria of shared features that are dynamically determined without the use of a priori classifications, by employing conditional probability constraints.
-
Citations
43 Claims
-
1. A system for processing information, comprising:
-
a computer for running software programs in a distributed network, said computer including a processing means in communication with said distributed network;
at least one storage means in communication with said processing means for storing programs and information; and
at least one data agent in said distributed network for conducting a specific response to commands generated by said processing means;
wherein said computer, said at least one storage means, and said at least one data agent are operable for;
learning associations of natural language artifacts in unstructured data sources, wherein said artifacts include at least one of words, phrases, subjects, predicates, modifiers, and other syntactic forms; and
learning semantic and syntactic relationships in structured data sources, wherein said structured data sources are in conventional formats used by relational database systems;
further wherein learned associations resulting from said learning associations of natural language artifacts are formed using grouping of one natural language artifact in an interaction window with another at least one natural language artifact in said interaction window, based on a criteria of shared features of one or more sets from said grouping constituting measurements from said data sources;
further wherein said criteria of shared features are dynamically determined without the use of a priori classifications and using conditional probability constraints between sets of learned associations; and
further wherein said grouping creates a network of conditional probabilities between all encountered natural language artifacts or a subset thereof, determined by consideration of conditional interaction probabilities based on a history of measurements from said data sources. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method, comprising the steps of:
-
providing a computer for running software programs in a distributed network, said computer including a processing means in communication with said distributed network;
providing at least one storage means in communication with said processing means for storing programs and information;
providing at least one data agent in said distributed network for conducting a specific response to commands generated by said processing means; and
configuring said computer, said at least one storage means, and said at least one data agent to be operable for;
learning associations of natural language artifacts in unstructured data sources, wherein said artifacts include at least one of words, phrases, subjects, predicates, modifiers, and other syntactic forms; and
learning semantic and syntactic relationships in structured data sources, wherein said structured data sources comprise entities in conventional formats used by relational database systems;
wherein learned associations resulting from said learning associations of natural language artifacts are formed using grouping of one natural language artifact in an interaction window with another at least one natural language artifact in said interaction window, based on a criteria of shared features of one or more sets from said grouping constituting measurements from said data sources;
further wherein said criteria of shared features are dynamically determined without the use of a priori classifications, using conditional probability constraints between sets of learned associations; and
further wherein said grouping creates a network of conditional probabilities between all encountered natural language artifacts or a subset thereof, determined by consideration of conditional interaction probabilities based on a history of measurements from said data sources. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A dynamic conceptual network system, comprising:
-
a computer for running software programs in a distributed network, said computer including a processing means in communication with said distributed network;
at least one storage means in communication with said processing means for storing programs and information; and
at least one data agent in said distributed network for conducting a specific response to commands generated by said processing means;
wherein said computer, said at least one storage means, and said at least one data agent are constructed and adapted for learning associations of natural language artifacts in unstructured data sources, wherein said artifacts include at least one of words, phrases, subjects, predicates, modifiers, and other syntactic forms. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36)
-
-
37. A method for processing information, comprising the steps of:
-
providing a computer for running software programs in a distributed network, said computer including a processing means in communication with said distributed network;
providing at least one storage means in communication with said processing means for storing programs and information;
providing at least one data agent in said distributed network for conducting a specific response to commands generated by said processing means;
configuring said computer, said at least one storage means, and said at least one data agent to be operable for;
learning associations of natural language artifacts in unstructured data sources, wherein said artifacts include at least one of words, phrases, subjects, predicates, modifiers, and other syntactic forms; and
learning semantic and syntactic relationships in structured data sources, wherein said structured data sources comprise entities in conventional formats used by relational database systems;
wherein learned associations resulting from said learning associations of natural language artifacts are formed using grouping of one natural language artifact in an interaction window with another at least one natural language artifact in said interaction window, based on a criteria of shared features of one or more sets from said grouping constituting measurements from said data sources;
further wherein said criteria of shared features are dynamically determined without the use of a priori classifications, using conditional probability constraints between sets of learned associations; and
further wherein said grouping creates a network of conditional probabilities between all encountered natural language artifacts or a subset thereof, determined by consideration of conditional interaction probabilities based on a history of measurements from said data sources. - View Dependent Claims (38, 39, 40, 41, 42, 43)
-
Specification