Knowledge discovery agent system and method
First Claim
Patent Images
1. A system for processing information, comprising:
- a computer for running software programs in a distributed network, said computer including a processing means in communication with said distributed network;
at least one storage means in communication with said processing means for storing programs and information; and
at least one data agent in said distributed network for conducting a specific response to commands generated by said processing means;
wherein said computer, said at least one storage means, and said at least one data agent are operable for;
learning associations of natural language artifacts in unstructured data sources, wherein said artifacts include at least one of words, phrases, subjects, predicates, modifiers, and other syntactic forms; and
learning semantic and syntactic relationships in structured data sources, wherein said structured data sources are in conventional formats used by relational database systems;
further wherein learned associations resulting from said learning associations of natural language artifacts are formed using grouping of one natural language artifact in a interaction window with another at least one natural language artifact in said interaction window, based on a criteria of shared features of one or more sets from said grouping constituting measurements from said data sources;
further wherein said criteria of shared features are dynamically determined without the use of a priori classifications and using conditional probability constraints between sets of learned associations; and
further wherein said grouping creates a network of conditional probabilities between all encountered natural language artifacts or a subset thereof, determined by consideration of conditional interaction probabilities based on a history of measurements from said data sources;
means for constructing hierarchies of association across a state space of term usage compatible for interpolation of mapping functions between sets of terms, wherein said mapping functions include one or more of fuzzy-type, weighted-type, or other types of mapping functions; and
means for generating a structure of mapping functions comprising sets of terms in particular semantic positions, wherein said structure comprises a formal semantic structure of one or more of programming languages, modal logics, frame systems, and ontologies of objects and relationships.
8 Assignments
0 Petitions
Accused Products
Abstract
A system and method for processing information in unstructured or structured form, comprising a computer running in a distributed network with one or more data agents. Associations of natural language artifacts may be learned from natural language artifacts in unstructured data sources, and semantic and syntactic relationships may be learned in structured data sources, using grouping based on a criteria of shared features that are dynamically determined without the use of a priori classifications, by employing conditional probability constraints.
182 Citations
38 Claims
-
1. A system for processing information, comprising:
-
a computer for running software programs in a distributed network, said computer including a processing means in communication with said distributed network; at least one storage means in communication with said processing means for storing programs and information; and at least one data agent in said distributed network for conducting a specific response to commands generated by said processing means; wherein said computer, said at least one storage means, and said at least one data agent are operable for; learning associations of natural language artifacts in unstructured data sources, wherein said artifacts include at least one of words, phrases, subjects, predicates, modifiers, and other syntactic forms; and learning semantic and syntactic relationships in structured data sources, wherein said structured data sources are in conventional formats used by relational database systems; further wherein learned associations resulting from said learning associations of natural language artifacts are formed using grouping of one natural language artifact in a interaction window with another at least one natural language artifact in said interaction window, based on a criteria of shared features of one or more sets from said grouping constituting measurements from said data sources; further wherein said criteria of shared features are dynamically determined without the use of a priori classifications and using conditional probability constraints between sets of learned associations; and further wherein said grouping creates a network of conditional probabilities between all encountered natural language artifacts or a subset thereof, determined by consideration of conditional interaction probabilities based on a history of measurements from said data sources; means for constructing hierarchies of association across a state space of term usage compatible for interpolation of mapping functions between sets of terms, wherein said mapping functions include one or more of fuzzy-type, weighted-type, or other types of mapping functions; and means for generating a structure of mapping functions comprising sets of terms in particular semantic positions, wherein said structure comprises a formal semantic structure of one or more of programming languages, modal logics, frame systems, and ontologies of objects and relationships. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method, comprising the steps of:
-
providing a computer for running software programs in a distributed network, said computer including a processing means in communication with said distributed network; providing at least one storage means in communication with said processing means for storing programs and information; providing at least one data agent in said distributed network for conducting a specific response to commands generated by said processing means; configuring said computer, said at least one storage means, and said at least one data agent to be operable for; learning associations of natural language artifacts in unstructured data sources, wherein said artifacts include at least one of words, phrases, subjects, predicates, modifiers, and other syntactic forms; and learning semantic and syntactic relationships in structured data sources, wherein said structured data sources comprise entities in conventional formats used by relational database systems; wherein learned associations resulting from said learning associations of natural language artifacts are formed using grouping of one natural language artifact in a interaction window with another at least one natural language artifact in said interaction window, based on a criteria of shared features of one or more sets from said grouping constituting measurements from said data sources; further wherein said criteria of shared features are dynamically determined without the use of a priori classifications, using conditional probability constraints between sets of learned associations; and further wherein said grouping creates a network of conditional probabilities between all encountered natural language artifacts or a subset thereof, determined by consideration of conditional interaction probabilities based on a history of measurements from said data sources; constructing hierarchies of association across a state space of term usage compatible for interpolation of mapping functions between sets of terms, wherein said mapping functions include one or more of fuzzy-type, weighted-type, or other types of mapping functions, and said sets of terms have corresponding particular syntactic positions; and generating a structure of mapping functions composed of sets of terms in particular semantic positions, wherein said structure is a formal semantic structure of one or more of programming languages, modal logics, frame systems, and ontologies of objects and relationships. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A dynamic conceptual network system, comprising:
-
a computer for running software programs in a distributed network, said computer including a processing means in communication with said distributed network; at least one storage means in communication with said processing means for storing programs and information; at least one data agent in said distributed network for conducting a specific response to commands generated by said processing means; and means for generating a structure of mapping functions composed of sets of terms in particular semantic positions, wherein said structure is a formal semantic structure of one or more of programming languages, modal logics, frame systems, and ontologies of objects and relationships; wherein said computer, said at least one storage means, and said at least one data agent are further constructed and adapted for learning semantic and syntactic relationships in structured data sources, wherein said structured data sources are in conventional formats used by relational database systems; further wherein said computer, said at least one storage means, and said at least one data agent are constructed and adapted for learning associations of natural language artifacts in unstructured data sources, said natural language artifacts including at least one of words, phrases, subjects, predicates, modifiers, and other syntactic forms; further wherein the learned associations resulting from said learning associations of natural language artifacts are formed using grouping of one natural language artifact in a interaction window with another at least one natural language artifact in said interaction window, based on a criteria of shared features of one or more sets from said grouping constituting measurements from said data sources; further wherein said criteria of shared features are dynamically determined without the use of a priori classifications and using conditional probability constraints between sets of learned associations; and further wherein said grouping creates a network of conditional probabilities between all encountered natural language artifacts or a subset thereof, determined by consideration of conditional interaction probabilities based on a history of measurements from said data sources. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. A method for processing information, comprising the steps of:
-
providing a computer for running software programs in a distributed network, said computer including a processing means in communication with said distributed network; providing at least one storage means in communication with said processing means for storing programs and information; providing at least one data agent in said distributed network for conducting a specific response to commands generated by said processing means; configuring said computer, said at least one storage means, and said at least one data agent to be operable for; learning associations of natural language artifacts in unstructured data sources, wherein said artifacts include at least one of words, phrases, subjects, predicates, modifiers, and other syntactic forms; and learning semantic and syntactic relationships in structured data sources, wherein said structured data sources comprise entities in conventional formats used by relational database systems; wherein learned associations resulting from said learning associations of natural language artifacts are formed using grouping of one natural language artifact in a interaction window with another at least one natural language artifact in said interaction window, based on a criteria of shared features of one or more sets from said grouping constituting measurements from said data sources; further wherein said criteria of shared features are dynamically determined without the use of a priori classifications, using conditional probability constraints between sets of learned associations; and further wherein said grouping creates a network of conditional probabilities between all encountered natural language artifacts or a subset thereof, determined by consideration of conditional interaction probabilities based on a history of measurements from said data sources; constructing hierarchies of association across a state space of term usage compatible for interpolation of mapping functions between sets of terms, wherein said mapping functions include one or more of fuzzy-type, weighted-type, or other types of mapping functions, wherein said sets of terms have corresponding particular syntactic positions; and generating a structure of mapping functions composed of sets of terms in particular semantic positions, wherein said structure is a formal semantic structure of one or more of programming languages, modal logics, frame systems, and ontologies of objects and relationships. - View Dependent Claims (31, 32, 33, 34)
-
-
35. A method for processing information, comprising the steps of:
-
providing a computer for running software programs in a distributed network, said computer including a processing means in communication with said distributed network; providing at least one storage means in communication with said processing means for storing programs and information; providing at least one data agent in said distributed network for conducting a specific response to commands generated by said processing means; configuring said computer, said at least one storage means, and said at least one data agent to be operable for; learning associations of natural language artifacts in unstructured data sources, wherein said artifacts include at least one of words, phrases, subjects, predicates, modifiers, and other syntactic forms; and learning semantic and syntactic relationships in structured data sources, wherein said structured data sources comprise entities in conventional formats used by relational database systems; wherein learned associations resulting from said learning associations of natural language artifacts are formed using grouping of one natural language artifact in a interaction window with another at least one natural language artifact in said interaction window, based on a criteria of shared features of one or more sets from said grouping constituting measurements from said data sources; further wherein said criteria of shared features are dynamically determined without the use of a priori classifications, using conditional probability constraints between sets of learned associations; and further wherein said grouping creates a network of conditional probabilities between all encountered natural language artifacts or a subset thereof, determined by consideration of conditional interaction probabilities based on a history of measurements from said data sources; collecting data input generated by interaction with a human user; learning from said input; utilizing results of said learning to reorganize and alter mapping functions induced from analyzed input; and aligning said mapping functions according to human natural usage information in said results of said learning; wherein said input data is in the form of structured information or unstructured information. - View Dependent Claims (36, 37, 38)
-
Specification