System and method of grouping and extracting information from data corpora
First Claim
Patent Images
1. A method of extracting information from data corpora, the method comprising:
- identifying a data corpus accessible by a computer comprising grammatical sentences made up of a plurality of word elements;
assigning conceptual numerical identifiers to the plurality of word elements by a processor, where the conceptual numerical identifiers each has a word portion and a meaning portion;
grouping the conceptual numerical identifier based upon predetermined rules that group the conceptual numerical identifiers to form sets with conceptual sets logic, where the groupings of the conceptual numerical identifiers are stored in memory;
applying additional rules to the groupings of the conceptual numerical identifiers by the processor that result in indexed groups of conceptual numerical identifier; and
storing the indexed groupings of conceptual numerical identifier in a database, where the indexed groupings of conceptual numerical identifier may be queried with a grammatical sentence.
3 Assignments
0 Petitions
Accused Products
Abstract
A system for annotating words of a data corpus based upon their particular concept and their corresponding grammatical sense with Conceptual Numerical Identifiers (CNIs) from a Conceptual Dictionary, pairing the words based on conceptual inter-relating network (CIRN) rules, and determining if a selected plurality of paired words are grammatically, syntactically, and linguistically correct by matching CNIs from each pair of words.
31 Citations
31 Claims
-
1. A method of extracting information from data corpora, the method comprising:
-
identifying a data corpus accessible by a computer comprising grammatical sentences made up of a plurality of word elements; assigning conceptual numerical identifiers to the plurality of word elements by a processor, where the conceptual numerical identifiers each has a word portion and a meaning portion; grouping the conceptual numerical identifier based upon predetermined rules that group the conceptual numerical identifiers to form sets with conceptual sets logic, where the groupings of the conceptual numerical identifiers are stored in memory; applying additional rules to the groupings of the conceptual numerical identifiers by the processor that result in indexed groups of conceptual numerical identifier; and storing the indexed groupings of conceptual numerical identifier in a database, where the indexed groupings of conceptual numerical identifier may be queried with a grammatical sentence. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 15)
-
-
9. A method of extracting information from data corpora, the method comprising:
-
converting a query made up of a grammatical string into a plurality of word elements; assigning conceptual numerical identifiers to word elements in the plurality of word elements, such that the conceptual numerical identifiers identifies a specific word meaning to the word elements, where the conceptual numerical identifiers each has a word portion and a meaning portion; grouping the conceptual numerical identifiers based upon predetermined rules that group the conceptual numerical identifiers to form sets with conceptual sets logic, where the groupings of the conceptual numerical identifiers are stored in memory; applying additional rules to the groupings of the conceptual numerical identifiers by a processor that result in indexed grouping of numerical identifier; and accessing a database with the indexed groupings of numerical identifiers, where the database contains groupings of conceptual numerical identifiers associated with the data corpora in order to identify matches between a portion of the indexed groupings of numerical identifiers with the groupings of conceptual numerical identifiers associated with the data corpora. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
16. A method of extracting information from data corpora, the method comprising:
-
identifying a data corpus accessible by a computer comprising grammatical sentences made up of a plurality of word elements; assigning conceptual numerical identifiers to the plurality of word elements by a processor, where the conceptual numerical identifiers each has a word portion and a meaning portion; grouping the conceptual numerical identifiers into pairings of conceptual numerical identifiers based upon predetermined rules that group the conceptual numerical identifiers to form sets with conceptual sets logic, where the pairings are stored in memory; applying additional rules to the pairings of the conceptual numerical identifiers by the processor that result in indexed pairs of conceptual numerical identifiers; and storing the indexed pairings of conceptual numerical identifiers in a database, where the indexed pairings of conceptual numerical identifier may be queried with a grammatical sentence. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23)
-
-
24. A non-transitory machine-readable medium with machine readable instructions, that when executed result in a method of extracting information from data corpora, the instructions for the steps comprising:
-
identifying a data corpus accessible by a computer comprising grammatical sentences made up of a plurality of word elements; assigning conceptual numerical identifiers to the plurality of word elements by a processor, where the conceptual numerical identifiers each has a word portion and a meaning portion; grouping of the conceptual numerical identifiers based upon predetermined rules that group the conceptual numerical identifiers to form sets with conceptual sets logic, where the groupings of the conceptual numerical identifiers are stored in memory; applying additional rules to the groupings the conceptual numerical identifiers by the processor that result in indexed groups of conceptual numerical identifiers; and storing the indexed groupings of conceptual numerical identifiers in a database, where the indexed groupings of conceptual numerical identifiers may be queried with a grammatical sentence. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31)
-
Specification