Organizing structured and unstructured database columns using corpus analysis and context modeling to extract knowledge from linguistic phrases in the database
First Claim
1. A method for extracting contextual information about a plurality of objects and a plurality of activities from a plurality of tables comprising a plurality of columns and a plurality of rows in a database comprising:
- selecting a number P partitioning columns, said partitioning columns to be used for partitioning said database rows into partitions;
each said partition comprising zero or more rows, wherein if one or more rows exist, each said row having the same values, or value ranges, in each row and partitioning column in said partition;
selecting a number R processing columns of said database to be used in extracting information from said database;
said processing columns comprising structured columns and unstructured columns;
said unstructured columns including one or more columns containing words or phrases expressed in a language;
modeling classes and relationships among said plurality of objects and said plurality of activities described by entries in said database;
searching the said partitions for said contextual information based on the modeling; and
presenting said contextual information to a user;
wherein of P and R is greater than zero.
1 Assignment
0 Petitions
Accused Products
Abstract
Corpus analysis methods have previously been applied to text, typically to annotated text. The invention shows how to apply corpus analysis methods to information captured in databases, where the database columns include a mixture of both structured domains and unstructured domains containing text. It uses case-based methods to automatically organize cases for periodic review. The invention can help to identify opportunities for increasing knowledge about databases. By organizing a database around common lexical, semantic, pragmatic and syntactic relationships, the invention can be used to increase the effectiveness of previous corpus analysis methods, and to apply them to a diversity of commercial applications. The invention applies contextual constraints to focus the application of linguistic methods. This invention can provide a component for medical records, enterprise databases, information retrieval, question answering systems, interactive robots, interactive appliances, linguistically competent speech recognition, speech understanding and many other useful devices and applications that require a high level of linguistic competence within operational contexts.
110 Citations
19 Claims
-
1. A method for extracting contextual information about a plurality of objects and a plurality of activities from a plurality of tables comprising a plurality of columns and a plurality of rows in a database comprising:
-
selecting a number P partitioning columns, said partitioning columns to be used for partitioning said database rows into partitions; each said partition comprising zero or more rows, wherein if one or more rows exist, each said row having the same values, or value ranges, in each row and partitioning column in said partition; selecting a number R processing columns of said database to be used in extracting information from said database; said processing columns comprising structured columns and unstructured columns; said unstructured columns including one or more columns containing words or phrases expressed in a language; modeling classes and relationships among said plurality of objects and said plurality of activities described by entries in said database; searching the said partitions for said contextual information based on the modeling; and presenting said contextual information to a user; wherein of P and R is greater than zero. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
an assignment of at least one of said concepts to each said partition wherein each said partition is an instance of each concept to which said partition has been assigned.
-
-
9. The method of claim 8 wherein
at least one said assignment of concepts to each said partition results from a function applied to said partition. -
10. The method of claim 8 wherein
at least one said assignment of concepts to each said partition results from an action made by a human operator. -
11. The method of claim 9 wherein said function modifies the structure of said database.
-
12. The method of claim 9 wherein said function modifies the contents of said database.
-
13. A method for extracting contextual information about a plurality of objects and a plurality of activities from a plurality of texts comprising:
-
annotating each said text with metadata information; transforming each said text into one or more database rows that reflect the said annotations through tables and columns of the said database; selecting a number P partitioning columns, said partitioning columns to be used for partitioning said database rows into partitions; each said partition comprising zero or more rows, wherein if one or more rows exist, each said row having the same values, or value ranges, in each row and partitioning column in said partition selecting a number R processing columns of said database to be used in extracting information from said database; said processing columns comprising structured columns and unstructured columns; said unstructured columns including one or more columns containing words or phrases expressed in a language; modeling classes and relationships among said plurality of objects and said plurality of activities described by entries in said database; searching the said partitions for said contextual information based on the modeling; and presenting said contextual information to a user; wherein each of P and R is greater than zero. - View Dependent Claims (14, 15, 16, 17, 18, 19)
the contextual information is formatted into a linguistically acceptable response; and the text and the contextual information form an interaction with the user.
-
-
18. The method of claim 17 wherein said database is updated after a user'"'"'s linguistic phrase is entered by said method.
-
19. The method of claim 17 wherein said interactions performed in response to said user'"'"'s linguistic phrase is evaluated and possibly said evaluation is stored into said database.
Specification