Methods, apparatus, and data structures for annotating a database design schema and/or indexing annotations
First Claim
1. A computer implemented method of indexing annotations to an information storage to generate a list of indexed annotations, the method comprising steps of:
- a) classifying words of the annotations based on a concordance and a dictionary so that the words have an associated classification, wherein;
classifications comprise;
a unique classification, wherein the unique classification comprises;
possessives; and
words not in the dictionary;
a proper classification, wherein the proper classification comprises words defined as proper nouns in the dictionary;
a stop classification, wherein the stop classification comprises words having no semantic meaning;
a class classification, wherein the class classification comprises words that have been used to annotate rows in a table; and
at least four distinct classifications comprising words derived from frequency of each word in the concordance such that;
words occurring least frequently in the concordance comprise a rare classification;
words occurring next-least frequently in the concordance comprise an infrequent classification;
words occurring next-most frequently in the concordance comprise a frequent classification; and
words occurring most frequently in the concordance comprise a common classification; and
the step of classifying words includes determining the classification to which the words should be assigned; and
b) assigning a normalized weight to words of each annotation, based on the associated classification of the words of each annotation, wherein a total normalized weight for the words of each annotation is equal to 1.00,wherein, the normalized weight assigned to words of each annotation based on the classifications associated with the words is determined by;
determining whether the annotation contains any words in the unique classification, wherein;
in an event that more than one word in the unique classification exists in the annotation, each of the unique classification words is assigned a normalized weight of 0.75 divided by the number of unique classification words in the annotation;
in an event that only one word in the unique classification exists in the annotation, that word is assigned a normalized weight of 0.50;
based on the number of words in the unique classification determining the weight to be apportioned to the remaining words occurs such that;
in an event that there are no unique words in the annotation, the remaining weight to be apportioned is 1.00;
in an event that there was one unique word in the annotation, the remaining weight to be apportioned is 0.50;
in an event that there was more than one unique word in the annotation, the remaining weight to be apportioned is 0.25;
of the weight apportioned to the remaining words, determining a share of the apportioned weight for each word such that;
each word in the class classification gets 1 share of the remaining weight,each word in the stop classification gets 1 share of the remaining weight,each word in the common classification gets 2 shares of the remaining weight,each word in the frequent classification gets 3 shares of the remaining weight,each word in the infrequent classification gets 4 shares of the remaining weight,each word in the rare classification gets 5 shares of the remaining weight; and
each word in the proper classification gets 5 shares of the remaining weight,in an event that the weight added together for number of shares of class, common, and stop words in the annotation is more than 0.10, setting the normalized weight for the class, common, and stop words to equal 0.10.
1 Assignment
0 Petitions
Accused Products
Abstract
An authoring tool (or process) to facilitate the performance of an annotation function and an indexing function. The annotation function may generate informational annotations and word annotations to a database design schema (e.g., an entity-relationship diagram or “ERD”). The indexing function may analyze the words of the annotations by classifying the words in accordance with a concordance and dictionary, and assign a normalized weight to each word of each of the annotations based on the classification(s) of the word(s) of the annotation. A query translator (or query translation process) to (i) accept a natural language query from a user interface process, (ii) convert the natural language query to a formal command query (e.g., an SQL query) using the indexed annotations generated by the authoring tool and the database design schema, and (iii) present the formal command query to a database management process for interrogating the relational database.
30 Citations
10 Claims
-
1. A computer implemented method of indexing annotations to an information storage to generate a list of indexed annotations, the method comprising steps of:
-
a) classifying words of the annotations based on a concordance and a dictionary so that the words have an associated classification, wherein; classifications comprise; a unique classification, wherein the unique classification comprises; possessives; and words not in the dictionary; a proper classification, wherein the proper classification comprises words defined as proper nouns in the dictionary; a stop classification, wherein the stop classification comprises words having no semantic meaning; a class classification, wherein the class classification comprises words that have been used to annotate rows in a table; and at least four distinct classifications comprising words derived from frequency of each word in the concordance such that; words occurring least frequently in the concordance comprise a rare classification; words occurring next-least frequently in the concordance comprise an infrequent classification; words occurring next-most frequently in the concordance comprise a frequent classification; and words occurring most frequently in the concordance comprise a common classification; and the step of classifying words includes determining the classification to which the words should be assigned; and b) assigning a normalized weight to words of each annotation, based on the associated classification of the words of each annotation, wherein a total normalized weight for the words of each annotation is equal to 1.00, wherein, the normalized weight assigned to words of each annotation based on the classifications associated with the words is determined by; determining whether the annotation contains any words in the unique classification, wherein; in an event that more than one word in the unique classification exists in the annotation, each of the unique classification words is assigned a normalized weight of 0.75 divided by the number of unique classification words in the annotation; in an event that only one word in the unique classification exists in the annotation, that word is assigned a normalized weight of 0.50; based on the number of words in the unique classification determining the weight to be apportioned to the remaining words occurs such that; in an event that there are no unique words in the annotation, the remaining weight to be apportioned is 1.00; in an event that there was one unique word in the annotation, the remaining weight to be apportioned is 0.50; in an event that there was more than one unique word in the annotation, the remaining weight to be apportioned is 0.25; of the weight apportioned to the remaining words, determining a share of the apportioned weight for each word such that; each word in the class classification gets 1 share of the remaining weight, each word in the stop classification gets 1 share of the remaining weight, each word in the common classification gets 2 shares of the remaining weight, each word in the frequent classification gets 3 shares of the remaining weight, each word in the infrequent classification gets 4 shares of the remaining weight, each word in the rare classification gets 5 shares of the remaining weight; and each word in the proper classification gets 5 shares of the remaining weight, in an event that the weight added together for number of shares of class, common, and stop words in the annotation is more than 0.10, setting the normalized weight for the class, common, and stop words to equal 0.10. - View Dependent Claims (2, 3, 4, 5, 10)
-
-
6. A system for indexing annotations to information, the system comprising:
-
a processor for processing annotations to information via an annotating facility adapted to annotate a database design schema with annotations comprising; word annotations; and informational annotations; a memory operatively coupled to the processor for storing annotations to information; classification means for classifying words of the annotations based on a dictionary and a concordance such that each individual word of the words is associated with a classification, wherein; classifications comprise; a unique classification, wherein the unique classification comprises; possessives; and words not in the dictionary; a proper classification, wherein the proper classification comprises words defined as proper nouns in the dictionary; a stop classification, wherein the stop classification comprises words having no semantic meaning; a class classification, wherein the class classification comprises words that have been used to annotate rows in a table; and at least four distinct classifications comprising words derived from frequency of each word in the concordance such that; words occurring least frequently in the concordance comprise a rare classification; words occurring next-least frequently in the concordance comprise an infrequent classification; words occurring next-most frequently in the concordance comprise a frequent classification; and words occurring most frequently in the concordance comprise a common classification; and the step of classifying words includes determining the classification to which the words should be assigned; and assignment means for assigning a normalized weight to each individual word of the annotations responsive to the associated classification of each individual word by the classification means, wherein a total normalized weight for the words of each annotation is equal to 1.00, and the normalized weight assigned to words of each annotation based on the classifications associated with the words is determined by; determining whether the annotation contains any words in the unique classification, wherein; in an event that more than one word in the unique classification exists in the annotation, each of the unique classification words is assigned a normalized weight of 0.75 divided by the number of unique classification words in the annotation; in an event that only one word in the unique classification exists in the annotation, that word is assigned a normalized weight of 0.50; based on the number of words in the unique classification determining the weight to be apportioned to the remaining words occurs such that; in an event that there are no unique words in the annotation, the remaining weight to be apportioned is 1.00; in an event that there was one unique word in the annotation, the remaining weight to be apportioned is 0.50; in an event that there was more than one unique word in the annotation, the remaining weight to be apportioned is 0.25; of the weight apportioned to the remaining words, determining a share of the apportioned weight for each word such that; each word in the class classification gets 1 share of the remaining weight, each word in the stop classification gets 1 share of the remaining weight, each word in the common classification gets 2 shares of the remaining weight, each word in the frequent classification gets 3 shares of the remaining weight, each word in the infrequent classification gets 4 shares of the remaining weight, each word in the rare classification gets 5 shares of the remaining weight; and each word in the proper classification gets 5 shares of the remaining weight, in an event that the weight added together for number of shares of class, common, and stop words in the annotation is more than 0.10, setting the normalized weight for the class, common, and stop words to equal 0.10.
-
-
7. At least one machine readable storage medium storing processor-executable instructions, the processor-executable instructions comprising:
-
an annotating facility, the annotating facility adapted to annotate a database design schema with annotations comprising; word annotations; and informational annotations; and an indexing facility, the indexing facility adapted to; classify individual words of the word annotations based on a concordance and a dictionary such that the individual words of the word annotations are have an associated classification, wherein; classifications comprise; a unique classification, wherein the unique classification comprises; possessives; and words not in the dictionary; a proper classification, wherein the proper classification comprises words defined as proper nouns in the dictionary; a stop classification, wherein the stop classification comprises words having no semantic meaning; a class classification, wherein the class classification comprises words that have been used to annotate rows in a table; and at least four distinct classifications comprising words derived from frequency of each word in the concordance such that; words occurring least frequently in the concordance comprise a rare classification; words occurring next-least frequently in the concordance comprise an infrequent classification; words occurring next-most frequently in the concordance comprise a frequent classification, and words occurring most frequently in the concordance comprise a common classification; and the step of classifying words includes determining the classification to which the words should be assigned; and assign a normalized weight to the classified individual words of the word annotations, based on the associated classification of the words of each annotation, wherein a total normalized weight for the words of each annotation is equal to 1.00 and the normalized weight assigned to words of each annotation based on the classifications associated with the words is determined by; determining whether the annotation contains any words in the unique classification, wherein; in an event that more than one word in the unique classification exists in the annotation, each of the unique classification words is assigned a normalized weight of 0.75 divided by the number of unique classification words in the annotation; in an event that only one word in the unique classification exists in the annotation, that word is assigned a normalized weight of 0.50; based on the number of words in the unique classification determining the weight to be apportioned to the remaining words occurs such that; in an event that there are no unique words in the annotation, the remaining weight to be apportioned is 1.00; in an event that there was one unique word in the annotation, the remaining weight to be apportioned is 0.50; in an event that there was more than one unique word in the annotation, the remaining weight to be apportioned is 0.25; of the weight apportioned to the remaining words, determining a share of the apportioned weight for each word such that; each word in the class classification gets 1 share of the remaining weight, each word in the stop classification gets 1 share of the remaining weight, each word in the common classification gets 2 shares of the remaining weight, each word in the frequent classification gets 3 shares of the remaining weight, each word in the infrequent classification gets 4 shares of the remaining weight, each word in the rare classification gets 5 shares of the remaining weight; and each word in the proper classification gets 5 shares of the remaining weight, in an event that the weight added together for number of shares of class, common, and stop words in the annotation is more than 0.10, setting the normalized weight for the class, common, and stop words to equal 0.10. - View Dependent Claims (8, 9)
-
Specification