Methods, apparatus, and data structures for annotating a database design schema and/or indexing annotations

US 7,512,609 B2
Filed: 07/22/2005
Issued: 03/31/2009
Est. Priority Date: 05/03/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A computer implemented method of indexing annotations to an information storage to generate a list of indexed annotations, the method comprising steps of:

a) classifying words of the annotations based on a concordance and a dictionary so that the words have an associated classification, wherein;

classifications comprise;

a unique classification, wherein the unique classification comprises;

possessives; and

words not in the dictionary;

a proper classification, wherein the proper classification comprises words defined as proper nouns in the dictionary;

a stop classification, wherein the stop classification comprises words having no semantic meaning;

a class classification, wherein the class classification comprises words that have been used to annotate rows in a table; and

at least four distinct classifications comprising words derived from frequency of each word in the concordance such that;

words occurring least frequently in the concordance comprise a rare classification;

words occurring next-least frequently in the concordance comprise an infrequent classification;

words occurring next-most frequently in the concordance comprise a frequent classification; and

words occurring most frequently in the concordance comprise a common classification; and

the step of classifying words includes determining the classification to which the words should be assigned; and

b) assigning a normalized weight to words of each annotation, based on the associated classification of the words of each annotation, wherein a total normalized weight for the words of each annotation is equal to 1.00,wherein, the normalized weight assigned to words of each annotation based on the classifications associated with the words is determined by;

determining whether the annotation contains any words in the unique classification, wherein;

in an event that more than one word in the unique classification exists in the annotation, each of the unique classification words is assigned a normalized weight of 0.75 divided by the number of unique classification words in the annotation;

in an event that only one word in the unique classification exists in the annotation, that word is assigned a normalized weight of 0.50;

based on the number of words in the unique classification determining the weight to be apportioned to the remaining words occurs such that;

in an event that there are no unique words in the annotation, the remaining weight to be apportioned is 1.00;

in an event that there was one unique word in the annotation, the remaining weight to be apportioned is 0.50;

in an event that there was more than one unique word in the annotation, the remaining weight to be apportioned is 0.25;

of the weight apportioned to the remaining words, determining a share of the apportioned weight for each word such that;

each word in the class classification gets 1 share of the remaining weight,each word in the stop classification gets 1 share of the remaining weight,each word in the common classification gets 2 shares of the remaining weight,each word in the frequent classification gets 3 shares of the remaining weight,each word in the infrequent classification gets 4 shares of the remaining weight,each word in the rare classification gets 5 shares of the remaining weight; and

each word in the proper classification gets 5 shares of the remaining weight,in an event that the weight added together for number of shares of class, common, and stop words in the annotation is more than 0.10, setting the normalized weight for the class, common, and stop words to equal 0.10.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An authoring tool (or process) to facilitate the performance of an annotation function and an indexing function. The annotation function may generate informational annotations and word annotations to a database design schema (e.g., an entity-relationship diagram or “ERD”). The indexing function may analyze the words of the annotations by classifying the words in accordance with a concordance and dictionary, and assign a normalized weight to each word of each of the annotations based on the classification(s) of the word(s) of the annotation. A query translator (or query translation process) to (i) accept a natural language query from a user interface process, (ii) convert the natural language query to a formal command query (e.g., an SQL query) using the indexed annotations generated by the authoring tool and the database design schema, and (iii) present the formal command query to a database management process for interrogating the relational database.

30 Citations

View as Search Results

10 Claims

1. A computer implemented method of indexing annotations to an information storage to generate a list of indexed annotations, the method comprising steps of:
- a) classifying words of the annotations based on a concordance and a dictionary so that the words have an associated classification, wherein;
  
  classifications comprise;
  
  a unique classification, wherein the unique classification comprises;
  
  possessives; and
  
  words not in the dictionary;
  
  a proper classification, wherein the proper classification comprises words defined as proper nouns in the dictionary;
  
  a stop classification, wherein the stop classification comprises words having no semantic meaning;
  
  a class classification, wherein the class classification comprises words that have been used to annotate rows in a table; and
  
  at least four distinct classifications comprising words derived from frequency of each word in the concordance such that;
  
  words occurring least frequently in the concordance comprise a rare classification;
  
  words occurring next-least frequently in the concordance comprise an infrequent classification;
  
  words occurring next-most frequently in the concordance comprise a frequent classification; and
  
  words occurring most frequently in the concordance comprise a common classification; and
  
  the step of classifying words includes determining the classification to which the words should be assigned; and
  
  b) assigning a normalized weight to words of each annotation, based on the associated classification of the words of each annotation, wherein a total normalized weight for the words of each annotation is equal to 1.00,wherein, the normalized weight assigned to words of each annotation based on the classifications associated with the words is determined by;
  
  determining whether the annotation contains any words in the unique classification, wherein;
  
  in an event that more than one word in the unique classification exists in the annotation, each of the unique classification words is assigned a normalized weight of 0.75 divided by the number of unique classification words in the annotation;
  
  in an event that only one word in the unique classification exists in the annotation, that word is assigned a normalized weight of 0.50;
  
  based on the number of words in the unique classification determining the weight to be apportioned to the remaining words occurs such that;
  
  in an event that there are no unique words in the annotation, the remaining weight to be apportioned is 1.00;
  
  in an event that there was one unique word in the annotation, the remaining weight to be apportioned is 0.50;
  
  in an event that there was more than one unique word in the annotation, the remaining weight to be apportioned is 0.25;
  
  of the weight apportioned to the remaining words, determining a share of the apportioned weight for each word such that;
  
  each word in the class classification gets 1 share of the remaining weight,each word in the stop classification gets 1 share of the remaining weight,each word in the common classification gets 2 shares of the remaining weight,each word in the frequent classification gets 3 shares of the remaining weight,each word in the infrequent classification gets 4 shares of the remaining weight,each word in the rare classification gets 5 shares of the remaining weight; and
  
  each word in the proper classification gets 5 shares of the remaining weight,in an event that the weight added together for number of shares of class, common, and stop words in the annotation is more than 0.10, setting the normalized weight for the class, common, and stop words to equal 0.10.
- View Dependent Claims (2, 3, 4, 5, 10)
- - 2. The computer implemented method of claim 1 wherein the step of classifying words is based on a degree to which the words are distinct.
  - 3. The computer implemented method of claim 2 wherein more distinct words are weighted more heavily than less distinct words.
  - 4. The computer implemented method of claim 1 wherein, in an annotation, words classified as “
    - unique”
      
      share a fixed amount of a normalized weight, and remaining words of the annotation share a remaining amount of the normalized weight.
  - 5. The computer implemented method of claim 4 wherein the remaining words of the annotation share the remaining amount of the normalized weight such that the more distinct a remaining word is, the larger its share of the remaining amount of the normalized weight.
  - 10. A processor readable storage medium having processor executable instructions embodied thereon, the processor executable instructions which when executed facilitate performance of the method of claim 1.

6. A system for indexing annotations to information, the system comprising:
- a processor for processing annotations to information via an annotating facility adapted to annotate a database design schema with annotations comprising;
  
  word annotations; and
  
  informational annotations;
  
  a memory operatively coupled to the processor for storing annotations to information;
  
  classification means for classifying words of the annotations based on a dictionary and a concordance such that each individual word of the words is associated with a classification, wherein;
  
  classifications comprise;
  
  a unique classification, wherein the unique classification comprises;
  
  possessives; and
  
  words not in the dictionary;
  
  a proper classification, wherein the proper classification comprises words defined as proper nouns in the dictionary;
  
  a stop classification, wherein the stop classification comprises words having no semantic meaning;
  
  a class classification, wherein the class classification comprises words that have been used to annotate rows in a table; and
  
  at least four distinct classifications comprising words derived from frequency of each word in the concordance such that;
  
  words occurring least frequently in the concordance comprise a rare classification;
  
  words occurring next-least frequently in the concordance comprise an infrequent classification;
  
  words occurring next-most frequently in the concordance comprise a frequent classification; and
  
  words occurring most frequently in the concordance comprise a common classification; and
  
  the step of classifying words includes determining the classification to which the words should be assigned; and
  
  assignment means for assigning a normalized weight to each individual word of the annotations responsive to the associated classification of each individual word by the classification means, wherein a total normalized weight for the words of each annotation is equal to 1.00, and the normalized weight assigned to words of each annotation based on the classifications associated with the words is determined by;
  
  determining whether the annotation contains any words in the unique classification, wherein;
  
  in an event that more than one word in the unique classification exists in the annotation, each of the unique classification words is assigned a normalized weight of 0.75 divided by the number of unique classification words in the annotation;
  
  in an event that only one word in the unique classification exists in the annotation, that word is assigned a normalized weight of 0.50;
  
  based on the number of words in the unique classification determining the weight to be apportioned to the remaining words occurs such that;
  
  in an event that there are no unique words in the annotation, the remaining weight to be apportioned is 1.00;
  
  in an event that there was one unique word in the annotation, the remaining weight to be apportioned is 0.50;
  
  in an event that there was more than one unique word in the annotation, the remaining weight to be apportioned is 0.25;
  
  of the weight apportioned to the remaining words, determining a share of the apportioned weight for each word such that;
  
  each word in the class classification gets 1 share of the remaining weight,each word in the stop classification gets 1 share of the remaining weight,each word in the common classification gets 2 shares of the remaining weight,each word in the frequent classification gets 3 shares of the remaining weight,each word in the infrequent classification gets 4 shares of the remaining weight,each word in the rare classification gets 5 shares of the remaining weight; and
  
  each word in the proper classification gets 5 shares of the remaining weight,in an event that the weight added together for number of shares of class, common, and stop words in the annotation is more than 0.10, setting the normalized weight for the class, common, and stop words to equal 0.10.

7. At least one machine readable storage medium storing processor-executable instructions, the processor-executable instructions comprising:
- an annotating facility, the annotating facility adapted to annotate a database design schema with annotations comprising;
  
  word annotations; and
  
  informational annotations; and
  
  an indexing facility, the indexing facility adapted to;
  
  classify individual words of the word annotations based on a concordance and a dictionary such that the individual words of the word annotations are have an associated classification, wherein;
  
  classifications comprise;
  
  a unique classification, wherein the unique classification comprises;
  
  possessives; and
  
  words not in the dictionary;
  
  a proper classification, wherein the proper classification comprises words defined as proper nouns in the dictionary;
  
  a stop classification, wherein the stop classification comprises words having no semantic meaning;
  
  a class classification, wherein the class classification comprises words that have been used to annotate rows in a table; and
  
  at least four distinct classifications comprising words derived from frequency of each word in the concordance such that;
  
  words occurring least frequently in the concordance comprise a rare classification;
  
  words occurring next-least frequently in the concordance comprise an infrequent classification;
  
  words occurring next-most frequently in the concordance comprise a frequent classification, andwords occurring most frequently in the concordance comprise a common classification; and
  
  the step of classifying words includes determining the classification to which the words should be assigned; and
  
  assign a normalized weight to the classified individual words of the word annotations, based on the associated classification of the words of each annotation, wherein a total normalized weight for the words of each annotation is equal to 1.00 and the normalized weight assigned to words of each annotation based on the classifications associated with the words is determined by;
  
  determining whether the annotation contains any words in the unique classification, wherein;
  
  in an event that more than one word in the unique classification exists in the annotation, each of the unique classification words is assigned a normalized weight of 0.75 divided by the number of unique classification words in the annotation;
  
  in an event that only one word in the unique classification exists in the annotation, that word is assigned a normalized weight of 0.50;
  
  based on the number of words in the unique classification determining the weight to be apportioned to the remaining words occurs such that;
  
  in an event that there are no unique words in the annotation, the remaining weight to be apportioned is 1.00;
  
  in an event that there was one unique word in the annotation, the remaining weight to be apportioned is 0.50;
  
  in an event that there was more than one unique word in the annotation, the remaining weight to be apportioned is 0.25;
  
  of the weight apportioned to the remaining words, determining a share of the apportioned weight for each word such that;
  
  each word in the class classification gets 1 share of the remaining weight,each word in the stop classification gets 1 share of the remaining weight,each word in the common classification gets 2 shares of the remaining weight,each word in the frequent classification gets 3 shares of the remaining weight,each word in the infrequent classification gets 4 shares of the remaining weight,each word in the rare classification gets 5 shares of the remaining weight; and
  
  each word in the proper classification gets 5 shares of the remaining weight,in an event that the weight added together for number of shares of class, common, and stop words in the annotation is more than 0.10, setting the normalized weight for the class, common, and stop words to equal 0.10.
- View Dependent Claims (8, 9)
- - 8. The at least one machine readable storage medium storing processor-executable instructions of claim 7 wherein the word annotations include at least one of manually-generated word annotations and automatically-generated word annotations.
  - 9. The at least one machine readable storage medium storing processor-executable instructions of claim 7 wherein:
    - word annotations facilitate attaching related words to tables, rows, columns, and relationships of the database design schema; and
      
      informational annotations facilitate;
      
      distinguishing tables corresponding to entities from tables corresponding to properties in a database to which the database design schema is applied;
      
      attaching to rows of tables in the database, a probability that the row will be referenced; and
      
      describing entities in a way that is meaningful to humans.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
McConnell, Christopher Clayton
Primary Examiner(s)
Alam; Hosain T
Assistant Examiner(s)
Chempakaseril; Ann J

Application Number

US11/188,058
Publication Number

US 20050256889A1
Time in Patent Office

1,348 Days
Field of Search

None
US Class Current

1/1
CPC Class Codes

G06F 16/20   of structured data, e.g. re...

G06F 16/24522   Translation of natural lang...

G06F 16/288   Entity relationship models

G06F 16/353   into predefined classes

G06F 16/90   Details of database functio...

Y10S 707/99932   Access augmentation or opti...

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99935   Query augmenting and refini...

Y10S 707/99937   Sorting

Y10S 707/99943   Generating database or data...

Y10S 707/99953   Recoverability

Methods, apparatus, and data structures for annotating a database design schema and/or indexing annotations

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

30 Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Methods, apparatus, and data structures for annotating a database design schema and/or indexing annotations

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

30 Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links