System, method and apparatus for generating phrases from a database

US 6,697,793 B2
Filed: 03/02/2001
Issued: 02/24/2004
Est. Priority Date: 03/02/2001
Status: Expired due to Fees

First Claim

Patent Images

1. A method of generating phrases from a database comprising:

providing a database;

providing one or more stopterms;

creating a relational model of the database by a process comprising determining a plurality of relations, wherein each of the plurality of relations includes at least one term pair and one or more directional metric values;

outputting the relational model for the database;

inputting a query for the database, wherein the query includes one or more base phrases, each base phrase including at least one of a group of one or more terms; and

one or more phrases;

determining a plurality of phrases from the relational model of the database, wherein each of the plurality of phrases is contextually related to the query;

sorting the plurality of phrases;

outputting the sorted plurality of phrases;

wherein each of the plurality of phrases is contextually related to the query by a process comprising;

(1) creating an empty phrase list (PL), wherein a phrase list is a list of base phrases;

(2) setting a weight of each base phrase of the query to a threshold level, and replacing the PL with the query;

(3) selecting one of the plurality of relations from the model of the database;

(4) selecting a first term from the selected relation;

(5) identifying the selected term as a contained term;

(6) identifying a second term of the selected relation as an appended term;

(7) determining if the contained term is included in the one or more base phrases in the PL;

(8) when the contained term is included in the PL;

(8-i) selecting one of the one or more base phrases from the PL, wherein the selected base phrase includes the contained term;

(8-ii) concatenating the selected base phrase and the appended term into a first candidate jtrase and a second candidate phrase, wherein the first candidate phrase includes the selected base phrase followed by the appended term and the second candidate phrase includes the appended term followed by the selected base phrase, and determining for each of the candidate phrases a link count consisting of a count of known relations associated with each of the candidate phrases, and associating with each of the candidate phrases one or more link weights, each link weight consisting of one of the one or more directional metric values included in the selected relation whose magnitude represents a degree of contextual association between the contained term and the appended term;

(8-iii) updating a conditional list of phrases (CLP);

(8-iv) selecting the first candidate phrase; and

(8-v) determining number of stopterms in the selected candidate phrase;

(9) determining if number of the stopterms is greater than a first pre-selected number;

(10) when (i) the number of the stoptemis is greater than the first pre-selected number or (ii-a) the number of stopterms is not greater than the first preselected number and (ii-b) the link count is equal to a number of terms in the base phrase included in the selected candidate phrases and (ii-c) at least one link weight is non-positive, deleting the selected candidate phrase and continuing to step (13);

(11) when (i) the number of the stopterms is not greater than the first pre-selected number and (ii) the link count is not equal to number of terms in the base phrase, continuing to step (13);

(12) when (i) the number of the stopterms is not greater than the first pre-selected number and (ii) the link count is equal to number of terms in the base phrase included in the selected candidate phrase and (iii) all the link weights are positive, including the selected candidate phrase in an interim phrase list (IPL) and continuing to step (13);

(13) determining if the second candidate phrase has been processed;

(14) when the second candidate phrase has not been processed, selecting the second candidate phrase and returning to step (8-v);

(15) when the second candidate phrase has been processed, determining if a subsequent phrase in the PL contains the contained term; and

(16) when a subsequent phrase in the PL contains the contained term, selecting a subsequent base phrase containing the contained term and returning to step (8-ii).

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A phrase generation is a method of generating sequences of terms, such as phrases, that may occur within a database of subsets containing sequences of terms, such as text. A database is provided and a relational model of the database is created. A query is then input. The query includes a term or a sequence of terms or multiple individual terms or multiple sequences of terms or combinations thereof. Next, several sequences of terms that are contextually related to the query are assembled from contextual relations in the model of the database. The sequences of terms are then sorted and output. Phrase generation can also be an iterative process used to produce sequences of terms from a relational model of a database.

246 Citations

6 Claims

1. A method of generating phrases from a database comprising:
- providing a database;
  
  providing one or more stopterms;
  
  creating a relational model of the database by a process comprising determining a plurality of relations, wherein each of the plurality of relations includes at least one term pair and one or more directional metric values;
  
  outputting the relational model for the database;
  
  inputting a query for the database, wherein the query includes one or more base phrases, each base phrase including at least one of a group of one or more terms; and
  
  one or more phrases;
  
  determining a plurality of phrases from the relational model of the database, wherein each of the plurality of phrases is contextually related to the query;
  
  sorting the plurality of phrases;
  
  outputting the sorted plurality of phrases;
  
  wherein each of the plurality of phrases is contextually related to the query by a process comprising;
  
  (1) creating an empty phrase list (PL), wherein a phrase list is a list of base phrases;
  
  (2) setting a weight of each base phrase of the query to a threshold level, and replacing the PL with the query;
  
  (3) selecting one of the plurality of relations from the model of the database;
  
  (4) selecting a first term from the selected relation;
  
  (5) identifying the selected term as a contained term;
  
  (6) identifying a second term of the selected relation as an appended term;
  
  (7) determining if the contained term is included in the one or more base phrases in the PL;
  
  (8) when the contained term is included in the PL;
  
  (8-i) selecting one of the one or more base phrases from the PL, wherein the selected base phrase includes the contained term;
  
  (8-ii) concatenating the selected base phrase and the appended term into a first candidate jtrase and a second candidate phrase, wherein the first candidate phrase includes the selected base phrase followed by the appended term and the second candidate phrase includes the appended term followed by the selected base phrase, and determining for each of the candidate phrases a link count consisting of a count of known relations associated with each of the candidate phrases, and associating with each of the candidate phrases one or more link weights, each link weight consisting of one of the one or more directional metric values included in the selected relation whose magnitude represents a degree of contextual association between the contained term and the appended term;
  
  (8-iii) updating a conditional list of phrases (CLP);
  
  (8-iv) selecting the first candidate phrase; and
  
  (8-v) determining number of stopterms in the selected candidate phrase;
  
  (9) determining if number of the stopterms is greater than a first pre-selected number;
  
  (10) when (i) the number of the stoptemis is greater than the first pre-selected number or (ii-a) the number of stopterms is not greater than the first preselected number and (ii-b) the link count is equal to a number of terms in the base phrase included in the selected candidate phrases and (ii-c) at least one link weight is non-positive, deleting the selected candidate phrase and continuing to step (13);
  
  (11) when (i) the number of the stopterms is not greater than the first pre-selected number and (ii) the link count is not equal to number of terms in the base phrase, continuing to step (13);
  
  (12) when (i) the number of the stopterms is not greater than the first pre-selected number and (ii) the link count is equal to number of terms in the base phrase included in the selected candidate phrase and (iii) all the link weights are positive, including the selected candidate phrase in an interim phrase list (IPL) and continuing to step (13);
  
  (13) determining if the second candidate phrase has been processed;
  
  (14) when the second candidate phrase has not been processed, selecting the second candidate phrase and returning to step (8-v);
  
  (15) when the second candidate phrase has been processed, determining if a subsequent phrase in the PL contains the contained term; and
  
  (16) when a subsequent phrase in the PL contains the contained term, selecting a subsequent base phrase containing the contained term and returning to step (8-ii).
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method as recited in claim 1, further comprising inputting said query by a process comprising selecting a value for an initial threshold weight.
  - 3. The method as recited in claim 1, further comprising inputting said query by a process comprising setting an initial weight for each of said base phases of said query.
  - 4. The method as recited in claim 1, further comprising inputting said query by a process comprising setting a pre-selected number of phrases to be output.
  - 5. The method as recited in claim 1, further comprising:
6. The method as recited in claim 1, wherein updating said conditional list of phrases (CLP) in said step (8-iii) further comprises:
- (28) selecting said first candidate phrase;
  
  (29) determining if said selected candidate phrase is contained in said CLP;
  
  (30) when said selected candidate phrase is contained in said CLP, (i) incrementing said count of known relations associated with said selected candidate phrase in said CLP, and (ii) continuing to step(31);
  
  (31) determining if a weight associated with said selected candidate phrase in said CLP is greater than said directional metric value of said selected relation corresponding to an order of said contained term and said appended term in said selected candidate phrase;
  
  (32) when the weight associated with said selected candidate phrase in said CLP is greater than a corresponding directional metric value of said selected relation, (i) setting the weight associated with said selected candidate phrase in said CLP equal to the corresponding directional metric value in said selected relation and (ii) continuing to step (33);
  
  (33) determining if said second candidate phrase has been processed;
  
  (34) when said second candidate phrase has not been processed, selecting said second candidate phrase and returning to step (29);
  
  (35) when said selected candidate phrase is not contained in said CLP, (i) including said selected candidate phrase in said CLP and (ii) setting equal to 1 said count of known relations associated with said selected candidate phrase in said CLP;
  
  (36) determining if said weight of said base phrase included in said selected candidate phrase is greater than said corresponding directional metric value of said selected relation;
  
  (37) when said weight of said base phrase included in said selected candidate phrase is not greater than said corresponding directional metric value of said selected relation, (i) setting the weight associated with said selected candidate phrase in said CLP equal to said weight of said base phrase included in said selected candidate phrase and (ii) returning to step (33);
  
  (38) when said weight of said base phrase included in said selected candidate phrase is greater than the corresponding directional metric value of said selected relation, returning to step (32-i);
  
  (39) when the weight associated with said selected candidate phrase in said CLP is not greater than said corresponding directional metric value of said selected relation, returning to step (33); and
  
  (40) when said second candidate phrase has been processed, ending a sub-process associated with said step (8-iii) of claim 1.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
U.S.A. as represented by the Administrator of the National Aeronautics and Space Administration
Original Assignee
United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration
Inventors
McGreevy, Michael W.
Primary Examiner(s)
Le, Uyen
Assistant Examiner(s)
THAI, HANH B

Application Number

US09/800,313
Publication Number

US 20020188587A1
Time in Patent Office

1,089 Days
Field of Search

707/1-7, 704/9
US Class Current

1/1
CPC Class Codes

G06F 16/3335   Syntactic pre-processing, e...

G06F 16/334   Query execution G06F16/335 ...

Y10S 707/99931   Database or file accessing

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99937   Sorting

System, method and apparatus for generating phrases from a database

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

246 Citations

6 Claims

Specification

Use Cases

Quick Links

Others

System, method and apparatus for generating phrases from a database

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

246 Citations

6 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others