System, method and apparatus for generating phrases from a database
First Claim
1. A method of generating phrases from a database comprising:
- providing a database;
providing one or more stopterms;
creating a relational model of the database by a process comprising determining a plurality of relations, wherein each of the plurality of relations includes at least one term pair and one or more directional metric values;
outputting the relational model for the database;
inputting a query for the database, wherein the query includes one or more base phrases, each base phrase including at least one of a group of one or more terms; and
one or more phrases;
determining a plurality of phrases from the relational model of the database, wherein each of the plurality of phrases is contextually related to the query;
sorting the plurality of phrases;
outputting the sorted plurality of phrases;
wherein each of the plurality of phrases is contextually related to the query by a process comprising;
(1) creating an empty phrase list (PL), wherein a phrase list is a list of base phrases;
(2) setting a weight of each base phrase of the query to a threshold level, and replacing the PL with the query;
(3) selecting one of the plurality of relations from the model of the database;
(4) selecting a first term from the selected relation;
(5) identifying the selected term as a contained term;
(6) identifying a second term of the selected relation as an appended term;
(7) determining if the contained term is included in the one or more base phrases in the PL;
(8) when the contained term is included in the PL;
(8-i) selecting one of the one or more base phrases from the PL, wherein the selected base phrase includes the contained term;
(8-ii) concatenating the selected base phrase and the appended term into a first candidate jtrase and a second candidate phrase, wherein the first candidate phrase includes the selected base phrase followed by the appended term and the second candidate phrase includes the appended term followed by the selected base phrase, and determining for each of the candidate phrases a link count consisting of a count of known relations associated with each of the candidate phrases, and associating with each of the candidate phrases one or more link weights, each link weight consisting of one of the one or more directional metric values included in the selected relation whose magnitude represents a degree of contextual association between the contained term and the appended term;
(8-iii) updating a conditional list of phrases (CLP);
(8-iv) selecting the first candidate phrase; and
(8-v) determining number of stopterms in the selected candidate phrase;
(9) determining if number of the stopterms is greater than a first pre-selected number;
(10) when (i) the number of the stoptemis is greater than the first pre-selected number or (ii-a) the number of stopterms is not greater than the first preselected number and (ii-b) the link count is equal to a number of terms in the base phrase included in the selected candidate phrases and (ii-c) at least one link weight is non-positive, deleting the selected candidate phrase and continuing to step (13);
(11) when (i) the number of the stopterms is not greater than the first pre-selected number and (ii) the link count is not equal to number of terms in the base phrase, continuing to step (13);
(12) when (i) the number of the stopterms is not greater than the first pre-selected number and (ii) the link count is equal to number of terms in the base phrase included in the selected candidate phrase and (iii) all the link weights are positive, including the selected candidate phrase in an interim phrase list (IPL) and continuing to step (13);
(13) determining if the second candidate phrase has been processed;
(14) when the second candidate phrase has not been processed, selecting the second candidate phrase and returning to step (8-v);
(15) when the second candidate phrase has been processed, determining if a subsequent phrase in the PL contains the contained term; and
(16) when a subsequent phrase in the PL contains the contained term, selecting a subsequent base phrase containing the contained term and returning to step (8-ii).
1 Assignment
0 Petitions
Accused Products
Abstract
A phrase generation is a method of generating sequences of terms, such as phrases, that may occur within a database of subsets containing sequences of terms, such as text. A database is provided and a relational model of the database is created. A query is then input. The query includes a term or a sequence of terms or multiple individual terms or multiple sequences of terms or combinations thereof. Next, several sequences of terms that are contextually related to the query are assembled from contextual relations in the model of the database. The sequences of terms are then sorted and output. Phrase generation can also be an iterative process used to produce sequences of terms from a relational model of a database.
246 Citations
6 Claims
-
1. A method of generating phrases from a database comprising:
-
providing a database;
providing one or more stopterms;
creating a relational model of the database by a process comprising determining a plurality of relations, wherein each of the plurality of relations includes at least one term pair and one or more directional metric values;
outputting the relational model for the database;
inputting a query for the database, wherein the query includes one or more base phrases, each base phrase including at least one of a group of one or more terms; and
one or more phrases;
determining a plurality of phrases from the relational model of the database, wherein each of the plurality of phrases is contextually related to the query;
sorting the plurality of phrases;
outputting the sorted plurality of phrases;
wherein each of the plurality of phrases is contextually related to the query by a process comprising;
(1) creating an empty phrase list (PL), wherein a phrase list is a list of base phrases;
(2) setting a weight of each base phrase of the query to a threshold level, and replacing the PL with the query;
(3) selecting one of the plurality of relations from the model of the database;
(4) selecting a first term from the selected relation;
(5) identifying the selected term as a contained term;
(6) identifying a second term of the selected relation as an appended term;
(7) determining if the contained term is included in the one or more base phrases in the PL;
(8) when the contained term is included in the PL;
(8-i) selecting one of the one or more base phrases from the PL, wherein the selected base phrase includes the contained term;
(8-ii) concatenating the selected base phrase and the appended term into a first candidate jtrase and a second candidate phrase, wherein the first candidate phrase includes the selected base phrase followed by the appended term and the second candidate phrase includes the appended term followed by the selected base phrase, and determining for each of the candidate phrases a link count consisting of a count of known relations associated with each of the candidate phrases, and associating with each of the candidate phrases one or more link weights, each link weight consisting of one of the one or more directional metric values included in the selected relation whose magnitude represents a degree of contextual association between the contained term and the appended term;
(8-iii) updating a conditional list of phrases (CLP);
(8-iv) selecting the first candidate phrase; and
(8-v) determining number of stopterms in the selected candidate phrase;
(9) determining if number of the stopterms is greater than a first pre-selected number;
(10) when (i) the number of the stoptemis is greater than the first pre-selected number or (ii-a) the number of stopterms is not greater than the first preselected number and (ii-b) the link count is equal to a number of terms in the base phrase included in the selected candidate phrases and (ii-c) at least one link weight is non-positive, deleting the selected candidate phrase and continuing to step (13);
(11) when (i) the number of the stopterms is not greater than the first pre-selected number and (ii) the link count is not equal to number of terms in the base phrase, continuing to step (13);
(12) when (i) the number of the stopterms is not greater than the first pre-selected number and (ii) the link count is equal to number of terms in the base phrase included in the selected candidate phrase and (iii) all the link weights are positive, including the selected candidate phrase in an interim phrase list (IPL) and continuing to step (13);
(13) determining if the second candidate phrase has been processed;
(14) when the second candidate phrase has not been processed, selecting the second candidate phrase and returning to step (8-v);
(15) when the second candidate phrase has been processed, determining if a subsequent phrase in the PL contains the contained term; and
(16) when a subsequent phrase in the PL contains the contained term, selecting a subsequent base phrase containing the contained term and returning to step (8-ii). - View Dependent Claims (2, 3, 4, 5, 6)
(17) when a subsequent phrase in said PL does not contain said contained term, continuing to step (19);
(18) when said contained term is not included in said PL, continuing to step (19);
(19) determining if said second term in said selected relation has been processed as said contained term;
(20) when said second term in said selected relation has not been processed as said contained term, (i) identifying the second term from said selected relation as said contained term (ii) identifying said first term from said selected relation as said appended term and (iii) returning to said step (7) in claim 1;
(21) when said second term in said selected relation has been processed as said contained term, determining if a subsequent one of said relations exists within said relational model of said database;
(22) when a subsequent relation exists within said relational model of said database, (i) selecting the subsequent relation and (ii) returning to said step (4) in claim 1;
(23) when a subsequent relation does not exist within said relational model of said database, (i) filtering said phrases in said IPL, based upon a weight of each of said phrases, (ii) eliminating each duplicate phrase from said IPL, and (iii) determining if a number of said phrases within said IPL is greater than 0;
(24) when number of said phrases within said IPL is greater than 0, (i) adding phrases within said IPL to an interim buffer, (ii) replacing said base phrases within said PL with said phrases within said IPL, and (iii) returning to said step (3) in claim 1;
(25) when the number of said phrases within said IPL is not greater than 0, determining if the number of phrases in the interim buffer is greater than or equal to a second pre-selected number;
(26) when the number of phrases in the interim buffer is not greater than or equal to a second pre-selected number, reducing said threshold weight and returning to said step (2) of claim 1; and
(27) when the number of phrases in the interim buffer is greater than or equal to a second pre-selected number, (i) sorting said phrases in the interim buffer and (ii) outputting said phrases in the interim buffer.
-
-
6. The method as recited in claim 1, wherein updating said conditional list of phrases (CLP) in said step (8-iii) further comprises:
-
(28) selecting said first candidate phrase;
(29) determining if said selected candidate phrase is contained in said CLP;
(30) when said selected candidate phrase is contained in said CLP, (i) incrementing said count of known relations associated with said selected candidate phrase in said CLP, and (ii) continuing to step(31);
(31) determining if a weight associated with said selected candidate phrase in said CLP is greater than said directional metric value of said selected relation corresponding to an order of said contained term and said appended term in said selected candidate phrase;
(32) when the weight associated with said selected candidate phrase in said CLP is greater than a corresponding directional metric value of said selected relation, (i) setting the weight associated with said selected candidate phrase in said CLP equal to the corresponding directional metric value in said selected relation and (ii) continuing to step (33);
(33) determining if said second candidate phrase has been processed;
(34) when said second candidate phrase has not been processed, selecting said second candidate phrase and returning to step (29);
(35) when said selected candidate phrase is not contained in said CLP, (i) including said selected candidate phrase in said CLP and (ii) setting equal to 1 said count of known relations associated with said selected candidate phrase in said CLP;
(36) determining if said weight of said base phrase included in said selected candidate phrase is greater than said corresponding directional metric value of said selected relation;
(37) when said weight of said base phrase included in said selected candidate phrase is not greater than said corresponding directional metric value of said selected relation, (i) setting the weight associated with said selected candidate phrase in said CLP equal to said weight of said base phrase included in said selected candidate phrase and (ii) returning to step (33);
(38) when said weight of said base phrase included in said selected candidate phrase is greater than the corresponding directional metric value of said selected relation, returning to step (32-i);
(39) when the weight associated with said selected candidate phrase in said CLP is not greater than said corresponding directional metric value of said selected relation, returning to step (33); and
(40) when said second candidate phrase has been processed, ending a sub-process associated with said step (8-iii) of claim 1.
-
Specification