System, method and apparatus for discovering phrases in a database

US 6,721,728 B2
Filed: 03/02/2001
Issued: 04/13/2004
Est. Priority Date: 03/02/2001
Status: Expired due to Fees

First Claim

Patent Images

1. A method of discovering phrases from a database comprising:

providing a selection of text;

extracting a plurality of phrases from the provided text, by a proces comprising;

determining a plurality of phrase processing positions within the selection of text, by a process comprising;

determining a plurality of phase processing starting positions within the selection of text, by a process comprising;

identifying a phrase starting position (T1);

initializing values for an iterative process by a process comprising setting an interior stopterm counter to zero and setting a tuple size to two;

determining a plurality of phrase processing ending positions within the selection of relevant text, by a process comprising;

identifying a phrase ending position (T2);

identifying a position immediately subsequent to the phrase ending position T2 as the phrase ending position T2;

extracting a plurality of phrases, wherein the first position of each of the plurality of phrases is one of the plurality of phrase processing starting positions (T1) and the last position of each of the plurality of phrases is a one of the plurality of phrase processing ending positions (T2), by a process comprising;

identifying an indicated phrase, wherein an indicated phrase is a sequence of positions staffing at T1 and ending at T2;

determining a tuple size, wherein tuple size is a count of positions within the indicated phrase;

determining if the tuple size is greater than a maximum phrase length, and when the tuple size is not greater than the maximum phrase length, outputting the indicated phrase as an extracted phrase;

culling the extracted plurality of phrases;

gathering a plurality of phrases, wherein the gathered plurality of phrases are related by relevance to a plurality of contextual patterns included within and among the culled and extracted plurality of phrases; and

outputting the plurality of gathered phrases.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A phrase discovery is a method of identifying sequences of terms in a database. First, a selection of one or more relevant sequences of terms, such as relevant text, is provided. Next, several shorter sequences of terms, such as phrases, are extracted from the provided relevant sequences of terms. The extracted sequences of terms are then reduced through a culling process. A gathering process then emphasizes the more relevant of the extracted and culled sequences of terms and de-emphasizes the more generic of the extracted and culled sequences of terms. The gathering process can also include iteratively retrieving additional selections of relevant sequences (e.g., text), extracting and culling additional sequences of terms (e.g., phrases), emphasizing and de-emphasizing extracted and culled sequences of terms and accumulating all gathered sequences of terms. The resulting gathered sequences of terms are then output.

390 Citations

50 Claims

1. A method of discovering phrases from a database comprising:
- providing a selection of text;
  
  extracting a plurality of phrases from the provided text, by a proces comprising;
  
  determining a plurality of phrase processing positions within the selection of text, by a process comprising;
  
  determining a plurality of phase processing starting positions within the selection of text, by a process comprising;
  
  identifying a phrase starting position (T1);
  
  initializing values for an iterative process by a process comprising setting an interior stopterm counter to zero and setting a tuple size to two;
  
  determining a plurality of phrase processing ending positions within the selection of relevant text, by a process comprising;
  
  identifying a phrase ending position (T2);
  
  identifying a position immediately subsequent to the phrase ending position T2 as the phrase ending position T2;
  
  extracting a plurality of phrases, wherein the first position of each of the plurality of phrases is one of the plurality of phrase processing starting positions (T1) and the last position of each of the plurality of phrases is a one of the plurality of phrase processing ending positions (T2), by a process comprising;
  
  identifying an indicated phrase, wherein an indicated phrase is a sequence of positions staffing at T1 and ending at T2;
  
  determining a tuple size, wherein tuple size is a count of positions within the indicated phrase;
  
  determining if the tuple size is greater than a maximum phrase length, and when the tuple size is not greater than the maximum phrase length, outputting the indicated phrase as an extracted phrase;
  
  culling the extracted plurality of phrases;
  
  gathering a plurality of phrases, wherein the gathered plurality of phrases are related by relevance to a plurality of contextual patterns included within and among the culled and extracted plurality of phrases; and
  
  outputting the plurality of gathered phrases.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
- - 2. The method as recited in claim 1, further comprising:
3. The method as recited in claim 2, further comprising determining if there is a term immediately subsequent to T1.
4. The method as recited in claim 3, further comprising, when there is a term immediately subsequent to T1, determining if T1 is a stopterm.
5. The method as recited in claim 4, further comprising, when T1 is a stopterm, identifying said subsequent term as T1 and T2.
6. The method as recited in claim 4, further comprising choosing said stopterm to include at least one of a group consisting of a starting stopterm, an interior stopterm, and an ending stopterm.
7. The method as recited in claim 3, further comprising, when there is not a term immediately subsequent to T1, determining if T1 is a stopterm.
8. The method as recited in claim 7, further comprising choosing said stopterm to include at least one of a group consisting of a staffing stopterm, an interior stopterm, and an ending stopterm.
9. The method as recited in claim 7, further comprising, when T1 is not a stopterm, determining if a single term phrase is acceptable, wherein a single term phrase is acceptable when an option is selected to enable inclusion of each single term phrase among said extracted plurality of phrases.
10. The method as recited in claim 9, further comprising, when said single term phrase is acceptable, determining if a current phrase is included in a phrase list (PL), wherein the current phrase begins at term T1 and ends at term T2.
11. The method as recited in claim 10, further comprising, when said current phrase is included in said phrase list (PL), incrementing a selected frequency counter corresponding to said phrase list.
12. The method as recited in claim 10, further comprising, when said current phrase is not included in a phrase list (PL), including said current phrase in said PL and setting a selected frequency counter, corresponding to said current phrase, to 1.
13. The method as recited in claim 4, further comprising, when T1 is not a stopterm, determining if a single term phrase is acceptable, wherein a single term phrase is acceptable when an option is selected to enable inclusion of each single term phrase among said extracted plurality of phrases.
14. The method as recited in claim 13, further comprising, when said single term phrase is acceptable, determining if a current phrase is included in a phrase list (PL), wherein the current phrase begins at term T1 and ends at term T2.
15. The method as recited in claim 14, further comprising, when said current phrase is included in said phrase list (PL), incrementing a selected frequency counter corresponding to said current phrase.
16. The method as recited in claim 14, further comprising, when said current phrase is not included in said phrase list (PL), including said current phrase in the PL and setting a selected frequency counter, corresponding to said current phrase, to 1.
17. The method as recited in claim 3, further comprising, when there is not a term immediately subsequent to T1, outputting a phrase list.
18. The method as recited in claim 1, further comprising, when said tuple size is greater than said maximum phrase length:
- not identifying said sequence of positions staffing at T1 and ending at T2 as an extracted phrase; and
  
  not outputting an extracted phrase; and
  
  identifying a term immediately subsequent to T1 as T1 and T2.
19. The method as recited in claim 1, further comprising, when said tuple size is not greater than a maximum phrase length, determining if T2 is a stopterm.
20. The method as recited in claim 19, further comprising choosing said stopterm to include at least one of a group consisting of a starting stopterm, an interior stopterm, and an ending stopterm.
21. The method as recited in claim 19, further comprising, when T2 is a stopterm:
- incrementing an interior stopterm counter; and
  
  determining if the interior stopterm counter is greater than a pre-selected number of interior stopterms.
22. The method as recited in claim 21, further comprising, when said interior stopterm counter is greater than said pre-selected number of interior stopterms:
- not identifying said sequence of positions starting at T1 and ending at T2 as an extracted phrase; and
  
  not outputting an extracted phrase; and
  
  identifying a term immediately subsequent to T1 as T1 and T2.
23. The method as recited in claim 21, further comprising, when said interior stopterm counter is not greater than said pre-selected number of interior stopterms:
- incrementing said tuple size; and
  
  determining if there is a term immediately subsequent to T2 in said text.
24. The method as recited in claim 23, further comprising, when there is a term immediately subsequent to T2 in said relevant text:
- identifying the term immediately subsequent to T2 as T2; and
  
  determining if the tuple size is greater than a maximum phrase length.
25. The method as recited in claim 23, further comprising, when there is not a term immediately subsequent to T2 in said relevant text:
- not identifying said sequence of positions starting at T1 and ending at T2 as an extracted phrase; and
  
  not outputting an extracted phrase; and
  
  identifying a term immediately subsequent to T1 as T1 and T2.
26. The method as recited in claim 19, further comprising, when T2 is not a stopterm:
- determining if a current phrase is included in a phrase list (PL), wherein the current phrase begins at term T1 and ends at term T2;
  
  when the current phrase is included in a phrase list (PL), incrementing a selected frequency counter corresponding to the current phrase; and
  
  when the current phrase is not included in a phrase list (PL), including the current phrase in the PL and setting a selected frequency counter, corresponding to the current phrase, to 1.
27. The method as recited in claim 20, further comprising, when T2 is not said interior stopterm, determining if T2 is said ending stopterm.
28. The method as recited in claim 27, further comprising, when T2 is not said ending stop term:
- determining if said current phrase is included in said phrase list (PL) wherein said current phrase begins at term T1 and ends at term T2;
  
  when said current phrase is included in said PL, incrementing a selected frequency counter corresponding to said current phrase; and
  
  when said current phrase is not included in said PL, including said current phrase in said PL and setting a selected frequency counter, corresponding to said current phrase, to 1.
29. The method as recited in claim 27, further comprising, when T2 is an ending stopterm:
- determining if T2 is an interior-only term;
  
  when T2 is an interior-only term, repeating said process of claim 20; and
  
  when T2 is not an interior-only term, identifying a term immediately subsequent to T1 as T1 and T2.
30. The method as recited in claim 1, further comprising culling said extracted plurality of phrases by a process comprising identifying a first phrase from a candidate phrase list (CPL) as P1.
31. The method as recited in claim 30, further comprising:
- identifying a plurality of phrases from said CPL, wherein said P1 is a proper subset of each one of said plurality of phrases; and
  
  identifying a first phase, from said plurality of phrases having P1 as a proper subset, as P2.
32. The method as recited in claim 31, further comprising comparing a frequency of occurrence of P1 and P2.
33. The method as recited in claim 32, further comprising, when said frequency of P1 is not equal to said frequency of P2, determining if there is a phrase subsequent to P2 in said plurality of phrases having P1 as a proper subset.
34. The method as recited in claim 33, further comprising, when there is a phrase subsequent to P2 in said plurality of phrases having P1 as a proper subset,identifying said subsequent phrase as P2.
35. The method as recited in claim 33, further comprising, when there is not a phrase subsequent to P2 in said plurality of phrases having P1 as a proper subset, determining if there is a phrase subsequent to P1 in said CPL.
36. The method as recited in claim 35, further comprising, when there is a phrase subsequent to P1 in said CPL, identifying said subsequent phrase as P1.
37. The method as recited in claim 35, further comprising, when there is not a phrase subsequent to P1 in said CPL, outputting said plurality of phrases in said CPL.
38. The method as recited in claim 32, further comprising, when said frequency of P1 is equal to said frequency of P2:
- eliminating P1 from said CPL; and
  
  determining if there is a phrase subsequent to P2 in said CPL.
39. The method as recited in claim 38, further comprising, when there is a phrase subsequent to P1 in said CPL, identifying said subsequent phrase as P1.
40. The method as recited in claim 38, further comprising, when there is not a phrase subsequent to P1 in said CPL, outputting said plurality of phrases in said CPL.
41. The method as recited in claim 1, further comprising, when said phrase search counter is greater than a pre-selected number of phrase searches, said GPL is output.
42. The method as recited in claim 1, further comprising, when said phrase search counter is not greater than a pre-selected number of phrase searches:
- performing a phase search on said database, wherein said GPL is input as a single query having a plurality of phrases and wherein said database includes said relevant data;
  
  outputting a ranked list of subsets of said database;
  
  extracting a plurality of phrases from the ranked list of subsets; and
  
  culling the extracted plurality of phrases.
43. The method as recited in claim 42, further comprising:
- ranking said plurality of phrases output from said extracting and cuffing processes.

44. A method of discovering phrases from a database comprising:
- providing a set of text;
  
  extracting a plurality of phrases from the provided text;
  
  culling the extracted plurality of phrases;
  
  gathering a plurality of phrases, wherein the gathered plurality of phrases are related to the culled and extracted plurality of phrases, by a process comprising;
  
  (A) initiating a gathered phrase list (GPL), wherein the GPL is empty upon initiation, and setting a phrase search counter to zero;
  
  (B) ranking the plurality of phrases output from the extracting and culling processes;
  
  (C) creating a combined plurality of phrases including the ranked plurality of phrases from the extracting and culling processes and any phrases contained in the GPL;
  
  (D) storing the combined plurality of phrases in the GPL;
  
  (E) incrementing a phrase search counter;
  
  (F) determining if the phrase search counter is greater than a pre-selected number of phrase searches;
  
  (F-1) when the phrase search counter is greater than a pre-selected number of phrase searches, exiting the process of gathering a plurality of phrases;
  
  (F-2) when the phrase search counter is not greater than a pre-selected number of phrase searches, (F-2-a) performing a phrase search in a database using the GPL as a single query having a plurality of phrases;
  
  (F-2-b) outputting a plurality of ranked subsets from the database;
  
  (F-2-c) extracting a plurality of phrases from a selected plurality of the plurality of ranked subsets;
  
  (F-2-d) culling the extracted plurality of phrases, and continuing the process of gathering a plurality of phrases at step (B); and
  
  outputting the plurality of gathered phrases.

45. A method of discovering phrases from a database comprising:
- (A) providing a set of text;
  
  (B) extracting a plurality of phrases from the provided text by a process comprising determining a plurality of phrase processing starting positions within a selection of text, and extracting the plurality of phrases wherein, each of the plurality of extracted phrases begins at one of the plurality of starting positions by a process comprising;
  
  (B-1) identifying a first term in the text as a term T1 and a term T2;
  
  (B-2) determining if at least one term subsequent to T1 exists in the text;
  
  (B-3) when at least one term subsequent to T1 exists in the text, determining if T1 is a stopterm;
  
  (B-4) when T1 is a stopterm, identifying the at least one subsequent term as a term T1 and term T2; and
  
  returning to step (B-2);
  
  (B-5) when T1, is not a stopterm, (a) saving T1 as a single term phrase in a phrase list, (b) extracting a plurality of multi-term phrases wherein, each of the plurality of extracted multi-term phrases begins at starting position T1, (c) identifying the at least one term subsequent to T1 as term T1 and term T2, and (d) returning to step (B-2);
  
  (B-6) when at least one term subsequent to T1 does not exist in the text, determining if T1 is a stopterm;
  
  (B-7) when T1 is a stopterm, continuing to step (B-9);
  
  (B-8) when T1 is not a stopterm, saving T1 as a single term phrase in the phrase list, and continuing to step (B-9); and
  
  (B-9) outputting the phrase list containing the extracted plurality of phrases;
  
  (C) cuffing the extracted plurality of phrases;
  
  (D) gathering a plurality of phrases, wherein the gathered plurality of phrases are related to the culled and extracted plurality of phrases; and
  
  (E) outputting the plurality of gathered phrases.
- View Dependent Claims (46, 47)
- - 46. The method of claim 45, wherein said process of saving said T1 as a single term phrase comprises:
47. The method of claim 46, wherein said process of saving said single term phrase T1 comprises:
- (B-8-d) determining if said single term phrase T1 is already in said phrase list;
  
  (B-8-e) when said single term phrase T1 is already in said phrase list, incrementing by 1 a frequency counter corresponding to number of single terms in said phrase list;
  
  (B-8-f) when said single term phrase T1 is not already in said phrase list, adding said single term phrase T1 to said phrase list and setting the frequency counter to the value 1.

48. A method of discovering phrases from a database comprising:
- (A) providing a set of text;
  
  (B) extracting a plurality of phrases from the provided text by a process comprising determining a plurality of phrase processing starting positions within a selection of text, and extracting the plurality of phrases wherein, each of the plurality of extracted phrases begins at one of the plurality of starting positions by a process comprising;
  
  (B-1) identifying a first term in the text as a term T1 and a term T2;
  
  (B-2) determining if at least one term subsequent to T1 exists in the text;
  
  (B-3) when at least one term subsequent to T1 exists in the text, determining if T1 is a stopterm by a process comprising;
  
  (B-3-a) when T1 is a starting stopterm, identifying the at least one subsequent term as a term T1 and term T2; and
  
  returning to step (B-2); and
  
  (B-3-b) when T1 is not a starting stopterm, (a) saving T1 as a single term phrase in a phrase list, (b) extracting a plurality of multi-term phrases wherein, each of the plurality of extracted multi-term phrases begins at starting position T1, (c) identifying the at least one term subsequent to T1 as term T1 and term T2, and (d) returning to step (B-2);
  
  (B-4) when T1 is a stopterm, identifying the at least one subsequent term as a term T1 and term T2; and
  
  returning to step (B-2);
  
  (B-5) when T1 is not a stopterm, (a) saving T1 as a single term phrase in a phrase list, (b) extracting a plurality of multi-term phrases wherein, each of the plurality of extracted multi-term phrases begins at starting position T1, (c) identifying the at least one term subsequent to T1 as term T1 and term T2, and (d) returning to step (B-2);
  
  (B-6) when at least one term subsequent to T1 does not exist in the text, determining if T1 is a stopterm;
  
  (B-7) when T1 is a stopterm, continuing to step (B-9);
  
  (B-8) when T1 is not a stopterm, saving T1 as a single term phrase in the phrase list, and continuing to step (B-9); and
  
  (B-9) outputting the phrase list containing the extracted plurality of phrases;
  
  (C) culling the extracted plurality of phrases;
  
  (D) gathering a plurality of phrases, wherein the gathered plurality of phrases are related to the culled and extracted plurality of phrases; and
  
  (B) outputting the plurality of gathered phrases.

49. A method of discovering phrases from a database comprising:
- (A) providing a set of text;
  
  (B) extracting a plurality of phrases from the provided text by a process comprising determining a plurality of phrase processing starting positions within a selection of text, and extracting the plurality of phrases wherein, each of the plurality of extracted phrases begins at one of the plurality of starting positions by a process comprising;
  
  (B-1) identifying a first term in the relevant text as a term T1 and a term T2;
  
  (B-2) determining if at least one term subsequent to T1 exists in the relevant text;
  
  (B-3) when at least one term subsequent to T1 exists in the relevant text, determining if T1 is a stopterm;
  
  (B-4) when T1 is a stopterm, identifying the at least one subsequent term as a term T1 and term T2; and
  
  returning to step (B-2);
  
  (B-5) when T1 is not a stopterm, (a) saving T1 as a single term phrase in a phrase list, (b) extracting a plurality of multi-term phrases wherein, each of the plurality of extracted multi-term phrases begins at starting position, (c) identifying the at least one term subsequent to T1 as term T1 and term T2, and (d) returning to step (B-2), wherein the process of extracting the plurality of multi-term phrases comprises;
  
  (B-5-b-i) initializing a counter of interior stopterms to 0, and setting a tuple size equal to 2;
  
  (B-5-b-ii) identifying the term subsequent to said T2 as T2;
  
  (B-5-b-iii) determining if the tuple size is greater than a selected maximum phrase length;
  
  (B-5-b-iv) when the tuple size is not greater than the selected maximum phrase length, determining if said 172 is a stopterm;
  
  (B-5-b-v) when the tuple size is not greater than the selected maximum phrase length, and said T2 is not a stopterm, saving said phrase and continuing to step (B-5-b-x);
  
  (B-5-b-vi) when the tuple size is not greater than the selected maximum phrase length, and said T2 is a stopterm, incrementing by 1 the counter of interior stopterms and determining if the interior stopterm counter value is greater than a selected number of interior stopterms;
  
  (B-5-b-vii) when the tuple size is not greater than the selected maximum phrase length, and said T2 is a stopterm and the interior stopterm counter value is not greater than a selected number of stopterms, continuing to step (B-5-b-x);
  
  (B-5-b-viii) when the tuple size is not greater than the selected maximum phrase length, and said T2 is a stopterm and the interior stopterm counter value is greater than a selected number of stopterms, ending said sub-process of said step (B-5-b);
  
  (B-5-b-ix) when the tuple size is greater than the selected maximum phrase length, ending said sub-process of said step (B-5-b);
  
  (B-5-b-x) incrementing by 1 the tuple size, and determining if there is a term subsequent to T2 in said text;
  
  (B-5-b-xi) when there is no term subsequent to T2 in said relevant text, ending the sub-process of said step (B-5-b); and
  
  (B-5-b-xii) when there is a term subsequent to T2 in said relevant text, returning to step (B-5-b-ii);
  
  (B-6) when at least one term subsequent to T1 does not exist in the text, determining if T1 is a stopterm;
  
  (B-7) when T1 is a stopterm, continuing to step (B-9);
  
  (B-8) when T1 is not a stopterm, saving T1 as a single term phrase in the phrase list, and continuing to step (B-9); and
  
  (B-9) outputting the phrase list containing the extracted plurality of phrases;
  
  (C) culling the extracted plurality of phrases;
  
  (D) gathering a plurality of phrases, wherein the gathered plurality of phrases are related to the culled and extracted plurality of phrases; and
  
  (E) outputting the plurality of gathered phrases.

50. A method of discovering phrases from a database comprising:
- (A) providing a set of text;
  
  (B) extracting a plurality of phases from the provided text by a process comprising determining a plurality of phrase processing starting positions within a selection of text, and extracting the plurality of phrases wherein, each of the plurality of extracted phrases begins at one of the plurality of starting positions by a process comprising;
  
  (B-1) identifying a first term in the text as a term T1 and a term T2;
  
  (B-2) determining if at least one term subsequent to T1 exists in the text;
  
  (B-3) when at least one term subsequent to T1 exists in the text, determining if T1 is a stopterm;
  
  (B-4) when T1 is a stopterm, identifying the at least one subsequent term as a term T1 and term T2; and
  
  returning to step (B-2);
  
  (B-5) when T1 is not a stopterm, (a) saving T1 as a single term phrase in a phrase list, (b) extracting a plurality of multi-term phrases wherein, each of the plurality of extracted multi-term phrases begins at starting position T1, (c) identifying the at least one term subsequent to T1 as term T1 and term T2, and (d) returning to step (B-2), wherein the process of extracting the plurality of multi-term phrases at the phrase starting position comprises;
  
  (B-5-b-i) initializing a counter of interior stopterms to 0, and setting a tuple size equal to 2;
  
  (B-5-b-ii) identifying the term subsequent to the T2 as T2;
  
  (B-5-b-iii) determining if the tuple size is greater than a selected maximum phrase length;
  
  (B-5-b-iv) when the tuple size is greater than the selected maximum phrase length, ending the sub-process of step (B-5-b-i);
  
  (B-5-b-v) when the tuple size is not greater than the selected maximum phrase length, determining if the T2 is an interior stopterm;
  
  (B-5-b-vi) when the tuple size is not greater than the selected maximum phrase length, and T2 is not an interior stopterm, determining if T2 is an ending stopterm;
  
  (B-5-b-vii) when the tuple size is not greater than the selected maximum phrase length, and the T2 is not an interior stopterm, and T2 is not an ending stopterm, saving the phrase and continuing to step (B-5-b-xiv);
  
  (B-5-b-vii) when the tuple size is not greater than the selected maximum phrase length and T2 is not an interior stopterm and T2 is an ending stopterm, determining if T2 is an interior-only term;
  
  (B-b-5-ix) when the tuple size is not greater than the selected maximum phrase length and T2 is not an interior stopterm and T2 is an ending stopterm and T2 is an interior-only term, continuing to step (B-5-b-xiv);
  
  (B-5-b-x) when the tuple size is not greater than the selected maximum phrase length and T2 is not an interior stopterm and T2 is an ending stopterm and T2 is not an interior-only term, ending the sub-process of step (B-5);
  
  (B-5-b-xi) when the tuple size is not greater than the selected maximum phrase length and T2 is an interior stopterm, incrementing by 1 the interior stopterm counter and determining if the interior stopterm count is greater than a selected number of interior stopterms;
  
  (B-5-b-xii) when the tuple size is not greater than the selected maximum phrase length and T2 is an interior stopterm and the interior stopterm count is greater than the selected number of interior stopterms, ending the sub-process of step (B-5);
  
  (B-5-b-xiii) when the tuple size is not greater than the selected maximum phrase length and T2 is an interior stopterm and the interior stopterm count is not greater than the selected number of interior stopterms, continuing to step (B-5-b-xiv);
  
  (B-5-b-xiv) incrementing by 1 the tuple size, and determining if there is a term subsequent to T2 in the text;
  
  (B-5-b-xv) when there is no term subsequent to T2 in the text, ending the sub-process of step (B-5); and
  
  (B-5-b-xvi) when there is a term subsequent to T2 in the text, returning to step (B-5-b-ii);
  
  (B-6) when at least one term subsequent to T1 does not exist in the text, determining if T1 is a stopterm;
  
  (B-7) when T1 is a stopterm, continuing to step (B-9);
  
  (B-8) when T1 is not a stopterm, saving T1 as a single term phrase in the phrase list, and continuing to step (B-9); and
  
  (B-9) outputting the phrase list containing the extracted plurality of phrases;
  
  (C) culling the extracted plurality of phrases;
  
  (D) gathering a plurality of phrases, wherein the gathered plurality of phrases are related to the culled and extracted plurality of phrases; and
  
  (E) outputting the plurality of gathered phrases.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
U.S.A. as represented by the Administrator of the National Aeronautics and Space Administration
Original Assignee
United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration
Inventors
McGreevy, Michael W.
Primary Examiner(s)
Le, Uyen
Assistant Examiner(s)
Thai, Hanh B

Application Number

US09/800,310
Publication Number

US 20020188599A1
Time in Patent Office

1,138 Days
Field of Search

707/1-7
US Class Current

1/1
CPC Class Codes

G06F 16/2465   Query processing support fo...

G06F 16/3337   Translation of the query la...

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99936   Pattern matching access

System, method and apparatus for discovering phrases in a database

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

390 Citations

50 Claims

Specification

Solutions

Use Cases

Quick Links

System, method and apparatus for discovering phrases in a database

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

390 Citations

50 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links