Systems and methods for organizing text
First Claim
1. A computer-implemented method of organizing text content of at least one text passage, comprising;
- automatically selecting a plurality of terms from the at least one text passage;
obtaining a plurality of candidate terms from the plurality of terms, the plurality candidate terms being less than the plurality of terms;
organizing at least some of the plurality of candidate terms into a hierarchy according to co-occurrence relationships among the some of the plurality of candidate terms, including arranging the plurality of candidate terms into a co-occurrence matrix showing a number of times each candidate term co-occurs with each other candidate term in the at least one text passage;
selecting one of the candidate terms as a first dominant hierarchical position candidate term; and
generating a first candidate hierarchy, comprising;
arranging the first dominant hierarchical position candidate term in a dominant hierarchical position, selecting at least one other candidate term, based on the co-occurrence matrix and a predetermined overlap criterion, and arranging the at least one other candidate term in a hierarchical position that is subordinate to the dominant hierarchical position; and
displaying the hierarchy.
7 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are provided for organizing text content of one or more text passages, such as text passages obtained in response to a search query, and/or other text passages, using an organization based on concept terms obtained from the one or more text passages. A hierarchical structure is used to organize the documents in a way that informs the user about co-occurrence relations among terms that represent concepts, indicating the relative degree of occurrence and context of discussion of the terms within the search results. One or more candidate hierarchies may be generated, each with a different term in the most-dominant position. The one or more candidate hierarchies can be evaluated, and a hierarchy to be displayed can be selected based on the evaluation.
134 Citations
26 Claims
-
1. A computer-implemented method of organizing text content of at least one text passage, comprising;
-
automatically selecting a plurality of terms from the at least one text passage;
obtaining a plurality of candidate terms from the plurality of terms, the plurality candidate terms being less than the plurality of terms;
organizing at least some of the plurality of candidate terms into a hierarchy according to co-occurrence relationships among the some of the plurality of candidate terms, including arranging the plurality of candidate terms into a co-occurrence matrix showing a number of times each candidate term co-occurs with each other candidate term in the at least one text passage;
selecting one of the candidate terms as a first dominant hierarchical position candidate term; and
generating a first candidate hierarchy, comprising;
arranging the first dominant hierarchical position candidate term in a dominant hierarchical position, selecting at least one other candidate term, based on the co-occurrence matrix and a predetermined overlap criterion, and arranging the at least one other candidate term in a hierarchical position that is subordinate to the dominant hierarchical position; and
displaying the hierarchy. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
evaluating the first candidate hierarchy and determining a first evaluation score;
selecting a candidate term other than the first dominant hierarchical position candidate term as a second dominant hierarchical position candidate term;
generating a second candidate hierarchy, comprising;
arranging the second dominant hierarchical position candidate term in the dominant hierarchical position, selecting at least one other candidate term, based on the co-occurrence matrix and the predetermined overlap criterion, and arranging the at least one other candidate term in the hierarchical position that is subordinate to the dominant hierarchical position;
evaluating the second candidate hierarchy and determining a second evaluation score;
comparing the first and second evaluation scores; and
retaining the first candidate hierarchy if the first evaluation score is better than the second evaluation score, and retaining the second candidate hierarchy if the second evaluation score is better than the first evaluation score.
-
-
7. The method according to claim 1, wherein the first dominant hierarchical position candidate term is a most-frequently-occurring term among the candidate terms.
-
8. The method according to claim 1, wherein each unique term is selected no more than a predetermined number of times.
-
9. The method according to claim 1, wherein at least one of the plurality of terms in the hierarchy occurs in a plurality of locations within the hierarchical organization.
-
10. The method according to claim 1, wherein at least two of the at least some of the plurality of terms appear together in a same position in the displayed hierarchy.
-
11. A data carrier carrying a program capable of performing the steps of the method according to claim 1.
-
12. A computer-implemented method of organizing text content of at least one text passage, comprising;
-
automatically selecting a plurality of terms from the at least one text passage;
organizing at least some of the plurality of terms into a hierarchy according to co-occurrence relationships among the some of the plurality of terms;
displaying the hierarchy; and
associating a plurality of selectable elements with a respective plurality of displayed terms of the displayed hierarchy;
whereinif one of the plurality of selectable elements is selected, at least one text passage including the respective displayed term and at least one of
1) text sequentially before the displayed term and
2) text sequentially after the at least one displayed term, is displayed; and
if another of the plurality of selectable elements is selected, the at least one text passage is displayed again.
-
-
13. A computer-implemented method of organizing text content of at least one text passage, comprising;
-
automatically selecting a plurality of terms from the at least one text passage;
organizing at least some of the plurality of terms into a hierarchy according to co-occurrence relationships among the some of the plurality of terms;
displaying the hierarchy; and
associating a plurality of selectable elements with a respective plurality of displayed terms of the displayed hierarchy, wherein a first selectable element comprises a first displayed term, and a second selectable element comprises an element separate from the displayed terms. - View Dependent Claims (14)
if one of the first and second selectable elements is selected, at least one first text passage including the at least one displayed term and at least one of
1) text sequentially before the at least one displayed term and
2) text sequentially after the at least one displayed term, is displayed, the at least one first text passage including at least one displayed term from a position subordinate to a hierarchical position of the at least one displayed term; and
if the other of the first and second selectable elements is selected, at least one second text passage including the at least one displayed term and at least one of
1) text sequentially before the at least one displayed term and
2) text sequentially after the at least one displayed term, is displayed, the at least one second text passage not including any terms displayed in the hierarchy at a position subordinate to a hierarchical position of the at least one displayed term.
-
-
15. A computer-implemented method of organizing text content of at least one text passage, comprising;
-
automatically selecting a plurality of terms from the at least one text passage;
organizing at least some of the plurality of terms into a hierarchy according to co-occurrence relationships among the some of the plurality of terms;
displaying the hierarchy; and
associating at least one selectable element with at least one displayed term of the displayed hierarchy, wherein, when the at least one selectable element is selected, a new hierarchy is generated based on the at least one displayed term.
-
-
16. A computer-implemented method of organizing text content of at least one text passage, comprising;
-
automatically selecting a plurality of terms from the at least one text passage;
organizing at least some of the plurality of terms into a hierarchy according to co-occurrence relationships among the some of the plurality of terms wherein organizing at least some of the plurality of terms into a hierarchy comprises;
generating a plurality of candidate hierarchies; and
assessing a score to each candidate hierarchy based on at least one predetermined constraint; and
displaying a best-scoring one of the candidate hierarchies.
-
-
17. A computer-implemented method of organizing text content of at least one text passage, comprising;
-
automatically selecting a plurality of terms from the at least one text passage;
organizing at least some of the plurality of terms into a hierarchy according to co-occurrence relationships among the some of the plurality of terms by evaluating individual ones of the at least some of the plurality of terms based on at least one predetermined co-occurrence constraint. - View Dependent Claims (18, 19)
relaxing the predetermined co-occurrence constraint after evaluating individual ones of the at least some of the plurality of terms; and
re-evaluating individual ones of the at least some of the plurality of terms based on the relaxed at least one co-occurrence constraint.
-
-
20. A computer-implemented data organization system, comprising:
-
a term extractor that extracts a plurality of terms from at least one text passage;
a co-occurrence determination system that determines co-occurrence relationships between at least some of the plurality of terms;
a co-occurrence matrix generator that generates a co-occurrence matrix based on the co-occurrence relationships, an individual score being assigned to each co-occurrence relationship within the co-occurrence matrix;
a data grouping system that generates a plurality of hierarchies by organizing at least some of the at least some of the plurality of terms based on the co-occurrence relationship; and
a scoring system that assigns a total evaluation score to each hierarchy based on the individual scores within the co-occurrence matrix;
wherein the data organization system retains a best-scoring one of the plurality of hierarchies. - View Dependent Claims (21, 22)
-
-
23. A computer-implemented data organization system, comprising:
-
a term extractor that extracts a plurality of terms from at least one text passage;
a co-occurrence determination system that determines co-occurrence relationships between at least some of the plurality of terms;
a data grouping system that generates a hierarchy by organizing at least some of the at least some of the plurality of terms based on the co-occurrence relationship;
a display that displays the hierarchy;
a selectable element generator that generates at least one selectable element associated with at least one of the terms in the displayed hierarchy; and
a controller that, when the at least one selectable element is selected, alters the display. - View Dependent Claims (24, 25, 26)
-
Specification