Text processing and retrieval system and method
First Claim
1. A content-based text processing and retrieval system, comprising:
- means for processing a plurality of pieces of text based on content to generate an index for each piece of text, the index comprising a list of phrases that represent the content of the piece of text;
means for grouping phrases together to generate clusters based on a predetermined degree of relationship between the phrases;
means for generating a hierarchical structure, the hierarchical structure comprising a plurality of maps, each map corresponding to a predetermined degree of relationship, the map graphically depicting the clusters at the predetermined degree of relationship and comprising a plurality of nodes, each node representing a cluster, and a plurality of links connecting nodes that are related;
means for selecting a predetermined map;
means for displaying said selected map to a user;
means for selecting a particular cluster displayed on said selected map; and
means for extracting a portion of text from said pieces of text based on the selected cluster.
19 Assignments
0 Petitions
Accused Products
Abstract
A content-based system and method for text processing and retrieval is provided wherein a plurality of pieces of text are processed based on content to generate an index for each piece of text, the index comprising a list of phrases that represent the content of the piece of text. The phrases are grouped together to generate clusters based on a degree of relationship of the phrases, and a hierarchical structure is generated, the hierarchical structure comprising a plurality of maps, each map corresponding to a predetermined degree of relationship, the map graphically depicting the clusters at the predetermined degree of relationship, and comprising a plurality of nodes, each node representing a cluster, and a plurality of links connecting nodes that are related. The map is displayed to a user, a user selects a particular cluster on the map, and a portion of text is extracted from said pieces of text based on the cluster selected by the user. The system may also generate scenarios, based on said maps, that indicate changes in the relationships shown by the maps.
-
Citations
34 Claims
-
1. A content-based text processing and retrieval system, comprising:
-
means for processing a plurality of pieces of text based on content to generate an index for each piece of text, the index comprising a list of phrases that represent the content of the piece of text; means for grouping phrases together to generate clusters based on a predetermined degree of relationship between the phrases; means for generating a hierarchical structure, the hierarchical structure comprising a plurality of maps, each map corresponding to a predetermined degree of relationship, the map graphically depicting the clusters at the predetermined degree of relationship and comprising a plurality of nodes, each node representing a cluster, and a plurality of links connecting nodes that are related; means for selecting a predetermined map; means for displaying said selected map to a user; means for selecting a particular cluster displayed on said selected map; and means for extracting a portion of text from said pieces of text based on the selected cluster. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for content-based text processing and retrieval, comprising:
-
processing a plurality of pieces of text based on content to generate an index for each piece of text, the index comprising a list of phrases that represent the content of the piece of text; grouping phrases together to generate clusters based on a predetermined degree of relationship between the phrases; generating a hierarchical structure, the hierarchical structure comprising a plurality of maps, each map corresponding to a predetermined degree of relationship, the map graphically depicting the clusters at the predetermined degree of relationship and comprising a plurality of nodes, each node representing a cluster, and a plurality of links connecting nodes that are related; selecting a predetermined map; displaying said selected map to a user; selecting a particular cluster displayed on said selected map; and extracting a portion of text from said pieces of text based on the selected cluster. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A content-based text processing and retrieval system, comprising:
-
means for processing a plurality of pieces of text based on content to generate an index for each piece of text, the index comprising a list of phrases that represent the content of the piece of text; means for grouping phrases together to generate clusters based on a predetermined degree of relationship between the phrases; and means for generating a hierarchical structure, the hierarchical structure comprising a plurality of maps, each map corresponding to a predetermined degree of relationship, the map graphically depicting the clusters at the predetermined degree of relationship and comprising a plurality of nodes, each node representing a cluster, and a plurality of links connecting nodes that are related. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
-
25. A method for content-based text processing and retrieval system, comprising:
-
processing a plurality of pieces of text based on content to generate an index for each piece of text, the index comprising a list of phrases that represent the content of the piece of text; grouping phrases together to generate clusters based on a predetermined degree of relationship between the phrases; and generating a hierarchical structure, the hierarchical structure comprising a plurality of maps, each map corresponding to a predetermined degree of relationship, the map graphically depicting the clusters at the predetermined degree of relationship and comprising a plurality of nodes, each node representing a cluster, and a plurality of links connecting nodes that are related. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32)
-
-
33. A content-based text processing and retrieval system, comprising:
-
means for processing a plurality of pieces of text based on content to generate an index for each piece of text, the index comprising a list of phrases that represent the content of the piece of text; means for grouping phrases together to generate clusters based on a predetermined degree of relationship between the phrases; means for generating a hierarchical structure, the hierarchical structure comprising a plurality of maps, each map corresponding to a predetermined degree of relationship, the map graphically depicting the clusters at the predetermined degree of relationship and comprising a plurality of nodes, each node representing a cluster, and a plurality of links connecting nodes that are related; means for generating a semiotic data structure from said plurality of pieces of text, the semiotic data structure comprising a list of phrases that indicate the content of said pieces of text, and a tag that is associated with each phrase in said semiotic data structure to classify which word by its content; and means for comparing a plurality of maps to each other to generate a scenario, said scenario indicating changes in the relationship graphically depicted by said maps. - View Dependent Claims (34)
-
Specification