Apparatus for automatically generating index
First Claim
1. An apparatus for creating an index of coded textual data, comprising:
- textual data storing means for storing coded textual data;
text analyzing means for dividing said coded textual data into meaningful strings of words, characters, symbols, or control codes;
specialized word storing means for storing a set of specialized words pertaining to a field of knowledge related to said coded textual data;
entry selecting means for selecting an index entries only those strings that match a word or its substantial equivalent in said set of specialized words and creating index entry data by associating and storing each of the selected strings with each position in said coded textual data where that selected string occurs; and
index outputting means for outputting in visible form an index of said coded textual data by arranging said index entries in a prescribed order and outputting the arranged index entries together with symbols indicative of their associated positions of occurrence in said coded textual data.
1 Assignment
0 Petitions
Accused Products
Abstract
A system for creating an index of textual data stores textual data in memory, and a text analyzing module analyzes the textual data and divides it into a plurality of meaningful strings of characters, punctuation marks, symbols, control codes, etc. A dictionary stores sets of specialized words particular to a field of knowledge related to the textual data in a particular language. An entry selecting module selects as index entries only those strings which match one of those specialized words and notes the location(s) of each occurrence each index entry in the text. A printer outputs the selected index entries together with their occurrence positions. Each entry of the dictionary in the specialized field includes information concerning inflections and variants of that entry. The index is quickly and accurately generated by selecting index entries using a specialized dictionary relevant to a particular, specialized field. Since the selection of index entries is made by referring to such a dictionary, differences in criteria for selection used by different operators can be prevented. Since a specialized dictionary is prepared and updated for each field, the knowledge for generating an index is collected and shared by all the operators.
266 Citations
20 Claims
-
1. An apparatus for creating an index of coded textual data, comprising:
-
textual data storing means for storing coded textual data; text analyzing means for dividing said coded textual data into meaningful strings of words, characters, symbols, or control codes; specialized word storing means for storing a set of specialized words pertaining to a field of knowledge related to said coded textual data; entry selecting means for selecting an index entries only those strings that match a word or its substantial equivalent in said set of specialized words and creating index entry data by associating and storing each of the selected strings with each position in said coded textual data where that selected string occurs; and index outputting means for outputting in visible form an index of said coded textual data by arranging said index entries in a prescribed order and outputting the arranged index entries together with symbols indicative of their associated positions of occurrence in said coded textual data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 20)
-
-
12. An apparatus for creating an index of coded textual data, comprising:
-
textual data storing means for storing coded textual data; text analyzing means for dividing said coded textual data into meaningful strings of words, characters, symbols or control codes; rule storing means for storing rules for selecting index entries from said textual data; entry selecting means for selecting strings as index entries, creating index entry data, and storing said index entry data along with each position in said coded textual data where an associated one of said selected strings occurs in accordance with said rules; and index outputting means for outputting in visible form an index of said coded textual data by arranging said index entries in a prescribed order and outputting the arranged index entries together with symbols indicative of their associated positions of occurrence in said textual data. - View Dependent Claims (18, 19)
-
Specification