Abstracting system for index search machine
First Claim
1. Apparatus for generating at least one abstract that is useful during information searching and retrieval procedures, comprising:
- 1. means for sensing information comprising individual words in a selected language, each word comprising one or more individual characters;
2. means for categorizing selected ones only of the characters in said words into predefined character groups that are based on a probability distribution of characters in the language selected;
3. means maintaining a count of the number of characters categorized into each of said predefined character groups; and
4. means for storing said count as an abstract of said information.
0 Assignments
0 Petitions
Accused Products
Abstract
The invention concerns a system of English language abstracting used to increase the search rate for an Index Search machine that, as an example makes use of magnetic record cards each having in a typical case 50 tracks for recording of information. To increase the rate at which groups of words can be compared, an abstract of each group of words is generated. On the magnetic card the text or groups of words are recorded on tracks 2 through 50 and the abstracts of these groups are recorded in track 1. When searching for a particular group of words, an abstract is generated for the group being sought. This abstract is then compared to the abstracts on track 1 of each card. If any of the abstracts match, the corresponding group of words on the card are searched in detail. For abstracts that do not match there is no need to search the corresponding group of words. Therefore, the need to search every group of words is eliminated, thus increasing the search rate.
The abstracts are generated from the first and third characters of each word in the group of words. Each character to be used in the generation of the abstract is encoded into one of three groups which are equally probable over the distribution of characters in the English Language. The abstract for a group of words consists of the number of characters that fall in each of the three groups.
147 Citations
71 Claims
-
1. Apparatus for generating at least one abstract that is useful during information searching and retrieval procedures, comprising:
-
1. means for sensing information comprising individual words in a selected language, each word comprising one or more individual characters; 2. means for categorizing selected ones only of the characters in said words into predefined character groups that are based on a probability distribution of characters in the language selected; 3. means maintaining a count of the number of characters categorized into each of said predefined character groups; and 4. means for storing said count as an abstract of said information. - View Dependent Claims (2, 4, 5, 6, 7)
-
-
3. count maintaining means for maintaining a count of the number of characters in said inquiry word categorized into each of said predefined character groups;
-
8. means for comparing said inquiry count with said abstract count; and 9. means providing an indication of a match or mismatch of said abstract and inquiry counts.
-
-
8. Apparatus for generating at least one abstract that is useful during information searching and retrieval procedures, comprising:
-
1. means for sensing an inquiry of information comprising at least an individual inquiry word in a selected language, said inquiry word comprising one or more individual characters; 2. means for categorizing selected ones only of the characters in said inquiry word into predefined character groups that are based on a probability distribution of characters in the language selected, 3. means for maintaining a count of the number of characters categorized into each of said predefined character groups; and 4. means for utilizing said count as an inquiry during searching of said information. - View Dependent Claims (11, 12, 15, 16, 19, 20, 21, 22, 23, 25, 26, 27, 28, 29, 31, 32, 33, 34, 37, 39, 40, 41, 42, 43, 45, 46, 47, 49, 50, 51)
-
-
9. Apparatus for generating at least one abstract that is useful during information searching and retrieval procedures, comprising:
-
1. means for sensing information comprising individual words in a selected language, each word comprising one or more individual characters; 2. means for categorizing selected ones only of the characters in said words into predefined character groups that are based on a probability distribution of characters in the language selected; 3. means for maintaining a count of the number of characters categorized into each of said predefined character groups; 4. means for converting said count to a compressed apparatus-compatible abstract code form;
-
- 10. and means for storing said compressed abstract code as an abstract of said information.
-
14. Apparatus for searching and retrieving information wherein said information is stored on a record medium in index groups each having index words representative of original documents and wherein a separate area is allotted for storage of abstracts, and an abstract count is stored for each said index group, said abstract count being based on individual characters categorized into predefined character groups based on a probability distributiion of said characters in a selected language;
- said apparatus comprising;
1. means for sensing said stored abstract counts from said record medium; 2. means for comparing an inquiry count with said sensed abstract counts; and 3. means for providing an indication of a match or mismatch of said abstract and inquiry counts.
- said apparatus comprising;
-
17. Apparatus for generating at least one abstract that is useful during searching and retrieval procedures involving information stored on a record medium comprising:
1. means for sensing information comprising individual words in a selected language stored on said record medium, each word comprising one or more individual characters;
-
18. means for categorizing selected ones only of the characters in said words into predefined character groups that are based on a probability distribution of characters in the language selected;
-
3. means maintaining a count of the number of characters categorized groups; and 4. means for storing said count on said record medium as an abstract of said information.
-
-
24. means for maintaining a count of the number of characters in said inquiry word categorized into each of said predefined character groups;
-
8. means for sensing said abstract count stored on said record medium; 9. means for comparing said inquiry count with said sensed abstract count; and 10. means providing an indication of a match or mismatch of said abstract and inquiry counts.
-
-
30. Apparatus for searching and retrieving information stored on a record medium wherein said information is stored on said record medium in index groups, each having index words representative of original documents and wherein a separate area is allotted on said medium for storage of abstracts, and said record medium storing an abstract count for each said index group, said abstract count being based on individual characters in an index group that comply with predefined character groups based on a probability distribution of said characters in a selected language;
- said apparatus comprising;
1. means for sensing said abstract count stored on said record medium; 2. means for comparing an inquiry count with said sensed abstract count; and 3. means providing an indication of a match or mismatch of said abstract and inquiry counts.
- said apparatus comprising;
-
35. A method for generating in an information processing machine at least one abstract that is useful during information searching and retrieval procedures, comprising:
-
36. sensing by said machine information signals representative of information comprising individual words in a selected language, each word comprising one of more individual character;
-
2. developing in said machine, signals representative of characters in said system; 3. categorizing in said machine, selected ones only of the individual character signals into predefined character groups that are based on a probability distribution of the characters represented by said signals in the language selected; 4. maintaining an abstract count by said machine of the number of character signals categorized into each of said predefined character groups; and 5. storing said abstract count in said machine as an abstract of said information.
-
-
38. categorizing in said machine selected ones only of the individual character inquiry signals into predefined character groups that are based on a probability distribution of the characters represented by said signals in the language selected;
-
9. maintaining in said machine an inquiry count of the number of character inquiry signals categorized into each of said predefined character groups; 10. comparing in said machine said inquiry count with said abstract count; and 11. providing from said machine indication signals representative of a match or mismatch or said abstract and inquiry counts.
-
- 44. generating and recording in said machine a force search indication rather than an abstract when count capacity is exceeded in order to signify that a search of the related information is required.
-
48. locating in said machine each index storage area for accessing of the index group stored therein;
-
9. locating in said machine said particular storage area; and 10. sensing in said machine each abstract count in said particular storage area. - View Dependent Claims (59)
-
-
52. determining in said machine when an index group on said record medium exceeds one track of storage;
- and
12. recording on said record medium in said machine a track skip code for tracks storing the excess of information in any index group. - View Dependent Claims (66)
- and
-
53. A method for searching and retrieving in an information processing machine, information stored on a record medium in response to an inquiry represented by an inquiry count wherein said information is stored on said record medium in index groups, each having index words representative of original documents and wherein a separate area is allotted on said medium for storage of abstracts, and said record medium storing an abstract count for each said index group, said abstract count being based on individual characters in an index group that comply with predefined character groups based on a probability distribution of said characters in a selected language;
- said apparatus comprising;
1. sensing in said machine an abstract count stored on said record medium; 2. comparing in said machine said inquiry count with a sensed abstract count; and 3. providing from said machine an indication of a match or mismatch of said abstract and inquiry counts.
- said apparatus comprising;
-
57. A method for generating in an information processing machine at least one abstract that is useful during searching and retrieval procedures involving information stored on a record medium, comprising:
-
58. sensing by said machine information signals representative of information comprising individual words in a selected language stored on said record medium, each word comprising one or more individual characters;
-
2. developing in said machine character signals representative of characters in said system; 3. categorizing in said machine selected ones of the individual character signals into predefined character groups that are based a probability distribution of the characters represented by said signals in the language selected; 4. maintaining in said machine an abstract count of the number of character signals categorized into each of said predefined character groups; and 5. storing said count by said machine on said record medium as an abstract of said information.
-
-
60. categorizing in said machine selected ones of the individual character inquiry signals into predefined character groups that are based on a probability distribution of the characters represented by said character signals in the language selected;
-
8. maintaining in said machine an inquiry count of the number of character inquiry signals categorized into each of said predefined character groups; 9. sensing by said machine said abstract count stored on said record medium; 10. comparing by said machine said inquiry count with said sensed abstract count; and 11. providing from said machine indication signals representative of a match or mismatch of said abstract and inquiry count signals.
-
-
61. A method for generating in an information processing machine at least one abstract that is useful during searching and retrieval procedures involving information stored on a record medium, comprising:
1. providing by said machine information inquiry signals representative of information comprising at least an individual word in a selected language, said word comprising one or more individual characters;
-
62. developing character inquiry signals representative of selected ones only of said individual inquiry characters;
-
3. categorizing in said machine individual character inquiry signals into predefined character groups that are based on a probability distribution of the characters represented by said inquiry signals on said record medium in the language selected; 4. maintaining in said machine an inquiry count of the number of character inquiry signals categorized into each of said predefined character groups; and 5. referencing in said machine said inquiry count as an inquiry during searching of information stored on said record medium by said machine.
-
-
63. A method for generating in an information processing machine at least one abstract that is useful during information searching and retrieval procedures, comprising:
-
1. sensing by said machine information inquiry signals representative of information comprising at least an individual word in a selected language, said word comprising one or more individual characters; 2. developing in said machine character inquiry signals representative of selected ones of said individual inquiry characters;
-
-
64. categorizing in said machine individual character inquiry signals into predefined character groups that are based on a probability distribution of the characters represented by said inquiry signals in the language selected;
-
4. maintaining in said machine an inquiry count of the number of character inquiry signals categorized into each of said predefined character groups; and 5. referencing in said machine said inquiry count as an inquiry during searching of said information by said machine.
-
-
65. A method for searching and retrieving information in an information processing machine wherein said information is stored in index groups with individual characters categorized into predefined character groups and with each index group being represented by an abstract count that is based on a probability distribution of said characters in a selected language, comprising:
-
1. sensing by said machine abstract count signals representative of said stored abstract counts; 2. comparing in said machine an inquiry count signal with said abstract count signals; and 3. providing from said machine indication signals representative of a match or mismatch of said abstract and inquiry count signals.
-
-
67. A method for generating in an information processing machine at least one abstract that is useful during information searching and retrieval procedures, comprising:
-
1. sensing by said machine information signals representative of information comprising individual words in a selected language, each word comprising at least an individual character; 2. developing in said machine signals representative of characters in said system; 3. categorizing in said machine individual character signals into predefined character groups that are based on a probability distribution of the characters represented by said signals in the language selected; 4. maintaining an abstract count by said machine of the number of character signals categorized into each of said predefined character groups; 5. converting in said machine said abstract count to a compressed abstract count form; and
-
-
68. storing said compressed abstract count in said machine as an abstract of said information.
Specification