Document retrieval system for displaying document image data with inputted bibliographic items and character string selected from multiple character candidates
First Claim
1. A document storage and retrieval system comprising:
- image file means for storing documents which are converted into digital document image data by a photo-electric conversion means and compression processed by an image processor;
document recognition means, coupled to said image file means, for recognizing said documents and for generating full text data of said documents, said full text data including character code strings identified as a result of recognition of characters in said document, wherein said document recognition means outputs multiple candidates of character codes for a character not identified as a result of character recognition and stores said multiple candidates of character codes between predetermined special character codes in said character code strings between correctly identified characters at the location of said character not identified;
text file means, coupled to said document recognition means, for storing said full text data;
data base file means for storing bibliographic items and information which identifies said document image data stored by said image file means and said full text data of said documents stored by said text file means thereby correlating said bibliographic items to said information, said bibliographic items each including a title, an author'"'"'s name or classification of a document; and
retrieval means, coupled to said data base file means, said text file means, and said image file means, for searching whether a bibliographic item and a character code string input as a request for text content by an operator exists in said full text data of said documents stored by said text file means for outputting document image data corresponding to a document including said bibliographic item and said character code string requested by said operator, wherein said retrieval means searches said multiple candidates or character codes of each said character not identified in said full text data to locate a character code from said input character code string among said multiple candidates of character codes.
0 Assignments
0 Petitions
Accused Products
Abstract
A document storage and retrieval system for storing a document body in the form of image, means for storing text information in the form of a character code string for retrieval, apparatus for executing a retrieval with reference to the text information, and apparatus for displaying a document image relating thereto on a retrieval terminal according to the retrieval result. Such a form of the system is available for retrieving the full contents of a document and also for displaying the document body printed in a format easy to read straight in the form of image. Users are capable of retrieving documents with arbitrary words and also capable of reading even such a document as is complicated to include mathematical expressions and charts through a terminal in the form of image, the same as on paper. A system is provided wherein the text information for retrieval is extracted automatically from the document image through character recognition. Since a precision of the character recognition has not been satisfactory hitherto, a visual retrieval and correction have been carried out without fail by operators. However, there is no necessity for the operators to attend therefor.
71 Citations
4 Claims
-
1. A document storage and retrieval system comprising:
-
image file means for storing documents which are converted into digital document image data by a photo-electric conversion means and compression processed by an image processor; document recognition means, coupled to said image file means, for recognizing said documents and for generating full text data of said documents, said full text data including character code strings identified as a result of recognition of characters in said document, wherein said document recognition means outputs multiple candidates of character codes for a character not identified as a result of character recognition and stores said multiple candidates of character codes between predetermined special character codes in said character code strings between correctly identified characters at the location of said character not identified; text file means, coupled to said document recognition means, for storing said full text data; data base file means for storing bibliographic items and information which identifies said document image data stored by said image file means and said full text data of said documents stored by said text file means thereby correlating said bibliographic items to said information, said bibliographic items each including a title, an author'"'"'s name or classification of a document; and retrieval means, coupled to said data base file means, said text file means, and said image file means, for searching whether a bibliographic item and a character code string input as a request for text content by an operator exists in said full text data of said documents stored by said text file means for outputting document image data corresponding to a document including said bibliographic item and said character code string requested by said operator, wherein said retrieval means searches said multiple candidates or character codes of each said character not identified in said full text data to locate a character code from said input character code string among said multiple candidates of character codes. - View Dependent Claims (2, 3)
-
-
4. A document storage and retrieval system comprising:
-
image file means for storing documents which are converted into digital document image data by a photo-electrical conversion means and compression processed by an image processor; document recognition means, coupled to said image file means, for recognizing said documents and for generating full text data of said documents, said full text data including character code strings identified as a result of recognition of characters in said documents, wherein said document recognition means outputs multiple candidates of character codes for a character recognition and stores said multiple candidates of character codes between predetermined special character codes in said character code strings between correctly identified characters at the location of said character not identified as result of text file means, coupled to said document recognition means, for storing full text data of said documents, said full text data including character code string representative of characters which exist in said documents as character codes, wherein said full text data is used for retrieving and said document image data is used for outputting; data base file means for storing bibliographic items and information which identifies said document image data stored by said image file means and said full text data of said documents stored by said text file means thereby correlating said bibliographic items to said information, wherein said bibliographic items each include a title, and author'"'"'s name or classification of a document; and retrieval means, coupled to said data base file means, said text file means, and said image file means, for searching whether a bibliographic item and a character code string input as a request for text content by an operator exists in said full text data of said documents stored by said text file means and for outputting document image data corresponding to a document including said bibliographic item and said character code string requested by said operator, wherein said retrieval means searches said multiple candidates of character codes of each said character not identified in said full text data to locate a character code from said input character code string among said multiple candidates of character codes.
-
Specification