Document retrieving method and apparatus
First Claim
1. A document retrieving method comprising:
- a first acquisition step of acquiring text data based on image data of a document and acquiring text feature data based on the acquired text data;
a second acquisition step of acquiring layout feature data based on the image data of the document;
a storing step of storing in storage means the text feature data and the layout feature data, respectively acquired in said first and second acquisition steps, in association with the document; and
a retrieving step of retrieving a document stored in the storage means using the text feature data and the layout feature data acquired by executing said first and second acquisition steps on a search document.
1 Assignment
0 Petitions
Accused Products
Abstract
In the proposed document retrieving apparatus, text feature data that bases upon text data included in a document and image feature data that bases upon a document image are stored in a memory. Image data of a search document is subjected to character recognition processing, text feature data is acquired based on the obtained text data, and image feature data (layout data) is acquired based on the image data of the search document. Using the text feature data and image feature data acquired with respect to the search document, a memory is searched, and a document corresponding to the search document is retrieved from plural documents.
65 Citations
24 Claims
-
1. A document retrieving method comprising:
-
a first acquisition step of acquiring text data based on image data of a document and acquiring text feature data based on the acquired text data;
a second acquisition step of acquiring layout feature data based on the image data of the document;
a storing step of storing in storage means the text feature data and the layout feature data, respectively acquired in said first and second acquisition steps, in association with the document; and
a retrieving step of retrieving a document stored in the storage means using the text feature data and the layout feature data acquired by executing said first and second acquisition steps on a search document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 23, 24)
-
-
12. A document retrieving apparatus comprising:
-
a first acquisition unit configured to acquire text data based on image data of a document and acquire text feature data based on the acquired text data;
a second acquisition unit configured to acquire layout feature data based on the image data of the document;
a storage unit configured to store the text feature data and the layout feature data, respectively acquired by said first and second acquisition units, in association with the document; and
a retrieving unit configured to retrieve a document stored in said storage unit using the text feature data and the layout feature data acquired by executing said first and second acquisition units on a search document. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
Specification