Information searching system
First Claim
Patent Images
1. A computer-implement method for searching a collection of documents comprising:
- identifying, by one or more processors, text in a document and text terms from the text;
identifying, by one or more processors, an image in the document;
extracting, by one or more processors, features in the image in the document, wherein the features are described using descriptor vectors;
identifying, by one or more processors, image terms for the features by comparing descriptor vectors for the features to descriptor vectors representing features in a dictionary;
transforming, by one or more processors, the image into image terms;
combining, by one or more processors, the text terms and the image terms to form search terms; and
searching, by one or more processors, the collection of documents for images in conjunction with text using text-only searching processes.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and computer system for searching a collection of documents. The method comprises first identifying text in the document and then identifying an image in the document. Then the features of the image are extracted from the document and image terms are identified from the features. Finally, the collection of documents is searched using the image terms in conjunction with text terms.
-
Citations
24 Claims
-
1. A computer-implement method for searching a collection of documents comprising:
-
identifying, by one or more processors, text in a document and text terms from the text; identifying, by one or more processors, an image in the document; extracting, by one or more processors, features in the image in the document, wherein the features are described using descriptor vectors; identifying, by one or more processors, image terms for the features by comparing descriptor vectors for the features to descriptor vectors representing features in a dictionary; transforming, by one or more processors, the image into image terms; combining, by one or more processors, the text terms and the image terms to form search terms; and searching, by one or more processors, the collection of documents for images in conjunction with text using text-only searching processes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 22)
-
-
11. A computer system for searching a collection of documents, computer system comprising:
-
a bus system; a storage device connected to the bus system, wherein the storage device stores program instructions; a processor unit; and a search engine, running on the processor unit, wherein the search engine; identifies text in a document and text terms from the text; identifies an image in the document; extracts features in an image in the document, wherein the features are described using descriptor vectors; identifies image terms for the features by comparing descriptor vectors for the features to descriptor vectors representing features in a dictionary; transforms the image into image terms; combines the text terms and the image terms to form search terms; and searches a collection of documents utilizing the search terms to search for images in conjunction with text using text-only searching processes. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 23)
-
-
19. A computer program product for searching a collection of documents, the computer program product comprising:
-
a non-transitory computer-readable storage media; a first program code, stored on the computer-readable storage media, for identifying text in a document and text terms from the text; a second program code, stored on the computer-readable storage media, for identifying an image in the document; a third program code, stored on the computer-readable storage media, for extracting features in the image in the document, wherein the features are described using descriptor vectors; a fourth program code, stored on the computer-readable storage media, for identifying image terms for the features by comparing descriptor vectors for the features to descriptor vectors representing features in a dictionary; a fifth program code, stored on the computer-readable storage media, for transforming the image into image terms; a sixth program code, stored on the computer-readable storage media, for combining, by one or more processors, the text terms and the image terms to form search terms; and a seventh program code, stored on the computer-readable storage media, for searching the collection of documents utilizing the search terms to search for images in conjunction with text using text-only searching processes. - View Dependent Claims (20, 21, 24)
-
Specification