Personal information retrieval using knowledge bases for optical character recognition correction
First Claim
1. A system for updating a contacts database, the system comprising:
- a portable imager configured to acquire a digital business card image;
an image segmenter configured to extract text image segments from the digital business card image;
an optical character recognizer (OCR) configured to generate one or more textual content candidates for each text image segment;
a scoring processor configured to assign scores to two or more textual content candidates generated by the OCR for an ambiguous text image segment based on results of database queries respective to the two or more textual content candidates, wherein the scoring processor comprises;
a local scoring processor component configured to compute a score for each of the two or more textual content candidates based on results of database queries respective to that textual content candidate; and
a global scores adjustment component configured to selectively adjust scores of two or more textual content candidates based on results of database queries respective to textual content candidates generated by the OCR for text image segments other than the ambiguous text image segment;
a content selector that selects one of the two or more textual content candidates based at least on the assigned scores; and
an interface configured to update the contacts database based on the selected one of the two or more textual content candidates.
6 Assignments
0 Petitions
Accused Products
Abstract
In a system for updating a contacts database (42, 46), a portable imager (12) acquires a digital business card image (10). An image segmenter (16) extracts text image segments from the digital business card image. An optical character recognizer (OCR) (26) generates one or more textual content candidates for each text image segment. A scoring processor (36) scores each textual content candidate based on results of database queries respective to the textual content candidates. A content selector (38) selects a textual content candidate for each text image segment based at least on the assigned scores. An interface (50) is configured to update the contacts list based on the selected textual content candidates.
-
Citations
15 Claims
-
1. A system for updating a contacts database, the system comprising:
-
a portable imager configured to acquire a digital business card image; an image segmenter configured to extract text image segments from the digital business card image; an optical character recognizer (OCR) configured to generate one or more textual content candidates for each text image segment; a scoring processor configured to assign scores to two or more textual content candidates generated by the OCR for an ambiguous text image segment based on results of database queries respective to the two or more textual content candidates, wherein the scoring processor comprises; a local scoring processor component configured to compute a score for each of the two or more textual content candidates based on results of database queries respective to that textual content candidate; and a global scores adjustment component configured to selectively adjust scores of two or more textual content candidates based on results of database queries respective to textual content candidates generated by the OCR for text image segments other than the ambiguous text image segment; a content selector that selects one of the two or more textual content candidates based at least on the assigned scores; and an interface configured to update the contacts database based on the selected one of the two or more textual content candidates. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for acquiring personal information, the method comprising:
-
acquiring a business card image wherein the acquiring of the business card image comprises photographing the business card using a built in camera of a cellular telephone; extracting text image segments including an ambiguous text image segment from the business card image; applying optical character recognition (OCR) to the text image segments including the ambiguous text image segment to generate textual content candidates for the text image segments including two or more textual content candidates for the ambiguous text image segment; querying at least one database respective to each of the textual content candidates; assigning scores to the two or more textual content candidates generated by the OCR for the ambiguous text image segment by; computing a score for each of the two or more textual content candidates generated by the OCR for the ambiguous text image segment based on results of database queries respective to that textual content candidate; and performing a global adjustment of the scores of the two or more textual content candidates generated by the OCR for the ambiguous text image segment based on results of database queries respective to textual content candidates generated by the OCR for text image segments other than the ambiguous text image segment; selecting a most likely one of the two or more textual content candidates generated by the OCR for the ambiguous text image segment based at least on the assigned scores; wherein the extracting, applying, querying, assigning, and selecting are performed by a processor. - View Dependent Claims (8, 9)
-
-
10. A system for generating a textual contact record from a business card image, the system comprising:
-
a cellular telephone including a built-in camera configured to (i) acquire an image of a business card using the built-in camera and (ii) extract textual content candidates from the image of the business card by optical character recognition (OCR) and (iii) extract at least one logo image segment from the image of the business card; a content candidates scoring processor that queries at least one database respective to the textual content candidates and collects records returned responsive to the queries and assigns scores to the textual content candidates based on the collected records, the content candidates scoring processor including a local scoring processor that assigns a score to each textual content candidate based on records returned by queries respective to the textual content candidate; wherein the content candidates scoring processor also image queries at least one image database respective to the logo image segment and collects textual metadata returned responsive to the image query, the content candidates scoring processor further including a global adjuster that modifies the score of a selected textual content candidate when the collected textual metadata includes the selected textual content candidate; and a content selector that selects a textual content candidate for each text image segment based at least on the assigned scores. - View Dependent Claims (11, 12, 13, 14, 15)
-
Specification