Processing an electronic document for information extraction
First Claim
Patent Images
1. A method of automatically processing an electronic document for routing over a computer network, comprising:
- recognizing text in the document to identify a candidate address indicative of a portion of the document containing a potential destination;
accessing a collection of potential destinations;
computing a first weighted score and a contiguous weighted score for at least one potential destination, wherein the first weighted score is indicative of a difference between a word in the candidate address and a word associated with the at least one potential destination, and wherein the contiguous weighted score is indicative of a difference between contiguous words in the candidate address and words associated with the at least one potential destination;
computing a combined score for the at least one potential destination based on the first weighted score and the contiguous weighted score;
determining a recipient from the collection of potential destinations based on the combined score; and
routing the document to the recipient.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention relates to a method of automatically processing an electronic document for routing over a computer network. The method includes recognizing text in the document to identify a candidate address, accessing a collection of potential destinations and comparing the candidate address to the collection of potential destinations to determine a destination for the document.
18 Citations
19 Claims
-
1. A method of automatically processing an electronic document for routing over a computer network, comprising:
-
recognizing text in the document to identify a candidate address indicative of a portion of the document containing a potential destination; accessing a collection of potential destinations; computing a first weighted score and a contiguous weighted score for at least one potential destination, wherein the first weighted score is indicative of a difference between a word in the candidate address and a word associated with the at least one potential destination, and wherein the contiguous weighted score is indicative of a difference between contiguous words in the candidate address and words associated with the at least one potential destination; computing a combined score for the at least one potential destination based on the first weighted score and the contiguous weighted score; determining a recipient from the collection of potential destinations based on the combined score; and routing the document to the recipient. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer storage medium having instructions that, when implemented on a computer, cause the computer to process a document for routing to a selected address, the instructions comprising:
-
a recognition module adapted to recognize words within the document, the recognized words being indicative of a portion of the document containing containing a potential destination; an identification module adapted to identify features of the recognized words within the document and assign a relevance score to each recognized word based on the features, wherein the features relate to at least one of a location of a word in the text, a distance from a word in the text to a nearby word in the text and text of a word, and wherein the identification module is further adapted to select relevant word candidates from the recognized words based on the relevance score; a comparison module adapted to compare the relevant word candidates to words associated with a collection of addresses and determine a recipient based on the text and the features; and a routing module adapted to route the document to the recipient. - View Dependent Claims (10, 11, 12)
-
-
13. A method of processing an electronic document for routing over a computer network, comprising:
-
recognizing text in the document to identify a plurality of candidate addresses, wherein each of the plurality of candidate addresses is indicative of a portion of the document containing a potential destination; accessing a collection of potential destinations; identifying features associated with the plurality of candidate addresses; computing weighted relevance scores for the plurality of candidate addresses, wherein the scores are weighted based on the identified features; generating a list of the candidate addresses; sorting the list of candidate addresses based on the weighted relevance scores; determining a recipient from the collection of potential destinations based on the list; and routing the document to the recipient. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
Specification