×

Automatic method of identifying drop words in a document image without performing character recognition

  • US 5,850,476 A
  • Filed: 12/14/1995
  • Issued: 12/15/1998
  • Est. Priority Date: 12/14/1995
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of identifying drop words in a document image without performing character recognition, the document image including a first multiplicity of sentences and a second multiplicity of word occurrences, a processor implementing the method by executing instructions stored in electronic form in a memory coupled to the processor, the method comprising the steps of:

  • a) analyzing the document image to identify word equivalence classes, each word equivalence class including at least one word occurrence of the second multiplicity of word occurrences;

    b) for each word equivalence class determining the likelihood that word equivalence class is a drop word;

    c) designating a number of the word equivalence classes as drop words based upon the likelihood that the word equivalence classes are drop words.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×