×

Multiple hypothesis testing for word detection

  • US 9,152,871 B2
  • Filed: 05/02/2014
  • Issued: 10/06/2015
  • Est. Priority Date: 09/02/2013
  • Status: Expired due to Fees
First Claim
Patent Images

1. A processor implemented method for determining words in a character sequence output during Optical Character Recognition (OCR), the method comprising:

  • determining a set of one or more bifurcation points for the character sequence, wherein each bifurcation point identifies a location to split the character sequence into two or more words and wherein the one or more bifurcation points are determined based on a separation between adjacent characters in the character sequence;

    generating a plurality of hypotheses, each hypothesis comprising one or more words formed by the character sequence, at least one of the hypotheses being generated based on the one or more bifurcation points;

    computing a plurality of normalized scores, each normalized score corresponding to a hypothesis, wherein the normalized score for a corresponding hypothesis is based, in part, on a length of each word in a set of the one or more words associated with the corresponding hypothesis; and

    selecting a hypothesis from the plurality of hypotheses based on a corresponding normalized score associated with the selected hypothesis.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×