Method for recognizing handwritten characters using shape and context analysis

US 5,151,950 A
Filed: 10/31/1990
Issued: 09/29/1992
Est. Priority Date: 10/31/1990
Status: Expired due to Fees

First Claim

Patent Images

1. A method for handwriting translation including a deterministic finite automaton traversal subroutine, and a dynamic programming subroutine, comprising the steps of:

deriving character proposals and corresponding probabilities from an input of digitized strokes and storing a pointer to the strokes not yet analyzed;

deriving a separate list of character proposals, probabilities, and new states from the deterministic finite automaton traversal subroutine;

merging the two character proposal lists into a single list of hypotheses, that are sorted by probability;

expanding each of those hypotheses in order from most to least probable;

deriving a new list of proposals from the strokes not yet analyzed with a shape matching subroutine and storing a pointer to the strokes still not yet analyzed;

deriving a new list of proposals from the DFA traverser using the state previously stored with the hypothesis being expanded; and

repeatedly expanding said hypotheses until a list of translations is produced.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An improved pattern recognition system, using an improved method for merging low-level recognition information with auxiliary contextual information such as a Deterministic Finite Automaton (DFA). The system comprises a low-level shape recognizer for handwriting input, an English Language dictionary organized as a Trie (a special type of DFA), and software to merge the results of the two. An input of digitized handwriting strokes is translated into characters using the shape recognizer and the Trie in tandem, allowing the system to reject nonsense translations at the earliest possible stage of the process and without the overhead traversing the trie from the top with each translation.

76 Citations

View as Search Results

13 Claims

1. A method for handwriting translation including a deterministic finite automaton traversal subroutine, and a dynamic programming subroutine, comprising the steps of:
- deriving character proposals and corresponding probabilities from an input of digitized strokes and storing a pointer to the strokes not yet analyzed;
  
  deriving a separate list of character proposals, probabilities, and new states from the deterministic finite automaton traversal subroutine;
  
  merging the two character proposal lists into a single list of hypotheses, that are sorted by probability;
  
  expanding each of those hypotheses in order from most to least probable;
  
  deriving a new list of proposals from the strokes not yet analyzed with a shape matching subroutine and storing a pointer to the strokes still not yet analyzed;
  
  deriving a new list of proposals from the DFA traverser using the state previously stored with the hypothesis being expanded; and
  
  repeatedly expanding said hypotheses until a list of translations is produced.
- View Dependent Claims (2, 3, 4)
- - 2. The handwriting translation method of claim 1, further including the step of generating hypotheses from multiple experts with veto and propose power, the generating step including the steps of:
    - defining a common proposal table with an entry for every possible character and containing a common probability value, a common veto count, a common proposal count, and a separate state value for every expert;
      
      initializing the table probabilities to 1.0 and the common veto count to zero and the common count to zero;
      
      directing each expert to increment the veto count for characters it rejects, increment the propose count for characters it proposes, multiply its judgement of the probabilities of each character by the cumulative probability so far, and set its state pointer to the state it would be in if this character were accepted; and
      
      hypothesizing all characters with non-zero probability, non-zero proposal counts, and zero veto counts.
  - 3. The method of claim 1, including a shape matching subroutine, wherein the step of deriving character proposals and corresponding probabilities is carried out utilizing the shape matching subroutine.
  - 4. The handwriting translation method of claim 2, wherein the step of defining a common proposal table is carried out by using a plurality of experts.

5. A method for recognition of sequences of characters, including the steps of:
- (1) storing a plurality of input strokes which are in a sequence;
  
  (2) generating a first set of recognition hypotheses for a first subset of said strokes, beginning at one end of the sequence of strokes, each hypothesis including;
  
  (i) a hypothesized character, (ii) the number of strokes required to construct the hypothesized character, and (iii) a probability assigned to the hypothesized character;
  
  (3) comparing each hypothesized character with a predetermined database of character sequences;
  
  (4) for each hypothesized character which is found as a first character in the predetermined database of character sequences, generating a pointer which is correlated with that hypothesis and which points to that character'"'"'s position in the database;
  
  (5) for each character which is not found as a first character in the predetermined database of character sequences, assigning a lower probability to that character'"'"'s recognition hypothesis;
  
  (6) designating the recognition hypothesis having the highest probability as the current active hypothesis in the set of recognition hypotheses of which that recognition hypothesis is a member;
  
  (7) placing that set of recognition hypotheses as the first element on a stack of sets of recognition hypotheses;
  
  (8) from a plurality of strokes following the last stroke which is a part of the current active hypothesis from the set of recognition hypotheses at the top of the stack, generating a new set of new recognition hypotheses, each said new recognition hypothesis including;
  
  (i) a hypothesized character, (ii) the number of strokes required to construct the hypothesized character, and (iii) a probability assigned to the hypothesized character;
  
  (9) replacing the probability assigned to each new recognition hypothesis with a value derived from the current value of the probability for the new recognition hypothesis and the value of the probability of the current active hypothesis from the set of recognition hypotheses at the top of the stack;
  
  (10) if the current active hypothesis from the set of recognition hypotheses at the top of the stack has an assigned pointer to the database, then for each new hypothesis which is found in the predetermined database of character sequences as a new character following the character corresponding to said current active hypothesis, generating a pointer which is correlated with that hypothesis and which points to the new character'"'"'s position in the database;
  
  (11) if the current active hypothesis from the set of the recognition hypotheses at the top of the stack does not have an assigned pointer to the database, then for each new hypothesis, reduce the probability of every recognition hypotheses in the new set by a predetermined factor;
  
  (12) designating the recognition hypothesis which is a member of the new set of recognition hypotheses and which has the highest probability in that set as the current active hypothesis in that set;
  
  (13) placing the new set of recognition hypotheses on top of the stack;
  
  (14) repeating steps 8 through 13, until no additional strokes are found in step 8 following the last stroke which is a part of the current active hypothesis from the set of recognition hypotheses at the top of the stack;
  
  (15) generating a first hypothesis string from the current active hypothesis of each set of recognition of hypotheses, in order from the bottom of the stack to the top;
  
  (16) generating a probability for the first hypothesis string, based upon the value of the probability for the current active hypothesis from the set of recognition hypotheses at the top of the stack;
  
  (17) storing the first hypothesis string as the best hypothesis string and storing its probability as the best probability, and proceeding to step 20;
  
  (18) generating a new hypothesis string from the current active hypothesis of each set of recognition of hypotheses, in order from the bottom of the stack to the top;
  
  (19) generating a probability for the new hypothesis string, based upon the value of the probability for the current active hypothesis from the set of recognition hypotheses at the top of the stack;
  
  (20) if the probability of the new hypothesis string is higher than the stored best probability, storing the new hypothesis string as the best probability string and storing its probability as the best probability;
  
  (21) if the set of recognition hypotheses at the top of the stack includes a subset of at least one other recognition hypothesis having a probability lower than the current active hypothesis, then designating the recognition hypothesis of said subset which has the highest probability value as the current active hypothesis and repeating steps 8-14 and 18-20;
  
  (22) removing the set of recognition hypotheses at the top of the stack, and if the stack is not empty as a result of such removal, proceeding to step 21, but if it is empty then proceeding to step 23; and
  
  (23) outputting the best stored string.
- View Dependent Claims (6)
- - 6. The method of claim 5, wherein step 21 is carried out only if the probability of the new subset is greater than the probability of the current string.

7. A method for generating a sequence of characters from a set of input strokes, including the steps of:
- (1) generating at least one shape proposal corresponding to a first subset of the plurality of input strokers;
  
  (2) generating a probability for each of the shape proposals generated in step 1;
  
  (3) generating a plurality of hypothesis lists including the shape proposals;
  
  (4) maintaining a stack of hypothesis lists, with one hypothesis in each hypothesis list being designated as a current active hypothesis, the stack being ordered from a beginning of the plurality of input strokes to an end thereof, wherein each list of hypotheses is determined from the current active hypothesis in the list directly below it on the stack, and herein the current active hypothesis from the hypothesis list at the top of the stack utilizes all input strokes to be analyzed;
  
  (5) generating a string including the shape proposals corresponding to each of the current active hypotheses from the hypothesis lists on the stack, in order from bottom to top;
  
  (6) generating a probability for each such string form the probabilities of the individual shape proposals corresponding to the current active hypotheses which form the string, wherein the probability for each string which is not represented in a predetermined database is reduced in a predefined manner;
  
  (7) outputting a string as generated in step 5 having the highest probability as generated in step 6.
- View Dependent Claims (8, 9, 10, 11, 12, 13)
- - 8. The method of claim 7, wherein maintaining the stacks as specified in step 4 includes the step of generating all possible variations of current active hypotheses from the hypothesis lists in the stack.
  - 9. The method of claim 8, wherein the step of reducing the probability as in step 6 the step of multiplying the generated probability by a predetermined factor for each shape proposal in the new hypothesis list.
  - 10. The method of claim 7, wherein the shape proposals generate din step 1 are selected from a predefined set of known shape proposals and a character signifying that the subset of the strokes does not correspond to any of the known shape proposals.
  - 11. The method of claim 10, wherein step 6 includes the steps of:
    - (8) determining whether nay of the shape proposals generated in step 1 is a character signifying that the subset of the strokes does not correspond to any known shape proposal; and
      
      (9) if the determination of step 8 is positive, modifying the generated probability for each hypothesis list including said character signifying that the subset of the strokes does not correspond to any of the known shape proposals.
  - 12. The method of claim 11, wherein step 9 includes the steps of:
    - (10) determining whether any of the shape proposals generated in step 1 is a character signifying that the subset of the strokes does not correspond to any known shape proposal; and
      
      (11) if the determination of step 10 is positive, then generating at least one alternative shape proposal to replace that character, where the alternative shape proposal is selected such that a resulting hypothesis list matches a sequence of characters in a predefined dictionary.
  - 13. The method of claim 7, wherein the probabilities of step 2 are generated according to the likelihood of the correctness of each shape proposal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Go PLC (Emirates International Telecommunications Limited)
Original Assignee
Go PLC (Emirates International Telecommunications Limited)
Inventors
Hullender, Gregory N.
Primary Examiner(s)
Razavi, Michael
Assistant Examiner(s)
Cammarata, Michael

Application Number

US07/607,125
Time in Patent Office

699 Days
Field of Search

382/3, 382/13, 382/37, 382/9, 382/14, 382/15, 382/25, 382/40, 382/39
US Class Current

382/187
CPC Class Codes

G06V 30/2276 with probabilistic networks...

Method for recognizing handwritten characters using shape and context analysis

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

76 Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Method for recognizing handwritten characters using shape and context analysis

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

76 Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links