METHOD FOR RECOGNIZING CHARACTERS
First Claim
1. A method to be practiced on a machine for identifying a character on a document as being one of a pre-determined set comprising the steps of:
- 1. using apparatus to scan said document in the area of the charac-ter to generate electrical signals corresponding to the image of the character on the document, 2. using apparatus responsive to the electrical signals generated in step (1) to generate a sequence of signals composed of two different signal types, saId sequence corresponding to a binary raster representation of said character, 3. using apparatus to convert said binary raster representation to a set of numbers representative of respective features of said binary raster representation, 4. using apparatus to perform a plurality of tests on said set of numbers, each of said tests serving to discriminate between a respective pair of characters in said predetermined set for determining if one of the characters of the pair is more likely to be the character to be identified than the other character of the pair, and 5. using apparatus to identify the character in accordance with the results of the pairwise tests performed in step (4).
0 Assignments
0 Petitions
Accused Products
Abstract
A method for recognizing a digitized character. The shape of the character is represented by the number, positions and shapes of alternating contour convexities, as viewed from two sides of the character. The number and positions of the convexities define the sort group of the character, there being nine sort groups in the systems described. Each sort group has associated with it a separate linear discriminant logic test for every pair of characters which share the sort group. Depending on the sort group of the character to be recognized, the associated pairwise discriminant tests are performed, and the character class which passes a specified number of the tests is identified as the class of the character to be recognized.
-
Citations
131 Claims
-
1. A method to be practiced on a machine for identifying a character on a document as being one of a pre-determined set comprising the steps of:
- 1. using apparatus to scan said document in the area of the charac-ter to generate electrical signals corresponding to the image of the character on the document, 2. using apparatus responsive to the electrical signals generated in step (1) to generate a sequence of signals composed of two different signal types, saId sequence corresponding to a binary raster representation of said character, 3. using apparatus to convert said binary raster representation to a set of numbers representative of respective features of said binary raster representation, 4. using apparatus to perform a plurality of tests on said set of numbers, each of said tests serving to discriminate between a respective pair of characters in said predetermined set for determining if one of the characters of the pair is more likely to be the character to be identified than the other character of the pair, and 5. using apparatus to identify the character in accordance with the results of the pairwise tests performed in step (4).
-
2. controlling said machine to perform pairwise discriminant tests on said vector for recognizing said digitized character based on the results of the tests.
-
3. A method in accordance with claim 2 wherein said predetermined number is equal to the number of the tests in each of which the particular character was one of two in the test pair.
-
4. A method in accordance with claim 3 wherein the features of said binary number representation which are represented by said set of numbers include the numbers, shapes and locations of alternating bumps of opposite convexities as seen looking from at least two different directions.
-
5. using apparatus to identify the character in accordance with the results of the pairwise tests performed in step (4).
-
6. A method in accordance with claim 5 wherein in step (2) the binary raster representation is operated upon to correct breaks in said one direction.
-
7. A method in accordance with claim 1 wherein the features of said binary raster representation which are represented by said set of numbers include the numbers, shapes and locations of alternating bumps of opposite convexities as seen looking from outside said binary raster representation.
-
8. A method in accordance with claim 7 wherein the pairwise tests are included in a plurality of groups, the groups being associated with respective numbers of alternating bumps of opposite convexities and the pairwise tests included in the respective groups being those for discriminating between characters whose features correspond to the respective numbers of alternating bumps of opposite convexities, and in step (4) the only pairwise tests which are performed are those in the group for discriminating between characters whose features correspond to the same number of alternating bumps of opposite convexities as the number corresponding to the features determined in step (3).
-
9. A method in accordance with claim 8 wherein each of said groups of tests includes a test for discriminating between each possible pair of characters in said predetermined set whose features correspond to the number of alternating bumps of opposite convexities associated with the group.
-
10. A method in accordance with claim 9 wherein in step (5) the character is identified as a particular character only if during the performance of pairwise tests in step (4) the particular character was determined to be the more likely identity of the character to be identified in a predetermined number of the tests in each of which the particular character was one of the two in the test pair, and the pairwise tests are performed in step (4) in an order determined by the probabilities of occurrence of the characters to be discriminated to reduce the average number of pairwise tests which otherwise would be performed to identify a character.
-
11. A method in accordance with claim 7 wherein the featUres of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the difference between (a) the sum of numbers proportional to lengths on the two side regions of the binary raster representation which correspond to the absence of parts of the scanned character above a horizontal row positioned in the lower half of the binary raster representation, and (b) a number proportional to a length in the central region of the binary raster representation which corresponds to the absence of a part of the scanned character above said horizontal row.
-
12. A method in accordance with claim 7 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon a length in the binary raster representation which corresponds to the absence of a part of the scanned character above a horizontal row positioned in the lower half of the binary raster representation, which length is measured in the vertical direction immediately to the left of the leftmost portion of said horizontal row which corresponds to a part of the scanned character.
-
13. A method in accordance with claim 7 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the difference between (a) a number proportional to a length in the central region of the binary raster representation which corresponds to the absence of a part of the scanned character at the top of the binary raster representation, and (b) the sum of numbers proportional to lengths on the two sides of the binary raster representation which correspond to the absence of parts of the scanned character at the top of the binary raster representation.
-
14. A method in accordance with claim 7 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the average horizontal width between the leftmost and rightmost portions of the binary raster representation which represents parts of the scanned character taken along horizontal rows of the binary raster representation in the bottom portion thereof.
-
15. A method in accordance with claim 7 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the average horizontal width between the leftmost and rightmost portions of the binary raster representation which represents parts of the scanned character taken along horizontal rows of the binary raster representation in the central region thereof, which central region includes less than half of the total number of rows of the binary raster representation.
-
16. A method in accordance with claim 7 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the average horizontal width between the leftmost and rightmost portions of the binary raster representation which represents parts of the scanned character taken along horizontal rows of the binary raster representation in the central region thereof, which central region includes more than half of the total number of rows of the binary raster representation.
-
17. A method in accordance with claim 7 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the total number of continuous line segments represented by said binary raster representation along a group of rows thereof, said group consisting of rows in the central region of the upper half of the binary raster representation.
-
18. A method in accordance with claim 7 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the total number of continuous linE segments represented by said binary raster representation along a group of rows thereof, said group consisting of rows in the central region of the lower half of the binary raster representation.
-
19. A method in accordance with claim 7 wherein step (3) includes the sub-steps of:
- (3a) computing at least two differently directed histograms for said binary raster representation, (3b) computing a pair of difference strings for said binary raster representation by subtracting each element in each of said differently directed histograms from an adjacent element, (3c) changing the values of pairs of successive elements in each of said difference strings to minimize the effects of noise in said binary raster representation, thereby producing edited differently directed difference strings, (3d) deriving a list of magnitude and direction codes for a sequence of straight-line segments for each of the edited differently directed difference strings in accordance with the element values thereof, the direction of each straight-line segment being one of a predetermined relatively small number, (3e) inserting magnitude and direction codes for additional straight-line segments in each of said lists in accordance with the magnitudes and direction codes for the straight-line segments derived in step (3d) to derive a composite list of straight-line segments whose direction codes change in a predetermined order which causes the successive straight-line segments in each list to represent bumps of alternating opposite convexities, and (3f) combining said lists to derive said set of numbers representative of the features of said binary raster representation.
-
20. A method in accordance with claim 19 wherein step (3) further includes the sub-step of:
- (3g) computing each of a group of special feature numbers from said binary raster representation in accordance with a respective formula, said group of special feature numbers being combined with said lists in sub-step (3f) to derive said set of numbers representative of the features of said binary raster representation.
-
21. A method in accordance with claim 7 wherein step (3) includes the sub-steps of:
- (3a) computing at least two differently directed histograms for said binary raster representation, (3b) computing a pair of lists of straight-line segments from respective ones of said differently directed histograms, the straight-line segments in said lists representing bumps of alternating opposite convexities conforming to the contour of said binary raster representation, (3c) computing each of a group of special feature numbers from said binary raster representation in accordance with a respective formula, and (3d) combining the lists computed in step (3b) and the special feature numbers computed in step (3c) to derive said set of numbers representative of the features of said binary raster representation.
-
22. A method in accordance with claim 21 wherein the pairwise tests are included in a plurality of groups, the groups being associated with respective numbers of alternating bumps of opposite convexities and the pairwise tests included in the respective groups being those for discriminating between characters whose features correspond to the respective number of alternating bumps of opposite convexities, and in step (4) the only pairwise tests which are performed are those in the group for discriminating between characters whose features correspond to the same number of alternating bumps of opposite convexities as the number corresponding to the features determined in step (3).
-
23. A method in accordance with claim 22 wherein each of said groups of tests includes a test for discriminating between each possible pair of characters in said predetermined set whose features correspond to the number of alternating bumps of opposite convexities associated with the gRoup.
-
24. A method in accordance with claim 23 wherein in step (5) the character is identified as a particular character only if during the performance of pairwise tests in step (4) the particular character was determined to be the more likely identity of the character to be identified in a predetermined number of the tests in each of which the particular character was one of the two in the test pair, and the pairwise tests are performed in step (4) in an order determined by the probabilities of occurrence of the characters to be discriminated to reduce the average number of pairwise tests which otherwise would be performed to identify a character.
-
25. A method in accordance with claim 24 wherein each of the pairwise tests performed in step (4) is the computation of an optimal linear discriminant designed to distinguish between the two characters of the respective pair.
-
26. A method in accordance with claim 25 wherein in step (5) the character is identified as a particular character only if during the performance of pairwise tests in step (4) the particular character was determined to be the more likely identity of the character to be identified in a predetermined number of the tests in each of which the particular character was one of the two in the test pair.
-
27. A method in accordance with claim 26 wherein the pairwise tests are included in a plurality of groups, the groups being associated with respective numbers of alternating bumps of opposite convexities and the pairwise tests included in the respective groups being those for discriminating between characters whose features correspond to the respective numbers of alternating bumps of opposite convexities, and in step (4) the only pairwise tests which are performed are those in the group for discriminating between characters whose features correspond to the same number of alternating bumps of opposite convexities as the number corresponding to the features determined in step (3).
-
28. A method in accordance with claim 8 wherein for a group of pairwise tests the tests are performed in a sequence such that TIJ precedes TRQ if and only if PI >
- PR for I not = R and PJ >
PQ for I R, where Tij represents a test for discriminating between characters i and j, and PK represents the probability of character K being identified from among all of the characters which are scanned and are discriminated by the pairwise tests in said group.
- PR for I not = R and PJ >
-
29. A method in accordance with claim 28 wherein in step (5) the character is identified as a particular character only if during the performance of pairwise tests in step (4) the particular character was determined to be the more likely identity of the character to be identified in a predetermined number of the tests in each of which the particular character was one of the two in the test pair.
-
30. A method in accordance with claim 29 wherein said predetermined number is equal to the number of the tests in each of which the particular character was one of two in the test pair.
-
31. A method in accordance with claim 28 wherein the data for each pairwise test includes a plurality of weights to be used in computing a respective optimal linear discriminant, threshold values for enabling a character decision to be made after the optimal linear discriminant is computed, and pointer values for indicating the data to be used for the next pairwise test in accordance with the character decision made at the end of the current test.
-
32. A method in accordance with claim 8 wherein the data for each pairwise test includes a plurality of weights to be used in computing a respective optimal linear discriminant, threshold values for enabling a character decision to be made after the optimal linear discriminant is computed, and pointer values for indicating the data to be used for the next pairwise test in accordance with the character decision made aT the end of the current test.
-
33. A method in accordance with claim 7 wherein during the performance of each of the pairwise tests of step (4) the set of numbers representative of respective features of the binary raster representation which are used represent the contour of the binary raster representation as seen in directions from outside the binary raster representation, the particular directions being dependent upon the pair of characters to be discriminated by the pairwise test to be performed.
-
34. A method in accordance with claim 2 wherein the pairwise tests are included in a plurality of groups, each group being associated with a respective group of characters which are known to have some features in common, the pairwise tests included in each group being those for discriminating between the characters having said common features, and in step (4) the pairwise tests in only one group are performed, said one group being that whose characters have the common features represented by the set of numbers derived in step (3).
-
35. A method in accordance with claim 34 wherein each of said groups of tests includes a test for discriminating between all possible pairs of characters associated with the group.
-
36. A method in accordance with claim 35 wherein in step (5) the character is identified as a particular character only if during the performance of pairwise tests in step (4) the particular character was determined to be the more likely identity of the character to be identified in a predetermined number of the tests in each of which the particular character was one of the two in the test pair.
-
37. A method in accordance with claim 36 wherein said predetermined number is equal to the number of the tests in each of which the particular character was one of two in the test pair.
-
38. A method in accordance with claim 34 wherein in step (5) the character is identified as a particular character only if during the performance of pairwise tests in step (4) the particular character was determined to be the more likely identity of the character to be identified in a predetermined number of the tests in each of which the particular character was one of the two in the test pair, and the pairwise tests are performed in step (4) in an order determined by the probabilities of occurrence of the characters to be discriminated to reduce the average number of pairwise tests which otherwise would be performed to identify a character.
-
39. A method in accordance with claim 34 wherein for a group of pairwise tests the tests are performed in a sequence such that TIJ precedes TRQ if and only if PI >
- PR for I not = R and PJ >
PQ for I R, where Tij represents a test for discriminating between characters i and j, and PK represents the probability of character K being identified from among all of the characters which are scanned and are discriminated by the pairwise tests in said group.
- PR for I not = R and PJ >
-
40. A method in accordance with claim 39 wherein the data for each pairwise test includes a plurality of weights to be used in computing a respective optimal linear discriminant, threshold values for enabling a character decision to be made after the optimal linear discriminant is computed, and pointer values for indicating the data to be used for the next pairwise test in accordance with the character decision made at the end of the current test.
-
41. A method to be practiced on a machine for identifying a character on a document as being one of a predetermined set comprising the steps of:
-
42. A method in accordance with claim 41 wherein said set of numbers represents the numbers and shapes of alternating bumps of opposite convexities as seen looking from at least two different directions outside said binary raster representation.
-
43. A method in accordance with claim 41 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the difference between (a) the sum of numbers proportional to lengths on the two side regions of the binary raster representation which correspond to the absence of parts of the scanned character above a horizontal row positioned in the lower half of the binary raster representation, and (b) a number proportional to a length in the central region of the binary raster representation which corresponds to the absence of a part of the scanned character above said horizontal line.
-
44. A method in accordance with claim 41 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon a length in the binary raster representation which corresponds to the absence of a part of the scanned character above a horizontal row positioned in the lower half of the binary raster representation, which length is measured in the vertical direction immediately to the left of the leftmost portion of said horizontal row which corresponds to a part of the scanned character.
-
45. A method in accordance with claim 41 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the difference between (a) a number proportional to a length in the central region of the binary raster representation which corresponds to the absence of a part of the scanned character at the top of the binary raster representation, and (b) the sum of numbers proportional to lengths on the two sides of the binary raster representation which corresponds to the absence of parts of the scanned character at the top of the binary raster representation.
-
46. A method in accordance with claim 41 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the average horizontal width between the leftmost and rightmost portions of the binary raster representation which represents parts of the scanned character taken along horizontal rows of the binary raster representation in the bottom portion thereof.
-
47. A method in accordance with claim 41 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the average horizontal width between the leftmost and rightmost portions of the binary raster representation which represents parts of the scanned character taken along horizontal rows of the binary raster representation in the central region thereof, which central region includes less than half of the total number of rows of the binary raster representation.
-
48. A method in accordance with claim 41 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the average horizontal width between the leftmost and rightmost portions of the binary raster representation which represents parts of the scanned character taken along horizontal rows of the binary raster representation in the central region thereof, Which central region includes more than half of the total number of rows of the binary raster representation.
-
49. A method in accordance with claim 41 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the total number of continuous line segments represented by said binary raster representation along a group of rows thereof, said group consisting of rows in the central region of the upper half of the binary raster representation.
-
50. A method in accordance with claim 41 wherein the features of said binary raster representation which are represented by said set of numbers further include a number which is dependent upon the total number of continuous line segments represented by said binary raster representation along a group of rows thereof, said group consisting of rows in the central region of the lower half of the binary raster representation.
-
51. A method in accordance with claim 41 wherein step (3) includes the sub-steps of:
- (3a) computing at least two differently directed histograms for said binary raster representation, (3b) computing a pair of difference strings for said binary raster representation by subtracting each element in each of said differently directed histograms from an adjacent element, (3c) changing the values of pairs of successive elements in each of said difference strings to minimize the effects of noise in said binary raster representation, thereby producing edited differently directed difference strings, (3d) deriving a list of pairwise and direction codes for a sequence of straight-line segments for each of the edited differently directed difference strings in accordance with the element values thereof, the direction of each straight-line segment being one of a predetermined relatively small number, (3e) inserting magnitude and direction codes for additional straight-line segments in each of said lists in accordance with the magnitudes and direction codes for the straight-line segments derived in step (3d) to derive a composite list of straight-line segments whose direction codes change in a predetermined order which causes the successive straight-line segments in each list to represent bumps of alternating opposite convexities, and (3f) combining said lists to derive said set of numbers representative of the features of said binary raster representation.
-
52. A method in accordance with claim 51 wherein step (3) further includes the sub-step of:
- (3g) computing each of a group of special feature numbers from said binary raster representation in accordance with a respective formula, said group of special feature numbers being combined with said lists in sub-step (3f) to derive said set of numbers representative of the features of said binary raster representation.
-
53. A method in accordance with claim 41 wherein step (3) includes the sub-steps of:
- (3a) computing at least two differently directed histograms for said binary raster representation, (3b) computing a pair of lists of straight-line segments from respective ones of said differently directed histograms, the straight-line segments in said lists representing bumps of alternating opposite convexities conforming to the contour of said binary raster representation, (3c) computing each of a group of special feature numbers from said binary raster representation in accordance with a respective formula, and (3d) combining the lists computed in step (3b) and the special feature numbers computed in step (3c) to derive said set of numbers representative of the features of said binary raster representation.
-
54. A method to be practiced on a machine for recognizing a previously scanned character which is represented as a digitized character as being one of a predetermined set of characters comprising the steps of:
-
55. A method in accordance with claim 54 wherein in step (3) the character is recognized as being a particular character in said set only if during the performance of pairwise tests in step (2) the particular character passed a predetermined number of the tests in which it was one of the two in the test pair.
-
56. A method in accordance with claim 55 wherein said predetermined number is equal to the number of the tests in each of which the particular character was one of two in the test pair.
-
57. A method in accordance with claim 56 wherein the features of said digitized character which are represented by said vector include contour data for said digitized character as seen looking in at least two different directions from outside the digitized character.
-
58. A method in accordance with claim 54 wherein prior to step (1) the digitized character is operated upon to stretch it in at least one direction such that the stretched digitized character has a predetermined length in said at least one direction.
-
59. A method in accordance with claim 58 wherein prior to step (1) the digitized character is operated upon to correct breaks in said one direction.
-
60. A method in accordance with claim 54 wherein the features of said digitized character which are represented by said vector include contour data for said digitized character as seen looking from outside said digitized character.
-
61. A method in accordance with claim 60 wherein the pairwise tests are included in a plurality of groups, the groups being associated with respective contour data sets and the pairwise tests included in the respective groups being those for discriminating between characters whose contour data features correspond to respective contour data sets, and in step (2) the only pairwise tests which are performed are those in the group for discriminating between characters whose contour data features correspond to the contour data set which is applicable to the contour data features represented by said vector.
-
62. A method in accordance with claim 61 wherein each of said groups of tests includes a test for discriminating between each possible pair of characters in said predetermined set whose contour data features correspond to the contour data set which is associated with the group.
-
63. A method in accordance with claim 62 wherein in step (3) the digitized character is recognized as being a particular character in said set only if during the performance of pairwise tests in step (2) the particular character passed a predetermined number of the tests in which it was one of the two in the test pair, and the pairwise tests are performed in step (2) in an order determined by the probabilities of occurrence of the characters to be discriminated to reduce the average number of pairwise tests which otherwise would be performed to recognize a character.
-
64. A method in accordance with claim 60 wherein the features of said digitized character which are represented by said vector further include a number which is dependent upon the difference between (a) the sum of numbers proportional to lengths on the two side regions of the digitized character which correspond to the absence of parts of the digitized character above a horizontal row positioned in the lower half of the digitized character, and (b) a number proportional to a length in the central region of the digitized character which corresponds to the absence of a part of the digitized character above said horizontal row.
-
65. A method in accordance with claim 60 wherein the features of said digitized character which are represented by said vector further include a number which is dependent upon a length in the digitized character which corresponds to the absence of a part of the digitized character above a horizontal row positioned in the lower half of the digitized character, which length is measured in the vertical direction immediately to the left of the leftmost portion of said horizontal row which corresponds to a part of the digitized character.
-
66. A method in accordance with claim 60 wherein the features of said digitized character which are represented by said vector further include a number which is dependent upon the difference between (a) a number proportional to a length in the central region of the digitized character which corresponds to the absence of a part of the digitized character at the top thereof, and (b) the sum of numbers proportional to lengths on the two sides of the digitized character which correspond to the absence of parts of the digitized character at the top thereof.
-
67. A method in accordance with claim 60 wherein the features of said digitized character which are represented by said vector further include a number which is dependent upon the average horizontal width between the leftmost and rightmost portions of the digitized character which represents part of the digitized character taken along horizontal rows of the digitized character in the bottom portion thereof.
-
68. A method in accordance with claim 60 wherein the features of said digitized character which are represented by said vector further include a number which is dependent upon the average horizontal width between the leftmost and rightmost portions of the digitized character which represents parts of the digitized character taken along horizontal rows of the digitized character in the central region thereof, which central region includes less than half of the total number of rows of the digitized character.
-
69. A method in accordandance with claim 60 wherein the features of said digitized character which are represented by said vector further include a number which is dependent upon the average horizontal width between the leftmost and rightmost portions of the digitized character which represents parts of the digitized character taken along horizontal rows of the digitized character in the central region thereof, which central region includes more than half of the total number of rows of the digitized character.
-
70. A method in accordance with claim 60 wherein the features of said digitized character which are represented by said vector further include a number which is dependent upon the total number of continuous line segments represented by said digitized character along a group of rows thereof, said group consisting of rows in the central region of the upper half of the digitized character.
-
71. A method in accordance with claim 60 wherein the features of said digitized character which are represented by said vector further include a number which is dependent upon the total number of continuous line segments represented by said digitized character along a group of rows thereof, said group consisting of rows in the central region of the lower half of the digitized character.
-
72. A method in accordance with claim 60 wherein step (1) includes the sub-steps of:
- (1a) computing at least two differently directed histograms for said digitized character, (1b) computing a pair of difference strings for said digitized character by subtracting each element in each of said differently directed histograms from an adjacent element, (1c) changing the values of pairs of successive elements in each of said difference strings to minimize the effects of noise in said digitized character, thereby producing edited differently directed difference strings, (1d) deriving a list of magnitude and direction codes for a sequence of straight-Line segments for each of the edited differently directed difference strings in accordance with the element values thereof, the direction of each straight-line segment being one of a predetermined relatively small number, (1e) inserting magnitude and direction codes for additional straight-line segments in each of said lists in accordance with the magnitude and direction codes for the straight-line segments derived in step (2d) to derive a composite list of straight-line segments whose direction codes change in a predetermined order which causes the successive straight-line segments in each list to represent bumps of alternating opposite convexities, and (1f) combining said lists to derive said set of numbers representative of the features of said digitized character.
-
73. A method in accordance with claim 72 wherein step (1) further includes the sub-step of:
- (1g) computing each of a group of special feature numbers from said differently directed histograms in accordance with a respective formula, said group of special feature numbers being combined with said lists in sub-step (1f) to derive said set of numbers representative of the features of said digitized characters.
-
74. A method in accordance with claim 60 wherein step (1) includes the sub-steps of:
- (1a) computing at least two differently directed histograms for said digitized character, (1b) computing a pair of lists of straight-line segments from respective ones of said differently directed histograms, the straight-line segments in said lists representing bumps of alternating opposite convexities conforming to the contour of said digitized character, (1c) computing each of a group of special feature numbers from said differently directed histograms in accordance with a respective formula, and (1d) combining the lists computed in step (1b) and the special feature numbers computed in step (1c) to derive said set of numbers representative of the features of said digitized character.
-
75. A method in accordance with claim 74 wherein the pairwise tests are included in a plurality of groups, the groups being associated with respective contour data sets and the pairwise tests included in the respective groups being those for discriminating between characters whose contour data features correspond to respective contour data sets, and in step (2) the only pairwise tests which are performed are those in the group for discriminating between characters whose contour data features correspond to the contour data set which is applicable to the contour data features represented by said vector.
-
76. A method in accordance with claim 75 wherein each of said groups of tests includes a test for discriminating between each possible pair of characters in said predetermined set whose contour data features correspond to the contour data set which is associated with the group.
-
77. A method in accordance with claim 76 wherein in step (3) the digitized character is recognized as being a particular character in said set only if during the performance of pairwise tests in step (2) the particular character passed a predetermined number of the tests in which it was one of the two in the test pair, and the pairwise tests are performed in step (2) in an order determined by the probabilities of occurrence of the characters to be discriminated to reduce the average number of pairwise tests which otherwise would be performed to recognize a character.
-
78. A method in accordance with claim 77 wherein each of the pairwise tests performed in step (2) is the computation of an optimal linear discriminant designed to distinguish between the two characters of the respective pair.
-
79. A method in accordance with claim 78 wherein in step (3) the character is recognized as being a particular character in said set only if during the performance of pairwise tests in step (2) the particular chaRacter passed a predetermined number of the tests in which it was one of the two in the test pair.
-
80. A method in accordance with claim 61 wherein for a group of pairwise tests the tests are performed in a sequence such that TIJ precedes TRQ if and only if PI>
- PR for I not = R and PJ>
PQ for I R, where Tij represents a test for discriminating between character i and j, and PK represents the probability of character K being recognized from among all of the characters which are digitized and are discriminated by the pairwise tests in said group.
- PR for I not = R and PJ>
-
81. A method in accordance with claim 80 wherein in step (3) the character is recognized as being a particular character in said set only if during the performance of pairwise tests in step (2) the particular character passed a predetermined number of the tests in which it was one of the two in the test pair.
-
82. A method in accordance with claim 81 wherein said predetermined number is equal to the number of the tests in each of which the particular character was one of two in the test pair.
-
83. A method in accordance with claim 80 wherein the data for each pairwise test includes a plurality of weights to be used in computing a respective optimal linear discriminant, threshold values for enabling a character decision to be made after the optimal linear discriminant is computed, and pointer valves for indicating the data to be used for the next pairwise test in accordance with the character decision made at the end of the current test.
-
84. A method in accordance with claim 61 wherein the data for each pairwise test includes a plurality of weights to be used in computing a respective optimal linear discriminant, threshold values for enabling a character decision to be made after the optimal linear discriminant is computed, and pointer values for indicating the data to be used for the next pairwise test in accordance with the character decision made at the end of the current test.
-
85. A method in accordance with claim 60 wherein during the performance of each of the pairwise tests of step (2) only some of the elements of said vector are utilized, the elements representing contour data features as seen in directions from outside the dizitized character, the particular directions being dependent upon the pair of characters to be discriminated by the pairwise test to be performed.
-
86. A method in accordance with claim 55 wherein the pairwise tests are included in a plurality of groups, each group being associated with a respective group of characters which are known to have some features in common, the pairwise tests included in each group being those for discriminating between the characters having said common features, and in step (2) the pairwise tests in only one group are performed, said one group being that whose characters have the common features represented by the vector constructed in step (1).
-
87. A method in accordance with claim 86 wherein each of said groups of tests includes a test for discriminating between all possible pairs of characters associated with the group.
-
88. A method in accordance with claim 87 wherein in step (3) the character is recognized as being a particular character in said set only if during the performance of pairwise tests in step (2) the particular character passed a predetermined number of the tests in which it was one of the two in the test pair.
-
89. A method in accordance with claim 88 wherein said predetermined number is equal to the number of the tests in each of which the particular character was one of two in the test pair.
-
90. A method in accordance with claim 86 wherein in step (3) the digitized character is recognized as being a particular character in said set only if during the performance of pairwise tests in step (2) the particular character passed a predetermined number of the tests in which it was one of the two in the teSt pair, and the pairwise tests are performed in step (2) in an order determined by the probabilities of occurrence of the characters to be discriminated to reduce the average number of pairwise tests which otherwise would be performed to recognize a character.
-
91. A method in accordance with claim 86 wherein for a group of pairwise tests the tests are performed in a sequence such that TIJ precedes TRQ if and only if PI>
- PR for I not = R and PJ>
PQ for I R, where Tij represents a test for discriminating between characters i and j, and PK represents the probability of character K being recognized from among all of the characters which are digitized and are discriminated by the pairwise tests in said group.
- PR for I not = R and PJ>
-
92. A method in accordance with claim 91 wherein the data for each pairwise test includes a plurality of weights to be used in computing a respective optimal linear discriminant, threshold values for enabling a character decision to be made after the optimal linear discriminant is computed, and pointer values for indicating the data to be used for the next pairwise test in accordance with the character decision made at the end of the current test.
-
93. A method in accordance with claim 55 wherein each of the pairwise tests performed in step (2) is the computation of an optimal linear discriminant designed to distinguish between the two characters of the respective pair.
-
94. A method in accordance with claim 55 wherein the data for each pairwise test includes a plurality of weights to be used in computing a respective optimal linear discriminant, threshold values for enabling a character decision to be made after the optimal linear discriminant is computed, and pointer values for indicating the data to be used for the next pairwise test in accordance with the character decision made at the end of the current test.
-
95. A method in accordance with claim 54 wherein each of the pairwise tests performed in step (2) is the computation of an optimal linear discriminant designed to distinguish between the two characters of the respective pair.
-
96. A method in accordance with claim 54 wherein the data for each pairwise test includes a plurality of weights to be used in computing a respective optimal linear discriminant, threshold values for enabling a character decision to be made after the optimal linear discriminant is computed, and pointer values for indicating the data to be used for the next pairwise test in accordance with the character decision made at the end of the current test.
-
97. A method in accordance with claim 54 wherein in step (3) the character is recognized as being a particular character in said set only if during the performance of pairwise tests in step (2) the particular character passed more of the tests in which it was one of the two in the test pair than any other character.
-
98. A method in accordance with claim 87 wherein the features of said digitized character which are represented by said vector include contour data for said digitized character as seen looking in at least two different directions from outside the digitized character.
-
99. A method in accordance with claim 98 wherein the pairwise tests are included in a plurality of groups, the groups being associated with respective contour data sets and the pairwise tests included in the respective groups being those for discriminating between characters whose contour data features correspond to respective contour data sets, and in step (2) the only pairwise tests which are performed are those in the group for discriminating between characters whose contour data features correspond to the contour data set which is applicable to the contour data features represented by said vector.
-
100. A method for using apparatus to design a machine program for recognizing a digitized character as being one of a predetermined group of characters comprising the steps of:
-
101. A method in accordance with claim 100 wherein prior to the execution of step (3) a plurality of sets of characteristics descriptive of a feature set are identified, and in step (3) a set of discriminants and associated threshold values is computed for each of the characteristic sets in said plurality for discriminating between the character classes whose feature sets exhibit the respective set of characteristics.
-
102. A method in accordance with claim 101 wherein said set of features includes a representation of contour data for a character, and said sets of characteristics are descriptive of contour data represented by a set of features.
-
103. A method to be practiced on a machine for recognizing a character as one of a predetermined set comprising the steps of:
-
104. A method in accordance with claim 103 wherein said pairwise tests are performed in a sequence such that TIJ precedes TRQ if and only if PI>
- PR for I not = R and PJ>
PQ for I R, where Tij represents a test for discriminating between character classes i and j, and PK represents the probability of character class K, as opposed to all other character classes, containing the character to be recognized.
- PR for I not = R and PJ>
-
105. A method in accordance with claim 104 wherein each of the tests performed in step (1) is the computation of a linear discriminant designed to distinguish between two character classes.
-
106. A method in accordance with claim 105 wherein the linear discriminant computed during each test performed in step (1) is a function of data representing external contour patterns of the character to be recognized.
-
107. A method in accordance with claim 103 wherein in step (1) two lists are maintained, the first being a list containing an entry for each character class, which entry is the number of pairwise tests performed in which said character class was the one of the two in the pair which was determined to have the greater probability of containing the character to be recognized, and the second being a list containing an entry for each character class, which entry is an indication of the performance of at least one test in which said character clasS was one of the two in the test pair and was not determined to have the greater probability of containing the character to be recognized, and said two lists are updated following the performance of each pairwise test, the presence of condition (a) is detected by observing an indication in said second list of an entry for each character class, and the presence of condition (b) is detected by observing a number for the entry for any character class in said first list which is equal to the number of pairwise tests which include said any character class as one of the two in the test pair.
-
108. A method in accordance with claim 107 wherein the tests performed in step (2) serves to discriminate between respective pairs of characters in said predetermined set relative to a character to be recognized.
-
109. A method in accordance with claim 108 wherein each of the tests performed in step (2) is the computation of a linear discriminant.
-
110. A method in accordance with claim 109 wherein in step (2) the character is recognized as being a particular character in said set if during the performance of the pairwise tests the associated character class passed a predetermined number of the tests in which it was one of the two in the test pair.
-
111. A method in accordance with claim 110 wherein the pairwise tests are performed in step (2) in an order determined by the probabilities of occurrence of the characters in said set to reduce the average number of pairwise tests which otherwise would be performed to recognize a character.
-
112. A method in accordance with claim 103 wherein the tests performed in step (2) serve to discriminate between respective pairs of characters in said predetermined set relative to said character to be recognized.
-
113. A method in accordance with claim 112 wherein each of the tests performed in step (2) is the computation of a linear discriminant.
-
114. A method in accordance with claim 113 wherein the pairwise tests are performed in step (2) in an order determined by the probabilities of occurrence of the characters in said set to reduce the average number of pairwise tests which otherwise would be performed to recognize a character.
-
115. A method in accordance with claim 103 wherein each of the tests performed in step (2) is the computation of a linear discriminant.
-
116. A method in accordance with claim 115 wherein the pairwise tests are performed in step (2) in an order determined by the probabilities of occurrence of the characters in said set to reduce the average number of pairwise tests which otherwise would be performed to recognize a character.
-
117. A method in accordance with claim 103 wherein the pairwise tests are performed in step (2) in an order determined by the probabilities of occurrence of the characters in said set to reduce the average number of pairwise tests which otherwise would be performed to recognize a character.
-
118. A method in accordance with claim 117 wherein in step (1) two lists are maintained, the first being a list containing an entry for each character class, which entry is the number of pairwise tests performed in which said character class was the one of the two in the pair which was determined to have the greater probability of containing the character to be recognized, and the second being a list containing an entry for each character class, which entry is an indication of the performance of at least one test in which said character class was one of the two in the test pair and was not determined to have the greater probability of containing the character to be recognized, and said two lists are updated following the performance of each pairwise test, the presence of condition (a) is detected by observing an indication in said second list of an entry for each character class, and the presence of condition (b) is detected by observing a number for the entry for any character class in said first list which Is equal to the number of pairwise tests which include said any character class as one of the two in the test pair.
-
119. A method to be practiced on a machine for recognizing a character in digitized form as being one of a predetermined set of characters comprising the steps of:
-
120. A method in accordance with claim 119 wherein said tests discriminate respective pairs of characters in the respective sub-set of characters.
-
121. A method in accordance with claim 120 wherein each of said tests is the computation of a linear discriminant.
-
122. A method in accordance with claim 120 wherein the pairwise tests are performed in step (3) in an order determined by the probabilities of occurrence of the characters in the sub-set associated with the selected test group to reduce the average number of pairwise tests which otherwise would be performed to recognize a character.
-
123. A method in accordance with claim 120 wherein the elements of the vector constructed in step (1) are non-binary, continuous measures of features of the character.
-
124. A method in accordance with claim 119 wherein the elements of the vector constructed in step (1) are non-binary, continuous measures of features of the character.
-
125. A method in accordance with claim 119 wherein the tests are performed in step (3) in an order determined by the probabilities of occurrence of the characters in the sub-set associated with the selected test group to reduce the average number of tests which otherwise would be performed to recognize a character.
-
126. A method in accordance with claim 119 wherein each of said tests is the computation of a linear discriminant.
-
127. A method in accordance with claim 119 wherein the elements of the vector constructed in step (1) are non-binary, continuous measures of features of the character.
-
128. A method to be practiced on a machine for recognizing a digitized character as being one of a predetermined set of characters comprising the steps of:
-
129. A method in accordance with claim 128 wherein said vector elements represent the numbers, shapes and locations of alternating bumps of opposite convexities as seen looking from outside said digitized character.
-
130. A method in accordance with claim 128 wherein each of the tests performed in step (2) is the computation of a linear discriminant designed to distinguish between two characters.
-
131. A method in accordance with claim 130 wherein the linear discriminant computed during each test performed in step (2) is a function of data representing external contour patterns of the character to be recognized.
Specification