System and method for automated interpretation of input expressions using novel a posteriori probability measures and optimally trained information processing networks
First Claim
1. A system for analyzing an input expression and scoring possible interpretations of said input expression, said system comprising:
- segment producing means for analyzing an input data set representative of said input expression and dividing said input data set into a plurality of segments, each said segment having specifiable boundaries and being classifiable as possibly representing any one of a plurality of symbols in a predetermined symbol set,said input data set comprising a set of pixels associated with an acquired image of a graphically represented sequence of symbols, and said segment producing means analyzing said set of pixels and dividing said set of pixels into a plurality of image segments, such that each said image segment has specified boundaries and is classifiable as possibly representing any one or more of said plurality of symbols in a predetermined symbol set;
segment scoring means for analyzing each segment in said plurality of segments, and assigning a score to each possible classification of said segment associated with a particular symbol in said predetermined symbol set;
representation means for representing a plurality of possible interpretations for said input expression, and a plurality of image consegmentations, wherein each said possible interpretation consists of a different sequence of symbols selected from said plurality of symbols, and each said consegmentation consists of a different sequence of said plurality of segments;
consegmentation scoring means for assigning scores to said plurality of consegmentations based on the scores assigned to said segments;
candidate interpretation identifying means for identifying one or more candidate symbol interpretations from said plurality of possible interpretations based on the scores assigned to said plurality of segments;
symbol sequence scoring means for assigning scores to said one or more candidate interpretations based on the scores assigned to one or more of said plurality of segments;
first score evaluation means for evaluating the scores assigned to said one or more candidate interpretations;
second score evaluation means for evaluating the scores assigned to said plurality of possible interpretations; and
normalized score producing means for producing a normalized score for each candidate interpretation using the evaluated score for said plurality of possible interpretations.
7 Assignments
0 Petitions
Accused Products
Abstract
A method and system for forming an interpretation of an input expression, where the input expression is expressed in a medium, the interpretation is a sequence of symbols, and each symbol is a symbol in a known symbol set. In general, the system processes an acquired input data set representative of the input expression, to form a set of segments, which are then used to specify a set of consegmentations. Each consegmentation and each possible interpretation for the input expression is represented in a data structure. The data structure is graphically representable by a graph comprising a two-dimensional array of nodes arranged in rows and columns and selectively connected by directed arcs. Each path, extending through the nodes and along the directed arcs, represents one consegmentation and one possible interpretation for the input expression. All of the consegmentations and all of the possible interpretations for the input expression are represented by the set of paths extending through the graph. For each row of nodes in the graph, a set of scores is produced for the known symbol set, using a complex of optimally trained neural information processing networks. Thereafter the system computes an a posteriori probability for one or more symbol sequence interpretations. By deriving each a posteriori probability solely through analysis of the acquired input data set, highly reliable probabilities are produced for competing interpretations for the input expression.
-
Citations
37 Claims
-
1. A system for analyzing an input expression and scoring possible interpretations of said input expression, said system comprising:
-
segment producing means for analyzing an input data set representative of said input expression and dividing said input data set into a plurality of segments, each said segment having specifiable boundaries and being classifiable as possibly representing any one of a plurality of symbols in a predetermined symbol set, said input data set comprising a set of pixels associated with an acquired image of a graphically represented sequence of symbols, and said segment producing means analyzing said set of pixels and dividing said set of pixels into a plurality of image segments, such that each said image segment has specified boundaries and is classifiable as possibly representing any one or more of said plurality of symbols in a predetermined symbol set; segment scoring means for analyzing each segment in said plurality of segments, and assigning a score to each possible classification of said segment associated with a particular symbol in said predetermined symbol set; representation means for representing a plurality of possible interpretations for said input expression, and a plurality of image consegmentations, wherein each said possible interpretation consists of a different sequence of symbols selected from said plurality of symbols, and each said consegmentation consists of a different sequence of said plurality of segments; consegmentation scoring means for assigning scores to said plurality of consegmentations based on the scores assigned to said segments; candidate interpretation identifying means for identifying one or more candidate symbol interpretations from said plurality of possible interpretations based on the scores assigned to said plurality of segments; symbol sequence scoring means for assigning scores to said one or more candidate interpretations based on the scores assigned to one or more of said plurality of segments; first score evaluation means for evaluating the scores assigned to said one or more candidate interpretations; second score evaluation means for evaluating the scores assigned to said plurality of possible interpretations; and normalized score producing means for producing a normalized score for each candidate interpretation using the evaluated score for said plurality of possible interpretations. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method for forming an interpretation of an input expression, where said input expression is expressed in a medium, said interpretation is a sequence of symbols, and each symbol being an element in a predetermined symbol set, said method comprising the steps:
-
(a) acquiring an input data set representative of said input expression, said input data set comprising a set of pixels associated with an acquired image of a graphically represented sequence of symbols; (b) processing said input data set so as to form a set of image segments, each said image segment having specified boundaries and being classifiable as possibly representing any one or more of said plurality of symbols in said predetermined symbol set; (c) forming a data structure that represents a set of consegmentations and a set of possible interpretations for said input expression, each said consegmentation consisiting of a set of said segments which collectively represent said input data set and being arranged in an order that substantially preserves the sequential structure of said input data set, each said possible interpretation for said input expression consisting of a possible symbol sequence, and each symbol in said possible symbol sequence being selected from said predetermined symbol set and occypying a symbol position in said possible symbol sequence, said data structure being graphically representable by a graph comprising a two-dimensional array of nodes arranged in rows and columns and selectively connected by directed arcs, each said column of nodes being indexable by one said symbol position, and each said row of nodes being indexable by one said image segment in an order that corresponds to the logical structure of said acquired input data set, and each path extending through said nodes and along said directed arcs representing one said consegmentation and one said possible interpretation for said input expression, and all of said consegmentations and all of said possible in interpretations for said input expression being represented by the set of paths extending through said graph; (d) for each row of nodes in said graph, producing a set of scores for said predetermined symbol set represented by each node in said row, wherein the production of each said set of scores includes analyzing the segment indexing the row of nodes for which said set of scores is produced; (e) implicitly or explicitly attributing a path score to paths through said graph; and (f) analyzing the path scores attributed to the paths through said graph in step (e) in order to select one or more possible interpretations for said input expression. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A system for forming an interpretation of an input expression, where said input expression is expressed in a medium, said interpretation is a sequence of symbols, and each said symbol is an element in a predetermined symbol set, said system comprising:
-
data set acquisition means for acquiring input data set representative of said input expression; data processing means for processing said acquired data set so as to produce a plurality of segments, each said segment having specificable boundaries and being classifiable as possibly representing any one of a plurality of symbols in a predetermined symbol set; consegmentation specifying means for producing data specifying a set of consegmentations, each said segmentation consisting of a set of said segments collectively representing said acquired input data set and being arranged in an order that substantially preserves the sequential structure of said acquired input data set; symbol sequence interpretation specifying means for producing data specifying a set of possible interpretations for said input expression, each said possible interpretation for said input expression consisting of a possible sequence of symbols and each said symbol in said possible sequence of symbols being selected from said predetermined symbol set and occupying a symbol position in said possible sequence of symbols; data storing means for storing in a data structure, the produced data representative of each said consegmentation and each said possible interpretation for said input expression, wherein said data structure is graphically representable by a graph comprising a two-dimensional array of nodes arranged in rows and columns and selectively connected by directed arcs, and wherein each said column of nodes is indexable by one said symbol position and each said row of nodes is indexable by one said segment in an order that corresponds to the sequential structure of said acquired input data set, wherein each path extending through said nodes and along said directed arcs represents one said consegmentation and one said possible interpretation for said input expression, wherein said set of consegmentations and said set of possible interpretations for said input expression are represented by the set of paths extending through said graph; segment analyzing means for analyzing the data in each said segment, and producing, for each row of nodes in said graph, a set of scores for said symbol set represented by each node in said row; path score computing means for computing a path score for each said path through said graph; and path score analyzing means for analyzing the computed path scores in order to select one or more said possible interpretations for said input expression. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A system for forming an interpretation of an input expression, where said input expression is expressed in a medium, said interpretation is a sequence of symbols, and each said symbol is an element in a predetermined symbol set, said system comprising:
-
image acquisition means for acquiring an image of said input expression; image processing means for processing said image so as to form a set of image segments, each said image segment being a sub-image of said acquired image; image consegmentation specifying means for producing data specifying a set of image consegmentations, each said image consegmentation consisting of a set of said image segments collectively representing said acquired image and being arranged in an order that substantially preserves the spatial structure of said acquired image; symbol sequence interpretation specifying means for producing data specifying a set of possible interpretations for said input expression, each said possible interpretation for said input expression consisting of a sequence of symbols, each said symbol in said sequence of symbol being selected from said predetermined symbol set and occupying a symbol position in said sequence of symbols; data storage means for storing in a data structure, the produced data representative of each said image consegmentation and each said possible interpretation for said input expression, wherein said data structure is graphically representable by a directed acyclic graph comprising a two-dimensional array of nodes arranged in rows and columns and selectively connected by directed arcs, and wherein each said column of nodes is indexable by one said symbol position, and each said row of nodes is indexable by one said image segment in an order that corresponds to the spatial structure of said acquired image, and wherein each path extending through said nodes and along said directed arcs represents one said image consegmentation and one said possible interpretation for said input expression, and all of said image consegmentations and all of said possible interpretations for said input expression are represented by the set of paths extending through said graph; image segment analyzing means for analyzing each said image segment, and producing, for each row of nodes in said graph, a set of scores for said predetermined symbol set represented by each node in said row; path score computing means for computing a path score for each said path through said graph; and path score analyzing means for analyzing the computed path scores in order to select one or more of said possible interpretations for said input expression. - View Dependent Claims (35, 36, 37)
-
Specification