Methods and systems for processing of images of mathematical expressions
First Claim
1. A system comprising:
- one or more electronic memories that store instructions for automated optical symbol recognition; and
one or more processors to execute the instructions to;
apply blocking to an image stored in at least one of the memories to decompose the image into an ordered set of symbol variants, wherein the image depicts a mathematical expression, and wherein, to apply blocking to the image, the processors are further to execute the instructions to;
set a blocking-direction indication to indicate one of a horizontal blocking-direction or a vertical blocking-direction;
set a current-level indication to indicate a first level;
block the image into sub-images at a level according to the current-level indication and in a direction according to the blocking-direction indication; and
recursively for each sub-image in the sub-images at the level, apply one or more symbol-recognition methods to the sub-image;
select a most probable path from among candidate paths corresponding to the ordered set of symbol variants;
use the most probable path and the ordered set of symbol variants to generate an encoded mathematical expression equivalent to the mathematical expression; and
store the encoded mathematical expression in one or more of the memories.
4 Assignments
0 Petitions
Accused Products
Abstract
The current document is directed to methods and systems that convert document images containing mathematical expression into corresponding electronic documents. In one implementation, an image or sub-image containing a mathematical expression is recursively partitioned into blocks separated by white-space stripes. Horizontal and vertical partitioning are alternately and recursively applied to the image or sub-image containing a mathematical expression until the lowest-level blocks obtained by partitioning correspond to symbols recognizable by character-recognition methods. Graph-based analysis of the recognized symbols provides a basis for encoding an equivalent representation of the mathematical expression contained in the image or sub-image.
140 Citations
19 Claims
-
1. A system comprising:
-
one or more electronic memories that store instructions for automated optical symbol recognition; and one or more processors to execute the instructions to; apply blocking to an image stored in at least one of the memories to decompose the image into an ordered set of symbol variants, wherein the image depicts a mathematical expression, and wherein, to apply blocking to the image, the processors are further to execute the instructions to; set a blocking-direction indication to indicate one of a horizontal blocking-direction or a vertical blocking-direction; set a current-level indication to indicate a first level; block the image into sub-images at a level according to the current-level indication and in a direction according to the blocking-direction indication; and recursively for each sub-image in the sub-images at the level, apply one or more symbol-recognition methods to the sub-image; select a most probable path from among candidate paths corresponding to the ordered set of symbol variants; use the most probable path and the ordered set of symbol variants to generate an encoded mathematical expression equivalent to the mathematical expression; and store the encoded mathematical expression in one or more of the memories. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method comprising:
-
applying, by one or more processors that execute instructions stored in one or more memories for automated optical symbol recognition, blocking to an image stored in at least one of the memories to decompose the image into an ordered set of symbol variants, wherein the image depicts a mathematical expression; selecting, by the processors, a most probable path from among candidate paths corresponding to the ordered set of symbol variants, wherein the candidate paths corresponding to the ordered set of symbol variants each comprise one or more arcs, wherein each of the arcs encompasses an ordered subset of the ordered set of symbol variants, wherein each of the arcs is associated with an arc weight, and wherein the most probable path is selected based on the arc weight of one or more of the arcs; using the most probable path and the ordered set of symbol variants to generate an encoded mathematical expression equivalent to the mathematical expression; and storing the encoded mathematical expression in one or more of the memories. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
-
-
19. Computer A non-transitory computer-readable storage medium having stored therein instructions that, when executed by one or more processors, cause the processors to:
-
apply, by the processors that execute the instructions for automated optical symbol recognition, blocking to an image stored in at least one of one or more memories to decompose the image into an ordered set of symbol variants, wherein the image depicts a mathematical expression; select, by the processors, a most probable path from among candidate paths corresponding to the ordered set of symbol variants, wherein the candidate paths corresponding to the ordered set of symbol variants each comprise one or more arcs, wherein each of the arcs encompasses an ordered subset of the ordered set of symbol variants, wherein each of the arcs is associated with an arc weight, and wherein the most probable path is selected based on the arc weight of one or more of the arcs; use the most probable path and the ordered set of symbol variants to generate an encoded mathematical expression equivalent to the mathematical expression; and store the encoded mathematical expression in one or more of the memories.
-
Specification