Method and apparatus for producing a hybrid data structure for displaying a raster image
First Claim
1. A method for producing a hybrid data structure from an input raster image which has been scanned and converted to an input bitmap, the hybrid data structure including coded portions which represent lexical units contained within a first part of the input bitmap, the lexical units being organized into hierarchical levels selected from the class consisting of a blob level, a character level, a word level, a text line level, a text block level, a page level and a document level, and a non-coded second part of the input bitmap, the coded portions themselves being capable of conversion to bitmap representations of the lexical units, the method comprising:
- performing a recognition process on the input bitmap, thereby recognizing the lexical units;
assigning a confidence level to each lexical unit indicating how confidently it has been recognized;
assigning a data code to each lexical unit to which a confidence level has been assigned at or above a predetermined confidence level; and
creating the hybrid data structure including the assigned data codes, the input bitmap for any lexical units below the predetermined confidence level and the non-coded second part of the input bitmap.
0 Assignments
0 Petitions
Accused Products
Abstract
A system for producing a raster image derived from coded and non-coded portions of a hybrid data structure from an input bitmap including (1) a data processing apparatus, (2) a recognizer which performs recognition on an input bitmap to the data processing apparatus to detect identifiable objects within the input bitmap, (3) a mechanism for producing a hybrid data structure including coded data corresponding to the identifiable objects and non-coded data derived from portions of the input bitmap which do not correspond to the identifiable objects, and (4) an output device capable of developing a visually perceptible raster image derived from the hybrid data structure. The raster image includes raster images of the identifiable objects and raster images derived from portions of the input bitmap that do not correspond to the identifiable objects. This includes a method for producing a hybrid data structure for a bitmap of an image having the steps of: (1) inputting a signal comprising a bitmap into a digital processing apparatus, (2) partitioning the bitmap into a hierarchy of lexical units, (3) assigning labels to a label list for each lexical unit of a predetermined hierarchical level, where labels in the label list have an associated confidence level, and (4) storing each lexical unit in a hybrid data structure as either an identifiable object or a non-identifiable object.
48 Citations
4 Claims
-
1. A method for producing a hybrid data structure from an input raster image which has been scanned and converted to an input bitmap, the hybrid data structure including coded portions which represent lexical units contained within a first part of the input bitmap, the lexical units being organized into hierarchical levels selected from the class consisting of a blob level, a character level, a word level, a text line level, a text block level, a page level and a document level, and a non-coded second part of the input bitmap, the coded portions themselves being capable of conversion to bitmap representations of the lexical units, the method comprising:
-
performing a recognition process on the input bitmap, thereby recognizing the lexical units; assigning a confidence level to each lexical unit indicating how confidently it has been recognized; assigning a data code to each lexical unit to which a confidence level has been assigned at or above a predetermined confidence level; and creating the hybrid data structure including the assigned data codes, the input bitmap for any lexical units below the predetermined confidence level and the non-coded second part of the input bitmap. - View Dependent Claims (2, 3, 4)
-
Specification