Method and apparatus for producing a hybrid data structure for displaying a raster image
First Claim
1. A method for producing an output raster image from a hybrid data structure without access to external data, the method comprising:
- receiving an input bitmap representing detected objects in a document;
performing a recognition process on the input bitmap in order to recognize identifiable lexical objects as particular characters;
representing recognized lexical objects as coded data corresponding to particular characters, the coded data including a character code having a standard format, font and point size for each of one or more characters;
representing detected lexical objects that are not confidently identifiable as particular characters as non-coded bitmap data derived from unidentifiable lexical objects of the input bitmap;
creating a hybrid data structure including the coded data and the non-coded data derived from the input bitmap, the hybrid data structure having a first part incorporating only the coded data, said first part capable of being converted to bitmap representations of the recognized lexical objects and a second part incorporating only the non-coded bit map data, said second part capable of being rendered as bitmap representations of the unidentifiable lexical objects; and
creating an output raster image from the hybrid data structure by converting the first part to the bitmap representations of the recognized lexical objects and combining the bitmap representations of the recognized lexical objects with the bitmap representations of the unidentifiable lexical objects rendered from the second part of the hybrid data structure.
0 Assignments
0 Petitions
Accused Products
Abstract
A system for producing a raster image derived from coded and non-coded portions of a hybrid data structure from an input bitmap including (1) a data processing apparatus, (2) a recognizer which performs recognition on an input bitmap to the data processing apparatus to detect identifiable objects within the input bitmap, (3) a mechanism for producing a hybrid data structure including coded data corresponding to the identifiable objects and non-coded data derived from portions of the input bitmap which do not correspond to the identifiable objects, and (4) an output device capable of developing a visually perceptible raster image derived from the hybrid data structure. The raster image includes raster images of the identifiable objects and raster images derived from portions of the input bitmap that do not correspond to the identifiable objects. The invention includes a method for producing a hybrid data structure for a bitmap of an image having the steps of: (1) inputting a signal comprising a bitmap into a digital processing apparatus, (2) partitioning the bitmap into a hierarchy of lexical units, (3) assigning labels to a label list for each lexical unit of a predetermined hierarchical level, where labels in the label list have an associated confidence level, and (4) storing each lexical unit in a hybrid data structure as either an identifiable object or a non-identifiable object
-
Citations
10 Claims
-
1. A method for producing an output raster image from a hybrid data structure without access to external data, the method comprising:
-
receiving an input bitmap representing detected objects in a document;
performing a recognition process on the input bitmap in order to recognize identifiable lexical objects as particular characters;
representing recognized lexical objects as coded data corresponding to particular characters, the coded data including a character code having a standard format, font and point size for each of one or more characters;
representing detected lexical objects that are not confidently identifiable as particular characters as non-coded bitmap data derived from unidentifiable lexical objects of the input bitmap;
creating a hybrid data structure including the coded data and the non-coded data derived from the input bitmap, the hybrid data structure having a first part incorporating only the coded data, said first part capable of being converted to bitmap representations of the recognized lexical objects and a second part incorporating only the non-coded bit map data, said second part capable of being rendered as bitmap representations of the unidentifiable lexical objects; and
creating an output raster image from the hybrid data structure by converting the first part to the bitmap representations of the recognized lexical objects and combining the bitmap representations of the recognized lexical objects with the bitmap representations of the unidentifiable lexical objects rendered from the second part of the hybrid data structure. - View Dependent Claims (2, 4, 5)
-
-
3. A method for producing a hybrid data structure from an input bitmap, the method comprising:
-
receiving an input bitmap representing detected objects in a document;
performing a recognition process on the input bitmap in order to recognize identifiable lexical objects as particular characters;
representing recognized lexical objects as coded data corresponding to particular characters, the coded data including a character code having a standard format, font and point size for each of one or more characters;
assigning a confidence level to each recognized lexical object;
assigning a data code to each recognized lexical object for which a confidence level at or above a predetermined confidence level has been assigned, the data code capable of being converted to a bitmap representation of the recognized lexical object;
representing detected lexical objects that are not confidently identifiable as particular characters as non-coded bitmap data derived from unidentifiable lexical objects of the input bitmap, the bitmap data capable of being rendered as bitmap representations of the unidentifiable lexical objects; and
creating a hybrid data structure capable of reproducing the entire input raster image without access to external data, the hybrid data structure including a first part incorporating only the coded data and a second part incorporating only the non-coded bitmap data, the first part including assigned data codes for recognized lexical objects at or above the confidence level and input bitmap data for any detected lexical object below the predetermined confidence level, the second part including the non-coded data for the unidentifiable lexical objects. - View Dependent Claims (6)
-
-
7. A method for recreating a visually perceptible raster image of a document from a hybrid data structure, the method comprising:
-
receiving an input bitmap representing detected objects in a document;
performing recognition on the input bitmap to recognize characters within a first part of the input bitmap;
representing recognized characters as coded data, the coded data including a character code having a standard format, font and point size for each of the characters, the coded data capable of being converted to an output bitmap of the recognized characters;
creating a hybrid data structure capable of reproducing a visually perceptible raster image without accessing external data, the hybrid data structure including a first art incorporating only the coded data and a non-coded second part incorporating only non-coded bitmap data, the non-coded second part representing detected lexical objects that are not confidently identifiable as particular characters and including bitmap data derived from unidentifiable detected lexical objects of the input bitmap, the bitmap data capable of being rendered as bitmap representations of the unidentifiable lexical objects; and
recreating the visually perceptible raster image from the hybrid data structure by converting the first part to bitmap representations of the recognized objects and combining the bitmap representations of the recognized objects with the bitmap representations of the unrecognized objects rendered from the non-coded second part of the hybrid data structure, the raster image having an appearance virtually identical to the document. - View Dependent Claims (8)
-
-
9. A method for producing an output raster image without accessing external data, the method comprising:
-
receiving a hybrid data structure derived from an input bitmap, the input bitmap representing detected objects in a document, the hybrid data structure including a first part incorporating only coded data and a second part incorporating only non-coded data derived from the input bitmap, the first part representing all lexical objects classified as recognized, the coded data including a character code having a standard format, font and point size for each of one or more characters, the second part representing lexical objects that are not confidently identifiable as particular characters, the non-coded data including bitmap data derived from unidentifiable lexical objects in the input bitmap, the first part of the hybrid data structure having coded portions capable of being converted to bitmap representations of the recognized lexical objects and the second part of the hybrid data structure having non-coded portions capable of being rendered as bitmap representations of the unidentifiable lexical objects; and
creating an output raster image from the hybrid data structure by converting the coded portions of the first part to bitmap representations of the recognized lexical objects and combining the bitmap representations of the recognized lexical objects with bitmap representations of the unidentifiable lexical objects rendered from the non-coded portions of the second part of the hybrid data structure. - View Dependent Claims (10)
-
Specification