Method and apparatus for selecting text and image data from video images
First Claim
1. A method for selecting text and image data from documents with an image processing system in which images of documents are captured by an image capture device having a field of view, comprising the steps of:
- (a) displaying an image captured by the image capture device;
the displayed image containing text matter being defined by one of grayscale image data and color image data;
(b) receiving a first user input defining both a start of a selection and a first position within the displayed image;
(c) responsive to said step (b), freezing the displayed image to define a frozen displayed image;
(d) determining a skew angle θ
of the text matter with respect to the field of view of the image capture device;
(e) receiving a second user input defining both an end of the selection and a second position within the displayed image;
the second user input defining a final user input;
(f) determining, using the skew angle θ
determined in said step (d), a selection element;
the selection element having a position, a shape and dimensions that are, at least, dependent upon the first position; and
(g) displaying the selection element superimposed on the frozen displayed image;
wherein step (f) further comprises the steps of;
selecting a word of the text matter within the displayed image;
said step (f) determining the selection element to include a selection block overlaying the word;
(f1) determining a word separation value (Sw)min from measured values of separation between adjacent pairs of characters in the text matter; and
(f2) determining dimensions of the selection block in a direction of flow of the text matter as a function of the word separation value (Sw)min determined in said step (f1).
4 Assignments
0 Petitions
Accused Products
Abstract
A method carried out in an image processing system in which images of documents are captured by an image capture device, such as a video camera, comprising: (a) displaying successive images captured by the video camera, each image being defined by grayscale image data and containing text matter, (b) receiving a first user input (mouse button click) defining the start of a selection and a first position within the displayed image, (c) in response to the first user input, freezing the displayed image, (d) determining the skew angle of text matter with respect to the field of view of the video camera, (e) receiving at least one further user input (further button click; drag of cursor), including a final user input (mouse button release), defining the end of a selection, and for the or each further user input, (f) determining, using the skew angle determined in step (d), the position, shape and dimensions of a selection element in dependence upon at least the first position, and (g) displaying the selection element superimposed on the frozen displayed image. The selection element may be a rectangle, or a selection block highlighting one or more words of text.
116 Citations
16 Claims
-
1. A method for selecting text and image data from documents with an image processing system in which images of documents are captured by an image capture device having a field of view, comprising the steps of:
-
(a) displaying an image captured by the image capture device;
the displayed image containing text matter being defined by one of grayscale image data and color image data;
(b) receiving a first user input defining both a start of a selection and a first position within the displayed image;
(c) responsive to said step (b), freezing the displayed image to define a frozen displayed image;
(d) determining a skew angle θ
of the text matter with respect to the field of view of the image capture device;
(e) receiving a second user input defining both an end of the selection and a second position within the displayed image;
the second user input defining a final user input;
(f) determining, using the skew angle θ
determined in said step (d), a selection element;
the selection element having a position, a shape and dimensions that are, at least, dependent upon the first position; and
(g) displaying the selection element superimposed on the frozen displayed image;
wherein step (f) further comprises the steps of;
selecting a word of the text matter within the displayed image;
said step (f) determining the selection element to include a selection block overlaying the word;
(f1) determining a word separation value (Sw)min from measured values of separation between adjacent pairs of characters in the text matter; and
(f2) determining dimensions of the selection block in a direction of flow of the text matter as a function of the word separation value (Sw)min determined in said step (f1). - View Dependent Claims (2, 3, 4, 5)
(f1i) forming a histogram of frequency versus inter-character spacing for each pair of adjacent characters within a portion of the text matter near the first position;
(f1ii) determining a best-fitting curve using a plurality of distinct Gaussian curves;
the best-fitting curve forming a best fit with a predetermined mode of the histogram formed in step (f1i); and
(f1iii) determining an estimate point on an inter-character spacing axis of the histogram at which the best-fitting curve satisfies a predetermined criteria.
-
-
3. The method according to claim 2, wherein said step (f1ii) determines the estimate point using the equation given by:
-
4. The method according to claim 1, wherein said step (f) further comprises the steps of:
-
(f3) determining a line spacing (Sl) between adjacent lines of the text matter; and
(f4) determining dimensions of the selection block in a direction perpendicular to the flow of the text matter as a function of the line spacing (Sl).
-
-
5. The method according to claim 4, wherein said step (f) further comprises the steps of:
-
(f5) determining a first horizontal limit and a second horizontal limit of the text matter;
(f6) determining whether the first position and the second position are on different lines of the text matter; and
responsive to said step (f6) determining the first position and the second position are on different lines of the text matter, performing the steps of;
(f2i) for an upper portion of the selection block, overlaying the text matter between the first position and the first horizontal limit of the text matter;
(f2ii) for a lower portion of the selection block, overlaying the text matter between the second position and the second horizontal limit of the text matter; and
(f2iii) for any portion of the selection block interposed between the lower portion of the selection block and the upper portion of the selection block, overlaying the text matter between the first horizontal limit and the second horizontal limit.
-
-
6. An apparatus for selecting text and image data from documents, comprising:
-
an image capture device having a field of view;
a memory for storing instructions, and the text and the image data from the documents;
a processor for communicating with said image capture device and said memory;
said processor for executing the instructions stored in said memory;
a display, controlled by said processor, for displaying an image capture image capture device;
the displayed image containing text matter being defined by one of grayscale image data and color image data; and
a user input device, coupled to said processor, for receiving a first user input and a second user input;
the first user input defining both a start of a selection and a first position within the displayed image;
the second user input defining both an end of the selection and a second position within the displayed image;
wherein the instructions stored in said memory further comprises;
means for freezing the displayed image to define a frozen displayed image;
means for determining a skew angle θ
of the text matter with respect to the field of view of said image capture device; and
means for determining, using the skew angle θ
, a selection element;
the selection element having a position, a shape, and dimensions that are, at least, dependent upon the first position; and
means for superimposing the selection element on the frozen displayed image displayed on said display;
wherein the instructions stored in said memory further comprise;
means for selecting a word of the text matter within the displayed image;
said selecting means determining the selection element to include a selection block overlaying the word;
means for determining a word separation value (Sw)min from measured values of separation between adjacent pairs of characters in the text matter; and
means determining dimensions of the selection block in a direction of flow of the text matter as a function of the word separation value (Sw)min. - View Dependent Claims (7, 8, 9)
means for forming a histogram of frequency versus inter-character spacing for each pair of adjacent characters within a portion of the text matter near the first position;
means for determining a best-fitting curve using a plurality of distinct Gaussian curves;
the best-fitting curve forming a best fit with a predetermined mode of the histogram; and
means for determining an estimate point on an inter-character spacing axis of the histogram at which the best-fitting curve satisfies a predetermined criteria.
-
-
8. The apparatus according to claim 7, wherein the estimate point is determined using the equation given by:
-
9. The apparatus according to claim 7, wherein the instructions stored in said memory further comprise:
-
means for determining a line spacing (Sl) between adjacent lines of the text matter; and
means for determining dimensions of the selection block in a direction perpendicular to the flow of the text matter as a function of the line spacing (S1).
-
-
10. A method for selecting text and image data from documents with an image processing system in which images of documents are captured by an image capture device having a field of view, comprising the steps of:
-
(a) displaying an image captured by the image capture device;
the displayed image containing text matter being defined by one of grayscale image data and color image data;
(b) receiving a first user input defining both a start of a selection and a first position within the displayed image;
(c) responsive to said step (b), freezing the displayed image to define a frozen displayed image;
(d) determining a skew angle θ
of the text matter with respect to the field of view of the image capture device;
(e) receiving a second user input defining both an end of the selection and a second position within the displayed image;
the second user input defining a final user input;
(f) determining, using the skew angle θ
determined in said step (d), a selection element;
the selection element having a position, a shape, and dimensions that are, at least, dependent upon the first position;
(g) displaying the selection element superimposed on the frozen displayed image; and
(h) receiving a third user input after receiving the second user input;
the third user input indicating selection of a sentence containing a word identified by the first user input and the second user input.
-
-
11. An apparatus for selecting text and image data from documents, comprising:
-
an image capture device having a field of view;
a memory for storing instructions, and the text and the image data from the documents;
a processor for communicating with said image capture device and said memory;
said processor for executing the instructions stored in said memory;
a display, controlled by said processor, for displaying an image captured by said image capture device;
the displayed image containing text matter being defined by one of grayscale image data and color image data; and
a user input device, coupled to said processor, for receiving a first user input and a second user input;
the first user input defining both a start of a selection and a first position within the displayed image;
the second user input defining both an end of the selection and a second position within the displayed image;
wherein the instructions stored in said memory further comprises;
means for freezing the displayed image to define a frozen displayed image;
means for determining a skew angle θ
of the text matter with respect to the field of view of said image capture device; and
means for determining, using the skew angle θ
, a selection element;
the selection element having a position, a shape, and dimensions that are, at least, dependent upon the first position; and
means for superimposing the selection element on the frozen displayed image displayed on said display;
wherein said user input device receives a third user input after receiving the second user input;
the third user input indicating selection of a sentence containing a word identified by the first user input and the second user input.
-
-
12. A method for selecting text and image data from documents with an image processing system in which images of documents are captured by an image capture device having a field of view, comprising the steps of:
-
(a) displaying an image captured by the image capture device;
the displayed image containing text matter being defined by one of grayscale image data and color image data;
(b) receiving a first user input defining both a start of a selection and a first position within the displayed image;
(c) responsive to said step (b), freezing the displayed image to define a frozen displayed image;
(d) determining a skew angle θ
of the text matter with respect to the field of view of the image capture device;
(e) receiving a second user input defining both an end of the selection and a second position within the displayed image;
the second user input defining a final user input;
(f) determining, using the skew angle θ
determined in said step (d), a selection element;
the selection element having a position, a shape, and dimensions that are, at least, dependent upon the first position;
(g) displaying the selection element superimposed on the frozen displayed image;
(h) extracting image data defining the selection element from the image; and
(l) rotating the extracted image data through a negative angle of the determined skew angle θ
.- View Dependent Claims (13, 14, 15)
-
-
16. An apparatus for selecting text and image data from documents, comprising:
-
an image capture device having a field of view;
a memory for storing instructions, and the text and the image data from the documents;
a processor for communicating with said image capture device and said memory;
said processor for executing the instructions stored in said memory;
a display, controlled by said processor, for displaying an image captured by said image capture device;
the displayed image containing text matter being defined by one of grayscale image data and color image data; and
a user input device, coupled to said processor, for receiving a first user input and a second user input;
the first user input defining both a start of a selection and a first position within the displayed image;
the second user input defining both an end of the selection and a second position within the displayed image;
wherein the instructions stored in said memory further comprises;
means for freezing the displayed image to define a frozen displayed image;
means for determining a skew angle θ
of the text matter with respect to the field of view of said image capture device; and
means for determining, using the skew angle θ
, a selection element;
the selection element having a position, a shape, and dimensions that are, at least, dependent upon the first position;
means for superimposing the selection element on the frozen displayed image displayed on said display;
means for extracting image data defining the selection element from the image; and
means for rotating the extracted image data through a negative angle of the determined skew angle θ
.
-
Specification