Zone segmentation for image display

US 6,195,459 B1
Filed: 12/17/1996
Issued: 02/27/2001
Est. Priority Date: 12/21/1995
Status: Expired due to Fees

First Claim

Patent Images

1. A method for classifying segments of a digital image into text-like portions and non-text-like portions, said method comprising the steps of:

(a) establishing a set of fuzzy detection rules for distinguishing text-like portions of said image from said non-text-like portions of said image;

said establishing step including the steps of;

(aa) identifying a plurality of image features that distinguish different portions of an image;

(ab) generating a plurality of fuzzy detection rules by applying different combinations of said features to a text-like learning image and to a non-text-like learning image;

(b) dividing the image into a plurality of segments; and

(c) applying said set of fuzzy detection rules to each segment of said image to thereby classify each said segment as being one of a text-like portion and a non-text-like portion.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention provides a method and apparatus for detecting in an image text-like portions and non-text-like portions. The method includes the steps of establishing a set of fuzzy detection rules for distinguishing text-like portions of an image from non-text-like portions of an image, dividing the test image into a plurality of segment, and applying the set of fuzzy detection rules to each segment of the test image to thereby classify each segment as text-like or non-text-like. Preferably, the establishing step includes the sub-steps of identifying a plurality of image features that distinguish different portions of an image, generating a plurality of fuzzy detection rules by applying different combinations of the features to a text-like learning image and to a non-text-like learning image, and minimizing the rules to exclude those rules not supported by a predetermined amount of the learning images, and allocating the non-excluded rules to the set. Optionally, the generating sub-step includes the sub-steps of normalising each image feature to have a value in the range 0 to 1, partitioning each input feature space into a plurality of equally spaced region, assigning each input feature to a label of one of the regions to maximize a membership value of the label in the one region, selecting for each the region the maximized label for each feature to thus form a respective fuzzy rule.

Citations

66 Claims

1. A method for classifying segments of a digital image into text-like portions and non-text-like portions, said method comprising the steps of:
- (a) establishing a set of fuzzy detection rules for distinguishing text-like portions of said image from said non-text-like portions of said image;
  
  said establishing step including the steps of;
  
  (aa) identifying a plurality of image features that distinguish different portions of an image;
  
  (ab) generating a plurality of fuzzy detection rules by applying different combinations of said features to a text-like learning image and to a non-text-like learning image;
  
  (b) dividing the image into a plurality of segments; and
  
  (c) applying said set of fuzzy detection rules to each segment of said image to thereby classify each said segment as being one of a text-like portion and a non-text-like portion.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The method as recited in claim 1, wherein said establishing step (a) comprises the further step of:
3. The method as recited in claim 1, wherein said generating step (ab) comprises the sub-steps of:
- (aba) normalising each image feature to have a value in the range 0 to 1;
  
  (abb) partitioning each input feature space into a plurality of equally spaced regions;
  
  (abc) assigning each input feature to a label of one of said regions to maximize a membership value of said label in said one region;
  
  (abd) selecting for each said region the maximized label for each said feature to thus form a respective fuzzy rule.
4. The method as recited in claim 3, wherein adjacent ones of said equally spaced regions overlap.
5. The method as recited in claim 3, wherein each said fuzzy rule comprises a logical ANDed combination of said image features.
6. The method as recited in claim 3, wherein step (abd) comprises determining an output value O_pfor a pth input pattern:
- $O_{p} = \frac{\sum_{i = 1}^{K} D_{p}^{i} O^{i}}{\sum_{i = 1}^{K} D_{p}^{i}}$ where K is the number of rules, Oⁱis the class generated by rule i, and Dⁱ_pmeasures how the pth pattern fits an IF condition of the ith rule, wherein Dⁱ_pis given by the product of membership values of the feature vector for the labels used in the ith rule, such that, $D_{p}^{i} = \prod_{j = 1}^{n} m_{ji}$ where n is the number of features, and m_jiis the membership value of feature j for the labels that the ith rule uses.
7. The method as recited in claim 3, wherein said regions correspond to said segments of said test image.
8. The method as recited in claim 1, wherein said image features comprise spatial domain features.
9. A method as recited in claim 8, wherein said image features are selected from the group consisting of:
- (i) mean gray level in a region;
  
  (ii) gray-level variance (or standard deviation) in a region;
  
  (iii) absolute value of the gradient;
  
  (iv) mean absolute value of the on-zero gradient in a region;
  
  (v) maximum absolute value of the non-zero gradient in a region;
  
  (vi) standard deviation of the absolute value of the on-zero gradient in a region;
  
  (vii) absolute value of local contrast;
  
  (viii) mean of the absolute value of non-zero local contrast;
  
  (ix) maximum absolute value of the non-zero local contrast in a region;
  
  (x) standard deviation of the absolute value of the non-zero contrast in a region;
  
  (xi) contrast of a darker pixel against its background;
  
  (xii) dominant local orientation;
  
  (xiii) number of gray levels within in a region;
  
  (xiv) number of pixels in the block with maximum gray level in a region;
  
  (xv) number of pixels in the block with gray level larger than mean gray level in a region;
  
  (xvi) number of pixels in block with gray level smaller than mean gray level in a region;
  
  (xvii) directional gradients;
  
  (xviii) transform domain features; and
  
  (xix) x,y direction projections.
10. The method as recited in claim 1, wherein said image features are dependent upon frequency characteristic information of a portion of said image contained in each segment.
11. The method as recited in claim 10, wherein said image features comprise energy features obtained by decomposing said each segment.
12. The method as recited in claim 11, wherein decomposing said each segment is carried out by applying a wavelet transportation at least once to said each segment.
13. The method as recited in claim 1, wherein said segments form a regular array over said image and adjacent ones of segments overlap.
14. The method as recited in claim 1, wherein said segments comprise blocks and are sized in the range of 4×
- 4 pixels to 32×
  
  32 pixels, and are preferably 9×
  
  9 pixels.

15. An apparatus for classifying segments of a digital image into text-like portions and non-text-like portions, said apparatus comprising:
- (a) means for establishing a set of fuzzy detection rules for distinguishing text-like portions of an image from non-text-like portions of an image;
  
  said establishing means comprising;
  
  means for identifying a plurality of image features that distinguish different portions of an image;
  
  means for generating a plurality of fuzzy detection rules by applying different combinations of said features to a text-like learning image and to a non-text-like learning image;
  
  (b) means for dividing a test image into a plurality of segments; and
  
  (c) means for applying said set of fuzzy detection rules to each segment of said test image to thereby classify each said segment as being one of a text-like portion and a non-text-like portion.
- View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
- - 16. The apparatus as recited in claim 15, wherein said establishing means further comprises:
17. The apparatus as recited in claim 15, wherein said generating means further comprises:
- means for normalising each image feature to have a value in the range 0 to 1;
  
  means for partitioning each input feature space into a plurality of equally spaced regions;
  
  means for assigning each input feature to a label of one of said regions to maximize a membership value of said label in said one region;
  
  means for selecting for each said region the maximized label for each said feature to thus form a respective fuzzy rule.
18. The apparatus as recited in claim 17, wherein adjacent ones of said equally spaced regions overlap.
19. The apparatus as recited in claim 17, wherein each said fuzzy rule comprises a logical ANDed combination of said image features.
20. The apparatus as recited in claim 17, wherein said selecting means comprises means for determining an output value O_pfor a pth input pattern:
- $O_{p} = \frac{\sum_{i = 1}^{K} D_{p}^{i} O^{i}}{\sum_{i = 1}^{K} D_{p}^{i}}$ where K is the number of rules, Oⁱis the class generated by rule i, and Dⁱ_pmeasures how the pth pattern fits an IF condition of the ith rule, wherein Dⁱ_pis given by the product of membership values of the feature vector for the labels used in the ith rule, such that, $D_{p}^{i} = \prod_{j = 1}^{n} m_{ji}$ where n is the number of features, and m_jiis the membership value of feature j for the labels that the ith rule uses.
21. The apparatus is recited in claim 15, wherein said image features comprise spatial domain features.
22. The apparatus as recited in claim 21, wherein said image features are selected from the group consisting of:
- (i) mean gray level in a region;
  
  (ii) gray-level variance (or standard deviation) in a region;
  
  (iii) absolute value of the gradient;
  
  (iv) mean absolute value of the on-zero gradient in a region;
  
  (v) maximum absolute value of the non-zero gradient in a region;
  
  (vi) standard deviation of the absolute value of the on-zero gradient in a region;
  
  (vii) absolute value of local contrast;
  
  (viii) mean of the absolute value of non-zero local contrast;
  
  (ix) maximum absolute value of the non-zero local contrast in a region;
  
  (x) standard deviation of the absolute value of the non-zero contrast in a region;
  
  (xi) contrast of a darker pixel against its background;
  
  (xii) dominant local orientation;
  
  (xiii) number of gray levels within in a region;
  
  (xiv) number of pixels in the block with maximum gray level in a region;
  
  (xv) number of pixels in the block with gray level larger than mean gray level in a region;
  
  (xvi) number of pixels in block with gray level smaller than mean gray level in a region;
  
  (xvii) directional gradients;
  
  (xviii) transform domain features; and
  
  (xix) x,y direction projections.
23. The apparatus as recited in claim 15, wherein said image features are dependent on frequency characteristic information of a portion of said image contained in each segment.
24. The apparatus as recited in claim 23, wherein said image features comprise energy features obtained by decomposing said each segment.
25. The apparatus as recited in claim 24, wherein decomposing said each segment is carried out by applying a wavelet transformation at least once to said each segment.

26. A method for classifying segments of a digital image for display on display means, wherein said digital image is processed as a plurality of blocks each having a predetermined number of pixels, said method comprising the steps of:
- extracting a set of features from each block to generate a feature vector for said block; and
  
  classifying said block using a set of fuzzy rules as either a text-type image or a natural-type image dependent on said feature vector for said block, said rules being generated by applying different combinations of said features to a text-like learning image and to a non-text-like learning image.
- View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34)
- - 27. The method according to claim 26, wherein said set of features comprises spatial domain features extracted from pixel values of each block.
  - 28. The method according to claim 27, further comprising the step of:
29. The method according to claim 28, further comprising, to generate said fuzzy rules using training image data, the steps of:
- extracting said N features from each block of said training image data;
  
  assigning a respective label to each of said N features dependent upon the value of said each of said N features;
  
  determining Q fuzzy rules dependent on labels of said N possible features, wherein each of said Q fuzzy rules has a corresponding amount of support based on said blocks of said training image data;
  
  selecting P fuzzy rules of said Q possible fuzzy rules as said set of fuzzy rules, where P and Q being integers with P≦
  
  M, dependent upon the corresponding amount of support of each of said P fuzzy rules exceeding a predetermined threshold value.
30. The method according to claim 26, wherein said set of features comprise energy measure features extracted from coefficients in a region of interest for each block.
31. The method according to claim 30, wherein said coefficients are obtained by wavelet transforming each block at least once.
32. The method according to claim 31, further comprising the step of tile integrating classified blocks so as to reduce the number of misclassified blocks.
33. The method according to claim 30, wherein said energy measure features comprise the variance of said coefficients over said region of interest for each block.
34. The method according to claim 33, wherein energy measure features are derived based on two or more scales of resolution of said coefficients in said region of interest.

35. An apparatus for classifying segments of a digital image for display on display means, wherein said digital image is processed as a plurality of blocks each having a predetermined number of pixels, said apparatus comprising the steps of:
- means for extracting a set of features from each block to generate a feature vector for said block;
  
  means for classifying said block using a set of fuzzy rules as either a text-type image or a natural-type image dependent on said feature vector for said block, said rules being generated by applying different combinations of said features to a text-like learning image and to a non-text-like learning image.
- View Dependent Claims (36, 37, 38, 39, 40, 41, 42, 43)
- - 36. The apparatus according to claim 35, wherein said set of features comprises spatial domain features extracted from pixel values of each block.
  - 37. The apparatus according to claim 36, further comprising means for selecting N features of M possible features, where N and M are integers with N≦
    - M.
  - 38. The apparatus according to claim 37, further comprising means for generating said fuzzy rules using training image data, wherein said generating means comprises:
39. The apparatus according to claim 35, wherein said set of features comprise energy measure features extracted from coefficients in a region of interest for each block.
40. The apparatus according to claim 39, wherein said coefficients are obtained by wavelet transforming each block at least once.
41. The apparatus according to claim 40, further comprising means for tile integrating classified blocks so as to reduce the number of misclassified blocks.
42. The apparatus according to claim 39, wherein said energy measure features comprise the variance of said coefficients over said region of interest for each block.
43. The apparatus according to claim 42, wherein said energy measure features are derived based on two or more scales of resolution of said coefficients in said region of interest.

44. A computer program product including a computer readable medium having recorded thereon a computer program for detecting in an image text-like portions and non-text-like portions, the computer program comprising:
- (a) establishment steps for establishing a set of fuzzy detection rules for distinguishing text-like portions of said image from said non-text-like portions of said image;
  
  said establishing steps comprising;
  
  (aa) identifying a plurality of image features that distinguish different portions of an image;
  
  (ab) generating a plurality of fuzzy detection rules by applying different combinations of said features to a text-like learning image and to a non-text-like learning image;
  
  (b) dividing steps for dividing the image into a plurality of segments; and
  
  (c) application steps for applying said set of fuzzy detection rules to each segment of said image to thereby classify each said segment as being one of a text-like portion and a non-text-like portion.
- View Dependent Claims (45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57)
- - 45. The computer program product as recited in claim 44, wherein said establishing step (a) comprises the further step of:
46. The computer program product as recited in claim 45, wherein said image feature comprise spatial domain features.
47. A computer program product as recited in claim 45, wherein said image features are selected from the group consisting of:
- (i) mean gray level in a region;
  
  (ii) gray-level variance (or standard deviation) in a region;
  
  (iii) absolute value of the gradient;
  
  (iv) mean absolute value of the on-zero gradient in a region;
  
  (v) maximum absolute value of the non-zero gradient in a region;
  
  (vi) standard deviation of the absolute value of the on-zero gradient in a region;
  
  (vii) absolute value of local contrast;
  
  (viii) means of the absolute value of non-zero local contrast;
  
  (ix) maximum absolute value of the non-zero local contrast in a region;
  
  (x) standard deviation of the absolute value of the non-zero contrast in a region;
  
  (xi) contrast of a darker pixel against its background;
  
  (xii) dominant local orientation;
  
  (xiii) number of gray levels within a region;
  
  (xiv) number of pixels in the block with maximum gray level in a region;
  
  (xv) number of pixels in the block with gray level larger than mean gray level in a region;
  
  (xvi) number of pixels in block with gray level small than mean gray level in a region;
  
  (xvii) directional gradients;
  
  (xviii) transform domain features; and
  
  (xix) x, y direction protections.
48. The computer program product as recited in claim 44, wherein said generating step (ab) comprises the sub-steps of:
- (aba) normalizing each image feature to have a value in the range 0 to 1;
  
  (abb) partitioning each input feature space into a plurality of equally spaced regions;
  
  (abc) assigning each input feature to a label of one of said regions to maximize a membership value of said label in said one region; and
  
  (abd) selecting for each said region the maximized label for each said feature to thus form a respective fuzzy rule.
49. The computer program product as recited in claim 48, wherein adjacent ones of said equally spaced regions overlap.
50. The computer program product as recited in claim 48, wherein each said fuzzy rule comprises a logical ANDed combination of said image features.
51. The computer program product as recited in claim 48, wherein step (abd) comprises determining an output value O_pfor a pth input pattern:
- $O_{p} = \frac{\sum_{i = 1}^{K} D_{p}^{i} O^{i}}{\sum_{i = 1}^{K} D_{p}^{i}}$ where K is the number of rules, Oⁱis the class generated by rule i, and Dⁱ_pmeasures how the pth pattern fits an IF condition of the ith rule, wherein Dⁱ_pis given by the product of membership values of the feature vector for the labels used in the ith rule, such that, $D_{p}^{i} = \prod_{j = 1}^{n} m_{ji}$ where n is the number of features, and m_jiis the membership value of feature j for the labels that the ith rule uses.
52. The computer product as recited in claim 48, wherein said regions correspond to said segments of said test image.
53. The computer program product as recited in claim 44, wherein said image features are dependent upon frequency characteristic information of a portion of said image contained in each segment.
54. The computer program product as recited in claim 53, wherein said image features comprise energy features obtained by decomposing said each segment.
55. The computer program product as recited in claim 54, wherein decomposing said each segment is carried out by applying a wavelet transportation at least once to said each segment.
56. The computer program product as recited in claim 48, wherein said segments form a regular array over said image and adjacent ones of segments overlap.
57. The computer program product as recited in claim 48, wherein said segments comprise blocks and are sized in the range of 4×
- 4 pixels to 32×
  
  32 pixels, and preferably 9×
  
  9 pixels.

58. A computer program product including a computer readable medium having recorded thereon a computer program for zone segmenting a digital image for display on display means, wherein said digital image is processed as a plurality of blocks each having a predetermined number of pixels, said computer program comprising:
- extracting steps for extracting a set of features from each block to generate a feature vector of said block; and
  
  classifying steps for classifying said block using a set of fuzzy rules as either a text-type image or a natural-type image dependent on said feature vector for said block, said rules being generated by applying different combinations of said features to a text-like learning image and to a non-text-like learning image.
- View Dependent Claims (59, 60, 61, 62, 63, 64, 65, 66)
- - 59. The computer program product according to claim 58, wherein said set of features comprises spatial domain features extracted from pixel values of each block.
  - 60. The computer program product according to claim 59, further comprising the step of:
61. The computer program product according to claim 60, further comprising, to generate said fuzzy rules using training image data;
- extracting steps for extracting said N features from each block of said training image data;
  
  assigning steps for assigning a respective label to each of said N features dependent upon the value of each of said N features;
  
  determining steps for determining Q fuzzy rules dependent on labels of said N possible features, wherein each of said Q fuzzy rules has a corresponding amount of support based on said blocks of said training image data; and
  
  selecting steps for selecting P fuzzy rules of said Q possible fuzzy rules as said set of fuzzy rules, where P and Q are integers with P≦
  
  M, dependent upon the corresponding amount of support of each of said fuzzy rules exceeding a predetermined threshold value.
62. The computer program product according to claim 58, wherein said set of features comprise energy measure features extracted from coefficients in a region of interest for each block.
63. The computer program product according to claim 62, wherein said coefficients are obtained by wavelet transforming each block at least once.
64. The computer program product according to claim 63, further comprising the step of tile integrating classified blocks so as to reduce the number of misclassified blocks.
65. The computer program product according to claim 62, wherein said energy measure features comprise the variance of said coefficients over said region of interest for each block.
66. The computer program product according to claim 65, wherein energy features are derived based on two or more scales of resolution of said coefficients in said region of interest.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Original Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Inventors
Zhu, Julie Yan
Primary Examiner(s)
Tran, Phuoc

Application Number

US08/767,804
Time in Patent Office

1,533 Days
Field of Search

382/176, 382/155, 382/161, 382/157, 382/173
US Class Current

382/176
CPC Class Codes

G06T 2207/10016   Video; Image sequence

G06T 2207/20021   Dividing image into blocks,...

G06T 2207/20064   Wavelet transform [DWT]

G06T 2207/20081   Training; Learning

G06T 7/11   Region-based segmentation

G06T 7/143   involving probabilistic app...

G06T 7/168   involving transform domain ...

G06V 30/413   Classification of content, ...

Zone segmentation for image display

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

66 Claims

Specification

Solutions

Use Cases

Quick Links

Zone segmentation for image display

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

66 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links