Passive and interactive real-time image recognition software method

US 20070292033A1
Filed: 06/19/2006
Published: 12/20/2007
Est. Priority Date: 06/19/2006
Status: Abandoned Application

First Claim

Patent Images

1. Whereas, the passive real-time image recognition method is described as follows:

Step 1;

Capture an image projected by an image projection apparatus to image areas as reference images (5×

5 grey-level value) by using a video camera;

Step 2;

Continuously capture real-time images (5×

5 grey-level value) projected by an image projection apparatus to image areas by using a video camera, and check if any foreign object touches the reactive area.The difference value between the reference image from step 1 and the real-time image from step 2 can be denoted as follows (1);

DIFF(x,y)=|REF(x,y)−

NEW(x,y)|

(1)Step 3;

Difference of the grey-level value of real-time image in step 2 and the grey-level value of reference image in step 1 to have the grey-level distribution of remaining images, which means foreign objects exist.Step 4;

The image which is subject to differencing through step 3 usually has noises, which can be present as in formula (2) $\begin{matrix} BIN (x, y) = {\begin{matrix} 255 & DIFF (x, y) \geq T^{*} \\ 0 & DIFF (x, y) < T^{*} \end{matrix} & (2) \end{matrix}$ The binarization method eliminates the noises;

in which, T* represents a threshold, in 8 bit grey-scale image and the threshold ranges from 0 to 255. The optimal threshold can be decided by a statistical method. The optimal threshold is on the wave trough of the grey-level value;

when T* is decided, the image can be segmented into two sections. The requirement for the optimal threshold T* is when the sum of variances in C₁and the variances in C₂has the minimum value. It is assumed that the size of the image is N=5×

5, and the grey-level value number of 8 bit grey-level image is I=256. Then the probability of grey-level value is I can be denoted as;

$\begin{matrix} P (i) = \frac{n_{i}}{N} & (3) \end{matrix}$ Wherein n_iindicates the appearance number of grey-level value I, and the range of I is 0≦

i≦

I−

1. According to the probability principle, the following can be obtained;

$\begin{matrix} \sum_{i = 0}^{I - 1} P (i) = 1 & (4) \end{matrix}$ Suppose the ratio of the pixel number in C₁is;

$\begin{matrix} W_{1} = \Pr (C_{1}) = \sum_{i = 0}^{T^{*}} P (i) & (5) \end{matrix}$ While the ratio of the pixel number in C₂is;

$\begin{matrix} W_{2} = \Pr (C_{2}) = \sum_{i = T^{*} + 1}^{I - 1} P (i) & (6) \end{matrix}$ Here W₁+W₂=1 can be satisfied. The expect value of C₁can be calculated as;

$\begin{matrix} U_{1} = \sum_{i = 0}^{T^{*}} \frac{P (i)}{W_{1}} \times i & (7) \end{matrix}$ The expect value of C₂is;

$\begin{matrix} U_{2} = \sum_{i = T^{*} + 1}^{I - 1} \frac{P (i)}{W_{2}} \times i & (8) \end{matrix}$ The variance of C₁and C₂can be obtained by using the formula (7) and (8). $\begin{matrix} σ_{1}^{2} = \sum_{i = 0}^{T^{*}} {(i - U_{1})}^{2} \frac{P (i)}{W_{1}} & (9) \\ σ_{2}^{2} = \sum_{i = T^{*} + 1}^{I - 1} {(i - U_{2})}^{2} \frac{P (i)}{W_{2}} & (10) \end{matrix}$ The sum of variance in C₁and C₂are;

σ

_w²=W₁σ

₁²+W₂σ

₂²

(11) Substitute the value 0-255 for formula (11). When the formula (11) has the minimum value, then the optimal threshold T* can be obtained. Step 5;

Although the residual noises have been removed through binarization in step 4, however, the moving object becomes dilapidated. This can be removed by using four connected masks and the inflation and erosion algorithm. The inflation algorithm is described as follows;

when M_b(i,j)=255, set the mask of the 4-neighbor points as
M_b(i,j−

1)=M_b(i,j+1)=M_b(i−

1,i)=M_b(i+1,j)=255

(12) The erosion algorithm is described as follows;

when M_b(i,j)=0, set the mask of the 4 neighbor points as
M_b(i,j−

1)=M_b(i,j+1)=M_b(i−

1,j)=M_b(i+1,j)=0

(13) Convoluting the above-mentioned mask and binarized image can eliminate the dilapidation. Step 6;

Next, the lateral mask can be used to obtain the contours of the moving object. Where, the Sobel (the image contour operation mask) is used to obtain the object contours. Convolute the Sobel (the image contour operation mask) mask and the real-time image, which can be denoted by formula (14) and (15);

G_x(x,y)=(NEW(x−

1,y+1)+2×

NEW(x,y+1)+NEW(x+1,y+1))−

(NEW(x−

1,y−

1)+2×

NEW(x,y−

1)+NEW(x+1,y−

1))

(14)
G_y(i,j)=(NEW(x+1,y−

1)+2×

NEW(x+1,y)+NEW(x+1,y+1))−

(NEW(x−

1,y−

1)+2×

NEW(x−

1,y)+NEW(x−

1,y+1)) The rim of the acquired image can be obtained by using formula (16).
G(x,y)=√

{square root over (G_x(x,y)²+G_y(x,y)²)}{square root over (G_x(x,y)²+G_y(x,y)²)}

(16) Then the above rim image is binarized. $\begin{matrix} E (x, y) = {\begin{matrix} 255 & G (x, y) \geq T_{e}^{*} \\ 0 & G (x, y) < T_{e}^{*} \end{matrix} & (17) \end{matrix}$ Wherein T_e* represents the optimal threshold, the optimal threshold can be obtained using the prior method;

then, after mixing the binarization contour pattern of the real-time image and the differentiated binary image BIN(x,y), the periphery contour of the moving object can be obtained. Step 7;

Check if the contour point coordinates of the moving object is touched by the reactive area and run the corresponding movement. Step 8;

Repeat all the steps above;

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

This invention relates to a passive and interactive real-time image recognition software method, particularly to a real-time image recognition software method without the effects of the ambient light sources and noises, which includes passive and interactive recognition methods.

Citations

3 Claims

1. Whereas, the passive real-time image recognition method is described as follows:
- Step 1;
  
  Capture an image projected by an image projection apparatus to image areas as reference images (5×
  
  5 grey-level value) by using a video camera;
  
  Step 2;
  
  Continuously capture real-time images (5×
  
  5 grey-level value) projected by an image projection apparatus to image areas by using a video camera, and check if any foreign object touches the reactive area.The difference value between the reference image from step 1 and the real-time image from step 2 can be denoted as follows (1);
  
  DIFF(x,y)=|REF(x,y)−
  
  NEW(x,y)|
  
  (1)Step 3;
  
  Difference of the grey-level value of real-time image in step 2 and the grey-level value of reference image in step 1 to have the grey-level distribution of remaining images, which means foreign objects exist.Step 4;
  
  The image which is subject to differencing through step 3 usually has noises, which can be present as in formula (2) $\begin{matrix} BIN (x, y) = {\begin{matrix} 255 & DIFF (x, y) \geq T^{*} \\ 0 & DIFF (x, y) < T^{*} \end{matrix} & (2) \end{matrix}$ The binarization method eliminates the noises;
  
  in which, T* represents a threshold, in 8 bit grey-scale image and the threshold ranges from 0 to 255. The optimal threshold can be decided by a statistical method. The optimal threshold is on the wave trough of the grey-level value;
  
  when T* is decided, the image can be segmented into two sections. The requirement for the optimal threshold T* is when the sum of variances in C₁and the variances in C₂has the minimum value. It is assumed that the size of the image is N=5×
  
  5, and the grey-level value number of 8 bit grey-level image is I=256. Then the probability of grey-level value is I can be denoted as;
  
  $\begin{matrix} P (i) = \frac{n_{i}}{N} & (3) \end{matrix}$ Wherein n_iindicates the appearance number of grey-level value I, and the range of I is 0≦
  
  i≦
  
  I−
  
  1. According to the probability principle, the following can be obtained;
  
  $\begin{matrix} \sum_{i = 0}^{I - 1} P (i) = 1 & (4) \end{matrix}$ Suppose the ratio of the pixel number in C₁is;
  
  $\begin{matrix} W_{1} = \Pr (C_{1}) = \sum_{i = 0}^{T^{*}} P (i) & (5) \end{matrix}$ While the ratio of the pixel number in C₂is;
  
  $\begin{matrix} W_{2} = \Pr (C_{2}) = \sum_{i = T^{*} + 1}^{I - 1} P (i) & (6) \end{matrix}$ Here W₁+W₂=1 can be satisfied. The expect value of C₁can be calculated as;
  
  $\begin{matrix} U_{1} = \sum_{i = 0}^{T^{*}} \frac{P (i)}{W_{1}} \times i & (7) \end{matrix}$ The expect value of C₂is;
  
  $\begin{matrix} U_{2} = \sum_{i = T^{*} + 1}^{I - 1} \frac{P (i)}{W_{2}} \times i & (8) \end{matrix}$ The variance of C₁and C₂can be obtained by using the formula (7) and (8). $\begin{matrix} σ_{1}^{2} = \sum_{i = 0}^{T^{*}} {(i - U_{1})}^{2} \frac{P (i)}{W_{1}} & (9) \\ σ_{2}^{2} = \sum_{i = T^{*} + 1}^{I - 1} {(i - U_{2})}^{2} \frac{P (i)}{W_{2}} & (10) \end{matrix}$ The sum of variance in C₁and C₂are;
  
  σ
  
  _w²=W₁σ
  
  ₁²+W₂σ
  
  ₂²
  
  (11) Substitute the value 0-255 for formula (11). When the formula (11) has the minimum value, then the optimal threshold T* can be obtained. Step 5;
  
  Although the residual noises have been removed through binarization in step 4, however, the moving object becomes dilapidated. This can be removed by using four connected masks and the inflation and erosion algorithm. The inflation algorithm is described as follows;
  
  when M_b(i,j)=255, set the mask of the 4-neighbor points as
  M_b(i,j−
  
  1)=M_b(i,j+1)=M_b(i−
  
  1,i)=M_b(i+1,j)=255
  
  (12) The erosion algorithm is described as follows;
  
  when M_b(i,j)=0, set the mask of the 4 neighbor points as
  M_b(i,j−
  
  1)=M_b(i,j+1)=M_b(i−
  
  1,j)=M_b(i+1,j)=0
  
  (13) Convoluting the above-mentioned mask and binarized image can eliminate the dilapidation. Step 6;
  
  Next, the lateral mask can be used to obtain the contours of the moving object. Where, the Sobel (the image contour operation mask) is used to obtain the object contours. Convolute the Sobel (the image contour operation mask) mask and the real-time image, which can be denoted by formula (14) and (15);
  
  G_x(x,y)=(NEW(x−
  
  1,y+1)+2×
  
  NEW(x,y+1)+NEW(x+1,y+1))−
  
  (NEW(x−
  
  1,y−
  
  1)+2×
  
  NEW(x,y−
  
  1)+NEW(x+1,y−
  
  1))
  
  (14)
  G_y(i,j)=(NEW(x+1,y−
  
  1)+2×
  
  NEW(x+1,y)+NEW(x+1,y+1))−
  
  (NEW(x−
  
  1,y−
  
  1)+2×
  
  NEW(x−
  
  1,y)+NEW(x−
  
  1,y+1)) The rim of the acquired image can be obtained by using formula (16).
  G(x,y)=√
  
  {square root over (G_x(x,y)²+G_y(x,y)²)}{square root over (G_x(x,y)²+G_y(x,y)²)}
  
  (16) Then the above rim image is binarized. $\begin{matrix} E (x, y) = {\begin{matrix} 255 & G (x, y) \geq T_{e}^{*} \\ 0 & G (x, y) < T_{e}^{*} \end{matrix} & (17) \end{matrix}$ Wherein T_e* represents the optimal threshold, the optimal threshold can be obtained using the prior method;
  
  then, after mixing the binarization contour pattern of the real-time image and the differentiated binary image BIN(x,y), the periphery contour of the moving object can be obtained. Step 7;
  
  Check if the contour point coordinates of the moving object is touched by the reactive area and run the corresponding movement. Step 8;
  
  Repeat all the steps above;

2. The said interactive real-time image recognition software method is described as follows:
- Step 1;
  
  Capture the image projected to the image region by an image projection apparatus as reference images by using video camera;
  
  Step 2;
  
  Capture the real-time image continuously projected by an image projection apparatus to the image region by using a video camera, wherein images have active images. Then, check if the reactive area is touched by any foreign object.The difference value between reference images in step 1 and real-time images in step 2 can be defined by the following formula (1);
  
  DIFF(x,y)=|REF(x,y)−
  
  NEW(x,y)|
  
  (1)Step 3;
  
  Difference of the grey-level values of said reference image from step 1 with grey-level values of real-time images from step 2 and get the remaining image, which is denoted by formula (2) $\begin{matrix} BIN (x, y) = {\begin{matrix} 255 & DIFF (x, y) \geq T^{*} \\ 0 & DIFF (x, y) < T^{*} \end{matrix} & (2) \end{matrix}$ The binarization method removes the effect of noises. Step 4;
  
  After binarization, the white segments refer to the active images and within the images. The active images and can be segmented by using the Line Segment Coding Method, said line segment coding method is a line segment restore method to store every bit of data in an object. Once the segmented images are detected in line 1, it can be regarded as the first line of the first object denoted as 1-1. Then, two lines are detected in the second line. Since the first line is under 1-1 that is denoted as 1-2, the second line is a new object denoted as 2-1. Accordingly, there is only 1 line under object 1 and object 2 in the forth line. Therefore, the image originally regarded as two objects actually is an object, which is denoted as 1-4. After all the images are scanned, then the merge procedure is performed. Wherein, the information of every object includes;
  
  square area, circumference, object characteristic, segmented image size, width and the total number of the object. Step 5;
  
  when the active images and activity reactive area are segmented, every object characteristic value is calculated. Seven unchanged matrixes are used to represent the object characteristics. The solution is described as follows;
  
  The (k+1) matrix definition of a binary image b(m, n) is $\begin{matrix} M_{k, l} = \sum_{m = 0}^{M - 1} \sum_{n = 0}^{N - 1} m^{k} n^{l} b (m, n) & (18) \end{matrix}$ Wherein, the center matrix is defined as;
  
  $\begin{matrix} μ_{k, l} = \sum_{m = 0}^{M - 1} \sum_{n = 0}^{N - 1} {(m - \overline{x})}^{k} {(n - \overline{y})}^{l} b (m, n) & (19) \end{matrix}$ Wherein, $\overline{x} = \frac{M_{1, 0}}{M_{0, 0}}, \overline{y} = \frac{M_{0, 1}}{M_{0, 0}}$ represents the mass center of the object respectively.Then the normalized center matrix of the formula (19) is defined as follows;
  
  $\begin{matrix} η_{k, l} = \frac{μ_{k, l}}{{(\sqrt{μ_{0, 0}})}^{k + l + 2}} & (20) \end{matrix}$ The seven unchanged matrixes can be obtained by the normalized second and third order matrix;
  
  $φ_{1} = η_{2, 0} + η_{0, 2}$ $φ_{2} = {(η_{2, 0} - η_{0, 2})}^{2} + 4 η_{1, 1}^{2}$ $φ_{3} = {(η_{3, 0} - 3 η_{1, 2})}^{2} + {(3 η_{2, 1} - η_{0, 3})}^{2}$ $φ_{4} = {(η_{3, 0} + η_{1, 2})}^{2} + {(η_{2, 1} + η_{0, 3})}^{2}$ $\begin{matrix} φ_{5} = (η_{3, 0} - 3 η_{1, 2}) (η_{3, 0} + η_{1, 2}) [{(η_{3, 0} + η_{1, 2})}^{2} - 3 {(η_{2, 1} + η_{0, 3})}^{2}] + \\ (3 η_{2, 1} - η_{0, 3}) (η_{2, 1} + η_{0, 3}) [3 {(η_{3, 0} + η_{1, 2})}^{2} - {(η_{2, 1} + η_{0, 3})}^{2}] \end{matrix}$ $\begin{matrix} φ_{6} = (η_{2, 0} - η_{0, 2}) [{(η_{3, 0} + η_{1, 2})}^{2} - {(η_{2, 1} + η_{0, 3})}^{2}] + \\ 4 η_{1, 1} (η_{3, 0} + η_{1, 2}) (η_{2, 1} + η_{0, 3}) \end{matrix}$ $\begin{matrix} φ_{7} = (3 η_{2, 1} - η_{0, 3}) (η_{3, 0} + η_{1, 2}) [{(η_{3, 0} + η_{1, 2})}^{2} - 3 {(η_{2, 1} + η_{0, 3})}^{2}] + \\ (3 η_{1, 2} - η_{0, 3}) (η_{2, 1} + η_{0, 3}) [3 {(η_{3, 0} + η_{1, 2})}^{2} - {(η_{2, 1} + μ_{0, 3})}^{2}] \end{matrix}$ Step 6;
  
  In the realistic pattern recognition process, the pattern of each category has different characteristic vectors within a range, while the falling point within the range cannot be predicted precisely even the range is known. Such kind of random problem can be described using the probability concept. Here, the Bayesian classifier of Gaussian pattern category is adopted to recognize patterns to be identified in real time, which can be described as;
  
  $\begin{matrix} D_{j} (x) = - \frac{1}{2} \ln \langle C_{j} \rangle - \frac{1}{2} [{(x - m_{j})}^{T} C_{j}^{- 1} (x - m_{j})], j = 1, 2 Λ M - & (21) \end{matrix}$ Wherein, D_jis the j^thpattern decision function;
  
  x=[φ
  
  ₁Aφ
  
  ₇] is the j^theigenvector;
  
  m_jand C_jis the j^thaverage eigenvector and covariance matrix. When D is the maximum, it is classified as the j^thpattern. After the pattern recognition is completed, the position of the reactive area is decided. The recognition process can be summarized as;
  
  (1). Practice the pattern template in advance, calculate each category φ
  
  ₁Aφ
  
  ₇, and calculate m_jand C_jof each category, then the decision rules of each categorizer are completed.(2). Segment the images acquired by video camera 12 into several sub images through step 4, and then calculate each D_j(x) of sub images.(3) Compare the size of D_j(x), identify the maximum, and set the pattern as the k^thcategory.After the recognition, the activity reactive area can be located precisely. Step 7;
  
  Check if the activity reactive area is touched by foreign objects and perform the corresponding actions. Step 8;
  
  Repeat all the steps above;
- View Dependent Claims (3)
- - 3. This is the same above-mentioned claim 2 of the said interactive real-time image recognition software method, wherein said step 6:
    - if there are several reactive areas in the images, there are several sub reference images. The passive recognition step 1 through 8 is utilized to determine whether the foreign object touches the sub reference images.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Chao-Wang Hsiung, Chih Hung Chuang, Hsien-Wen Chang
Original Assignee
Chao-Wang Hsiung, Chih Hung Chuang, Hsien-Wen Chang
Inventors
Chang, Hsien-Wen, Hsiung, Chao-Wang, Chuang, Chih Hung

Application Number

US11/455,187
Publication Number

US 20070292033A1
Time in Patent Office

Days
Field of Search
US Class Current

382/218
CPC Class Codes

G06V 10/751 Comparing pixel values or l...

Passive and interactive real-time image recognition software method

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

3 Claims

Specification

Solutions

Use Cases

Quick Links

Passive and interactive real-time image recognition software method

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

3 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links