METHOD AND APPARATUS FOR COMMUNICATING AND RECOVERING MOTION INFORMATION

US 20140307798A1
Filed: 09/10/2012
Published: 10/16/2014
Est. Priority Date: 09/09/2011
Status: Abandoned Application

First Claim

Patent Images

1. A method for recovering motion information within multiple frame media content, using video frame sample data for some frames, together with motion data that carries some information about the motion between frames, comprising the following steps:

(a) selection of two or more reference frames to be used in predicting a further frame;

(b) using the motion data to identify at least two spatial domains within a first reference frame, to group each of these with a corresponding spatial domain in each other reference frame, and to determine a parametric representation of motion between corresponding domains in different reference frames;

(c) using the motion representations, domain correspondences and reference video frame sample values to determine validity information for each overlapping domain within the first reference frame;

(d) using the parametric motion representations, domain correspondences and reference video frame sample values to determine validity information for each overlapping domain within each other reference frame;

)using the parametric motion representations, validity information and reference frame sample values to form a prediction of said further frame.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

This invention describes a method for communicating crude motion information using tracking metadata and recovering more accurate motion information from the received tracking metadata and partial video frame data; in particular, we use metadata to convey crude boundaries of objects in the scene and signal motion information for these objects. The proposed method leaves the task of identifying the exact boundaries of an object to the decoder/client. The proposed method is particularly appealing when metadata itself carries semantics that the client is interested in, such as tracking information in surveillance applications, because, in this case, metadata does not constitute an overhead. The proposed method involves motion descriptions that can be used to predict the appearance of an object in any one frame from its appearance in any other frame that contains the object. That is, the motion information itself allows locations within an object to be invertibly mapped to locations within the same object in any other relevant frame. This is a departure from conventional motion coding schemes, which tightly-couple motion information to the prediction strategy. This property makes the proposed method particularly suitable for applications which require flexible access to the content.

23 Citations

View as Search Results

55 Claims

1. A method for recovering motion information within multiple frame media content, using video frame sample data for some frames, together with motion data that carries some information about the motion between frames, comprising the following steps:
- (a) selection of two or more reference frames to be used in predicting a further frame;
  
  (b) using the motion data to identify at least two spatial domains within a first reference frame, to group each of these with a corresponding spatial domain in each other reference frame, and to determine a parametric representation of motion between corresponding domains in different reference frames;
  
  (c) using the motion representations, domain correspondences and reference video frame sample values to determine validity information for each overlapping domain within the first reference frame;
  
  (d) using the parametric motion representations, domain correspondences and reference video frame sample values to determine validity information for each overlapping domain within each other reference frame;
  
  )using the parametric motion representations, validity information and reference frame sample values to form a prediction of said further frame.
- View Dependent Claims (2, 3, 4, 8, 18, 19, 22, 47)
- - 2. The method of claim 1, comprising a further step of iteratively applying steps (a) to (e) to the prediction of additional frames.
  - 3. The method of claim 1, wherein the motion data is tracking metadata, wherein the spatial domains are tracked regions of interest and where domain correspondences are obtained from the tracking semantics,
  - 4. The method of claim 3, where the parametric motion models are obtained from the geometry of the tracked regions of interest.
  - 8. The method of claim 1, wherein successive iterations of the steps predict frames in a hierarchical fashion, such that each frame is predicted using two reference frames and the frames that are predicted in one level of the hierarchy may be selected as reference frames when predicting other frames in the next level of the hierarchy.
  - 18. The method of claim 1 wherein the validity information within some reference frames is obtained by motion compensating the validity information found for a corresponding domain within another reference frame.
  - 19. The method of claim 1, wherein the predicted frame is obtained by motion compensating each reference frame, within each of the overlapping spatial domains, using the associated parametric motion models and forming a weighted combination of the resulting motion compensated sample values, where the weights are based on the validity information for each domain.
  - 22. The method of claim 1, wherein the video frame sample data is available in the form of images compressed using the JPEG2000 image compression standard, and a portion of the compressed representation and the auxiliary metadata is communicated to a client using the NIP standard, that portion being used by the method of claim 5 to reconstruct content that has not been communicated.
  - 47. A non-transient computer readable medium, comprising instructions for controlling a computer to implement a method in accordance with claim 1.

5-7. -7. (canceled)

9-17. -17. (canceled)

20-21. -21. (canceled)

23. A multiresolution method for comparing two images over a spatial domain of interest to determine a set of likelihood ratios for each resolution level, in which each location within the resolution level has its own likelihood ratio that expresses the probability that the spatial features of the two images are matched at said location, divided by the probability that the spatial features of the two images are not matched at said location, comprising the steps of:
- (a) decomposing each image into a multi-resolution hierarchy;
  
  (b) determining a first set of likelihood ratios for each resolution level based on spatial neighbourhoods of the associated locations in each of the two images within said resolution level;
  
  (c) determining a second set of likelihood ratios for each resolution level by combining the first set of likelihood ratios with the final set of likelihood ratios determined at a lower resolution level, except at the lowest resolution level, where the first and second sets of likelihood ratios are the same.
- View Dependent Claims (24, 29, 39, 40, 41, 43, 54)
- - 24. A method in accordance with claim 23, comprising the further step of:
    - determining a final set of likelihood ratios for each location in each resolution level by applying an edge refinement process to the second set of likelihood ratios.
  - 29. The method of claim 23, wherein determination of the first set of likelihood ratios for each resolution level involves the steps of:
    - (a) determining structural measure values for each image at each location in said resolution level;
      
      (b) determining similarity feature values at each location in said resolution level;
      
      (c) forming a structural similarity likelihood ratio at each location in said resolution level based on the two structural measure values at that location, one from each image;
      
      (d) forming a conditional similarity likelihood ratio at each resolution in said resolution level from the similarity feature values at that location, conditioned upon the structural measure values at that location;
      
      (e) forming the first set of likelihood ratios for the resolution level by multiplying the structural similarity likelihood ratios by the conditional similarity likelihood ratios at each location.
  - 39. The method of claim 23, wherein the probability ratios are found using lookup tables that are populated through an off line modeling procedure.
  - 40. The method of claim 24, wherein the likelihood ratios are expressed in a logarithmic domain and the first, second and final sets of likelihood ratios for each resolution level correspond to first, second and final sets of log-likelihood ratios.
  - 41. The method of claim 24, wherein the second set of log-likelihood ratios at a given resolution level are obtained by interpolating and multiplying the final set of log-likelihood ratios from a lower resolution level by a set of scaling factors and adding these scaled and interpolated log-likelihood values to those from the first set of log-likelihood ratios at the given resolution level.
  - 43. The method of claim 23, wherein the likelihoods are derived by applying a transducer function to the likelihood ratios found by applying the method to the first reference frame and a motion compensated version of the second reference frame.
  - 54. A non-transient computer readable medium, comprising instructions for controlling a computer to implement a method in accordance with claim 23.

25-28. -28. (canceled)

30-38. -38. (canceled)

42. (canceled)

44. An apparatus for recovering motion information within multiple frame media content, using video frame sample data for some frames, together with motion data that carries some information about the motion between frames, the apparatus comprising a processing apparatus arranged to implement the following steps:
- (a) selection of two or more reference frames to be used in predicting a further frame;
  
  (b) using the motion data to identify at least two spatial domains within a first reference frame, to group each of these with a corresponding spatial domain in each other reference frame, and to determine a parametric representation of motion between corresponding domains in different reference frames;
  
  (c) using the motion representations, domain correspondences and reference video frame sample values to determine validity information for each overlapping domain within the first reference frame;
  
  (d) using the parametric motion representations, domain correspondences and reference video frame sample values to determine validity information for each overlapping domain within each other reference frame;
  
  (e) using the parametric motion representations, validity information and reference frame sample values to form a prediction of said further frame.
- View Dependent Claims (45)
- - 45. An apparatus in accordance with claim 44, the apparatus comprising a video signal decoder.

46. (canceled)

48-49. -49. (canceled)

50. An apparatus for comparing two images over a spatial domain of interest to determine a set of likelihood ratios for each resolution level, in which each location within the resolution level has its own likelihood ratio that expresses the probability that the spatial features of the two images are matched at said location, divided by the probability that the spatial features of the two images are not matched at said location, the apparatus comprising a processor configured to implement the steps of;
- (a) decomposing each image into a multi-resolution hierarchy;
  
  (b) determining a first set of likelihood ratios for each resolution level based on spatial neighbourhoods of the associated locations in each of the two images within said resolution level;
  
  (c) determining a second set of likelihood ratios for each resolution level by combining the first set of likelihood ratios with the final set of likelihood ratios determined at a lower resolution level, except at the lowest resolution level, where the first and second sets of likelihood ratios are the same.
- View Dependent Claims (51, 52)
- - 51. An apparatus in accordance with claim 50, the processor being configured to implement the further step of:
    - determining a final set of likelihood ratios for each location in each resolution level by applying an edge refinement process to the second set of likelihood ratios.
  - 52. An apparatus in accordance with claim 50 or claim 51, comprising a video signal decoder.

53. (canceled)

55-67. -67. (canceled)

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Kakadu R & D Pty Limited
Original Assignee
NewSouth Innovations Pty Limited (University Of New South Wales)
Inventors
Taubman, David Scott, Naman, Aous Thabit

Application Number

US14/343,473
Publication Number

US 20140307798A1
Time in Patent Office

Days
Field of Search
US Class Current

375/240.16
CPC Class Codes

H04N 19/44   Decoders specially adapted ...

H04N 19/46   Embedding additional inform...

H04N 19/53   Multi-resolution motion est...

H04N 19/543   using regions

H04N 19/61   in combination with predict...

H04N 19/63   using sub-band based transf...

METHOD AND APPARATUS FOR COMMUNICATING AND RECOVERING MOTION INFORMATION

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

23 Citations

55 Claims

Specification

Solutions

Use Cases

Quick Links

METHOD AND APPARATUS FOR COMMUNICATING AND RECOVERING MOTION INFORMATION

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

23 Citations

55 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links