Discriminative motion modeling for human motion tracking

US 7,728,839 B2
Filed: 10/26/2006
Issued: 06/01/2010
Est. Priority Date: 10/28/2005
Status: Expired due to Fees

First Claim

Patent Images

1. A method for recognizing and tracking human motion comprising steps of:

receiving, by an input device, a plurality of learned motion segments representing different learned motions within a motion class, wherein each learned motion segment comprises a plurality of state vectors and each state vector comprises a time stamp, and wherein one of the learned motion segments comprises temporally contiguous state vectors clustered together in a low-dimensional space based on the time stamps;

receiving, by the input device, a representation of human motion having at least one motion from the motion class, the at least one motion comprising a sequence of pose states represented in a high dimensional space;

processing the received representation according to computer-executable instructions stored in a memory that cause a processor to execute steps of;

projecting the sequences of pose states from the high dimensional space to the low dimensional space according to a discriminative model that when applied to the sequence of pose states increases the inter-class separability between pose states of different motion classes and decreases the intra-class separability between pose states of a same motion-class;

determining an integer P nearest neighbors of a first projected pose state in the low dimensional space, the P nearest neighbors from P different learned motion segments;

determining P pose predictions for the P different learned motion segments; and

determining the pose prediction that best matches a current frame of the representation of human motion; and

storing the determined pose prediction to a memory.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method recognizes and tracks human motion from different motion classes. In a learning stage, a discriminative model is learned to project motion data from a high dimensional space to a low dimensional space while enforcing discriminance between motions of different motion classes in the low dimensional space. Additionally, low dimensional data may be clustered into motion segments and motion dynamics learned for each motion segment. In a tracking stage, a representation of human motion is received comprising at least one class of motion. The tracker recognizes and tracks the motion based on the learned discriminative model and the learned dynamics.

39 Citations

View as Search Results

25 Claims

1. A method for recognizing and tracking human motion comprising steps of:
- receiving, by an input device, a plurality of learned motion segments representing different learned motions within a motion class, wherein each learned motion segment comprises a plurality of state vectors and each state vector comprises a time stamp, and wherein one of the learned motion segments comprises temporally contiguous state vectors clustered together in a low-dimensional space based on the time stamps;
  
  receiving, by the input device, a representation of human motion having at least one motion from the motion class, the at least one motion comprising a sequence of pose states represented in a high dimensional space;
  
  processing the received representation according to computer-executable instructions stored in a memory that cause a processor to execute steps of;
  
  projecting the sequences of pose states from the high dimensional space to the low dimensional space according to a discriminative model that when applied to the sequence of pose states increases the inter-class separability between pose states of different motion classes and decreases the intra-class separability between pose states of a same motion-class;
  
  determining an integer P nearest neighbors of a first projected pose state in the low dimensional space, the P nearest neighbors from P different learned motion segments;
  
  determining P pose predictions for the P different learned motion segments; and
  
  determining the pose prediction that best matches a current frame of the representation of human motion; and
  
  storing the determined pose prediction to a memory.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1 wherein determining the pose prediction that best matches a current frame of the representation of human motion comprises steps of:
    - reconstructing at least one pose prediction in the high dimensional space based on the discriminative model; and
      
      determining an optimal matching of the at least one pose prediction in the high dimensional space to a current frame of the representation of human motion.
  - 3. The method of claim 2 wherein determining an optimal matching comprises steps of:
    - representing the current frame by a human body model comprising coordinates of joints and body parts having shapes associated with limbs, torso and head;
      
      representing each pose prediction in the high dimensional space by the human body model; and
      
      selecting the pose prediction that optimally matches to the current frame based on the human body model.
  - 4. The method of claim 3 wherein the human body model comprises body part descriptors including one or more of a color histogram, a gradient orientation histogram, and a color distance histogram.
  - 5. The method of claim 1 wherein determining the P pose predictions comprises steps of:
    - determining a motion type of each of the nearest neighbors; and
      
      applying a dynamic model to each of the nearest neighbors based on the motion type, the dynamic model learned in a learning stage.
  - 6. The method of claim 1 wherein the discriminative model is received from a learning stage and wherein the learning stage is prior to said receiving steps, the learning stage comprising steps of:
    - receiving motion capture data from a motion capture source, the motion capture data comprising a first motion from a first motion class and a second motion from a second motion class that is different from the first motion class;
      
      processing the motion capture data to extract a first sequence of pose states representing the first motion and a second sequence of pose states representing the second motion;
      
      learning the discriminative model configured to project the first and second sequence of pose states to a low dimensional space and enforce discriminance between the first and second motion classes in the low dimensional space;
      
      applying a clustering algorithm to cluster the temporally contiguous state vectors into the learned motion segments in the low dimensional space; and
      
      learning a dynamic model for each motion segment to generate motion predictions in the low dimensional space.
  - 7. The method of claim 6 wherein the clustering algorithm includes a k-means clustering algorithm.
  - 8. The method of claim 6 wherein learning the discriminative model includes applying a Local Discriminant Embedding (LDE) model.
  - 9. The method of claim 6 wherein learning a discriminative model comprises steps of:
    - computing an intra-class variety representing the sum of the distances between data points that are in the same motion class;
      
      computing the inter-class separability representing the sum of the distances between data points that are in different motion classes;
      
      obtaining a projection matrix configured to reduce the intra-class variety and increase the inter-class separability; and
      
      projecting the motion capture data from the high dimensional space to the low dimensional space based on the projection matrix.
  - 10. The method of claim 1 wherein the pose state comprises a vector of skeleton joint coordinates.
  - 11. The method of claim 1 wherein the at least one motion is tracked without background subtraction.
  - 12. The method of claim 1 wherein tracking the at least one motion comprises tracking the at least one motion in three dimensions.

13. A system for recognizing and tracking human motion comprising:
- an input device for receiving a representation of human motion having at least one motion from a motion class, the at least one motion comprising a sequence of pose states represented in a high dimensional space, and for receiving a plurality of learned motion segments representing different learned motions within the motion class, wherein each learned motion segment comprises a plurality of state vectors and each state vector comprises a time stamp, and wherein one of the learned motion segments comprises temporally contiguous state vectors clustered together in a low-dimensional space based on the time stamps;
  
  a processor adapted to project the sequences of pose states from the high dimensional space to the low dimensional space according to a discriminative model that when applied to the sequence of pose states, increases the inter-class separability between pose states of different motion classes and decreases the intra-class separability between pose states of a same motion class, determining an integer P nearest neighbors of a first projected pose state in the low dimensional space, the P nearest neighbors from P different learned motion segments, determining P pose predictions for the P different learned motion segments, and determining the pose prediction that best matches a current frame of the representation of human motion; and
  
  a memory adapted to store the determined pose state.

14. A computer program product, comprising a computer readable medium storing computer executable code for recognizing and tracking human motion, the computer executable code when executed causing a processor to perform steps of:
- receiving a plurality of learned motion segments representing different learned motions within a motion class, wherein each learned motion segment comprises a plurality of state vectors and each state vector comprises a time stamp, and wherein one of the learned motion segments comprises temporally contiguous state vectors clustered together in a low-dimensional space based on the time stamps;
  
  receiving a representation of human motion having at least one motion from the motion class, the at least one motion comprising a sequence of pose states represented in a high dimensional space;
  
  projecting the sequences of pose states from the high dimensional space to the low dimensional space according to a discriminative model that when applied to the sequence of pose states increases the inter-class separability between pose states of different motion classes and decreases the intra-class separability between pose states of a same motion-class;
  
  determining an integer P nearest neighbors of a first projected pose state in the low dimensional space, the P nearest neighbors from P different learned motion segments;
  
  determining P pose predictions for the P different learned motion segments; and
  
  determining the pose prediction that best matches a current frame of the representation of human motion; and
  
  storing the determined pose prediction to a memory.
- View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
- - 15. The computer program product of claim 14 wherein determining the pose prediction that best matches a current frame of the representation of human motion comprises steps of:
    - reconstructing at least one pose prediction in the high dimensional space based on the discriminative model; and
      
      determining an optimal matching of the at least one pose prediction in the high dimensional space to a current frame of the representation of human motion.
  - 16. The computer program product of claim 15 wherein determining an optimal matching comprises steps of:
    - representing the current frame by a human body model comprising coordinates of joints and body parts having shapes associated with limbs, torso and head;
      
      representing each pose prediction in the high dimensional space by the human body model; and
      
      selecting the pose prediction that optimally matches to the current frame based on the human body model.
  - 17. The computer program product of claim 16 wherein the human body model comprises body part descriptors including one or more of a color histogram, a gradient orientation histogram, and a color distance histogram.
  - 18. The computer program product of claim 14, wherein determining the P pose predictions comprises steps of:
    - determining a motion type of each of the nearest neighbors; and
      
      applying a dynamic model to each of the nearest neighbors based on the motion type, the dynamic model learned in a learning stage.
  - 19. The computer program product of claim 14 wherein the discriminative model is received from a learning stage and wherein the learning stage is prior to said receiving steps, the learning stage comprising steps of:
    - receiving motion capture data from a motion capture source, the motion capture data comprising a first motion from a first motion class and a second motion from a second motion class that is different from the first motion class;
      
      processing the motion capture data to extract a first sequence of pose states representing the first motion and a second sequence of pose states representing the second motion;
      
      learning the discriminative model configured to project the first and second sequence of pose states to a low dimensional space and enforce discriminance between the first and second motion classes in the low dimensional space;
      
      applying a clustering algorithm to the cluster temporally contiguous state vectors into the learned motion segments in the low dimensional space; and
      
      learning a dynamic model for each motion segment to generate motion predictions in the low dimensional space.
  - 20. The computer program product of claim 19 wherein the clustering algorithm includes a k-means clustering algorithm.
  - 21. The computer program product of claim 19 wherein learning the discriminative model includes applying a Local Discriminant Embedding (LDE) model.
  - 22. The computer program product of claim 19 wherein learning a discriminative model comprises steps of:
    - computing an intra-class variety representing the sum of the distances between data points that are in the same motion class;
      
      computing the inter-class separability representing the sum of the distances between data points that are in different motion classes;
      
      obtaining a projection matrix configured to reduce the intra-class variety and increase the inter-class separability; and
      
      projecting the motion capture data from the high dimensional space to the low dimensional space based on the projection matrix.
  - 23. The computer program product of claim 14 wherein the pose state comprises a vector of skeleton joint coordinates.
  - 24. The computer program product of claim 14 wherein the at least one motion is tracked without background subtraction.
  - 25. The computer program product of claim 14 wherein tracking the at least one motion comprises tracking the at least one motion in three dimensions.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Honda Motor Co., Ltd. (Honda Motor Company)
Original Assignee
Honda Motor Co., Ltd. (Honda Motor Company)
Inventors
Fan, Zhimin, Yang, Ming-Hsuan
Primary Examiner(s)
PAPPAS, PETER

Application Number

US11/553,374
Publication Number

US 20070103471A1
Time in Patent Office

1,314 Days
Field of Search

None
US Class Current

345/474
CPC Class Codes

A61B 5/1038   Measuring plantar pressure ...

G06T 7/246   using feature-based methods...

G06V 40/103   Static body considered as a...

Discriminative motion modeling for human motion tracking

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

39 Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Discriminative motion modeling for human motion tracking

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

39 Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links