PATTERN CHANGE DISCOVERY BETWEEN HIGH DIMENSIONAL DATA SETS

US 20140122039A1
Filed: 10/23/2013
Published: 05/01/2014
Est. Priority Date: 10/25/2012
Status: Active Grant

First Claim

Patent Images

1. A method for pattern change discovery between high-dimensional data sets, comprising:

determining a linear model of a dominant subspace for each pair of high dimensional data sets using at least one automated processor, and using matrix factorization to produce a set of principal angles representing differences between the linear models;

defining a set of basis vectors under a null hypothesis of no statistically significant pattern change and under an alternative hypothesis of a statistically significant pattern change;

performing a statistical test on the basis vectors with respect to the null hypothesis and the alternate hypothesis to automatically determine whether a statistically significant difference is present; and

producing an output selectively dependent on whether the statistically significant difference is present.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The general problem of pattern change discovery between high-dimensional data sets is addressed by considering the notion of the principal angles between the subspaces is introduced to measure the subspace difference between two high-dimensional data sets. Current methods either mainly focus on magnitude change detection of low-dimensional data sets or are under supervised frameworks. Principal angles bear a property to isolate subspace change from the magnitude change. To address the challenge of directly computing the principal angles, matrix factorization is used to serve as a statistical framework and develop the principle of the dominant subspace mapping to transfer the principal angle based detection to a matrix factorization problem. Matrix factorization can be naturally embedded into the likelihood ratio test based on the linear models. The method may be unsupervised and addresses the statistical significance of the pattern changes between high-dimensional data sets.

Citations

20 Claims

1. A method for pattern change discovery between high-dimensional data sets, comprising:
- determining a linear model of a dominant subspace for each pair of high dimensional data sets using at least one automated processor, and using matrix factorization to produce a set of principal angles representing differences between the linear models;
  
  defining a set of basis vectors under a null hypothesis of no statistically significant pattern change and under an alternative hypothesis of a statistically significant pattern change;
  
  performing a statistical test on the basis vectors with respect to the null hypothesis and the alternate hypothesis to automatically determine whether a statistically significant difference is present; and
  
  producing an output selectively dependent on whether the statistically significant difference is present.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method according to claim 1, wherein the statistical test comprises a likelihood ratio test.
  - 3. The method according to claim 1, wherein said automatically determining is unsupervised.
  - 4. The method according to claim 1, further comprising determining a statistical significance of a change of pattern represented in the linear models.
  - 5. The method according to claim 1, wherein the high-dimensional data sets comprise semantic data.
  - 6. The method according to claim 1, wherein the high-dimensional data sets comprise video data.
  - 7. The method according to claim 6, wherein the high-dimensional data sets comprise video representing a common location acquired at different times.
  - 8. The method according to claim 1, wherein the high-dimensional data sets comprise multimedia data.
  - 9. The method according to claim 1, wherein the statistical test employs a likelihood ratio statistic given by
  - 10. The method according to claim 9, wherein the maximum likelihood estimates subject to the null hypothesis constraint H₀:
    - P′
      
      ^TP=0

11. A system for determining pattern change discovery between high-dimensional data sets, comprising:
- an input configured to receive a pair of high dimensional data sets;
  
  at least one automated processor, configured to;
  
  determine a linear model of a dominant subspace for each pair of high dimensional data sets;
  
  factoring at least one matrix to produce a set of principal angles representing differences between the linear models;
  
  define a set of basis vectors under a null hypothesis of no statistically significant pattern change and under an alternative hypothesis of a statistically significant pattern change; and
  
  perform a statistical test on the basis vectors with respect to the null hypothesis and the alternate hypothesis to determine whether a statistically significant difference is present; and
  
  an output configured to communicate data selectively dependent on whether the statistically significant difference is present.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
- - 12. The system according to claim 11, wherein the statistical test comprises a likelihood ratio test.
  - 13. The system according to claim 11, wherein said automatically determining is unsupervised.
  - 14. The system according to claim 11, wherein the at least one processor is further configured to determine a statistical significance of pattern changes between the dominant subspaces.
  - 15. The system according to claim 11, wherein the high-dimensional data sets comprise semantic data.
  - 16. The system according to claim 11, wherein the high-dimensional data sets comprise video data.
  - 17. The system according to claim 11, wherein the high-dimensional data sets comprise video representing a common location acquired at different times.
  - 18. The system according to claim 11, wherein the high-dimensional data sets comprise multimedia data.
  - 19. The system according to claim 11, wherein the statistical test employs a likelihood ratio statistic given by

20. A nontransitory computer readable medium, comprising instructions for controlling a programmable processor to perform a method comprising:
- determining a linear model of a dominant subspace for each pair of high dimensional data sets using at least one automated processor, and using matrix factorization to produce a set of principal angles representing differences between the linear models;
  
  defining a set of basis vectors under a null hypothesis of no statistically significant pattern change and under an alternative hypothesis of a statistically significant pattern change;
  
  performing a statistical test on the basis vectors with respect to the null hypothesis and the alternate hypothesis to automatically determine whether a statistically significant difference is present; and
  
  producing an output selectively dependent on whether the statistically significant difference is present.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
The Research Foundation for The State University of New York (State University of New York)
Original Assignee
The Research Foundation for The State University of New York (State University of New York)
Inventors
Zhang, Zhongfei Mark, Xu, Yi

Granted Patent

US 10,860,683 B2
Time in Patent Office

Days
Field of Search
US Class Current

703/2
CPC Class Codes

G06F 17/18 for evaluating statistical ...

G06F 18/213 Feature extraction, e.g. by...

PATTERN CHANGE DISCOVERY BETWEEN HIGH DIMENSIONAL DATA SETS

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

PATTERN CHANGE DISCOVERY BETWEEN HIGH DIMENSIONAL DATA SETS

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links