Relevance vector machine

US 6,633,857 B1
Filed: 09/04/1999
Issued: 10/14/2003
Est. Priority Date: 09/04/1999
Status: Expired due to Term

First Claim

Patent Images

1. A computer-implemented method operable on a data set to be modeled, comprising:

determining a prior distribution for the data set for modeling thereof, the prior distribution having a plurality of weights and a corresponding plurality of hyperparameters;

determining a relevance vector learning machine to obtain a posterior distribution; and

outputting the posterior distribution for the data set.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A relevance vector machine (RVM) for data modeling is disclosed. The RVM is a probabilistic basis model. Sparsity is achieved through a Bayesian treatment, where a prior is introduced over the weights governed by a set of hyperparameters. As compared to a Support Vector Machine (SVM), the non-zero weights in the RVM represent more prototypical examples of classes, which are termed relevance vectors. The trained RVM utilizes many fewer basis functions than the corresponding SVM, and typically superior test performance. No additional validation of parameters (such as C) is necessary to specify the model, except those associated with the basis.

Citations

23 Claims

1. A computer-implemented method operable on a data set to be modeled, comprising:
- determining a prior distribution for the data set for modeling thereof, the prior distribution having a plurality of weights and a corresponding plurality of hyperparameters;
  
  determining a relevance vector learning machine to obtain a posterior distribution; and
  
  outputting the posterior distribution for the data set.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein determining a prior distribution comprises:
3. The method of claim 2, wherein optimizing a marginal likelihood comprises utilizing an EM approach.
4. The method of claim 2, wherein optimizing a marginal likelihood comprises utilizing a direct differentiation approach.
5. The method of claim 1, wherein determining a prior distribution comprises:
- for each hyperparameter, determining most probable weights;
  
  determining a Hessian at the most probable weights; and
  
  , repeating until a predetermined convergence criteria has been satisfied.
6. The method of claim 1, wherein determining a prior distribution for the data set for modeling thereof comprises determining the prior distribution for the data set for linear modeling thereof.
7. The method of claim 1, wherein determining a prior distribution for the data set for modeling thereof comprises determining the prior distribution for the data set for classification modeling thereof.
8. The method of claim 1, wherein the prior distribution comprises a Gaussian prior distribution.

9. A computer-implemented method comprising;
- determining a relevance vector learning machine to obtain a posterior distribution of a data set to be modeled; and
  
  , outputting at least the posterior distribution for the data set.
- View Dependent Claims (10, 11)
- - 10. The method of claim 9, wherein determining a relevance vector learning machine comprises determining a plurality of weights for the posterior distribution based on a plurality of basis functions by utilizing a corresponding plurality of hyperparameters.
  - 11. The method of claim 10, wherein the plurality of basis functions comprises a plurality of kernel functions.

12. A computer-implemented method comprising;
- selecting a model for a data set to be modeled;
  
  determining a prior distribution for the model having a plurality of weights by utilizing a corresponding plurality of hyperparameters to simplify the model;
  
  determining a relevance vector learning machine to obtain a posterior distribution; and
  
  outputting at least the posterior distribution for the model.

13. A machine-readable medium having instructions stored thereon for execution by a processor to perform a method comprising:
- inputting a data set to be modeled;
  
  determining a prior distribution for the data set for modeling thereof, the prior distribution having a plurality of weights and a corresponding plurality of hyperparameters, comprising, determining a marginal likelihood for the plurality of hyperparameters;
  
  iteratively re-estimating the plurality of hyperparameters to optimize the plurality of hyperparameters;
  
  determining a relevance vector learning machine to obtain a posterior distribution; and
  
  outputting the posterior distribution for the data set.
- View Dependent Claims (14, 15, 16)
- - 14. The medium of claim 13, wherein optimizing a marginal likelihood comprises utilizing an EM approach.
  - 15. The medium of claim 13, wherein optimizing a marginal likelihood comprises utilizing a direct differentiation approach.
  - 16. The medium of claim 13, wherein the prior distribution comprises a normal distribution.

17. A machine-readable medium having instructions stored thereon for execution by a processor to perform a method operable on a data set to be modeled comprising:
- determining a prior distribution for the data set for modeling thereof, the prior distribution having a plurality of weights and a corresponding plurality of hyperparameters, comprising;
  
  for each hyperparameter, determining most probable weights;
  
  determining a Hessian at the most probable weights;
  
  repeating until a predetermined convergence criteria has been satisfied; and
  
  , outputting a posterior distribution for the data set.
- View Dependent Claims (18)
- - 18. The medium of claim 17, wherein the prior distribution comprises a normal distribution.

19. A machine-readable medium having instructions stored thereon for execution by a process to perform a method operable on a data set to be modeled comprising:
- selecting a model for the data set;
  
  determining a prior distribution for the model having a plurality of weights by utilizing a corresponding plurality of hyperparameters to simplify the model;
  
  determining a relevance vector learning machine to obtain a posterior distribution; and
  
  outputting at least the posterior distribution for the model.
- View Dependent Claims (20)
- - 20. The medium of claim 19, wherein the model comprises one of:
    - a linear model, and a classification model.

21. A machine-readable medium having instructions stored thereon for execution by a process to perform a method comprising:
- determining a relevance vector learning machine having a plurality of basis functions to obtain a posterior distribution for a data set to be modeled; and
  
  , outputting at least the posterior distribution for the data set.
- View Dependent Claims (22, 23)
- - 22. The medium of claim 21, wherein determining a relevance vector learning machine comprises determining a plurality of weights for the posterior distribution based on a plurality of basis functions by utilizing a corresponding plurality of hyperparameters.
  - 23. The medium of claim 21, wherein the model comprises one of:
    - a linear model, and a classification model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Tipping, Michael
Primary Examiner(s)
Patel, Ramesh
Assistant Examiner(s)
Holmes, Michael B.

Application Number

US09/391,093
Time in Patent Office

1,501 Days
Field of Search

706/15-44
US Class Current

706/16
CPC Class Codes

G06N 3/047 Probabilistic or stochastic...

Relevance vector machine

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

Relevance vector machine

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links