Adaptation of exponential models

US 7,860,314 B2
Filed: 10/29/2004
Issued: 12/28/2010
Est. Priority Date: 07/21/2004
Status: Expired due to Fees

First Claim

Patent Images

1. A non-transitory computer storage medium having computer-executable instructions that when executed by a processor cause the processor to perform steps comprising:

for each of a first set of feature threshold counts, performing steps comprising;

selecting a set of features from background data, where each feature in the set appears in the background data more than a number of times represented by the feature threshold count;

for each of a first set of variances of a prior model, performing steps comprising;

training a set of weights comprising a separate weight for each feature in the selected set of features from the background data such that the set of weights maximizes the likelihood of the set of background data using update equations for the weights that are based on an exponential probability model and relative frequencies in the background data of co-occurrences of contexts and capitalization tags, wherein each trained set of weights and respective selected set of features from the background data represent a separate model;

applying each separate model to a set of background development data and selecting the model with the best accuracy as an initial model having an initial set of weights and an initial set of features from the background data;

for each of a second set of feature threshold counts performing steps comprising;

selecting a set of features from adaptation data, where each feature in the set of features appears in the adaptation data more than a number of times represented by the feature threshold count from the second set of feature threshold counts, wherein the adaptation data is smaller than the background data;

for each of a second set of variances of the prior model, performing steps comprising;

the processor determining an adapted set of weights comprising a separate weight for each feature in a union of the selected set of features from the adaptation data and the initial set of features from the background data such that the set of weights maximize the likelihood of a set of adaptation data, and such that a weight for a feature that is present in the initial set of features from the background data but that is not present in the selected set of features from the adaptation data is updated when determining an adapted set of weights, wherein the likelihood of the set of adaptation data is based on;

a second exponential probability model;

a prior model for the set of weights that comprises means with values equal to the initial set of weights for features that are present in the initial set of features from the background data and means with values equal to zero for features that are not present in the initial set of features from the background data but that are present in the set of features from the adaptation data; and

relative frequencies in the adaptation data of co-occurrences of contexts and capitalization tags;

selecting a set of adapted weights as a final adapted model be determining which set of adapted weights provides the highest likelihood for a asset of adaptation development data.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus are provided for adapting an exponential probability model. In a first stage, a general-purpose background model is built from background data by determining a set of model parameters for the probability model based on a set of background data. The background model parameters are then used to define a prior model for the parameters of an adapted probability model that is adapted and more specific to an adaptation data set of interest. The adaptation data set is generally of much smaller size than the background data set. A second set of model parameters are then determined for the adapted probability model based on the set of adaptation data and the prior model.

11 Citations

10 Claims

1. A non-transitory computer storage medium having computer-executable instructions that when executed by a processor cause the processor to perform steps comprising:
- for each of a first set of feature threshold counts, performing steps comprising;
  
  selecting a set of features from background data, where each feature in the set appears in the background data more than a number of times represented by the feature threshold count;
  
  for each of a first set of variances of a prior model, performing steps comprising;
  
  training a set of weights comprising a separate weight for each feature in the selected set of features from the background data such that the set of weights maximizes the likelihood of the set of background data using update equations for the weights that are based on an exponential probability model and relative frequencies in the background data of co-occurrences of contexts and capitalization tags, wherein each trained set of weights and respective selected set of features from the background data represent a separate model;
  
  applying each separate model to a set of background development data and selecting the model with the best accuracy as an initial model having an initial set of weights and an initial set of features from the background data;
  
  for each of a second set of feature threshold counts performing steps comprising;
  
  selecting a set of features from adaptation data, where each feature in the set of features appears in the adaptation data more than a number of times represented by the feature threshold count from the second set of feature threshold counts, wherein the adaptation data is smaller than the background data;
  
  for each of a second set of variances of the prior model, performing steps comprising;
  
  the processor determining an adapted set of weights comprising a separate weight for each feature in a union of the selected set of features from the adaptation data and the initial set of features from the background data such that the set of weights maximize the likelihood of a set of adaptation data, and such that a weight for a feature that is present in the initial set of features from the background data but that is not present in the selected set of features from the adaptation data is updated when determining an adapted set of weights, wherein the likelihood of the set of adaptation data is based on;
  
  a second exponential probability model;
  
  a prior model for the set of weights that comprises means with values equal to the initial set of weights for features that are present in the initial set of features from the background data and means with values equal to zero for features that are not present in the initial set of features from the background data but that are present in the set of features from the adaptation data; and
  
  relative frequencies in the adaptation data of co-occurrences of contexts and capitalization tags;
  
  selecting a set of adapted weights as a final adapted model be determining which set of adapted weights provides the highest likelihood for a asset of adaptation development data.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The computer storage medium of claim 1 wherein the prior model comprises a Gaussian model.
  - 3. The computer storage medium of claim 1 wherein the prior model comprises an exponential model.
  - 4. The computer storage medium of claim 1 wherein the exponential probability model uses a weighted sum of the selected set of features from the background data, wherein each feature in the selected set of features from the background data comprises an indicator function based on a respective capitalization tag and respective context information.
  - 5. The computer storage medium of claim 4 wherein each feature in the union of the selected set of features from the adaptation data and the initial set of features from the background data comprises an indicator function based on a respective capitalization tag and respective context information.

6. A method comprising:
- a processor selecting a set of features from a set of background data by selecting features that occur in the set of background data more than a number of times represented by a threshold count for the background data;
  
  the processor determining an initial set of weights that maximize the likelihood of a set of background data, wherein the likelihood is based on an exponential probability model and wherein there is a separate initial weight for each feature in the selected set of features from the background data;
  
  the processor selecting a set of features from a set adaptation data by selecting features that occur in the set of adaptation data more than a number of times represented by a threshold count for the adaptation data;
  
  the processor determining an adapted set of weights that maximize the likelihood of a set of adaptation data, wherein the set of adaptation data is smaller than the set of background data and wherein the likelihood is based on a second exponential probability model and a prior model of a distribution of weights comprising a separate mean for each feature in the union of the set of features from the set of background data and the set of features from the set of adaptation data, wherein each mean for a feature in the set of features from the background data has a value equal to the value of the initial weight for that feature and wherein each mean for a feature that is not in the set of features from the background data but is in the set of features from the adaptation data has a value equal to zero.
- View Dependent Claims (7, 8, 9, 10)
- - 7. The method of claim 6 wherein the prior model comprises a Gaussian model.
  - 8. The method of claim 6 wherein the prior model comprises an exponential model.
  - 9. The method of claim 6 wherein the exponential probability model uses a weighted sum of the set of features from the set of background data, wherein each feature in the set of features from the set of background data comprises an indicator function based on a respective capitalization tag and respective context information.
  - 10. The method of claim 9 wherein each feature in the union of the set of features from the set of background data and the set of features from the set of adaptation data comprises an indicator function based on a respective capitalization tag and respective context information.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Acero, Alejandro, Chelba, Ciprian I.
Primary Examiner(s)
Ahmed; Samir A
Assistant Examiner(s)
Lee; John W

Application Number

US10/977,871
Publication Number

US 20060018541A1
Time in Patent Office

2,251 Days
Field of Search

382/228, 382/181, 382155-161, 382/190, 382201-203, 382/220, 704/256
US Class Current

382/181
CPC Class Codes

G06F 18/295 Markov models or related mo...

G06F 40/232 Orthographic correction, e....

Adaptation of exponential models

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

11 Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Adaptation of exponential models

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

11 Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links