METHOD AND APPARATUS TO MODEL THE VARIABLES OF A DATA SET

US 20020062296A1
Filed: 01/14/1999
Published: 05/23/2002
Est. Priority Date: 03/13/1998
Status: Active Grant

First Claim

Patent Images

1. A method of modeling the variables in an input data set by means of a probabilistic network including data nodes and causal links, the method comprising the steps of:

registering the input data set;

generating a population of genomes each individually modeling the input data set by means of chromosome data to represent the data nodes in a probabilistic network and the causal links between the data nodes;

performing a crossover operation between the chromosome data of parent genomes in the population to generate offspring genomes;

performing an addition operation to add the offspring genomes to the population;

performing a scoring operation on genomes in the population to derive scores representing the correspondence between the genomes and the input data set;

performing a selecting operation to select genomes from the population according to the scores;

repeating the crossover, scoring, addition and selecting operations for a plurality of generations of the genomes; and

selecting, as an output model, a genome from the last generation.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention relates to modeling the variables of a data set by means of a probabilistic network including data nodes and causal links. The term ‘probabilistic networks’ includes Bayesian networks, belief networks, causal networks and knowledge maps. The variables of an input data set are registered and a population of genomes is generated each of which individually models the input data set. Each genome has a chromosome to represent the data nodes in a probabilistic network and a chromosome to represent the causal links between the data nodes. A crossover operation is performed between the chromosome data of parent genomes in the population to generate offspring genomes. The offspring genomes are then added to the genome population. A scoring operation is performed on genomes in the said population to derive scores representing the correspondence between the genomes and the input data. Genomes are selected from the population according to their scores and the crossover, scoring, addition and selecting operations for a plurality of generations of the genomes. Finally a genome is selected from the last generation according to the best score. A mutation operation may be performed on the genomes. The mutation may consist of the addition or deletion of a data node and the addition or deletion of a causal link.

Citations

22 Claims

1. A method of modeling the variables in an input data set by means of a probabilistic network including data nodes and causal links, the method comprising the steps of:
- registering the input data set;
  
  generating a population of genomes each individually modeling the input data set by means of chromosome data to represent the data nodes in a probabilistic network and the causal links between the data nodes;
  
  performing a crossover operation between the chromosome data of parent genomes in the population to generate offspring genomes;
  
  performing an addition operation to add the offspring genomes to the population;
  
  performing a scoring operation on genomes in the population to derive scores representing the correspondence between the genomes and the input data set;
  
  performing a selecting operation to select genomes from the population according to the scores;
  
  repeating the crossover, scoring, addition and selecting operations for a plurality of generations of the genomes; and
  
  selecting, as an output model, a genome from the last generation.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19)
- - 2. A method according to claim 1, further comprising the step of randomly mutating the genomes.
  - 3. A method according to claim 2, wherein the step of randomly mutating the genomes comprises adding a data node.
  - 4. A method according to claim 2, wherein the step of randomly mutating the genomes comprises deleting a data node.
  - 5. A method according to claim 2, wherein the step of randomly mutating the genomes comprises adding a causal link.
  - 6. A method according to claim 2, wherein the step of randomly mutating the genomes comprises deleting a causal link.
  - 7. A method according to claim 1, wherein the chromosome data of each genome includes a node chromosome in the form of a linear array of node data and a causal link chromosome in the form of a matrix of causal links.
  - 8. A method according to claim 7, wherein the crossover operation provides the offspring genomes with independently inherited node and causal link chromosomes.
  - 9. A method according to claim 1, further comprising the step of culling genomes which fail to meet predetermined structural constraints.
  - 10. A method according to claim 9, wherein the predetermined structural constraints comprise the exclusion of preselected causal links.
  - 11. A method according to claim 9, wherein the predetermined structural constraints comprise the exclusion of genomes that do not represent directed acyclic graphs.
  - 13. Apparatus according to claim 12, further comprising mutating means for randomly mutating the genomes.
  - 14. Apparatus according to claim 13, wherein the mutating means is adapted for performing random mutation which comprises adding a data node.
  - 15. Apparatus according to claim 13, wherein the mutating means is adapted for performing random mutation which comprises deleting a data node.
  - 16. Apparatus according to claim 13, wherein the mutating means is adapted for performing random mutation which comprises adding a causal link.
  - 17. Apparatus according to claim 13, wherein the mutating means is adapted for performing random mutation which comprises deleting a causal link.
  - 18. Apparatus according to claim 12, wherein the generating means is adapted for generating genomes each of which includes a node chromosome in the form of a linear array of node data and a causal link chromosome in the form of a matrix of causal links.
  - 19. Apparatus according to claim 18, wherein the crossover means is adapted for providing the offspring genomes with independently inherited node and causal link chromosomes.

12. Apparatus for modeling the variables in an input data set by means of a probabilistic network including data nodes and causal links, the apparatus comprising:
- data register means for registering the input data set;
  
  generating means for generating a population of genomes each individually modeling the input data set by means of chromosome data to represent the data nodes in a probabilistic network and the causal links between the data nodes;
  
  crossover means for performing a crossover operation between the chromosome data of parent genomes in the population to generate offspring genomes;
  
  adding means for performing an addition operation to add the offspring genomes to the population;
  
  scoring means for performing a scoring operation on genomes in the population to derive scores representing the correspondence between the genomes and the input data set;
  
  selecting means for performing a selecting operation to select genomes from the population according to the scores;
  
  control means for controlling the crossover, scoring, addition and selecting means to repeat their operations for a plurality of generations of the genomes; and
  
  output means for selecting, as an output model, a genome from the last generation.
- View Dependent Claims (20, 21, 22)
- - 20. Apparatus according to claims 12, further comprising culling means for culling genomes which fail to meet predetermined structural constraints.
  - 21. Apparatus according to claim 20, wherein the predetermined structural constraints comprise the exclusion of preselected causal links.
  - 22. Apparatus according to claim 20, wherein the predetermined structural constraints comprise the exclusion of genomes that do not represent directed acyclic graphs.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NCR Corporation
Original Assignee
NCR Corporation
Inventors
NAKISA, RAMIN C.

Granted Patent

US 6,480,832 B2
Time in Patent Office

Days
Field of Search
US Class Current

706/15
CPC Class Codes

G06Q 30/02 Marketing; Price estimation...

METHOD AND APPARATUS TO MODEL THE VARIABLES OF A DATA SET

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

METHOD AND APPARATUS TO MODEL THE VARIABLES OF A DATA SET

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links