Kernels and methods for selecting kernels for use in learning machines
First Claim
1. A method for analyzing a dataset in data space to extract knowledge by identifying patterns in the dataset for classification of data within the dataset, wherein the dataset has an actual structure, the method comprising:
- inputting structured training data into a memory of a computer comprising a processor having software stored therein for executing a kernel machine;
defining a kernel for execution by the kernel machine by;
representing the training dataset in the memory as a collection of components and an actual location of each component within the training data structure, wherein the actual location is specified as indices within an index set, each of the indices corresponding to a point of reference to the actual location;
applying a vicinity function to the collection of components to define subsets of components centered at different actual locations within the training dataset structure; and
measuring a similarity of the subsets of components at the different actual locations to define a locational kernel corresponding to each actual location;
combining the locational kernels for all of the different actual locations by performing an operation to produce a kernel on a set of structures;
applying the kernel on a set of structures to the dataset in the memory to identify patterns within the datase; and
storing data associated with the patterns within the dataset on a media.
3 Assignments
0 Petitions
Accused Products
Abstract
Learning machines, such as support vector machines, are used to analyze datasets to recognize patterns within the dataset using kernels that are selected according to the nature of the data to be analyzed. Where the datasets possesses structural characteristics, locational kernels can be utilized to provide measures of similarity among data points within the dataset. The locational kernels are then combined to generate a decision function, or kernel, that can be used to analyze the dataset. Where invariance transformations or noise is present, tangent vectors are defined to identify relationships between the invariance or noise and the data points. A covariance matrix is formed using the tangent vectors, then used in generation of the kernel for recognizing patterns in the dataset.
33 Citations
23 Claims
-
1. A method for analyzing a dataset in data space to extract knowledge by identifying patterns in the dataset for classification of data within the dataset, wherein the dataset has an actual structure, the method comprising:
-
inputting structured training data into a memory of a computer comprising a processor having software stored therein for executing a kernel machine; defining a kernel for execution by the kernel machine by; representing the training dataset in the memory as a collection of components and an actual location of each component within the training data structure, wherein the actual location is specified as indices within an index set, each of the indices corresponding to a point of reference to the actual location; applying a vicinity function to the collection of components to define subsets of components centered at different actual locations within the training dataset structure; and measuring a similarity of the subsets of components at the different actual locations to define a locational kernel corresponding to each actual location; combining the locational kernels for all of the different actual locations by performing an operation to produce a kernel on a set of structures; applying the kernel on a set of structures to the dataset in the memory to identify patterns within the datase; and storing data associated with the patterns within the dataset on a media. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for extracting knowledge from a dataset in data space, wherein the dataset has an actual structure, the method comprising:
-
inputting structured training data into a memory of a computer comprising a processor having software stored therein for executing a kernel machine; representing the training data in the memory as a collection of components and a structural location of each component within the training data structure, wherein the structural location is specified as indices within an index set, wherein each of the indices corresponding to a point of reference to the structural location; applying a vicinity function to the collection of components to define subsets of components centered at different structural locations within the training data structure; measuring a similarity of the subsets of components at the different structural locations to define a locational kernel corresponding to each structural location; combining the locational kernels for all of the different structural locations by performing an operation to produce a kernel on a set of structures; applying the kernel on a set of structures in the memory to the dataset to identify patterns within the dataset; and transferring data associated with the patterns to a storage device. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A method for analyzing a pair of structured objects to identify patterns within the pair of structure objects for classification of the structured objects, wherein each structured object comprises a dataset, the method comprising:
-
inputting a training dataset into a memory of a computer having a processor for executing a kernel machine, wherein the training dataset comprises structured objects; representing each structured object in the training dataset in the memory as a collection of component objects and a structural location of a set of indices within the structured object each of the indices corresponding to a point of reference to the structural location; applying a vicinity function to the collection of component objects to define subsets of component objects centered at different indices within each structured object in the training dataset; and measuring a similarity between the subsets of component objects of one structured object of the training dataset and the subsets of components objects of the another structured object of the training dataset to define a locational kernel corresponding to each of the different indices; combining the locational kernels for all of the different indices by performing an operation to produce a kernel on a set of structured objects; applying the kernel on a set of structures in the memory to the pair of structured objects to identify patterns within the pair of structure objects; and
displaying data associated with the patterns on a display device. - View Dependent Claims (22, 23)
-
Specification