KERNELS AND METHODS FOR SELECTING KERNELS FOR USE IN LEARNING MACHINES

US 20080301070A1
Filed: 10/30/2007
Published: 12/04/2008
Est. Priority Date: 05/07/2001
Status: Active Grant

First Claim

Patent Images

1-20. -20. (canceled)

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Learning machines, such as support vector machines, are used to analyze datasets to recognize patterns within the dataset using kernels that are selected according to the nature of the data to be analyzed. Where the datasets possesses structural characteristics, locational kernels can be utilized to provide measures of similarity among data points within the dataset. The locational kernels are then combined to generate a decision function, or kernel, that can be used to analyze the dataset. Where an invariance transformation or noise is present, tangent vectors are defined to identify relationships between the invariance or noise and the data points. A covariance matrix is formed using the tangent vectors, then used in generation of the kernel.

26 Citations

View as Search Results

29 Claims

1-20. -20. (canceled)

21. A computer-implemented method for analyzing data comprising a text document to identify patterns in words or characters within the document, the method comprising:
- inputting the data into a computing environment comprising one or more pre-processing program modules and one or more support vector machine modules stored on a drive or a system memory of a computer or computer network by;
  
  dividing the data into a training dataset and a test dataset;
  
  defining a kernel for structured data for execution by the one or more support vector machine modules by representing the training dataset as a collection of sequences of words or characters and an index set within the document structure, wherein the indices within the index set correspond to locations of words or characters within the document;
  
  applying a vicinity function to the collection of sequences of words or characters to define a plurality of sequences of words or characters centered at different words or characters;
  
  measuring similarity of pairs of sequences of words or characters centered at the different indices to define a locational kernel having a value corresponding to each of the different pairs of sequences of words or characters;
  
  creating additional locational kernels by performing an operation selected from addition, scalar multiplication, multiplication, pointwise limits, transformation and convolution on the locational kernels;
  
  combining the locational kernels and the additional locational kernels for the different sequences of words or characters by performing an operation to produce a kernel on a set of sequences of words or characters;
  
  testing the kernel on the test data set having a known set of sequences of words or characters to determine whether an optimal solution has been achieved;
  
  if the optimal solution has been achieved, applying the kernel on a set of sequences of words or characters to a document having an unknown structure to identify patterns within, and thereby extract knowledge from, the document; and
  
  generating a display of the identified patterns of words or characters within the document having an unknown structure.
- View Dependent Claims (22, 23, 24, 25, 26)
- - 22. The method of claim 21, wherein the step of combining the locational kernels comprises performing an operation selected from the group consisting of summing over the indices and fixing the indices on a given locational kernel.
  - 23. The method of claim 21, wherein the locational kernels comprise Gaussian radial basis function kernels having fixed indices and the kernel on a set of sequences of words comprises a product over all index positions of the locational kernels.
  - 24. The method of claim 21, wherein the kernel on a set of sequences of words or characters further incorporates transformation invariance and the method is generated by estimating at least one tangent vector for associating each data point with a local invariance and selecting the function so that its optimization incorporates a covariance matrix of a plurality of the tangent vectors.
  - 25. The method of claim 21, wherein the dataset has the characteristic of noise and the function is generated by estimating at least one noise vector associated with each data point and selecting the function so that its optimization incorporates a covariance matrix of a plurality of noise vectors.
  - 26. The method of claim 21, wherein the words or characters can appear in any order within the sequence and a penalty is applied according to a distance of each word or character from the index.

27. A computer-implemented method for analyzing data comprising a text document to identify patterns in words or characters within the document, the method comprising:
- inputting the data into a memory and a computer processor programmed for executing one or more support vector machines;
  
  dividing the data into a training dataset and a test dataset;
  
  defining a kernel for structured data for execution by the one or more support vector machines by representing the training dataset as a collection of word or character strings and an index set within the document structure, wherein the indices within the index set correspond to locations of word or character strings within the document;
  
  applying a vicinity function to the collection of word or character strings to define a plurality of word or character strings centered at different words or characters;
  
  measuring similarity of pairs of word or character strings centered at the different indices to define a locational kernel having a value corresponding to each of the different pairs of word or character strings;
  
  creating additional locational kernels by performing an operation selected from addition, scalar multiplication, multiplication, pointwise limits, transformation and convolution on the locational kernels;
  
  combining the locational kernels and the additional locational kernels for the different word or character strings by performing an operation to produce a kernel on a set of word or character strings;
  
  testing the kernel on the test data set having a known set of word or character strings to determine whether an optimal solution has been achieved;
  
  if the optimal solution has been achieved, applying the kernel on a set of word or character strings to a document having an unknown structure to identify patterns within, and thereby extract knowledge from, the document; and
  
  generating a display of the identified patterns of words or characters within the document having an unknown structure.
- View Dependent Claims (28, 29)
- - 28. The method of claim 27, wherein the step of combining the locational kernels comprises performing an operation selected from the group consisting of summing over the indices and fixing the indices on a given locational kernel.
  - 29. The method of claim 27, wherein the words or characters can appear in any order within the sequence and a penalty is applied according to a distance of each word or character from the index.

Specification

Resources

Litigation Campaign Assessment

Granted Patent

US 7,788,193 B2
Time in Patent Office

Days
Field of Search
US Class Current

706/12
CPC Class Codes

G06F 18/2113   by ranking or filtering the...

G06F 18/21355   nonlinear criteria, e.g. em...

G06F 18/22   Matching criteria, e.g. pro...

G06F 18/2411   based on the proximity to a...

G06N 20/00   Machine learning

G06N 20/10   using kernel methods, e.g. ...

G06Q 20/042   characterized in that the p...

G06V 10/761   Proximity, similarity or di...

G06V 10/764   using classification, e.g. ...

G06V 10/771   Feature selection, e.g. sel...

G06V 10/7715   Feature extraction, e.g. by...

G16B 40/00   ICT specially adapted for b...

G16B 40/20   Supervised data analysis

KERNELS AND METHODS FOR SELECTING KERNELS FOR USE IN LEARNING MACHINES

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

26 Citations

29 Claims

Specification

Use Cases

Quick Links

Others

KERNELS AND METHODS FOR SELECTING KERNELS FOR USE IN LEARNING MACHINES

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

26 Citations

29 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others