×

Neural network system with N-gram term weighting method for molecular sequence classification and motif identification

  • US 5,845,049 A
  • Filed: 03/27/1996
  • Issued: 12/01/1998
  • Est. Priority Date: 03/27/1996
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for training a neural network to predict membership in a family of linear sequences comprising;

  • providing a training set of full length member sequences, a training set of full length non-member sequences and a training set of family motif sequences;

    deriving term weights for n-gram terms by dividing the number of occurrences of each n-gram term in the motif set by the number of occurrences in the full length member set;

    deriving a set of global vectors for the full length member set and a set of global vectors for the full length non-member set using an n-gram method;

    deriving a set of motif vectors for the full length member set and a set of motif vectors for the fall length non-member set by multiplying each term in the global vector set by its term weighting factor;

    providing a neural network with multiple output units to represent one family;

    using the global vector of the member sequence set to train the positive full length output unit;

    using the motif vector of the member set to train the positive motif output unit;

    using the global vector of the non-member set to train the fall length negative output unit; and

    using the motif vector of the non-member set to train the negative motif output unit of said neural network.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×