System, method and apparatus for prediction using minimal affix patterns
First Claim
Patent Images
1. A method of prediction, the method comprising acts, performed via at least one processor, of:
- determining from a selected input sequence a set of potential affixes, wherein the set of potential affixes comprises one or more potential affixes each being contained within the selected input sequence;
generating a predicted set of affixes by processing a master data set, wherein the processing of the master data set comprises removing entries in the master data set based on an excluded data set;
comparing the set of potential affixes with the predicted set of affixes comprising a set of predicted affixes;
determining that a group of one or more potential affixes from the set of potential affixes is in the predicted set of affixes; and
selecting a matching affix from the group, wherein the matching affix is the potential affix within the group that has the greatest number of characters.
7 Assignments
0 Petitions
Accused Products
Abstract
One embodiment generally pertains to a method of prediction. The method includes generating a set of affixes from a selected input sequence and comparing the set of affixes with a predictive set of affixes. The method also includes selecting an affix from the predictive set of affixes. The invention uses various input data sets and allows the ability to perfectly render the original data set and the minimal size of the predictive set of affixes.
75 Citations
36 Claims
-
1. A method of prediction, the method comprising acts, performed via at least one processor, of:
-
determining from a selected input sequence a set of potential affixes, wherein the set of potential affixes comprises one or more potential affixes each being contained within the selected input sequence; generating a predicted set of affixes by processing a master data set, wherein the processing of the master data set comprises removing entries in the master data set based on an excluded data set; comparing the set of potential affixes with the predicted set of affixes comprising a set of predicted affixes; determining that a group of one or more potential affixes from the set of potential affixes is in the predicted set of affixes; and selecting a matching affix from the group, wherein the matching affix is the potential affix within the group that has the greatest number of characters. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method for generating a data set, the method comprising acts, performed via at least one processor, of:
-
receiving a corpus comprising a plurality of sequences; generating a set of triplets based on the corpus, each triplet having an affix, an associated pattern, and a frequency of occurrence for an affix-pattern combination, wherein the affix and associated pattern in the triplet determine the affix-pattern combination of the triplet, and wherein the frequency of occurrence for an affix-pattern combination of each triplet is accumulated while processing each of the plurality of sequences of the corpus; and selecting a subset of triplets as the data set, wherein a selection criteria is based on the length of each affix and the frequency of occurrence of each affix-pattern combination. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A system for predicting a pattern associated with an input sequence, said system comprising:
-
an affix generation module for; receiving a corpus comprising a plurality of sequences; generating an affix prediction data set, the affix prediction data set comprising a set of triplets based on the corpus, each triplet having an affix, an associated pattern, and a frequency of occurrence for an affix-pattern combination, wherein the affix and associated pattern determine the affix-pattern combination, and wherein the frequency of occurrence for each affix-pattern combination is accumulated while processing each of the plurality of sequences of the corpus; and an affix prediction module for; determining from the input sequence a set of affixes, wherein the set of affixes comprises one or more affixes each contained within the input sequence; and predicting a pattern by comparing the set of affixes with entries in the affix prediction data set determining that a group of one or more affixes from the set of affixes is in the prediction data set; selecting a matching affix from the group of one or more affixes, wherein the matching affix is the potential affix within the group that has the greatest number of characters; and selecting a pattern associated with the matching affix as the predicted pattern. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29)
-
-
30. An apparatus for generating a data set, the apparatus comprising:
at least one processor programmed to; receive a corpus, the corpus comprising a plurality of sequences; generate a set of triplets based on the corpus, each triplet having an affix, an associated pattern, and a frequency of occurrence for an affix-pattern combination, wherein the affix and associated pattern determine the affix-pattern combination and wherein the frequency of occurrence for an affix-pattern combination of each triplet is accumulated while processing each of the plurality of sequences of the corpus; and select a subset of triplets as the data set using a selection criteria based on the length of each affix in the set of triplets and the frequency of occurrence of each affix-pattern combination in the set of triplets. - View Dependent Claims (31, 32, 33, 34, 35, 36)
Specification