Machine learning and validation of account names, addresses, and/or identifiers
First Claim
Patent Images
1. A computer-implemented method for determining if an account identifier is computer-generated, comprising:
- receiving the account identifier;
dividing the account identifier into a plurality of fragments;
calculating a first feature of the plurality of fragments of the account identifier by calculating fragment frequencies of each of the plurality of fragments relative to a plurality of fragments associated with a plurality of account identifiers;
calculating a second feature of the account identifier by classifying alphanumeric character types of the account identifier;
calculating a third feature of the plurality of fragments of the account identifier by hashing the fragment frequencies to a commonness of each of the plurality of fragments of the account identifier; and
generating a plurality of data pairs, each data pair comprising a fragment of the plurality of fragments and an associated feature value, the associated feature value comprising the first feature, the second feature, or the third feature;
providing each of the plurality of data pairs to a probabilistic classifier model to determine if the account identifier is computer-generated.
5 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are disclosed for determining if an account identifier is computer-generated. One method includes receiving the account identifier, dividing the account identifier into a plurality of fragments, and determining one or more features of at least one of the fragments. The method further includes determining the commonness of at least one of the fragments, and determining if the account identifier is computer-generated based on the features of at least one of the fragments, and the commonness of at least one of the fragments.
44 Citations
17 Claims
-
1. A computer-implemented method for determining if an account identifier is computer-generated, comprising:
-
receiving the account identifier; dividing the account identifier into a plurality of fragments; calculating a first feature of the plurality of fragments of the account identifier by calculating fragment frequencies of each of the plurality of fragments relative to a plurality of fragments associated with a plurality of account identifiers; calculating a second feature of the account identifier by classifying alphanumeric character types of the account identifier; calculating a third feature of the plurality of fragments of the account identifier by hashing the fragment frequencies to a commonness of each of the plurality of fragments of the account identifier; and generating a plurality of data pairs, each data pair comprising a fragment of the plurality of fragments and an associated feature value, the associated feature value comprising the first feature, the second feature, or the third feature; providing each of the plurality of data pairs to a probabilistic classifier model to determine if the account identifier is computer-generated. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for determining if an account identifier is computer-generated, the system including:
-
a data storage device storing instructions determining if an account identifier is computer-generated; and a processor configured to execute the instructions to perform a method including; receiving the account identifier; dividing the account identifier into a plurality of fragments; calculating a first feature of the plurality of fragments of the account identifier by calculating fragment frequencies of each of the plurality of fragments relative to a plurality of fragments associated with a plurality of account identifiers; calculating a second feature of the account identifier by classifying alphanumeric character types of the account identifier; calculating a third feature of the plurality of fragments of the account identifier by hashing the fragment frequencies to a commonness of each of the plurality of fragments of the account identifier; and generating a plurality of data pairs, each data pair comprising a fragment of the plurality of fragments and an associated feature value, the associated feature value comprising the first feature, the second feature, or the third feature; providing each of the plurality of data pairs to a probabilistic classifier model to determine if the account identifier is computer-generated. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A non-transitory computer-readable medium storing instructions that, when executed by a processor, cause the processor to perform a method for determining whether an account identifier is computer-generated, the method including:
-
receiving the account identifier; dividing the account identifier into a plurality of fragments; calculating a first feature of the plurality of fragments of the account identifier by calculating fragment frequencies of each of the plurality of fragments relative to a plurality of fragments associated with a plurality of account identifiers; calculating a second feature of the account identifier by classifying alphanumeric character types of the account identifier; calculating a third feature of the plurality of fragments of the account identifier by hashing the fragment frequencies to a commonness of each of the plurality of fragments of the account identifier; and generating a plurality of data pairs, each data pair comprising a fragment of the plurality of fragments and an associated feature value, the associated feature value comprising the first feature, the second feature, or the third feature; providing each of the plurality of data pairs to a probabilistic classifier model to determine if the account identifier is computer-generated. - View Dependent Claims (15, 16, 17)
-
Specification