×

Extracting product purchase information from electronic messages

  • US 9,875,486 B2
  • Filed: 10/21/2014
  • Issued: 01/23/2018
  • Est. Priority Date: 10/21/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method, comprising,by computer apparatus:

  • grouping electronic messages into respective clusters based on similarities between the electronic messages;

    for each of one or more of the clusters, extracting a respective grammar defining a respective arrangement of structural elements of the electronic messages in the cluster, wherein the extracting comprises, for each of the one or more clusters,in non-transitory computer-readable memory, building a respective generalized suffix tree representation from a concatenation of tokenized contents of electronic messages in the cluster into a single respective string, wherein the generalized suffix tree representation maintains the order of suffixes from the single respective string in a hierarchical tree structure of nodes that are linearly interconnected from root to leaf node and, for each suffix, identifies the electronic messages in which the suffix appears and a respective count of times the suffix appears in each electronic message in the cluster, andtraversing the respective generalized suffix tree, wherein the traversing comprises ascertaining the respective arrangement of structural elements for the respective grammar based on appearance frequencies of substrings in the electronic messages in the cluster;

    based on training data comprising training data fields, building one or more classifiers that classify field tokens extracted from selected electronic messages with respective product purchase relevant labels based on correspondences between tokens extracted from the selected electronic messages and the structural elements of the grammars respectively matched to the selected electronic messages; and

    in non-transitory computer-readable memory, storing the grammars and the one or more classifiers in one or more data structures associated with a parser executable by a processor to parse product purchase information from electronic messages.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×