Extracting product purchase information from electronic messages
First Claim
1. A computer-implemented method, comprising:
- for each purchase-related electronic message in a group of purchase-related electronic messages selected from a collection of electronic messages transmitted between network nodes and stored in a first networked non-transitory computer-readable memory,segmenting, by a processor, contents of the electronic message into tokens;
matching the electronic message to one of multiple clusters of purchase-related electronic messages, wherein each cluster is associated with a respective grammar that recursively defines a respective allowable arrangement of tokens corresponding to structural elements of the electronic messages in the matched cluster;
parsing, by a processor, the tokens segmented from the electronic message in accordance with the grammar associated with the cluster matched to the electronic message, wherein the parsing comprises identifying the tokens segmented from the electronic message that correspond to respective structural elements defined in the grammar and extracting unidentified tokens segmented from the electronic message as field tokens;
determining classification features from the tokens corresponding to structural elements of the electronic messages in the matched cluster;
classifying, by at least one machine learning classifier, the extracted field tokens with respective product purchase relevant labels based on the determined classification features;
storing associations between the product purchase relevant labels and the respective extracted field tokens as aggregated data in a second networked non-transitory computer-readable memory; and
transmitting data for displaying a view based on the aggregated data on a client network node.
5 Assignments
0 Petitions
Accused Products
Abstract
Improved systems and methods for extracting product purchase information from electronic messages transmitted between physical network nodes to convey product purchase information to designated recipients. These examples provide a product purchase information extraction service that is able to extract product purchase information from electronic messages with high precision across a wide variety of electronic message formats and thereby solve the practical problems that have arisen as a result of the proliferation of different electronic message formats used by individual merchants and across different merchants and different languages. In this regard, these examples are able to automatically learn the structures and semantics of different message formats, which accelerates the ability to support new message sources, new markets, and different languages.
78 Citations
22 Claims
-
1. A computer-implemented method, comprising:
-
for each purchase-related electronic message in a group of purchase-related electronic messages selected from a collection of electronic messages transmitted between network nodes and stored in a first networked non-transitory computer-readable memory, segmenting, by a processor, contents of the electronic message into tokens; matching the electronic message to one of multiple clusters of purchase-related electronic messages, wherein each cluster is associated with a respective grammar that recursively defines a respective allowable arrangement of tokens corresponding to structural elements of the electronic messages in the matched cluster; parsing, by a processor, the tokens segmented from the electronic message in accordance with the grammar associated with the cluster matched to the electronic message, wherein the parsing comprises identifying the tokens segmented from the electronic message that correspond to respective structural elements defined in the grammar and extracting unidentified tokens segmented from the electronic message as field tokens; determining classification features from the tokens corresponding to structural elements of the electronic messages in the matched cluster; classifying, by at least one machine learning classifier, the extracted field tokens with respective product purchase relevant labels based on the determined classification features; storing associations between the product purchase relevant labels and the respective extracted field tokens as aggregated data in a second networked non-transitory computer-readable memory; and transmitting data for displaying a view based on the aggregated data on a client network node. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. An apparatus, comprising non-transitory memory storing processor-readable instructions, and one or more processors coupled to the memory, configured to execute the instructions, and based at least in part on the execution of the instructions configured to perform operations comprising:
-
identifying purchase-related electronic messages in a collection of electronic messages transmitted between network nodes and stored in a first networked non-transitory computer-readable memory; for each of a plurality of the identified purchase-related electronic messages, segmenting, by a processor, contents of the identified electronic message into tokens; matching the identified electronic message to one of multiple clusters of purchase-related electronic messages, wherein each cluster is associated with a respective grammar that recursively defines a respective allowable arrangement of tokens corresponding to structural elements of the electronic messages in the cluster matched to the identified electronic message; parsing, by a processor, the tokens segmented from the identified electronic message in accordance with the grammar associated with the cluster matched to the identified electronic message, wherein the parsing comprises identifying the tokens segmented from the identified electronic message that correspond to respective structural elements defined in the grammar and extracting unidentified tokens segmented from the identified electronic message as field tokens; determining classification features from the tokens corresponding to structural elements of the electronic messages in the matched cluster; classifying, by at least one machine learning classifier, the extracted field tokens with respective product purchase relevant labels based on the determined classification features; and storing associations between the product purchase relevant labels and the respective extracted field tokens as aggregated data in a second networked non-transitory computer-readable memory; and in response to a request from a client network node, transmitting data for displaying a view based on the aggregated data on the client network node.
-
-
21. At least one non-transitory computer-readable medium having processor-readable program code embodied therein, the processor-readable program code adapted to be executed by a processor to implement a method comprising:
-
for each purchase-related electronic message in a group of purchase-related electronic messages selected from a collection of electronic messages transmitted between network nodes and stored in a first networked non-transitory computer-readable memory, segmenting, by a processor, contents of the electronic message into tokens; matching the electronic message to one of multiple clusters of purchase-related electronic messages, wherein each cluster is associated with a respective grammar that recursively defines a respective allowable arrangement of tokens corresponding to structural elements of the electronic messages in the matched cluster; parsing, by a processor, the tokens segmented from the electronic message in accordance with the grammar associated with the cluster matched to the electronic message, wherein the parsing comprises identifying the tokens segmented from the electronic message that correspond to respective structural elements defined in the grammar and extracting unidentified tokens segmented from the electronic message as field tokens; determining classification features from the tokens corresponding to structural elements of the electronic messages in the matched cluster; classifying, by at least one machine learning classifier, the extracted field tokens with respective product purchase relevant labels based on the determined classification features; storing associations between the product purchase relevant labels and the respective extracted field tokens as aggregated data in a second networked non-transitory computer-readable memory; and transmitting data for displaying a view based on the aggregated data on a client network node.
-
-
22. An apparatus comprising at least one non-transitory memory storing processor-readable instructions, and at least one processor coupled to the memory, configured to execute the instructions, and based at least in part on the execution of the instructions configured to implement:
-
a product purchase information token parser configured to segment contents of a selected electronic message into tokens, match the selected electronic message to one of multiple clusters of purchase-related electronic messages transmitted between physical network nodes to convey product purchase information to designated recipients, wherein each cluster is associated with a respective grammar that recursively defines a respective allowable arrangement of tokens corresponding to structural elements of the electronic messages in the matched cluster, and parse the tokens segmented from the selected electronic message in accordance with the grammar associated with the cluster matched to the selected electronic message, wherein the parsing comprises identifying the tokens segmented from the selected electronic message that correspond to respective structural elements defined in the grammar and extracting unidentified tokens segmented from the selected electronic message as field tokens; a product purchase information token classifier configured to determine classification features of the selected electronic message from the tokens corresponding to structural elements of the electronic messages in the matched cluster, classify the extracted field tokens with respective product purchase relevant labels based on the determined classification features; and non-transitory computer-readable memory storing associations between the product purchase relevant labels and the respective extracted field tokens in one or more data structures permitting computer-based generation of actionable purchase history information.
-
Specification