×

Obtaining data from electronic documents

  • US 9,348,811 B2
  • Filed: 04/20/2012
  • Issued: 05/24/2016
  • Est. Priority Date: 04/20/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method performed with a computing system for obtaining information from a set of related electronic documents, the method comprising:

  • accessing the set of related electronic documents that are each hosted on one or more respective web servers that are accessible through a network, the accessing including retrieving data associated with the set of related electronic documents through the network;

    analyzing markup language of an electronic document of the set of related electronic documents to identify markup language tags of the electronic document;

    analyzing, using a page recognition module, the markup language tags to identify the electronic document as a product page, the page recognition model generated based on a first machine learning algorithm, and the product page comprising a plurality of terms;

    filtering the plurality of terms into a first set of terms and a second set of terms, the first set of terms and the second set of terms including different terms of the plurality of terms, each term in the first set of terms identified as potentially being associated with a product name, and each term in the second set of terms identified as not being associated with a product name;

    for each term of the first set of terms, identifying a noun phrase that includes the term and determining one or more features of each of the noun phrase and the term;

    for each feature of the one or more features;

    determining, for each term of the first of terms, a first feature value of the noun phrase and a second feature value of the term, anddetermining, for each term of the first set of terms, an overall feature value for the term based on the first feature value and the second feature value;

    identifying each term in the first set of terms as being associated with a product name or not being associated with a product name with a name recognition model, the name recognition model generated based on the overall feature value for each feature of the term; and

    providing for display on a graphical user interface, one or more of the first set of terms that are identified as being associated with a product name.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×