×

Canonicalized online document sitelink generation

  • US 10,776,435 B2
  • Filed: 04/19/2017
  • Issued: 09/15/2020
  • Est. Priority Date: 01/31/2013
  • Status: Active Grant
First Claim
Patent Images

1. A system for canonicalized online document sitelink generation, comprising:

  • a data processing system comprising at least one processor and memory to;

    receive digital information generated by an audio codec that converts spoken information from a user to the digital information;

    identify, based on the digital information, a content item associated with a first uniform resource locator (URL) including a campaign parameter;

    generate a canonicalized content item URL comprising a canonical form by removing the campaign parameter from the first URL;

    generate a content item URL group with the canonicalized content item URL;

    receive a sitelink associated with a second URL indexed in a database, the second URL including a URL parameter;

    crawl the second URL with the URL parameter to identify a landing page;

    crawl the second URL without the URL parameter to identify the same landing page;

    generate, responsive to crawl of the second URL with and without the URL parameter and identification of the same landing page, a canonicalized sitelink URL for the second URL by removal of the URL parameter, wherein the canonicalized content item URL is in the canonical form configured to reduce repeated calculations as compared to the first URL not in the canonical form;

    match the canonicalized sitelink URL with the content item of the content item URL group based on an indication of similarity between text of the content item and text of the canonicalized sitelink URL;

    determine, based on a filter configured to eliminate excluded content items based on a geographic policy, that the content item is compatible with the canonicalized sitelink URL; and

    select, in response to receipt of the digital information generated by the audio codec that converts the spoken information from the user to the digital information, the content item matched with the sitelink associated with the canonicalized sitelink URL based on the filter.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×