×

Scalable, high performance and highly available distributed storage system for Internet content

  • US 7,624,169 B2
  • Filed: 11/10/2006
  • Issued: 11/24/2009
  • Est. Priority Date: 04/02/2001
  • Status: Active Grant
First Claim
Patent Images

1. A method of content storage on behalf of participating content providers, comprising:

  • maintaining a set of storage sites managed by a content delivery network (CDN) service provider distinct from the participating content providers, wherein the storage sites of the set of storage sites are located in different Internet-accessible locations and operate asynchronously and autonomously from one another, wherein each of the given storage sites comprises a network filesystem, a set of one or more client servers, a set of one or more file servers that export the network filesystem to the set of client servers, a content upload process operative on at least one of the client servers, a content replication process operative on at least one of the client servers, and a content download process operative on at least one of the client servers, the service provider also operating a traffic manager that estimates relative connectivity to the storage sites from a set of proxy points, wherein each proxy point is determined by directing a trace route from given storage sites and determining a given point in the Internet where the trace routes intersect, and wherein the relative connectivity is determined by probing each of the proxy points;

    for each piece of content identified for upload by a participating content provider, selecting a given upload storage site from the set of storage sites by having the traffic manager resolve a first DNS query without reference to an identifier for the piece of content;

    at the given upload storage site identified by resolving the first DNS query, using a content upload process to receive an upload of the piece of content identified by the participating content provider, and using a replication process to store the piece of content received by the content upload process and to initiate replication of the piece of content to at least one other storage site in the set of storage sites;

    for each piece of content identified for download by a given server entity, the given server entity being a CDN edge server, selecting a given download storage site from the set of storage sites by having the traffic manager resolve a second DNS query without reference to an identifier for the piece of content, wherein the second DNS query differs from the first DNS query;

    at the given download storage site identified by resolving the second DNS query, using a download process to attempt to download the piece of content to the given server entity;

    for a given piece of content identified for download by a given server entity that is not then available for download from the given download storage site selected from the set of storage sites, redirecting a request for the given piece of content to one or more of the storage sites, wherein during the redirecting step a redirecting threshold is maintained such that, when the redirecting threshold is met, the given piece of content is then dynamically received from a participating content provider origin server;

    maintaining an activity log at each storage site; and

    at a second storage site, replaying an activity log recorded at a first storage site to synchronize content across the first and second storage sites.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×