Reduced bandwidth data uploading in data systems
First Claim
1. A device, comprising:
- at least one processor; and
a memory comprising program instructions, wherein the program instructions are executable by the at least one processor to implement a data store gateway at a given client network that is distinct from a plurality of other client networks of clients of a remote network-based virtualized data store service that provides remote storage services on a provider network for the clients, wherein the provider network is distinct from the given client network, wherein the given client network connects one or more devices to the data store gateway, wherein the data store gateway is configured to provide a storage gateway between the given client network and the remote network-based virtualized data store service, wherein the program instructions are further executable to cause the data store gateway to;
receive a plurality of data units from the one or more devices connected to the data store gateway via the given client network of a given client of the remote network-based virtualized data store service;
generate fingerprints for the received plurality of data units, wherein each fingerprint uniquely identifies a respective data unit of the received plurality of data units;
send the fingerprints to the remote network-based virtualized data store service, wherein the remote network-based virtualized data store service maintains a data store of the plurality of data units;
receive, from the remote network-based virtualized data store service, an indication of one or more of the plurality of data units that are to be stored to the data store of the remote network-based virtualized data store service; and
send, to the remote network-based virtualized data store service, the one or more of the plurality of data units for storage by the remote network-based virtualized data store service to the data store.
0 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus for uploading data from a sender to a receiver. A data deduplication technique is described that may reduce the bandwidth used in uploading data from the sender to the receiver. In the technique, the receiver, rather than the sender, maintains a fingerprint dictionary for previously uploaded data. When a sender has additional data to be uploaded, the sender extracts fingerprints for units of the data and sends the fingerprints to the receiver. The receiver checks its fingerprint dictionary to determine the data units to be uploaded and notifies the sender of the identified units, which then sends the identified units of data to the receiver. The technique may, for example, be applied in virtualized data store systems to reduce bandwidth usage in uploading data.
21 Citations
20 Claims
-
1. A device, comprising:
-
at least one processor; and a memory comprising program instructions, wherein the program instructions are executable by the at least one processor to implement a data store gateway at a given client network that is distinct from a plurality of other client networks of clients of a remote network-based virtualized data store service that provides remote storage services on a provider network for the clients, wherein the provider network is distinct from the given client network, wherein the given client network connects one or more devices to the data store gateway, wherein the data store gateway is configured to provide a storage gateway between the given client network and the remote network-based virtualized data store service, wherein the program instructions are further executable to cause the data store gateway to; receive a plurality of data units from the one or more devices connected to the data store gateway via the given client network of a given client of the remote network-based virtualized data store service; generate fingerprints for the received plurality of data units, wherein each fingerprint uniquely identifies a respective data unit of the received plurality of data units; send the fingerprints to the remote network-based virtualized data store service, wherein the remote network-based virtualized data store service maintains a data store of the plurality of data units; receive, from the remote network-based virtualized data store service, an indication of one or more of the plurality of data units that are to be stored to the data store of the remote network-based virtualized data store service; and send, to the remote network-based virtualized data store service, the one or more of the plurality of data units for storage by the remote network-based virtualized data store service to the data store. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method, comprising:
-
receiving, at a data store gateway, a plurality of data units from one or more devices connected to the data store gateway via a client network that is distinct from a plurality of other client networks of clients of a remote network-based virtualized data store service that provides remote storage services on a provider network for the clients, wherein the provider network is distinct from the client network; generating, at the data store gateway, fingerprints for the plurality of data units, wherein each fingerprint uniquely identifies a respective data unit in the received plurality of data units; sending, from the data store gateway, the fingerprints to the remote network-based virtualized data store service via a communications channel; receiving, at the data store gateway and from the remote network-based virtualized data store service via the communications channel, an indication of one or more of the plurality of data units that are to be uploaded to the remote network-based virtualized data store service via the communications channel; and sending, from the data store gateway via the communications channel to the remote network-based virtualized data store service, the indicated one or more of the plurality of data units. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-accessible storage medium storing program instructions which are computer-executable to implement:
-
receiving, at a data store gateway, a plurality of data units from one or more devices connected to the data store gateway via a client network that is distinct from a plurality of other client networks of clients of a remote network-based virtualized data store service that provides remote storage services on a provider network for the clients, wherein the provider network is distinct from the client network; generating, at the data store gateway, fingerprints for the received plurality of data units, wherein each fingerprint uniquely identifies a respective data unit in the received plurality of data units; sending, from the data store gateway, the fingerprints to the remote network-based virtualized data store service via a communications channel; receiving, at the data store gateway and from the remote network-based virtualized data store service via the communications channel, an indication of one or more of the plurality of data units that are to be uploaded to the remote network-based virtualized data store service via the communications channel; and sending, from the data store gateway via the communications channel to the remote network-based virtualized data store service, the indicated one or more of the plurality of data units. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification