Partial file restore in a data storage system
First Claim
1. A method of storing files in secondary storage in a data storage system, comprising:
- using one or more computing devices comprising computer hardware;
initiating copying of a plurality of files from a primary storage subsystem to a secondary storage subsystem, wherein data stored on the secondary storage subsystem is stored in one or more chunks, and each chunk is a logical data unit for storing the data in the secondary storage subsystem in one or more secondary storage devices residing in the secondary storage subsystem;
copying a first portion of a first file of the plurality of files from the primary storage subsystem to a buffer for writing to the secondary storage subsystem;
creating a first entry in an index for a first chunk of the one or more chunks, the index stored in association with the first chunk in the secondary storage subsystem, the first entry corresponding to the first portion of the first file and comprising;
a first application offset determined by a software application that accessed the first file and that corresponds to the first portion of the first file, wherein the first application offset designates a starting position within the first file of the first portion of the first file to be restored from a secondary copy of the first file in the first chunk stored in the secondary storage subsystem; and
a first secondary storage offset indicating a location of the first portion of the first file within the secondary copy of the first file in the first chunk in the secondary storage subsystem;
copying a second portion of the first file from the primary storage subsystem to the buffer for writing to the secondary storage subsystem;
creating a second entry in the index for the first chunk, the second entry corresponding to the second portion of the first file and comprising;
a second application offset determined by the software application that accessed the first file and that corresponds to the second portion of the first file, wherein the second application offset designates a starting position within the first file of the second portion of the first file to be restored from the secondary copy of the first file in the first chunk stored in the secondary storage subsystem; and
a second secondary storage offset indicating a location of the second portion of the first file within the secondary copy of the first file in the first chunk in the secondary storage subsystem;
writing the first portion from the buffer to the location indicated by the first secondary storage offset and writing the first entry to the first chunk in response to the first portion being written from the buffer; and
writing the second portion from the buffer to the location indicated by the second secondary storage offset and writing the second entry to the first chunk in response to the second portion being written from the buffer,wherein creation of the secondary copy involves a series of transactions in which data is written to the buffer and then written from the buffer to the secondary storage subsystem, and wherein an amount of data written to the buffer in each transaction is not predetermined.
4 Assignments
0 Petitions
Accused Products
Abstract
The data storage system according to certain aspects can implement partial file restore, where only a portion of the secondary copy of a file is restored. Such portion may be designated by one or more application offsets for the file. The system may provide an in-chunk index that includes mapping information between the application offsets and the secondary copy offsets. Chunks may refer to logical data units in which secondary copies are stored, and the in-chunk index for a chunk may be stored in secondary storage with the chunk. Because the mapping information may not be provided at a fixed interval, the system can search through application offsets in the in-chunk index to locate the secondary copy offset corresponding to the portion application offset(s). In this manner, the system may restore the designated portion of the secondary copy in a fast and efficient manner by using the in-chunk index.
122 Citations
18 Claims
-
1. A method of storing files in secondary storage in a data storage system, comprising:
using one or more computing devices comprising computer hardware; initiating copying of a plurality of files from a primary storage subsystem to a secondary storage subsystem, wherein data stored on the secondary storage subsystem is stored in one or more chunks, and each chunk is a logical data unit for storing the data in the secondary storage subsystem in one or more secondary storage devices residing in the secondary storage subsystem; copying a first portion of a first file of the plurality of files from the primary storage subsystem to a buffer for writing to the secondary storage subsystem; creating a first entry in an index for a first chunk of the one or more chunks, the index stored in association with the first chunk in the secondary storage subsystem, the first entry corresponding to the first portion of the first file and comprising; a first application offset determined by a software application that accessed the first file and that corresponds to the first portion of the first file, wherein the first application offset designates a starting position within the first file of the first portion of the first file to be restored from a secondary copy of the first file in the first chunk stored in the secondary storage subsystem; and a first secondary storage offset indicating a location of the first portion of the first file within the secondary copy of the first file in the first chunk in the secondary storage subsystem; copying a second portion of the first file from the primary storage subsystem to the buffer for writing to the secondary storage subsystem; creating a second entry in the index for the first chunk, the second entry corresponding to the second portion of the first file and comprising; a second application offset determined by the software application that accessed the first file and that corresponds to the second portion of the first file, wherein the second application offset designates a starting position within the first file of the second portion of the first file to be restored from the secondary copy of the first file in the first chunk stored in the secondary storage subsystem; and a second secondary storage offset indicating a location of the second portion of the first file within the secondary copy of the first file in the first chunk in the secondary storage subsystem; writing the first portion from the buffer to the location indicated by the first secondary storage offset and writing the first entry to the first chunk in response to the first portion being written from the buffer; and writing the second portion from the buffer to the location indicated by the second secondary storage offset and writing the second entry to the first chunk in response to the second portion being written from the buffer, wherein creation of the secondary copy involves a series of transactions in which data is written to the buffer and then written from the buffer to the secondary storage subsystem, and wherein an amount of data written to the buffer in each transaction is not predetermined. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
10. A data storage system for storing files in secondary storage, comprising:
-
a storage manager executing on computer hardware and configured to; initiate copying of a plurality of files from primary storage subsystem to a secondary storage subsystem, wherein data stored on the secondary storage subsystem is stored in one or more chunks, and each chunk is a logical data unit for storing the data in the secondary storage subsystem in one or more secondary storage devices residing in the secondary storage subsystem; and one or more computing devices comprising computer hardware and configured to; copy a first portion of the first file of the plurality of files from the primary storage subsystem to a buffer for writing to the secondary storage subsystem; create a first entry in an index for a first chunk of the one or more chunks, the index stored in association with the first chunk in the secondary storage subsystem, the first entry corresponding to the first portion of the first file and comprising; a first application offset determined by a software application that accessed the first file and that corresponds to the first portion of the first file, wherein the first application offset designates a starting position within the first file of the first portion of the first file to be restored from a secondary copy of the first file in the first chunk stored in the secondary storage subsystem; and a first secondary storage offset indicating a location of the first portion of the first file within the secondary copy of the first file in the first chunk in the secondary storage subsystem; copy a second portion of the first file from the primary storage subsystem to the buffer for writing to the secondary storage subsystem; create a second entry in the index for the first chunk, the second entry corresponding to the second portion of the first file and comprising; a second application offset determined by the software application that accessed the first file and that corresponds to the second portion of the first file, wherein the second application offset designates a starting position within the first file of the second portion of the first file to be restored from the secondary copy of the first file in the first chunk stored in the secondary storage subsystem; and a second secondary storage offset indicating a location of the second portion within the secondary copy of the first file in the first chunk in the secondary storage subsystem; write the first portion from the buffer to the location indicated by the first secondary storage offset and write the first entry to the first chunk in response to the first portion being written from the buffer; and write the second portion from the buffer to the location indicated by the second secondary storage offset and write the second entry to the first chunk in response to the second portion being written from the buffer, wherein creation of the secondary copy involves a series of transactions in which data is written to the buffer and then written from the buffer to the secondary storage subsystem, and wherein an amount of data written to the buffer in each transaction is not predetermined. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification