Intelligent caching

US 9,507,718 B2
Filed: 04/16/2013
Issued: 11/29/2016
Est. Priority Date: 04/16/2013
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

generating, at a computer system and in response to receiving a specified query from a client, a first read request for obtaining a result of the specified query from a storage system;

extracting, from a plurality of candidate queries and at the computer system, a plurality of features of the candidate queries, wherein the features are characteristics of a query;

correlating the features of each of the candidate queries to identify a usage pattern of the features, the correlating including;

identifying a specified feature of the features based on a derived value of the specified feature, the derived value being derived from an actual value of the specified feature, the actual value and the specified feature specified in one or more of the candidate queries;

predicting, based on the usage pattern of the features, a set of queries to be received at the computer system in the future;

executing a query of the set of queries to obtain data corresponding to the query from the storage system, the data including time series data, wherein the data is stored at a first granularity level in the storage system, wherein executing the query includes;

generating a second read request to obtain the data corresponding to the query,combining the first read request and the second read request to generate a combined read request, andexecuting the combined read request at the storage system to obtain the result of the specified query and the data corresponding to the query;

determining, based on the predicting, a second granularity level at which the data is to be cached, the second granularity level being different from the first granularity level;

processing the data from the first granularity level to the second granularity level to generate processed data; and

updating a cache of the computer system with the processed data, the updating to be performed before any of the set of queries is received at the computer system.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed are methods, systems, paradigms and structures for managing cache memory in computer systems. Certain caching techniques anticipate queries and caches the data that may be required by the anticipated queries. The queries are predicted based on previously executed queries. The features of the previously executed queries are extracted and correlated to identify a usage pattern of the features. The prediction model predicts queries based on the identified usage pattern of the features. The disclosed method includes purging data from the cache based on predefined eviction policies that are influenced by the predicted queries. The disclosed method supports caching time series data. The disclosed system includes a storage unit that stores previously executed queries and features of the queries.

Citations

19 Claims

1. A method comprising:
- generating, at a computer system and in response to receiving a specified query from a client, a first read request for obtaining a result of the specified query from a storage system;
  
  extracting, from a plurality of candidate queries and at the computer system, a plurality of features of the candidate queries, wherein the features are characteristics of a query;
  
  correlating the features of each of the candidate queries to identify a usage pattern of the features, the correlating including;
  
  identifying a specified feature of the features based on a derived value of the specified feature, the derived value being derived from an actual value of the specified feature, the actual value and the specified feature specified in one or more of the candidate queries;
  
  predicting, based on the usage pattern of the features, a set of queries to be received at the computer system in the future;
  
  executing a query of the set of queries to obtain data corresponding to the query from the storage system, the data including time series data, wherein the data is stored at a first granularity level in the storage system, wherein executing the query includes;
  
  generating a second read request to obtain the data corresponding to the query,combining the first read request and the second read request to generate a combined read request, andexecuting the combined read request at the storage system to obtain the result of the specified query and the data corresponding to the query;
  
  determining, based on the predicting, a second granularity level at which the data is to be cached, the second granularity level being different from the first granularity level;
  
  processing the data from the first granularity level to the second granularity level to generate processed data; and
  
  updating a cache of the computer system with the processed data, the updating to be performed before any of the set of queries is received at the computer system.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
- - 2. The method of claim 1 further comprising:
    - retrieving, in response to a specific query received at the computer system from a client, specific data from the cache; and
      
      sending the specific data to the client.
  - 3. The method of claim 1, wherein the features of the candidate queries include at least one of (a) a select clause with data fields indicating data to be retrieved, (b) a table name, (c) a filter value in a query, (d) a window-size of data requested, or (e) a date-range of the data requested.
  - 4. The method of claim 1, wherein the candidate queries include previously executed queries.
  - 5. The method of claim 1, wherein correlating the features includes:
    - identifying features based on a frequency of appearance of the features in the previously received queries.
  - 6. The method of claim 1, wherein predicting the future queries includes:
    - determining a probability of appearance of one or more correlated features in the future queries, anddetermining the one or more correlated features whose probability exceeds a predefined threshold, as predicted features appearing in the future queries.
  - 7. The method of claim 6, wherein updating the cache with the data responsive to the future queries includes:
    - obtaining, from a storage unit, new data corresponding to the predicted features, andwriting the new data into the cache.
  - 8. The method of claim 7, wherein obtaining the new data corresponding to the predicted features includes:
    - combining a first read request with a second read request to generate a combined request, the first read request for obtaining data corresponding to a current query, the second read request for obtaining the new data corresponding to the predicted features, andobtaining, from the storage unit, the data and the new data in response to the combined request.
  - 9. The method of claim 7, wherein updating the cache includes:
    - determining whether the cache has sufficient space to store the new data, andresponsive to a determination that the cache does not have sufficient space to store the new data, purging a portion of existing data from the cache based on a predefined data eviction policy.
  - 10. The method of claim 9, wherein the predefined data eviction policy includes purging the portion of existing data based on at least one of (a) a weighted least recently used basis, (b) a determination of whether the portion of existing data in the cache will be requested by future queries, or (c) access pattern of the portion of existing data over a predefined duration.
  - 11. The method of claim 7, wherein writing the new data into the cache includes processing the new data obtained from the storage unit to a form required by the future queries.
  - 12. The method of claim 11, wherein processing the new data to a form required by the future queries includes aggregating the new data obtained from the storage unit.
  - 13. The method of claim 1, wherein extracting the features from candidate queries include:
    - storing previously executed queries in a storage unit of the computer system based on a pre-defined query selection policy; and
      
      storing the features of each of the previously executed queries.
  - 14. The method of claim 13, wherein the pre-defined query storing policy includes storing at least one of (a) queries received over a pre-defined duration, (b) queries received by a particular application, (c) queries received for a particular application, or (d) a particular type of queries received.
  - 15. The method of claim 1, wherein the candidate queries include at least one of (a) a first type of queries that retrieve cacheable data, or (b) a second type of queries that retrieve uncacheable data.
  - 16. The method of claim 15, wherein correlating the features of the candidate queries includes correlating the features of candidate queries of both the first type and the second type.

17. A method comprising:
- receiving a query at a computer system from a client;
  
  generating, at the computer system, a first read request for obtaining data required to respond to the query;
  
  determining whether data required to respond to the query is in a cache of the computer system;
  
  responsive to a determination that the data is in the cache, retrieving, in response to the query, the data from the cache, the cache containing (a) the data required to respond to the query and (b) new data required to respond to future queries, wherein the future queries are predicted based on a correlation of features of (i) the query and (ii) previously executed queries, wherein the correlation of features is performed by;
  
  identifying a specified feature of the features based on a derived value of the specified feature, the derived value being derived from an actual value of the specified feature, the actual value and the specified feature specified in the query or the previously executed queries, wherein the data and the new data include time series data; and
  
  responsive to a determination that the data is not in the cache,generating a second read request to obtain the new data,combining the first read request and the second read request to generate a combined read request, andexecuting the combined read request at a storage unit to obtain the data and the new data,wherein the new data is stored at a first granularity level in the storage unit,determining based on the future queries, a second granularity level at which the new data is to be cached, the second granularity level being different from the first granularity level,processing the new data from the first granularity level to the second granularity level, andupdating the cache with (1) the data and (2) the new data, the cache updated with the new data before any of the set of queries is received at the computer system; and
  
  sending, in response to the query, the data to the client.
- View Dependent Claims (18)
- - 18. The method of claim 17, wherein the new data is processed to the second granularity level before being stored in the cache.

19. An apparatus comprising:
- a computer system having a processor that processes a specified query received from a client, by obtaining data from a storage unit or a cache, the process further configured to generate a first read request for obtaining a result of the specified query from a storage system;
  
  a feature extraction module working in cooperation with the processor to extract, from previously received queries, features of the queries, wherein the features are characteristics of a query;
  
  a feature correlation module to correlate the features of the queries to identify a usage pattern of the features, wherein the feature correlation module is configured to correlate by;
  
  identifying a specified feature of the features based on a derived value of the specified feature, the derived value being derived from an actual value of the specified feature, the actual value and the specified feature specified in one or more of the queries;
  
  a query prediction module to;
  
  predict, based on the usage pattern of the features, a set of queries to be received at the computer system in the future, andexecute a query of the set of queries to obtain data corresponding to the query from the storage system, the data including time series data, wherein the data is stored at a first granularity level in the storage system, wherein executing the query includes;
  
  generating a second read request to obtain the data corresponding to the query,combining the first read request and the second read request to generate a combined read request, andexecuting the combined read request at the storage system to obtain the result of the specified query and the data corresponding to the query, wherein the query determination module is further configured to determine based on the prediction, a second granularity level at which the data is to be cached, the second granularity level being different from the first granularity level; and
  
  a cache updating module to;
  
  process the data from the first granularity level to the second granularity level to generate processed data, andupdate the cache with the processed data required to serve the set of queries, wherein the cache module is configured to update the cache before any of the set of queries is received at the computer system.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Meta Platforms, Inc. (f/k/a Facebook, Inc.)
Original Assignee
Meta Platforms, Inc. (f/k/a Facebook, Inc.)
Inventors
Rash, Samuel, Williamson, Timothy
Primary Examiner(s)
Rigol, Yaima
Assistant Examiner(s)
Waddy, Jr., Edward

Application Number

US13/864,016
Publication Number

US 20140310470A1
Time in Patent Office

1,323 Days
Field of Search

711/126, 709/203, 707/E17.014, 707/740, 707/772, 707/999.003, 707/999.004, 1/1
US Class Current

1/1
CPC Class Codes

G06F 12/0862   with prefetch

G06F 12/12   Replacement control

G06F 12/123   with age lists, e.g. queue,...

G06F 2212/1016   Performance improvement

G06F 2212/163   Server or database system

G06F 2212/6024   History based prefetching

Intelligent caching

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Intelligent caching

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links