Resource allocation in distributed processing systems

US 10,560,397 B2
Filed: 11/28/2017
Issued: 02/11/2020
Est. Priority Date: 09/29/2014
Status: Active Grant

First Claim

Patent Images

1. A distributed processing system configured to improve processing speeds, the system comprising:

a source device configured to provide groups of data, wherein each of the groups of data is associated with one or several user authors, wherein the groups of data together comprise a processing task;

a plurality of independent processing units configured to receive a portion of the processing task, wherein the portion of the processing task comprises one or several of the groups of data, and wherein the independent processing units are configured to characterize one or several aspects of the one or several of the groups of data; and

a server communicatively connected to the source device and the plurality of independent processing units via a network, wherein the server is configured to;

receive the processing task;

identify a plurality of features in some of the groups of data;

generate a preliminary subset from the groups of data, by selecting an attribute identified in at least one of the plurality of features;

calculate a subset measure for the preliminary subset, wherein the subset measure indicates a degree to which the subset is representative of the processing task;

optimize the subset measure by replacing some of the groups of data of the preliminary subset with at least one replacement group of data that increases the subset measure by increasing the efficiency of use of the preliminary subset using at least one contribution factor measuring a contribution of a piece of data within the at least one replacement group; and

provide a final subset, including the at least one replacement group of data that replaces the preliminary subset with the replacement group, thereby increasing the subset measure, to the plurality of independent processing units.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A distributed processing system is disclosed herein. The distributed processing system includes a server, a database server, and an application server that are interconnected via a network, and connected via the network to a plurality of independent processing units. The independent processing units can include an analysis engine that is machine-learning-capable, and thus uniquely completes its processing tasks. The server can provide one or several pieces of data to one or several of the independent processing units, can receive analysis results from these one or several independent processing units, and can update the result based on a value characterizing the machine learning of the independent processing unit.

10 Citations

20 Claims

1. A distributed processing system configured to improve processing speeds, the system comprising:
- a source device configured to provide groups of data, wherein each of the groups of data is associated with one or several user authors, wherein the groups of data together comprise a processing task;
  
  a plurality of independent processing units configured to receive a portion of the processing task, wherein the portion of the processing task comprises one or several of the groups of data, and wherein the independent processing units are configured to characterize one or several aspects of the one or several of the groups of data; and
  
  a server communicatively connected to the source device and the plurality of independent processing units via a network, wherein the server is configured to;
  
  receive the processing task;
  
  identify a plurality of features in some of the groups of data;
  
  generate a preliminary subset from the groups of data, by selecting an attribute identified in at least one of the plurality of features;
  
  calculate a subset measure for the preliminary subset, wherein the subset measure indicates a degree to which the subset is representative of the processing task;
  
  optimize the subset measure by replacing some of the groups of data of the preliminary subset with at least one replacement group of data that increases the subset measure by increasing the efficiency of use of the preliminary subset using at least one contribution factor measuring a contribution of a piece of data within the at least one replacement group; and
  
  provide a final subset, including the at least one replacement group of data that replaces the preliminary subset with the replacement group, thereby increasing the subset measure, to the plurality of independent processing units.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The distributed processing system of claim 1, wherein the server is further configured to receive a characterization of at least one group of data within the final subset.
  - 3. The distributed processing system of claim 2, wherein generating the preliminary subset comprises determining a desired size of the preliminary subset.
  - 4. The distributed processing system of claim 3, wherein the server is configured to generate at least one selection attribute for at least some of the groups of data in the preliminary subset.
  - 5. The distributed processing system of claim 4, wherein the server is further configured to generate an attribute vector for at least some of the groups of data of the preliminary subset.
  - 6. The distributed processing system of claim 5, wherein the attribute vector is generated from values indicative of an identification of one or several of the at least one selection attribute in the at least some of the groups of data of the preliminary subset.
  - 7. The distributed processing system of claim 6, wherein the attribute vector comprises a multi-dimensional vector, and wherein the dimensions of the attribute vector correspond with the at least one selection attribute such that each dimension of the attribute vector is associated with a unique one of the at least one selection attribute.
  - 8. The distributed processing system of claim 7, wherein optimizing the subset measure comprises calculating the at least one contribution factor for each of the at least one group of data included in the preliminary subset, wherein each of the at least one contribution factor identifies an effect of the associated group of data on the subset measure;
    - and calculating the at least one contribution factor for some of the groups of data not included in the preliminary subset.
  - 9. The distributed processing system of claim 8, wherein optimizing the subset measure comprises:
    - identifying the group of data in the preliminary subset having a first contribution factor indicating the smallest positive effect on the subset measure; and
      
      identifying the group of data outside of the preliminary subset having a second contribution factor indicating the largest positive effect on the subset measure.
  - 10. The distributed processing system of claim 9, wherein optimizing the subset measure comprises:
    - comparing the first contribution factor and the second contribution factor; and
      
      replacing the group of data in the preliminary subset having the first contribution factor indicating the smallest positive effect on the subset measure with the replacement group of data outside of the preliminary subset having the second contribution factor indicating the largest positive effect on the subset measure when the second contribution factor indicates a greater positive effect than the first contribution factor.
  - 11. The distributed processing system of claim 10, wherein optimizing the subset measure comprises identifying the preliminary subset as optimized when the second contribution factor indicates a lesser positive effect than the first contribution factor.

12. A method for distributed processing, the method comprising:
- receiving at a server a processing task, wherein the processing task comprises a plurality of groups of data;
  
  identifying with the server a plurality of features in some of the groups of data;
  
  generating a preliminary subset from the groups of data, by selecting an attribute identified in at least one of the plurality of features;
  
  calculating a subset measure for the preliminary subset, wherein the subset measure indicates a degree to which the subset is representative of the processing task;
  
  optimizing the subset measure by replacing some of the groups of data of the subset with at least one replacement group of data that increases the subset measure by increasing the efficiency of use of the preliminary subset using at least one contribution factor measuring a contribution of a piece of data within the at least one replacement group; and
  
  providing a final subset, including the at least one replacement group of data that replaces the preliminary subset with the replacement group, thereby increasing the subset measure, to a plurality of independent processing units.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
- - 13. The method of claim 12, the method further comprising receiving a characterization of at least one group of data within the final subset.
  - 14. The method of claim 13, wherein generating the preliminary subset comprises determining a desired size of the preliminary subset.
  - 15. The method of claim 14, the method further comprising:
    - generating at least one selection attribute for at least some of the groups of data in the preliminary subset; and
      
      generating an attribute vector for at least some of the groups of data of the preliminary subset.
  - 16. The method of claim 15, wherein the attribute vector is generated from values indicative of an identification of one or several of the at least one selection attribute in the at least some of the groups of data of the preliminary subset.
  - 17. The method of claim 16, wherein the attribute vector comprises a multi-dimensional vector, and wherein the dimensions of the attribute vector correspond with the at least one selection attribute such that each dimension of the attribute vector is associated with a unique one of the at least one selection attribute.
  - 18. The method of claim 17, wherein optimizing the subset measure comprises calculating the at least one contribution factor for each of the at least one group of data included in the preliminary subset, wherein each of the at least one contribution factor identifies an effect of the associated group of data on the subset measure;
    - and calculating the at least one contribution factor for some of the groups of data not included in the preliminary subset.
  - 19. The method of claim 18, wherein optimizing the subset measure comprises:
    - identifying the group of data in the preliminary subset having a first contribution factor indicating the smallest positive effect on the subset measure; and
      
      identifying the group of data outside of the preliminary subset having a second contribution factor indicating the largest positive effect on the subset measure.
  - 20. The method of claim 19, wherein optimizing the subset measure comprises:
    - comparing the first contribution factor and the second contribution factor;
      
      replacing the group of data in the subset having the first contribution factor indicating the smallest positive effect on the preliminary subset measure with the replacement group of data outside of the preliminary subset having the second contribution factor indicating the largest positive effect on the subset measure when the second contribution factor indicates a greater positive effect than the first contribution factor; and
      
      identifying the preliminary subset as optimized when the second contribution factor indicates a lesser positive effect than the first contribution factor.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Pearson Education Incorporated (Pearson plc)
Original Assignee
Pearson Education Incorporated (Pearson plc)
Inventors
Dronen, Nicholas A., Foltz, Peter W., Garner, Holly, Loring, Miles T., Kapoor, Vishal
Primary Examiner(s)
Lai, Michael C

Application Number

US15/824,960
Publication Number

US 20180081718A1
Time in Patent Office

805 Days
Field of Search

709224, 709226
US Class Current
CPC Class Codes

G06F 9/4881   Scheduling strategies for d...

G06F 9/5072   Grid computing

H04L 47/783   Distributed allocation of r...

Resource allocation in distributed processing systems

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

10 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

Resource allocation in distributed processing systems

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

10 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others