Parallel processing of data sets

US 8,868,470 B2
Filed: 11/09/2010
Issued: 10/21/2014
Est. Priority Date: 11/09/2010
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

partitioning a data set into a plurality of data partitions, the partitioning including removing dependencies in the data set that require some of the data partitions to be processed sequentially rather than in parallel;

distributing the plurality of data partitions to a plurality of processors, each of the plurality of data partitions being assigned to a single one of the plurality of processors;

processing, by the plurality of processors, each of the plurality of data partitions in parallel; and

synchronizing the plurality of processors to obtain a global record corresponding to the processed data partitions.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems, methods, and devices are described for implementing learning algorithms on data sets. A data set may be partitioned into a plurality of data partitions that may be distributed to two or more processors, such as a graphics processing unit. The data partitions may be processed in parallel by each of the processors to determine local counts associated with the data partitions. The local counts may then be aggregated to form a global count that reflects the local counts for the data set. The partitioning may be performed by a data partition algorithm and the processing and the aggregating may be performed by a parallel collapsed Gibbs sampling (CGS) algorithm and/or a parallel collapsed variational Bayesian (CVB) algorithm. In addition, the CGS and/or the CVB algorithms may be associated with the data partition algorithm and may be parallelized to train a latent Dirichlet allocation model.

Citations

20 Claims

1. A method comprising:
- partitioning a data set into a plurality of data partitions, the partitioning including removing dependencies in the data set that require some of the data partitions to be processed sequentially rather than in parallel;
  
  distributing the plurality of data partitions to a plurality of processors, each of the plurality of data partitions being assigned to a single one of the plurality of processors;
  
  processing, by the plurality of processors, each of the plurality of data partitions in parallel; and
  
  synchronizing the plurality of processors to obtain a global record corresponding to the processed data partitions.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method as recited in claim 1, further comprising generating, by the plurality of processors, one or more local counts corresponding to each of the data partitions.
  - 3. The method as recited in claim 1, wherein the processing the data partitions in parallel is implemented by one or more algorithms.
  - 4. The method as recited in claim 3, wherein the plurality of data partitions are utilized to train the one or more algorithms.
  - 5. The method as recited in claim 1, wherein:
    - the partitioning is performed by a partition algorithm;
      
      the processing is performed at least in part by one or more graphics processing unit (GPU); and
      
      the distributing is performed by a partition algorithm that balances workloads across the plurality of processors.
  - 6. The method as recited in claim 1, further comprising:
    - storing each of the processed data partitions in association with a respective one of the plurality of processors; and
      
      storing, by a computing device, a single copy of the global record that is shared between each of the plurality of processors.

7. A method comprising:
- distributing subsets of a plurality of documents of a data set across a plurality of processors, the plurality of documents being partitioned into the subsets to remove dependencies between the plurality of documents, the dependencies between the plurality of documents causing the plurality of documents to be processed sequentially rather than in parallel;
  
  processing, by a particular one of the plurality of processors, a particular subset of the plurality of documents in parallel with the plurality of processors to identify local counts associated with the subset of documents; and
  
  aggregating the local counts from each of the processors to generate a global count that is representative of the data set.
- View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15)
- - 8. The method as recited in claim 7, wherein the operations further comprise:
    - processing the subset of documents to identify a topic-word assignment or a topic count associated with the subset of documents; and
      
      determining the topic-word assignment or the topic count using a parallel collapsed Gibbs sampling algorithm.
  - 9. The method as recited in claim 8, wherein the topic-word assignment corresponds to words associated with topics associated with the subset of documents and the topic count corresponds to a number of different topics associated with each document.
  - 10. The method as recited in claim 8, wherein the operations further comprise:
    - dividing words included in the subset of documents into one or more subsets; and
      
      storing, on memory associated with a respective processor, a local copy of each corresponding topic-word assignment or topic count.
  - 11. The method as recited in claim 7, wherein the operations further comprise identifying the local counts and the global count utilizing a collapsed Gibbs sampling algorithm executed on one or more graphics processing unit.
  - 12. The method as recited in claim 7, further comprising partitioning the data set into the plurality of documents in order to remove the dependencies between the plurality of documents.
  - 13. The method as recited in claim 7, wherein the global count represents a total number of topic-word assignments or topic counts associated with the plurality of documents distributed to the plurality of processors.
  - 14. The method as recited in claim 7, wherein each processor is limited to processing a subset of documents distributed to that processor.
  - 15. The method as recited in claim 7, wherein the global count is determined after calculation of the local counts.

16. A system comprising:
- a plurality of processors; and
  
  memory to store computer-executable instructions that, when executed by one of the plurality of processors, perform operations comprising;
  
  distributing, across the plurality of processors, subsets of documents partitioned from a plurality of documents included in a data set;
  
  determining, by each processor and in parallel with the plurality of processors, an expected local count corresponding to topics or words expected to be identified in the documents distributed to each processor; and
  
  synchronizing, based at least in part on the expected local counts, the plurality of processors to determine variational parameters that represent a distribution of the topics or the words expected to be identified in the plurality of documents.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The system as recited in claim 16, wherein the distributing, the determining, and the synchronizing are performed by a collapsed variational Bayesian algorithm.
  - 18. The system as recited in claim 16, wherein:
    - at least one of the plurality of processors is a graphics processing unit (GPU); and
      
      the collapsed variational Bayesian algorithm is executed by the GPU.
  - 19. The system as recited in claim 18, wherein the collapsed variational Bayesian algorithm causes a single copy of the variational parameters to be stored in the memory and shared by each of the plurality of processors.
  - 20. The system as recited in claim 16, wherein the expected local count determined by each processor is stored locally on memory associated with a respective processor.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Xu, Ning-Yi, Hsu, Feng-Hsiung, Yan, Feng
Primary Examiner(s)
Gaffin, Jeffrey A
Assistant Examiner(s)
Bharadwaj, Kalpana

Application Number

US12/942,736
Publication Number

US 20120117008A1
Time in Patent Office

1,442 Days
Field of Search

706/12
US Class Current

706/12
CPC Class Codes

G06F 9/5061 Partitioning or combining o...

Parallel processing of data sets

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Parallel processing of data sets

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links