Method and apparatus for parallel profile matching in a large scale webcasting system
First Claim
1. A method for parallel matching a user profile with desired data, the method comprising:
- partitioning a profile database into sub-partitions having data subsets, the subsets comprising predicates, the predicates used to assert selected properties to information items;
mapping each sub-partition onto one or more processors yielding greatest processing efficiency;
communicating an information item to each processor; and
matching the information item with a corresponding predicate.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for efficiently matching a large collection of user profiles against a large volume of data in a webcasting system. The invention generally includes in one embodiment four steps to parallelize the profiles. First, an initial profile set is partitioned into several subsets also referred to as sub-partitions using various heuristic methods. Second, each sub-partition is mapped onto one or more independent processing units. Each processing unit is not required to have equal processing performance. However, for best performance results, subset data should be mapped in one embodiment where the subset with a highest cost is mapped to a fastest processor, and the next highest cost subset mapped to the next fastest processor. Where appropriate, the invention evaluates the relative subset processing speed of each processor and adjusts future subset mapping based upon these evaluations. For each information item I that needs to be matched with a profile predicate, a third and a fourth step are executed. The third step broadcasts I to all processing units, and a fourth step performs a sequential profile match on I.
78 Citations
32 Claims
-
1. A method for parallel matching a user profile with desired data, the method comprising:
-
partitioning a profile database into sub-partitions having data subsets, the subsets comprising predicates, the predicates used to assert selected properties to information items;
mapping each sub-partition onto one or more processors yielding greatest processing efficiency;
communicating an information item to each processor; and
matching the information item with a corresponding predicate. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
if the sub-partition'"'"'s data subsets have a predetermined overlap, then using a means for greedy mapping partitioning the profile database; and
if the data subsets have unacceptable overlap, using another partitioning means for partitioning the profile database.
-
-
7. The method recited in claim 6, using another partitioning means comprising using a means for β
- -Mapping partitioning the profile database.
-
8. The method recited in claim 6, using another partitioning means comprising using a means for clustering partitioning the database.
-
9. The method recited in claim 6, using another partitioning means comprising using a means for incremental clustering partitioning of the database.
-
10. The method recited in claim 6, further comprising:
-
building a profile index, wherein the profile index includes a collection of user profiles including predicates; and
identifying predicates shared by user profiles.
-
-
11. The method recited in claim 10, wherein evaluating the predicates further comprises:
-
dynamically monitoring the evaluation of each predicate;
assigning an evaluation cost to each predicate based upon the monitoring, wherein the cost of a cheap predicate requires less evaluation time and the cost of an expensive predicate requires more evaluation time; and
adjusting the mapping of the predicates to a processor based upon dynamically determined processing cost.
-
-
12. The method recited in claim 11, wherein evaluating a predicate against a data item by:
-
accessing an index for the data items, the data items having contents and the index comprising a list of the contents; and
evaluating a predicate against the list of the contents.
-
-
13. A signal-bearing medium tangibly embodying a program of machine-readable instructions executable by a digital processing apparatus to perform a method for matching a user profile with desired data, said method comprising:
-
partitioning a profile database into sub-partitions having data subsets, the subsets comprising predicates, the predicates used to assert selected properties to information items;
mapping each sub-partition onto one or more processors yielding greatest processing efficiency;
communicating an information item to each processor; and
matching the information item with a corresponding predicate. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21)
if the sub-partitions data subsets have acceptable overlap, then using a greedy clustering means for partitioning the profile database; and
if the data subsets have unacceptable overlap, using another partitioning means for partitioning the profile database.
-
-
19. The signal-bearing medium recited in claim 18, the method further comprising:
-
building a profile index, wherein the profile index includes a collection of user profiles including predicates; and
identifying predicates shared by user profiles.
-
-
20. The signal-bearing medium recited in claim 19, the method further comprising:
-
dynamically monitoring the evaluation of each predicate;
assigning an evaluation cost to each predicate based upon the monitoring, wherein the cost of a cheap predicate requires less evaluation time and the cost of an expensive predicate requires more evaluation time; and
adjusting the mapping of the predicates to a processor based upon dynamically determined processing cost.
-
-
21. The signal-bearing medium recited in claim 20, the method further comprising:
-
accessing an index for the data items, the data items having contents and the index comprising a list of the contents; and
evaluating a predicate against the list of the contents.
-
-
22. An apparatus to match a user profile with desired data, the apparatus comprising:
-
a webcasting system, the system including;
a profile handler;
a profile matcher;
a profile database;
an data item fetcher;
a processor, wherein the processor is capable of executing instructions to;
partitioning a profile database into sub-partitions having data subsets, the subsets comprising predicates, the predicates used to assert selected properties to information items;
mapping each sub-partition onto one or more processors yielding greatest processing efficiency;
communicating an information item to each processor; and
matching the information item with a corresponding predicate. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31)
if the sub-partitions data subsets have acceptable overlap, then using a greedy clustering means for partitioning the profile database; and
if the data subsets have unacceptable overlap, using another partitioning means for partitioning the profile database.
-
-
28. The apparatus recited in claim 27, wherein the processor is further capable of executing instructions for:
if the sub-partitions data subsets have acceptable overlap, then using a greedy clustering means for partitioning the profile database, and if the data subsets have unacceptable overlap, using another partitioning means for partitioning the profile database.
-
29. The apparatus recited in claim 28, wherein the processor is further capable of executing instructions for:
-
building a profile index, wherein the profile index includes a collection of user profiles including predicates; and
identifying predicates shared by user profiles.
-
-
30. The apparatus recited in claim 29, wherein the processor is further capable of executing instructions for:
-
dynamically monitoring the evaluation of each predicate;
assigning an evaluation cost to each predicate based upon the monitoring, wherein the cost of a cheap predicate requires less evaluation time and the cost of an expensive predicate requires more evaluation time; and
adjusting the mapping of the predicates to a processor based upon dynamically determined processing cost.
-
-
31. The apparatus recited in claim 30, wherein the processor is further capable of executing instructions for:
-
accessing an index for the data items, the data items having contents and the index comprising a list of the contents; and
evaluating a predicate against the list of the contents.
-
-
32. A method for matching a user profile with selected data, the method comprising:
-
partitioning a profile database into sub-partitions having data subsets, the subsets comprising predicates, the predicates used to assert selected properties to information items;
mapping each sub-partition onto one or more processors yielding greatest processing efficiency;
communicating an information item to each processor; and
matching the information item with a corresponding predicate;
wherein the profile database is partitioned based upon a cost of each sub-partition, the cost related to system overhead required to process the sub-partition;
wherein a sub-partition is mapped to a one processor; and
wherein a processor to which a sub-partition is mapped is upon a sub-partition'"'"'s cost.
-
Specification