System and method for managing files in a distributed system using prioritization
First Claim
1. In a network-connected distributed system comprising a plurality of nodes through which digital data flow, one or more of the nodes adapted to process the digital data, a method for efficiently managing the transmission of units of digital data from node to node, the method comprising the steps of:
- receiving, at one of the one or more nodes, one or more units of digital data first transmitted by an originating node;
queuing, for processing at other nodes, one or more units of the digital data;
prioritizing the queued units of digital data for transmission to a next node based on prioritizing information; and
updating the prioritizing information according to results of processing performed in and received from at least one of the one of the one or more nodes and other nodes in the system, where the units of digital data comprise queries or files, and wherein the prioritizing step comprises the steps of;
classifying the queued queries or files into categories, clustering the files, in each of the categories, into similarity clusters;
choosing, for each similarity cluster, one or more representatives; and
determining an order of processing for the one or more representatives.
2 Assignments
0 Petitions
Accused Products
Abstract
In a network-connected distributed system including nodes through which digital data flow, one or more of the nodes adapted to process the digital data, a method for efficiently managing the transmission of units of digital data from node to node includes the steps of receiving, at one of the one or more nodes, units of digital data first transmitted by an originating node; queuing, for processing at other nodes, one or more units of the digital data; prioritizing the queued units of digital data for transmission to a next node based on prioritizing information; and updating the prioritizing information according to results of processing performed in and received from the one of the one or more nodes and/or other nodes in the system.
-
Citations
33 Claims
-
1. In a network-connected distributed system comprising a plurality of nodes through which digital data flow, one or more of the nodes adapted to process the digital data, a method for efficiently managing the transmission of units of digital data from node to node, the method comprising the steps of:
-
receiving, at one of the one or more nodes, one or more units of digital data first transmitted by an originating node;
queuing, for processing at other nodes, one or more units of the digital data;
prioritizing the queued units of digital data for transmission to a next node based on prioritizing information; and
updating the prioritizing information according to results of processing performed in and received from at least one of the one of the one or more nodes and other nodes in the system, where the units of digital data comprise queries or files, and wherein the prioritizing step comprises the steps of;
classifying the queued queries or files into categories, clustering the files, in each of the categories, into similarity clusters;
choosing, for each similarity cluster, one or more representatives; and
determining an order of processing for the one or more representatives. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
determining a file to be probably malicious;
identifying one or more other queued files as being in the same cluster as the file; and
adjusting the prioritizing information in response to the identifying step.
-
-
11. The method of claim 1 wherein the choosing step comprises the step of selecting N smallest files in each cluster as the one or more representatives, where N is an integer such as one.
-
12. The method of claim 1 wherein the determining step includes the step of ranking the one or more representatives so that representatives from clusters which contain more queued samples are to be transmitted prior to representatives from clusters which contain fewer queued samples.
-
13. The method of claim 1 wherein the updating step comprises the step of updating the prioritizing information according to results of automatic processing.
-
14. The method of claim 1 wherein the distributed system comprises a computer protection system and the units of digital data comprise samples of undesirable textual messages.
-
15. In a network-connected distributed system comprising a plurality of nodes through which digital data flow, one or more of the nodes adapted to process the digital data, a method for efficiently managing the transmission of units of digital data from node to node, the method comprising the steps of:
-
receiving, at one of the one or more nodes, one or more units of digital data first transmitted by an originating node;
queuing, for processing at other nodes, one or more units of the digital data;
prioritizing the queued units of digital data for transmission to a next node based on prioritizing information; and
updating the prioritizing information according to results of processing performed in and received from at least one of the one of the one or more nodes and other nodes in the system, where the units of digital data comprise queries or files, and wherein the units of digital data comprise queries including a database version of the originating node and a request for an updated version, if available, and wherein the updating step comprises the step of updating the originating prioritizing information of the originating node and/or other nodes of the system that are likely to have older versions. - View Dependent Claims (16)
-
-
17. In a network-connected distributed system comprising a plurality of nodes through which digital data flow, one or more of the nodes adapted to process the digital data, a method for efficiently managing the transmission of units of digital data from node to node, the method comprising the steps of:
-
receiving, at one of the one or more nodes, units of digital data first transmitted by an originating node;
filtering out sufficiently processed units of the digital data based on filtering information;
transmitting, to at least one of the originating node and other nodes, filtered results relating to the sufficiently processed units;
queuing, for processing at other nodes, unfiltered units of the digital data which are not filtered out;
prioritizing the unfiltered units of digital data for transmission to a next node based on prioritizing information; and
updating the filtering information and the prioritizing information according to results of automatic processing performed in and received from at least one of the one of the one or more nodes and other nodes in the system;
wherein the updating step comprises the step of re-executing at least one of the filtering step and the prioritizing step to apply the updated filtering and prioritizing information to the queued units of the digital data. - View Dependent Claims (18, 19, 20, 21, 22, 23)
-
-
24. A system for efficiently managing the transmission of units of digital data from node to node in a network-connected distributed system comprising a plurality of nodes through which digital data flow, one or more of the nodes adapted to process the digital data, the system comprising:
-
means for receiving, at one of the one or more nodes, one or more units of digital data first transmitted by an originating node;
means for queuing, for processing at other nodes, one or more units of the digital data;
means for prioritizing the queued units of digital data for transmission to a next node based on prioritizing information; and
means for updating the prioritizing information according to results of processing performed in and received from at least one of the one of the one or more nodes and other nodes in the system, where the units of digital data comprise queries or files; and
wherein the prioritizing means comprise;
means for classifying the queued queries or files into categories, means for clustering the files, in each of the categories, into similarity clusters;
means for choosing, for each similarity cluster, one or more representatives; and
means for determining an order of processing for the one or more representatives. - View Dependent Claims (25, 26, 27, 28, 29, 30)
means for determining a file to be probably malicious;
means for identifying one or more other queued files as being in the same cluster as the file; and
means for adjusting the prioritizing information in response to the identifying means.
-
-
29. The system of claim 24 wherein the choosing means comprise means for selecting N smallest files in each cluster as the one or more representatives, where N is an integer such as one.
-
30. The system of claim 24 wherein the determining means include means for ranking the one or more representatives so that representatives from clusters which contain more queued samples are to be transmitted prior to representatives from clusters which contain fewer queued samples.
-
31. A system for efficiently managing the transmission of units of digital data from node to node in a network-connected distributed system comprising a plurality of nodes through which digital data flow, one or more of the nodes adapted to process the digital data, the system comprising:
-
means for receiving, at one of the one or more nodes, one or more units of digital data first transmitted by an originating node;
means for queuing, for processing at other nodes, one or more units of the digital data;
means for prioritizing the queued units of digital data for transmission to a next node based on prioritizing information; and
means for updating the prioritizing information according to results of processing performed in and received from at least one of the one of the one or more nodes and other nodes in the system, where the units of digital data comprise queries or files; and
wherein the units of digital data comprise queries including a database version of the originating node and a request for an updated version, if available, and wherein the updating means comprise means for updating the originating prioritizing information of at least one of the originating node and other nodes of the system that are likely to have older versions. - View Dependent Claims (32)
-
-
33. A system for efficiently managing the transmission of units of digital data from node to node in a network-connected distributed system comprising a plurality of nodes through which digital data flow, one or more of the nodes adapted to process the digital data, the system comprising:
-
means for receiving, at one of the one or more nodes, units of digital data first transmitted by an originating node;
means for filtering out sufficiently processed units of the digital data based on filtering information;
means for transmitting, to at least one of the originating node and other nodes, filtered results relating to the sufficiently processed units;
means for queuing, for processing at other nodes, unfiltered units of the digital data which are not filtered out;
means for prioritizing the unfiltered units of digital data for transmission to a next node based on prioritizing information; and
means for updating the filtering information and the prioritizing information according to results of automatic processing performed in and received from at least one of the one of the one or more nodes and other nodes in the system;
wherein the updating means comprise at least one of means for re-filtering and means for re-prioritizing to apply the updated filtering and prioritizing information to the queued units of the digital data.
-
Specification