Speculative Stream Scanning
First Claim
1. A method for partitioning a data stream into tokens, the method comprising steps of:
- receiving the data stream;
setting a partition scanner to a beginning point in the data stream;
identifying likely token boundaries in the data stream using the partition scanner;
partitioning the data stream according to the likely token boundaries as determined by the partition scanner, wherein each partition of the partitioned data stream bounded by the likely token boundaries comprises a chunk; and
passing the chunk to a next available token scanner, one chunk per token scanner, for;
identifying at least one actual token within each chunk.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method for partitioning a data stream into tokens includes steps or acts of: receiving the data stream; setting a partition scanner to a beginning point in the data stream; identifying likely token boundaries in the data stream using the partition scanner; partitioning the data stream according to the likely token boundaries as determined by the partition scanner, wherein each partition of the partitioned data stream bounded by the likely token boundaries comprises a chunk; and passing the chunk to a next available token scanner, one chunk per token scanner, for identifying at least one actual token within each chunk.
-
Citations
17 Claims
-
1. A method for partitioning a data stream into tokens, the method comprising steps of:
-
receiving the data stream; setting a partition scanner to a beginning point in the data stream; identifying likely token boundaries in the data stream using the partition scanner; partitioning the data stream according to the likely token boundaries as determined by the partition scanner, wherein each partition of the partitioned data stream bounded by the likely token boundaries comprises a chunk; and passing the chunk to a next available token scanner, one chunk per token scanner, for; identifying at least one actual token within each chunk. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A data scanner system comprising:
-
an input/output interface that receives input comprising a data stream and generates output comprising tokens; a partition scanner that receives as input the data stream;
identifies likely token boundaries in the data stream; and
partitions the input data stream into a plurality of chunks according to the likely token boundaries; anda plurality of token scanners for concurrently processing the plurality of chunks to determine whether the likely token boundaries are non-speculative token boundaries and to provide a stream of tokens representing non-speculative token boundaries. - View Dependent Claims (13, 14, 15)
-
-
16. A computer program product embodied on a computer readable storage medium and comprising code that, when executed, causes a computer to perform the following receive a data stream;
-
set a partition scanner to a beginning point in the data stream; identify likely token boundaries in the data stream using the partition scanner; partition the data stream according to the likely token boundaries as determined by the partition scanner, wherein each partition of the partitioned data stream bounded by the likely token boundaries comprises a chunk; and pass the chunk to a next available token scanner, one chunk per token scanner, to identify at least one actual token within each chunk. - View Dependent Claims (17)
-
Specification