Scalable storage and processing of hierarchical documents
First Claim
1. A computer-readable medium having computer-executable instructions for processing a data stream embodying a hierarchically structured document by performing acts comprising:
- partitioning said data stream into fixed length segments utilizing said hierarchical structure to determine a length of each fixed length segment;
processing said fixed length segments in a pipeline fashion;
creating a new read cache comprising existing unresolved queries and receiving a source data stream, wherein a read cache is utilized to read a fixed length segment of passing stream data;
creating an empty write cache, wherein a write cache is utilized to provide fixed length segments of stream data;
associating said empty write cache with an existing instance of a re-writer, wherein a re-writer is utilized to provide a copy of a fixed length segment of stream data; and
providing a copy of said source data stream as an output data stream from said empty write cache.
2 Assignments
0 Petitions
Accused Products
Abstract
Large messages in the form of hierarchically structured documents are processed in a streaming fashion using the ultimate consumer read requests as the driving force for the processing. The messages are partitioned into fixed length segments. The segments are processed in pipeline fashion. This processing chain includes simulating random access of hierarchical documents using stream transformations, mapping streams to a transport'"'"'s native capabilities, composing streams into chains and using pipeline processing on the chains, staging fragments into a database and routing messages when complete messages have been formed, and providing tools to allow the end user to inspect partial messages.
40 Citations
4 Claims
-
1. A computer-readable medium having computer-executable instructions for processing a data stream embodying a hierarchically structured document by performing acts comprising:
-
partitioning said data stream into fixed length segments utilizing said hierarchical structure to determine a length of each fixed length segment; processing said fixed length segments in a pipeline fashion; creating a new read cache comprising existing unresolved queries and receiving a source data stream, wherein a read cache is utilized to read a fixed length segment of passing stream data; creating an empty write cache, wherein a write cache is utilized to provide fixed length segments of stream data; associating said empty write cache with an existing instance of a re-writer, wherein a re-writer is utilized to provide a copy of a fixed length segment of stream data; and providing a copy of said source data stream as an output data stream from said empty write cache. - View Dependent Claims (2, 3)
-
-
4. A computer-readable medium having computer-executable instructions for processing a data stream embodying a hierarchically structured document by performing acts comprising:
-
partitioning said data stream into fixed length segments utilizing said hierarchical structure to determine a length of each fixed length segment; obtaining a fixed length segment of source stream data from an associated read cache; consuming said fixed length segment of source stream data; resolving pending queries on said source stream data; creating a mutator stream of data from said source stream of data, wherein mutations in said mutator stream are in accordance with write requests cached in an associated write cache; consuming an output stream of data from said mutator stream of data; and resolving queries on said output stream; discarding resolved queries; and embedding values associated with said resolved queries in said output stream.
-
Specification