PROCESSING STRUCTURED DATA
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides a fast and efficient way of processing structured data by utilizing an intermediate file to store the structural information. The structured data may be processed into a Binary mask Format (BMF) file which may serve as a starting point for post-processing. A tree structure built on top of the BMF file may be constructed very quickly, and also takes up less space than a DOM tree. Additionally, BMF records may reside entirely in the memory and contain structural information, allowing SAX-like sequential data access.
21 Citations
40 Claims
-
1-20. -20. (canceled)
-
21. A method for efficiently processing a structured data file, the method comprising:
-
receiving the structured data file; creating an intermediate file, wherein the intermediate file is a binary file having a plurality of cells organized into groupings, wherein each of the groupings of cells constitutes a record; parsing the structured data file by; creating a first record in an intermediate file for an element in the structured data file, wherein the first record further contains one or more descriptors including an offset value identifying a location, within the structured data file, of the element; creating a second record in the intermediate file for an attribute name in the structured data file, wherein the second record further contains one or more descriptors including an offset value identifying a location, within the structured data file, of the attribute name; and creating a third record in the intermediate file for an attribute value in the structured data file, wherein the third record further contains one or more descriptors including an offset value identifying a location, within the structured data file, of the attribute value; and transmitting the intermediate file and the structured data file to a component so that the component accesses data from the structured data file using both the intermediate file and the structured data file together. - View Dependent Claims (22, 23, 24, 25, 26, 27)
-
-
28. An apparatus for efficiently processing a structured data file, the structured data file including a starting tag, an attribute name, and content, the apparatus comprising:
-
a peripheral component interface (PCI) interface; a direct memory access (DMA) engine coupled to the PCI interface; a text processor coupled to the PCI interface, the text processor configured to; receive the structured data file; create an intermediate file, wherein the intermediate file is a binary file having a plurality of cells organized into groupings, wherein each of the groupings of cells constitutes a record; parse the structured data file by; creating a first record in an intermediate file for an element in the structured data file, wherein the first record further contains one or more descriptors including an offset value identifying a location, within the structured data file, of the element; creating a second record in the intermediate file for an attribute name in the structured data file, wherein the second record further contains one or more descriptors including an offset value identifying a location, within the structured data file, of the attribute name; and creating a third record in the intermediate file for an attribute value in the structured data file, wherein the third record further contains one or more descriptors including an offset value identifying a location, within the structured data file, of the attribute value; and transmit the intermediate file and the structured data file to a component so that the component accesses data from the structured data file using both the intermediate file and the structured data file together. configuration memory coupled to the text processor and to the PCI interface; a memory controller coupled to the PCI interface; BMF memory coupled to the DMA engine, the memory controller, and the text processor; a document buffer coupled to the DMA engine, the memory controller, and the text processor; and a string cache coupled to the DMA engine, the memory controller, and the text processor. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36, 37)
-
-
38. An apparatus for efficiently processing a structured data file, the apparatus comprising:
-
means for receiving the structured data file; means for creating an intermediate file, wherein the intermediate file is a binary file having a plurality of cells organized into groupings, wherein each of the groupings of cells constitutes a record; means for parsing the structured data file by; creating a first record in an intermediate file for an element in the structured data file, wherein the first record further contains one or more descriptors including an offset value identifying a location, within the structured data file, of the element; creating a second record in the intermediate file for an attribute name in the structured data file, wherein the second record further contains one or more descriptors including an offset value identifying a location, within the structured data file, of the attribute name; and creating a third record in the intermediate file for an attribute value in the structured data file, wherein the third record further contains one or more descriptors including an offset value identifying a location, within the structured data file, of the attribute value; and means for transmitting the intermediate file and the structured data file to a component so that the component accesses data from the structured data file using both the intermediate file and the structured data file together. - View Dependent Claims (39, 40)
-
Specification