Efficient Data Structures for Parsing and Analyzing a Document
First Claim
Patent Images
1. A method comprising:
- defining a plurality of different processes for analyzing and manipulating a document comprising a plurality of primitive elements; and
defining a storage for data associated with the primitive elements, wherein at least some of the data is stored in a separate memory space from the processes and is shared by at least two different processes, wherein the processes access the data by use of references to the data, wherein the data is not replicated by the processes.
1 Assignment
0 Petitions
Accused Products
Abstract
Some embodiments provide a method that parses an unstructured document that includes a number of primitive elements. The method stores the primitive elements in a random order in a first storage. The method stores references to the primitive elements in a second storage in an order based on locations of the primitive elements in the unstructured document. The method receives instructions to perform a document reconstruction operation. The method performs the received instructions without storing any new references to the primitive elements.
102 Citations
28 Claims
-
1. A method comprising:
-
defining a plurality of different processes for analyzing and manipulating a document comprising a plurality of primitive elements; and defining a storage for data associated with the primitive elements, wherein at least some of the data is stored in a separate memory space from the processes and is shared by at least two different processes, wherein the processes access the data by use of references to the data, wherein the data is not replicated by the processes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer readable medium storing a computer program for execution by at least one processor, the computer program comprising sets of instructions for:
-
parsing a document comprising a plurality of primitive elements; storing the primitive elements in a random order in a first storage; storing references to the primitive elements in a second storage in an order based on locations of the primitive elements in the document; receiving instructions to perform a document reconstruction operation; and performing the received instructions without storing any new references to the primitive elements. - View Dependent Claims (22, 23)
-
-
24. A method comprising:
-
defining a first module for (i) parsing a document comprising a plurality of primitive elements and (ii) storing the primitive elements in a random order in a first storage; defining a second module for (i) allocating memory in a second storage for storing references to the randomly-ordered primitive elements and (ii) storing the references in a particular order in the allocated memory; defining a third module for storing a data structure that references a portion of the ordered references, the data structure comprising only a reference to a first one of the ordered references and a count value. defining a fourth module for (i) receiving instructions to perform document reconstruction operations and (ii) identifying which of the first, second, and third modules is required to perform the document reconstruction operations while minimizing memory and computation usage. - View Dependent Claims (25, 26, 27, 28)
-
Specification