Creation of semantic objects for providing logical structure to markup language representations of documents
First Claim
1. A computer-implemented method of constructing an object model reflective of structure of a document, comprising:
- scanning a markup language representation of the document to construct a plurality of text runs;
determining which text runs correspond to a same semantic block to thereby create at least one semantic block container;
determining the order of the text lines within each of the at least one semantic blocks;
determining the order of the at least one semantic blocks on a current page; and
saving the semantic blocks including the order of the text lines and the order of the semantic blocks on the page as the object model.
2 Assignments
0 Petitions
Accused Products
Abstract
Semantic objects are created that provide a structure for markup language representations of documents. The semantic objects include text runs that are produced from the markup language representation and that are placed into semantic blocks that group text runs according to how text is logically structured in the document being represented. The text runs of each semantic block are ordered to correspond to the logical order of the document being represented. The semantic blocks corresponding to each page of the document being represented are ordered to correspond to the logical order of the document being represented. The ordered semantic blocks including the ordered text runs are saved as a semantic object which can they be utilized to make use of the logical structure of the document being represented by the markup language.
-
Citations
20 Claims
-
1. A computer-implemented method of constructing an object model reflective of structure of a document, comprising:
-
scanning a markup language representation of the document to construct a plurality of text runs;
determining which text runs correspond to a same semantic block to thereby create at least one semantic block container;
determining the order of the text lines within each of the at least one semantic blocks;
determining the order of the at least one semantic blocks on a current page; and
saving the semantic blocks including the order of the text lines and the order of the semantic blocks on the page as the object model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer readable medium having instructions encoded there for performing acts comprising:
-
obtaining text runs from a markup language representation of a document;
constructing semantic block containers by determining which text runs correspond to the same semantic blocks and placing those text runs corresponding to the same semantic block into the same semantic block container;
ordering the text runs within each semantic block to match a logical order present in the document; and
ordering the semantic blocks within a page to match a logical order present in the document. - View Dependent Claims (13, 14)
-
-
15. A computer system, comprising:
-
storage containing instructions for generating a semantic object and containing a markup language representation of a document;
a processor that implements the instructions to generate the semantic object, wherein implementing the instructions comprises accessing the markup language representation to produce text turns, placing the text runs in corresponding semantic blocks in accordance with a logical structure of the document, ordering the text runs within the semantic blocks in accordance with the logical structure of the document, ordering the semantic blocks to form one or more pages in accordance with the logical structure of the document, and storing the ordered semantic blocks containing the order text runs as the semantic object. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification