Conversion data representing a document to other formats for manipulation and display
First Claim
1. A computer implemented method of converting a first document in a first document file format to a second document in a second document file format different from the first document file format, comprising:
- locating first document file format data in the first document;
grouping said first document file format data into at least one intermediate document file format block in an intermediate document file format document, including locating words in the first document, joining words into lines, and joining lines into paragraphs, each paragraph being one of said intermediate format blocks;
locating tables, each table being one of said intermediate format blocks; and
converting said intermediate document file format document to the second document in the second document file format using said intermediate document file format blocks.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer implemented method of converting a document in an input format to a document in a different output format is disclosed. The method generally comprises locating data in the input document, grouping data into one or more intermediate format blocks in an intermediate format document, and converting the intermediate format document to the output format document using the intermediate format blocks. Each intermediate format block may be a paragraph, a line, a word, a table, or an image. The input document may be received over a network and the output document is sent over the network. A linked table of contents and/or an index may be generated. A computer executable program may be generated and inserted into the output document for selecting one output format for display. The output document may be displayed by locating sub-page breaks in the document, subdividing the document into sub-pages using the sub-page breaks, locating blocks within each sub-page, and sequentially displaying all or a portion of each block of the sub-pages within display parameters of a display configuration. Tables may be divided to be displayed in more than one display page. The converter may be incorporated in a computer program product for maintaining a repository of input documents in one or more storage formats.
-
Citations
26 Claims
-
1. A computer implemented method of converting a first document in a first document file format to a second document in a second document file format different from the first document file format, comprising:
-
locating first document file format data in the first document;
grouping said first document file format data into at least one intermediate document file format block in an intermediate document file format document, including locating words in the first document, joining words into lines, and joining lines into paragraphs, each paragraph being one of said intermediate format blocks;
locating tables, each table being one of said intermediate format blocks; and
converting said intermediate document file format document to the second document in the second document file format using said intermediate document file format blocks. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15)
locating tags in the first document; and
utilizing the tags in locating words, joining words into lines, joining lines into paragraph, and locating tables.
-
-
3. The computer implemented method of claim 1, wherein each intermediate format block is selected from the group consisting of a word, a line, a paragraph, a table, and an image.
-
4. The computer implemented method of claim 1, wherein each of the first format and second format is selected the group consisting of portable document format (PDF), rich text format (RTF), hypertext markup language (HTML), extensible markup language (XML), cascading style sheets (CSS), Netscape Layers, linked and separate pages, Tag Image File Format (TIFF), graphics interchange format (GIF), bit map (BMP), Joint Photographic Experts Group (JPEG), MICROSOFT WORD™
- , WORD PERFECT™
, AUTOCAD™
, and POWER POINT™
.
- , WORD PERFECT™
-
5. The computer implemented method of claim 1, wherein the second format is selected from hypertext markup language (HTML) and rich text format (RTF), comprising:
-
determining coordinates of each intermediate format block;
generating a second format block for each intermediate format block;
generating a second format style sheet for each intermediate format block, coordinates of each second format style sheet match coordinates of corresponding intermediate format block;
mapping an intermediate format block font to second format font to fit second format block into second format style sheet; and
placing each second format block into corresponding second format style sheet.
-
-
6. The computer implemented method of claim 1, wherein the second format is an image bitmap format, comprising:
-
generating bitmap of the intermediate format document using intermediate format blocks; and
placing the bitmap into second image document.
-
-
7. The computer implemented method of claim 1, wherein the first document is received over a network and the second document is sent over the network.
-
8. The computer implemented method of claim 7, wherein the network is selected from the group consisting of Internet and an intranet.
-
9. The computer implemented method of claim 8, wherein the receiving and the sending is via electronic mail.
-
10. The computer implemented method of claim 7, further comprising
locating headings of the first document; -
generating a table of contents page containing the headings in the second format, each table of contents heading containing a link to the heading contained in the document; and
placing the table of contents page into the second document.
-
-
11. The computer implemented method of claim 7, wherein said converting said intermediate format document to the second format document is selected from the group consisting of:
-
converting to the second format document in one second format;
converting to the second format document in multiple second formats; and
converting to the multiple second format documents, each in a different second format.
-
-
12. The computer implemented method of claim 11, further comprising:
-
generating a computer executable program for selecting one second format to be displayed; and
inserting the computer executable program into the second document.
-
-
13. The computer implemented method of claim 12, wherein the computer executable program is written in a programming language selected from the group consisting of a JAVA, Common Gateway Interface (CGI), Visual Basic, Practical extraction and reporting language (Perl), C, and C++.
-
15. The computer implemented method of claim 7, wherein said generating the table of coordinates comprises:
-
determining gaps extending across the intermediate format document;
creating a macro table having cells corresponding to portions of the intermediate format document outside of said gaps; and
recursively dividing each cell of the macro table by determining gaps extending across the cell until each cell cannot be further divided.
-
-
14. A computer implemented method of converting a first document in a first format to a second document in a different, second format in hypertext markup language (HTML), comprising:
-
locating data in the first document;
grouping data into at least one intermediate format block in an intermediate format document;
converting said intermediate format document to the second HTML document using said intermediate format blocks;
generating a table of coordinates wherein at least a subset of said coordinates correspond to a coordinate of each intermediate format block; and
placing each intermediate format block on the corresponding coordinate in the table of coordinates.
-
-
16. A computer program product for converting a document in a first document file format to a document in a second document file format different from the first document file format, comprising:
-
computer code that locates first document file format data in the first document;
computer code that groups said first document file format data into at least one intermediate document file format block in an intermediate document file format document, said computer code that groups includes computer code that locates words in the first document, joins words into lines, and joins lines into paragraphs, each paragraph being one of said intermediate format blocks;
computer code that locates tables, each table being one of said intermediate format blocks;
computer code that converts said intermediate document file format document to the second document in the second document file format using said intermediate document file format blocks; and
a computer readable medium that stores the computer codes. - View Dependent Claims (17)
-
-
18. A computer implemented method for displaying a document, comprising:
-
receiving a document for display;
automatically locating sub-page breaks in the received document;
subdividing the received document into sub-pages using the sub-page breaks;
locating blocks within each sub-page; and
sequentially displaying at least a portion of each block of the sub-pages within display parameters of a display configuration, including determining if each block can be displayed within display parameters of the display configuration and dividing a block not within display parameters into portions to be within the display parameters of the display configuration, said dividing a block including;
determining if the block is a table;
if the block is not a table, sequentially displaying each element of the block until all element of the block are displayed;
if the block is a table;
determining the headings of the table and subset of non-heading columns of the table displayable within the display parameters;
display the subset of non-heading columns of all rows of the table; and
continue determining a next subset of non-heading columns of the table displayable within the display parameters and displaying those columns of all rows of the table until all rows and all columns of the table have been displayed. - View Dependent Claims (19, 20)
locating headings of the document;
generating a table of contents page containing the headings, each table of contents heading containing a link to the heading contained in the document; and
placing the table of contents page into the second document.
-
-
21. A computer program product for maintaining a repository of first documents in at least one storage document file format, comprising:
-
computer code that receives at least one first document, said at least one first document being in at least one first document file format;
computer code that converts the first documents in the at least one first document file format to storage documents in the at least one storage document file format, said storage document file format containing storage format blocks, said computer that converts includes computer code that locates words in the first documents, joins words into lines, joins lines into paragraphs, each paragraph being one of said storage format blocks, and locates tables, each table being one of said intermediate format blocks; and
a computer readable medium that stores the computer codes. - View Dependent Claims (22, 23, 24, 25, 26)
computer code that locates keywords in the first documents; - and
computer code that generates an index document of the located keywords, the index document containing the keywords, each keyword containing at least one link to the keyword contained in at least one first document.
-
-
25. The computer program product of claim 21, further comprising:
-
computer code that generates a computer executable program for selecting one second format; and
computer code that inserts the computer executable program into the second document.
-
-
26. The computer program product of claim 21, further comprising:
-
computer code that locates headings of the first documents;
computer code that generates a table of contents page for each first document, the table of contents page containing the headings, each table of contents heading containing a link to the heading contained in the first document; and
computer code that places the table of contents page into the second document.
-
Specification