Content conversion method and apparatus
First Claim
1. A method for converting content comprising the steps of:
- receiving content expressed in a first state;
parsing said content expressed in a first state into at least a first segment and a second segment, said first segment having a first portion, said second segment having a second portion, said first portion and said second portion having overlapping portions of said content;
accessing a third segment of said content expressed in a second state, said third segment corresponding to one of said first and second segments;
accessing a fourth segment of said content expressed in the second state, said fourth segment corresponding to the other one of said first and second segments and having an overlapping portion with said third segment;
determining said content expressed in the second state based on combining said third and fourth segments; and
providing said content expressed in said second state.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for automatically translating documents from one language into another. The method includes comparing sample first and second documents representing the same, or similar ideas, written in a first and second language respectively, and creating a database associating words in the first language with words in the second language that are translations of each other. The method can also include translating a document from a first language into a second language. The method includes parsing the document in the first language into words or segments, finding words or segments in the second language that correspond to selected words or segments in the first language, and finding words or segments in the second language that correspond to combinations of words or segments in the first language.
-
Citations
20 Claims
-
1. A method for converting content comprising the steps of:
-
receiving content expressed in a first state;
parsing said content expressed in a first state into at least a first segment and a second segment, said first segment having a first portion, said second segment having a second portion, said first portion and said second portion having overlapping portions of said content;
accessing a third segment of said content expressed in a second state, said third segment corresponding to one of said first and second segments;
accessing a fourth segment of said content expressed in the second state, said fourth segment corresponding to the other one of said first and second segments and having an overlapping portion with said third segment;
determining said content expressed in the second state based on combining said third and fourth segments; and
providing said content expressed in said second state.
-
-
2. A method for creating a content conversion database comprising the steps of:
-
providing a pair of documents representing the same idea in two different states; and
using said pair of documents to create a database of segment associations between the two different states by parsing segments of a first state and comparing said parsed first state segments to parsed segments of a second state, and by associating an occurrence frequency between parsed segments of the first state and parsed segments of the second state. - View Dependent Claims (3, 4, 5)
-
-
6. A method of creating a database comprising the steps of:
-
providing one or more pairs of documents representing the same idea in two or more states;
selecting at least a first and a second occurrence of a chosen segment in the first state, the chosen segment having a plurality of occurrences in the documents in the first state;
selecting at least a first range and a second range in the second state documents, wherein the first and second ranges broadly correspond to the first and second selected segment occurrences in the first state;
comparing segments in the first range and the second range and locating segments common to both ranges;
storing located common segments in said database; and
associating in said database located common segments with the chosen segment, ranked by frequency of occurrence. - View Dependent Claims (7, 8, 9)
-
-
10. A method for translating idea content from a first state to a second state comprising the steps of utilizing a database of segment associations between content in said first state and said second state to convert the content of the document in a first state into the document of a second state, wherein said conversion includes examining segments of content in said first state and segments of content in said second state, and removing similar segments from said examined first state content and said examined second state content, and associating the content of said first state content with said second state content after removal of similar segments.
-
11. A method of converting a document, the method comprising the steps of:
-
providing content comprising data segments in a first state associated with data segments in a second state;
selecting the largest delimited portion of the document to be translated that begins with the first segment of the document and exists in a database;
retrieving from the database a segment in the second state associated with the located first segment in the first state;
selecting at least a second delimited portion in the first state that has one or more overlapping segments with the previous delimited segment in the first state;
retrieving from the database a segment in the second state associated with the located second segment in the first state;
returning the two data segments in the first state have overlapping content as a single data segment in the first state;
returning, if the two data segments in the second state have overlapping content, a single data segment in the second state; and
associating said single data segment in said first state with said single data segment in said second state, thereby returning a conversion of said single data segment from said first state to said second state. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A method of converting a document, the method comprising the steps of:
-
(a) providing content comprising data segments in a first state associated with data segments in a second state;
(b) selecting the largest delimited segment of the document to be translated that begins with the first word of the document and exists in a database;
(c) retrieving from the database a data segment in the second language associated with the located data segment in the first language;
(d) selecting at least a second delimited segment in the first language that exists in the database and has one or a plurality of overlapping words with the previous delimited segment in the first language;
(e) retrieving from the database a data segment in the second language associated with the located data segment in the first language; and
(f) combining the two segments in the second language to form a translation if the two data segments have an overlapping word or plurality of words, and repeating steps (e) and (f) if the two data segments do not have an overlapping word or plurality of words until a data segment is located with an overlapping word or plurality of words. - View Dependent Claims (18)
-
-
19. A computer system for converting content, comprising:
-
a computing device that receives content expressed in a first state and parses said content into at least a first segment and a second segment, said first segment having a first portion, said second segment having a second portion, said first portion and said second portion having overlapping portions of said content;
wherein said computing device accesses third and fourth segments of said content that are each expressed in a second state, said third segment corresponding to one of said first and second segments, said fourth segment corresponding to another one of said first and second segments and having an overlapping portion with said third segment; and
wherein said computing device determines said content expressed in the second state based on said third and fourth segments having an overlapping portion and provides said content in the second state. - View Dependent Claims (20)
-
Specification