Automation tool for web site content language translation
First Claim
1. A method implemented on a computer having at least one processor, storage, and a communication platform for managing language translation, comprising the steps of:
- crawling web pages of an origin web site to retrieve content in a first language by following publicly accessible links to additional pages;
identifying at least one translatable component from the content in the first language that is not yet translated into a second language;
automatically scheduling for human translation of the content in the first language that has the at least one translatable component that is not yet translated by storing a universal resource locator (URL) address of the content in a translation list, the scheduling being performed based on a priority associated with the corresponding web page, wherein the priority corresponds to a frequency of accessing the web page;
retrieving content in the first language from the origin web site referenced by the stored URL address via a public network path to obtain newly retrieved content;
extracting one or more translatable components that are not yet translated into the second language from the newly retrieved content in the first language;
receiving human translation in the second language for at least some of the extracted one or more translatable components; and
storing into a database the human translation of the at least some of the one or more extracted translatable components as translated components.
3 Assignments
0 Petitions
Accused Products
Abstract
A system, method and computer readable medium for providing translated web content is disclosed. The method on an information processing system includes retrieving a first content in a first language and parsing the first content into a plurality of translatable components. The method further includes generating a unique identifier for each of the plurality of translatable components of the first content and queuing the plurality of translatable components and corresponding unique identifiers for translation into a second language. The method further includes, for each of the plurality of translatable components, storing a translated component and an associated unique identifier corresponding to the translatable component, thereby storing a plurality of translated components and corresponding unique identifiers.
-
Citations
8 Claims
-
1. A method implemented on a computer having at least one processor, storage, and a communication platform for managing language translation, comprising the steps of:
-
crawling web pages of an origin web site to retrieve content in a first language by following publicly accessible links to additional pages; identifying at least one translatable component from the content in the first language that is not yet translated into a second language; automatically scheduling for human translation of the content in the first language that has the at least one translatable component that is not yet translated by storing a universal resource locator (URL) address of the content in a translation list, the scheduling being performed based on a priority associated with the corresponding web page, wherein the priority corresponds to a frequency of accessing the web page; retrieving content in the first language from the origin web site referenced by the stored URL address via a public network path to obtain newly retrieved content; extracting one or more translatable components that are not yet translated into the second language from the newly retrieved content in the first language; receiving human translation in the second language for at least some of the extracted one or more translatable components; and storing into a database the human translation of the at least some of the one or more extracted translatable components as translated components. - View Dependent Claims (2, 3, 4)
-
-
5. A non-transitory computer-readable medium having information recorded thereon for managing language translation, wherein the information, when read by a computer, causes the computer to perform the following:
-
crawling web pages of an origin web site to retrieve content in a first language by following publicly accessible links to additional pages; identifying at least one translatable component from the content in the first language that is not yet translated into a second language; automatically scheduling for human translation of the content in the first language that has the at least one translatable component that is not yet translated by storing a universal resource locator (URL) address of the content in a translation list, the scheduling being performed based on a priority associated with the corresponding web page, wherein the priority corresponds to a frequency of accessing the web page; retrieving content in the first language from the origin web site referenced by the stored URL address via a public network path to obtain newly retrieved content; extracting one or more translatable components that are not yet translated into the second language from the newly retrieved content in the first language; receiving human translation in the second language for at least some of the extracted one or more translatable components; and storing into a database the human translation of the at least some of the one or more extracted translatable components as translated components. - View Dependent Claims (6, 7, 8)
-
Specification