Data representation schema translation through shared examples
First Claim
1. A method for data representation schema translation, said method comprising the steps of:
- identifying one or more shared examples encoded in two data representation schemas; and
automatically generating a translator based on said shared examples, wherein said translator is adapted for translating data between said two data representation schemas.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for translating data from one representation or schema to another representation or schema. Example data encoded in both the schemas is used to generate a translator. This translator is then used for automatically translating data from one schema to another. The translator is computed by finding corresponding paths for matched data elements. When new data is presented in one schema, the translator then gives the translation for the paths of data elements in the data. A translated data is then constructed by using these translated paths. Possible applications in the Internet domain, include but are not limited to: EDI; search engines; content ingestion; content customization; data delivery; and data retrieval. Specific examples are shown for generating a translator and translating data between various schema including HTML, XML and extensions thereto such as SpeechML.
153 Citations
47 Claims
-
1. A method for data representation schema translation, said method comprising the steps of:
-
identifying one or more shared examples encoded in two data representation schemas; and
automatically generating a translator based on said shared examples, wherein said translator is adapted for translating data between said two data representation schemas. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
parsing data in said one or more shared examples into trees, each tree representing one of said data representation schemas;
generating a path table for said each tree, in response to said parsing step; and
said step of automatically generating the translator comprising the step of generating a translation table from path tables generated for said each tree.
-
-
3. The method of claim 1, where said data representation schema is a hierarchical data representation language selected from the group consisting of XML;
- HTML; and
SGML.
- HTML; and
-
4. The method of claim 3, wherein said shared examples have different values for each attribute and element.
-
5. The method of claim 2, wherein said shared examples are sufficient to cover all paths that may be encountered in any data that needs to be translated between the two schemas.
-
6. The method of claim 2, wherein said step of automatically generating the translator further comprises the step of storing a path in a specific traversal in each of said data representation schemas for each data item in each shared example.
-
7. The method of claim 2, further comprising the step of
receiving a data item in a first data representation schema; -
identifying the path associated with the data item;
identifying an equivalent path in a second data representation, based on the translation table; and
translating the data item to the second data representation, based on the equivalent path.
-
-
8. The method of claim 7, further comprising the step of matching paths if parts of the path are matched.
-
9. The method of claim 7, further comprising the step of identifying paths by a list of nodes in the path from the root of the data item.
-
10. The method of claim 6, further comprising the step of storing additional path information including one or more of a node order among its siblings;
- and a node order in siblings of the same type.
-
11. The method of claim 1, wherein said shared examples have a different value for each part.
-
12. The method of claim 1, wherein said data representation schemas are machine readable.
-
13. The method of claim 1, further comprising the step of generating a translator table for translating between said data representation schemas by using common elements of said shared examples.
-
14. The method of claim 1, further comprising the step of copying parts of the data representation in one of said data representation schemas in the translation.
-
15. The method of claim 1, wherein said data representation schema is in the form of a graph, further comprising the step of converting the graph to an equivalent tree.
-
16. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for data representation schema translation, said method steps comprising:
-
identifying one or more shared examples encoded in two data representation schemas; and
automatically generating a translator based on said shared examples, wherein said translator is adapted for translating data between said two data representation schemas. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
parsing data in said one or more shared examples into trees, each tree representing one of said data representation schemas;
generating a path table for said each tree, in response to said parsing step; and
said step of automatically generating the translator comprising the step of generating a translation table from path tables generated for said each tree.
-
-
18. The program storage device of claim 16, where said data representation schema is a hierarchical data representation language selected from the group consisting of:
XML;
HTML; and
SGML.
-
19. The program storage device of claim 18, wherein said shared examples have different values for each attribute and element.
-
20. The program storage device of claim 17, wherein said shared examples are sufficient to cover all paths that may be encountered in any data that needs to be translated between the two schemas.
-
21. The program storage device of claim 17, wherein said step of automatically generating the translator further comprises the step of storing a path in a specific traversal in each of said data representation schemas for each data item in each shared example.
-
22. The program storage device of claim 21, further comprising the step of
receiving a data item in a first data representation schema; -
identifying the path associated with the data item;
identifying an equivalent path in a second data representation, based on the translation table; and
translating the data item to the second data representation, based on the equivalent path.
-
-
23. The program storage device of claim 22, further comprising the step of matching paths if parts of the path are matched.
-
24. The program storage device of claim 22, further comprising the step of identifying paths by a list of nodes in the path from the root of the data item.
-
25. The program storage device of claim 21, further comprising the step of storing additional path information including one or more of:
- a node order among its siblings; and
a node order in siblings of the same type.
- a node order among its siblings; and
-
26. The program storage device of claim 16, wherein said shared examples have a different value for each part.
-
27. The program storage device of claim 16, wherein said data representation schemas are machine readable.
-
28. The program storage device of claim 16, further comprising the step of generating a translator table for translating between said data representation schemas by using common elements of said shared examples.
-
29. The program storage device of claim 16, further comprising the step of copying parts of the data representation in one of said data representation schemas in the translation.
-
30. The program storage device of claim 16, wherein said data representation schema is in the form of a graph, further comprising the steps of converting the graph to an equivalent tree.
-
31. A computer program product comprising:
-
a computer usable medium having computer readable program code means embodied therein for causing a data representation schema translation, the computer readable program code means in said computer program product comprising;
computer readable program code means for causing a computer to effect, identifying one or more shared examples encoded in two data representation schemas; and
computer readable program code means for causing a computer to effect, automatically generating a translator based on said shared examples, wherein said translator is adapted for translating data between said two data representation schemas. - View Dependent Claims (32)
computer readable program code means for causing a computer to effect, parsing data in said one or more shared examples into trees, each tree representing one of said data representation schemas;
computer readable program code means for causing a computer to effect, generating a path table for said each tree, in response to said parsing step; and
said computer readable program code means for causing a computer to effect, automatically generating the translator comprising computer readable program code means for causing a computer to effect, generating a translation table from path tables generated for said each tree.
-
-
33. A method for data representation schema translation, said method comprising the steps of:
-
identifying data encoded in a first data representation schema;
converting said data encoded in said first data representation schema to an encoding in a second data representation schema; and
automatically generating a translator based on common data encodings, in response to said identifying and converting, wherein said translator is adapted for translating data between said first data representation schema and said second data representation schema.
-
-
34. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for data representation schema translation, said method steps comprising:
-
identifying data encoded in a first data representation schema;
converting said data encoded in said first data representation schema to an encoding in a second data representation schema; and
automatically generating a translator based on common data encodings, in response to said identifying and converting, wherein said translator is adapted for translating data between said first data representation schema and said second data representation schema.
-
-
35. A computer program product comprising:
-
a computer usable medium having computer readable program code means embodied therein for causing a for data representation schema translation, the computer readable program code means in said computer program product comprising;
computer readable program code means for causing a computer to effect, identifying data encoded in a first data representation schema;
computer readable program code means for causing a computer to effect, converting said data encoded in said first data representation schema to an encoding in a second data representation schema; and
computer readable program code means for causing a computer to effect, automatically generating a translator based on common data encodings, in response to said identifying and converting, wherein said translator is adapted for translating data between said first data representation schema and said second data representation schema.
-
-
36. A method for translating data from a Web page represented in one data representation schema to another data representation schema, said method comprising the steps of:
-
identifying one or more shared examples of the Web page encoded in both data representation schemas; and
automatically generating a translator based on said shared examples, wherein said translator is adapted for translating data from the Web page between said data representation schemas. - View Dependent Claims (37, 38, 39, 40, 41)
receiving another Web page including different data; and
translating the different data in the Web page represented in said one data representation schema to said another data representation schema.
-
-
38. The method of claim 36 wherein the Web page in said one data representation schema includes data in a HTML schema and said another data representation schema includes data in a XML schema.
-
39. The method of claim 38 wherein said XML schema includes data in a SpeechML schema.
-
40. The method of claim 39, further comprising the steps of:
-
receiving another Web page in said XML schema including different data; and
translating the different data in the Web page represented said XML schema to said SpeechML schema.
-
-
41. The method of claim 39, for use with a voice browser in onr or more of an automobile and over a telephone.
-
42. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for data representation schema translation, said method steps comprising:
-
identifying one or more shared examples of the Web page encoded in both data representation schemas; and
automatically generating a translator based on said shared examples, wherein said translator is adapted for translating data from the Web page between said data representation schemas. - View Dependent Claims (43, 44, 45, 46, 47)
receiving another Web page including different data; and
translating the different data in the Web page represented in said one data representation schema to said another data representation schema.
-
-
44. The program storage device of claim 42 wherein the Web page in said one data representation schema includes data in a HTML schema and said another data representation schema includes data in a XML schema.
-
45. The program storage device of claim 44 wherein said XML schema includes data in a SpeechML schema.
-
46. The program storage device of claim 45, further comprising the steps of:
-
receiving another Web page in said XML schema including different data; and
translating the different data in the Web page represented said XML schema to said SpeechML schema.
-
-
47. The program storage device of claim 45, for use with a voice browser in one or more of an automobile and over a telephone.
Specification