Method and system for inferring a schema from a hierarchical data structure for use in a spreadsheet
First Claim
1. A method for inferring a schema from hierarchical data within an electronic document, comprising:
- in an application program, receiving the electronic document containing the hierarchical eXtensible Markup Language (XML) data and determining from the electronic document that the hierarchical data within the electronic document does not reference a schema, wherein the hierarchical data includes one or more nodes;
(a) discovering one of the nodes in the hierarchical data;
(b) saving content associated with the discovered node when the node is not determined to be repeating;
(c) repeating tasks (a)-(b) until the content for each discovered node has been saved; and
generating the schema based on the content saved for each discovered node;
wherein generating the schema comprises writing a schema definition for each node that includes specifying a maximum occurrence of repeating nodes that are out of sequence as unbounded by checking for a repeating indicator and a sequence indicator, wherein the schema is an XML schema that provides the XML data a set of grammatical and data type rules that govern types and a structure of data that may be included in the electronic document.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system are provided for inferring a schema from an electronic document containing hierarchical data for use in a spreadsheet application program. The electronic document containing the hierarchical data is received into an application program. The application program may be a spreadsheet application program. The format of the hierarchical data structure may be XML. The hierarchical data includes a set of nodes making up the structure of the hierarchical data. The nodes may be XML elements and attributes. The hierarchical data is then parsed to discover one of the nodes in the hierarchical data. Once the node has been discovered, content associated with the discovered node is saved to a memory location in the computer system. The content may include data associated with the discovered node and the type of data associated with the node. The hierarchical data is then parsed again to discover subsequent nodes until the content for all of the nodes has been saved to the memory location. Then a schema generator generates schema elements using complex rules based on the particular qualities of each discovered node for each discovered node until a schema is generated for the hierarchical data.
-
Citations
19 Claims
-
1. A method for inferring a schema from hierarchical data within an electronic document, comprising:
-
in an application program, receiving the electronic document containing the hierarchical eXtensible Markup Language (XML) data and determining from the electronic document that the hierarchical data within the electronic document does not reference a schema, wherein the hierarchical data includes one or more nodes; (a) discovering one of the nodes in the hierarchical data; (b) saving content associated with the discovered node when the node is not determined to be repeating; (c) repeating tasks (a)-(b) until the content for each discovered node has been saved; and generating the schema based on the content saved for each discovered node;
wherein generating the schema comprises writing a schema definition for each node that includes specifying a maximum occurrence of repeating nodes that are out of sequence as unbounded by checking for a repeating indicator and a sequence indicator, wherein the schema is an XML schema that provides the XML data a set of grammatical and data type rules that govern types and a structure of data that may be included in the electronic document. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer system for inferring a schema from an electronic document containing hierarchical data comprising:
-
an application program that determines that the electronic document containing hierarchical data does not reference a schema; a memory for storing the hierarchical data, wherein the hierarchical data includes one or more nodes; a parser for receiving the hierarchical data from the memory and parsing the hierarchical data in the electronic document to discover the one or more nodes; a logical memory module for; (a) receiving one of the discovered nodes from the parser; (b) determining content associated with the discovered node; and (c) saving the content associated with the discovered node in the memory; (d) repeating tasks (a)-(c) until the content for each discovered node has been saved in the memory; and a schema generator module for generating the schema based on the content saved for each discovered node that includes specifying repeating nodes that are out of sequence as unbounded by checking at least a repeating indicator, wherein the schema provides the electronic document a set of grammatical and data type rules that govern types and a structure of data that may be included in the electronic document. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A tangible computer-readable storage medium having computer-executable instructions, which when executed on a computer, perform a method for inferring a schema from hierarchical data comprising:
-
in an application program, receiving the electronic document containing the hierarchical data and determining that the hierarchical data within the electronic document does not reference a schema, wherein the hierarchical data includes one or more nodes; (a) discovering one of the nodes in the hierarchical data; (b) saving content associated with the discovered nodes; (c) repeating tasks (a)-(b) until the content for each discovered node has been saved; and generating the schema based on the content saved for each discovered node that includes specifying repeating nodes that are out of sequence as unbounded by checking at least a repeating indicator, wherein the schema provides the electronic document a set of grammatical and data type rules that govern types and a structure of data that may be included in the electronic document. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification