Method and apparatus for normalizing and converting structured content
First Claim
1. A method for use in converting content of electronic data from a source form to a target form, said electronic data having a machine format and an informational content independent of said machine format and any machine instructions, said method comprising the steps of:
- defining, by utilizing a computer, a transformation matrix for use by a machine tool involving;
providing a set of source content elements reflecting a source environment, wherein said source content elements include human readable informational content entered by one or more human users in a form free from compliance with any device format, said source content elements reflecting inconsistencies of linguistics, at least including different terms to identify same subject matter, and syntax, at least including different ordering of terms;
providing a set of normalized content elements that are amenable to transformation to the target form;
establishing a normalization structure for normalizing said set of source content elements to said set of normalized content elements with respect to linguistics and syntax, wherein the source content elements correspond to a single normalized content element, and wherein said normalization structure is based on a knowledge base developed from information about said set of source content elements, the establishing the normalization structure comprises utilizing grammar rules to identify one or more attributes of at least a subset of said set of source content elements and utilizing linguistics rules to identify attributes or attribute values of said source content elements that are expressed in a plurality of forms;
defining a set of rules for converting said normalized content elements to target content elements;
receiving an item of electronic data having a machine format and information content including at least one source content element and extracting said information content using said machine format;
using a first operating of said machine tool to apply said transformation matrix by;
identifying a first source content element under consideration;
applying said normalization structure to said first source content element to identify a first normalized content element; and
using said set of rules with respect to said first normalized content element to convert said first source content element to said target form,assisting in applying said normalization structure by associating contextual information with said normalized content elements, wherein said associating contextual information comprises providing tags for schematizing source information; and
providing, by using second operating of said machine tool, an output including said first source content element converted to said target form.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus are disclosed for transforming information from one semantic environment to another. In one implementation, a SOLx system (1700) includes a Normalization/Translation NorTran Workbench (1702) and a SOLx server (1708). The NorTran Workbench (1702) is used to develop a knowledge base based on information from a source system (1712), to normalize legacy content (1710) according to various rules, and to develop a database (1706) of translated content. During run time, the SOLx server (1708) receives transmissions from the source system (1712), normalizes the transmitted content, accesses the database (1706) of translated content and otherwise translates the normalized content, and reconstructs the transmission to provide substantially real-time transformation of electronic messages.
46 Citations
16 Claims
-
1. A method for use in converting content of electronic data from a source form to a target form, said electronic data having a machine format and an informational content independent of said machine format and any machine instructions, said method comprising the steps of:
-
defining, by utilizing a computer, a transformation matrix for use by a machine tool involving; providing a set of source content elements reflecting a source environment, wherein said source content elements include human readable informational content entered by one or more human users in a form free from compliance with any device format, said source content elements reflecting inconsistencies of linguistics, at least including different terms to identify same subject matter, and syntax, at least including different ordering of terms; providing a set of normalized content elements that are amenable to transformation to the target form; establishing a normalization structure for normalizing said set of source content elements to said set of normalized content elements with respect to linguistics and syntax, wherein the source content elements correspond to a single normalized content element, and wherein said normalization structure is based on a knowledge base developed from information about said set of source content elements, the establishing the normalization structure comprises utilizing grammar rules to identify one or more attributes of at least a subset of said set of source content elements and utilizing linguistics rules to identify attributes or attribute values of said source content elements that are expressed in a plurality of forms; defining a set of rules for converting said normalized content elements to target content elements; receiving an item of electronic data having a machine format and information content including at least one source content element and extracting said information content using said machine format; using a first operating of said machine tool to apply said transformation matrix by; identifying a first source content element under consideration; applying said normalization structure to said first source content element to identify a first normalized content element; and
using said set of rules with respect to said first normalized content element to convert said first source content element to said target form,assisting in applying said normalization structure by associating contextual information with said normalized content elements, wherein said associating contextual information comprises providing tags for schematizing source information; and providing, by using second operating of said machine tool, an output including said first source content element converted to said target form. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. An apparatus for use in converting content of electronic data from a source form to a target form, said electronic data having a machine format and an informational content independent of said machine format and any machine instructions, said apparatus comprising:
-
a storage medium for establishing and storing a normalization structure for normalizing source content elements with respect to linguistics and syntax to standardized content elements amenable to a transformation to the target form, and for storing a second conversion structure for converting said standardized content elements to target content elements of said target form, wherein the source content elements correspond to a single normalized content element, wherein said source content elements include human informational content entered by one or more human users in a form free from compliance with any device format, said source content elements reflecting inconsistencies of linguistics, at least including different terms to identify same subject matter, and syntax, at least including different ordering of terms, and wherein said normalization structure is based on a knowledge base developed from information related to said set of source content elements, the establishing the normalization structure comprises utilizing grammar rules to identify one or more attributes of at least a subset of said set of source content elements and utilizing linguistics rules to identify attributes or attribute values of said source content elements that are expressed in a plurality of forms; an input structure for receiving input content having the machine format and information content including a first source content element and for extracting said information content using said machine format; a processor for accessing said normalization structure and said second conversion structure from said storage medium and using said normalization structure and said second conversion structure to convert said first source content element to at least one target content element; a first operating of said machine tool to apply a transformation matrix by; applying said normalization structure to said first source content element to identify a first normalized content element; and
using said set of rules with respect to said first normalized content element to convert said first source content element to said target form; andthe processor further for assisting in the applying said normalization structure by associating contextual information with said normalized content elements, wherein said associating contextual information comprises providing tags for schematizing said source information; an output structure for providing an output including said at least one target content element. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
Specification