Method and apparatus for information factoring
First Claim
Patent Images
1. A method comprising:
- representing a source information asset as a point in a metric space; and
rendering said source information asset in a second form.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for information factoring have been disclosed by representing a source information asset as a point in a metric space and rendering said source information asset in a second form.
44 Citations
42 Claims
-
1. A method comprising:
-
representing a source information asset as a point in a metric space; and
rendering said source information asset in a second form. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 23, 24)
-
-
11. A method for information factoring comprising:
-
receiving one or more information assets;
representing said one or more information assets in a tree topology;
extracting from said one or more information assets a list of one or more different parameters;
calculating probabilities associated with each said one or more different parameters for each said one or more information assets;
(a) calculating for selected nodes in said tree topology a first metric and a second metric;
(b) combining said first metric and said second metric to derive a third metric;
repeating (a) and (b) thus generating a plurality of said third metrics; and
determining a specific optimum cut-point based upon said plurality of third metrics. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A method for information factoring comprising:
-
(a) receiving N information assets capable of being represented in a tree topology;
(b) extracting from said N information assets a list of one or more different parameters;
(c) traversing over each said N information assets and calculating probabilities associated with each said one or more different parameters for each said N information assets;
(d) calculating for each node in said tree topology a first metric based upon nodes above a cut-point and a second metric based upon nodes below said cut-point;
(e) combining for each node in said tree topology said first metric and said second metric to derive a third metric for said cut-point in said tree topology;
(f) moving said cut-point between every possible set of nodes in said tree topology by traversing said tree topology from root downward and repeating (d) and (e) thus generating a plurality of said third metrics;
(g) determining a specific optimum cut-point by calculating which of the plurality of cut-points has a highest rate of change in said plurality of third metrics.
-
-
25. A method comprising:
-
representing one or more XML web assets as one or more points in a metric space;
rendering one or more XML data elements in said metric space;
determining statistical properties of said metric space;
computing one or more distance metrics in terms of said statistical properties of said metric space; and
determining an optimum of said one or more computed distance metrics. - View Dependent Claims (26, 27, 28)
-
-
29. An apparatus comprising:
-
means for representing a source information asset as a point in a metric space; and
means for rendering said source information asset in a second form. - View Dependent Claims (30)
-
-
31. A method for factoring information, the method comprising:
-
receiving information to be factored;
converting said information into one or more directed acyclic graphs (DAGs);
extracting for each node in said one or more DAGs an effectiveness and information content metric; and
choosing one or more factor points based upon said effectiveness and information content metrics. - View Dependent Claims (32, 33)
-
-
34. A method of information factoring comprising:
-
(a) receiving pages (yi), said pages having tags (tj) within each page;
(b) traversing all said pages yi, and compiling details on all possible tags (tij);
(c) traversing over said all possible tags tij for each page yi and obtaining probabilities of the tag values;
(d) computing for each node in a tree for page yi, the residual as if that node and nodes above were to be considered part of a template, and everything below that node would be part of an extraction, and would not contribute to a residual. (e) taking the potential residuals computed in (d) over all the pages, and computing the residual associated with a node and everything below it;
(f) determining a best cut point by looking at the rate of change of the total residual below each candidate cut point considering the total sum of residuals for all nodes that have the cut point as a direct or indirect parent node and picking an optimal cut point as a point (or points) that define a “
knee”
in the residual curve, plotted as a function of sorted potential cut points. - View Dependent Claims (35, 36, 37)
-
-
38. A method comprising:
-
computing a digest of node names and content text for one or more XML expressions;
tallying said XML expressions according to a cut-point algorithm;
separating each XML expression into a content part and a template part using said cut-point algorithm;
gathering for each said one or more XML expressions said content parts directly or indirectly derived from each said one or more XML expressions;
identifying all distinct digests into one or more same-digest sets;
for each same-digest set;
identifying all XML template parts associated with said same-digest set; and
summing residuals for said same-digest set content parts;
sorting said same-digest sets based on said same-digest sets'"'"' residual; and
selecting N top said sorted same-digest sets. - View Dependent Claims (39, 40, 41, 42)
-
Specification