Defining the semantics of data through observation
First Claim
1. A method executed in a computer system for estimating a probability associated with a parent variable that an event corresponding to the parent variable will occur, the method comprises:
- retrieving data as data strings from a data source;
producing a dataset from the retrieved data strings;
building a statistical model of parent-child relationships from data strings in the dataset by;
determining incidence values for the data strings in the dataset; and
concatenating the incident values with the data strings to provide child variables;
analyzing the child variables and the parent variable to produce statistical relationships between the child variables and the parent variable;
determining probabilities values for the event based on the determined parent child relationships; and
building an ontological representation of the data based on subsequent conditional probabilities values.
0 Assignments
0 Petitions
Accused Products
Abstract
Techniques for estimating the structure and meaning of data using probability are described. The techniques include retrieving data as data strings from a data source, producing a dataset from the retrieved data strings and building a statistical model of parent-child relationships from data strings in the dataset. Building the statistical model includes determining incidence values for the data strings in the dataset and concatenating the incident values with the data strings to provide child variables. The techniques include analyzing the child variables and the parent variables to produce statistical relationships between the child variables and a parent variable, determining probabilities values based on the determined parent child relationships and building an ontological representation of the data based on subsequent conditional probabilities values.
8 Citations
24 Claims
-
1. A method executed in a computer system for estimating a probability associated with a parent variable that an event corresponding to the parent variable will occur, the method comprises:
-
retrieving data as data strings from a data source; producing a dataset from the retrieved data strings; building a statistical model of parent-child relationships from data strings in the dataset by; determining incidence values for the data strings in the dataset; and concatenating the incident values with the data strings to provide child variables; analyzing the child variables and the parent variable to produce statistical relationships between the child variables and the parent variable; determining probabilities values for the event based on the determined parent child relationships; and building an ontological representation of the data based on subsequent conditional probabilities values. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A computer program product residing on a computer readable medium for estimating a probability associated with a parent variable that an event corresponding to the parent variable will occur, the computer program product comprising instructions for causing a computer to:
-
retrieve data as data strings from a data source; produce a dataset from the retrieved data strings; build a statistical model of parent-child relationships from data strings in the dataset by; determine incidence values for the data strings in the dataset; and concatenate the incident values with the data strings to provide child variables; analyze the child variables and the parent variable to produce statistical relationships between the child variables and the parent variable; determine probabilities values for the event based on the determined parent child relationships; and build an ontological representation of the data based on subsequent conditional probabilities values. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22)
-
-
23. An apparatus comprising:
-
a processor; and a computer readable medium storing a computer program product for estimating a probability associated with a parent variable that an event corresponding to the parent variable will occur, the computer program product comprising instructions for causing the processor to; retrieve data as data strings from a data source; produce a dataset from the retrieved data strings; build a statistical model of parent-child relationships from data strings in the dataset by; determine incidence values for the data strings in the dataset; and concatenate the incident values with the data strings to provide child variables; analyze the child variables and the parent variables to produce statistical relationships between the child variables and the parent variable; determine probabilities values for the event based on the determined parent child relationships; and build an ontological representation of the data based on subsequent conditional probabilities values. - View Dependent Claims (24)
-
Specification