Visually Interactive Identification of a Cohort of Data Objects Similar to a Query Based on Domain Knowledge
First Claim
1. A system comprising:
- a data processor to;
access a plurality of data objects, each data object comprising a plurality of numerical components, wherein each component represents a data feature of a plurality of data features, andidentify, for each data feature, a feature distribution of the numerical components;
a selector to select a sub-plurality of data features of a query object, wherein a given data feature is selected if the component representing the given data feature is a peak for the feature distribution of the given data feature;
an evaluator to determine, for the query object and a data object, a similarity measure based on the sub-plurality of the data features, the similarity measure indicative of data features common to the query object and the data object; and
an interaction processor to;
provide, via an interactive graphical user interface, an interactive visual representation of a distance histogram representing the feature distributions,iteratively process, based on the interactive distance histogram, selection of a sub-plurality of the data features, the selection based on domain knowledge, andidentify, based on the similarity measures, a cohort of data objects similar to the query object.
1 Assignment
0 Petitions
Accused Products
Abstract
Visually interactive identification of a cohort of similar data objects is disclosed. One example is a system including a data processor to access a plurality of data objects, each data object comprising a plurality of numerical components, where each component represents a data feature of a plurality of data features, and to identify, for each data feature, a feature distribution of the numerical components. A selector selects a sub-plurality of the data features of a query object, where a given data feature is selected if the component representing the given data feature is a peak for the feature distribution. An evaluator determines a similarity measure based on the sub-plurality of the data features. An interaction processor iteratively processes selection of a sub-plurality of the data features based on domain knowledge, and identifies, based on the similarity measures, a cohort of data objects similar to the query object.
-
Citations
15 Claims
-
1. A system comprising:
-
a data processor to; access a plurality of data objects, each data object comprising a plurality of numerical components, wherein each component represents a data feature of a plurality of data features, and identify, for each data feature, a feature distribution of the numerical components; a selector to select a sub-plurality of data features of a query object, wherein a given data feature is selected if the component representing the given data feature is a peak for the feature distribution of the given data feature; an evaluator to determine, for the query object and a data object, a similarity measure based on the sub-plurality of the data features, the similarity measure indicative of data features common to the query object and the data object; and an interaction processor to; provide, via an interactive graphical user interface, an interactive visual representation of a distance histogram representing the feature distributions, iteratively process, based on the interactive distance histogram, selection of a sub-plurality of the data features, the selection based on domain knowledge, and identify, based on the similarity measures, a cohort of data objects similar to the query object. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14)
-
-
13. A method to determine a cohort of objects similar to a query object, the method comprising:
-
accessing, from a database, a plurality of data objects, each data object comprising a plurality of numerical components, wherein each component represents a data feature of a plurality of data features; selecting a sub-plurality of data features of a query object, wherein a given data feature is selected if the component representing the given data feature is a peak for the feature distribution of the given data feature; determining, for the query object and a data object of the plurality of data objects, a similarity measure based on the sub-plurality of the data features, the similarity measure indicative of data features common to the query object and the data object; providing, via an interactive graphical user interface, an interactive visual representation of a distance histogram representing the feature distributions; iteratively processing selection of the sub-plurality of the data features, the iterative selection based on at least one of adding a first data feature to the selected sub-plurality of data features and deleting a second data feature from the selected sub-plurality of data features.
-
-
15. A non-transitory computer readable medium comprising executable instructions to:
-
access, via a processor, a plurality of data objects, each data object comprising a plurality of numerical components, wherein each component represents a data feature of a plurality of data features; identify, for each data feature, a feature distribution of the components associated with the data feature; select a sub-plurality of data features of a query object, wherein a given data feature is selected if the component representing the given data feature is a peak for the feature distribution of the given data feature; determine, for the query object and a data object of the plurality of data objects, a similarity measure based on the sub-plurality of the data features, the similarity measure indicative of data features common to the query object and the data object; provide, to a computing device, an interactive visual representation of a distance histogram representing the feature distributions; iteratively process, based on the interactive distance histogram, selection of a sub-plurality of the data features, the selection based on domain knowledge; and identify, based on the similarity measures, a cohort of data objects similar to the query object.
-
Specification