System, method and computer program for multi-dimensional temporal and relative data mining framework, analysis and sub-grouping

US 9,898,513 B2
Filed: 12/12/2012
Issued: 02/20/2018
Est. Priority Date: 12/12/2011
Status: Active Grant

First Claim

Patent Images

1. A computer implemented data mining method for controlling mining of data streams in a distributed computing environment configured to provide a distribution layer operable to maintain consistencies across multiple distributed computing systems when performing distributed data processing and analysis, wherein different attributes are associated with each of a plurality data streams, the computer implemented data mining method comprising:

(a) using a central distribution computer system component to store and maintain consistency of a data mining framework configured to support data mining across the multiple distributed computing systems, the data mining framework including at least;

(i) a series of temporal rules deployable to a subset of multiple distributed computing systems that are targets for a query; and

(ii) relative rules adapted for relatively aligning time series multi-dimensional data based on at least one time point of interest, the central distribution computer system being configured for determining a subset of particular temporal rules that are applicable to the time series multi-dimensional data associated to a particular site, based on the different attributes associated with the data streams;

(b) distributing, from the central distribution computer system to the multiple distributed computing systems, the series of temporal rules and the relative rules to be applied by each distributed computing systems of the multiple distributed computing systems to pre-process the time series multi-dimensional data and to generate new temporally abstracted and relatively aligned time series data representing trends and patterns that include one or more indications of a potential future clinical event;

(c) collecting, and cleaning at the multiple distributed computing systems, the time series multi-dimensional data, the time series multi-dimensional data obtained through one or more corresponding data streams of the plurality of data streams;

(d) temporally abstracting, at the multiple distributed computing systems, the collected and cleaned time series multi-dimensional data by accessing and applying the applicable temporal rules so as to generate temporally abstracted time series multi-dimensional data categorized both on similarity and frequency, and relatively aligning the temporally abstracted time series multi-dimensional data based on an at least one time point of interest by accessing and applying the applicable relative rules; and

(e) collecting the temporally abstracted and relatively aligned time series multi-dimensional data from the multiple distributed computing systems to provide multi-dimensional, temporal, multi-site time series data for use in data mining operations.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention relates to a system, method and computer program product that is a multi-dimensional data mining environment and that operable to apply a series of temporal and relative rules (i.e., STDMⁿ0) and is further operable in at least one of the following ways: to incorporate a framework to support temporal abstractions and relative alignments to data (i.e., STDMⁿ0); and to derive characteristics within the data (STDMⁿ0). The present invention may incorporate data from multiple sources, and potentially multiple centers. The analysis and alignment of the data may involve both temporal dimensions and other dimensions (or relative aspects) of the data. The present invention may further be a data mining environment that is flexible enough to permit relatively open ended queries thereby enabling, for example, the detection of trends, including trends with new dimensions, or trends based on relatively small data sets.

20 Citations

View as Search Results

19 Claims

1. A computer implemented data mining method for controlling mining of data streams in a distributed computing environment configured to provide a distribution layer operable to maintain consistencies across multiple distributed computing systems when performing distributed data processing and analysis, wherein different attributes are associated with each of a plurality data streams, the computer implemented data mining method comprising:
- (a) using a central distribution computer system component to store and maintain consistency of a data mining framework configured to support data mining across the multiple distributed computing systems, the data mining framework including at least;
  
  (i) a series of temporal rules deployable to a subset of multiple distributed computing systems that are targets for a query; and
  
  (ii) relative rules adapted for relatively aligning time series multi-dimensional data based on at least one time point of interest, the central distribution computer system being configured for determining a subset of particular temporal rules that are applicable to the time series multi-dimensional data associated to a particular site, based on the different attributes associated with the data streams;
  
  (b) distributing, from the central distribution computer system to the multiple distributed computing systems, the series of temporal rules and the relative rules to be applied by each distributed computing systems of the multiple distributed computing systems to pre-process the time series multi-dimensional data and to generate new temporally abstracted and relatively aligned time series data representing trends and patterns that include one or more indications of a potential future clinical event;
  
  (c) collecting, and cleaning at the multiple distributed computing systems, the time series multi-dimensional data, the time series multi-dimensional data obtained through one or more corresponding data streams of the plurality of data streams;
  
  (d) temporally abstracting, at the multiple distributed computing systems, the collected and cleaned time series multi-dimensional data by accessing and applying the applicable temporal rules so as to generate temporally abstracted time series multi-dimensional data categorized both on similarity and frequency, and relatively aligning the temporally abstracted time series multi-dimensional data based on an at least one time point of interest by accessing and applying the applicable relative rules; and
  
  (e) collecting the temporally abstracted and relatively aligned time series multi-dimensional data from the multiple distributed computing systems to provide multi-dimensional, temporal, multi-site time series data for use in data mining operations.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, comprising managing the distribution and application of the temporal rules and the relative rules across the multiple distributed computing systems to support the data mining operations across the multiple distributed computing systems in real time or near real time.
  - 3. The method of claim 1, wherein the different attributes may include one or more of:
    - (a) data structure, (b) data collection frequency, or (c) the type of device collecting the data, including at least one of;
      
      manufacturer/model, approach of device to data correction or mechanism used for identifying artefacts in signals.
  - 4. The method of claim 3, comprising distributing applicable temporal rules and applicable relative rules based on the attributes associated with the relevant data streams.
  - 5. The method of claim 4, wherein each data stream of the plurality of data streams relates to a corresponding human subject, and wherein the central distribution computer system is configured to (a) initiate creation of simple abstractions for each human subject, store of the simple abstractions locally at each site, and tag the corresponding data streams using site identification data, and (b) initiate creation of complex abstractions using applicable temporal rules and tag the complex abstractions with tagging information defined by the central distribution computer system to enable access for multi-site data mining operations initiated by the central distribution computer system.
  - 6. The method of claim 1, wherein the time series multi-dimensional data is associated with two or more distributed computing systems, and optionally is generated by two or more types of devices, and further is associated with two or more research studies.
  - 7. The method of claim 5, comprising generation of patient monitoring data in real time or near real time for use in connection with one or more patient care systems or patient monitoring systems.
  - 8. The method of claim 5, comprising dynamically defining groups or sub-groups of human subjects, or characteristics associated with such groups or sub-groups, and enabling data mining operations in real time or near real time based on such groups or sub-groups.
  - 9. The method of claim 1, comprising the use of the results of the data mining operations to perform multi-site research data operations across each of the distributed computing systems.
  - 10. The method of claim 1, wherein the time series multi-dimensional data includes physiological data collected by medical devices, wherein at least one of the data structure and frequency of the time series multi-dimensional data collected by the medical devices varies.
  - 11. The method of claim 2, comprising storing the temporal rules and the relative rules in a data store that includes a hierarchy based on simple rules to complex rules.
  - 12. The method of claim 1, wherein at least one data mining operation is based on null hypothesis testing.

13. A data mining computer system for mining data from multiple distributed computing systems, wherein different attributes may be associated with data streams, the system controlling mining of the data streams in a distributed computing environment configured to provide a distribution layer operable to maintain consistencies across the multiple distributed computing systems when performing distributed data processing and analysis, the system comprising:
- (a) a central distribution computer system component configured to store and maintain a data mining framework configured to support data mining across the multiple distributed computing systems, the data mining framework including at least;
  
  (i) a series of temporal rules deployable to a subset of multiple distributed computing systems that are targets for a query and(ii) relative rules adapted for relatively aligning time series multi-dimensional data based on at least one time point of interest, the central distribution computer system being configured for determining a subset of particular temporal rules that are applicable to data associated to a particular site based on the different attributes associated with the data streams;
  
  the central distribution computer system component configured to distribute to the multiple distributed computing systems the data mining framework, including at least the series of temporal rules and the relative rules to be applied by each distributed computing systems of the multiple distributed computing systems to pre-process the time series multi-dimensional data and to generate new temporally abstracted and relatively aligned time series data representing trends and patterns that include one or more indications of a potential future clinical event;
  
  (b) one or more devices associated with two or more of the multiple distributed computing systems, the devices collecting data in a plurality of data streams at the multiple distributed computing systems; and
  
  (c) at least one local computer at each distributed computing system connected to central distribution computer system;
  
  wherein;
  
  the central distribution computer system is configured to manage the temporal abstraction and relative alignment of the data streams so as to support data mining operations for multi-dimensional data across the multiple sites by;
  
  accessing, from the at least one local computer, information regarding the different attributes for the data streams;
  
  providing, to the at least one local computer, the applicable temporal rules and applicable relative rules thereby enabling temporal abstraction of the time series multi-dimensional data to generate temporally abstracted time series multi-dimensional data, and to generate relative alignment of the temporally abstracted time series multi-dimensional data based on an at least one time point of interest in a way that addresses the different attributes; and
  
  collecting the temporally abstracted and relatively aligned time series multi-dimensional data from the multiple sites by communicating with the at least one local computer and initiating the retrieval and transfer of the temporally abstracted and relatively aligned data based on a data mining request.
- View Dependent Claims (14, 15, 16, 17, 18, 19)
- - 14. The data mining computer system of claim 13, wherein the data mining computer system configured to manage distribution and application of the temporal rules and the relative rules across the multiple sites to support data mining operations across the multiple sites in real time or near real time.
  - 15. The data mining computer system of claim 13, wherein the different attributes may include one or more of:
    - (a) data structure, (b) data collection frequency, or (c) the type of device collecting the data (including manufacturer/model, approach of device to data correction or mechanism for identifying artefacts in signals).
  - 16. The data mining computer system of claim 15, wherein the central distribution computer system is configured to distribute applicable temporal rules and applicable relative rules based on the attributes associated with the relevant data streams.
  - 17. The data mining computer system of claim 16, wherein each data stream relates to a human subject, and wherein the central distribution computer system is configured to (a) initiate creation of simple abstractions for each human subject, and storage of the simple abstractions locally at each site, and tagging of the data streams using site identification data, and (b) initiate creation of complex abstractions using the applicable temporal rules and tagging of the complex abstractions with tagging information defined by the central distribution computer system so as to enable access for multi-site data mining operations initiated by the central distribution computer system.
  - 18. The data mining computer system of claim 15, wherein the central computer system is configured to generate patient monitoring data in real time or near real time for use in connection with one or more patient care systems or patient monitoring systems.
  - 19. The data mining computer system of claim 18, wherein each data stream is associated with a particular human subject, and the computer system is configured to dynamically define groups or sub-groups of human subjects, or characteristics associated with such groups or sub-groups, and thereby permit data mining operations in real time or near real time based on such groups or sub-groups.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
University Of Ontario Institute Of Technology
Original Assignee
University Of Ontario Institute Of Technology
Inventors
McGregor, Carolyn Patricia, Smith, Kathleen Patricia, Dhanoa, Agam
Primary Examiner(s)
Gofman, Alex
Assistant Examiner(s)
Mian, Umar

Application Number

US14/363,385
Publication Number

US 20140358926A1
Time in Patent Office

1,896 Days
Field of Search

None
US Class Current
CPC Class Codes

G06F 16/24568   Data stream processing; Con...

G06F 16/2465   Query processing support fo...

G06F 16/2477   Temporal data queries

G06F 2216/03   Data mining

G16H 10/60   for patient-specific data, ...

G16H 40/67   for remote operation

G16H 50/20   for computer-aided diagnosi...

G16H 50/70   for mining of medical data,...

System, method and computer program for multi-dimensional temporal and relative data mining framework, analysis and sub-grouping

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

20 Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

System, method and computer program for multi-dimensional temporal and relative data mining framework, analysis and sub-grouping

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

20 Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links