Incrementally retrieving data for objects to provide a desired level of detail

US 10,223,401 B2
Filed: 03/28/2014
Issued: 03/05/2019
Est. Priority Date: 08/15/2013
Status: Active Grant

First Claim

Patent Images

1. A system for retrieving data and metadata comprising:

a memory;

a computing device comprising;

a user interface that;

allows a user to select an information set that includes a limited amount of information from a repository index that includes information about an indexed repository including one or more data server systems, wherein the information set includes references to member objects in the repository index;

allows the user to select an action to apply to the information set wherein the selected action requires extra information for the information set beyond the information within the repository index;

allows the user to define one or more data harvesting criteria for a data expansion operation to be performed for the information set on the indexed repository wherein the data expansion operation provides extra information for performance of the selected action; and

a processor that;

determines data server systems of the indexed repository to participate in the data expansion operation;

causes each participating data server system to execute the data expansion operation against data and metadata on that particular data server system according to the data harvesting criteria while allowing the data and metadata to be left intact on that particular data server system;

utilizes natural language processing and artificial intelligence to classify the data and metadata on each particular data server system and to refine the data harvesting criteria;

converts a characterization included in the data harvesting criteria to adaptors with filters to retrieve the data and metadata on each particular data server system according to a user request;

transforms the repository index for the indexed repository to include additional information from the data expansion operation, wherein the transforming adds the additional information to the information set to incrementally update the limited amount of information included in the information set until a desired level of detail is attained, and wherein the transforming ensures that subsequently generated information sets created based on the data and metadata subject to the data expansion operation include the additional information;

the additional information comprises a plurality of subsets, and wherein transforming the repository index comprises;

training a classifier to learn one or more classifications based on machine learning techniques;

classifying each of the subsets into one or more learned classifications during the data expansion operation via the trained classifier;

determining inclusion of each subset in the desired level of detail according to the classification of the subset by the trained classifier; and

applies the selected action to the information set based on the additional information from the data expansion operation.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A computer-implemented method is provided, for retrieving data and metadata according to a data harvesting criteria into an indexed repository, comprising providing a user interface allowing a user to define data harvesting criteria for adjusting a subset of data and metadata on an indexed repository. Responsive to a user utilizing the user interface by defining the data harvesting criteria, the subset of data and metadata on the indexed repository is adjusted according to the data harvesting criteria.

79 Citations

View as Search Results

9 Claims

1. A system for retrieving data and metadata comprising:
- a memory;
  
  a computing device comprising;
  
  a user interface that;
  
  allows a user to select an information set that includes a limited amount of information from a repository index that includes information about an indexed repository including one or more data server systems, wherein the information set includes references to member objects in the repository index;
  
  allows the user to select an action to apply to the information set wherein the selected action requires extra information for the information set beyond the information within the repository index;
  
  allows the user to define one or more data harvesting criteria for a data expansion operation to be performed for the information set on the indexed repository wherein the data expansion operation provides extra information for performance of the selected action; and
  
  a processor that;
  
  determines data server systems of the indexed repository to participate in the data expansion operation;
  
  causes each participating data server system to execute the data expansion operation against data and metadata on that particular data server system according to the data harvesting criteria while allowing the data and metadata to be left intact on that particular data server system;
  
  utilizes natural language processing and artificial intelligence to classify the data and metadata on each particular data server system and to refine the data harvesting criteria;
  
  converts a characterization included in the data harvesting criteria to adaptors with filters to retrieve the data and metadata on each particular data server system according to a user request;
  
  transforms the repository index for the indexed repository to include additional information from the data expansion operation, wherein the transforming adds the additional information to the information set to incrementally update the limited amount of information included in the information set until a desired level of detail is attained, and wherein the transforming ensures that subsequently generated information sets created based on the data and metadata subject to the data expansion operation include the additional information;
  
  the additional information comprises a plurality of subsets, and wherein transforming the repository index comprises;
  
  training a classifier to learn one or more classifications based on machine learning techniques;
  
  classifying each of the subsets into one or more learned classifications during the data expansion operation via the trained classifier;
  
  determining inclusion of each subset in the desired level of detail according to the classification of the subset by the trained classifier; and
  
  applies the selected action to the information set based on the additional information from the data expansion operation.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The system of claim 1, wherein the data harvesting criteria supports a specification selected from a group consisting of per object, per information set, metadata, container, full text, member sets, caching, attribute, classification, files, email, and servers.
  - 3. The system of claim 1, wherein the processor:
    - manages subsets of the data and metadata on each particular data server system when subsets are formed based on new data harvesting criteria.
  - 4. The system of claim 1, wherein the data harvesting criteria is applied to filters to access the data and metadata on each particular data server system.
  - 5. The system of claim 1, wherein each of the indexed data server systems has its own access interface.

6. A computer program product for providing information to a user, comprising a non-transitory computer readable storage medium having computer readable program code embodied therewith, the computer readable program code, when executed by a processor, causes the processor to:
- generate an information set that includes a limited amount of information based on a repository index, wherein the repository index includes information about an indexed repository that includes one or more data server systems, and wherein the information set includes references to member objects in the repository index;
  
  receive user input via a user interface, the user interface allowing a user to;
  
  select an action to apply to the information set, wherein the selected action requires extra information for the information set beyond the information within the repository index; and
  
  define one or more data harvesting criteria for a data expansion operation to be performed for the information set on the indexed repository, wherein the data expansion provides the extra information for performance of the selected action;
  
  determine data server systems of the indexed repository to participate in the data expansion operation;
  
  cause each of the participating data server systems to execute the data expansion operation against data and metadata on that particular data server system according to the data harvesting criteria, while allowing the data and metadata to be left intact on that particular data server system;
  
  utilize natural language processing and artificial intelligence to classify the data and metadata on each particular data server system and to refine the data harvesting criteria;
  
  convert a characterization included in the data harvesting criteria to adaptors with filters to retrieve the data and metadata on each particular data server system according to a user request;
  
  transform the repository index for the indexed repository to include additional information from the data expansion operation, wherein the transforming adds the additional information to the information set to incrementally update the limited amount of information included in the information set until a desired level of detail is attained, and wherein the transforming ensures that subsequently generated information sets created based on the data and metadata subject to the data expansion operation include the additional information;
  
  the additional information comprises a plurality of subsets, and wherein transforming the repository index comprises;
  
  training a classifier to learn one or more classifications based on machine learning techniques;
  
  classifying each of the subsets into one or more learned classifications during the data expansion operation via the trained classifier;
  
  determining inclusion of each subset in the desired level of detail according to the classification of the subset by the trained classifier; and
  
  apply the selected action to the information set based on the additional information from the data expansion operation.
- View Dependent Claims (7, 8, 9)
- - 7. The computer program product of claim 6, wherein the data harvesting criteria supports a specification selected from a group consisting of per object, per information set, metadata, container, full text, member sets, caching, attribute, classification, files, email, and servers.
  - 8. The computer program product of claim 6, the computer readable program code further configured to cause the processor to:
    - manage subsets of the data and metadata on each particular data server system when subsets are formed based on new data harvesting criteria.
  - 9. The computer program product of claim 6, the computer readable program code further configured to cause the processor to:
    - utilize natural language processing and artificial intelligence to classify the data and metadata on each particular data server system and to refine the data harvesting criteria.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Breakwater Solutions LLC
Original Assignee
International Business Machines Corporation
Inventors
Bishop, Thomas P., Chee, Kevin, McCoy, Jordan R., Szalay, Jozsef, Tran, Michael T.
Primary Examiner(s)
Bromell, Alexandria Y

Application Number

US14/228,925
Publication Number

US 20150052158A1
Time in Patent Office

1,803 Days
Field of Search

707741, 707737, 707754, 707756, 707763, 707709, 707711, 707765, 707776, 706 25
US Class Current
CPC Class Codes

G06F 16/1734   Details of monitoring file ...

G06F 16/21   Design, administration or m...

G06F 16/2228   Indexing structures

G06F 16/2365   Ensuring data consistency a...

G06F 16/24573   using data annotations, e.g...

G06F 16/285   Clustering or classification

G06F 16/335   Filtering based on addition...

G06F 16/90332   Natural language query form...

G06F 16/9535   Search customisation based ...

Incrementally retrieving data for objects to provide a desired level of detail

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

79 Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

Incrementally retrieving data for objects to provide a desired level of detail

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

79 Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links