×

Distributed catalog service for data processing platform

  • US 10,425,350 B1
  • Filed: 11/30/2017
  • Issued: 09/24/2019
  • Est. Priority Date: 04/06/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • configuring a plurality of distributed processing nodes, each comprising a processor coupled to a memory, to communicate over a network;

    abstracting content locally accessible in respective data zones of respective ones of the distributed processing nodes into respective catalogs of a distributed catalog service in accordance with a layered extensible data model;

    providing in the distributed processing nodes a plurality of microservices for performing processing operations on at least one of the layered extensible data model and the catalogs of the distributed catalog service; and

    executing an application distributed across at least two of the plurality of distributed processing nodes utilizing the catalogs of the distributed catalog service to determine, for each of the at least two distributed processing nodes, a subset of a plurality of data resources utilized by the application that are located within its corresponding one of the data zones;

    wherein each of the catalogs of the distributed catalog service is configured to track data resources within its corresponding one of the data zones through addressing the data resources based on semantic content of the data resources expressed through metadata;

    wherein the layered extensible model comprises;

    a data layer configured to persist the catalogs of the distributed catalog service;

    a core data model layer configured to provide a set of core classes for classifying the data resources in the respective data zones; and

    at least one extensions layer configured to extend respective ones of the core classes to at least one of;

    one or more additional classes; and

    instances of one or more the core classes and the additional classes;

    wherein the microservices comprise at least one microservice configured to establish relationships between data resources and metadata using one or more of the core classes, the additional classes, and the instances of the core classes and additional classes; and

    wherein the microservices further comprise at least one microservice configured to automate a process of metadata collection and ingestion for one or more discovered data hubs and data sources to populate the catalogs of the distributed catalog service.

View all claims
  • 7 Assignments
Timeline View
Assignment View
    ×
    ×