Method and system for high performance integration, processing and searching of structured and unstructured data using coprocessors
First Claim
Patent Images
1. A method for building a metadata index for unstructured data for a plurality of different data sources, the method comprising:
- receiving streaming unstructured data into a reconfigurable logic device, the streaming unstructured data comprising a plurality of data items for a plurality of different sources, wherein the reconfigurable logic device has a plurality of pipelined firmware application modules deployed thereon;
the pipelined firmware application modules analyzing the streaming unstructured data to generate metadata about the streaming unstructured data at hardware processing speeds, the analyzing including detecting whether a term relating to a name is found in any of the data items, the generated metadata comprising data associated with the data item that is indicative of where a data item having the detected term can be located; and
generating an index about the streaming unstructured data from the generated metadata, the index for subsequent querying to locate data items of interest based on associations between the metadata and the data items.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein is a method and system for integrating an enterprise'"'"'s structured and unstructured data to provide users and enterprise applications with efficient and intelligent access to that data. In accordance with exemplary embodiments, the generation of metadata indexes about unstructured data can be hardware-accelerated by processing streaming unstructured data through a reconfigurable logic device to generate the metadata about the unstructured data for the index.
-
Citations
39 Claims
-
1. A method for building a metadata index for unstructured data for a plurality of different data sources, the method comprising:
-
receiving streaming unstructured data into a reconfigurable logic device, the streaming unstructured data comprising a plurality of data items for a plurality of different sources, wherein the reconfigurable logic device has a plurality of pipelined firmware application modules deployed thereon; the pipelined firmware application modules analyzing the streaming unstructured data to generate metadata about the streaming unstructured data at hardware processing speeds, the analyzing including detecting whether a term relating to a name is found in any of the data items, the generated metadata comprising data associated with the data item that is indicative of where a data item having the detected term can be located; and generating an index about the streaming unstructured data from the generated metadata, the index for subsequent querying to locate data items of interest based on associations between the metadata and the data items. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. An apparatus for building a metadata index for unstructured data for a plurality of different data sources, the apparatus comprising:
-
a reconfigurable logic device; and a memory; wherein the reconfigurable logic device is configured to receive streaming unstructured data, the streaming unstructured data comprising a plurality of data items for a plurality of different sources, wherein the reconfigurable logic device has a plurality of pipelined firmware application modules deployed thereon; the pipelined firmware application modules configured to perform analysis of the streaming unstructured data to generate metadata about the streaming unstructured data at hardware processing speeds, the analysis including a detection by the pipelined firmware application modules whether a term relating to a name is found in any of the data items, the generated metadata comprising data associated with the data item that is indicative of where a data item having the detected term can be located; and the memory configured to store an index about the streaming unstructured data from the generated metadata, the index for querying to locate data items of interest based on associations between the metadata and the data items.
-
-
18. A method for integrating unstructured data for a plurality of different data sources, the method comprising:
-
streaming unstructured data through a field programmable gate array (FPGA), the unstructured data comprising at least two members of the group consisting of (1) a plurality of emails, (2) a plurality of social network communications, (3) a plurality of corporate documents, and (4) a plurality of news reports; the FPGA performing a metadata generation operation on the unstructured data streamed therethrough to thereby generate metadata about the unstructured data; storing the unstructured data in a data store of unstructured data; storing the metadata about the unstructured data in a database of structured data; and determining a connectedness of a plurality of subjects based on an analysis of the stored unstructured data and the stored metadata. - View Dependent Claims (19, 20, 21, 22)
-
-
23. A method for building a metadata index for unstructured data for a plurality of different data sources, the method comprising:
-
receiving streaming unstructured data into a reconfigurable logic device, the streaming unstructured data comprising a plurality of data items for a plurality of different sources, wherein the reconfigurable logic device has a plurality of pipelined firmware application modules deployed thereon; the pipelined firmware application modules analyzing the streaming unstructured data to generate metadata about the streaming unstructured data at hardware processing speeds, the analyzing including detecting whether a term relating to a security instrument is found in any of the data items, the generated metadata comprising data associated with the data item that is indicative of where a data item having the detected term can be located; and generating an index about the streaming unstructured data from the generated metadata, the index for subsequent querying to locate data items of interest based on associations between the metadata and the data items. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38)
-
-
39. An apparatus for building a metadata index for unstructured data for a plurality of different data sources, the apparatus comprising:
-
a reconfigurable logic device; and a memory; wherein the reconfigurable logic device is configured to receive streaming unstructured data, the streaming unstructured data comprising a plurality of data items for a plurality of different sources, wherein the reconfigurable logic device has a plurality of pipelined firmware application modules deployed thereon; the pipelined firmware application modules configured to perform analysis of the streaming unstructured data to generate metadata about the streaming unstructured data at hardware processing speeds, the analysis including a detection by the pipelined firmware application modules whether a term relating to a security instrument is found in any of the data items, the generated metadata comprising data associated with the data item that is indicative of where a data item having the detected term can be located; and the memory configured to store an index about the streaming unstructured data from the generated metadata, the index for querying to locate data items of interest based on associations between the metadata and the data items.
-
Specification