Performance and scalability in an intelligent data operating layer system
First Claim
1. A system comprising:
- an index handler;
data servers configured for distributed processing of structured and unstructured data,where a first of the data servers is configured to;
use a statistical algorithm to assign weights to terms found in content of electronic files,use idea distancing between similar concepts to find different words in the electronic files that describe a same idea,form a conceptual understanding of content in each of the electronic files using the statistical algorithm and the idea distancing, andcooperate with the index handler to form a common index of the conceptual understanding of each of the electronic files, where the common index includes at least a conceptual understanding of a first of the electronic files that contains structured data and a conceptual understanding of a second of the electronic files that contains unstructured data, anda query pipeline that includes an action handler that is configured to support a distribution of command actions in a common protocol to the data servers,wherein the data servers are configurable to selectively run in one of a mirror mode and a non-mirror mode, wherein the data servers in the mirror mode have a same configuration and contain same data, and wherein the data servers in the non-mirror mode are configured differently and contain different data, andwherein the index handler is configurable to selectively run in one of the mirror mode and the non-mirror mode, and wherein the index handler in the mirror mode is configured to distribute same index data of the common index to the data servers, and the index handler in the non-mirror mode is configured to distribute different portions of the common index to the corresponding data servers.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods that allow for an intelligence platform for distributed processing of big data sets including both structured and unstructured data types across two or more intelligent data operation engine servers. The intelligent data operation engine servers can form a conceptual understanding of content in each electronic file and then cooperates with a distributed index handler to index the conceptual understanding of the electronic file. A query pipeline and the distributed index handler in the intelligence platform cooperate with the two or more intelligent data operation engine servers to improve scalability and performance on the big data sets containing both structured and un-structured electronic files represented in the common index.
-
Citations
20 Claims
-
1. A system comprising:
-
an index handler; data servers configured for distributed processing of structured and unstructured data, where a first of the data servers is configured to; use a statistical algorithm to assign weights to terms found in content of electronic files, use idea distancing between similar concepts to find different words in the electronic files that describe a same idea, form a conceptual understanding of content in each of the electronic files using the statistical algorithm and the idea distancing, and cooperate with the index handler to form a common index of the conceptual understanding of each of the electronic files, where the common index includes at least a conceptual understanding of a first of the electronic files that contains structured data and a conceptual understanding of a second of the electronic files that contains unstructured data, and a query pipeline that includes an action handler that is configured to support a distribution of command actions in a common protocol to the data servers, wherein the data servers are configurable to selectively run in one of a mirror mode and a non-mirror mode, wherein the data servers in the mirror mode have a same configuration and contain same data, and wherein the data servers in the non-mirror mode are configured differently and contain different data, and wherein the index handler is configurable to selectively run in one of the mirror mode and the non-mirror mode, and wherein the index handler in the mirror mode is configured to distribute same index data of the common index to the data servers, and the index handler in the non-mirror mode is configured to distribute different portions of the common index to the corresponding data servers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 17)
-
-
16. A method comprising:
-
assigning weights to terms found in content of electronic files; using idea distancing between similar concepts to find different words in the electronic files that describe a same idea; using both the assigned weights and the idea distancing to form a conceptual understanding of the content in each of the electronic files; cooperating with an index handler to form a common index of the conceptual understanding of each of the electronic files, where the common index includes at least a conceptual understanding of a first of the electronic files that contains structured data and a conceptual understanding of a second of the electronic files that contains unstructured data; distributing, by a query pipeline, command actions in a common protocol to data servers; selectively configuring the data servers to selectively run in one of a mirror mode and a non-mirror mode, wherein the data servers in the mirror mode have a same configuration and contain same data, and wherein the data servers in the non-mirror mode are configured differently and contain different data; and selectively configuring the index handler to selectively run in one of the mirror mode and the non-mirror mode, wherein the index handler in the mirror mode is configured to distribute same index data of the common index to the data servers, and the index handler in the non-mirror mode is configured to distribute different portions of the common index to corresponding data servers. - View Dependent Claims (18)
-
-
19. A non-transitory computer-readable storage medium storing instructions that upon execution cause a system to:
-
assign weights to terms found in content of electronic files; use idea distancing between similar concepts to find different words in the electronic files that describe a same idea; form a conceptual understanding of the content in each of the electronic files using both the assigned weights and the idea distancing; cooperate with an index handler to form a common index of the conceptual understanding of each of the electronic files, where the common index includes at least a conceptual understanding of a first of the electronic files that contains structured data and a conceptual understanding of a second of the electronic files that contains unstructured data; distribute, by a query pipeline, command actions in a common protocol to data servers; selectively configure the data servers to selectively run in one of a mirror mode and a non-mirror mode, wherein the data servers in the mirror mode have a same configuration and contain same data, and wherein the data servers in the non-mirror mode are configured differently and contain different data; and selectively configure the index handler to selectively run in one of the mirror mode and the non-mirror mode, wherein the index handler in the mirror mode is configured to distribute same index data of the common index to the data servers, and the index handler in the non-mirror mode is configured to distribute different portions of the common index to corresponding data servers. - View Dependent Claims (20)
-
Specification