Method and system for managing servers in a server cluster
First Claim
Patent Images
1. A method of passively monitoring servers in a server cluster comprising machine-implemented steps of:
- receiving request traffic that is sent from clients to the server cluster;
routing the request traffic to a server in the server cluster;
receiving response traffic from the server in the server cluster;
wherein the response traffic is returned from the server to the clients, the response traffic corresponding to the request traffic;
detecting, within a configured retry time period, whether a number of abnormal end sessions in the response traffic exceeds a first configured failure threshold;
wherein the response traffic includes packets from all of a plurality of connections between the clients and the server;
wherein the number of abnormal end sessions in the response traffic is determined across all of the plurality of connections between the clients and the server;
in response to detecting, within the configured retry time period, that the number of abnormal end sessions in the response traffic exceeds the first configured failure threshold, performing the steps of;
changing a state of the server to a first state that indicates that the server is at least temporarily removed from the server cluster, andstarting a first state time clock;
sending the response traffic to the clients;
when the first state time clock expires, changing the state of the server to a second state that indicates that the server is included in the server cluster;
receiving further response traffic from the server in the server cluster;
wherein the further response traffic corresponds to further request traffic that was sent from the clients to the server cluster;
detecting, within the configured retry time period, whether a number of abnormal end sessions in the further response traffic exceeds a second configured failure threshold;
wherein the further response traffic includes packets from all of the plurality of connections between the clients and the server;
wherein the number of abnormal end sessions in the further response traffic is determined across all of the plurality of connections between the clients and the server;
in response to detecting, within the configured retry time period, that the number of abnormal end sessions in the further response traffic exceeds the second configured failure threshold, changing the state of the server to a third state that indicates that the server is removed from the server cluster;
wherein said second configured failure threshold is less than said first configured failure threshold;
sending the further response traffic to the clients;
wherein the method is performed by one or more network devices.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of managing servers in a server cluster is disclosed. The health of servers is detected through passive return traffic monitoring. Server failure can be detected through TCP information or HTTP return codes. Various settings affecting number of failure thresholds and the time period to detect failures can be configured. Servers can be mapped to URLs such that passive health monitoring can be performed for URLs instead of server clusters.
187 Citations
20 Claims
-
1. A method of passively monitoring servers in a server cluster comprising machine-implemented steps of:
-
receiving request traffic that is sent from clients to the server cluster; routing the request traffic to a server in the server cluster; receiving response traffic from the server in the server cluster; wherein the response traffic is returned from the server to the clients, the response traffic corresponding to the request traffic; detecting, within a configured retry time period, whether a number of abnormal end sessions in the response traffic exceeds a first configured failure threshold; wherein the response traffic includes packets from all of a plurality of connections between the clients and the server; wherein the number of abnormal end sessions in the response traffic is determined across all of the plurality of connections between the clients and the server; in response to detecting, within the configured retry time period, that the number of abnormal end sessions in the response traffic exceeds the first configured failure threshold, performing the steps of; changing a state of the server to a first state that indicates that the server is at least temporarily removed from the server cluster, and starting a first state time clock; sending the response traffic to the clients; when the first state time clock expires, changing the state of the server to a second state that indicates that the server is included in the server cluster; receiving further response traffic from the server in the server cluster; wherein the further response traffic corresponds to further request traffic that was sent from the clients to the server cluster; detecting, within the configured retry time period, whether a number of abnormal end sessions in the further response traffic exceeds a second configured failure threshold; wherein the further response traffic includes packets from all of the plurality of connections between the clients and the server; wherein the number of abnormal end sessions in the further response traffic is determined across all of the plurality of connections between the clients and the server; in response to detecting, within the configured retry time period, that the number of abnormal end sessions in the further response traffic exceeds the second configured failure threshold, changing the state of the server to a third state that indicates that the server is removed from the server cluster; wherein said second configured failure threshold is less than said first configured failure threshold; sending the further response traffic to the clients; wherein the method is performed by one or more network devices. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus operable to passively monitor servers in a server cluster, the apparatus comprising:
-
one or more processors; a network interface communicatively coupled to the one or more processors and configured to communicate one or more packet flows among the one or more processors in a network; and a computer readable medium comprising one or more sequences of instructions which, when executed by the one or more processors, cause the one or more processors to perform the steps of; receiving request traffic that is sent from clients to the server cluster; routing the request traffic to a server in the server cluster; receiving response traffic from the server in the server cluster; wherein the response traffic is returned from the server to the clients, the response traffic corresponding to the request traffic; detecting, within a configured retry time period, whether a number of abnormal end sessions in the response traffic exceeds a first configured failure threshold; wherein the response traffic includes packets from all of a plurality of connections between the clients and the server; wherein the number of abnormal end sessions in the response traffic is determined across all of the plurality of connections between the clients and the server; in response to detecting, within the configured retry time period, that the number of abnormal end sessions in the response traffic exceeds the first configured failure threshold, performing the steps of; changing a state of the server to a first state that indicates that the server is at least temporarily removed from the server cluster, and starting a first state time clock; sending the response traffic to the clients; when the first state time clock expires, changing the state of the server to a second state that indicates that the server is included in the server cluster; receiving further response traffic from the server in the server cluster; wherein the further response traffic corresponds to further request traffic that was sent from the clients to the server cluster; detecting, within the configured retry time period, whether a number of abnormal end sessions in the further response traffic exceeds a second configured failure threshold; wherein the further response traffic includes packets from all of the plurality of connections between the clients and the server; wherein the number of abnormal end sessions in the further response traffic is determined across all of the plurality of connections between the clients and the server; in response to detecting, within the configured retry time period, that the number of abnormal end sessions in the further response traffic exceeds the second configured failure threshold, changing the state of the server to a third state that indicates that the server is removed from the server cluster; wherein said second configured failure threshold is less than said first configured failure threshold; sending the further response traffic to the clients. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A non-transitory computer-readable medium storing one or more sequences of instructions for passively monitoring servers in a server cluster, which instructions, when executed by one or more processors, cause the one or more processors to perform the steps of:
-
receiving request traffic that is sent from clients to the server cluster; routing the request traffic to a server in the server cluster; receiving response traffic from the server in the server cluster; wherein the response traffic is returned from the server to the clients, the response traffic corresponding to the request traffic; detecting, within a configured retry time period, whether a number of abnormal end sessions in the response traffic exceeds a first configured failure threshold; wherein the response traffic includes packets from all of a plurality of connections between the clients and the server; wherein the number of abnormal end sessions in the response traffic is determined across all of the plurality of connections between the clients and the server; in response to detecting, within the configured retry time period, that the number of abnormal end sessions in the response traffic exceeds the first configured failure threshold, performing the steps of; changing a state of the server to a first state that indicates that the server is at least temporarily removed from the server cluster, and starting a first state time clock; sending the response traffic to the clients; when the first state time clock expires, changing the state of the server to a second state that indicates that the server is included in the server cluster; receiving further response traffic from a server in the server cluster; wherein the further response traffic corresponds to further request traffic that was sent from the clients to the server cluster; detecting, within the configured retry time period, whether a number of abnormal end sessions in the further response traffic exceeds a second configured failure threshold; wherein the further response traffic includes packets from all of the plurality of connections between the clients and the server; wherein the number of abnormal end sessions in the further response traffic is determined across all of the plurality of connections between the clients and the server; in response to detecting, within the configured retry time period, that the number of abnormal end sessions in the further response traffic exceeds the second configured failure threshold, changing the state of the server to a third state that indicates that the server is removed from the server cluster; wherein said second configured failure threshold is less than said first configured failure threshold; sending the further response traffic to the clients. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification