On zookeeper open source client framework curator
zookeepercurator
Cursor is an open source zookeeper client framework of Netflix In the process of using zookeeper, Netflix found that the built-in client of zookeeper is too low-level, and the application side needs to deal with many things by itself, so it packaged it and provided a better client framework We also encountered some problems when Netflix used zookeeper, so we began to study it. First, we started with its source code on GitHub, Wiki documents and Netflix's technical blog
After reading the official documents, it is found that cursor mainly solves three types of problems:
1. Encapsulate the connection processing between zookeeper client and zookeeper server; 2. Provide a set of fluent style operation API; 3. Provide abstract encapsulation of various zookeeper application scenarios (recipe, such as shared lock service and cluster leader election mechanism)
Several problems in the use of zookeeper listed by cursor
1. Problem of initializing the connection: during the handshake between the client and the server to establish the connection, if the handshake fails, executing all synchronization methods (such as create, GetData, etc.) will throw an exception 2 Problem of automatic recovery: when a client loses its connection to one server and tries to connect to another server, the client will return to the initial connection mode. The problem of session Expiration: in extreme cases, the zookeeper session expires, and the client needs to listen to the state and re create the zookeeper instance 3. Handling of recoverable exceptions: when an ordered znode is created on the server side and crashes when the node name is returned to the client, the client side throws recoverable exceptions. The user needs to catch these exceptions and try again. 4 Problems with usage scenarios: zookeeper provides some standard usage scenario support, but zookeeper has few instructions for these functions, and it is easy to make mistakes ZK did not provide detailed documentation on how to deal with some extreme scenarios For example, in the shared lock service, when the server creates a temporary sequential node successfully, but hangs up before the client receives the node name, if this situation cannot be handled well, it will lead to deadlock
Cursor mainly reduces ZK use complexity in the following aspects:
1. Retry mechanism: pluggable retry mechanism is provided, which will configure a retry policy for capturing all recoverable exceptions, and several standard retry strategies (such as exponential compensation) are also provided internally 2. Connection status monitoring: after the cursor is initialized, it will always monitor the ZK connection. Once the connection status changes, it will handle it accordingly 3. ZK client instance management: the cursor manages the ZK client to server cluster connection If necessary, rebuild ZK instances to ensure reliable connection with ZK clusters. Various usage scenarios are supported: the cursor implements most of the usage scenarios supported by ZK (even those not supported by ZK itself). These implementations follow ZK's better practices and consider various extreme situations
Through the above processing, cursor allows users to focus on their own business without spending more energy on ZK itself
Some highlights of cursor's claim:
Log tool
Slf4j is used internally to output logs, and the driver mechanism is used to allow extension and customization of logs and tracking processing. A tracerdriver interface is provided to integrate the user's own tracking framework by implementing addtrace() and addcount() interfaces
Compared with cursor, another zookeeper client - zkclient
There is almost no exception handling in the document. Weak explosion (simply throwing runtimeException) retry processing is too difficult to use. There is no implementation of various use scenarios. There is a "complaint" about zookeeper's built-in client (zookeeper class): it is just a low-level implementation that needs to write a lot of code by itself, which is easy to be misused. It needs to deal with connection loss, Retry, etc
Several components of cursor
1. Client: it is an alternative to zookeeper client and provides some underlying processing and related tools and methods 2. Framework: it is used to simplify the use of zookeeper advanced functions and add some new functions, such as managing the connection to zookeeper cluster and retry processing 3 Recipes: implements the recipe of the general zookeeper, which is based on the framework. 4 Utilities: various tool classes of zookeeper 5 Errors: exception handling, connection, recovery, etc 6. Extensions: Recipe extension
Client
This is an underlying API, which can be ignored by the application side. It is better to start directly with the cursor framework, which mainly includes three parts:
Uninterrupted connection management connection retry loop
A typical usage:
If the operation fails, and the failure can be retried, and within the allowed number of times, the cursor will ensure the final completion of the operation Another retry method using the callable interface:
Retry policy
The retrypolicy interface has only one method (there were two methods in previous versions):
Before starting the retry, the allowretry method is called. Its parameters specify the current number of retries and the elapsed time of the operation If allowed, it will continue to retry, otherwise an exception will be thrown
Several retry strategies implemented inside the cursor:
1. Exponentialbackoffretry: retry the specified number of times, and the pause time between each retry increases gradually 2. Retryntimes: specifies a retry policy with a large number of retries. 3 Retryonetime: retry only once 4 Retryuntilelapsed: retry until the specified time is reached
Framework
It is a higher abstract API for zookeeper client
Automatic connection management: when an exception occurs inside the zookeeper client, it will automatically reconnect or retry. This process is almost completely transparent to the outside world
Clearer API: simplifies zookeeper's native methods, events, etc., and provides process interfaces
The curatorframeworkfactory class provides two methods, a factory method newclient and a build method build Using the factory method newclient can create a default instance, while the build method can customize the instance When the curatorframework instance is built, the start () method is called immediately. At the end of the application, the close () method needs to be called The cursor framework is thread safe You can share the curatorframework of the same ZK cluster in one application
Curatorframework API adopts fluent interface All operations are returned to the builder. When all elements are added together, the whole method looks like a complete sentence For example, the following operations:
Method description:
1. Create (): initiate a create operation You can combine other methods (such as mode or background) and end with the forpath () method 2 Delete(): initiate a delete operation You can combine other methods (version or background) and end with the forpath () method. 3 Checkexists(): initiate an operation to check whether the znode exists You can combine other methods (watch or background) and end with the forpath () method 4 GetData (): initiate an operation to obtain the data of the znode You can combine other methods (watch, background or get STAT) and end with the forpath () method 5 Setdata(): initiate an operation to set the data of znode You can combine other methods (version or background) and end with the forpath () method 6 Getchildren(): initiate an operation to obtain the child nodes of the znode You can combine other methods (watch, background or get STAT) and end with the forpath () method 7 In transaction (): initiate a zookeeper transaction You can combine create, SetData, check, and / or delete into one operation, and then commit ()
Notification
The code related to cursor has been updated. The interface has been changed from clientlistener to curatorlistener, and the clientcloseduetoerror method has been removed from the interface There is only one method: eventreceived (). This method is called when a background operation is completed or the specified watch is triggered
The unhandlederrorlistener interface is used to handle exceptions
Curatorevent (clientevent in previous versions) is a complete package of event objects (POJOs) related to various operation triggers, and the contents of event objects are related to event types. The following is the corresponding relationship:
Namespace
Because a ZK cluster will be shared by multiple applications, in order to avoid ZK patch conflicts among applications, each instance of the cursor framework will be assigned a namespace (optional) within the cursor framework In this way, when you create a znode, you will automatically add this namespace as the root of the node path The use code is as follows:
Recipe
The cursor implements all recipes of zookeeper (except two segment submissions)
election
Cluster leader election
Lock service
Shared lock: global synchronous distributed lock. Only one of two machines can obtain the same lock at the same time Shared read / write lock: used for distributed read / write mutual exclusion processing. Two locks are generated at the same time: one read lock and one write lock. The read lock can be held by multiple applications, while the write lock can only be exclusive. When the write lock is not held, multiple read lock holders can read at the same time. Shared semaphores: each JVM in the distributed system uses the same ZK lock path, The path will be associated with a given number of leases, and then each application obtains the corresponding lease according to the request order. Relatively speaking, this is the fairest way to use the lock service Multiple shared locks: internal components have multiple shared locks (which will be associated with a znode path). During the acquire() process, execute the acquire() method of all shared locks. If a failure occurs in the middle, all required shared locks will be released; When the release () method is executed, the release method of multiple internal shared locks is executed (ignored if failure occurs)
Queue
Distributed queue: FIFO queue is implemented by persistent order ZK node. If there are multiple consumers, leaderselector can be used to ensure the order of consumers in the queue. Distributed priority queue: distributed version of priority queue
Blockingqueueconsumer: distributed version of JDK blocking queue
Barrier
Distributed level: a pile of clients handle a pile of tasks. Only after all clients have finished executing, can all clients continue to process
Dual distributed level: start and end at the same time
Counter
Shared counter: all clients listen to the same znode path and share an integer count value. Distributed atomiclong (atomicinteger): the distributed version of atomicxxx uses optimistic lock update first. If it fails, it uses mutex update. You can configure retry policy to handle retry
Tool class
Path Cache
Path cache is used to monitor the changes of child nodes of znode. When adding, updating and removing child nodes, the path cache state will be changed, and the data and state of all child nodes will be returned
Pathchildrencache class is used in cursor to handle path cache, and pathchildrencachelistener is used to listen for state changes
See testpathchildrencache test class for related usage
Note: when the data of ZK server changes, the ZK client will be inconsistent. This needs to be identified by the version number
Test Server
Used to simulate a local in-process zookeeper server in the test
Test Cluster
Used to simulate a zookeeper server cluster in the test
Zkpaths tool class
Provides path processing tools and methods related to znode:
1. Getnodefrompath: get node name according to the given path i.e. "/one/two/three" -> "three"
2. Mkdirs: create all nodes recursively according to the given path
3. Getsortedchildren: returns a list of child nodes sorted by serial number according to the given path
4. MakePath: create a complete path according to the given path and child node name
Ensurepath tool class
Look at the example directly. Specifically, it is called multiple times, and the node creation operation will only be performed once
Notification event handling
The cursor encapsulates the event watcher of zookeeper, and then implements a set of monitoring mechanism Several listening interfaces are provided to handle the changes of zookeeper connection status
When a connection exception occurs, it will listen through the connectionstatelistener interface and handle it accordingly. These state changes include:
1. Suspended: when the connection is lost, all operations will be suspended until the connection is re established. If the connection cannot be established within the specified time, the lost notification will be triggered. 2 Reconnected: if the connection is lost, the notification will be triggered when reconnecting. 3 Lost: this notification will be triggered when the connection times out
From com netflix. curator. framework. imps. CuratorFrameworkImpl. In the validateconnection (curatorevent) method, we can know that the curator converts the disconnected, expired and syncconnected states of zookeeper into the above three states respectively
summary
The above is all about the zookeeper open source client framework cursor in this article. Interested friends can refer to: understanding of zookeeper watch mechanism, configuring corresponding ACL permissions for zookeeper, detailed examples of Apache zookeeper usage methods, etc. I hope it will be helpful to you. If there are deficiencies, please leave a message for correction. Thank you for your support!