ZooKeeper is a Distributed Coordination Service for Distributed Applications. ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchal namespace which is organized similarly to a standard file system. The name space consists of data registers – called znodes, in ZooKeeper parlance – and these are similar to files and directories, unlike a typical file system. ZooKeeper runs in Java and has bindings for both Java and C.
- ZooKeeper Cluster – Terminology
- Designing ZooKeeper Deployment
- Deploying ZooKeeper Cluster (Multi-Server) Setup
- Integrating ZooKeeper Cluster
- Maintenance – Basic elements
- Performance & Availability Considerations
ZooKeeper Cluster – Terminology
ZooKeeper Service is replicated over a sets of hosts called an ensemble. A replicated group of servers in the same application is called a quorum. All servers in the quorum have copies of the same configuration file. QuorumPeers will form a ZooKeeper ensemble. Zookeeper requires a majority, it’s recommended to use an odd number of machines/servers. For example: Five machines ZooKeeper can handle the failure of two machines.
Designing ZooKeeper Deployment
Designing, yes – few things we have to be clear enough before we begin the deployment of ZooKeeper, let’s have answers for following questions-
- Identify # of ZooKeeper Server planned to deploy odd numbers are best?
- Identify # of Physical machine Box will be participating in deployment?
- Prepare ZooKeeper port #’s for deployment
- Client port #
- Quorum port #
- Leader election port #
- Where to put Data directory dataDir for ZooKeeper Server?
- Where to put Data Log directory dataLogDir for ZooKeeper Server?
- Memory allocation for ZooKeeper?
Deployment Diagram of this Article:
Of course, let’s have answers for this article, shall we:
Answer: As I said above for this article 5 ZooKeeper servers on one Box, so I have to choose unique port #’s for each. Ensure selected port #’s are open in your Box, ZooKeeper uses TCP to communicate each other.
Note: Plan out appropriately on port #’s, otherwise ZooKeeper may not communicate each other. It may delay a deployment success!
1 2 3 4 5 6 7 8 9 |
---------------------------------------------------------------- | Server ID | Client Port | Quorum Port | Leader Election Port | ---------------------------------------------------------------- | 1 | 2181 | 2888 | 3888 | | 2 | 2182 | 2889 | 3889 | | 3 | 2183 | 2890 | 3890 | | 4 | 2184 | 2891 | 3891 | | 5 | 2185 | 2892 | 3892 | ---------------------------------------------------------------- |
Answer: I’m planning to place it in /Users/jeeva/zookeeper/data and this directory as data home
Note: Data directory is one of the performance factor, see Point #3 at Performance & Availability Considerations
Answer: I’m planning to place it in /Users/jeeva/zookeeper/log and this directory as log home
Note: Log directory is one of the performance factor, see Point #4 at Performance & Availability Considerations
Answer: For demo & article, I’m keeping this heap size to default. For customization create a file called java.env in the {ZooKeeperHome}/conf/
Note: JVM Heap size significantly contributes to performance factor, see Point #5 at Performance & Availability Considerations
Okay, now we have answers let’s move on.
Deploying ZooKeeper Cluster (Multi-Server) Setup
Let’s begin installation and configuration of ZooKeeper.
Step 1: Directory Structure creation, as decided in the designing section
1 2 3 4 5 |
mac-book-pro:demo jeeva$ mkdir -p /Users/jeeva/zookeeper/zk-server-1 /Users/jeeva/zookeeper/zk-server-2 /Users/jeeva/zookeeper/zk-server-3 /Users/jeeva/zookeeper/zk-server-4 /Users/jeeva/zookeeper/zk-server-5 mac-book-pro:demo jeeva$ mkdir -p /Users/jeeva/zookeeper/data/zk1 /Users/jeeva/zookeeper/data/zk2 /Users/jeeva/zookeeper/data/zk3 /Users/jeeva/zookeeper/data/zk4 /Users/jeeva/zookeeper/data/zk5 mac-book-pro:demo jeeva$ mkdir -p /Users/jeeva/zookeeper/log/zk1 /Users/jeeva/zookeeper/log/zk2 /Users/jeeva/zookeeper/log/zk3 /Users/jeeva/zookeeper/log/zk4 /Users/jeeva/zookeeper/log/zk5 |
Let’s take a look above created directory structure-
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
mac-book-pro:demo jeeva$ tree /Users/jeeva/zookeeper /Users/jeeva/zookeeper |-data |---zk1 |---zk2 |---zk3 |---zk4 |---zk5 |-log |---zk1 |---zk2 |---zk3 |---zk4 |---zk5 |-zk-server-1 |-zk-server-2 |-zk-server-3 |-zk-server-4 |-zk-server-5 mac-book-pro:demo jeeva$ |
Okay, looks good!
Step 2: Creating a ZooKeeper Server ID, basically this file reside in the ZooKeeper data directory. Go on choose your favorite text editor
1 2 3 4 5 6 7 8 |
# just enter a value '1' in the file. Save the file, do the same for rest of ZooKeeper mac-book-pro:demo jeeva$& vi /Users/jeeva/zookeeper/data/zk1/myid # follow the same way to fill server id vi /Users/jeeva/zookeeper/data/zk2/myid vi /Users/jeeva/zookeeper/data/zk3/myid vi /Users/jeeva/zookeeper/data/zk4/myid vi /Users/jeeva/zookeeper/data/zk5/myid |
Step 3: Downloading ZooKeeper Release
Download a ZooKeeper from http://hadoop.apache.org/zookeeper/releases.html; this article utilize the version 3.4.4 of ZooKeeper. However same principle is applied for other version too.
Step 4: Extract & prepare ZooKeeper for deployment
1 2 3 4 5 6 |
mac-book-pro:demo jeeva$ gzip -dc ~/Downloads/soft/zookeeper-3.4.4.tar.gz | tar -xf - -C /tmp mac-book-pro:demo jeeva$ cp -r /tmp/zookeeper-3.4.4/* /Users/jeeva/zookeeper/zk-server-1/ mac-book-pro:demo jeeva$ cp -r /tmp/zookeeper-3.4.4/* /Users/jeeva/zookeeper/zk-server-2/ mac-book-pro:demo jeeva$ cp -r /tmp/zookeeper-3.4.4/* /Users/jeeva/zookeeper/zk-server-3/ mac-book-pro:demo jeeva$ cp -r /tmp/zookeeper-3.4.4/* /Users/jeeva/zookeeper/zk-server-4/ mac-book-pro:demo jeeva$ cp -r /tmp/zookeeper-3.4.4/* /Users/jeeva/zookeeper/zk-server-5/ |
Once done don’t forget to cleanup the /tmp/zookeeper-3.4.4
Step 5: Preparing ZooKeeper configuration called zoo.cfg at {zk-server-1}/conf/zoo.cfg. Here I will show you for Server 1 and perform same steps with appropriate values (clientPort, dataDir, dataLogDir) for respective ZooKeeper server.
1 |
mac-book-pro:demo jeeva$ vi /Users/jeeva/zookeeper/zk-server-1/conf/zoo.cfg |
Place below configuration into it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # Choose appropriately for your environment dataDir=/Users/jeeva/zookeeper/data/zk1 # the port at which the clients will connect clientPort=2181 # the directory where transaction log is stored. # this parameter provides dedicated log device for ZooKeeper dataLogDir=/Users/jeeva/zookeeper/log/zk1 # ZooKeeper server and its port no. # ZooKeeper ensemble should know about every other machine in the ensemble # specify server id by creating 'myid' file in the dataDir # use hostname instead of IP address for convenient maintenance server.1=localhost:2888:3888 server.2=localhost:2889:3889 server.3=localhost:2890:3890 server.4=localhost:2891:3891 server.5=localhost:2892:3892 |
Screenshot: zk-server-1 directory structure along with conf/zoo.cfg
Step 6: Configuration ZooKeeper Logger for deployment. Following are the default values of log4j.properties and it holds dev nature in it; update it as per your environment and need –
1 2 3 4 5 6 7 |
zookeeper.root.logger=INFO, CONSOLE zookeeper.console.threshold=INFO zookeeper.log.dir=. zookeeper.log.file=zookeeper.log zookeeper.log.threshold=DEBUG zookeeper.tracelog.dir=. zookeeper.tracelog.file=zookeeper_trace.log |
Step 7: Once zoo.cfg created for all the server then we can start the ZooKeeper Servers. Let’s start the zk-server-1
1 2 3 4 5 6 7 |
mac-book-pro:demo jeeva$ cd /Users/jeeva/zookeeper/zk-server-1/bin/ mac-book-pro:bin jeeva$ ./zkServer.sh start JMX enabled by default Using config: /Users/jeeva/zookeeper/zk-server-1/bin/../conf/zoo.cfg Starting zookeeper ... STARTED mac-book-pro:bin jeeva$ |
Now, go ahead and start the remaining 4 ZooKeeper server(s). Tail the zookeeper.out file in the bin directory to see more information.
zkServer.sh supports the following commands:
1 2 3 4 5 6 7 |
start start-foreground stop restart status upgrade print-cmd |
We will use ‘status’ command to see ZookKeeper Server status:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
mac-book-pro:demo jeeva$ /Users/jeeva/zookeeper/zk-server-3/bin/zkServer.sh status JMX enabled by default Using config: /Users/jeeva/zookeeper/zk-server-3/bin/../conf/zoo.cfg Mode: leader mac-book-pro:demo jeeva$ mac-book-pro:demo jeeva$ /Users/jeeva/zookeeper/zk-server-5/bin/zkServer.sh status JMX enabled by default Using config: /Users/jeeva/zookeeper/zk-server-5/bin/../conf/zoo.cfg Mode: follower mac-book-pro:demo jeeva$ |
ZooKeeper CLI Client
ZooKeeper command line interface for handy administration How to Connect ZooKeeper through CLI? and Famous Four letter commands for it.
Integrating ZooKeeper Cluster
Typically, ZooKeeper enabled application will be able to connect right away. To integrate the ZooKeeper with application, all you need know is ‘all the ZooKeeper server(s)’-
- host-address/host-ip
- port-no
For an example: SolrCloud, Elasticsearch, etc
- Integrating ZooKeeper ensemble with SolrCloud Cluster
- Integrating ZooKeeper ensemble with Elasticsearch Cluster – (upcoming article)
Maintenance – Basic Elements
Typically two things to take care in ZooKeeper Server and also this is trick part.
- Data directory Cleanup
- Debug Log Cleanup (log4j)
Why I call maintenance is a tricky part? Let me describe – ZooKeeper provides the autopurge configuration like autopurge.snapRetainCount and autopurge.purgeInterval. However, in real-time scenario’s maintenance activities will differ organization to organization. Basically it depends on Organization IT policy and Business Requirements.
Please have a look on ZooKeeper Maintenance and plan yours!
Performance & Availability Considerations
- As long as a majority of the ensemble are up, the ZooKeeper service will be available. Because It requires a majority, it is best to use an odd number of machines. For example, with four machines ZooKeeper can only handle the failure of a single machine; if two machines fail, the remaining two machines do not constitute a majority. However, with five machines ZooKeeper can handle the failure of two machines
- It’s critical that you run ZooKeeper under supervision, since Zookeeper is fail-fast and will exit the process if it encounters any error case. See here for more details.
- The ZooKeeper Data Directory contains files which are a persistent copy of the znodes stored by a particular serving ensemble. It’s snapshot files. As changes are made to the znodes these changes are appended to a transaction log, occasionally, when a log grows large, a snapshot of the current state of all znodes will be written to the filesystem. This snapshot supercedes all previous logs
- ZooKeeper’s transaction log must be on a dedicated device. (A dedicated partition is not enough.) ZooKeeper writes the log sequentially, without seeking Sharing your log device with other processes can cause seeks and contention, which in turn can cause multi-second delays
- Do not put ZooKeeper in a situation that can cause a swap. In order for ZooKeeper to function with any sort of timeliness, it simply cannot be allowed to swap. Therefore, make certain that the maximum heap size given to ZooKeeper is not bigger than the amount of real memory available to ZooKeeper. For more on this, see Things to Avoid
References
http://zookeeper.apache.org/doc/r3.3.4/zookeeperOver.html
http://zookeeper.apache.org/doc/r3.3.4/zookeeperStarted.html
http://zookeeper.apache.org/doc/r3.3.4/zookeeperAdmin.html