ZooKeeper Cluster (Multi-Server) Setup

ZooKeeper is a Distributed Coordination Service for Distributed Applications.  ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchal namespace which is organized similarly to a standard file system.  The name space consists of data registers – called znodes, in ZooKeeper parlance – and these are similar to files and directories, unlike a typical file system.  ZooKeeper runs in Java and has bindings for both Java and C.


ZooKeeper Cluster – Terminology

ZooKeeper Service is replicated over a sets of hosts called an ensemble.  A replicated group of servers in the same application is called a quorum.  All servers in the quorum have copies of the same configuration file.  QuorumPeers will form a ZooKeeper ensemble.  Zookeeper requires a majority, it’s recommended to use an odd number of machines/servers. For example: Five machines ZooKeeper can handle the failure of two machines.


Designing ZooKeeper Deployment

Designing, yes – few things we have to be clear enough before we begin the deployment of ZooKeeper, let’s have answers for following questions-

  • Identify # of ZooKeeper Server planned to deploy odd numbers are best?
  • Identify # of Physical machine Box will be participating in deployment?
  • Prepare ZooKeeper port #’s for deployment
    • Client port #
    • Quorum port #
    • Leader election port #
  • Where to put Data directory dataDir for ZooKeeper Server?
  • Where to put Data Log directory dataLogDir for ZooKeeper Server?
  • Memory allocation for ZooKeeper?

Deployment Diagram of this Article:

Zookeeper Cluster (Multi-Server) - Deployment Diagram

Of course, let’s have answers for this article, shall we:

Q1: Identify # of ZooKeeper Server planned to deploy (odd numbers are best)?
Answer: Planing to deploy 5 ZooKeeper servers
Q2: Identify # of Physical machine (Box) will be participating in deployment?
Answer: This uses the one physical box for ZooKeeper deployment
Q3: Prepare ZooKeeper port #'s for deployment?

Answer: As I said above for this article 5 ZooKeeper servers on one Box, so I have to choose unique port #’s for each.  Ensure selected port #’s are open in your Box, ZooKeeper uses TCP to communicate each other.

Note: Plan out appropriately on port #’s, otherwise ZooKeeper may not communicate each other.  It may delay a deployment success!

Q4: Where to put Data directory (dataDir) for ZooKeeper Server?

Answer: I’m planning to place it in /Users/jeeva/zookeeper/data and this directory as data home

Note: Data directory is one of the performance factor, see Point #3 at Performance & Availability Considerations

Q5: Where to put Data Log directory (dataLogDir) for ZooKeeper Server?

Answer: I’m planning to place it in /Users/jeeva/zookeeper/log and this directory as log home

Note: Log directory is one of the performance factor, see Point #4 at Performance & Availability Considerations

Q6: Memory allocation for ZooKeeper?

Answer: For demo & article, I’m keeping this heap size to default.  For customization create a file called java.env in the {ZooKeeperHome}/conf/

Note: JVM Heap size significantly contributes to performance factor, see Point #5 at Performance & Availability Considerations

Okay, now we have answers let’s move on.


Deploying ZooKeeper Cluster (Multi-Server) Setup

Let’s begin installation and configuration of ZooKeeper.

Step 1: Directory Structure creation, as decided in the designing section

Let’s take a look above created directory structure-

Okay, looks good!

Step 2: Creating a ZooKeeper Server ID, basically this file reside in the ZooKeeper data directory.  Go on choose your favorite text editor

Step 3: Downloading ZooKeeper Release

Download a ZooKeeper from http://hadoop.apache.org/zookeeper/releases.html; this article utilize the version 3.4.4 of ZooKeeper.  However same principle is applied for other version too.

Step 4: Extract & prepare ZooKeeper for deployment

Once done don’t forget to cleanup the /tmp/zookeeper-3.4.4

Step 5: Preparing ZooKeeper configuration called zoo.cfg at {zk-server-1}/conf/zoo.cfg.  Here I will show you for Server 1 and perform same steps with appropriate values (clientPort, dataDir, dataLogDir) for respective ZooKeeper server.

Place below configuration into it.

Screenshot: zk-server-1 directory structure along with conf/zoo.cfg

Screenshot: zk-server-1 directory structure along with conf/zoo.cfg

Step 6: Configuration ZooKeeper Logger for deployment.  Following are the default values of log4j.properties and it holds dev nature in it; update it as per your environment and need –

Step 7: Once zoo.cfg created for all the server then we can start the ZooKeeper Servers.  Let’s start the zk-server-1

Now, go ahead and start the remaining 4 ZooKeeper server(s).  Tail the zookeeper.out file in the bin directory to see more information.

zkServer.sh supports the following commands:

We will use ‘status’ command to see ZookKeeper Server status:

ZooKeeper CLI Client

ZooKeeper command line interface for handy administration How to Connect ZooKeeper through CLI? and Famous Four letter commands for it.


Integrating ZooKeeper Cluster

Typically, ZooKeeper enabled application will be able to connect right away.  To integrate the ZooKeeper with application, all you need know is ‘all the ZooKeeper server(s)’-

  • host-address/host-ip
  • port-no

For an example: SolrCloud, Elasticsearch, etc


Maintenance – Basic Elements

Typically two things to take care in ZooKeeper Server and also this is trick part.

  • Data directory Cleanup
  • Debug Log Cleanup (log4j)

Why I call maintenance is a tricky part? Let me describe – ZooKeeper provides the autopurge configuration like autopurge.snapRetainCount and autopurge.purgeInterval.  However, in real-time scenario’s maintenance activities will differ organization to organization.  Basically it depends on Organization IT policy and Business Requirements.

Please have a look on ZooKeeper Maintenance and plan yours!


Performance & Availability Considerations

  • As long as a majority of the ensemble are up, the ZooKeeper service will be available.  Because It requires a majority, it is best to use an odd number of machines.  For example, with four machines ZooKeeper can only handle the failure of a single machine; if two machines fail, the remaining two machines do not constitute a majority.   However, with five machines ZooKeeper can handle the failure of two machines
  • It’s critical that you run ZooKeeper under supervision, since Zookeeper is fail-fast and will exit the process if it encounters any error case.  See here for more details.
  • The ZooKeeper Data Directory contains files which are a persistent copy of the znodes stored by a particular serving ensemble.  It’s snapshot files.  As changes are made to the znodes these changes are appended to a transaction log, occasionally, when a log grows large, a snapshot of the current state of all znodes will be written to the filesystem.  This snapshot supercedes all previous logs
  • ZooKeeper’s transaction log must be on a dedicated device. (A dedicated partition is not enough.) ZooKeeper writes the log sequentially, without seeking Sharing your log device with other processes can cause seeks and contention, which in turn can cause multi-second delays
  • Do not put ZooKeeper in a situation that can cause a swap.  In order for ZooKeeper to function with any sort of timeliness, it simply cannot be allowed to swap.  Therefore, make certain that the maximum heap size given to ZooKeeper is not bigger than the amount of real memory available to ZooKeeper. For more on this, see Things to Avoid

References

http://zookeeper.apache.org/doc/r3.3.4/zookeeperOver.html
http://zookeeper.apache.org/doc/r3.3.4/zookeeperStarted.html
http://zookeeper.apache.org/doc/r3.3.4/zookeeperAdmin.html