Sunday, July 15, 2012

Zookeeper: Leader Election and Group membership.

Its been less than a week since I started playing with Zookeeper to understand how can it be used for our existing platform and in this post I would like to document its Leader Election feature.

Used the algorithm pointed out in the recipe page. Few key things to note:
  • ZooKeeper is analogous to a file system where each Node can be thought of either a DIRECTORY (intermediate nodes which contains file or subdirectory) or FILE (leaf nodes) with root node denoted as unix style "/".
  • Make the nodes EPHEMERAL and SEQUENTIAL. This can be easily achieved by passing CreateMode.EPHEMERAL_SEQUENTIAL while create a node using create() method.
  • Implement Watcher Interface. Think of Watcher objects as Listeners which register with ZooKeeper for listening on an event.
  • Following diagram should help understand the concept visually.

The algorithm is self explanatory along with commented code.

Lets move to a working example. Compiled the code into jar and ran as independent processes.   The bottom right window in the screenshot is the zookeeper server's view, connected through its command line interface (zkCli.sh)

1) Initial state:

2) When first node connects (First node, NODE-0000000060 connects to Zookeeper server)

3) Similarly, second and third instance connect to zookeeper server. (Notice the list commands issued in the bottom right window confirms that they have registered and NODE-0000000061 is following 60, NODE-0000000062 is following 61)


4) Leader, NODE-0000000060 is killed. Notice, how the NODE-0000000061 follower takes up Leader's responsibility.

5) Resurrecting the process in Top left window, notice it now becomes a follower, NODE-0000000064 and starts following node, NODE-0000000062.

6) Kill a intermediate follower (process which is not a leader nor follower but an in between node which forms a logical Watcher link. Note reelection should be initiated to fix the broken link so that Leader responsibilities can be passed to the last remaining node if all previous nodes happen to crash. In the following screenshot the intermediate node, NODE-00000062 is between the leader, NODE-00000061 and node, NODE-00000064. If the intermediate node crashes, the node, NODE-00000064 starts following the leader, NODE-00000061. Also Node, NODE-00000064 becomes leader when leader, NODE-00000064 crashes.

Build a jar (leader-election.jar) using the source code.

Thus leader election and group management can be easily implemented using ZooKeeper. Next moving on to Distributed Locking using ZooKeeper.


2 comments:

  1. Hi, I came across the problem and found your blog. Nice post!
    Could you elaborate a bit how you leverage Group Membership in Leader Election? I only see node creation and watching rather than membership management.
    Thanks!

    ReplyDelete
    Replies
    1. Hello Jay, Membership management is client's responsibility by leveraging Zk's node watch feature. Zk will notify the watching node in the form of event and its upto the client to act on that event.

      One way client can apply membership is through first come first serve basis.

      Delete