Distributed locks are used to synchronize accesses shared resources. Most applications today use ZooKeeper to model distributed locks.

The simplest way to model a lock using ZooKeeper is (See ZooKeeper leader recipe for an exact and more advanced solution)

  • Each process tries to create an emphemeral node
  • If the node is successfully created, the process acquires the lock
  • Otherwise, it will watch the ZNode and try to acquire the lock again if the current lock holder disappears

This is good enough if there is only one lock. But in practice, an application will need many such locks. Distributing and managing the locks among difference process becomes challenging. Extending such a solution to many locks will result in:

  • Uneven distribution of locks among nodes; the node that starts first will acquire all the locks. Nodes that start later will be idle.
  • When a node fails, how the locks will be distributed among remaining nodes is not predicable.
  • When new nodes are added the current nodes don't relinquish the locks so that new nodes can acquire some locks

In other words we want a system to satisfy the following requirements.

  • Distribute locks evenly among all nodes to get better hardware utilization
  • If a node fails, the locks that were acquired by that node should be evenly distributed among other nodes
  • If nodes are added, locks must be evenly re-distributed among nodes.

Helix provides a simple and elegant solution to this problem. Simply specify the number of locks and Helix will ensure that above constraints are satisfied.

To quickly see this working run the lock-manager-demo script where 12 locks are evenly distributed among three nodes, and when a node fails, the locks get re-distributed among remaining two nodes. Note that Helix does not re-shuffle the locks completely, instead it simply distributes the locks relinquished by dead node among 2 remaining nodes evenly.


Short Version

This version starts multiple threads within the same process to simulate a multi node deployment. Try the long version to get a better idea of how it works.

git clone https://git-wip-us.apache.org/repos/asf/helix.git
cd helix
git checkout tags/helix-1.4.0
mvn clean install package -DskipTests
cd recipes/distributed-lock-manager/target/distributed-lock-manager-pkg/bin
chmod +x *
./lock-manager-demo

Output

./lock-manager-demo
STARTING localhost_12000
STARTING localhost_12002
STARTING localhost_12001
STARTED localhost_12000
STARTED localhost_12002
STARTED localhost_12001
localhost_12001 acquired lock:lock-group_3
localhost_12000 acquired lock:lock-group_8
localhost_12001 acquired lock:lock-group_2
localhost_12001 acquired lock:lock-group_4
localhost_12002 acquired lock:lock-group_1
localhost_12002 acquired lock:lock-group_10
localhost_12000 acquired lock:lock-group_7
localhost_12001 acquired lock:lock-group_5
localhost_12002 acquired lock:lock-group_11
localhost_12000 acquired lock:lock-group_6
localhost_12002 acquired lock:lock-group_0
localhost_12000 acquired lock:lock-group_9
lockName    acquired By
======================================
lock-group_0    localhost_12002
lock-group_1    localhost_12002
lock-group_10    localhost_12002
lock-group_11    localhost_12002
lock-group_2    localhost_12001
lock-group_3    localhost_12001
lock-group_4    localhost_12001
lock-group_5    localhost_12001
lock-group_6    localhost_12000
lock-group_7    localhost_12000
lock-group_8    localhost_12000
lock-group_9    localhost_12000
Stopping localhost_12000
localhost_12000 Interrupted
localhost_12001 acquired lock:lock-group_9
localhost_12001 acquired lock:lock-group_8
localhost_12002 acquired lock:lock-group_6
localhost_12002 acquired lock:lock-group_7
lockName    acquired By
======================================
lock-group_0    localhost_12002
lock-group_1    localhost_12002
lock-group_10    localhost_12002
lock-group_11    localhost_12002
lock-group_2    localhost_12001
lock-group_3    localhost_12001
lock-group_4    localhost_12001
lock-group_5    localhost_12001
lock-group_6    localhost_12002
lock-group_7    localhost_12002
lock-group_8    localhost_12001
lock-group_9    localhost_12001


Long version

This provides more details on how to setup the cluster and where to plugin application code.

Start ZooKeeper

./start-standalone-zookeeper 2199

Create a Cluster

./helix-admin --zkSvr localhost:2199 --addCluster lock-manager-demo

Create a Lock Group

Create a lock group and specify the number of locks in the lock group.

./helix-admin --zkSvr localhost:2199  --addResource lock-manager-demo lock-group 6 OnlineOffline --mode AUTO_REBALANCE

Start the Nodes

Create a Lock class that handles the callbacks.

public class Lock extends StateModel {
  private String lockName;

  public Lock(String lockName) {
    this.lockName = lockName;
  }

  public void lock(Message m, NotificationContext context) {
    System.out.println(" acquired lock:"+ lockName );
  }

  public void release(Message m, NotificationContext context) {
    System.out.println(" releasing lock:"+ lockName );
  }

}

and a LockFactory that creates Locks

public class LockFactory extends StateModelFactory<Lock> {
    /* Instantiates the lock handler, one per lockName */
    public Lock create(String lockName) {
        return new Lock(lockName);
    }
}

At node start up, simply join the cluster and Helix will invoke the appropriate callbacks on the appropriate Lock instance. One can start any number of nodes and Helix detects that a new node has joined the cluster and re-distributes the locks automatically.

public class LockProcess {
  public static void main(String args) {
    String zkAddress= "localhost:2199";
    String clusterName = "lock-manager-demo";
    //Give a unique id to each process, most commonly used format hostname_port
    String instanceName ="localhost_12000";
    ZKHelixAdmin helixAdmin = new ZKHelixAdmin(zkAddress);
    //configure the instance and provide some metadata
    InstanceConfig config = new InstanceConfig(instanceName);
    config.setHostName("localhost");
    config.setPort("12000");
    admin.addInstance(clusterName, config);
    //join the cluster
    HelixManager manager;
    manager = HelixManagerFactory.getHelixManager(clusterName,
                                                  instanceName,
                                                  InstanceType.PARTICIPANT,
                                                  zkAddress);
    manager.getStateMachineEngine().registerStateModelFactory("OnlineOffline", modelFactory);
    manager.connect();
    Thread.currentThread.join();
  }
}

Start the Controller

The controller can be started either as a separate process or can be embedded within each node process

Separate Process

This is recommended when number of nodes in the cluster > 100. For fault tolerance, you can run multiple controllers on different boxes.

./run-helix-controller --zkSvr localhost:2199 --cluster lock-manager-demo 2>&1 > /tmp/controller.log &
Embedded Within the Node Process

This is recommended when the number of nodes in the cluster is less than 100. To start a controller from each process, simply add the following lines to MyClass

public class LockProcess {
  public static void main(String args) {
    String zkAddress= "localhost:2199";
    String clusterName = "lock-manager-demo";
    // .
    // .
    manager.connect();
    HelixManager controller;
    controller = HelixControllerMain.startHelixController(zkAddress,
                                                          clusterName,
                                                          "controller",
                                                          HelixControllerMain.STANDALONE);
    Thread.currentThread.join();
  }
}

Back to top

Reflow Maven skin maintained by Olivier Lamy.

Apache Helix, Apache, the Apache feather logo, and the Apache Helix project logos are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners.
Privacy Policy