HA/failover and cluster

Apache Karaf natively provides a failover mechanism. It uses a kind of master/slave topology where one instance is active and the others are in standby.

If you are looking for cluster of Apache Karaf instances (active/active), Apache Karaf Cellar is a solution.

Karaf provides failover capability using either a simple lock file or a JDBC locking mechanism. In both cases, a container-level lock system allows bundles to be preloaded into the slave Karaf instance in order to provide faster failover performance.

HA/failover (active/passive)

The Apache Karaf failover capability uses a lock system.

This container-level lock system allows bundles installed on the master to be preloaded on the slave, in order to provide faster failover performance.

Two types of lock are supported:

  • filesystem lock

  • database lock

When a first instance starts, if the lock is available, it takes the lock and become the master. If a second instance starts, it tries to acquire the lock. As the lock is already hold by the master, the instance becomes a slave, in standby mode (not active). A slave periodically check if the lock has been released or not.

Filesystem lock

The Apache Karaf instances share a lock on the filesystem. It means that the filesystem storing the lock has to be accessible to the different instances (using SAN, NFS, …​).

The configuration of the lock system has to be defined in the etc/system.properties file, on each instance (master/slave):

karaf.lock=true
karaf.lock.class=org.apache.karaf.main.lock.SimpleFileLock
karaf.lock.dir=<PathToLockFileDirectory>
karaf.lock.delay=10000
  • karaf.lock property enables the the HA/failover mechanism

  • karaf.lock.class property contains the class name providing the lock implementation. Here, we use the filesystem lock.

  • karaf.lock.dir property contains the location where the lock will be written. All instances have to share the same lock.

  • karaf.lock.delay property is the interval period (in milliseconds) to check if the lock has been released or not.

Database lock

It’s not always possible and easy to have a shared filesystem between multiple Apache Karaf instances.

Instead of sharing a filesystem, Apache Karaf supports sharing a database.

The master instance holds the lock by locking a table in the database. If the master loses the lock, a waiting slave gains access to the locking table, acquire the lock on the table and starts.

The database lock uses JDBC (Java DataBase Connectivity). To use database locking, you have to:

  • copy the JDBC driver in the lib/ext folder on each instance. The JDBC driver to use is the one corresponding to the database used for the locking system.

  • update etc/system.properties file on each instance:

karaf.lock=true
karaf.lock.class=org.apache.karaf.main.lock.DefaultJDBCLock
karaf.lock.level=50
karaf.lock.delay=10000
karaf.lock.jdbc.url=jdbc:derby://dbserver:1527/sample
karaf.lock.jdbc.driver=org.apache.derby.jdbc.ClientDriver
karaf.lock.jdbc.user=user
karaf.lock.jdbc.password=password
karaf.lock.jdbc.table=KARAF_LOCK
karaf.lock.jdbc.clustername=karaf
karaf.lock.jdbc.timeout=30
karaf.lock.lostThreshold=0
  • karaf.lock property enabled the HA/failover mechanism

  • karaf.lock.class property contains the class name providing the lock implementation. The org.apache.karaf.main.lock.DefaultJDBCLock is the most generic database lock system implementation. Apache Karaf supports lock systems for specific databases (see later for details).

  • karaf.lock.level property is the container-level locking (see later for details).

  • karaf.lock.delay property is the interval period (in milliseconds) to check if the lock has been released or not.

  • karaf.lock.lostThreshold property is the count of attempts to re-acquire the lock before shutting down.

  • karaf.lock.jdbc.url property contains the JDBC URL to the database (derby in this example).

  • karaf.lock.jdbc.driver property contains the class name of the JDBC driver to use (derby in this example).

  • karaf.lock.jdbc.user property contains the username to use to connect to the database.

  • karaf.lock.jdbc.password property contains the password to use to connet to the database.

  • karaf.lock.jdbc.table property contains the database table to use for the lock. Karaf will first try to find the table as specified in this property, and if not found, it will try the table name in lower and upper case.

Note

Apache Karaf won’t start if the JDBC driver is not present in the lib/ext folder.

Note

The sample database will be created automatically if it does not exist.

Note

If the connection to the database is lost, the master instance tries to gracefully shutdown, allowing a slave instance to become the master when the database is back. The former master instance will require a manual restart.

Lock on Oracle

Apache Karaf supports Oracle database for locking. The lock implementation class name to use is org.apache.karaf.main.lock.OracleJDBCLock:

karaf.lock=true
karaf.lock.class=org.apache.karaf.main.lock.OracleJDBCLock
karaf.lock.jdbc.url=jdbc:oracle:thin:@hostname:1521:XE
karaf.lock.jdbc.driver=oracle.jdbc.OracleDriver
karaf.lock.jdbc.user=user
karaf.lock.jdbc.password=password
karaf.lock.jdbc.table=KARAF_LOCK
karaf.lock.jdbc.clustername=karaf
karaf.lock.jdbc.timeout=30

The Oracle JDBC driver file (ojdbc*.jar) has to be copied in the lib/ext folder.

Note

The karaf.lock.jdbc.url property contains a JDBC URL which requires an active SID. It means that you must manually create the Oracle database instance first before using the lock mechanism.

Lock on Derby

Apache Karaf supports Apache Derby database for locking. The lock implementation class name to use is org.apache.karaf.main.lock.DerbyJDBCLock:

karaf.lock=true
karaf.lock.class=org.apache.karaf.main.lock.DerbyJDBCLock
karaf.lock.jdbc.url=jdbc:derby://127.0.0.1:1527/dbname
karaf.lock.jdbc.driver=org.apache.derby.jdbc.ClientDriver
karaf.lock.jdbc.user=user
karaf.lock.jdbc.password=password
karaf.lock.jdbc.table=KARAF_LOCK
karaf.lock.jdbc.clustername=karaf
karaf.lock.jdbc.timeout=30

The Derby JDBC driver file name has to be copied in the lib/ext folder.

Lock on MySQL

Apache Karaf supports MySQL database for locking. The lock implementation class name to use is org.apache.karaf.main.lock.MySQLJDBCLock:

karaf.lock=true
karaf.lock.class=org.apache.karaf.main.lock.MySQLJDBCLock
karaf.lock.jdbc.url=jdbc:mysql://127.0.0.1:3306/dbname
karaf.lock.jdbc.driver=com.mysql.jdbc.Driver
karaf.lock.jdbc.user=user
karaf.lock.jdbc.password=password
karaf.lock.jdbc.table=KARAF_LOCK
karaf.lock.jdbc.clustername=karaf
karaf.lock.jdbc.timeout=30

The MySQL JDBC driver file name has to be copied in lib/ext folder.

Lock on PostgreSQL

Apache Karaf supports PostgreSQL database for locking. The lock implementation class name to use is org.apache.karaf.main.lock.PostgreSQLJDBCLock:

karaf.lock=true
karaf.lock.class=org.apache.karaf.main.lock.PostgreSQLJDBCLock
karaf.lock.jdbc.url=jdbc:postgresql://127.0.0.1:1527/dbname
karaf.lock.jdbc.driver=org.postgresql.Driver
karaf.lock.jdbc.user=user
karaf.lock.jdbc.password=password
karaf.lock.jdbc.table=KARAF_LOCK
karaf.lock.jdbc.clustername=karaf
karaf.lock.jdbc.timeout=0

The PostgreSQL JDBC driver file has to be copied in the lib/ext folder.

Lock on Microsoft SQLServer

Apache Karaf supports Microsoft SQLServer database for locking. The lock implementation class name to use is org.apache.karaf.main.lock.SQLServerJDBCLock:

karaf.lock=true
karaf.lock.class=org.apache.karaf.main.lock.SQLServerJDBCLock
karaf.lock.level=50
karaf.lock.delay=10000
karaf.lock.jdbc.url=jdbc:jtds:sqlserver://127.0.0.1;databaseName=sample
karaf.lock.jdbc.driver=net.sourceforge.jtds.jdbc.Driver
karaf.lock.jdbc.user=user
karaf.lock.jdbc.password=password
karaf.lock.jdbc.table=KARAF_LOCK
karaf.lock.jdbc.clustername=karaf
karaf.lock.jdbc.timeout=30

The JTDS JDBC driver file has to be copied in the lib/ext folder.

Container-level locking

Apache Karaf supports container-level locking. It allows bundles to be preloaded into the slave instance. Thanks to that, switching to a slave instance is very fast as the slave instance already contains all required bundles.

The container-level locking is supported in both filesystem and database lock mechanisms.

The container-level locking uses the karaf.lock.level property:

karaf.lock.level=50

The karaf.lock.level property tells the Karaf instance how far up the boot process to bring the OSGi container. All bundles with an ID equals or lower to this start level will be started in that Karaf instance.

As reminder, the bundles start levels are specified in etc/startup.properties, in the url=level format.

Level Behavior

1

A 'cold' standby instance. Core bundles are not loaded into container. Slaves will wait until lock acquired to start server.

<50

A 'hot' standby instance. Core bundles are loaded into the container. Slaves will wait until lock acquired to start user level bundles. The console will be accessible for each slave instance at this level.

>50

This setting is not recommended as user bundles will end up being started.

Note

Using 'hot' standby means that the slave instances are running and bound to some ports. So, if you use master and slave instances on the same machine, you have to update the slave configuration to bind the services (ssh, JMX, etc) on different port numbers.

Cluster (active/active)

Apache Karaf doesn’t natively support clustering. By cluster, we mean several active instances, synchronized with each other.

However, Apache Karaf Cellar can be installed to provide cluster support.