The information on these pages may be out of date, or may refer to
resources that have moved or have been made read-only.
For more information please refer to the Apache Attic
|Nadya Morozova, Andrey Chernyshev: document created.
|June 5, 2006
|Nadya Morozova, Pavel Rebriy: update of graphics and text, Java* layer is now part of VM core, Porting component restructured, all scenarios rewritten
|March 20, 2008
This document introduces the thread manager component delivered as part of the DRL (Dynamic Runtime Layer) initiative. This document focuses on the specifics of the current implementation showing the thread manager role inside the DRL virtual machine, and the internal organization of the thread management subsystem.
The target audience for the document includes a wide community of engineers interested in further work with threading technologies to contribute to their development. The document assumes that readers are familiar with DRLVM architecture basics, threading methodologies and structures.
This document uses the unified conventions for the DRL documentation kit.
Use this document to learn all about implementation specifics of the current version. It describes the thread manager functionality in a variety of aspects, including internal data structures, architecture specifics, and the key usage scenarios involving the thread manager. The document has the following major parts:
The thread manager (TM) is a library aimed to provide threading capabilities for Java* virtual machines. The main purpose of TM is to build a bridge between the POSIX-like threading model  provided by the operating system, and the Java*-like threading model implied by the J2SE specification .
In the current implementation, the JVM threading subsystem consists of three different layers:
Note that the thread manager consists of the native and Java* layers of the subsystem, whereas as the porting layer is external.
Each layer adds certain functionality to the threading provided by the underlying layer. That is, the porting layer adds portability to the threading provided by OS, the native layer adds Java*-specific enhancements to the porting layer, and the Java* layer adds a connection to Java* threads and objects to the native layer, as shown in Figure 1 below. These interfaces are grouped in a set of headers described in the Exported Interfaces section below.
Figure 1: Threading Subsystem
The supplied thread manager has the following characteristics:
hythread interface 
Figure 2 below demonstrates the interaction of the thread manager with the following components of the virtual machine:
java.lang.Thread objects and
appropriate native threads.
thread_helpers.h interface of the Java*
Figure 2: Thread Manager in VM Architecture
The thread manager code is mostly platform-independent and relies on the underlying porting layer to adjust to platform specifics. The current TM implementation is written on top of DRLVM and Apache Porting Layers (APR). The platform-dependent TM parts are the VM helpers' package, which is tied to the specific architecture, and the porting layer extensions package, which is partially tied with the OS API.
DRLVM-based and APR-based porting layers enable compilation of the thread manager code on every platform where porting is available. The current version of the thread manager supports the Linux* and Windows* OSes on x86 and x86_64 platforms, and Linux* OS on IA-64 platforms.
Subsequent sections describe the functional interfaces that the thread manager exports to interact with other VM components and its internal data structures.
As indicated in the overview, the thread manager exports the native and the Java* interfaces. These interfaces are represented as groups of functions providing specific functionality upon external requests, as described in the subsequent sections.
The native interface is inspired by the Harmony
module. This is a low-level layer that provides Java*-like native threading functionality, such as interruption
support for waiting operations (for example,
sleep) and helps
establish correct interaction of threads with the garbage collector.
This layer does not deal with Java* objects.
The native interface consists of the following function sets:
Consists of functions of the
hythread set  responsible for the following:
Includes the description of native thread structures and the set of functions
responsible for the following:
The functions of the Java* interface take Java* objects as parameters and can be easily used to implement kernel classes, JNI or JVMTI function sets. The Java* interface consists of the following parts:
java.lang.Thread API responsible for:
Functions supporting various JVMTI functions and the
java.lang.management API responsible for:
Includes the description of thread structures and functions providing accesstors to them:
Consists of functions providing the assembly code stubs that help to optimize the performance due to tighter TM and JIT integration:
The thread manager data structures are typically not exposed: external
VM components access these structures via opaque pointers instead. The
pointers are defined in the public header files
Structures themselves are described in the
The thread manager requires each thread to be registered before threading functions can be called. Thread registration is called attaching a thread and can be done by using one of the following:
hythread_attach() registers the current
native thread in the thread manager, so that threading operations can
be performed over this thread via the native layer. Calling
hythread_attach() involves call to
port_thread_detach() to register current thread in
the porting layer.
hythread_detach() unregisters the current native
thread. This involves call to
unregister current thread from the porting layer.
jthread_attach() associates the current Java* thread
with the appropriate
java.lang.Thread object, so that threading
operations can be performed over this thread via the Java* layer.
jthread_detach() disjoins the current Java* thread
from the thread manager.
Depending on the attaching function, the thread manager operates with two types of threads:
Each thread type has a structure assigned to it that holds thread-specific data, as described below.
Other VM components work with opaque handles to those structures and have no information about their contents. This way, to work with a thread, a component calls one of the attaching functions, receives an opaque handle to the thread control structure for the thread, and performs whatever operations with this thread using this opaque handle.
When registered with the thread manager’s native layer, each thread obtains a control structure with all thread-specific data required for operations with the thread, such as state, attributes, references to OS-specific thread structures, and synchronization aids. The control structure is subsequently used for miscellaneous threading operations.
The actual content of a thread control structure is implementation-specific and is not exposed to other components.
For details on thread control structures, see Doxygen documentation hosted on the website.
A thread control structure of a Java* thread is
defined by the
JVMTIThread type and holds mostly JVMTI
information specific to that thread, as shown in Figure 3 below.
Figure 3: Java* Attached Thread
For details on thread control structures, see Doxygen documentation hosted on the website.
The thread manager enables co-existence of multiple groups of threads, for example, groups of Java* threads and GC threads not visible for Java* applications. Each thread maintained by the thread manager belongs to a specific thread group, as shown in Figure 4.
Figure 4: Thread Groups
The thread manager provides a set of functions for iterating over the list of threads within a specific group. All threads are organized in a group array and a specific system-wide lock is used to prevent concurrent modifications of the groups array and the thread list inside the group. This lock is acquired internally during thread creation, deletion and iteration over the thread list.
The thread manager synchronizers are functional modules used for thread synchronization. Certain synchronizers have internal data structures associated with them, others can only delegate function calls to the appropriate synchronizers provided by the porting layer. The current implementation of synchronizers within the thread manager is based on two fundamental primitives: the conditional variable and the lock, as shown in Figure 5.
Figure 5: Components of the TM Synchronizer
The elements in the figure have the following meaning:
wait interruption support.
These synchronizers also ensure that a thread enters the safe
suspension mode when it is put into a
wait state using
the conditional variable or when the thread is blocked while
acquiring a lock.
unpark lock support
primitives are used in the
The above hierarchy is optimized for porting code re-use. Other implementations of the Thread Manager component are also possible and can utilize a different set of porting synchronizers.
The thread manager does not expose the internal structures of synchronizers to the external components. All synchronizers are referenced by means of opaque handles similarly to thread control structures.
The current version of the thread manager implements Java* monitors in a specific way to address the common problem of space comsumption. The DRL thread manager has a special type of monitor, thin monitor, holding the lock optimized for space consumption and single-threaded usage.
Monitor inflation is implemented using a thin-fat lock technique , which works as follows:
Different implementations of thin monitors are free to choose any space compaction or other optimization techniques (or none at all). However, the general recommendation is to use thin monitors when memory needs to be saved and a thread contention is not expected to be high. It is also recommended that the conventional mutex and conditional variables be used to achieve the better scalability in case of high contention. Java* monitors in the thread manager are built on top of thin monitors. This enables the thread manager to allocate the lock structure for thin monitors directly in the Java* objects and thus makes Java* monitors space usage more efficient.
The thin monitor is a synchronizer primitive that implements the lock compression technique and serves as a base for building Java* monitors . In other words, the thin monitor resides in the native layer of the TM subsystem and has no data on Java* objects. Java* monitors are tightly coupled with Java* objects and reside on the higher Java* level of the TM subsystem.
The central point of the synchronizer is the lock word, which holds the thin lock value or a reference to the fat lock depending on the contention.
In the absence of contention, the lock type is zero, and the lock word has the following structure:
Figure 6: Lock Word Structure: Contention Bit is 0
In the presence of contention, the contention bit is set to 1, and a thin compressed lock becomes a fat inflated lock with the following layout:
Figure 7: Lock Word Structure: Contention Bit is 1
The thread manager has a global lock table to map between the lock ID and the appropriate fat monitor, as follows:
Figure 8: Fat Monitor
The process of acquiring a monitor with the help of the
hythread_thin_monitor_try_enter() function is shown on
the following diagram:
Figure 9: Acquiring a Thin Lock
First, the thread uses the reservation bit to check whether the required lock is owned by this thread. If yes, the thread increases the recursion count by 1 and the function exits. This makes the fast path of the monitor enter operation for a single-threaded application. The fast path involves only a few assembly instructions and does no expensive atomic compare-and-swap (CAS) operations.
If the lock is not yet reserved, then it is checked for being occupied. The free lock is set to be reserved and acquired simultaneously with a single CAS operation. If the lock becomes busy then, the system checks whether the lock is fat.
The lock table holds a mapping between the fat lock ID and the actual monitor. Fat monitors are extracted from the lock table and acquired. If the lock is not fat and reserved by another thread, then this thread suspends the execution of the lock owner thread, removes the reservation, and resumes the owner thread. After that, the lock acquisition is tried again.
This section contains various scenarios of thread manipulation.
The Java* thread creation procedure consists of the following key stages:
creates a new native thread and initializes thread control structures
HyThread as part of
java.lang.Thread.start() method executes the
of the Java* layer in the thread manager via the
jthread_create_with_function() function calls
as a new thread body procedure.
hythread_create_ex() function executes
port_thread_create() function of the porting
layer, which does the actual fork and creates a new thread. The newly created
thread begins to execute
the required registration of the new thread in the thread group, allocates
VM_thread pool, creates
the top-level M2N frame and local handles, initiates thread stack info
and JNI thread environment. If needed, the function performs JVMTI callback
by sending the
java.lang.Thead.run() method, which makes the
user-defined body of the new Java* thread.
java.lang.Thread.run() has finished, the function
the thread in the thread group and de-allocates
If needed, the function performs JVMTI callback by sending the
The following figures illustrate the detailed sequence of thread creation and completion:
Figure 10: Java* Thread Life Cycle
Figure 11: New Java* Thread Life Cycle
The native thread control structures, such as
not de-allocated once the new thread body is finished. The thread manager
creates a weak reference for each
java.lang.Thread object supplying its internal reference
queue. The garbage collector places a reference into that queue when a
java.lang.Thread object is garbage-collected.
Before allocating native resources for new threads, the thread manager
seeks for the weak references in the queue. In case the weak
references queue is not empty, the thread manager extracts the first
available reference and re-uses its native resources for the newly
One of the important features that the native layer adds for threading in the porting layer is safe suspension. This mechanism ensures that the suspended thread can be safely explored by the garbage collector during the enumeration of live references. If a thread holds some system-critical locks, such as the locks associated with the native heap memory, safe suspension can keep it running even during the enumeration. Otherwise, doing the system or “hard” call to suspend the thread may result in deadlocks in case system locks are requested by other parts of the VM.
The algorithm of safe suspension describes the protocol of
communication between two threads, for example, thread T1 and thread
T2, where is T1 safely suspends thread T2. The T1 thread calls the
hythread_suspend(T2) function to suspend thread T2. The
procedure goes in the following stages:
hythread_suspend(T2) function increments the flag
for the T2 thread indicating a request for suspension. Depending on
the current state of thread T2, the
hythread_suspend(T2) function activates one of the
hythread_suspend(T2) call immediately returns,
see Figure 12.
hythread_suspend() function gets blocked until
thread T2 reaches the beginning of a safe region or a safe
The T2 thread undergoes the following:
in order to denote a safe region of code. T2 thread may also call
method to denote a selected point where safe suspension is possible.
until T1 resumes it by calling
A typical example of the safe suspension scenario takes place when the garbage collector suspends a Java* thread to enumerate live references. Figure 12 illustrates the case when the GC uses the thread manager to suspend the Java* thread while it is running in the safe code region.
Figure 12: Suspension: Safe Region
To understand the safe thread suspension algorithm better, think of each thread as having a lock associated with it. Thread T2 releases the lock when it enters a safe region and acquires the lock when it leaves the safe region. To suspend thread T2, acquire the lock associated with it. Resuming thread T2 is equivalent to releasing the lock associated with it. A straight-forward implementation of the safe suspension algorithm reserves a single-thread optimized lock (that is, the thin monitor) for each thread and uses it for suspending and resuming that thread.
Another safe suspension case is when a GC thread hits a Java* thread while it is in an unsafe region of code, as shown in Figure 13.
Figure 13: Safe Point
hythread_safe_point() operation as a wait
operation performed over the monitor associated with the thread. In
this case, the
hythread_resume() operation is equivalent
to notifying that monitor.
The stop-the-world thread suspension happens when the garbage collector needs to enumerate the live object references for all threads of a given thread group. Figure 14 illustrates the case when only a GC thread an indefinite number of Java* threads are running, so that the GC needs to suspend all Java* threads.
Figure 14: Suspending a Group of Threads
First, the garbage collector calls the thread manager interface
hythread_suspend_all() to suspend every thread
running within the given group (in this scenario, all Java* threads). The thread manager then returns the iterator
for traversing the list of suspended threads. GC uses this iterator to
analyze each Java* thread with respect to live
references and then does a garbage collection. After it is complete,
GC instructs the thread manager to resume all suspended threads.
Locking with the thread manager can be done by means of a mutex or a thin monitor. The mutex is preferable in case of high contention, while thin monitors are better optimized for space. This section describes a scenario when the VM core attempts to lock a resource from multiple threads, T1 and T2. The major stages of the process of locking and unlocking are shown in Figure 15.
Figure 15: Locking with a Mutex
Initially, the mutex is not occupied, that is, the label lock is set
to zero. Thread T1 calls the
which instructs the thread manager to mark the mutex as locked by T1.
T2 can also call the
hymutex_lock() function later, and
if it happens to call on a lock already occupied, then T2 is placed
into the internal waiting queue associated with the mutex and gets
blocked until T1 unlocks the mutex. The T1 thread calls
hymutex_unlock() to release the mutex, which enables the
mutex to extract T2 from the queue, transfer the lock ownership to
this thread, and to notify T2 that it can wake up.
Locking Java* monitors implies interaction between the thread manager and the VM core since the thread manager requires the memory address within the Java* object where it keeps the lock data. The process of locking Java* monitors is shown on Figure 16 below.
The code generated by the JIT compiler calls the
helper function. The helper function provides a chunk of code (stub) that
can be inlined by the JIT compiler directly into the generated assembly
function, which works with the Java* object
as with JNI code.
Figure 16: Locking Java* Monitors
This section lists the resources used in this document and other related documents.
 J2SE 1.5.0 specification, http://java.sun.com/j2se/1.5.0/docs/api/
 JVM Tool Interface Specification, http://java.sun.com/j2se/1.5.0/docs/guide/jvmti/jvmti.html
 Java* Native Interface Specification, http://java.sun.com/j2se/1.5.0/docs/guide/jni/spec/jniTOC.html
 Apache Portable Runtime project, http://apr.apache.org/
 POSIX standard in threading, http://www.opengroup.org/onlinepubs/009695399/idx/threads.html
 David F. Bacon, Ravi Konuru, Chet Murthy, Mauricio Serrano, Thin locks: featherweight synchronization for Java, http://portal.acm.org/citation.cfm?id=277734
 Kiyokuni Kawachiya Akira Koseki Tamiya Onodera, Lock Reservation: Java Locks Can Mostly Do Without Atomic Operation, http://portal.acm.org/citation.cfm?id=582433
 HyThread documentation, http://harmony.apache.org/externals/vm_doc/html/group__Thread.html