Garbage collection with JNI

This article describes a minimalistic system for implementing Java garbage collectible native resources. It is based on the implementation in zcl but with a more indirect mechanism for binding releases which requires no special case code and allows for easy reuse.

It provides standard Java memory semantics for natively created objects while also allowing fine control for managing the common case of the JVM not keeping up with native resource allocation.

Garbage collection

TODO.

Why finalize() is both dangerous and unecessary. Something about reference queues and weak references. Maybe link the article I started with, if i can find it again.

CObject

All native wrapping objects must inherit from CObject. It only has a single final field which stores a handle object. Because the pointer should be alive as long as the object is, the pointer is used as the hash key.

public abstract class CObject {

    final CObjectHandle h;

    protected CObject(long p) {
        this.h = new CObjectHandle(this, getClass(), p);
    }

    @Override
    public boolean equals(Object obj) {
        return (obj instanceof CObject) && h.equals(((CObject)obj).h);
    }

    @Override
    public int hashCode() {
        return h.hashCode();
    }

    [...]
}

There are only two interesting object methods. One to retrieve the pointer which is used by the JNI code (allowing native methods to take objects directly) and the other is a direct release mechanism.

public abstract class CObject {

    public long getP() {
        long x = h.p;
        if (x == 0) {
            throw new NullPointerException();
        }
        return x;
    }

    public void release() {
        h.release();
    }

    @Override
    public String toString() {
        return String.format("[%s: 0x%x]", h.jtype.getSimpleName(), h.p);
    }
}

The interesting part of the garbage collection mechanism is using a weak reference, in this case the CObjectHandle. This cannot maintain any hard reference to it's owner object but does so through the weak reference mechanism. What it does keep a "hard reference" to is the underlying native pointer, but as this is just a number it doesn't prevent garbage collection. It also requires a hard pointer to the object class so it can resolve the (static) release method. This loose binding allows the objects to be reclaimed safely without requiring any reference to the underlying object.

Also note that the references themselves need to be stored in a referring structure otherwise they are deallocated before they are used; by using a Map the references are also guaranteed to be unique across the application which allows for example wrapping native query functions.

In the original zcl design the release function is implemented by a switch statement on an object type. In this design it simply calls a declared static method on the object class. This allows this object to be reused at the cost of having to write a separate (JNI) release function for each object type.

public abstract class CObject {

    [...]

    static private final ReferenceQueue<CObject> references = new ReferenceQueue<>();

    public static class CObjectHandle extends WeakReference<CObject> {

        final Class jtype;
        long p;
        static private final Map<Long, CObjectHandle> map = new HashMap<>();

        public CObjectHandle(CObject referent, Class jtype, long p) {
            super(referent, references);

            this.jtype = jtype;
            this.p = p;
            synchronized (map) {
                map.put(p, this);
            }
        }

        static CObjectHandle get(long p) {
            synchronized (map) {
                return map.get(p);
            }
        }

        void release() {
            synchronized (map) {
                map.remove(p);
            }
            synchronized (this) {
                try {
                    if (p != 0) {
                        jtype.getDeclaredMethod("release", Long.TYPE).invoke(null, p);
                    }
                } catch (NoSuchMethodException | SecurityException | IllegalAccessException | IllegalArgumentException | InvocationTargetException ex) {
                    Logger.getLogger(CObject.class.getName()).log(Level.SEVERE, null, ex);
                } finally {
                    p = 0;
                }
            }
        }

        @Override
        public boolean equals(Object obj) {
            return (obj instanceof CObjectHandle) && ((CObjectHandle) obj).p == p;
        }

        @Override
        public int hashCode() {
            int hash = 7;
            hash = 11 * hash + (int) (this.p ^ (this.p >>> 32));

            return hash;
        }

    }

    [...]
}

The ReferenceQueue is managed by the JVM and automatically populated as objects become unreachable. A simple daemon thread simply polls the queue and releases any objects that appear.

public abstract class CObject {

    [...]

    static private final ReferenceQueue<CObject> references = new ReferenceQueue<>();

    static {
        Thread cleanup = new Thread(CObject::cleaner, "CObject cleaner");
        cleanup.setDaemon(true);
        cleanup.start();
    }

    static private void cleaner() {
        System.err.println("C cleaner started");
        try {
            while (true) {
                CObjectHandle h = (CObjectHandle) references.remove();
                try {
                    h.release();
                } catch (Throwable ex) {
                }
            }
        } catch (InterruptedException ex) {
        }
    }

    [...]
}

And finally there are a couple of static methods on CObject.

The toObject() method is called by the JNI to instantiate a wrapped native object. If the object already exists it is simply returned otherwise a new object is created. The new method automatically adds the object to the hash table which allows individual classes to override the new method (e.g. with extra arguments) and still link correctly into the GC system.

release() is just a conveneince method for releasing a bunch of objects at once. It ignores null objects.

    static CObject toObject(Class jtype, long p) {
        CObjectHandle h = CObjectHandle.get(p);
        if (h != null) {
            return h.get();
        } else {
            try {
                Class[] params = {Long.TYPE};

                return (CObject) jtype.getConstructor(params).newInstance(p);
            } catch (NoSuchMethodException | SecurityException | InstantiationException | IllegalAccessException | IllegalArgumentException | InvocationTargetException ex) {
                Logger.getLogger(CObject.class.getName()).log(Level.SEVERE, null, ex);
                throw new RuntimeException(ex);
            }
        }
    }

    public static void release(CObject... list) {
        for (CObject o: list) {
            if (o != null) {
                o.release();
            }
        }
    }

JNI code

The extra JNI code above any application code is trivial. Only two methods are required; toCObject() and fromCObject().

The former is for wrapping a new object or an existing object. It is also possible to call custom constructors directly when creating new objects.

And fromCObject() simply calls the getP() method to get the current pointer value. If the object has been released explicitly then this throws a NullPointerException().


...

static jclass CObject_classid;
static jmethodID CObject_toObject_lj;
static jmethodID CObject_getP;

...

jint JNI_OnLoad(JavaVM *vmi, void *reserved) {
	jclass jc;
	JNIEnv *env;

	if ((*vmi)->GetEnv(vmi, (void *)&env, JNI_VERSION_1_4) < 0)
		return 0;

	jc = (*env)->FindClass(env, "au/notzed/c/CObject");
	if (jc == NULL)
		return 0;
	CObject_classid = jc = (*env)->NewGlobalRef(env, jc);
	CObject_toObject_lj = (*env)->GetStaticMethodID(env, jc, "toObject", "(Ljava/lang/Class;J)Lau/notzed/c/CObject;");
	CObject_getP = (*env)->GetMethodID(env, jc, "getP", "()J");

	...

	return JNI_VERSION_1_4;
}

/* ********************************************************************** */

static void throwException(JNIEnv *env, const char *type, const char *msg) {
	jclass jc = (*env)->FindClass(env, type);

	if (jc)
		(*env)->ThrowNew(env, jc, msg);
}

static jobject toCObject(JNIEnv *env, jclass jc, void *p) {
	jvalue jargs[] = {
		{ .l = jc },
		{ .j = (uintptr_t)p }
	};
	
	return (*env)->CallStaticObjectMethodA(env, CObject_classid, CObject_toObject_lj, jargs);
}

static void *fromCObject(JNIEnv *env, jobject jo) {
	return (void *)(*env)->CallLongMethodA(env, jo, CObject_getP, NULL);
}

And this is an example of how the app methods are implemented. It also shows how each object class must implement a static release(long) method.

The release method doesn't need to and must not call fromCObject() because at this point the object has been reclaimed and this is simply cleaning up the native resources. It just has to free the resources as if it were a C pointer.


#include "app_A.h"

...

static jclass APPA_classid;
...

jint JNI_OnLoad(JavaVM *vmi, void *reserved) {
	...

	jc = (*env)->FindClass(env, "app/A");
	if (jc == NULL)
		return 0;
	APPA_classid = jc = (*env)->NewGlobalRef(env, jc);

	...
}

/* ********************************************************************** */

struct type_a {
	// whatever
};

JNIEXPORT void JNICALL
Java_app_A_create(JNIEnv *env, jclass jc, int param) {
	struct type_a *a = malloc(sizeof(*a));

	if (!a) {
		throwException(env, "java/lang/OutOfMemoryError", "Unable to allocate A");
		return NULL;
	}

	// init a
	
	return toCObject(env, APPA_classid, p);
}

JNIEXPORT void JNICALL
Java_app_A_release(JNIEnv *env, jclass jc, long p) {
	struct type_a *a = (void *)p;

	// free a.*
	
	free(a);
}

JNIEXPORT jobject JNICALL
Java_app_A_methodA(JNIEnv *env, jobject jo, int param0) {
	struct type_a *a = fromCObject(env, jo);

	if (!a)
		return NULL;

	...
		
	return result;	
}

References

TODO.

Contact

notzed on various mail servers, primarily gmail.com.

All source code is covered by the GNU General Public License Version 3 (or later).