TechAscent - Generalized Java Resource Management

2018-12-31

Generalized Java Resource Management

The history of computer science has been marked by several great (and usually unwinnable) dualistic battles. Consider garbage collection versus pooling, or manual memory management. An obvious political third rail, and the best choice depends on the nature of the problem, the quality of the garbage collector, and the skill of the team implementing the software. Memory issues happen in essentially any nontrivial system, so teams working often spend time dealing with some form of memory issues.

Smoothly flowing between garbage collected and native or non garbage collected code hits a certain ideal that the industry really still has not realized. For Clojure and Java, we have taken a few steps towards making integration of native, or 'off-heap' resources a bit more smooth and care-free. We bundle this work into a very small library, tech.resource.

One caveat before we start - 'off-heap' is a bad moniker for these types of resources. Call them native, call them non-gc, call them 'resources outside of the purview of the garbage collector' but off-heap might be confusing to an engineer coming from C++ or Rust. These languages have their own heaps and C++ heap items are often exactly what you are tracking with an 'off-heap' system. Since we are right now talking about building better interfaces between the JVM and other languages, starting with terms that are obtuse and inaccurate from the perspective of those other languages seems to us to add unnecessary confusion to the topic.

Non-GC Resources for The Brave and True

We start with the humble goal of supporting at least two resource management paradigms, and aim to allow a limited form of intermixing. In our native-pointers post we talk about one form of resource management, stack based deallocation of resources. In this form, you declare a scope where resources allocated within a given context will be released. Alternatively, one can harness the JVM's GC to help with the task of clearing resources. Trade-offs exist.

The most minimal definition of resource is a little counterintuitive. We define it as a function that takes no arguments and presumably has side effects that we expect to be called automatically. An important point of flexibility of the resource system is that one can use these functions to do anything; returning memory to the language runtime, clearing database connections, etc.

Stack Based Resource Management

The library promises that no matter what happens, the function declared as the resource will get called. This can close files, database connections, release native pointers, etc. This enables predictability, which is valuable. This simplicity makes the library approachable, but limits its reach.

Notably, you may find this type of resource management referred to as Resource Acquisition Is Initialization, or RAII. RAII specifically is a broader term that encompasses Stack Based Resource Management.

Here is a simple example of using the library:

(require '[tech.resource :as resource])
nil
user> (resource/stack-resource-context
       (resource/track #(println "released")))
released
#<Fn@213f191f user/eval8596[fn/fn]>
user> (resource/stack-resource-context
       (resource/track #(println "released"))
       (throw (ex-info "Bad things happen" {})))
released
ExceptionInfo Bad things happen  clojure.core/ex-info (core.clj:4739)

In general, think: "I want this piece of code to run when this stack frame unwinds regardless of the reason why it unwinds." In this sense, the resource system above is a generalization of concepts like with-open. The stack-based system makes no claims about the reachability of any objects with which the code interfaces.

GC Based Resource Management

The JVM has a great set of garbage collectors, and has had amazing garbage collectors since about version 5.0 (when you stopped having to use object pools). They are fast and especially good at collecting objects which are short-lived, allowing you to do functional programming quickly and efficiently. We have a lot of respect for these pieces of technology and enjoy using them.

It is possible in the JVM to bind a function to a JVM object such that when the the gc removes the object from the live set you get a chance to execute the function. This is similar to the dispose concept of the .NET family of languages but requires some scaffolding to use it specifically with the JVM.

user> (let [my-item (-> (Object.)
                        (resource/track
                         #(println "gc object released")
                        :gc))
            _ (println my-item)]
        nil)
#object[java.lang.Object 0x42da1624 java.lang.Object@42da1624]
nil
user> (System/gc)
gc object released
nil

By way of the library's implementation, when the garbage collector determines the object is no longer reachable by the program it will place the weak-pointer we allocated into a queue. There is a thread that is polling this queue (with a timeout so it isn't spinning) and calls a dispose function on things in the queue.

Stack & GC

You can use these systems together so that you can guarantee some resources will be released no later than a stack frame but they could be released sooner. resource/track with the final argument a vector or set containing both :stack and :gc creates both a gc-hook and creates a stack-based hook; both of these hooks may be called and the dispose function will only be called once:

user> (resource/stack-resource-context
       (resource/track (Object.) #(println "gc and stack obj released") [:gc :stack])
       (throw (ex-info "Bad things happen" {})))
gc and stack obj released
ExceptionInfo Bad things happen  clojure.core/ex-info (core.clj:4739)

Uses In The Wild

tvm-clj uses both forms of resource management throughout; this means that resources (AST nodes, tensors, etc) can be cleaned up by the GC earlier than the resource context would have otherwise, and thus amortizes your deallocation cost for some of these objects. You still get the benefit of knowing that a given tensor will definitely be deallocated by a certain time which helps when dealing with potentially many gigabytes of data for a particular dataset.

When doing the integration, we found a solid gotcha that we want to share because we think it is general to the area. The drawback to any gc-based system is that when it fails to work as expected it can be difficult to ascertain the exact reason why, as garbage collector internals are generally opaque. Consequently, when you encounter a situation where you are running out of resources it can be very hard to figure out exactly why. Or, conversely, if you get into a situation where your GC resources are getting cleaned up too soon you encounter errors which are non-deterministic.

What follows is the case we encountered with tensors and matrices in TVM.

TVM is part of the DMLC ecosystem and it has that entire ecosystem has its own tensor datatype, called the DLTensor. We use JNA to wrap this data structure here.

TVM provides an allocation function to create them, TVMArrayAlloc. However, it does not provide a way to create a dl-tensor from an existing pointer (such as one coming from OpenCV). In this case, we want to take a pointer (an address, basically) along with a description, and return a dl-tensor.

If you were to create a dl-tensor and then immediately want to create a sub-matrix, for example, then you need to use the above function. In some code, this may mean you no longer have a direct reference to the root dl-tensor and because the actual relationship between the two objects is obscured from the garbage collector because pointers are fundamentally mere addresses. So unless the root tensor is referenced somehow from the derived tensor the GC may very well want to clean up the root tensor.

So, we added an optional gc-root parameter passed in to the pointer->tvm-ary function. This root is referenced in the dispose function for the derived tensor and thus the gc will derive the correct relationship between the two entities. Things like this bring us to our final points.

In Closing

Working with native resources means having several methods to clean them up. We show two of these methods that bridge the gap between deterministic and predictable (stack based) and non-deterministic or automatic (gc-based). We recommend starting with the first method as the range of potential issues is smaller but there are problems that cannot be well solved by purely deterministic resource management. We further provide ways to use the two systems together in the hope that you are enabled to use whichever system suits your needs at the time.

The stack and gc systems are not specific to native resources. Really they come down to functions called with specific triggers. You could use these types of systems for any side effect you wanted; close database connections, cache management, summary generation, etc. This post framed them in context of native resources, but they have many potential applications.

The JVM is an great foundation to build systems on. We take great pleasure in building simple ways to allow it to work well with the larger ecosystem of non-JVM languages and libraries.

Addendum

One language that doesn't get the credit it deserves is C++/CLI by Microsoft. This was a redesign of ManagedC++, a language that neither gets credit nor deserves it. Herb Sutter took care in the design of C++/CLI to make it work well with existing C++ by designing the extensions required to work with a garbage collected language from the context of a deep knowledge of the language design of C++ itself.

C++/CLI still stands as arguably the most powerful language in the world as it has all the power of native C++ and all the power of any of the garbage collected languages. Ideally, one could intermix garbage collected and native types freely, as necessary, by writing either native or gc-based code and have the language take care of bridging the gaps. We admire and respect Herb, and his influence on the relatively recent C++ renaissance.

Building bridges with TechAscent.