TechAscent - Functions Across Languages

2019-07-28

Functions Across Languages

Functions are one of the basic units of abstraction in computer science. They are a mechanism for packaging up an algorithm into a reusable entity. The word function in this sense comes partly from the mathematical definition:

In mathematics, a function is a relation between sets that associates to every element of a first set exactly one element of the second set.

This definition is far more strict than what we consider to be a function in computer science. In programming, a function can be almost anything.

All of the partners at TechAscent have experience with many different languages and we enjoy working across several of them, although our preferred day to day language remains Clojure. Recently a partner in TechAscent, Chris Nuernberger, released a library to allow Clojure and Python to intermix. These two languages are both very powerful in their respective domains and bridging these two communities enables bringing distinctly different yet complementary skill sets to bear on problems.

Here we present a fundamental technique used to bridge functions across languages; that is calling a function defined in Clojure from Python. This is a very technical post and is meant to help understand how inter-language bindings are implemented at a base level.

Towards The Problem

Today we want to get into the details of what actually happens at a machine level when you define a function in Clojure and call it from Python. Here is a simple example where we define a simple summation Clojure function and use it from the numpy apply_along_axis function:

user> (require '[libpython-clj.python :as py])
nil
user> (py/initialize!)
Jul 21, 2019 10:29:41 AM clojure.tools.logging$eval7281$fn__7284 invoke
INFO: executing python initialize!
Jul 21, 2019 10:29:41 AM clojure.tools.logging$eval7281$fn__7284 invoke
INFO: Library python3.6m found at [:system "python3.6m"]
Jul 21, 2019 10:29:41 AM clojure.tools.logging$eval7281$fn__7284 invoke
INFO: Reference thread starting
:ok
user> (def clj-fn #(apply + %))
#'user/clj-fn
user> (clj-fn [1 2 3 4])
10
user> (def np (py/import-module "numpy"))
#'user/np
user> (py/call-attr np "array" [[1 2 3][4 5 6][7 8 9]])
[[1 2 3]
 [4 5 6]
 [7 8 9]]
user> (def test-ary *1)
#'user/test-ary
user> (py/get-attr test-ary "shape")
(3, 3)
user> (py/get-attr test-ary "dtype")
int64
user> (py/call-attr np "apply_along_axis" clj-fn 0 test-ary)
[12 15 18]
user> (py/call-attr np "apply_along_axis" clj-fn 1 test-ary)
[ 6 15 24]
user> (type *1)
:pyobject
user> (py/python-type *2)
:ndarray
user> (py/call-attr np "apply_along_axis" py-fn 0 test-ary)
[12 15 18]

Let's walk through this demo and put together, piece by piece what actually happens under the covers.

Defining Functions On The JVM

We start with the function definition:

user> (def clj-fn #(apply + %))
#'user/clj-fn
user> (clj-fn [1 2 3])
6

What exactly is clj-fn? Well, we are in Java so you can bet it is an object. Let's inspect it for just a second:

user> (require '[clojure.reflect :as r])
nil
user> (r/reflect clj-fn)
{:bases #{clojure.lang.RestFn},
 :flags #{:public :final},
 :members
 #{{:name getRequiredArity,
    :return-type int,
    :declaring-class user$clj_fn,
    :parameter-types [],
    :exception-types [],
    :flags #{:public}}
   {:name doInvoke,
    :return-type java.lang.Object,
    :declaring-class user$clj_fn,
    :parameter-types [java.lang.Object],
    :exception-types [],
    :flags #{:public}}
   {:name invokeStatic,
    :return-type java.lang.Object,
    :declaring-class user$clj_fn,
    :parameter-types [clojure.lang.ISeq],
    :exception-types [],
    :flags #{:public :static}}
   {:name const__0,
    :type clojure.lang.Var,
    :declaring-class user$clj_fn,
    :flags #{:public :static :final}}
   {:name user$clj_fn,
    :declaring-class user$clj_fn,
    :parameter-types [],
    :exception-types [],
    :flags #{:public}}}}

clj-fn is an instance of a class that derives from clojure.lang.RestFn. The important interface that the base class implements is clojure.lang.IFn. When you call it from the repl, the system calls the invoke method overloaded with the appropriate arity (number of arguments):

user> (.invoke clj-fn [1 2 3])
6

So, our one liner above created a class definition and an instance of that class that is overloaded to call the thing we care about. It also created a fast-path invokeStatic method if the compiler can see the actual class definition itself. The important interface implemented here is actually clojure.lang.IFn; anything deriving from this interface is callable by Clojure with no sugar. The datatype library takes full advantage of this fact to offer very good repl integration as does libpython-clj.

Into The Void

Now we create a python function from the Clojure function by invoking the ->python operator from libpython-clj. The return value is a JNA pointer which is a very light wrapper around a long integer:

user> (def py-fn (py/->python clj-fn))
#'user/py-fn
user> py-fn
#object[com.sun.jna.Pointer 0x62b602ec "native@0x7fab691e8318"]
user> (com.sun.jna.Pointer/nativeValue py-fn)
140569833018064
user> (py/python-type py-fn)
:builtin-function-or-method
user> (py/att-type-map py-fn)
{"__call__" :method-wrapper,
 "__class__" :type,
 "__delattr__" :method-wrapper,
 "__dir__" :builtin-function-or-method,
 "__doc__" :str,
 "__eq__" :method-wrapper,
 "__format__" :builtin-function-or-method,
 "__ge__" :method-wrapper,
 "__getattribute__" :method-wrapper,
 "__gt__" :method-wrapper,
 "__hash__" :method-wrapper,
 "__init__" :method-wrapper,
 "__init_subclass__" :builtin-function-or-method,
 "__le__" :method-wrapper,
 "__lt__" :method-wrapper,
 "__module__" :none-type,
 "__name__" :str,
 "__ne__" :method-wrapper,
 "__new__" :builtin-function-or-method,
 "__qualname__" :str,
 "__reduce__" :builtin-function-or-method,
 "__reduce_ex__" :builtin-function-or-method,
 "__repr__" :method-wrapper,
 "__self__" :none-type,
 "__setattr__" :method-wrapper,
 "__sizeof__" :builtin-function-or-method,
 "__str__" :method-wrapper,
 "__subclasshook__" :builtin-function-or-method,
 "__text_signature__" :none-type}

These are the steps that just occurred:

Given a Clojure function create a C function with particular signature and return its address. This is the most important step as it requires dynamic code generation at the machine level to produce a C function that essentially takes the arguments from C, convert to a form java can use, call the java function and the marshal the result back into C and return the value on the C stack.
Given C function pointer register a new function in the python interpreter and return a python object.

Let's walk through, top to bottom, what just happened. The return value is a JNA pointer to a python object, but what is the entire path?

->python

->python is a protocol function. Protocols are an extremely important language feature of Clojure that allow you to bolt on interfaces to objects after the fact and offer comparable performance to JVM interfaces. These are not things that exist at the Java or Scala level; this is comparable to being able to state that a particular class (or interface) extends another interface along with providing a concrete implementation of that extension post-facto or after the object is defined.

In our case, we have to find the actual protocol implementation that was called. This can be tough at times but luckily for us we know exactly which one was called because we wrote that derivation. The implementation of the ->python call for generic objects is defined in libpython-clj's python/object.clj. This calls the ->py-fn method which takes an object and attempts to create a python function.

->py-fn

Here we get our first look at actual python functionality. We are calling libpython's cfunction_new method. This method is only partially documented, We found this function via analyzing the exported symbols from libpython3.7m.so via the nm command:

chrisn@chrisn-lt-2:~/dev/techascent.com$ nm -D /usr/lib/x86_64-linux-gnu/libpython3.7m.so | grep -i function
0000000000236380 T _Py_AsyncFunctionDef
0000000000245710 T PyCFunction_Call
000000000015e920 T PyCFunction_ClearFreeList
00000000001606f0 T _PyCFunction_DebugMallocStats
00000000002456d0 T _PyCFunction_FastCallDict
0000000000245690 T _PyCFunction_FastCallKeywords
000000000015e970 T PyCFunction_Fini
000000000015f070 T PyCFunction_GetFlags
000000000015f100 T PyCFunction_GetFunction
000000000015f0b0 T PyCFunction_GetSelf
00000000001a2ad0 T PyCFunction_New
00000000001a29d0 T PyCFunction_NewEx
00000000006d4aa0 D PyCFunction_Type
00000000002451f0 T _Py_CheckFunctionResult
00000000002474e0 T PyEval_CallFunction
0000000000236450 T _Py_FunctionDef
0000000000244f00 T _PyFunction_FastCallDict
0000000000244d30 T _PyFunction_FastCallKeywords
00000000001c53b0 T PyFunction_GetAnnotations
00000000001c53f0 T PyFunction_GetClosure
00000000001c5690 T PyFunction_GetCode
00000000001c55d0 T PyFunction_GetDefaults
00000000001c5650 T PyFunction_GetGlobals
00000000001c54e0 T PyFunction_GetKwDefaults
00000000001c5610 T PyFunction_GetModule
00000000001db560 T PyFunction_New
00000000001db3b0 T PyFunction_NewWithQualName
00000000001c5300 T PyFunction_SetAnnotations
00000000001c5a10 T PyFunction_SetClosure
00000000001c5520 T PyFunction_SetDefaults
00000000001c5430 T PyFunction_SetKwDefaults
00000000006dc740 D PyFunction_Type
0000000000242160 T PyInstanceMethod_Function
0000000000242340 T PyMethod_Function
000000000016daa0 T PyModule_AddFunctions
00000000002475e0 T PyObject_CallFunction
0000000000246750 T PyObject_CallFunctionObjArgs
00000000002473e0 T _PyObject_CallFunction_SizeT
00000000006ffa40 B PyOS_ReadlineFunctionPointer
chrisn@chrisn-lt-2:~/dev/techascent.com$

Here is its interface definition in C:

PyObject *
PyCFunction_New(PyMethodDef *ml, PyObject *self);

It requires a python method definition! Here JNA really can help us out; we define a Java class that contains the appropriate member variables and derives from the appropriate JNA struct and JNA takes care of mapping that struct class to/from a raw block of memory.

So, we make JNA Bindings to C PyMethodDef.

And of course the PyMethodDef itself requires a PyCFunction... Which is actually a void* that is further defined by the ml_meth argument. So in essence the method def itself requires a pointer to a function. The actual function signature can be one of several and which exact one it is is defined by the ml_meth member variable of the method object.

In object.clj, the apply-method-def-data! function is what is converting between a thing defined from java and to a long integer (void*) that the system actually knows about. In our case because we just passed in generic Clojure function, we will call the wrap-clojure-fn method which creates an instance of the JNA interface CFunction$TupleFunction.

Given that we know which interface we will be using, we can set ml_meth member to the appropriate constant but we still need to take an implementation of the CFunction$TupleFunction interface and get back an actual C pointer.

JNA - The Magic

The problem is we have an implementation of an interface that takes some arguments. We want to map this interface to a C function that takes those same arguments but one argument, the this member must be encoded in the function somehow.

This problem which is specifically:

How can I take a member function and create a stand-alone function from it?

is something Rich was concerned about in 1994. The basic idea is to dynamically create a new function that hard codes the 'this' object and thus the 'this' object can be elided while transforming arguments from C and the return value back. Some refer to this new object as a trampoline where you are in essence closing over some of the function arguments and converting between systems. But the interesting thing is that it is creating a new C function dynamically upon request.

In our code, we call CallbackReference/getFunctionPointer. The important part of this is that JNA the creates a CallbackReference object with our interface and the constructor of CallbackReference creates a trampoline.

In particular we need a trampoline that:

takes the arguments from C
ensures the jvm is attached to this thread
produces an array of JVM objects
finds the 'this' object and calls our interface implementation
marshals the result back to C

Let's see how JNA accomplishes this task:

peer = Native.createNativeCallback(proxy, PROXY_CALLBACK_METHOD,
                                          nativeParamTypes, returnType,
                                          callingConvention, flags,
                                          encoding);

The proxy in this case is the object that can take an array of arguments and call the interface. So the unknown in this case is the createNativeCallback method.

static synchronized native long createNativeCallback(Callback callback,
                                                     Method method,
                                                     Class<?>[] parameterTypes,
                                                     Class<?> returnType,
                                                     int callingConvention,
                                                     int flags,
                                                     String encoding);

OK, this is an actual JNI function that is defined in C so let's check it out:

JNIEXPORT jlong JNICALL
Java_com_sun_jna_Native_createNativeCallback(JNIEnv *env,
                                             jclass UNUSED(cls),
                                             jobject obj,
                                             jobject method,
                                             jobjectArray arg_types,
                                             jclass return_type,
                                             jint call_conv,
                                             jint options,
                                             jstring encoding) {
  callback* cb =
    create_callback(env, obj, method, arg_types, return_type,
                    call_conv, options, encoding);

  return A2L(cb);
}

createNativeCallback calls create_callback in callback.c. This is the magic; this is where JNA uses the libffi closure api to create a new function that calls back into JNA's dispatch_callback function with the c arguments and the closed over environment.

Closing Out

Let's look at a condensed version of the demo one more time:

user> (py/call-attr np "apply_along_axis" #(apply + %) 0 test-ary)
[12 15 18]

Now you know that that small bit of code is actually doing quite a lot of heavy lifting! The crux of the matter is the problem of creating a C function pointer from a method on a java interface; JNA accomplishes this via libffi. One name of this mechanism is trampoline but we may actually call it a closure. The significant difference between a trampoline and a closure is that a trampoline is often expected to perform a simple set of transformations before calling and upon the return of the wrapped function. It is not uncommon to find in operating system code or just in code that interoperates between object systems and C-based functions and thus it is an important technique to be aware of.

We enjoy this type of research and feel it is important that these techniques are known to wider audiences as they are some of the more obscure but necessary components to building a great bridge between two languages or between a language and some hardware.

A simple C interface is sometimes the best possible interface between 2 different systems. Building these and integrating with them smoothly and dynamically enables interaction with other complex systems and can cut the dependencies between two complex systems. Trampolining is an integral part of doing this dynamically so the technique presented here can be used to bind the JVM into several types of environments.

In this way you can gain the benefits from binding the JVM's extremely strong garbage collection system, libraries, and runtime optimizations to systems that have made very different tradeoffs with potentially very complementary characteristics.

TechAscent: Bolting the future onto the now