The TechAscent Blog

Putting Some Pieces Together: OpenCV

OpenCV is the foremost image processing library on the planet.


Most, if not all, machine learning systems that deal with images use it. Its bevy of interesting algorithms go far beyond raw image manipulation, and it is really good at raw image manipulation. So, regardless of whether the unknowable claim of foremost above could be true, it is our considered position that OpenCV makes all of the legacy Java image processing fairly pointless. Especially if what you are doing is actually processing images (as opposed to setting icons on awt widgets).

The problem with OpenCV, from a Java perspective, really lies in needing to learn a set of custom interfaces to do non-custom things.

Like juggling arrays of JVM primitives...

Oh, and dealing with native libraries...

Now we can see an interesting application of those two: tech.opencv.

Image Joy!

user> (require '[tech.opencv :as opencv])
WARNING: cast already refers to: #'clojure.core/cast in namespace: tech.datatype.base, being replaced by: #'tech.datatype.base/cast
WARNING: cast already refers to: #'clojure.core/cast in namespace: tech.datatype, being replaced by: #'tech.datatype/cast
WARNING: load already refers to: #'clojure.core/load in namespace: tech.opencv, being replaced by: #'tech.opencv/load
user> (opencv/load "test/data/test.jpg")
#<org.bytedeco.javacpp.opencv_core$Mat@28a27eb4 org.bytedeco.javacpp.opencv_core$Mat[width=512,height=288,depth=8,channels=3]>
user> (def test-img *1)

test image

We got back an opencv::mat in the form of a JavaCPP pointer. Now what can we do with it?

user> (require '[clojure.core.matrix :as m])
user> (m/ecount test-img)
user> (require '[tech.datatype :as dtype])
user> (dtype/get-datatype test-img)
user> (m/shape test-img)
[288 512 3]
user> (def short-data (short-array (m/ecount test-img)))
user> (dtype/copy! test-img short-data)
user> (take 20 short-data)
(172 170 170 172 170 170 171 169 169 171 169 169 173 171 171 174 172 172 174 172)
user> ;; Note correct representation of data.

It should be noted that shape and ecount are really all core.matrix provides at this point. Over the next few posts, we will implement a bit more. That being said, you still have access to all of OpenCV.

m/shape gets you further than you might think:

user> (m/shape test-img)
[288 512 3]
user> (m/shape [test-img test-img test-img])
[3 288 512 3]
user> (m/shape (repeat 10 [test-img test-img test-img]))
[10 3 288 512 3]

Copy of course works both ways:

user> (require '[clojure.core.matrix.macros :refer [c-for]])
user> (c-for [idx (int 0) (< idx (m/ecount test-img)) (inc idx)]
             (aset ^shorts short-data idx (short (quot (aget ^shorts short-data idx) 2))))
user> (def dest-img (opencv/new-mat 288 512 3))
user> (opencv/save dest-img "darken.png")
#<org.bytedeco.javacpp.opencv_core$Mat@4674cb52 org.bytedeco.javacpp.opencv_core$Mat[width=512,height=288,depth=8,channels=3]>

Here, quot 2 brings the data closer to zero, darkening the image.


The resource system is also in play, so matrices allocated within a resource context will be released when the resource context unwinds:

user> (require '[tech.resource :as resource])
user> (resource/with-resource-context
        (let [test-img (opencv/load "test/data/test.jpg")]
#<org.bytedeco.javacpp.opencv_core$Mat@37a79311 org.bytedeco.javacpp.opencv_core$Mat[address=0x0,position=0,limit=1,capacity=1,deallocator=org.bytedeco.javacpp.Pointer$NativeDeallocator[ownerAddress=0x0,deallocatorAddress=0x0]]>

Note that pointer address is zero.

You also have access to OpenCV's methods, but so-far they are not wrapped nicely.

user> (import '[org.bytedeco.javacpp opencv_imgproc opencv_core opencv_core$Mat])
#<Class@16b65cc1 org.bytedeco.javacpp.opencv_core$Mat>
user> (def test-img (opencv/load "test/data/test.jpg"))
user> (def result-img (opencv/clone test-img))
user> (opencv_imgproc/blur test-img result-img (opencv/size 3 3))
user> (opencv/save result-img "blurry.png")
#<org.bytedeco.javacpp.opencv_core$Mat@6330f6da org.bytedeco.javacpp.opencv_core$Mat[width=512,height=288,depth=8,channels=3]>


They are, however, quite fast:

user> (time (dotimes [iter 100]
              (opencv_imgproc/blur test-img result-img (opencv/size 3 3))))
"Elapsed time: 73.80491 msecs"

Some Implementation Details

A simple conversion between an OpenCV matrix and a typed pointer (of the type of data in the matrix) drastically simplifies this kind of interop. With our system, all the intense code required to correctly marshal information into and out of OpenCV can be activated by 3 protocol methods.


See also this protocol:


The OpenCV matrix class partially satisfies these with its Pointer class, and it has a .ptr member variable that points to the actual image data. Finishing the implementation on top of OpenCV's additional datatypes requires a few more functions:

(extend-type opencv_core$Mat
  (release-resource [item] (.release item) (.deallocate item))
  (dimensionality [m] (count (mp/get-shape m)))
  (get-shape [m] [(.rows m) (.cols m) (.channels m)])
  (is-scalar? [m] false)
  (is-vector? [m] true)
  (dimension-count [m dimension-number]
    (let [shape (mp/get-shape m)]
      (if (<= (count shape) (long dimension-number))
        (get shape dimension-number)
        (throw (ex-info "Array does not have specific dimension"
                        {:dimension-number dimension-number
                         :shape shape})))))
  (element-count [m] (apply * (mp/get-shape m)))

  (get-datatype [m] (-> (.type m)

  (->ptr-backing-store [item] (jcpp-dtype/set-pointer-limit-and-capacity
                               (.ptr item)
                               (mp/element-count item)))
  ;;We gain a lot from inheritance of the base Pointer type (opencv_core$Mat inherits
  ;;from javacpp/Pointer.  This allows all the machinery required to interact
  ;;with the datatype copy system to be implemented once in Pointer.

The last bit, copy raw data, is there to allow this to be used in sequences. For instance, converting a batch of 10 images into a single floating point of data to be used in a neural networking inference or training run:

(def test-ary (float-array (* 3 (m/ecount test-img))))
user> (dtype/copy-raw->item! (repeat 3 test-img) test-ary 0)
[#<[F@5ef59d3c> 1327104]
user> (take 10 test-ary)
(172.0 170.0 170.0 172.0 170.0 170.0 171.0 169.0 169.0 171.0)

The full code is around 250 lines:

We are leveraging our architecture that provides far easier integration of bulk primitive datatypes with native applications. So everything in this post can be short, simple, and exhibit predictable behavior.

Computing With Images

We have another library, tech.compute. This library provides substantial infrastructure for doing the types of math computations that come up all the time in our work. There is a lot to it, but suffice it to say that it was carefully engineered to work on native datatypes as well as JVM datatypes to make interoperability transparent. With these tools we can do (among other things) linear algebra operations directly with opencv images without copying the underlying buffer data. (!)

Here is the above example (darken by dividing by 2) written with the compute library:

;; Load CPU layer of compute library.
user> (require '[tech.compute.cpu.tensor-math :as cpu-tm])
user> (require '[tech.compute.tensor.operations :as op])
user> (-> (opencv/load "test/data/test.jpg")
          (op// 2)
          (opencv/save "tensor_darken.jpg"))
#<org.bytedeco.javacpp.opencv_core$Mat@1083378d org.bytedeco.javacpp.opencv_core$Mat[width=512,height=288,depth=8,channels=3]>



RGB to BGR conversion serves as a slightly more extensive example. Doing rgb->bgr translations or normalizing image data to be in a different range are just some of the extremely common things we ran into when working with images. We wish we would have had this library from the beginning!

At TechAscent, we love this shit!

Contact us