Luminothâs Web API: Hackathon-Speed Integration of Neural Networks
One of the more challenging aspects of developing SDKs with machine learning models is deployment and productionization. TensorFlow in particular can be difficult to set up, and requires GPUs to evaluate large models. This post will share my experiences in skirting this process entirely to quickly evaluate a FasterRCNN-based model during a hackathon last year, usable on any office or CI machine.
During this hackathon, I implemented and trained a model from a paper from ICDAR 2017 on one of our physical machine learning-equipped machines. To achieve quick deliverables, rather than try to get the trained model and data off the machine, I simply used a tool called Luminoth running on the machine to expose the modelâs prediction functionality. This also allowed anybody on my team to continue developing the model afterward with minimal friction, and required only a small networking shim in our codebase.
Luminoth is a Python-based tool that I like to refer to as âa command line wrapper around TensorFlow.â While the use of a YAML file to quickly set up and train some popular networks such as FasterRCNN is its main use, it also exposes a Flask-based server which allows prediction queries via a web page. As it turns out, it also exposes an (undocumented) API which is usable programmatically.
My codebase is in C++ with a C# assembly wrapping it. That being the case, I had to get my modelâs predictions (a number of bounding boxes) into C++ code, and fast. Figuring out TensorFlowâs shaky C++ API (or even using Python-based TensorFlow) wasnât an option. The model was already trained on our machine-learning computer, and would have required a large setup cost and data duplication by anyone else evaluating the model. I had my eye on a particular C++ networking library, CPR, that I have been meaning to use; so I thought, why not tackle all of these problems at once?
Letâs start by figuring out Luminothâs API from the source and web page itself.
First, using Lunimothâs server as per the documentation shows requests being made to an endpoint named `api/fastercnn/predict`. We can see itâs returning some JSON–great, we now know itâs probably possible to invoke programmatically!
Digging in Luminothâs web.py, around line 31 at the time of writing, the corresponding endpoint `/api//predict/` method is our ticket.
The first thing we see is an attempt to retrieve the image data from the request to predict:
What is get_image() ? Well, it shows an expectation of a POSTâed file by the name of âimageâ.
This is a Flask web server. The Flask documentation for the files property in the Request object shows that this only appears (for our purposes) in a POST request, with a <form> object, and when an encoding of enctype=”multipart/form-data” is given. Right, sounds like we now know how to use the endpoint programmatically. Now, how can we call this from C++ using CPR?
Letâs start with the POST request. Using CPR, this is very straightforward. The required multipart/form-data encoding is handled by the cpr::Multipart object. At the time of writing, there is a bug with that and data buffers; so in order to proceed with the hackathon, the image was first written to a file, reloaded, and then sent. Donât do that if possible.
Where url is the URL of the Luminoth endpoint we found, and data and data_size are the image we are trying to use FasterRCNN to predict. When used, it looks like this:
The POST request returns a JSON string. We need to decode it. Luckily, there is superb header-only Json library, Nlohmann Json (which I think has the potential to be part of the C++ STL; by all means use it), we can drop right in and get a vector of RECTs and their confidences back:
Note that the boxes are returned in a X/Y/Right/Bottom format. If you need a X/Y/Width/Height format, itâs easily convertible. From then on, the bounding boxes can be passed on throughout the codebase, and improvements of the method over current methods can be measured.
Youâll have to excuse the use of void pointers, pointers to vector, new, and other frowned-upon items. The use of CPR also required an additional problem here. The C++ codebase is in MSVC 11.0, and CPR requires MSVC 14.0 or later. To integrate this, a separate DLL was created and loaded dynamically via LoadLibrary in the main source, so a C API was created. But these are implementation details. And again, it was simply the quickest way to get results.
Thatâs about it for this post. All-in-all, I believe Luminoth is an underrated, but also unfinished, machine learning tool. Itâs a good choice for having a quick way to train, save state, and evaluate neural networks. The API allows high-speed integration of a model into existing code in any language, after which a results analysis can determine if to further productionize the model or not.
James Waugh, Software Engineer III
James Waugh began his career at Accusoft in 2016, where he started in technical support. Since then, he obtained a Software Engineer III role on Accusoftâs SDK products, where he focuses on machine learning, PDF, and image processing technologies. A graduate of the University of South Florida, James holds a Computer Engineering degree. He also attends local technology meetups such as Barcamp Tampa regularly. Outside the office, James enjoys taking his classic Chevrolet Bel Air to car shows and practicing Japanese.