I recommend setting up MaaXBoard headlessly and installing full Tensorflow first.
INTROWhat is Tensorflow Lite?Tensorflow Lite is a leaner, more efficient version of Tensorflow. Running TensorFlow Lite requires two parts: the TensorFlow Lite interpreter and a model converted to Tensorflow Lite. Models are converted to Tensorflow Lite by pruning intermediate nodes, quantizing nodes and weights, and performing other optimizations like compressing weights and folding constants.
Quantization is an important tool when adapting a machine learning model to run on edge devices. Quantization converts large floating point numbers in a model to integers or smaller floats to make the model smaller and faster, at the price of accuracy. Eight-bit integer multiplies can be 6X less energy and 6X less area than IEEE 754 16-bit floating-point multiplies.
Full integer quantization converts floating points to integers, so some Tensorflow Lite models can even run on hardware that doesn't have a floating point unit. This makes it ideal for edge/iot devices like Raspberry Pi and MaaXBoard, and even microcontrollers.
Models don't have to be converted to Tensorflow Lite to be quantized. Many non-converted models have some degree of quantization. Also, just because a model is converted to Tensorflow Lite doesn't mean it doesn't still include floating point operations.
How much less accurate is Tensorflow Lite than full Tensorflow?There is no simple number for this. The difference in accuracy between TFLite and Tensorflow varies greatly depending on the model, the way it is converted, and the hardware that it is run on.
The only across-the-board benefits of using Tensorflow Lite are:
1.) The small size of the interpreter, and
2.) How easy it makes it to optimize inference
In addition to Tensorflow Lite, Tensorflow also provides a Model Optimization Toolkit to reduce your model's size and increase its efficiency with minimal impact on accuracy.
Ok, enough blabbering. Let's get started with the install.
INSTALL THE TENSORFLOW LITE INTERPRETERThe Tensorflow Lite interpreter is a lightweight version of Tensorflow designed to avoid wasting space when all you want to do is run Tensorflow Lite models.
I'll be doing the install using remote desktop, which I show how to set up here. Login to your MaaXboard via remote desktop.
Best practice: create a new virtual environment for tflite on your MaaXBoard, so we'll start by doing that.
mkvirtualenv tflite -p python3
workon tflite
MaaXBoard is aarch64. Tensorflow provides instructions for how to build from source for Arm 64 yourself. Thankfully, they also include the prebuilt python .whl, which can easily be installed with pip:
pip3 install https://dl.google.com/coral/python/tflite_runtime-2.1.0.post1-cp37-cp37m-linux_aarch64.whl
TEST A TFLITE MODELModels are trained with floating point numbers to get the best possible accuracy (although there is "quantization aware training"). Many available models don't yet have a Tensorflow Lite version, but the good news is that there are still dozens of pre-converted models to choose from on TensorFlow Hub.
We'll use the Mobile_Net_v1 model provided in the Tensorflow repository examples. We can test it out with the label_image.py example file provided in the Tensorflow repository.
Create a new directory for the files on your MaaXBoard:
mkdir imgtest
cd imgtest
Download the python script, the photo, the model and the labels:
# get file
curl -LO https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/lite/examples/python/label_image.py
# get photo
curl https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/lite/examples/label_image/testdata/grace_hopper.bmp > grace_hopper.bmp
# get model
sudo curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz | tar xzv
# get labels
curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz | tar xzv mobilenet_v1_1.0_224/labels.txt
Now run your model on the image.
python ./label_image.py --model_file mobilenet_v1_1.0_224.tflite --label_file mobilenet_v1_1.0_224/labels.txt --image grace_hopper.bmp
This is what MobileNet is able to classify in the image:
Great job! You just ran Tensorflow Lite on the MaaXboard!
CONVERT YOUR OWN MODEL TO TFLITEIf you want to use a model that hasn't been converted yet, it's not too difficult to get started converting your models Tensorflow Lite yourself (although optimizing your model is both an art and a science).
You could even do the conversion directly on the MaaXBoard, but since it requires downloading a couple tools, I'll be doing it on my Macbook (running Catalina), where space isn't at a premium.
One of the key ways that Tensorflow Lite shrinks a model is by removing all of the nodes that aren't called during inference. To do that, you’ll need to inspect the model's graph to find the input and output tensors, so you can feed those into the conversion tool. There are several tools to graph the nodes of a model:
- Netron - an online tool that lets you visualize a model from a lot of different formats, and it also comes as a downloadable tool.
- summarize_graph (part of Bazel)
- TensorBoard - Tensorflow's visualization toolkit
- Colab (here's a Jupyter Notebook that uses Colab to find nodes).
I'll be using Summarize_Graph here since that is the most documented way. Netron is even more intuitive to use. Skip to the end if you want to see the graphs created in Netron.
DOWNLOAD A NON-CONVERTED MODELDownload the MobileNet frozen graph model we downloaded earlier using curl:
sudo curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz | tar xzv
It should include both .tflite and .pb files. We'll be converting the .pb (protobuff) file.
INSTALL TENSORFLOW ON YOUR BUILD MACHINEIt's possible to easily install Tensorflow with pip: pip install tensorflow
More detailed installation instructions for installing on Mac are here. Unfortunately, the pip install doesn't include the tools like Bazel and summarize_graph, which we'll need to get the input and output nodes. If we want to use those, we'll have to build from source. Instructions to build from source are below.
Install BazelTensorflow 2.2 requires Bazel version 3.1.0.
Since this is an older version, it's not possible to use homebrew to install it, so you'll have download and install it manually:
curl -LO https://github.com/bazelbuild/bazel/releases/download/3.1.0/bazel-3.1.0-installer-darwin-x86_64.sh
chmod +x bazel-3.1.0-installer-darwin-x86_64.sh
./bazel-3.1.0-installer-darwin-x86_64.sh --user
Note: Once installed, if you ever want to upgrade bazel to the latest version you can upgrade with homebrew: "brew upgrade bazel
"
You'll probably have to add Bazel to your path like I did. Bazel is installed in a directory called "bin" under your user. Export your path (in this case to zsh):
echo 'export PATH="$PATH:$HOME/bin"' >> ~/.zshrc
Restart your shell and check to make sure it's installed:
bazel --version
bazel 3.1.0
Download the Tensorflow build files and run "configure."
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
./configure
During the configuration, when prompted you can select 'N' for everything that requires a yes or no, and default ("enter/return") for everything that doesn't.
Build Tensorflow. This will take a long time - it took about 3 hours for me - so you might want to go for a hike or something while it's running:
bazel build //tensorflow/tools/pip_package:build_pip_package
INSPECT THE GRAPH TO GET INPUTS AND OUTPUTSWhat does it mean to inspect the graph? The graph of a machine learning model shows all the nodes and layers with their inputs and outputs:
Create a Bazel Workspace and build Summarize_Graph:
touch WORKSPACE
bazel build tensorflow/tools/graph_transforms:summarize_graph
This will take a long time.
Run Bazel to get input and output nodesOnce Summarize_Graph is ready, run it on the quantized version of our MobileNet v1 SSD model to get the input and output nodes:
bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph=/Users/monica/Downloads/mobilenet_v1_1.0_224_frozen.pb
As you can see, it found one possible input, and one possible output:
- name: input
- type: float32
- shape: [1,224,224,3]
- name: MobilenetV1/Predictions/Reshape_1
- op: Reshape
Now that we have the input and output nodes, we have everything we need to run the converter on our model. We can now run Bazel to convert the model to Tensorflow Lite:
bazel run tensorflow/lite/toco:toco -- --input_file=/Users/monica/Downloads/mobilenet_v1_1.0_224_frozen.pb --output_file=/Users/monica/Downloads/mobilenet_v1_1.0_224_frozen-CONVERTED-2.tflite --input_shapes=1,224,224,3 --input_arrays=Placeholder --output_arrays='MobilenetV1/Predictions/Reshape_1' --inference_type=QUANTIZED_UINT8 --mean_values=128 --std_values=128 --change_concat_input_ranges=false --allow_custom_ops
tensorflow/lite/toco:toco -- \
--input_file=/Users/monica/Downloads/mobilenet_v1_1.0_224_frozen.pb \
--output_file=/Users/monica/Downloads/mobilenet_v1_1.0_224_frozen-CONVERTED-2.tflite \
--inference_type=QUANTIZED_UINT8 \
--mean_values=128 \
--std_values=128 \
--input_arrays=input \
--output_arrays=MobilenetV1/Predictions/Reshape_1 \
--input_shapes=1,224,224,3 \
--default_ranges_min=0.0 \
--default_ranges_max=255.0
Voila! Your Tensorflow Lite model should now be in the same directory as the original model.
We could have gotten all of that information without building Tensorflow from source and installing Bazel, just by opening the model and looking at it in Netron, as the images above show. But hey - at least now you're familiar with Bazel!
In case you want to do it differently next time, you can write a script using Tensorflow's TFLiteConverter API or you can even use the command line tools they provide. Here's how to do that as well:
BONUS: Install the TFLite Command Line converterIf you want to do the conversion without Bazel next time, Here are the reference and examples for the Command line tool. Here's a sample conversion command:
tflite_convert \
--output_file=/ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18/ssd_mobilenet_v1_quantized_300x300_coco14.tflite \
--graph_def_file=/ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18/ssd_mobilenet_v1_quantized_300x300_coco14.pb \
--output_format=TFLITE \
--input_shapes=1,300,300,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \
--inference_type=QUANTIZED_UINT8 \
--mean_values=128 \
--std_dev_values=127 \
--change_concat_input_ranges=false \
--allow_custom_ops
RUN YOUR QUANTIZED MODEL ON YOUR MAAXBOARDFinally, run your quantized model on your MaaXBoard.
Even Tensorflow Lite models can be quite large. Luckily, my TLlite model ended up being a mere 6.9MB (as opposed to 27.7MB for the full model). Still, it's good to make sure you have enough space left on your MaaXBoard. Login to your MaaXBoard and type:
df
From your host computer, copy over your model to the MaaXBoard:
scp mobilenet_v1_1.0_224_frozen-CONVERTED.tflite ebv@[IP ADDRESS]:
Back on the MaaXBoard, move the model to the folder we used for running the previous model:
mv mobilenet_v1_1.0_224_frozen-CONVERTED.tflite ~/imgdemo/mobilenet_v1_1.0_224_frozen-CONVERTED.tflite
cd imgdemo
Finally, run the model on an image:
python label_image.py --model mobilenet_v1_1.0_224_frozen-CONVERTED.tflite --label coco_labels.txt --image grace_hopper.bmp
Further Reading:
Comments