Published June 13, 2020 © CC BY-NC-SA

Installing and Using TensorFlow Lite on MaaXBoard

TensorFlow Lite makes sense for edge devices like the MaaXBoard.

IntermediateProtip4 hours840

Installing and Using TensorFlow Lite on MaaXBoard

Things used in this project

Hardware components

Tria Technologies MaaxBoard

Avnet 5V/3A USB Type-c power supply

16gb micro SD card

Software apps and online services

TensorFlow

Story

Prerequisites

I recommend setting up MaaXBoard headlessly and installing full Tensorflow first.

INTRO

What is Tensorflow Lite?

Tensorflow Lite is a leaner, more efficient version of Tensorflow. Running TensorFlow Lite requires two parts: the TensorFlow Lite interpreter and a model converted to Tensorflow Lite. Models are converted to Tensorflow Lite by pruning intermediate nodes, quantizing nodes and weights, and performing other optimizations like compressing weights and folding constants.

What does it mean to quantize a model?

Quantization is an important tool when adapting a machine learning model to run on edge devices. Quantization converts large floating point numbers in a model to integers or smaller floats to make the model smaller and faster, at the price of accuracy. Eight-bit integer multiplies can be 6X less energy and 6X less area than IEEE 754 16-bit floating-point multiplies.

Full integer quantization converts floating points to integers, so some Tensorflow Lite models can even run on hardware that doesn't have a floating point unit. This makes it ideal for edge/iot devices like Raspberry Pi and MaaXBoard, and even microcontrollers.

Models don't have to be converted to Tensorflow Lite to be quantized. Many non-converted models have some degree of quantization. Also, just because a model is converted to Tensorflow Lite doesn't mean it doesn't still include floating point operations.

How much less accurate is Tensorflow Lite than full Tensorflow?

There is no simple number for this. The difference in accuracy between TFLite and Tensorflow varies greatly depending on the model, the way it is converted, and the hardware that it is run on.

The only across-the-board benefits of using Tensorflow Lite are:

1.) The small size of the interpreter, and

2.) How easy it makes it to optimize inference

In addition to Tensorflow Lite, Tensorflow also provides a Model Optimization Toolkit to reduce your model's size and increase its efficiency with minimal impact on accuracy.

Ok, enough blabbering. Let's get started with the install.

INSTALL THE TENSORFLOW LITE INTERPRETER

The Tensorflow Lite interpreter is a lightweight version of Tensorflow designed to avoid wasting space when all you want to do is run Tensorflow Lite models.

I'll be doing the install using remote desktop, which I show how to set up here. Login to your MaaXboard via remote desktop.

Best practice: create a new virtual environment for tflite on your MaaXBoard, so we'll start by doing that.

mkvirtualenv tflite -p python3
workon tflite

MaaXBoard is aarch64. Tensorflow provides instructions for how to build from source for Arm 64 yourself. Thankfully, they also include the prebuilt python .whl, which can easily be installed with pip:

pip3 install https://dl.google.com/coral/python/tflite_runtime-2.1.0.post1-cp37-cp37m-linux_aarch64.whl

TEST A TFLITE MODEL

Models are trained with floating point numbers to get the best possible accuracy (although there is "quantization aware training"). Many available models don't yet have a Tensorflow Lite version, but the good news is that there are still dozens of pre-converted models to choose from on TensorFlow Hub.

We'll use the Mobile_Net_v1 model provided in the Tensorflow repository examples. We can test it out with the label_image.py example file provided in the Tensorflow repository.

Create a new directory for the files on your MaaXBoard:

mkdir imgtest
cd imgtest

Download the python script, the photo, the model and the labels:

# get file
curl -LO https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/lite/examples/python/label_image.py
# get photo
curl https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/lite/examples/label_image/testdata/grace_hopper.bmp > grace_hopper.bmp
# get model
sudo curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz | tar xzv 
# get labels
curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz  | tar xzv mobilenet_v1_1.0_224/labels.txt

Now run your model on the image.

python ./label_image.py --model_file mobilenet_v1_1.0_224.tflite --label_file mobilenet_v1_1.0_224/labels.txt --image grace_hopper.bmp

This is what MobileNet is able to classify in the image:

Great job! You just ran Tensorflow Lite on the MaaXboard!

CONVERT YOUR OWN MODEL TO TFLITE

If you want to use a model that hasn't been converted yet, it's not too difficult to get started converting your models Tensorflow Lite yourself (although optimizing your model is both an art and a science).

You could even do the conversion directly on the MaaXBoard, but since it requires downloading a couple tools, I'll be doing it on my Macbook (running Catalina), where space isn't at a premium.

One of the key ways that Tensorflow Lite shrinks a model is by removing all of the nodes that aren't called during inference. To do that, you’ll need to inspect the model's graph to find the input and output tensors, so you can feed those into the conversion tool. There are several tools to graph the nodes of a model:

Netron - an online tool that lets you visualize a model from a lot of different formats, and it also comes as a downloadable tool.
summarize_graph (part of Bazel)
TensorBoard - Tensorflow's visualization toolkit
Colab (here's a Jupyter Notebook that uses Colab to find nodes).

I'll be using Summarize_Graph here since that is the most documented way. Netron is even more intuitive to use. Skip to the end if you want to see the graphs created in Netron.

DOWNLOAD A NON-CONVERTED MODEL

Download the MobileNet frozen graph model we downloaded earlier using curl:

sudo curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz | tar xzv

It should include both .tflite and .pb files. We'll be converting the .pb (protobuff) file.

INSTALL TENSORFLOW ON YOUR BUILD MACHINE

It's possible to easily install Tensorflow with pip: pip install tensorflow

More detailed installation instructions for installing on Mac are here. Unfortunately, the pip install doesn't include the tools like Bazel and summarize_graph, which we'll need to get the input and output nodes. If we want to use those, we'll have to build from source. Instructions to build from source are below.

Install Bazel

Tensorflow 2.2 requires Bazel version 3.1.0.

Since this is an older version, it's not possible to use homebrew to install it, so you'll have download and install it manually:

curl -LO https://github.com/bazelbuild/bazel/releases/download/3.1.0/bazel-3.1.0-installer-darwin-x86_64.sh
chmod +x bazel-3.1.0-installer-darwin-x86_64.sh
./bazel-3.1.0-installer-darwin-x86_64.sh --user

Note: Once installed, if you ever want to upgrade bazel to the latest version you can upgrade with homebrew: "brew upgrade bazel"

You'll probably have to add Bazel to your path like I did. Bazel is installed in a directory called "bin" under your user. Export your path (in this case to zsh):

echo 'export PATH="$PATH:$HOME/bin"' >> ~/.zshrc

Restart your shell and check to make sure it's installed:

bazel --version

bazel 3.1.0

Build Tensorflow from Source

Download the Tensorflow build files and run "configure."

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
./configure

During the configuration, when prompted you can select 'N' for everything that requires a yes or no, and default ("enter/return") for everything that doesn't.

Build Tensorflow. This will take a long time - it took about 3 hours for me - so you might want to go for a hike or something while it's running:

bazel build //tensorflow/tools/pip_package:build_pip_package

INSPECT THE GRAPH TO GET INPUTS AND OUTPUTS

What does it mean to inspect the graph? The graph of a machine learning model shows all the nodes and layers with their inputs and outputs:

Graph of MobileNet V2 visualized with Netron

Create a Bazel Workspace and build Summarize_Graph:

touch WORKSPACE
bazel build tensorflow/tools/graph_transforms:summarize_graph

This will take a long time.

Run Bazel to get input and output nodes

Once Summarize_Graph is ready, run it on the quantized version of our MobileNet v1 SSD model to get the input and output nodes:

bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph=/Users/monica/Downloads/mobilenet_v1_1.0_224_frozen.pb

As you can see, it found one possible input, and one possible output:

Input Node:

name: input
type: float32
shape: [1,224,224,3]

Output Nodes:

name: MobilenetV1/Predictions/Reshape_1
op: Reshape

Now that we have the input and output nodes, we have everything we need to run the converter on our model. We can now run Bazel to convert the model to Tensorflow Lite:

bazel run tensorflow/lite/toco:toco -- --input_file=/Users/monica/Downloads/mobilenet_v1_1.0_224_frozen.pb --output_file=/Users/monica/Downloads/mobilenet_v1_1.0_224_frozen-CONVERTED-2.tflite --input_shapes=1,224,224,3 --input_arrays=Placeholder --output_arrays='MobilenetV1/Predictions/Reshape_1' --inference_type=QUANTIZED_UINT8 --mean_values=128 --std_values=128 --change_concat_input_ranges=false --allow_custom_ops
tensorflow/lite/toco:toco -- \
--input_file=/Users/monica/Downloads/mobilenet_v1_1.0_224_frozen.pb \
--output_file=/Users/monica/Downloads/mobilenet_v1_1.0_224_frozen-CONVERTED-2.tflite \
--inference_type=QUANTIZED_UINT8  \
--mean_values=128 \
--std_values=128 \
--input_arrays=input  \
--output_arrays=MobilenetV1/Predictions/Reshape_1  \
--input_shapes=1,224,224,3 \
--default_ranges_min=0.0 \
--default_ranges_max=255.0

Voila! Your Tensorflow Lite model should now be in the same directory as the original model.

Tensorflow Lite model should now be in the same directory as the original model.

We could have gotten all of that information without building Tensorflow from source and installing Bazel, just by opening the model and looking at it in Netron, as the images above show. But hey - at least now you're familiar with Bazel!

In case you want to do it differently next time, you can write a script using Tensorflow's TFLiteConverter API or you can even use the command line tools they provide. Here's how to do that as well:

BONUS: Install the TFLite Command Line converter

If you want to do the conversion without Bazel next time, Here are the reference and examples for the Command line tool. Here's a sample conversion command:

tflite_convert \
    --output_file=/ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18/ssd_mobilenet_v1_quantized_300x300_coco14.tflite \
    --graph_def_file=/ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18/ssd_mobilenet_v1_quantized_300x300_coco14.pb \
    --output_format=TFLITE \
    --input_shapes=1,300,300,3 \
    --input_arrays=normalized_input_image_tensor \
    --output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3'  \
    --inference_type=QUANTIZED_UINT8 \
    --mean_values=128 \
    --std_dev_values=127 \
    --change_concat_input_ranges=false \
    --allow_custom_ops

RUN YOUR QUANTIZED MODEL ON YOUR MAAXBOARD

Finally, run your quantized model on your MaaXBoard.

Even Tensorflow Lite models can be quite large. Luckily, my TLlite model ended up being a mere 6.9MB (as opposed to 27.7MB for the full model). Still, it's good to make sure you have enough space left on your MaaXBoard. Login to your MaaXBoard and type:

df

Run the command "df" to see how much space you have left.

From your host computer, copy over your model to the MaaXBoard:

scp mobilenet_v1_1.0_224_frozen-CONVERTED.tflite ebv@[IP ADDRESS]:

Back on the MaaXBoard, move the model to the folder we used for running the previous model:

mv mobilenet_v1_1.0_224_frozen-CONVERTED.tflite ~/imgdemo/mobilenet_v1_1.0_224_frozen-CONVERTED.tflite
cd imgdemo

Finally, run the model on an image:

python label_image.py --model mobilenet_v1_1.0_224_frozen-CONVERTED.tflite --label coco_labels.txt --image grace_hopper.bmp

Monica Houston

80 projects • 464 followers

I don't live on a boat anymore.

Contact

Comments

Please log in or sign up to comment.

Installing and Using TensorFlow Lite on MaaXBoard

Things used in this project

Hardware components

Software apps and online services

Story

Prerequisites

INTRO

What is Tensorflow Lite?

What does it mean to quantize a model?

How much less accurate is Tensorflow Lite than full Tensorflow?

INSTALL THE TENSORFLOW LITE INTERPRETER

TEST A TFLITE MODEL

CONVERT YOUR OWN MODEL TO TFLITE

DOWNLOAD A NON-CONVERTED MODEL

INSTALL TENSORFLOW ON YOUR BUILD MACHINE

Install Bazel

Build Tensorflow from Source

INSPECT THE GRAPH TO GET INPUTS AND OUTPUTS

Run Bazel to get input and output nodes

Input Node:

Output Nodes:

BONUS: Install the TFLite Command Line converter

RUN YOUR QUANTIZED MODEL ON YOUR MAAXBOARD

Further Reading:

Credits

Monica Houston

Comments

Embed the widget on your own site

Installing and Using TensorFlow Lite on MaaXBoard

Installing and Using TensorFlow Lite on MaaXBoard

Things used in this project

Hardware components

Software apps and online services

Story

Prerequisites

INTRO

What is Tensorflow Lite?

What does it mean to quantize a model?

How much less accurate is Tensorflow Lite than full Tensorflow?

INSTALL THE TENSORFLOW LITE INTERPRETER

TEST A TFLITE MODEL

CONVERT YOUR OWN MODEL TO TFLITE

DOWNLOAD A NON-CONVERTED MODEL

INSTALL TENSORFLOW ON YOUR BUILD MACHINE

Install Bazel

Build Tensorflow from Source

INSPECT THE GRAPH TO GET INPUTS AND OUTPUTS

Run Bazel to get input and output nodes

Input Node:

Output Nodes:

BONUS: Install the TFLite Command Line converter

RUN YOUR QUANTIZED MODEL ON YOUR MAAXBOARD

Further Reading:

Credits

Monica Houston

Comments

Related channels and tags