Capacitive technology have become increasingly important in outdoor devices like self-service parcel lockers, self service shops or ticket machines. While before 2020 these were considered as "just common". After the outbreak these become essential to reduce direct human to human contact. In the last two years capacitive technology not only helped to slown down the spread of the virus, but also showed people convenience of such technology in ever more areas of life than before.
The most popular UI for mentioned above applications is simple touch screen with virtual keypad of some sort behind thick piece of glass that is there to increase durability versus harsh conditions of the outside as well as vandals.
While such execution may look good, it has some problems and inconveniences. These are my personal experiences from using common in my country parcel lockers and train ticket machines.
Firstly, the distance between capacitive glass touch panel and the screen itself can be more than a centimeter thick. This results in massive parallax error. Every time I have to pickup something a locker or buy a ticket, I either try to type in my normal pace and make at least one mistake literally every time, or I spend unconfortable amount of time to correct my view against the screen with every key touch.
The second problem is with the power draw. There is more than 20000 self service lockers in my country and there is ticket machine at almost every train / tram station. It mildly infuriates me that these devices have always on display. Imagine the power loss on keeping the screen powered and refreshing even when there is nobody around for the whole night.
This project aims to redefine UI of some of these applications by removing power hungry LCD screen and thick piece of glass, also removing parallax error at the same time. It also aims to redefine whole UX by adding new way of input. Instead of simple "press the key on screen" style, this project shows other ways to interface with capacitive panels, in example via symbol drawing or proximity sensing.
The hardwareThe hardware in this project is super simple and consists of only two parts. The first part is the PSoC 4100S Pioneer Kit with all included Arduino headers soldered and plastic backplate removed.
The front panel of the kit with the flexible PCB still attached will act as an input panel for the Lock. To make the project feel more complete I have decided to design my own graphics for the touch panel and accept / cancel buttons:
Touchpad's area shows all symbols that can be drawn on and classified by the device.
The second part is custom made PCB which holds four 12V relays with additional driver logic to control them using Kit's GPIO. Each relay has corresponding orange LED that shows it's status on the right side of the capacitive touch pad. Apart from the relays, there is also single 74HC595 SIPO register that controls eight green diodes. These diodes are used to visualize pattern fulfillment showing how much symbols (from 0 to 8) were drawn.
Additionally you can see that there is DEBUG DISPLAY header for small 1.5 inch OLED module. This module was supposed to be used as a way to show inserted shapes and compare them side by side with predefinied ones, but later in the project I've decided to perform this operation using PC through serial connection as it's more flexible way to test the device.
Later in the project I've also discovered the need to remove 0 Ohm R133 resistor which was connecting one of the GPIO's used by relays to built in potentiometer. To make this GPIO available I also had to short R125 which, based on provided schematics, connects chip's GPIO to GPIO header.
Assembled electronics were placed in snap on stand:
In this state the board was ready for development, testing the functionality and collecting data for pattern detector algorithm.
The codeThe code description is divided into three parts. The first one describes general architecture of the code, how the main event loop works and how this loop's state machine helps to achieve lower power consumption without giving up on the utility or responsiveness. The second one describes how touchpad data is organized in code to achieve lowest memory footprint possible, how I collected data samples from the touchpad and how I wrapped these samples into diverse database. The third one is the implementation of the pattern detector network itself, starting from Tensorflow model, moving to how this model gets converted so it can fit on the device and lastly how it get's interpreted by my simple self made inference engine.
Top design
The top design part of the PSoC Creator's project is just all the things mentioned above. Four relays with LEDS, SIPO IC for pattern LEDS, UART for communication and CapSense element with touchpad, proximity sensing and two buttons. There is also EZI2C for CapSense tuning.
Themain loop
The main part of the program is written in event-driven paradigm with two sources of events - hardware and time based. Hardware events mostly come from CapSense readSensors function which reads desired sensors and invokes callback functions each time a sensor changes it's state. To save power and clock cycles, in STANDBY application state only proximity sensor is polled. In other states all sensors are polled. Time events come from timing::doSleep() function and these are based on timers that are started or stopped in callbacks from hardware events.
while (true)
{
if (appState == AppState::STANDBY)
{
uint8_t w = CapSense_PROXIMITY_WDGT_ID;
readSensors(&w, 1);
}
else
{
readSensors(widgets, numWidgets);
}
uint32_t sleepTime = 0;
switch(appState)
{
case AppState::STANDBY:
sleepTime = SLEEP_STANDBY_MS;
break;
case AppState::NORMAL:
sleepTime = SLEEP_NORMAL_MS;
break;
case AppState::PATTERN_WIP:
sleepTime = SLEEP_PATTERN_WIP_MS;
break;
}
timing::doSleep(sleepTime, false);
}
Proximity change state callback defines whether application should change state from STANDBY (on proximity detected) or to start a timer which, on timeout will restore STANDBY state and will clear all input that was processed in the meantime (touchpad's state and all inserted symbols).
void handleProximityStateChange(capsense::ProximityState state)
{
if(state == capsense::ProximityState::DETECTED)
{
if(!timing::isRunning(proximityTimer))
{
INFO("Presence detected near lock, speeding up app at least to NORMAL");
}
if (appState < AppState::NORMAL)
{
appState = AppState::NORMAL;
}
printAppState();
appState = AppState::NORMAL;
timing::stop(proximityTimer);
DBG("Proximity detected!");
}
else if(state == capsense::ProximityState::LOST)
{
timing::restart(proximityTimer);
DBG("Proximity lost!");
}
}
Touchpad change state callback defines whether app shall speedup to PATTERN_WIP (work in progress) state or whether to start timer which, on timeout will clear touchpad's state.
void handleTouchpadStateChange(capsense::TouchpadState state)
{
if(state == capsense::TouchpadState::STARTED_TOUCH)
{
if(!timing::isRunning(touchpadTimer))
{
INFO("Touch detected on touchpad, speeding up app at least to PATTERN_WIP");
}
if (appState < AppState::PATTERN_WIP)
{
appState = AppState::PATTERN_WIP;
}
printAppState();
appState = AppState::PATTERN_WIP;
timing::stop(touchpadTimer);
DBG("Started touch!");
}
else if(state == capsense::TouchpadState::STOPPED_TOUCH)
{
timing::restart(touchpadTimer);
DBG("Stopped touch!");
}
}
Releasing accept button has two behaviors. If touchpad's buffer is not empty (something was drawn) then releasing accept button will run classifier and insert detected symbol into pattern buffer. If it's empty, then pattern buffer is checked against all relay buffers and opens those that do match.
void handleAcceptBtnStateChange(capsense::ButtonState state)
{
if(state == capsense::ButtonState::PRESSED)
{
DBG("Accept pressed!");
}
else if(state == capsense::ButtonState::RELEASED)
{
if(capsense::isTouchpadClear())
{
if (!relays::matchPatternAndOpen(pattern::getPatternBuf()))
{
pattern::blinkErrSignalAndClearPattern();
WARN("Invalid pattern inserted");
}
else
{
pattern::clear();
INFO("Correct pattern inserted, at least one relay is active");
}
}
else
{
pattern::PatternShape shape = pattern::classify(capsense::getTouchpadBuf());
pattern::insertSymbol(shape);
capsense::clearTouchpad();
INFO("Shape inserted to buffer via accept button");
}
DBG("Accept released!");
}
}
Cancel button release also works in two ways. If touchpad's buffer is not empty then it clears it. Otherwise it removes last inserted symbol from pattern buffer. Pressing the cancel button will start additional timer. If the button was not released for timer's duration then touchpad's buffer, pattern buffer will be cleared and all opened relays closed forcefully.
void handleCancelBtnStatechange(capsense::ButtonState state)
{
if(state == capsense::ButtonState::PRESSED)
{
timing::restart(cancelTimer);
DBG("Cancel pressed!");
}
else if(state == capsense::ButtonState::RELEASED)
{
timing::stop(cancelTimer);
if(capsense::isTouchpadClear())
{
pattern::removeLastSymbol();
INFO("Removed symbol via cancel button");
}
else
{
capsense::clearTouchpad();
INFO("Cleared touchpad via cancel button");
}
DBG("Cancel released!");
}
}
Apart from timers mentioned above, there also four timers, one for each relay. After accepting pattern, each relay's buffer that matches pattern's buffer will open for 15 seconds.
There is also UART hardware event, which is rather simple. Every time UART receives a command, it checks it's parameters, length and CRC. For now there is only one command implemented which is used to set a code for single relay. It's syntax is as follows:
- command byte - (RELAY_SET_PATTERN_CMD = 1)
- relay to set byte (relay from 0 to 3)
- code to set 0 - 8 bytes (X = 0, TRIANGLE = 1, HEART = 2, CIRCLE = 3, SQUARE = 4)
- 2 bytes of CRC
If the transmission succeeds then the device returns "[COMM] ok" string otherwise it returns "[COMM] err" Responses are in text format as there is also additional logging happening apart from command transmission.
Below you can see how the loop's refresh rate behaves depending on hardware and time events:
CapSense touchpad and data collection
To save as much memory and inference time as possible I have decided to shrink original CapSense's touchpad resolution of 100 by 100 into resolution that in as small as possible format allows to distinguish all five symbols. Additionaly, I want to store each "pixel" as just one bit. Considering that lowest available data unit was one byte (8 bits), to maximize memory usage, I also had to choose resolution that has total count of pixels (width * height) divisible by 8. Finally I have decided for resolution of 20x20, resulting in 400 pixels total. All 400 pixels are stored in just 50 bytes. To set values in this array, a number of simple bitwise operations are performed:
uint16_t bitToSet = y * (CAPSENSE_NUM_COLS - 1) + x;
uint8_t arrIdx = bitToSet / 8;
uint8_t shift = bitToSet % 8;
touchpadPattern[arrIdx] |= (1 << shift);
To get pixel's state, an inverse operation is performed:
uint16_t bitToSet = y * (CAPSENSE_NUM_COLS - 1) + x;
uint8_t arrIdx = bitToSet / 8;
uint8_t shift = bitToSet % 8;
return touchpadPattern[arrIdx] & (1 << shift);
For debug and data collection purposes I wanted more human readable format. So I've written simple piece of code that transmits matrix of "X" for set pixels and "O" for clear pixels. This code runs on accept button release, right where symbol classification would happen.
for(int x = 0; x < CAPSENSE_NUM_COLS; x++)
{
for(int y = 0; y < CAPSENSE_NUM_ROWS; y++)
{
if(capsense::getPixelValue(x, y))
{
UART_UartPutString("X");
}
else
{
UART_UartPutString("O");
}
}
}
UART_UartPutString("\r\n");
The same code transmits pixel data to data collector, a python script which displays received matrix as pyplot's heatmap, converts it back to binary array type and saves it as CSV sample that is ready to be used in training of pattern detector.
Whole data collector's code can be summed up by below function:
def update(frame):
update.ser.flush()
reading = update.ser.readline()
reading = reading.decode("utf-8").strip()
reading = charactersToBinary(reading) # from X/O matrix to 0's and 1's
if len(reading) == img_height * img_width:
parsedSamlpe = toSample(reading) # to 50 byte array
print(parsedSamlpe)
saveSample(parsedSamlpe) # to CSV file
update.image = np.reshape(reading, ((img_height, img_width)))
elif len(reading) != 0:
update.image = np.zeros((img_height, img_width))
update.ax.clear()
return update.ax.imshow(update.image)
Above function is passed to pyplot's FuncAnimation and allows to update touchpad's visualization in realtime, when new sample arrives via serial connection.
Below you can see single sample as a heatmap visualized in pyplot and as a byte array:
[0, 0, 0, 224, 1, 60, 97, 96, 12, 4, 130, 128, 32, 8, 16, 130, 0, 33, 0, 16, 2, 0, 97, 0, 16, 8, 0, 129, 0, 16, 16, 128, 0, 4, 6, 192, 16, 0, 144, 0, 0, 6, 0, 64, 0, 0, 12, 0, 0, 0]
Pattern Detector neural network
Having dataset collected it's time to create neural model for pattern detector. In standard image classification a mix of convolutional and dense layer is used for basic architecture of the network. In this case there are two problems. Firstly - the project uses Cortex M0, which is a really memory constrained device and secondly - the input to the network is a byte array instead of the image. Out of curiosity I've tried to train model consisting of only fully connected layers and finally, the model below achieved around 98% accuracy on test dataset, which is an amazing result for me and for such simple network.
model = keras.Sequential(
[
layers.Dense(10, activation="relu", name="layer1"),
layers.Dense(20, activation="relu", name="layer2"),
layers.Dense(NUM_CLASSES, activation="softmax", name="layer3"),
]
)
The model was trained using 1000 samples that were additionaly augmented by adding or removing pixels in a way that doesn't prevent the symbol to still be recognizable.
Now it's time to implement the trained model onto the device. While I have written whole project in C++ with the hope to include Tensorflow Lite runtime along with my source code, it was made clear to me that it's not that easy task to do. The Tensorflow Lite runtime relies on STL library and dynamic memory allocation. I even have managed to compile the project with these included, but resulting binary overflowed flash memory by alot and the linker said "no" for such implementation.
To include my network in the project I've decided to write my own model exporter, intermediate network format and interpreter. It may sounds super serious, but thanks to my super simple neural architecture I only needed to implement simple matrix multiplication, matrix add and ReLU / Softmax activations.
Let's start with the exporter. It's a simple piece of Python code that traverses through generated model, reads it's operations, activations, weights, etc. and returns those things as multiple lists that will be used by my intermediate format.
ssie_ops = [__INPUT_TYPE]
ssie_activations = []
ssie_num_neurons = []
ssie_byte_model = []
ssie_max_tmp_buf_size = max([num_inputs, num_classes])
for layer in model.layers:
if type(layer) == layers.Dense:
ssie_max_tmp_buf_size = max([ssie_max_tmp_buf_size, layer.units])
op_type, activation, num_neurons, byte_model = __get_dense_chunk(
layer, interpreter, quantize)
ssie_ops.append(op_type)
ssie_activations.append(activation)
ssie_num_neurons.append(num_neurons)
ssie_byte_model += byte_model
else:
raise "Not supported layer type %s" % type(layer)
ssie_ops.append(__OUTPUT_TYPE)
return ssie_ops, ssie_activations, ssie_num_neurons, ssie_byte_model, ssie_max_tmp_buf_size
Intermediate format is not complicated either. Basically it's just a bunch of variables that define the neural model.
namespace ssie_model {
constexpr uint8_t intermediateBufferSize = 50;
constexpr uint8_t numInputs = 50;
constexpr uint8_t numClasses = 5;
constexpr uint8_t numOps = 5;
constexpr Operation ops[] = {
Operation::INPUT,
Operation::DENSE,
Operation::DENSE,
Operation::DENSE,
Operation::OUTPUT,
};
constexpr Activation acts[] = {
Activation::RELU,
Activation::RELU,
Activation::SOFTMAX,
};
constexpr uint8_t numNeurons[] = {
10,
20,
5,
};
constexpr uint16_t length = 835;
constexpr float buffer[length] = {
-0.364011, -0.424963, 0.342720, ...};
};
There are two variables that are not self explanatory.
intermediateBufferSize is the size of a temporary buffer that will hold results between each operation.
buffer array contains weights and biases of each dense layer in format [weights for first neuron of first layer, bias for first neuron of first layer, weights for second neuron of first layer, bias of second neuron of first layer,..., of second layer,..., of third layer etc.]
Thanks to such arrangement of weights and biases it only takes one traversal through this buffer per full inference. No need to keep track of neurons / weights and move back and forth. Each time an operation is performed on this buffer, it's tracking pointer is increased until it reaches end of the array.
This piece of code was generated using simple Python script which used values from the model exporter and placed them in correct fields.
print("#pragma once")
print()
print("namespace ssie_model {")
print("constexpr uint8_t intermediateBufferSize = %d;" % max_tmp_buf_size)
print()
print("constexpr uint8_t numInputs = %d;" % NUM_INPUTS)
print("constexpr uint8_t numClasses = %d;" % NUM_CLASSES)
print()
print("constexpr uint8_t numOps = %d;" % len(ops))
print("constexpr Operation ops[] = {")
for op in ops:
print(" " + toEnumOperation(op) + ", ")
print("};")
print()
print("constexpr Activation acts[] = {")
for act in activations:
print(" " + toEnumActivation(act) + ", ")
print("};")
print()
print("constexpr uint8_t numNeurons[] = {")
for neurons in num_neurons:
print(" " + str(neurons) + ", ")
print("};")
print()
print("constexpr uint16_t length = %d;" % num_bytes)
row_cntr = 0
if QUANTIZE:
print("constexpr uint8_t buffer[length] = {")
else:
print("constexpr float buffer[length] = {")
for byte in byte_model:
if QUANTIZE:
print("{0:#0{1}x}, ".format(np.uint8(byte),4), end="")
else:
print("%f, " % byte, end="")
row_cntr += 1
if row_cntr > 10:
row_cntr = 0
print()
print("};")
print("};")
When it comes to inference engine, it has only one visible to the outside function:
uint8_t run(const uint8_t *inputBuf)
{
actCursor = 0;
neuronCursor = 0;
modelCursor = 0;
currInputBuffer = 0;
currOutputBuffer = 1;
for(uint8_t i = 0; i < ssie_model::numOps; i++)
{
Operation currOp = ssie_model::ops[i];
switch (currOp)
{
case Operation::INPUT:
inputOp(inputBuf);
break;
case Operation::DENSE:
denseOp();
break;
case Operation::OUTPUT:
outputOp();
break;
}
}
return result;
}
This function just iterates over all available operations and performs them in order. Input and output operations just copy data to / from temporary buffers used during inference. Dense operation can be seen below:
void denseOp()
{
Activation act = ssie_model::acts[actCursor++];
uint8_t numNeurons = ssie_model::numNeurons[neuronCursor++];
for (uint8_t neuron = 0; neuron < numNeurons; neuron++)
{
// MatMul
float neuronResult = 0;
for (uint8_t activation = 0; activation < activationLength; activation++)
{
neuronResult += ssie_model::buffer[modelCursor++] * buffer[currInputBuffer][activation];
}
// BiasAdd
float bias = ssie_model::buffer[modelCursor++];
neuronResult += bias;
// Activation
if (act != Activation::SOFTMAX)
{
buffer[currOutputBuffer][neuron] = activation(act, neuronResult);
}
else
{
buffer[currOutputBuffer][neuron] = neuronResult;
}
}
if (act == Activation::SOFTMAX)
{
softmaxOp();
}
currInputBuffer = !currInputBuffer;
currOutputBuffer = !currOutputBuffer;
activationLength = numNeurons;
}
As you can see, all buffer variables are used one after another, without any jumps through the buffer, just simple [modelCursor++] operation.
If you read some of the functions in the source code carefully, you could see that I've experimented with int8 quantized models, but due to closing contest's submission deadline I wasn't able to debug it in time. You can see some of the debug code in the model folder in provided repository though.
Engine's code along with model, even in float32 format easily fits into kit's memory, doesn't require any dynamic memory allocation and can perform inference in literally no time, without any overhead.
PresentationVideo below shows working project:
For the purpose of the presentation I've prepared simple script to send different codes for each relay (0:15 timestamp):
import serial
import numpy as np
import time
ser = serial.Serial()
ser.port = "COM4"
ser.baudrate = 115200
RELAY_SET_PATTERN_CMD = 1
X = 0
TRIANGLE = 1
HEART = 2
CIRCLE = 3
SQUARE = 4
def crc(arr):
sum1 = 0
sum2 = 0
for val in arr:
sum1 = np.uint8(sum1 + val) % 255
sum2 = np.uint8(sum2 + sum1) % 255
return [sum2, sum1]
def send_code(serial, relay, code):
buf = [RELAY_SET_PATTERN_CMD, relay] + code
buf += crc(buf)
buf = bytes(buf)
ser.write(buf)
res = ser.readline()
print(res)
ser.open()
delay = 0.05
send_code(ser, 0, [X])
time.sleep(delay)
send_code(ser, 1, [X, HEART])
time.sleep(delay)
send_code(ser, 2, [HEART, TRIANGLE, SQUARE])
time.sleep(delay)
send_code(ser, 3, [CIRCLE, HEART])
Comments