Identifying the active water consumers by measuring the vibrations on the main water pipe. Combining this information with the flow rate, we can create a water profile for each appliance and get insights on average water consumption, running time and potential leakages.
It's a sad thing when you get your water bill at the end of the year and you notice you've used up enough water to fill an Olympic-sized swimming pool (which you don't have). This hurts in many ways: you have to pay extra for the water you unintentionally used, you feel bad for wasting the water and you feel frustrated because you only discover this now after weeks or even months...
Determined to never let this happen again you want to start monitoring your water usage at regular basis and keep an eye on which device uses how much water. Maybe it's time to replace that dishwasher with a newer model? But how to do this with a basic water meter?
Noting the water count each week with "pen & paper" will not survive the month (let's be honest). Installing a new digital water meter would be ideal, but the costs are high and you would have to contact the water company (or I don't know who) to install it, which for me is too much hassle because I'm just renting the place. And how to detect the different water appliances?
So I had this idea, what if we could make an IoT Edge device which analyses the vibrations of the water flowing through the main water pipe. With these vibrations we could identify the appliance which is using the water and do an estimation of the water consumption. This solution would require no modification of the existing installation and keep the budget and hassle as low as possible.
1. Objectives & ConstraintsFirst things first, clear objectives and some constraints.
For this prototype I want to detect following things:
- Water being used by the faucet
- Water being used by the toilet (filling of the cistern)
- Water being used by the shower
- No water being used, idle
And generate reports:
- Water consumption per device with an overview per day and month
- On-duration per device with an overview per day and month
- Alert notifications on abnormal water usage
The estimation of the water consumption of each device will be done by multiplying the running time with a fixed measured flow rate (fill a bucket, see how long it takes to get a liter). Can't we detect this as well you ask? Hmm yes maybe, but let's first learn how to walk before we start to run.
Following that same mindset, I've given myself some slack by adding some constraints:
- Only one water consumer is active at the same time
- Central heating off: I've got an ancient boiler which generates a lot of vibrations and is right next to my central water line.
Just a couple of hardware components are used:
- QuickLogic Corp's QuickFeather: The brains and power behind the project, it contains the accelerometers and will run the machine learning model on board to do the predictions. Special thanks to QuickLogic Corp for supplying the hardware!
- Adafruit HUZZAH32 - ESP32 Feather Board for adding some connectivity over WiFi
- A 3D-printed case for mounting the boards in, files provided by adafruit right here!
Once assembled, attach it to the main water line inside your house/apartment/cabin/mansion/... You'll also need to supply power using the micro-USB on the feather, so make sure you have a power socket nearby.
I've started with some cable ties and electrical tape to attach it to the water pipe, but eventually I added a part which snaps on the water line and has velcro tape for an extra rigid connection (stl-file can be found in the attachments).
Tip: Before applying a bottle of Gorilla super glue for a permanent installation, skip ahead to the Software Setup - Training section in this document and verify if everything works correctly before installing it at location.
Tip: Don't use Gorilla (or any other) super glue, I was joking.
3. Training - SoftwareFor collecting data, labeling and training models, I've used the tools provided by SensiML which make the whole process easy and effortless, no need to be a machine learning expert to make this work. They have great documentation which you can find here & here & here & here.
Once the model was trained, I uploaded the binary file to the QuickFeather and uploaded a custom program to the ESP32 to post the classification output to a MQTT server. A MQTT-Client application in python listens for these classifications and uploads them to an influxdb instance. Influxdb 2.0 comes with a complete platform for data exploration and monitoring, worth while to check it out: https://www.influxdata.com/
Besides the data-ingestion application, I also added a data-calculator application which takes data from influxdb, does some simple math stuff - such as calculating the water usage, total running time,... - and uploads it to influxdb as new records. I'm pretty sure you could some of these tasks also in influxdb with some flux-queries, but for me it was easier to write some code for it.
I also added the possibility to do a soft reset of the QuickFeather by publishing a message to a reset-MQTT-topic to which the ESP32 is subscribed to. This is needed because after a while - 15 minutes or so - the QuickFeather started to output jibberish/unknown classification outputs. When the ingestion script receives messages like that, it will publish a reset message so the QuickFeather would reset and reinitialize the classification. This is more a work around than really solving the issue, but it works for me.
If you're interested you can take a look at the code in the attachment, I've included the python scripts and the source file for the feather.
4. Deployment - Hardware AdjustmentsFor resetting the QuickFeather without resetting the ESP32, I had to disconnect the connection between the reset pins, which were connected through the header pins. I pushed the header pin of the QuickFeather up and connected it with an output of the ESP32.
In order to evaluate and use the data coming from the sensor, I created a couple of dashboards in influxdb:
- A model-dashboard containing the raw output of the model and some derived graphs for making it easier to understand
- An overview-dashboard containing the amount of water used and the on-duration per device; near real time, per day and per month.
This water usage is an estimate based on the on-time detected by the model and the flow-rate which I determined with a bucket and a stopwatch. As a check I noted the values of the water-meter before and after I flushed the toilet which confirmed my estimate:
This is what it all boils down to, does it even work? Here are my honest findings:
- Water being used by the toilet (filling of the cistern)
Works like a charm: no noticeable delay in the detection, no false positives or negatives. I've simulated a running toilet by displacing the sealing ring of the cistern and the model picked it up without a problem.
- Water being used by the faucet
Overall good results: False negatives - faucet is running but not detected - barely exist, and when they occur it quickly recovers and detects the running faucet. False positives - faucet is not running but the classifier says otherwise - only happens when the shower is running. More about this in the shower objective evaluation.
- Water being used by the shower
Ah the problem child... When the shower is running the model gets confused and can't make up it's mind if it's the shower, idle or the faucet that is currently running. So in the plot we'll see the output vary a lot, no bueno.
A tiny silver lining is though that its obvious when the shower is being used by looking at the graphs. I played with the idea of using this data to train an additional ml-model which would - when given a window of predictions - a classification of 'shower-on' or 'shower-off'. But I decided against it because it's not really a solution, it's more a huge dirty flimsy workaround. The only proper way to tackle this issue is more and better data and a lot more testing, which is somewhere on the to do list.
- No water being used, idle
I want to say that this works perfectly, but I can't due to the issue we discussed with the shower. But besides that, pretty darn good!
7. Evaluation of the constraintsIn the beginning we've added some constraints to the project in order to make the problem a bit easier. Now what would happen if we would forget about these constraints and use the model as is? Is it still valid?
- Only one water consumer is active at the same time
I would say for the most of the time, the model would still be correct in my case. I live alone, so small chances that I'll be flushing the toilet and be in the shower at the same time.
- Central heating off: I've got an ancient boiler which generates a lot of vibrations and is right next to my central water line.
This would mess with the model. Apparently the vibrations of the boiler heating the water seem like a running faucet accordingly to the model. It is only normal that the model doesn't know what to do with this because I didn't add training data of this scenario.
I did notice something interesting while running the test with the boiler on, apparently it keeps the internal hot water coil on a constant temperature -the comfort mode according to the manual - for which it turns on every 10 minutes and heats the water.
This is totally not needed for me because I only use hot water twice a day (shower & dishes), so now I turn this bad boy on manually if I need it and by doing so saving on my gas bill.
8. ConclusionThe main objective was to determine the active water users solely by using the vibrations of the main waterline, and I feel I can say this is accomplished if you give me some slack concerning the problems with the shower.
This project surely proves that it is feasible to create such an edge device, but this will not be a plug-and-play installation because each water installation is different and thus the ml-model must be retrained for each case. Maybe someday wapr will exist with a fancy app to retrain it in a user-friendly way, but until then you'll have to go through this blog!
Peace!
9. Notes: Improvements for wapr2.0Some quick notes for future me, suggestions are welcome:
- Experiment with a higher sampling frequency (currently 100Hz which seem a bit on the low side)
- Use generated ml-model output to automatically label new data (semi-supervised learning?)
- Upgrade hardware connection to the water line, experiment with different attachments
Comments