Three years ago, I started installing solar panels in my backyard.
Soon enough, I switched from an ICE to a Electric Vehicle (EV), and purchased a Solar enabled EV charger (i.e. a charger able to charge solely from renewable energy)
On a perfect day, my 5kw solar array can deliver up to 30kwh, and I can add up to 150km (~100 miles) to my EV, just from the sun.
Being a solar enthusiast/energy conscious person, I quickly ended up asking myself :
Assuming I need to charge my EV by tomorrow evening, should I rather:
- charge overnight from the grid to benefit from off-peak rate ?
- or wait and and hope to charge “for free” tomorrow with the sun ?
What if I could predict tomorrow’s solar production ? Surely, this would help me make a better decision, and even automate the charging process.
Here I am talking of predicting my own energy production, taking into consideration panels’ orientation, shading, etc.. This has nothing to do with predicting tomorrow’s weather.
The higher the expected next day’s solar production, the less grid energy I will use to charge the car overnight.
Here comes Solar2ev, an AI application (based on deep neural network), predicting next day’s solar for a given solar array, trained on historical meteorological data.
Real world. End to end. Fully automated.Solar2ev gets its data from the real world (from a nearby weather station), acts on the real world (EV charger), and covers the entire lifecycle of a deep learning project:
- Development: Data acquisition, cleaning, search for best AI model, training, accuracy evaluation.
- Deployment: Runs unattended on premises, on low cost Linux Single Board Computer (Nvidia Jetson Nano)
- Management: With a mobile app based on the Blynk platform.
- Evolution once deployed: The AI model is monitored via regular accuracy assessment and continuously improved by retraining at the edge.
Solar2ev is written in Python, and is available on github.
It uses Google’s Tensorflow and Blynk for mobile application development.
Every day at sunset, I generate a prediction for the next day’s solar, based on meteorological and solar data from current and previous days.
The prediction is mapped to an amount of grid energy, and the EV charger is then programmed to charge the car overnight with that energy.
The higher the prediction, the lower the energy will be used from the grid at night.
For instance, if I am fully confident that tomorrow’s production will be substantial, I can skip overnight charging and charge tomorrow from the sun.
On the other end, if I think production will be low, I’d better charge overnight.
Of course, if my partner tells me to stop messing around with AI, I also have to charge overnight..but that’s another story…
The input to the deep learning model is a time ordered sequence (of meteorological/solar data). I chose hour as the granularity. There is no need to go finer grain since the weather does not change that fast.
For example: running prediction on Wednesday at 7pm, and using a span of 3 days (Monday to Wednesday) means the sequence has 24 + 24 + 18 = 66 elements. Each element contains hourly meteorological data ( temperature, pressure, etc. ) and solar production.
Real world data is used:
- To train the AI model. In that case, I need as much data as possible. However, this will be limited by the age of my solar installation (installed 3 year ago)
- To perform the prediction (once per day, at sunset).
My solar installation is based on Enphase micro-inverters; Enphase offers quite a comprehensive set of reporting capabilities (API to get production day by day, download file for batch historical data).
example of the file downloaded from Enphase (solar production per day):
Date/Time,Energy Produced (Wh)
10/15/2022,"21,245"
10/16/2022,"19,492"
10/17/2022,"17,887"
10/18/2022,"21,156"
Getting real world meteorological data.This is a bit more tricky. My requirements are:
- I need historical (multi years) data, not forecast.
- Data has to be as local as possible (not from 100’s of km/miles away)
I decided to “scrap” a website which provides very local weather measurement (recording station is a few miles from where I live) and yields quite granular information.
“Scrapping” works by parsing the jungle of html code returned by the web server (meant to be rendered by the browser) and to extract the relevant data. It may not be the most elegant type of coding, but it is surprisingly quite simple, albeit dependent on any changes on the web page’s layout, (but for production websites, this is quite infrequent).
Scrapping turns this …
into that:
date,hour,temp,humid,direction,wind,pressure
2021-03-10 00:00:00,0,-3.7,88%,Nord-Nord-Est,3,1020.7
2021-03-10 00:00:00,1,-4.5,89%,Sud-Sud-Ouest,2,1020
Cleaning data.Real world data is typically messy. It can be incomplete, ambiguous, inconsistent, and cannot be used without cleaning. For instance:
- From time to time, the weather website can miss some hourly data. In some instances an entire day may be missing. Those blanks need to be filled.
- I built my solar array incrementally, with multiple increments (from initially 1Kw to 5Kw today). I cannot just compare a production from 2021 to a production in 2024.
- Over the course of 3 years, the internet went down a couple of times, and solar data could not be reported. Again blanks have to be filled.
Data such as month, hour and wind direction are “cyclical”. For instance, January is semantically “close” to December, 23h is “close” to 0h.
If we were representing month with numbers, (such as January = 1 and December = 12), this closeness would be lost. However, this closeness is very valuable for the neural network’s learning process.
To alleviate this, cyclical data are represented with sine and cosine, which makes 23h numerically close to 0h.
Finally.I store the cleaned data into a csv file which becomes the input to the training process.
~25000 row. One row per hour for the life duration of my solar installation.
# feature_input.csv
date,month,hour,temp,humid,direction,wind,pressure,direction_sin,direction_cos,sin_month,cos_month,sin_hour,cos_hour,production
2021-03-19,3,21,-5.1,90.0,3,5,1020.8,0.87,0.5,1.0,0.0,-0.71,0.71,15.25
2021-03-19,3,22,-6.3,91.0,2,8,1021.3,0.64,0.77,1.0,0.0,-0.5,0.87,15.25
Before jumping into model’s training, it is always a good idea to have a look at the data at hand.
The chart above shows the solar production per day. As expected, it is higher in Summer, but can be pretty good from April to Oct.
The chart above shows the distribution of solar data. 50% of the days are above the purple line (i.e. above 17kWh/day), and 50% are below. Likewise, 25% are below 8kWh, and 25% are above 25kWh. (remember those values, we will use them later).
This is the same distribution viewed by month. The dash lines represent the 25/50/75% thresholds described earlier. The vertical “shape” represent the production value for each month.
I was a bit surprised to see how fast the production ramps up from February to March, and reassured to see that the production is decent until October.
I guess the ramp is caused by the orientation and shading of my panels. Moreover, snow on the panels does not help.
The chart above shows how many days (of the past 3 years) had a given production. Quite a few days with very good production (the peak on the right means that when is it sunny, it is very sunny), quite a few days of lousy production (the peak on the left, when is it bad, it is bad). In between, we get all combinations of partly sunny days.
How does solar data relate to meteorological data ?Note that discovering the response to this question is exactly the purpose of the neural network. Here we just glance at the data at hand, mostly to verify it makes sense.
The chart above shows the production (the color dots) for a given barometric pressure (X axis) and a given temperature (Y axis). There is one dot per day, and the redish/larger the cloud of dots, the higher the production.
- Production is indeed quite related to temperature.
- Pressure also affects production. When the pressure is low, the production is low. The effect is however not as strong as it is for temperature (the production does not always increases with the pressure)
This makes sense.. temperature is somehow correlated to the number of sun hours per day (both are higher in summer). Likewise, low pressure means cloudy sky.
Another way to look at the relationship between data is to analyze their correlation. It could be positive correlation (both values vary in the same direction) or negative correlation (values vary in opposite direction).
The table above shows such correlations. Just look at the last row to check the relationship between solar production and meteorological data.
The production seems negatively correlated with humidity (when one increases, the other decreases). Wind speed seems more important than wind direction. The production is also strongly negatively correlated with the month (winter versus summer ):
Note: in the table above, month is defined as abs(month - 6). So a high number (12–6, 1–6) is “winter”, ergo the low production. But we knew this already !!Enough charts, let’s train our model.
Sorry, not quite yet, I need to transform the raw data into input and output features:
- input features: what the model will sees as input.
- output features: what it will spit out (the prediction).
The model will be trained from time ordered sequences (aka time series)
There are many ways I can configure those sequences:
I can set how many days in the past I want to go, whether sequences contain all consecutive hours, or only one hour every two (or 3 or 4.. ).
# 4 days, including today
days_in_seq = 4
# elements within a sequence's are 1 hour apart
sampling = 1
# running prediction at 7pm
today_skipped_hours = 6
Shorter sequences mean “less to remember” for the neural network, but also “less to chew on”. Later, we will try with different parameter values to see what works best.
I can select the actual meteorological data to include in the sequence:
('temp', 'production', 'sin_month', 'cos_month', 'pressure')
I can set the starting point for each sequence:
# sequences used for training starts at those hours.
selection = [0, 1, 2, 3, 4, 5, 6, 7, 8]
- selection = [0]: Only sequences starting at 0h will be included. The advantage is that this is exactly the type of sequences the model will see once in operation, but the big drawback is that this only allows for ~1000 sequences (as we have ~1000 days of historical data). This is a very small (much too small) training set.
- selection = [0, 1, 2,..., 22, 23]. Sequences starting at every hour will be included. We get a much larger training set (24000 sequences), even though it may be less representative of what will be seen in operation.
All those configuration options allow me to experiment and look at what works best.
Output features.I actually build 2 models:
- One predicting a production value: The model’s output is a number, (eg 13, 6kWh).
- One prediction a production class: The model’s output is an interval (eg “between 7.8kWh and 17.3kWh”).
Pedantically speaking, the former is called regression, the later classification.
Why to have 2 models ? I can check later that they “agree”
The number of intervals (aka classes) and their boundaries is configurable:
# categorical = True means classification
# else regression
categorical = True
# classes boundaries
# 4 classes
"prod_bins":[0, 7.83, 17.37, 25.14, 1000000]
4 classes looks good to me; the first class correspond to a production below 8kWh, and the last to a production > 25kWh.
Hum...we already saw those numbers somewhere. Yes, I remember! in the solar production distribution.
In the data gathered from the real world, 25% of the solar production values are below 7.83kWh whereas 25% are above 25.14kWh.
Using those intervals generates a balanced training set, meaning thatno class is more represented than others.
So if I pick a class at random, there is 25% chance I will pick the right one.
Any model that is correct more than 25% of the time is better than random pick.
Training.The model is based on LSTM (Long Short Term Memory) with an attention layer. This is a mature neural network architecture suited to learn from time series.
On a modern laptop (Nvidia 4070 GPU), training with 24000 sequences takes anything between 5 to 10mn.
Thanks to the Nvidia Jetson Nano's GPU, the model is also regularly retrained on the edge (see later).
See also this notebook to train on Google’s Colab.The big question: does this works ?
How is my model doing? How correct are its predictions?
To answer those questions, I shall test the model using a test set, a small percentage of the data (10%), which were NOT used during the training process, and therefore never seen by the model.
A dedicated test set is the only way to get a sense of what will happen once the model is deployed in the wild.So, what are the results ?
When we predict classes, we typical look at accuracy, i.e. the % of correct predictions.
.predict(): 84% accuracy
The model makes the right classification 84% of the time.
When we predict a number, we rather look at errors, i.e. the difference between the prediction and the ground truth.
- mae (mean absolute error): theaverage of absolute errors (absolute so that positive errors do not cancel out negative ones)
- rmse (root mean square error): first we, square all the errors, we compute the average and then we take the square root of it. The “rmse” penalizes large errors more than “mae” does.
mae : 1.482 kwh
rmse: 3.547 kwh
The prediction is on average “1.5 kWh away” from the truth (mae). With a production range of 0 to 30kWh, that’s ~5%.
Is this good? Is this bad?I let you judge. I summarized my thoughts in the next paragraph.
At this point, I can say that classification’s accuracy (84%) is definitely in a different league than picking a class at random (25%). The model has learned something.
We will see later whether we can improve predictive capability, and how.Looking closer at the classification's results.
A typical way to drill down on classification’s results is to look at the confusion matrix which provides a per class view of accuracy.
Columns represent the predictions and rows the truth. The diagonal ( top-left to bottom-right) corresponds to all instances of correct predictions (i.e. predicted = truth). All cells outside the diagonal are instances of incorrect predictions. The number in the cells is the number of predictions, performed using the test set.
A few thoughts:
The model works very well for both very high production (>25kWh) and very low production (<8kWh):
- It never makes stupid mistakes: it never ever predicts <8 kWh when the reality is > 25kWh (or vice versa)
- It can be trusted: it is correct 88% of the time when predicting >25kWh and 87% of the time when predicting <8kWh.
It does not performs as well for the “mid” classes but remains accurate more than 75% of the time in those classes.
Let’s remember our use case: I want to use prediction to decide how much grid energy to put overnight in my car.
The worst case scenario for me is an optimistic model, where the actual production is often lower than the prediction. In such cases the overnight charge is underestimated.
On the other end, a conservative model, where the reality is higher than the prediction, is OK. In such cases, I just have underestimated the opportunity to charge for free the next day.
Let’s look at the cases where the model is correct or conservative:
correct or higher 91.179%
This means that more than 90% of the time, I can rely on the model to optimize my use of grid energy.
That works for me.
Looking closer at the regression results.We can look at how the errors are distributed.
50% of absolute errors < 0.20 Kwh
81.9% of absolute errors < 2.00Kwh
In summary, with regression, I am on average 1.5 kWh away from the ground truth (mae), and much closer to it (0.2kWh) 50% of the time.
That works for me as well.
Let’s use the application.There is almost nothing for the user to do.
Every day, at sunset, the application (running on the Nvidia JetsonNano) generates a prediction for the next day.
The user is informed of the result via push notification, and the mobile app’s prediction tab is updated.
The EV will automatically be charged overnight with the amount of energy corresponding to the prediction (assuming the prediction‘s confidence is above a given threshold).
The relation "prediction => energy" is configured in the mobile app’s charger tab.
The prediction’s confidence is a value between 0 and 1, and is returned by the deep learning model. 1 means the neural network is “fully confident” about its prediction.
Using the mobile app, the user can overwrite this automatic behavior to force overnight charge on/off.
The manufacturer of my EV charger has “some” level of programmatic access. I shall not elaborate on this aspect, as it is not the purpose of this article. The EV charging python module can be swapped to adapt to another brand.How do I know it is working ?
Even if I tested the model’s accuracy after training, how do I know it is not getting crazy once deployed in the wild and exposed every day to data it has never seen? Actually data nobody has ever seen, like.. the future.
In other words, how do I assess if the model’s predictive capability does not degrade overtime?
For that, I need to perform an ongoing evaluation of the model.
Systematic postmortem for recent predictions.Every day, the system automatically performs a postmortem by comparing what it has recently predicted to what actually happened.
This postmortem result is provided to the user as push notification, and is available in the mobile app’s prediction tab.
Postmortems are recorded as a rolling list for the last 6 days. A simple color code allows to check at a glance what happened.
Average accuracy since the application was installed is also provided.
Regular testing on all new data.Every day, I get … a new daily set of meteorological and solar production data, which the model has never seen.
Every so often (every week or every month) I run this batch of “unseen” data thru the model.
Model’s average accuracy on this “unseen” data is presented in the mobile app’s model tab.
This complements the 6 day postmortem view, with an average view since the last training.
If the predictive performance were to degrade, it would be time to go back to the drawing board.
In addition, this tab shows some key characteristics of the model currently being used (size, metrics)
This “unseen data” is also an opportunity to retrain the model on a larger data set.
Every so often (every month, every quarter), the model is retrained. The expectation is that it will improve overtime since it has access to more training data.
This continuous retraining will run “forever”, without any user intervention.
Thanks to the Nvidia Jetson Nano’s GPU, retraining is possible at the edge. It runs for ~1 hour.
That’s it.
As I did for another project (solar2heater, see below), I plan to write an update after one year of operation, to see how the model improved, and whether it saved me some money !!
Want more details ? The rest of this article describes the journey I followed to get to the best possible mode.
Want more solar stuff ? please check Solar2heater and a quantified benefit analysis after running it for one year.
When designing a deep neural network, one has to decide how deep the network is (i.e. how many layers), how many neurons to use per layers..etc.
It is a balancing act. If the network is too small it will not learn. If it is too large, it will memorize rather than learn to generalize.
Those numbers (hyperparameters) can come from experience, intuition, black magic...
I started with the above, but was able to get a slighly better, more accurate model using KerasTuner.
KerasTuner automates the search for the best hyperparameters. Just define the search space, launch KerasTuner, and it will come back later with the hyperparameters combination leading to the best results.
units = hp.Int("nb units", min_value=128, max_value=384, step=128)
num_layers = hp.Int("nb layers", min_value=1, max_value=2, step=1)
nb_dense = hp.Int("nb dense", min_value=0, max_value=128, step=128)
dropout_value = hp.Choice("dropout_value", values = [0.0, 0.3])
In the example above we have 3 possibilities for number of units per LSTM layers, 2 for number of LSTM layers, 2 for number of neurons in final classifier and 2 for use of dropout. So a total of 3x2x2x2 = 24 combinations which KerasTuner will analyze.
There are various options for search: random search, tournament style, Bayesian optimization. KerasTuner does not just brutally try all combinations.
After a while, KerasTuner comes back:
Showing 3 best trials
Objective(name="val_mae", direction="min")
Trial 16 summary
Hyperparameters:
nb units: 384
nb layers: 1
nb dense: 0
dropout_value: 0.3
Score: 1.6457581520080566
....
Search for best inputs.KerasTuner is used for hyperparameters, but what about the network’s inputs such as: how many days in sequence? should we use wind direction as one of the input data? should we use 4 classes or 5? etc.
Solar2ev allows to explore those questions.
A simple configuration file defines the individual parameters to explore:
#
days_in_seq_h = [3,4,5]
sampling_h = [1,2]
# exploring the 2 variable above will means 3x2 = 6 training
Solar2ev will fully train all combinations, and report the results in a.xls file.
I typically sort the xls by column to look at a particular metrics:
ALL combinations of the search space are fully trained (it is more brutal than clever). So depending on the depth of the search space, expect a few hours of execution.
In particular, I can use this brutal search to explore different types of network inputs, including multi-heads:
# one head, 5 values
('temp', 'pressure', 'production', 'sin_month', 'cos_month')
# two heads, 1st one with 5 values, 2nd one with 4 values
[
('temp', 'production', 'sin_month', 'cos_month', 'pressure'),
('wind', 'direction', 'sin_month', 'cos_month'),
]
With the 2 heads configured above, the network will learn separately from the sequence of (temp, production, pressure, month) and the sequence of (wind, direction, month).
Using multihead with my model this did not improve very much a network already fully optimized with KerasTuner on a single head.
My solar installation is 3 years old, so I only have access to 3 years of historical data.
There is nothing I can do about that… or is there?
Solar2ev can use the Synthetic Data Vault (SDV) to generate synthetic data, which mimics real data and can be used to extend the training set.
Using synthetic data is an option when training data is scarce, difficult or expensive to get or cannot be disclosed for confidentiality reasons.
Generating synthetic data is a two step process:
- first train SDV on real data. SDV will learn about the characteristics of the real data.
- then ask SDV to create new synthetic data, as much as I want!
For instance:
- I grab two years of real data (2x365x24 consecutive hours).
- I tell SDV that the 1st 365 days is “a typical year”, and the next 365 days is “another typical year”.
- I train SDV on this real data.
- Then I ask SDV to give me synthetic “years”, i.e. something that looks like a typical year (I can ask for as many as I want).
If I ask 2 years of synthetic data, I can rerun my training process as if I had 5 years of history.
I explored various options to use those 2 years of synthetic data:
- Retrain from scratch using 5 years of data, real data first, or synthetic data first.
- Train the model using 3 years of real data, and then continue training using the 2 years of synthetic data (or vice versa). Optionally freeze part of the model when contining training to only retrain the classifier.
# run with -h to get all options
.\synthetic_train.py -h
What is the result?
I could get a few percent improvement on mae compared to a model already tuned by KerasTuner. Not a lot, but I takes it !
Was all this SDV work worth a few percent improvement ?
Of course it was: I learned about SDV, it is a cool trick to know.
Comments