A high-resolution network of weather stations is critical for producing high-fidelity meteorological models, which interpolate simple time-series sensor readings (temperature, air pressure, humidity, etc.) between dispersed nodes. The concept of “hyper-local” monitoring would be extremely beneficial for groups like city planners, climate modelers, and agriculture consortiums, but it would be difficult and expensive to implement. In practice, only large entities like federal governments and corporations have the resources required to build an adequate data-collection infrastructure. Unfortunately, developing countries do not always have this infrastructure in place, leaving predominantly agricultural economies without access to reliable weather forecasts. But advancements in long-range networking, inexpensive hardware, and distributed storage protocols (like IPFS) make decentralization of weather reporting a much more feasible option, both in advanced and developing countries. Community-based weather reporting is not a new concept; companies like BloomSky sell weather station modules that can be set up outside your home to contribute to an open-source network. However, there are a few inherent scalability issues with existing platforms:
1) Cost and Range: BloomSky has a beautifully designed product with a full suite of climate sensors and even a wide angle camera. However, these features come with a sticker price of about $200 per unit. With a modular, open-source design, the weather station could be reconfigured to collect data relevant to the region and lead to much more widespread adoption. Rather than in-home WiFi, switching to a LoRaWAN-based protocol like the Helium Network would allow stations to connect from anywhere in their city.
2) Incentive Structure: As far as I can tell, the overwhelming majority of network participants are weather enthusiasts who are contributing because they are interested in the technology and community. My platform will have a mechanism in place to actually reward people for supporting and improving the network. In the same way Helium is already expanding much more rapidly than The Things Network ever did, this incentivization structure is critical to the growth of the platform.
3) Data Ownership: Rather than sending it off to a centralized entity, each account holder stores their own data locally (or at least has the option to, as a default). To allow vast amounts of time-series information to be transferred rapidly around the world (e.g. for frontend applications or independent data mining), IPFS is used as the enabling protocol. By only inserting the IPFS address (and not the data itself) within each transaction, the custom blockchain (running on a novel consensus known as “Proof of Forecast”) can stay lightweight while still representing critical information.
The overall structure of the platform is detailed in the following flowchart, but in this report, we will take a closer look at each component. LoRaWAN devices like the one in the Helium Developer Kit provide the actual weather data, while the processing and mining daemons are designed to run on dedicated hardware like a Raspberry Pi.
The setup of the RWS itself is the least novel aspect of the entire platform. But I think that’s the beauty of the Helium Network – the part of my application that took the least amount of time was setting up the hardware and transmitting data to the console. I’m simply using the L072CZ-LRWAN-1 board and X-NUCLEO IKSA03 MEMS shield to collect pressure, sensor, and humidity data and transmit it to the Helium Console via CayenneLPP-encoded packets. For completeness, my code is included at this Github repo, but I followed the steps almost exclusively from the LongFi Arduino Quickstart Guide.
The atomic unit of the network infrastructure is the WeatherPod (guess where I got my inspiration for the design…). If you have access to a 3D printer, you can print the housing to hold the controller, shield, and a USB battery brick, but it will work just fine in a weather-proof box. Upgraded versions could include an anemometer or rainfall sensors, but the key takeaway from the RWS is that it is an inexpensive, modular, low-power piece of hardware. Hyper-local weather monitoring is not feasible if a couple broken nodes ruin your investment, or if you have to constantly be changing batteries.
I set up an Decoder Function in Helium Console to unpack the payloads and then an HTTP Integration to route the payload data as POST requests to a Pipedream Endpoint/”Event Source”. I actually dove into this in a little more detail in a previous Hackster project. The idea is that each account holder could have multiple devices, but all this data is sent to one place. More on this later.
Pipedream has a REST API that allows you to pull events from the endpoint to use in your own applications, which will be used in the “Transaction Processing” step.
Before going any further, I think it would be useful to take a closer look at the database, which is the nuts and bolts of any blockchain. I’m hosting a PostgreSQL database on a Raspberry Pi 4 on my home network, so think of this as a local (i.e. “Testnet”) implementation of what a commercial blockchain development platform would do, where the database would be replicated and synchronized among all the nodes on the network. There are four tables:
Transaction Data
- Every time a change is proposed to the blockchain/world state (i.e. new meteorological readings are submitted by an account), it is recorded as a transaction. Transactions are either unconfirmed (0), valid (1), or invalid (-1), depending on their mining status. They are also given a unique identifier, or transaction hash (I'm using SHA-256).
- Columns: account_id, timestamp, ipfs_addr, confirmed, txn_hash
Mining Queue
- Rather than a competitive consensus like Proof of Work, the lean Proof of Forecast algorithm proceeds in round-robin fashion, so each miner needs to know when it is their turn. We also need a place to keep track of each miner’s performance, in order to compensate them later on. Mining will be described in a future section, but an account owner gets one spot in the queue per RWS. In this way, the account owners who are contributing the most (good & distributed) data to the network are rewarded more.
- Columns: dev_eui, account_id, blocks_mined, valid_txns, invalid_txns
Chain
- This is the immutable ledger of the blockchain. It stores important metadata about each block, including information about the round-robin Proof-of-Forecast consensus protocol I will discuss in more detail later. It also contains a list of approved transactions. Each block is connected to the previous block through the SHA-256-generated block hash. In this way, any changes to the ledger would invalidate the chain from that block onward, as identified by independent auditors.
- Columns: block_height, block_hash, timestamp, txn_list, miner, next_miner
World State
- A feature that I think makes sense for a blockchain used to monitor weather at spatially-distributed points is a "world state", or a current snapshot of the ledger contents. In this case, each RWS is identified by its Dev EUI and associated with its location and the IPFS hash of its most recent (confirmed) data file (this will be described in more detail in the Transaction Processing section). For a front-end application, the client will only need to look in this one place to get the latest useful weather data for the entire network.
- Columns: dev_eui, account_id, latitude, longitude, ipfs_addr, city
The component I call “Transaction Processing” describes the procedure through which weather data is retrieved, parsed, repackaged, and stored in a more useful way. Every mining epoch, we make a GET request through the Pipedream API to pull the last 100 events (payloads) sent from a given account owner’s devices. This JSON file is parsed to return the relevant values we want for each device (in this case, the timestamp, temperature, pressure, and humidity), and if new payloads have been received since the last epoch, they are appended to a locally-stored numpy array (I know there are probably more efficient and compressible formats, but Python pickle binaries are just so easy to work with). If the file contents have changed compared to what is currently in the world state for this device, we submit a transaction to propose that update.
We could use something like Google Drive to store the files, but in the spirit of decentralization, why not take advantage of a technology like the InterPlanetary File System (IPFS)? IPFS is a P2P network that hopes to one day replace the aging HTTP as the Internet’s next file transfer protocol. Until we get there, it’s a great way to share large files in a distributed way. I’d encourage you to check out the full documentation, but for our purposes, the important thing to know is that when you upload a file, its contents are broken into “blocks” and shared amongst a swarm of peers. They use a clever addressing scheme called a Distributed Hash Table (DHT), which allows you to retrieve any shared file by using its unique hash identifier.
A full step-by-step guide for setting up the code on the Raspberry Pi is in the README file in the repo, but an important note is that because I am using a Python library called IPFS HTTP Client (which communicates with IPFS’s API and therefore isn’t completely up to date), I had to downgrade to go-ipfs v0.4.23. Using the Raspberry Pi 4 with a 64-bit version of Raspbian, I went with this distribution.
When the updated file is added to IPFS, it returns the hash address (you can identify IPFS hashes by their Qm... indexing). When I insert a new, unconfirmed transaction to the database, it will contain just the IPFS hash instead of the data itself, which would be extremely inefficient to replicate over a blockchain.
The core logic of the transaction processing (from webhook_retrieval.py):
# check if data changed
try:
cur = db_conn.cursor()
sql = """SELECT ipfs_addr FROM world_state WHERE dev_eui = %s;"""
cur.execute(sql, (dev_eui_list[id],))
old_addr = cur.fetchall()[0][0]
except IndexError:
old_addr = []
if len(old_addr) == 0:
# device is new - add to mining pool
sql = """INSERT INTO mining_queue(dev_eui, account_id, blocks_mined, valid_txns, invalid_txns) VALUES(%s, %s, %s, %s, %s);"""
cur.execute(sql, (dev_eui_list[id], ACCOUNT_ID, 0, 0, 0))
db_conn.commit()
# update world state
sql = """INSERT INTO world_state(dev_eui, latitude, longitude, ipfs_addr, account_id, city) VALUES(%s, %s, %s, %s, %s, %s);"""
cur.execute(sql, (dev_eui_list[id], 'lat', 'lon', res['Hash'], ACCOUNT_ID, CITY))
db_conn.commit()
# insert as transaction
sql = """INSERT INTO transaction_data(account_id, timestamp, ipfs_addr, confirmed, txn_hash) VALUES(%s, %s, %s, %s, %s);"""
ts = str(np.round(time.time()))
txn = ACCOUNT_ID + ts + res['Hash'] + '0'
cur.execute(sql, (ACCOUNT_ID, ts, res['Hash'], '0', hashlib.sha256(txn.encode()).hexdigest()))
db_conn.commit()
print('Inserted data for new device and added to mining pool.')
elif old_addr != res['Hash']:
# data has changed - insert update as transaction
sql = """INSERT INTO transaction_data(account_id, timestamp, ipfs_addr, confirmed, txn_hash) VALUES(%s, %s, %s, %s, %s);"""
ts = str(np.round(time.time()))
txn = ACCOUNT_ID + ts + res['Hash'] + '0'
cur.execute(sql, (ACCOUNT_ID, ts, res['Hash'], '0', hashlib.sha256(txn.encode()).hexdigest()))
db_conn.commit()
print('Updated ipfs_address.')
else:
print('Nothing to change.')
MiningA key component of any blockchain is its consensus algorithm, which is an agreed-upon set of procedures for how the distributed nodes come to an agreement on the current state of the chain. You are probably familiar with the mining procedure of Bitcoin and Ethereum, which is known as Proof of Work (PoW). PoW is notoriously inefficient and costly (in fact, Ethereum is planning to switch to Proof of Stake soon). There are dozens of other consensus protocols (Helium’s Proof of Coverage algorithm is particularly clever) that are much more energy-efficient and do not require specialized hardware. I’m using something I call “Proof of Forecast” (PoF), which is essentially two distinct steps intended to address the specific function of keeping a weather network up and running. The two steps are:
1. Proof of Retrievability: A distributed weather network may have problems with data storage, so we want to make sure that when a node submits a transaction, the contents can be accessed by others. This also prevents (in part) exploitative actors from just spamming the transaction list with empty or corrupted weather data.
2. Proof of Integrity: Nodes also need to keep each other accountable for submitting weather data that is reasonable. While you could make this phase much more sophisticated, one simple check is to make sure that a node’s hyper-local readings agree with the current macro-level (i.e. city-wide) data in their area. We wouldn’t expect these two to be exactly the same, so any differences below a hyper-parameter-defined threshold are considered acceptable. In practice, I use OpenWeatherMap's free API to check the temperature in the node’s city and compare it to the device’s latest data point.
Proof of Forecast in action:
if next_miner in my_miner_list:
print('My turn to mine!')
# PROOF OF FORECAST
# 1) Proof of retrievability: make sure that IPFS hash is available and not empty
cur = db_conn.cursor()
sql = """SELECT ipfs_addr, txn_hash FROM transaction_data WHERE confirmed = '0';"""
cur.execute(sql, (ACCOUNT_ID,))
txn_list = cur.fetchall()
if len(txn_list) == 0:
print('No transactions to confirm at this time.')
else:
approved_txns = []
invalid_txns = []
for i in range(len(txn_list)):
try:
# retrieve from IPFS
client.get(txn_list[i][0])
os.rename(txn_list[i][0], 'file_to_mine.data')
# open pickle file
with open('file_to_mine.data', 'rb') as filehandle:
txn_data = pickle.load(filehandle)
txn_data = txn_data.reshape(-1,5)
latest_temp = txn_data[0,2]
dev_eui = txn_data[0][0]
cur = db_conn.cursor()
sql = """SELECT city FROM world_state WHERE dev_eui = %s;"""
cur.execute(sql, (dev_eui,))
res = cur.fetchall()
city = res[0][0]
url = "https://api.openweathermap.org/data/2.5/weather?q="+ city + "&APPID={insert_your_key_here}&units=metric"
payload = {}
headers = {}
response = requests.request("GET", url, headers=headers, data=payload)
r = response.json()
city_temp = r['main']['temp']
# 2) Proof of Integrity: is the hyperlocal temperature "close enough" to the city-wide temperature?
print('City-wide temperature (C): ', city_temp)
print('Hyper-local temperature (C): ', latest_temp)
if np.abs(city_temp - np.float(latest_temp)) < DIFFERENCE_THRESHOLD:
# mine block and add to chain
approved_txns.append(txn_list[i][1])
# update transaction_data
cur = db_conn.cursor()
sql = """UPDATE transaction_data SET confirmed = '1' WHERE txn_hash = %s;"""
cur.execute(sql, (txn_list[i][1],))
db_conn.commit()
# update world state
sql = """UPDATE world_state SET ipfs_addr = %s WHERE dev_eui = %s;"""
cur.execute(sql, (txn_list[i][0], dev_eui,))
db_conn.commit()
else:
# invalid transaction
invalid_txns.append(txn_list[i][0])
cur = db_conn.cursor()
sql = """UPDATE transaction_data SET confirmed = '-1' WHERE txn_hash = %s;"""
cur.execute(sql, (txn_list[i][1],))
db_conn.commit()
except EOFError:
invalid_txns.append(txn_list[i][0])
cur = db_conn.cursor()
sql = """UPDATE transaction_data SET confirmed = '-1' WHERE txn_hash = %s;"""
cur.execute(sql, (txn_list[i][1],))
db_conn.commit()
# update rewards
cur = db_conn.cursor()
sql = """UPDATE mining_queue SET valid_txns = valid_txns + %s, invalid_txns = %s WHERE dev_eui = %s;"""
cur.execute(sql, (len(approved_txns), len(invalid_txns), dev_eui,))
db_conn.commit()
os.remove('file_to_mine.data')
While there are still ways to exploit the system as-is, it represents a reasonably fault-tolerant way to gather consensus. Future iterations could use much more sophisticated PoF protocols, such as checking to make sure that readings agree with other nearby network nodes. Critically, the low computational requirements allow it to be run at little expense on a small computer like a Raspberry Pi.
Nodes take turns mining blocks in round-robin fashion. For each device an account owner has, they get one spot in the queue. Each epoch, the mining daemon checks if their account is up next in line, and if it is, for each unconfirmed transaction, they execute the PoF algorithm. The transactions deemed valid are bundled into a block and added to the chain. Participants earn rewards for mining blocks and supplying valid transactions, but they can also be penalized for submitting “invalid” transactions. In practice, at the end of each rewards period (e.g. on a monthly basis), the customer(s) who sponsor/subscribe to this hyper-local weather data (e.g. a municipal government or team of climate scientists) would distribute the payment in accordance with how each account contributed to the network, either in data itself (through submitting valid transactions) or in maintaining network integrity (through mining blocks). (Alternatively, the weather network could use its own cryptocurrency to buy and sell data).
if len(approved_txns) > 0:
# assemble and mine next block
new_block_hash = hash_block(last_block)
cur = db_conn.cursor()
sql = """INSERT INTO chain(txn_list, block_hash, miner, timestamp, next_miner) VALUES(%s, %s, %s, %s, %s);"""
cur.execute(sql, (str(approved_txns), new_block_hash, next_miner, str(np.round(time.time(), 0)), next_next_miner,))
db_conn.commit()
block = {
'block_height': last_block_height + 1,
'txn_list': str(approved_txns),
'block_hash': new_block_hash,
'miner': next_miner,
'next_miner': next_next_miner
}
print('Block mined successfully!\n', block)
# update rewards stats
cur = db_conn.cursor()
sql = """UPDATE mining_queue SET blocks_mined = blocks_mined + 1 WHERE miner_id = %s;"""
cur.execute(sql, (next_miner,))
db_conn.commit()
print('Rewards Updated.')
Wrapping UpThis project is probably different from other submissions in that the focus is on the infrastructure surrounding the hardware, rather than the device itself. The Helium Network enables us to collect IoT data in a way that would have been prohibitively expensive under existing protocols. However, we need smart ways to aggregate and validate that data if we are to make use of it. In this project, we present a working implementation of a blockchain-based mechanism, built on a lightweight Proof of Forecast consensus algorithm (with a Python daemon running on a Raspberry Pi) and IPFS. This ensures that high-quality weather monitoring data can be quickly accessed on demand. This network could be particularly useful in developing countries, where this information is in short supply, but existing solutions are expensive and highly centralized. Whereas Helium is “the People’s Network”, this could “the People’s Forecast” (eh, I’m not much of a marketing expert). The Github repo also includes a quick example script to show how you can quickly retrieve weather data files from IPFS and use them to make an interactive map with the folium library in Python.
To see an interactive version of this simple example plot of hyper-local temperature data in Pittsburgh, click here. While a crude example, think of what this little dApp represents - distributed, immutable, and independently-verified weather data!
Comments