Last night I attended my first Bristol and Bath Internet of Things Meetup. There were two main talks (The Challenge of Legacy Assets and Adventures in Remotely Updating Linux Devices in a Production Environment), a few one-slide lightning introductions, refreshments (including some very nice beer from Great Western Brewery) and some very enjoyable conversations with interesting people.
Our thanks go to the presenters, organisers and sponsors through whose hard work and generosity such things are possible.
DISCLAIMER: I've tried to make my notes accurate as I can, but they are based on my note-taking and my own interests. I'm happy to correct any mistakes that I have made.
"The Challenge of Legacy Assets"
The first talk was concerning a project by Ash Wireless as part of the Celsius initiative by Ofgem and Electricty North West to monitor important temperatures at electricity substations, and how these are affected by loading and environmental factors. The aim is to increase efficiency by not under or over-utilising transformers, the maximum loading for which is affected by temperature.
Many electricity substations are quite old, and were certainly built without thought of remote monitoring. And yet it is obvious how much these systems could benefit from telemetry. The project being talked about was mainly about efficiency and optimum loading, but one could easily imagine systems such as pre-emptive maintenance or remote diagnostics being very good use-cases.
The environment in such sites is certainly challenging to retrofit sensors:
- Access to the sites is heavily restricted due to safety concerns;
- They are not the sort of places you want to go around drilling holes in, again due to safety;
- Or running cables;
- Many are quite old;
- Ironically, there is often no power available for running devices at the substation;
- There is no Internet connection; and
- With large amounts of metal around, the RF environment is very problematic.
Despite the problems due to the RF environment, the other issues mandated the use of battery-powered, wireless equipment which could be magnetically mounted or cable-tied in place. The system also needed to provide its own backhaul communications via the use of a hub (field gateway) with a mobile data modem in it. The issues around the RF environment were circumvented by:
- having a commissioning mode with LEDs on each device so that the installation engineer could check the communications between hub and sensor device immediately upon installation, and
- allowing the use of multiple hubs where necessary, so that each sensor device had a good signal path to at least one hub. The wireless protocol were such that the devices were not paired to any particular hub.
For me, the biggest lesson was the point made that biggest driver for the design of a system is not the sensing technology used (however cool and modern that might be), the data transmission method or even the backend data analysis, but how the field devices are actually going to be installed.
"Adventures in Remotely Updating Linux Devices in a Production Environment"
As IoT devices proliferate, more people are having to deal with updating the software on more remote machines than ever before, and yet the cost of not doing can be massive - think of the problems caused by cameras and DVRs or routers.
Most Linux systems are updated by the use of a package manager such as
yum (although there are exceptions, for example Ubuntu Core and CoreOS which use systems close to described in the talk). Although this has generally worked well, there are issues when needing to update a fleet of instances:
- only part of the system is upgraded at a time (hard to keep track of which device has which bits of software on, systems may end up diverging);
- conflicts - third part updates can have unexpected consequences;
- hard to use with read-only filesystems;
- backend infrastructure required for the package indexing etc.;
- to get the full benefit of packing, ideally need to package your own application software too;
- updates are not transactional;
- upgrades can yield a device unresponsive for a period of time; and
- interruptions mid-upgrade can result in a "bricked" device.
For IoT devices all this applies, but with the following additional constraints:
- typically no physical access to a device;
- devices typically have a long field lifespan;
- uptime is critically important;
- often there is no package manager on the device, which would mandate a full image upgrade;
- the devices are resource-constrained, and
- have limited Internet connectivity.
1. Single Copy Update
In this scheme, there are (at least) three partitions - it would be normal to have data on a fourth, for example.
+------------+----------+-------------+ | BOOTLOADER | SWUPDATE | APPLICATION | +------------+----------+-------------+
When an update is required the device reboots into the software update partition, which is a minimal OS. It then downloads the updated application partition and writes over the top of the existing version. It can then reboot the device into the new version.
This strategy can be implemented by using the SWUpdate agent.
The big advantage of this approach is that not much disk space is required. There are a couple of big downsides however: the time taken to reboot, download and then reboot again might be considerable; and as the update is not transactional, there is nothing to roll back to, increasing the chances of rendering the device permanently inoperable.
2. Double Copy Update
This system requires more disc space then using the single copy update as it requires two copies of the application partition. The partition scheme looks like this
+------------+--------------------+---------------------+ | BOOTLOADER | APPLICATION COPY 1 | APPLICATION COPY 2 | +------------+--------------------+---------------------+
Initially, the device is booted into the first copy of the application. When an update is available, it is downloaded and written over the second copy. The device is then rebooted, but this time boots into the second application partition, which contains the newly downloaded version. If something goes wrong, it can fallback to using the first copy of the application. The next time an update is required, it is downloaded into the first application partition, then the currently active version is swapped again.
As per the single copy update, this scheme can be implemented by using the SWUpdate agent.
This improves on the previous approach by reducing downtime (the update can be downloaded in the background), separating the download of the update to when it is applied (so that the reboot can be scheduled when it is convenient), and allowing a rollback to the previous version in the case either of a corrupted update, or if problems are found with the new version.
3. Golden Partition
In this case an additional partition holds a copy of the software which is never updated. This means that there is always a bootable version available, but it takes disc space which is not often used and doesn't answer how to update the main copy. It is of most useful for devices where the end-user may need factory-reset functionality.
libOSTree takes a very different approach to updates. Rather than writing a filesystem image to a partition, it performs atomic upgrades of complete fileystem trees. It describes itself as "git for operating system binaries", a content-addressed store together with tools for bootloader configuration, configuration management and other actions required as part of uploads..
It has some distinct advantages other the other approaches:
- update size is minimised;
- easy for settings to be persistent, as the entire filesystem is not wiped;
- you can have multiple parallel installations.
The downside is that (it is generally considered) to be much more complicated to set-up and use.
Currently Zoetrope are using the Double Copy Update method effectively in production, but are considering moving to libOSTree for a reduction in the size of updates and the additional security options it provides.
Some Additional Notes
- SWUpdate doesn't (currently) support encrypted and compressed updates - you need to make a choice
- Try and keep private information out of updates - it is a weak link in the security chain unless the update is encrypted and signed
- Try and keep secret information off the end device - it can be compromised. If your device has a Trusted Platform Module, use it.
- Make it so that rogue/compromised devices can be blacklisted server-side.
uBoot is the bootloader used by Yocto.
- uBoot reads a lot from environment files on the device. Be careful about any security issues that this may introduce.
- Flattened Image Trees are supported. These can be signed.
- Can hardcode boot commands in uBoot. This is not well documented.
Thanks to Micahel from Zoetrope for sharing this information. I've a potential project coming up which would involve upgrading over very constrained links, so I'm grateful for him pointing out libOSTree to me.
There were two lightning introductions that I'd like to call out. The IoT Security Foundation Working Group 3 is producing some guidelines on patching constrained devices. They should soon be available from their publications page.