Design assuming your security controls will fail
10 Dec 2016

When you produce your threat model, you’re multiplying the probability of a threat by the impact of that threat.

IoT devices have three characteristics that work against you, as a device designer:

As a general rule, attacks only get better with time. As a designer, you’re not just defending against the attacks that exist today – you need to defend against future attacks that haven’t been invented yet!

For these reasons, I believe that you should design your device assuming that it will be breached. Once you adopt the mindset that some or all of your security controls will fail, you’re in a better position to design them to minimise the impact of a breach.

Key revocation

Key revocation isn’t a great strategy given the constraints of IoT devices, but there’s plenty of prior work that you can draw from.

For example, the designers for DVD and Blu-ray assumed that some of the encryption keys would be leaked. They’re distributed in every single device (or software instance), they’re difficult to control (DVD/Blu-ray players are cost sensitive) and attackers have a strong incentive to extract the keys (high quality duplication and distribution of content). Knowing this, they designed the system so that leaked keys could be revoked and new content could not be played on compromised hardware. This also incentivises the hardware designers slightly – if one of their players lost a key, they would be ‘punished’ by having their players unable to play new content. (One might debate if this is a punishment or a blessing; they would have to explain to customers why their player won’t play new content, but many customers would just buy a new player.)

Defence in depth

If you use the model that individual controls will fail, the obvious solution is to have multiple controls for a particular risk.

The controls need to be as independent as possible. Having two separate software checks for unauthorised access doesn’t help you if the attacker uses a debugger to bypass them both. Having a software check and an external hardware check would help.

Partitioning

Keep high-risk areas of the system separate from low-risk areas. Where possible, separate independent risky systems. You don’t want a breach in one subsystem to impact another.

SSH (the software) does this through “privilege separation”. Parts of it need to run as the root (administrative) user, but most of it can operate as a less-privileged user. To reduce the attack surface, SSH is split into different sections with different privilege requirements.

Cars are another great example. Modern cars want integration between all car systems – the user wants to be able to control everything from one interface. On one hand, you could reduce manufacturing costs by sharing CPUs and networks between the two. On the other hand, you don’t want a vulnerability in your media system from affecting safety-critical systems. Best to keep them separate and control the interfaces between them carefull.

Canaries and tamper detection

If you’re storing sensitive data (e.g. content decryption keys) you might be better off destroying the keys and/or device if an intrusion is detected.

This can be done in software. What an intrusion looks like varies tremendously between applications, but you might look for changes in memory that should not change, commands on debug ports, or ‘kill’ commands on interfaces that follow the same pattern as regular commands but are designed to catch fuzzing.

If you’ve got a Linux system, a great strategy would be to build AppArmor profiles for your application. Any AppArmor violations could trip a self-destruct. (Just limiting your application is a great start, even if you do nothing about violations!)

If you (the vendor) have remote telemetry coming from devices in the field, it might alert you to attacks in progress.

You can also use hardware modules that store secrets in a tamperproof manner. They cost money, of course.

Remote killswitch

The device can be remotely deactivated – perhaps by network control, perhaps by a physical button or radio signal.

This is useful for devices which can cause personal injury if something goes wrong – autonomous aircraft, industrial equipment, surgery robots, perhaps even cars.

Ideally, you want the ‘killswitch computer’ to be separate from the ‘application computer’ – a compromised or damaged application computer might not execute the kill command correctly.

A recent example of this is the Galaxy Note 7 recall, where an OTA update disables charging to reduce the risk of fires.


comments powered by Disqus