Securing embedded devices for both IoT and non-IoT applications is an increasingly important concern. In response, device manufactures are building embedded processors and microcontrollers with a wealth of integrated security features.
Unfortunately, security comes at a cost both in the price of these new devices and in the effort required to redesign and requalify hardware to make use of new ICs. The good news is that much can be done on low-cost or older hardware to enhance system security. Let’s explore the techniques we can use to improve security on these systems.
General Steps to Improving Security
Many simple techniques are available to greatly improve the security of low-cost systems. The first and easiest is to ensure that hardware and firmware details aren’t readily available. It’s common to see exploits presented in academic papers that begin with finding the firmware or schematics for a product on a website or FTP server. Schematics, source code, and binaries should all be access-controlled. In addition, firmware updates should be encrypted to prevent code from being easily extracted from an update. It’s also possible to take more extreme measures such as requesting custom-marked ICs to obfuscate what hardware is being used.
Limiting information in this way is simply obfuscation and not true security. It does increase the difficulty of an attack and improves the system’s security. However, limiting access to key information makes it less likely for researchers and other “white-hat” actors to analyze your product and alert you to security vulnerabilities.
One way to mitigate this is to engage with third parties who conduct code reviews and penetration testing, and provide this information under NDA. In addition, consider making information available under NDA for researchers or academic groups who wish to analyze and attack your product.
Another common technique is to secure the debug interface to the product. While low-cost ICs may not have hardware features such as Secure Debug Unlock (Fig. 1), they often enable the CPU to lock and unlock debug access. This capability can be used to implement a reasonable debug unlock feature.
In the simplest form of Secure Debug Unlock, each device would be programmed with the same public key (unlock_keypub) and a unique ID. To unlock a device, an unlock token must be generated by signing its uniqueID with the private unlock key (unlock_keypriv), and that certificate is sent to the device. The device firmware checks the signature to ensure that the uniqueID is being sent by the private key holder, and then checks that the uniqueID matches the uniqueID of that part. If both conditions are met, the debug interface can be unlocked.
While there are more advanced implementations, this simple example represents a significant increase in security over not locking the interface, using a simple password-based unlock, or using a device that allows anyone to unlock the device after erasing flash.
In addition, some devices offer the ability to permanently lock the debug interface. In systems where debug access isn’t needed for service or failure analysis, this is often a very secure way of disabling the debug port.
In certain systems, debug terminals are left running. Like debug ports, these should be disabled to prevent attackers from using them to compromise the system. The same techniques used in locking debug access can be applied for this purpose.
Protecting Symmetric Keys from Side-Channel Attacks
The consideration is how to better protect symmetric keys from extraction via side-channel attacks. Although these attacks take many forms, they all rely on observing some aspect of the device (power consumption, EMI, timing) while decryption is being performed. Then those observations are used to extract the key via statistical analysis.
Newer embedded devices can contain hardware countermeasures to prevent a usable signal from being observed. While it’s tempting to try to reproduce the effect of these countermeasures in software, the reality is that the statistical analysis performed by attackers is often too effective for software to be able to sufficiently obfuscate the signal.
In the absence of dedicated hardware countermeasures, the goal is to thwart side-channel attacks by preventing use of the key. If the key isn’t used, the attacker can’t capture the number of observations (traces) required to perform their analysis. In some cases, this isn’t possible, and we must live with the fact that the key is vulnerable to this style of attack. In other cases, we can effectively limit the use of the key and prevent the attacker from collecting the needed data.
A prime example of this is a secure bootloader, which will decrypt an image as part of the installation process. The key should only be accessed when installing a valid update. In most systems, updates are performed a few dozen times a year. This means we can limit key use to a few times a year, which is far less than an attacker requires to mount a successful attack. First, the bootloader must prevent the loading of the same image over and over. This can be easily accomplished by including a version number in the image and only decrypting it if the version is newer than the currently installed image. This prevents attackers from provoking use of the key by repeatedly loading the same image or by flipping between versions (i.e., N, N-1, N, N-1….).
Next, we need to prevent the bootloader from loading a corrupt or altered image. This is easily accomplished by signing the image and any metadata such as the version. It prevents the attacker from simply editing the version of the image and fooling the bootloader into thinking it’s getting hundreds of valid updates.
Initially, the bootloader must prevent the repeated loading of the same image. The final attack vector is a bit more difficult to handle. With these mechanisms above, the bootloader will only decrypt the image if it’s newer than the current version and correctly signed. However, attackers can still load the image many times if they abort the installation process by either removing power or pulling the reset pin.
When a reset occurs during image installation, most bootloaders will identify that a bootload was in progress and automatically retry the installation. This ensures that a failed installation doesn’t “brick” the part. It also causes decryption of the image to occur with every retry.
By continually resetting the part midway in the installation, an attacker can cause the bootloader to repeatedly decrypt the image. To avoid this scenario, the bootloader may include a “failed update” counter that’s incremented every time the installation begins. This counter will increase with each retry until it reaches a value (such as 20 retries) indicating that an attack is probable, and then it will brick the device (erase the decryption key and erase the application) to prevent any further attacks on the key. In most systems, this set of bootloader rules will greatly increase the difficulty of executing a side-channel attack.
Another example is a symmetric key used to protect communication with an external entity. Normally, a public key exchange is used to exchange symmetric keys that encrypt the actual communication between the two parties. The symmetric keys are often utilized for an extended period, and their use can be observed long enough to allow side-channel extraction. Rotation of the symmetric key at regular intervals reduces the number of times it’s used and the probability that it can be extracted through a side-channel attack.
Since generating and distributing a new key consumes power and time, there’s a cost to this method. However, when the traffic being protected is sensitive, the increase in security is generally worth the performance cost.
Figure 2 shows a communications protocol that frequently rotates the symmetric key (perhaps every 1 ms) by having the transmitter generate a new random symmetric key at fixed intervals and sending it as part of the encrypted data. Once sent, the link switches to the new key. If attackers gain access to any symmetric key, they will be able to read all future messages since they can decode the next key with the broken one. To mitigate this vulnerability, an asymmetric key exchange can be performed at longer intervals (for example, 1 second) to ensure that if a key is broken, the attacker can decode only a second of data.
Increasing the Difficulty of Exploiting the Application
Exploits of the application code are a concern for low-cost devices that don’t have hardware support for isolating software to contain exploits. Code exploits come in two varieties. The first style of attack attempts to manipulate memory by using an existing interface whose inputs aren’t correctly validated in an unexpected way.
The most well-known attack of this type is a buffer overflow (Fig. 3), where an attacker sends more data than expected to an interface, which results in overwriting application code or data with their own values. This attack can be used to activate application code in an unintended way, such as getting a command intended to retrieve a variable to instead retrieve configuration information. In the worst case, code can be injected into memory, causing the CPU to jump to that code and allowing the attacker to execute arbitrary code and take control of the hardware.
The second style of attack attempts to cause code to malfunction by indirectly injecting a fault such as a power supply or clock glitch. The goal of this attack is to modify program flow. This is typically accomplished by injecting a fault or disturbance at the precise moment in time that the CPU is making a security decision (or responding to one) by corrupting a branch instruction or the instructions leading up to a branch instruction. For example, there may be test code that sends secret information from a UART for test purposes that’s not supposed to run in normal operation. However, an attacker may be able to get that code to run by injecting a fault that causes an if-statement to make the wrong decision.
The best way to mitigate this type of attack is to reduce the attack surface, e.g., pare down the number of interfaces that can be attacked and reduce the complexity of the overall application. For example, it’s tempting to create a single image containing the application and test code needed for board test. This image can be programmed, a board test can be run, and the product shipped. However, it results in the presence of test code and a board test interface on devices in the field.
Instead, we can create separate test and application images. Program the board test image, run the tests, and apply the application image. This will simplify the end application and eliminate a possible point of attack. Similarly, production images should not contain unneeded code. For example, if a feature is partially developed and then eliminated for cost or time-to-market reasons, that code should be removed from the production image rather than left in place. Similarly, rather than include optional features that can be enabled in the field, create two separate images and enable the optional features via a firmware update.
Firmware updates are also a good way to mitigate code exploits. Regular firmware updates have two beneficial effects. First, if an attacker develops a permanent exploit that installs itself into flash, pushing an update will either overwrite that exploit with a new image or force the attacker to not install the new image, which will likely be detected when the expected changes in behavior aren’t observed.
Firmware updates can also include intentional (or unintentional) changes in code, which disrupt the function of exploits of the previous version. For example, if the structure of data in RAM or code in flash is changed, then an exploit relying on overwriting a specific address will not function since that address no longer contains the variable that the exploit was trying to overwrite. Similarly, if a communication protocol is changed on an update and a compromised device chooses to suppress (not install) the update, it will be immediately obvious that the device did not update properly.
Hardware functionality can also be reduced to limit the tools available to an attacker. For example, many ICs allow RAM to be disabled to save power. It’s possible for an application running on a device with 4K of RAM but using only 2K to disable the other 2K and prevent an attacker from using it. Similarly, unused pages of flash can be locked to prevent attackers from installing an exploit into them.
Other easy options available on some hardware include implementing stack limit/overflow detection (or stack canary), setting the “no execute” bit on data memory (if available) and address space layout randomization (ASLR). ASLR is an application-processor class feature, but the stack limit and “no execute” capabilities are found on microcontrollers as well. A variety of coding techniques can make code more robust to glitching attacks (Fig. 4) and side-channel observations.
In addition to these techniques, it’s recommended to perform penetration testing on firmware images to identify and fix vulnerabilities. This can be carried out through third parties or internally using open-source tooling. A combination of internal and third-party testing is advisable. The most common form of analysis for interface exploits is fuzzing; numerous online resources explain this technique.
Securing Flash Against Tampering
While a true secure boot requires an immutable root of trust (typical in ROM), many low-cost MCUs can make a reasonably secure root of trust in flash. This relies on the ability of hardware to lock pages of flash to prevent their erasure or programming. If this feature is available, a non-updatable bootloader can be placed into a flash page and that page is locked.
In addition to providing bootload services, the immutable bootloader can also provide secure-boot services to check the signature of the application before allowing it to execute. In this case, the bootloader contains a public key used to validate images to be installed and to validate the flash contents on boot. If an attacker attempts to edit the contents of the application, the bootloader will detect that the signature no longer matches and prevent the execution of the tampered code.
Protecting Confidential Data
Protecting confidential data on a low-cost IC is difficult. While some techniques make it more difficult for attackers to extract confidential data, ultimately hackers will find a way to extract the information if it’s valuable enough.
The primary defense of a low-cost system is to simply not have valuable data present in the first place. Persistent (flash) and transient (RAM) data should be evaluated to see if it’s required. If not, the data should be removed from the system. If required, it should be stored for as short a time as possible. If the system has other components that can store sensitive data, such as a smartphone or cloud application, move the data to those platforms because they typically have better capabilities to physically protect stored data.
In addition, implementation of features that require storage of confidential data should be carefully evaluated. If a desired feature can’t be adequately secured on a low-cost device, it should not be implemented, or the system should be updated to a more secure IC.
External Secure Element ICs do a good job of storing confidential data, but if the application is compromised, the attacker can do everything the application can do. For example, while storing a key in a Secure Element will prevent a compromised application from getting the key value, it will not prevent the application from using that key to sign or decrypt messages. That ability can be just as damaging as gaining access to the key value itself. Anything the application can do using an external Secure Element can also be performed by a compromised version of that application.
Remember that confidential data is accessible beyond the device. Protecting a symmetric key on a device is ineffective if the key is being emailed around the engineering office or stored in an unsecured source control repository.
Cryptography users should consider how confidential data (i.e., cryptographic keys, firmware images, or source code) is stored and accessed in their development office. This is critical to ensuring that confidential information is well protected. Developers should use hardened storage mechanisms such as a hardware security module (HSM) or trusted platform module (TPM), institute policies and procedures for accessing secure information, implement access controls, and implement correct logging and auditing procedures.
Fundamental Limitations of Low-Cost Systems
It’s important to understand the reasons why IC vendors include hardware-security mechanisms in their chips and why many OEMs pay for them. Low-cost systems have fundamental limitations and will never be as secure as a system with the proper dedicated hardware:
- Systems without hardware side-channel attack countermeasures will leak key information.
- Systems without process isolation hardware will be vulnerable to application code exploits.
- Systems without an on-board secure boot will be vulnerable to firmware tampering.
- Systems without a Secure Element (secure key storage) will be vulnerable to key extraction.
Since these vulnerabilities can’t be eliminated in low-cost systems, focus on manipulating the attacker’s cost/benefit calculation (Fig. 5). Attackers will develop an exploit if the benefit they gain (money, enjoyment, notoriety, geo-political aims) exceeds the cost (money, effort) required to generate it. Instead of making key extraction impossible, reduce the value of the key and increase the cost of extraction to the point where that cost is higher than the value of the key.
If the cost of the attack exceeds the value of that exploit, the designer must consider moving to a more expensive security-enabled device.
Conclusions
While low-cost or legacy devices have limitations, you can take many steps to make end products and systems more secure. Designers should carefully consider the security requirements of both new systems and upgrades to existing systems. In some cases, these requirements will require advanced hardware features. Applying simple, common-sense techniques and best practices can also enable legacy or low-cost devices to thrive and succeed in today’s IoT market.
Josh Norem is Staff Systems Engineer, IoT Products, at Silicon Labs.