Embedded Systems Architecture - Daniele Lacamera - E-Book

Embedded Systems Architecture E-Book

Daniele Lacamera

0,0
29,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Embedded Systems Architecture begins with a bird’s-eye view of embedded development and how it differs from the other systems that you may be familiar with. This book will help you get the hang of the internal working of various components in real-world systems.
You’ll start by setting up a development environment and then move on to the core system architectural concepts, exploring system designs, boot-up mechanisms, and memory management. As you progress through the topics, you’ll explore the programming interface and device drivers to establish communication via TCP/IP and take measures to increase the security of IoT solutions. Finally, you’ll be introduced to multithreaded operating systems through the development of a scheduler and the use of hardware-assisted trusted execution mechanisms.
With the help of this book, you will gain the confidence to work with embedded systems at an architectural level and become familiar with various aspects of embedded software development on microcontrollers—such as memory management, multithreading, and RTOS—an approach oriented to memory isolation.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 512

Veröffentlichungsjahr: 2023

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Embedded Systems Architecture

Design and write software for embedded devices to build safe and connected systems

Daniele Lacamera

BIRMINGHAM—MUMBAI

Embedded Systems Architecture

Copyright © 2023 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Gebin George

Publishing Product Manager: Kunal Sawant

Senior Editor: Rohit Singh

Technical Editor: Pradeep Sahu

Copy Editor: Safis Editing

Project Coordinator: Deeksha Thakkar

Proofreader: Safis Editing

Indexer: Sejal Dsilva

Production Designer: Joshua Misquitta

Developer Relations Marketing Executive: Sonakshi Bubbar

First published: May 2018

Second edition: January 2023

Production reference: 1281222

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80323-954-5

www.packt.com

Contributors

About the author

Daniele Lacamera is a software technologist and researcher. He is an expert in operating systems and TCP/IP, with more than 20 academic publications on transport protocol optimization. He began his career as a Linux kernel developer and had his first contribution accepted in Linux 2.6.

Since 2012, he has been working on microcontroller-based architectures, focusing on the design, development, and integration of software for embedded systems. He is an active contributor to many free software projects, and co-author of a TCP/IP stack and a POSIX operating system (OS) for Cortex-M devices, both distributed under the GPL. Nowadays, his activities are focused on IoT security, cryptography, secure boot, and custom transport protocols.

About the reviewer

Marco Oliverio obtained his Ph.D. degree at the University of Calabria, with a dissertation on OS defenses against side-channel and hardware attacks. Some of his academic publications have been presented at vitally important conferences regarding OSes. After finishing his Ph.D., he started working as a firmware and embedded developer, contributing to several open source projects.

Table of Contents

Preface

Part 1 – Introduction to Embedded Systems Development

1

Embedded Systems – A Pragmatic Approach

Domain definition

Embedded Linux systems

Low-end 8-bit microcontrollers

Hardware architecture

Understanding the challenges

Multithreading

RAM

Flash memory

General-purpose input/output (GPIO)

ADC and DAC

Timers and PWM

Interfaces and peripherals

Asynchronous UART-based serial communication

SPI

I2C

USB

Connected systems

Challenges of distributed systems

Introduction to isolation mechanisms

The reference platform

ARM reference design

The Cortex-M microprocessor

Summary

2

Work Environment and Workflow Optimization

Workflow overview

The C compiler

Linker

Make: a build automation tool

Debugger

Embedded workflow

Text editors versus integrated environments

The GCC toolchain

The cross compiler

Compiling the compiler

Linking the executable

Binary format conversion

Interacting with the target

The GDB session

Validation

Functional tests

Hardware tools

Testing off-target

Emulators

Summary

Part 2 – Core System Architecture

3

Architectural Patterns

Configuration management

Revision control

Tracking activities

Code reviews

Continuous integration

Source code organization

Hardware abstraction

Middleware

Application code

Security considerations

Vulnerability management

Software cryptography

Hardware cryptography

Running untrusted code

The life cycle of an embedded project

Defining project steps

Prototyping

Refactoring

API and documentation

Summary

4

The Boot-Up Procedure

Technical requirements

The interrupt vector table

Startup code

Reset handler

Allocating the stack

Fault handlers

Memory layout

Building and running the boot code

The makefile

Running the application

Multiple boot stages

Bootloader

Building the image

Debugging a multi-stage system

Shared libraries

Remote firmware updates

Secure boot

Summary

5

Memory Management

Technical requirements

Memory mapping

Memory model and address space

The code region

The RAM regions

Peripheral-access regions

The system region

Order of memory transactions

The execution stack

Stack placement

Stack overflows

Stack painting

Heap management

Custom implementation

Using newlib

Limiting the heap

Multiple memory pools

Common heap usage errors

The memory protection unit

MPU configuration registers

Programming the MPU

Summary

Part 3 – Device Drivers and Communication Interfaces

6

General-Purpose Peripherals

Technical requirements

Bitwise operations

The interrupt controller

Peripherals’ interrupt configuration

System time

Adjusting the flash wait states

Clock configuration

Clock distribution

Enabling the SysTick

Generic timers

GPIO

Pin configuration

Digital output

PWM

Digital input

Interrupt-based input

Analog input

The watchdog

Summary

7

Local Bus Interfaces

Technical requirements

Introducing serial communication

Clock and symbol synchronization

Bus wiring

Programming the peripherals

UART-based asynchronous serial bus

Protocol description

Programming the controller

Hello world!

newlib printf

Receiving data

Interrupt-based input/output

SPI bus

Protocol description

Programming the transceiver

SPI transactions

Interrupt-based SPI transfers

I2C bus

Protocol description

Clock stretching

Multiple masters

Programming the controller

Interrupt handling

Summary

8

Power Management and Energy Saving

Technical requirements

System configuration

Hardware design

Clock management

Voltage control

Low-power operating modes

Deep-sleep configuration

Stop mode

Standby mode

Wake-up intervals

Measuring power

Development boards

Designing low-power embedded applications

Replacing busy loops with sleep mode

Deep sleep during longer inactivity periods

Choosing the clock speed

Power state transitions

Summary

9

Distributed Systems and IoT Architecture

Technical requirements

Network interfaces

MAC

Selecting the appropriate network interfaces

The Internet protocols

Standard protocols, custom implementations

The TCP/IP stack

Network device drivers

Running the TCP/IP stack

Socket communication

Connectionless protocols

Mesh networks and dynamic routing

TLS

Securing socket communication

Application protocols

Message protocols

The REST architectural pattern

Distributed systems – single points of failure

Summary

Part 4 – Multithreading

10

Parallel Tasks and Scheduling

Technical requirements

Task management

The task block

Context switch

Creating tasks

Scheduler implementation

Supervisor calls

Cooperative scheduler

Concurrency and timeslices

Blocking tasks

Waiting for resources

Real-time scheduling

Synchronization

Semaphores

Mutexes

Priority inversion

System resource separation

Privilege levels

Memory segmentation

System calls

Embedded operating systems

OS selection

FreeRTOS

RIOT OS

Summary

11

Trusted Execution Environment

Technical requirements

Sandboxing

TrustZone-M

Reference platform

Secure and non-secure execution domains

System resources separation

Security attributes and memory regions

Flash memory and secure watermarks

GTZC configuration and block-based SRAM protection

Configuring secure access to peripherals

Building and running the example

Enabling TrustZone-M

Secure application entry point

Compiling and linking secure-world applications

Compiling and linking non-secure applications

Inter-domain transitions

Summary

Index

Other Books You May Enjoy

Preface

Embedded systems have become increasingly popular in the last two decades thanks to the technological progress made by microelectronics manufacturers and designers, which has aimed to increase computing power and decrease the size of the logic of microprocessors and peripherals.

Designing, implementing, and integrating the software components for these systems requires a direct approach to the hardware functionalities in most cases, where tasks are implemented in a single thread and there is no operating system to provide abstractions to access CPU features and external peripherals. For this reason, embedded development is considered a domain on its own in the universe of software development, in which the developer’s approach and workflow need to be adapted accordingly.

This book briefly explains the hardware architecture of a typical embedded system, introduces the tools and methodologies needed to get started with the development of a target architecture, and then guides the readers through interaction with the system features and peripheral interaction. Some areas, such as energy efficiency and connectivity, are addressed in more detail to give a closer view of the techniques used to design low-power and connected systems. Later in the book, a more complex design, incorporating a (simplified) real-time operating system, is built from the bottom up, starting from the implementation of single system components. Finally, in this second edition, we have added a detailed analysis of the implementation of TrustZone-M, the TEE technology introduced by ARM as part of its latest family of embedded microcontrollers.

The discussion often focuses on specific security and safety mechanisms by suggesting specific technologies aimed at improving the robustness of the system against programming errors in the application code, or even malicious attempts to compromise its integrity.

Who this book is for

If you’re a software developer or designer that wants to learn about embedded programming, this is the book for you. You’ll also find this book useful if you’re a less experienced or a beginner embedded programmer willing to expand your knowledge of embedded systems. More experience embedded software engineers may find this book useful for refreshing their knowledge of the internals of device drivers, memory safety, secure data transfers, privilege separation, and secure execution domains.

What this book covers

Chapter 1, Embedded Systems – A Pragmatic Approach, is an introduction to microcontroller-based embedded systems. The scope of the book is identified, from a broader definition of “embedded systems” to the actual domain that will be analyzed – 32-bit microcontrollers with physical memory mapping.

Chapter 2, Work Environment and Workflow Optimization, outlines the tools used and the development workflow. This is an introduction to the toolchain, debuggers, and emulators that can be used to produce code in a binary format that can be uploaded and run on the target platform.

Chapter 3, Architectural Patterns, is all about the strategies and development methodologies for collaborative development and testing. This chapter proposes a description of the processes that are typically used while developing and testing software for embedded systems.

Chapter 4, The Boot-Up Procedure, analyzes the boot phase of an embedded system, boot stages, and bootloaders. It contains a detailed description of the bring-up code and the mechanisms used to separate the software into several boot stages.

Chapter 5, Memory Management, suggests some optimal strategies for memory management by pointing out common pitfalls and explaining how to avoid memory errors that can result in unpredictable or bad behavior in the application code.

Chapter 6, General-Purpose Peripherals, walks through accessing GPIO pins and other generic integrated peripherals. This is the first interaction of the target platform with the outside world, using electric signals to perform simple input/output operations.

Chapter 7, Local Bus Interfaces, guides you through the integration of serial bus controllers (UART, SPI, and I2C). A code-oriented, detailed analysis of the most common bus communication protocols is introduced by explaining the code required to interact with the transceivers commonly available in embedded systems.

Chapter 8, Power Management and Energy Saving, explores the techniques available to reduce power consumption in energy-efficient systems. Designing low-power and ultra-low-power embedded systems requires specific steps to be performed for reducing energy consumption while running the required tasks.

Chapter 9, Distributed Systems and IoT Architecture, introduces the available protocols and interfaces required to build distributed and connected systems. IoT systems need to communicate with remote endpoints using standard network protocols that are implemented using third-party libraries. Particular attention is dedicated to securing communication between endpoints using secure sockets.

Chapter 10, Parallel Tasks and Scheduling, explains the infrastructure of a multitasking operating system through the implementation of a real-time task scheduler. This chapter proposes three approaches for implementing operating systems for microcontrollers from scratch, using different schedulers (cooperative, pre-emptive, and safe).

Chapter 11, Trusted Execution Environment, describes the TEE mechanisms typically available on embedded systems and provides an example of running secure and non-secure domains using ARM TrustZone-M. On modern microcontrollers, TEE provides the opportunity to secure specific areas of memory or peripherals by limiting their access from the non-secure execution domain.

To get the most out of this book

It is expected that you are proficient in the C language and understand how computer systems work. A GNU or Linux development machine is required to apply the concepts explained. Going through the example code provided is sometimes necessary to fully understand the mechanisms implemented. You are encouraged to modify, improve, and reuse the examples provided, applying the suggested methodologies.

Additional usage instructions for the requested tools are available inChapter 2, Work Environment and Workflow Optimization.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Embedded-Systems-Architecture-Second-Edition. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://packt.link/kVMr1.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “A single configuration file must be provided from the command-line invocation, with several platforms and development board configurations provided under the /scripts directory.”

A block of code is set as follows:

  /* Jump to non secure app_entry */  asm volatile("mov r12, %0" ::"r"     ((uint32_t)app_entry - 1));  asm volatile("blxns   r12" );

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

   Secure Area 1:      SECWM1_PSTRT : 0x0  (0x8000000)     SECWM1_PEND  : 0x39  (0x8039000)

Any command-line input or output is written as follows:

$ renode /opt/renode/scripts/single-node/stm32f4_discovery.resc

Commands for the debugger console are written as follows:

    > add-symbol-file app.elf 0x1000    > bt full

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share your thoughts

Once you’ve read Embedded Systems Architecture, Second Edition, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere?

Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily!

Follow these simple steps to get the benefits:

Scan the QR code or visit the link below:

https://packt.link/free-ebook/9781803239545

Submit your proof of purchaseThat’s it! We’ll send your free PDF and other benefits to your email directly

Part 1 – Introduction to Embedded Systems Development

This part gives a bird’s eye view of embedded development, explaining how it differs from other technical fields that developers may be familiar with. The second chapter helps transform a developer’s workstation into an actual hardware/software development lab and optimizes the steps needed to develop, test, debug, and deploy embedded software.

This part has the following chapters:

Chapter 1, Embedded Systems – A Pragmatic ApproachChapter 2, Work Environment and Workflow Optimization

1

Embedded Systems – A Pragmatic Approach

Designing and writing software for embedded systems poses a different set of challenges than traditional high-level software development.

This chapter provides an overview of these challenges and introduces the basic components and the platform that will be used as a reference in this book.

In this chapter, we will discuss the following topics:

Domain definitionGeneral-purpose input/output (GPIO)Interfaces and peripheralsConnected systemsIntroduction to isolation mechanismsThe reference platform

Domain definition

Embedded systems are computing devices that perform specific, dedicated tasks with no direct or continued user interaction. Due to the variety of markets and technologies, these objects have different shapes and sizes, but often, all have a small size and a limited amount of resources.

In this book, the concepts and the building blocks of embedded systems will be analyzed through the development of the software components that interact with their resources and peripherals. The first step is to define the scope for the validity of the techniques and the architectural patterns explained in this book, within the broader definition of embedded systems.

Embedded Linux systems

One part of the embedded market relies on devices with enough power and resources to run a variant of the GNU/Linux OS. These systems, often referred to as embedded Linux, are outside the scope of this book, as their development includes different strategies of design and integration of the components. A typical hardware platform that is capable of running a system based on the Linux kernel is equipped with a reasonably large amount of RAM, up to a few gigabytes, and sufficient storage space on board to store all the software components provided in the GNU/Linux distribution.

Additionally, for the Linux memory management to provide separate virtual address spaces to each process on the system, the hardware must be equipped with a memory management unit (MMU), a hardware component that assists the OS in translating physical addresses into virtual addresses, and vice versa, at runtime.

This class of devices presents different characteristics that are often overkill for building tailored solutions, which can use a much simpler design and reduce the production costs of single units.

Hardware manufacturers and chip designers have researched new techniques to improve the performance of microcontroller-based systems. In the past few decades, they have introduced new generations of platforms that would cut hardware costs, firmware complexity, size, and power consumption to provide a set of features that are most interesting for the embedded market.

Due to their specifications, in some real-life scenarios, embedded systems must be able to execute a series of tasks within a short, measurable, and predictable amount of time. These kinds of systems are called real-time systems and differ from the approach of multi-task computing, which is used in desktops, servers, and mobile phones.

Real-time processing is a goal that is extremely hard, if not impossible, to reach on embedded Linux platforms. The Linux kernel is not designed for hard real-time processing, and even if patches are available to modify the kernel scheduler to help meet these requirements, the results are not comparable to bare-metal, constrained systems that are designed with this purpose in mind.

Some other application domains, such as battery-powered and energy-harvesting devices, can benefit from the low power consumption capabilities of smaller embedded devices and the energy efficiency of the wireless communication technologies often integrated into embedded connected devices. The higher amount of resources and the increased hardware complexity of Linux-based systems often do not scale down enough on energy levels or require effort to meet similar figures in power consumption.

The type of microcontroller-based systems that we will analyze in this book is 32-bit systems, which are capable of running software in a single-threaded, bare-metal application, as well as integrating minimalist real-time OSs, which are very popular in the industrial manufacturing of embedded systems, which we use daily to accomplish specific tasks. They are becoming more and more adopted to help define more generic, multiple-purpose development platforms.

Low-end 8-bit microcontrollers

In the past, 8-bit microcontrollers dominated the embedded market. The simplicity of their design allows us to write small applications that can accomplish a set of predefined tasks but are too simple and usually equipped with too few resources to implement an embedded system, especially since 32-bit microcontrollers have evolved to cover all the use cases for these devices within the same range of price, size, and power consumption.

Nowadays, 8-bit microcontrollers are mostly relegated to the market of educational platform kits, aimed at introducing hobbyists and newcomers to the basics of software development on electronic devices. 8-bit platforms are not covered in this book because they lack the characteristics that allow advanced system programming, multithreading, and advanced features to be developed to build professional embedded systems.

In the context of this book, the term embedded systems is used to indicate a class of systems running on microcontroller-based hardware architecture, offering constrained resources but allowing real-time systems to be built through features provided by the hardware architecture to implement system programming.

Hardware architecture

The architecture of an embedded system is centered around its microcontroller, also sometimes referred to as the microcontroller unit (MCU). This is typically a single integrated circuit containing the processor, RAM, flash memory, serial receivers and transmitters, and other core components. The market offers many different choices among architectures, vendors, price ranges, features, and integrated resources. These are typically designed to be inexpensive, low-resource, low-energy consuming, self-contained systems on a single integrated circuit, which is the reason why they are often referred to as System-on-Chip (SoC).

Due to the variety of processors, memories, and interfaces that can be integrated, there is no actual reference architecture for microcontrollers. Nevertheless, some architectural elements are common across a wide range of models and brands, and even across different processor architectures.

Some microcontrollers are dedicated to specific applications and expose a particular set of interfaces to communicate to peripherals and the outside world. Others are focused on providing solutions with reduced hardware costs, or with very limited energy consumption.

Nevertheless, the following set of components is hardcoded into almost every microcontroller:

MicroprocessorRAMFlash memorySerial transceivers

Additionally, more and more devices are capable of accessing a network, to communicate with other devices and gateways. Some microcontrollers may provide either well-established standards, such as Ethernet or Wi-Fi interfaces, or specific protocols specifically designed to meet the constraints of embedded systems, such as sub-GHz radio interfaces or a Controller Area Network (CAN) bus, being partially or fully implemented within the IC.

All the components must share a bus line with the processor, which is responsible for coordinating the logic. The RAM, flash memory, and control registers of the transceivers are all mapped in the same physical address space:

Figure 1.1 – A simplified block diagram of the components inside a generic microcontroller

The addresses where RAM and Flash Memory are mapped depend on the specific model and are usually provided in the datasheet. A microcontroller can run code in its native machine language; that is, a sequence of instructions conveyed into a binary file that is specific to the architecture it is running on. By default, compilers provide a generic executable file as the output of the compilation and assembly operations, which needs to be converted into a format that can be executed by the target.

The Processor part is designed to execute the instructions that have been stored in its own specific binary format directly from RAM as well as from its internal flash memory. This is usually mapped starting from position zero in memory or another well-known address specified in the microcontroller manual. The CPU can fetch and execute code from RAM faster, but the final firmware is stored in the flash memory, which is usually bigger than the RAM on almost all microcontrollers and permits it to retain the data across power cycles and reboots.

Compiling a software operating environment for an embedded microcontroller and loading it onto the flash memory requires a host machine, which is a specific set of hardware and software tools. Some knowledge about the target device’s characteristics is also needed to instruct the compiler to organize the symbols within the executable image. For many valid reasons, C is the most popular language in embedded software, although not the only available option. Higher-level languages, such as Rust and C++, can produce embedded code when combined with a specific embedded runtime, or even in some cases by entirely removing the runtime support from the language.

Note

This book will focus entirely on C code because it abstracts less than any other high-level language, thus making it easier to describe the behavior of the underlying hardware while looking at the code.

All modern embedded systems platforms also have at least one mechanism (such as JTAG) for debugging purposes and uploading the software to the flash. When the debugging interface is accessed from the host machine, a debugger can interact with the breakpoint unit in the processor, interrupting and resuming the execution, and can also read and write from any address in memory.

A significant part of embedded programming is communicating the peripherals while using the interfaces that the MCU exposes. Embedded software development requires basic knowledge of electronics, the ability to understand schematics and datasheets, and confidence with the measurement tools, such as logic analyzers or oscilloscopes.

Understanding the challenges

Approaching embedded development means keeping the focus on the specifications as well as the hardware restrictions at all times. Embedded software development is a constant challenge that requires focusing on the most efficient way to perform a set of specific tasks but keeping the limited resources available in strong consideration. There are several compromises to deal with, which are uncommon in other environments. Here are some examples:

There might be not enough space in the flash to implement a new featureThere might not be enough RAM to store complex structures or make copies of large data buffersThe processor might be not fast enough to accomplish all the required calculations and data processing in due timeBattery-powered and resource-harvesting devices might require lower energy consumption to meet lifetime expectations

Moreover, PC and mobile OSs make large use of the MMU, a component of the processor that allows runtime translations between physical and virtual addresses.

The MMU is a necessary abstraction to implement address space separation among the tasks, as well as between the tasks and the kernel itself. Embedded microcontrollers do not have an MMU, and usually lack the amount of non-volatile memory required to store kernels, applications, and libraries. For this reason, embedded systems are often running in a single task, with the main loop performing all the data processing and communication in a specific order. Some devices can run embedded OSs, which are far less complex than their PC counterparts.

Application developers often see the underlying system as a commodity, while embedded development often means that the entire system has to be implemented from scratch, from the boot procedure up to the application logic. In an embedded environment, the various software components are more closely related to each other because of the lack of more complex abstractions, such as memory separations between the processes and the OS kernel.

A developer approaching embedded systems for the first time might find testing and debugging on some of the systems a bit more intricate than just running the software and reading out the results. This becomes especially true in those systems that have been designed with little or no human interaction interfaces.

A successful approach requires a healthy workflow, which includes well-defined test cases, a list of key performance indicators coming from the analysis of the specifications to identify possibilities of trade-offs, several tools and procedures at hand to perform all the needed measurements, and a well-established and efficient prototyping phase.

In this context, security deserves some special consideration. As usual, when writing code at the system level, it is wise to keep in mind the system-wide consequences of possible faults. Most embedded application code runs with extended privileges on the hardware, and a single task misbehaving can affect the stability and integrity of the entire firmware. As we will see, some platforms offer specific memory protection mechanisms and built-in privilege separation, which are useful for building fail-safe systems, even in the absence of a full OS based on separating process address spaces.

Multithreading

One of the advantages of using microcontrollers designed to build embedded systems is the possibility to run logically separated tasks within separate execution units by time-sharing the resources.

The most popular type of design for embedded software is based on a single loop-based sequential execution model, where modules and components are connected to expose callback interfaces. However, modern microcontrollers offer features and core logic characteristics that can be used by system developers to build a multitasking environment to run logically separated applications.

These features are particularly handy in the approach to more complex real-time systems, and they help us understand the possibilities of the implementation of safety models based on process isolation and memory segmentation.

RAM

“640 KB of memory ought to be enough for everyone”

– Bill Gates (founder and former director of Microsoft)

This famous statement has been cited many times in the past three decades to underline the progress in technology and the outstanding achievements of the PC industry. While it may sound like a joke for many software engineers out there, it is still in these figures that embedded programming has to be thought about, more than 30 years after MS-DOS was initially released.

Although most embedded systems are capable of breaking that limit today, especially due to the availability of external DRAM interfaces, the simplest devices that can be programmed in C may have as little as 4 KB of RAM available to implement the entire system logic. This has to be taken into account when designing an embedded system, by estimating the amount of memory potentially needed for all the operations that the system has to perform, and the buffers that may be used at any time to communicate with peripherals and nearby devices.

The memory model at the system level is simpler than that of PCs and mobile devices. Memory access is typically done at the physical level, so all the pointers in your code are telling you the physical location of the data they are pointing to. In modern computers, the OS is responsible for translating physical addresses into a virtual representation of the running tasks.

The advantage of the physical-only memory access on those systems that do not have an MMU is the reduced complexity of having to deal with address translations while coding and debugging. On the other hand, some of the features that are implemented by any modern OS, such as process swapping and dynamically resizing address spaces through memory relocation, become cumbersome and sometimes impossible.

Handling memory is particularly important in embedded systems. Programmers who are used to writing application code expect a certain level of protection to be provided by the underlying OS. A virtual address space does not allow memory areas to overlap, and the OS can easily detect unauthorized memory accesses and segmentation violations, so it promptly terminates the process and avoids having the whole system compromised.

On embedded systems, especially when writing bare-metal code, the boundaries of each address pool must be checked manually. Accidentally modifying a few bits in the wrong memory, or even accessing a different area of memory, may result in a fatal, irrevocable error. The entire system may hang, or, in the worst case, become unpredictable. A safe approach is required when handling memory in embedded systems, in particular when dealing with life-critical devices. Identifying memory errors too late in the development process is complex and often requires more resources than forcing yourself to write safe code and protecting the system from a programmer’s mistakes.

Proper memory-handling techniques will be explained in Chapter 5, Memory Management.

Flash memory

In a server or a personal computer, the executable applications and libraries reside in storage devices. At the beginning of the execution, they are accessed, transformed, possibly uncompressed, and stored in RAM before the execution starts.

The firmware of an embedded device is, in general, one single binary file containing all the software components, which can be transferred to the internal flash memory of the MCU. Since the flash memory is directly mapped to a fixed address in the memory space, the processor is capable of sequentially fetching and executing single instructions from it with no intermediate steps. This mechanism is called execute in place (XIP).

All non-modifiable sections on the firmware do not need to be loaded in memory and are accessible through direct addressing in the memory space. This includes not only the executable instructions but also all the variables that are marked as constant by the compiler. On the other hand, supporting XIP requires a few extra steps when preparing the firmware image to be stored in flash, and the linker needs to be instructed about the different memory-mapped areas on the target.

The internal flash memory that is mapped in the address space of the microcontroller is not accessible for writing. Altering the content of the internal flash can be done only by using block-based access, due to the hardware characteristics of flash memory devices. Before changing the value of a single byte in flash memory, the whole block containing it must be erased and rewritten. The mechanism offered by most manufacturers to access block-based flash memory for writing is known as In-Application Programming (IAP). Some filesystem implementations take care of abstracting write operations on a block-based flash device, by creating a temporary copy of the block where the write operation is performed.

While selecting the components for a microcontroller-based solution, it is vital to match the size of the flash memory to the space required by the firmware. The flash is often one of the most expensive components in the MCU, so for deployment on a large scale, choosing an MCU with a smaller flash could be more cost-effective. Developing software with code size in mind is not very usual nowadays within other domains, but it may be required when trying to fit multiple features in such little storage. Finally, compiler optimizations may exist on specific architectures to reduce code size when building the firmware and linking its components.

Additional non-volatile memories that reside outside of the MCU silicon can typically be accessed using specific interfaces, such as the Serial Peripheral Interface. External flash memories use different technologies than internal flash, which is designed to be fast and execute code in place. While being generally more dense and less expensive, external flash memories do not allow direct memory mapping in the physical address space, which makes them unsuitable for storing firmware images. This is because it would be impossible to execute the code fetching the instructions sequentially unless a mechanism is used to load the executable symbols in RAM – read access on these kinds of devices is performed one block at a time. On the other hand, write access may be faster compared to IAP, making these kinds of non-volatile memory devices ideal for storing data that is retrieved at runtime in some designs.

General-purpose input/output (GPIO)

The most basic functionality that can be achieved with any microcontroller is the possibility to control signals on specific pins of the integrated circuit. The microcontroller can turn a digital output on or off, which corresponds to a reference voltage to be applied to the pin when the value assigned to it is 1, and zero volts when the value is 0. In the same way, a pin can be used to detect a 1 or a 0 when the pin is configured as input. The software will read the digital value “1” when the voltage applied to it is higher than a certain threshold.

ADC and DAC

Some chips have onboard ADC controllers, which are capable of sensing the voltage that is applied to the pin and sampling it. This is often used to acquire measurements from input peripherals providing a variable voltage as output. The embedded software will be able to read the voltage, with an accuracy that depends on the predefined range.

A DAC controller is the inverse of an ADC controller, transforming a value on a microcontroller register into the corresponding voltage.

Timers and PWM

Microcontrollers may offer diverse ways to measure time. Often, there is at least one interface based on a countdown timer that can trigger an interrupt and automatically reset upon expiry.

GPIO pins configured as output can be programmed to output a square wave with a preconfigured frequency and duty cycle. This is called pulse-wave modulation (PWM) and has several uses, from controlling output peripherals to dimming an LED or even playing an audible sound through a speaker.

More details about GPIO, interrupt timers, and watchdogs will be explored in Chapter 6, General-Purpose Peripherals.

Interfaces and peripherals

To communicate with peripherals and other microcontrollers, several de facto standards are well established in the embedded world. Some of the external pins of the microcontroller can be programmed to carry out communication with external peripherals using specific protocols. A few of the common interfaces available on most architectures are as follows:

Asynchronous UART-based serial communicationSerial Peripheral Interface (SPI) busInter-integrated circuit (I2C) busUniversal Serial Bus (USB)

Let’s review each in detail.

Asynchronous UART-based serial communication

Asynchronous communication is provided by the Universal Asynchronous Receiver-Transmitter (UART). These kinds of interfaces, commonly known as serial ports, are called asynchronous because they do not need to share a clock signal to synchronize the sender and the receiver, but rather work on predefined clock rates that can be aligned while the communication is ongoing. Microcontrollers may contain multiple UARTs that can be attached to a specific set of pins upon request. Asynchronous communication is provided by UART as a full-duplex channel, through two independent wires, connecting the RX pin of each endpoint to the TX pin on the opposite side.

To understand each other, the systems at the two endpoints must set up the UART using the same parameters. This includes the framing of the bytes on the wire and the frame rate. All of these parameters have to be known in advance by both endpoints to correctly establish a communication channel. Despite being simpler than the other types of serial communication, UART-based serial communication is still widely used in electronic devices, particularly as an interface for modems and GPS receivers. Furthermore, using TTL-to-USB serial converters, it is easy to connect a UART to a console on the host machine, which is often handy for providing log messages.

SPI

A different approach to classic UAR—based communication is SPI. Introduced in the late 1980s, this technology aimed to replace asynchronous serial communication toward peripherals by introducing several improvements:

Serial clock line to synchronize the endpointsMaster-slave protocolOne-to-many communication over the same three-wire bus

The master device, usually the microcontroller, shares the bus with one or more slaves. To trigger the communication, a separate slave select (SS) signal is used to address each slave connected to the bus. The bus uses two independent signals for data transfer, one per direction, and a shared clock line that synchronizes the two ends of the communication. Due to the clock line being generated by the master, the data transfer is more reliable, making it possible to achieve higher bitrates than ordinary UART. One of the keys to the continued success of SPI over multiple generations of microcontrollers is the low complexity required for the design of slaves, which can be as simple as a single shift register. SPI is commonly used in sensor devices, LCDs, flash memory controllers, and network interfaces.

I2C

I2C is slightly more complex, and that is because it is designed with a different purpose in mind: interconnecting multiple microcontrollers, as well as multiple slave devices, on the same two-wire bus. The two signals are serial clock (SCL) and serial data (SDA). Unlike SPI or UART, the bus is half-duplex, as the two directions of the flow share the same signal. Thanks to a 7-bit slave-addressing mechanism incorporated in the protocol, it does not require additional signals dedicated to selecting the slaves. Multiple masters are allowed on the same line, given that all the masters in the system follow the arbitration logic in the case of bus contention.

USB

The USB protocol, originally designed to replace UART and include many protocols in the same hardware connector, is very popular in personal computers, portable devices, and a huge number of peripherals.

This protocol works in host-device mode, with one side of communication, the device, exposing services that can be used by the controller, on the host side. USB transceivers present in many microcontrollers can work in both modes. By implementing the upper layer of the USB standards, different types of devices can be emulated by the microcontroller, such as serial ports, storage devices, and point-to-point Ethernet interfaces, creating microcontroller-based USB devices that can be connected to a host system.

If the transceiver supports host mode, the embedded system can act as a USB host, and devices can be connected to it. In this case, the system should implement device drivers and applications to access the functionality provided by the device.

When both modes are implemented on the same USB controller, the transceiver works in on-the-go (OTG) mode, and selecting and configuring the desired mode can be done at runtime.

A more extended introduction to some of the most common protocols used for communicating with peripherals and neighboring systems will be provided in Chapter 7, Local Bus Interfaces.

Connected systems

An increasing number of embedded devices designed for different markets are now capable of network communication with their peers in the surrounding area or with gateways routing their traffic to a broader network or the internet. The term Internet of Things (IoT) has been used to describe the networks where those embedded devices can communicate using internet protocols.

This means that IoT devices can be addressed within the network in the same way as more complex systems, such as PCs or mobile devices, and most importantly, they use the transport layer protocols typical of internet communications to exchange data. TCP/IP is a suite of protocols standardized by the IETF, and it is the fabric of the infrastructure for the internet and other self-contained, local area networks.

The Internet Protocol (IP) provides network connectivity, but on the condition that the underlying link provides packet-based communication and mechanisms to control and regulate access to the physical media. Fortunately, many network interfaces meet these requirements. Alternative protocol families, which are not compatible with TCP/IP, are still in use in several distributed embedded systems, but a clear advantage of using the TCP/IP standard on the target is that, in the case of communication with non-embedded systems, there is no need for a translation mechanism to route the frames outside the scope of the LAN.

Besides the types of links that are widely used in non-embedded systems, such as Ethernet or wireless LAN, embedded systems can benefit from a wide choice of technologies that are specifically designed for the requirements introduced by IoT. New standards have been researched and put into effect to provide efficient communication for constrained devices, defining communication models to cope with specific resource usage limits and energy efficiency requirements.

Recently, new link technologies have been developed in the direction of lower bitrates and power consumption for wide-area network communication. These protocols are designed to provide narrow-band, long-range communication. The frame is too small to fit IP packets, so these technologies are mostly employed to transmit small payloads, such as periodic sensor data, or device configuration parameters if a bidirectional channel is available, and they require some form of gateway to translate the communication so that it can travel across the internet.

The interaction with the cloud services, however, requires, in most cases, connecting all the nodes in the network, and implementing the same technologies used by the servers and the IT infrastructure directly in the host. Enabling TCP/IP communication on an embedded device is not always straightforward. Even though there are several open source implementations available, system TCP/IP code is complex, big in size, and often has memory requirements that may be difficult to meet.

The same observation applies to the Secure Socket Layer (SSL)/Transport Layer Security (TLS) library, which adds confidentiality and authentication between the two communication endpoints. Choosing the right microcontroller for the task is, again, crucial, and if the system has to be connected to the internet and support secure socket communication, then the flash and RAM requirements have to be updated in the design phase to ensure integration with third-party libraries.

Challenges of distributed systems

Designing distributed embedded systems, especially those that are based on wireless link technologies, adds a set of interesting challenges.

Some of these challenges are related to the following aspects:

Selecting the correct technologies and protocolsLimitations on bitrate, packet size, and media accessAvailability of the nodesSingle points of failure in the topologyConfiguring the routesAuthenticating the hosts involvedThe confidentiality of the communication over the mediaThe impact of buffering on network speed, latency, and RAM usageThe complexity of implementing the protocol stacks

Chapter 9, Distributed Systems and IoT Architecture, analyzes some of the link-layer technologies implemented in embedded systems to provide remote communication, where TCP/IP communication is integrated into the design of distributed systems that are integrated with IoT services.

Introduction to isolation mechanisms

Some newer microcontrollers include support for isolation between trusted and non-trusted software running onboard. This mechanism is based on a CPU extension, available only on some specific architectures, which usually relies on a sort of physical separation inside the CPU itself between the two modes of execution. All the code running from a non-trusted zone in the system will have a restricted view of the RAM, devices, and peripherals, which must be dynamically configured by the trusted counterpart in advance.

Software running from the trusted area can also provide features that are not directly accessible from the non-trusted world, through special function calls that cross the secure/non-secure boundary.

Chapter 11, Trusted Execution Environment, explores the technology behind Trust Execution Environments (TEEs), as well as the software components involved in real embedded systems to provide a safe environment to run non-trusted modules and components.

The reference platform

The preferred design strategy for embedded CPU cores is reduced instruction set computer (RISC). Among all the RISC CPU architectures, several reference designs are used as guidelines by silicon manufacturers to produce the core logic to integrate into the microcontroller. Each reference design differs from the others in several characteristics of the CPU implementation. Each reference design includes one or more families of microprocessors integrated into embedded systems, which share the following characteristics:

Word size used for registers and addresses (8-bit, 16-bit, 32-bit, or 64-bit)Instruction setRegister configurationsEndiannessExtended CPU features (interrupt controller, FPU, MMU)Caching strategiesPipeline design

Choosing a reference platform for your embedded system depends on your project needs. Smaller, less feature-rich processors are generally more suited to low energy consumption, have a smaller MCU packaging, and are less expensive. Higher-end systems, on the other hand, come with a bigger set of resources and some of them have dedicated hardware to cope with challenging calculations (such as a floating-point unit, or an Advanced Encryption Standard (AES) hardware module to offload symmetric encryption operations). 8-bit and 16-bit core designs are slowly giving way to 32-bit architectures, but some successful designs remain relatively popular in some niche markets and among hobbyists.

ARM reference design

ARM is the most ubiquitous reference design supplier in the embedded market, with more than 10 billion ARM-based microcontrollers produced for embedded applications. One of the most interesting core designs in the embedded industry is the ARM Cortex-M family, which includes a range of models scaling from cost-effective and energy-efficient, to high-performance cores specifically designed for multimedia microcontrollers. Despite ranging among three different instruction sets (ARMv6, ARMv7, and ARMv8), all Cortex-M CPUs share the same programming interface, which improves portability across microcontrollers in the same families.

Most of the examples in this book will be based on this family of CPUs. Though most of the concepts expressed will apply to other core designs as well, picking a reference platform now opens the door to a more complete analysis of the interactions with the underlying hardware. In particular, some of the examples in this book use specific assembly instructions from the ARMv7 instruction set, which is implemented in some Cortex-M CPU cores.

The Cortex-M microprocessor

The main characteristic of the 32-bit cores in the Cortex-M family are as follows:

16 generic-purpose CPU registersThumb 16-bit only instructions for code density optimizationsA built-in Nested Vector Interrupt Controller (NVIC) with 8 to 16 priority levelsARMv6-M (M0, M0+), ARMv7-M (M3, M4, M7), or ARMv8-M (M23, M33) architectureOptional 8-region memory protection unit (MPU)Optional TEE isolation mechanism (ARM TrustZone-M)

The total memory address space is 4 GB. The beginning of the internal RAM is typically mapped at the fixed address of 0x20000000. The mapping of the internal flash, as well as the other peripherals, depends on the silicon manufacturer. However, the highest 512 MB (0xE0000000 to 0xFFFFFFFF) addresses are reserved for the System Control Block (SCB), which groups together several configuration parameters and diagnostics that can be accessed by the software at any time to directly interact with the core.

Synchronous communication with peripherals and other hardware components can be triggered through interrupt lines. The processor can receive and recognize several different digital input signals and react to them promptly, interrupting the execution of the software and temporarily jumping to a specific location in the memory. Cortex-M supports up to 240 interrupt lines on the high-end cores of the family.

The interrupt vector, located at the beginning of the software image in flash, contains the addresses of the interrupt routines that will automatically execute on specific events. Thanks to the NVIC, interrupt lines can be assigned priorities so that when a higher-priority interrupt occurs while the routine for a lower interrupt is executed, the current interrupt routine is temporarily suspended to allow the higher-priority interrupt line to be serviced. This ensures minimal interrupt latency for these signal lines, which are somewhat critical for the system to execute as fast as possible.

At any time, the software on the target can run in two privilege modes: unprivileged or privileged. The CPU has built-in support for privilege separation between system and application software, even providing two different registers for the two separate stack pointers. In Chapter 10, Parallel Tasks and Scheduling, we will examine how to properly implement privilege separation, as well as how to enforce memory separation when running untrusted code on the target, in more detail. This is, for example, used to hide secrets such as private keys from direct access from the non-secure world. In Chapter 11, Trusted Execution Environment, we will learn how to properly implement privilege separation, as well as how to enforce memory separation within an OS when running application code on the target with a different level of trust.

A Cortex-M core is present in many microcontrollers, from different silicon vendors. Software tools are similar for all the platforms, but each MCU has a different configuration to take into account. Convergence libraries are available to hide manufacturer-specific details and improve portability across different models and brands. Manufacturers provide reference kits and all the documentation required to get started, which are intended to be used for evaluation during the design phase, and may also be useful for developing prototypes at a later stage. Some of these evaluation boards are equipped with sensors, multimedia electronics, or other peripherals that extend the functionality of the microcontroller. Some even include preconfigured, third-party “middleware” libraries such as TCP/IP communication stacks, TLS and cryptography libraries, simple filesystems and other accessory components, and modules that can be quickly and easily added to a software project.

Summary

When approaching embedded software requirements, before anything else, you must have a good understanding of the hardware platform and its components. By describing the architecture of modern microcontrollers, this chapter pointed out some of the peculiarities of embedded devices and how developers should efficiently rethink their approach to meeting requirements and solving problems, while at the same time taking into account the features and the limits of the target platform.

In the next chapter, we will analyze the tools and procedures typically used in embedded development, including command-line toolchains and integrated development environments (IDEs). We will understand how to organize the workflow and how to effectively prevent, locate, and fix bugs.

2

Work Environment and Workflow Optimization

The first step toward a successful software project is choosing the right tools. Embedded development requires a set of hardware and software instruments that make the developer’s life easier and may significantly improve productivity and cut down the total development time. This chapter provides a description of these tools and gives advice on how to use them to improve the workflow.

The first section gives us an overview of the workflow in native C programming, and gradually reveals the changes necessary to translate the model to an embedded development environment. The GCC toolchain, a set of development tools to build the embedded application, is introduced through the analysis of its components.

Finally, in the last two sections, strategies of interaction with the target are proposed, to provide mechanisms for the debugging and validation of the embedded software running on the platform.

The topics covered in this chapter are the following:

Workflow overviewText editors versus integrated environmentsThe GCC toolchainInteraction with the targetValidation

By the end of this chapter, you will have learned how to create an optimized workflow by following a few basic rules, keeping the focus on test preparation, and a smart approach to debugging.

Workflow overview

Writing software in C, as well as in every other compiled language, requires the code to be transformed into an executable format for a specific target to run it. C is portable across different architectures and execution environments. Programmers rely on a set of tools to compile, link, execute, and debug software to a specific target.

Building the firmware image of an embedded system relies on a similar set of tools, which can produce firmware images for specific targets, called a toolchain. This section gives an overview of the common sets of tools required to write software in C and produce programs that are directly executable on the machine that compiled them. The workflow must then be extended and adapted to integrate the toolchain components and produce executable code for the target platform.

The C compiler

The C compiler is a tool responsible for translating source code into machine code, which can be interpreted by a specific CPU. Each compiler can produce machine code for one environment only, as it translates the functions into machine-specific instructions, and it is configured to use the address model and the register layout of one specific architecture. The native compiler included in most GNU/Linux distributions is the GNU Compiler Collection, commonly known as GCC. The GCC is a free software compiler system distributed under the GNU general public license since 1987, and since then, it has been successfully used to build UNIX-like systems. The GCC included in the system can compile C code into applications and libraries capable of running on the same architecture as that of the machine running the compiler.

The GCC compiler takes source code files as input, with the .c extension, and produces object files, with .o extensions, containing the functions and the initial values of the variables, translated from the input source code into machine instructions. The compiler can be configured to perform additional optimization steps at the end of the compilation that are specific to the target platform and insert debug data to facilitate debugging at a later stage.

A minimalist command line used to compile a source file into an object using the host compiler only requires the -c option, instructing the GCC program to compile the sources into an object of the same name:

$ gcc -c hello.c

This statement will try to compile the C source contained in the hello.c file and transform it into machine-specific code that is stored in the newly created hello.o file.

Compiling code for a specific target platform requires a set of tools designed for that purpose. Architecture-specific compilers exist, which provide compilers creating machine instructions for a specific target, different from the building machine. The process of generating code for a different target is called cross compilation. The cross compiler runs on a development machine, the host, to produce machine-specific code that can execute on the target.

In the next section, a GCC-based toolchain is introduced as the tool to create the firmware for an embedded target. The syntax and the characteristics of the GCC compiler are described there.

The first step for building a program made of separate modules is to compile all the sources into object files so that the components needed by the system are grouped and organized together in the final step, consisting of linking together all the required symbols and arranging the memory areas to prepare the final executable, which is done by another dedicated component in the toolchain.

Linker

The linker is the tool that composes executable programs and resolves the dependencies among object files provided as input.

The default executable format that is produced by the linker is the Executable and Linkable Format (ELF). The ELF is the default standard format for programs, objects, shared libraries, and even GDB core dumps on many Unix and Unix-like systems. The format has been designed to store programs on disks and other media supports, so the host operating system can execute it by loading the instructions in RAM and allocating the space for the program data.

Executable files are divided into sections, which can be mapped to specific areas in memory needed by the program to execute. The ELF file starts with a header containing the pointer to the various sections within the file itself, which contains the program’s code and data.

The linker maps the content of the areas describing an executable program into sections conventionally starting with a . (dot). The minimum set of sections required to run the executable consists of the following:

.text