34,79 €
Learn and implement the latest Arm Cortex-M microcontroller development concepts such as performance optimization, security, software reuse, machine learning, continuous integration, and cloud-based development from industry experts
This book is for practicing engineers and students working with embedded and IoT systems who want to quickly learn how to develop quality software for Arm Cortex-M processors without reading long technical manuals. If you’re looking for a book that explains C or assembly language programming for the purpose of creating a single application or mastering a type of programming such as digital signal processing algorithms, then this book is NOT for you. A basic understanding of embedded hardware and software, along with general C programming skills will assist with understanding the concepts covered in this book.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 395
Veröffentlichungsjahr: 2022
Leverage embedded software development tools and examples to become an efficient Cortex-M developer
Zachary Lasiuk
Pareena Verma
Jason Andrews
BIRMINGHAM—MUMBAI
Copyright © 2022 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Rahul Nair
Publishing Product Manager: Surbhi Suman
Senior Editor: Tanya D’cruz
Technical Editor: Arjun Varma
Copy Editor: Safis Editing
Project Coordinator: Shagun Saini and Deeksha Thakkar
Proofreader: Safis Editing
Indexer: Manju Arasan
Production Designer: Alishon Mendonca
Marketing Coordinator: Nimisha Dua
First published: October 2022
Production reference: 1211022
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80323-111-2
www.packt.com
A first thank you to my parents: John for intellectually challenging me and Rose for always believing in me (and improving my spelling). To Sam Bansil, as her passion for writing rubbed off on me. And of course, thank you to my beautiful fiancée, Isabella, for her constant stream of love, support, and fresh coffee.
– Zachary Lasiuk
Dedicated to my loving husband, Birju, and my incredible children, Leah and Liam, who would prefer a book about penguins and dragons. Thank you to my parents, Saroj and Praveen, for their endless love and support.
– Pareena Verma
Thank you to my wife, Deborah, my six children (Hannah, Caroline, Philip, Charlotte, Peter, and Maria), their spouses (Stephen, Fernando, and Averi), and my three grandchildren (Max, Thiago, and Agnes) for the encouragement and support.
– Jason Andrews
Zachary Lasiuk is a solutions designer at Arm. He specializes in being a systems-oriented thinker, with a broad understanding of and experience with the full IoT product life cycle in terms of both hardware and software. He is a designer at heart, crafting products that are easy and enjoyable to use. Graduating summa cum laude from Boston University with a degree in electrical engineering, Zach holds several certifications in fields ranging from UX Design to Design Thinking to Humane Technology. He graduated from the UN Young SDG Innovators Programme and has been an XTC judge for AI Ethics. He enjoys playing jazz saxophone and piano in his spare time and has toured with bands and DJ groups across the world. He lives in Austin, Texas with his fiancée, Isabella, who he loves very much.
Pareena Verma is a principal solutions architect at Arm. She works with Arm partners around the world to design system-level virtual prototyping solutions for early IP evaluation, performance analysis, and software bring-up. She has helped software developers and SoC architects on numerous Arm-based projects involving the usage of modeling, compilers, debuggers, and simulation tools. Prior to working at Arm, Pareena worked at a couple of other Electronic Design Automation start-ups, primarily focused on embedded software development and FPGA design. Pareena holds a Master of Science degree in Electrical and Computer Engineering from the University of Florida. She lives in the Greater Boston area with her husband, Birju, and their two amazing children, Leah and Liam.
Jason Andrews is a solutions director and distinguished engineer at Arm. He helps Arm partners in the areas of IP selection, system architecture, software development, and performance analysis. Jason has written hundreds of articles about Arm technology. As a member of the AWS Community Builders program, he promotes the Arm architecture in cloud and IoT applications. Prior to working at Arm, Jason worked in various Electronic Design Automation companies, including Cadence Design Systems. He lives in the Minneapolis area with his wife, Deborah, where they spend time with their six children and three grandchildren.
Ronan Synnott is a solutions architect within the Development Solutions group at Arm. During a career spanning more than twenty years, Ronan has held several customer-focused positions around the globe, working side by side with developers from various sectors to ensure they can be at their most effective. He has experienced firsthand how embedded software development has evolved. He is a frequent contributor to Arm’s community forums and blogs and at other public events.
Arm Cortex-M processors are ideal for a wide variety of applications. They are highly visible in microcontrollers and silently work in every other area of electronic design, from small sensors to large servers. In the fourth quarter of 2020, Arm reported a record 4.4 billion chips shipped with Cortex-M processors.
Consequently, the world of software development for embedded and IoT devices is broad. There are hundreds of companies creating thousands of Cortex-M chips, development boards, software libraries, and development tools. While all these components are intended to make your job developing software for Cortex-M devices easier, it is a challenge to understand which components to use on a specific project.
Our goal is to alleviate these challenges and enable you to focus on building better Cortex-M software. We hope our knowledge and experience will help you avoid frustration and spend more time doing what you enjoy.
This book is split into two parts. Part 1, Get Set Up, focuses on how to select the right components to make a Cortex-M based project successful. We cover which Cortex-M processor makes sense for your application—and hardware options to simplify development. Next is an overview of the large variety of software components available in the Cortex-M ecosystem, with context on when to use them. This part ends with a discussion on embedded software tool selection. After reading Part 1, you should be familiar with what exists in the broad Cortex-M ecosystem and be able to translate your project requirements into the right hardware, software, and tools to be successful.
Part 2, Sharpen Your Skills, dives into specific topics of Cortex-M software development. We cover both software topics (including system startup, optimization, machine learning, and security) and software development topics (including cloud services and continuous integration testing). Each topic will be explained in theory and in practice, with code examples for you to get experience with along the way. If you are interested in a specific topic, feel free to investigate that chapter sooner; just note later chapters may refer to techniques described earlier in the book.
This book is intended to teach practicing engineers and students about topics key to being a rounded Cortex-M software developer. It contains some theory and a lot of hands-on examples, as we believe the best way to learn something is to try it yourself. After reading this book, you will have a solid understanding of when and how to use topics such as embedded security and machine learning instead of just knowing the buzzwords.
If you are looking for a deep dive into Cortex-M hardware capabilities or how to optimize Arm assembly instructions for the Cortex-M Instruction Set, technical reference manuals will be best. Similarly, this book is not a programming book that explains C or assembly language programming for the purpose of creating a single application or mastering a type of programming such as digital signal processing.
To get the most out of this book, you should have a basic understanding of embedded system hardware and software and general C programming skills.
Chapter 1, Selecting the Right Hardware, covers differences between all Cortex-M CPUs, and by extension the SoCs they are packaged in, based on the use cases and performance requirements of a project. Each use case discussion will center around the practical usage of Cortex-M hardware and software support. Hardware features such as FPUs, TCMs, MVE, and TrustZone will be discussed. Software such as ML frameworks, RTOS, and more will be discussed.
Chapter 2, Selecting the Right Software, introduces the different software frameworks, ranging from bare-metal boot code to real-time operating systems that enable efficient development on Cortex-M embedded devices. We will map popular use cases to software frameworks available today to get them up and running quickly on your Cortex-M devices.
Chapter 3, Selecting the Right Tools, lists the platforms useful for developing products based on Cortex-M hardware. We will list the tools required to work with Cortex-M hardware, with some general commentary to compare them. We will also list the possible development environments people can use, along with the costs and benefits of each.
Chapter 4, Booting to Main, covers the basics of booting a bare-metal program on a Cortex-M device. We will walk you through assembly startup code examples and highlight the key bits that need to be programmed to successfully boot the systems. We will introduce scatter and linker files – what they are, how they’re structured based on memory layout, and how they’re used to link your program to create the final executable. We will also walk you through the different hardware mechanisms that can be leveraged for application input/output, accompanied by relevant examples.
Chapter 5, Optimizing Performance, covers different software optimization techniques that can be used to run your code faster and more efficiently on Cortex-M devices. We will discuss the tools and techniques you can leverage to measure the performance gained using these optimization tips.
Chapter 6, Leveraging Machine Learning, clarifies how Cortex-M devices process machine learning software. The focus will be on the Cortex-M55, the first Cortex-M designed with ML software in mind. We will introduce the difference between how the Cortex-M55 processes ML instructions as opposed to the other Cortex-M devices through vector processing. Then we will dive into the most popular low-power edge ML applications, speech recognition and image classification, and how to leverage software frameworks to get started quickly.
Chapter 7, Enforcing Security, provides an overview of Arm’s TrustZone technology, which provides robust protection and security for IoT devices. We will introduce Trusted Firmware-M and how it can be included and used to build secure firmware for Cortex-M devices. We will also provide software guidelines on topics such as fault detection and exception handling in the context of building secure systems.
Chapter 8, Streamlining with the Cloud, explains the migration of embedded software development to the cloud. We will introduce cloud concepts including source code management, remote code editing, and containers. Examples of various cloud services as well as build-your-own cloud solutions will be provided with software examples to try. Learning cloud development concepts will help you stay up to date with the latest developer trends.
Chapter 9, Implementing Continuous Integration, introduces automated testing, explains why continuous integration is needed, and reviews the challenges it presents. We will explain how to work with various testing frameworks and cloud services and apply them to both physical boards and virtual models. Learning about possible solutions will enable the right level of automation and tools to improve software quality.
Chapter 10, Looking Ahead, provides general tips for how you can go from a good programmer to a great one. Topics for further investigation are presented for you to continue learning. Additional examples of current industry needs are covered, as well as a look forward toward emerging trends and required skills for the future.
We find the best way to learn a topic is to practice it. In the spirit of “learning by doing,” there are multiple examples in each chapter in Part 2. To make it as easy as possible for you to code along with us, we tried to select freely available software and tools where possible. We spread out examples through Linux and Windows environments, but in many cases, you will be able to use your OS of choice if the tools are supported.
Hardware boards are not free, but the same three platforms are used throughout the book to minimize the amount of hardware you need:
Raspberry Pi PicoNXP LPC55S69-EVKArm Virtual HardwareThe first two boards are self-explanatory. The third option, Arm Virtual Hardware, is not a physical board. It is easily available online and is discussed and used in context at the start of Chapter 4.
If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/The-Insiders-Guide-to-Arm-Cortex-M-Development. If there’s an update to the code, it will be updated in the GitHub repository.
We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://packt.link/vjeD9.
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “It’s clear that the printf function is defined in a C header file, stdio.h.”
A block of code is set as follows:
#include <stdio.h> int main() { printf("Hello Cortex-M world!\n"); }When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
#include <stdio.h> int main() { printf("Hello Cortex-M world!\n"); }Any command-line input or output is written as follows:
./run.sh -a hello.axf
Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “Select System info from the Administration panel.”
Tips or important notes
Appear like this.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Once you’ve read The Insider’s Guide to Arm Cortex-M Development, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere? Is your eBook purchase not compatible with the device of your choice?
Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.
Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.
The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily
Follow these simple steps to get the benefits:
Scan the QR code or visit the link belowhttps://packt.link/free-ebook/9781803231112
Submit your proof of purchaseThat’s it! We’ll send your free PDF and other benefits to your email directlyThese first three chapters provide a broad overview of the options available to build a Cortex-M device, alongside helpful heuristics to narrow down the best options for your project. Confused by the differences in Cortex-M CPUs? Feeling overwhelmed by all the possible software stacks to choose from? Not sure what tools will help you develop software efficiently? We will answer these (and more) questions by reviewing the range of available hardware, software, and tools in the Cortex-M ecosystem.
This part of the book comprises the following chapters:
Chapter 1, Selecting the Right HardwareChapter 2, Selecting the Right SoftwareChapter 3, Selecting the Right ToolsIt may be surprising that the first chapter of this book, which is written for Cortex-M software developers, is all about hardware. This is because software, in all its forms, is ultimately run on hardware. It is critical to understand which hardware capabilities exist to properly leverage them in software.
Additionally, you will likely need a development board for debugging your code during development. Some of you reading may even have a level of influence over which hardware is ultimately selected for your device. All in all, no matter what specific situation you are in, understanding what Cortex-M hardware is out there—and what it can do—will help you develop quality software for your current and future projects.
So, in this opening chapter, we will explain how to select Cortex-M hardware and provide an overview of where to find development boards. Note that we will be discussing both individual Cortex-M processors and Cortex-M development boards.
There are different ways to frame which Cortex-M hardware is best suited for your specific project. Examples can be helpful; the first section of this chapter lists common embedded/IoT use cases and presents Cortex-M processors that fit that situation. The side-by-side comparison is also helpful; the second section ranks processors by performance, power, and area metrics. The third section then focuses on development boards, discussing trade-offs.
The chapter ends by selecting two boards that will be used for hands-on examples in future chapters. In a nutshell, the topics we’ll discuss in this chapter are presented here:
Processor selection through use casesProcessor selection based on performance and powerMicrocontroller development boardsIoT and machine learning (ML) applications are not only rapidly evolving and changing the way modern businesses operate but also transforming our everyday experiences. As these applications evolve and become more complex, it is essential to make the right hardware choices that meet application requirements. Ultimately, the processor choice comes down to the right balance of functionality, cost, power, and performance. Defining your use case and workload requirements makes determining this balance a lot simpler.
In this section, we will walk through the requirements of some common consumer embedded use cases and determine the Arm Cortex-M processor choices that are ideally suited. The list of use cases and the resulting processor selections are not exhaustive, mainly highlighting that if workload requirements are well understood, the processor decision-making process becomes much easier.
Let’s start with a smart medical wearable use case. The requirements of this wearable include that it will be a wrist-worn device, with long battery life and special sensors to continuously monitor heart activity. Security is a vital requirement as the wearable stores private medical data. Processing power is equally important, operating within the size and power constraints of a battery-operated wearable.
For this case, the Arm Cortex-M33 processor provides an excellent combination of security, processing power, and power consumption. Cortex-M33 includes security features for hardware-enforced isolation, known as TrustZone for Cortex-M. It reduces the potential for attacks by creating isolation between the critical firmware and the rest of the application. The Cortex-M33 has many optional hardware features including a digital signal processing (DSP) extension, memory protection unit (MPU), and a floating-point unit (FPU) for handling compute-intensive operations. The Arm custom instruction and coprocessor interface in the Cortex-M33 provide the customization and extensibility to address processing power demands while still decreasing power consumption.
Note that these hardware features are optional; once manufactured and sold, these features are either present or not. Make sure to check whether the microcontroller or development board you are buying has these Arm Cortex-M processor features enabled if desired.
Let’s take another use case as an example. Say you’re designing an industrial flow sensor that will be used to measure liquids and gases with great accuracy. It needs to be extremely reliable and have a small form factor. The primary requirement is that it will be low-power and work with this accuracy standalone for very long periods of time. A great central processing unit (CPU) choice for designing such an industrial sensor is the Arm Cortex-M0+, which combines low power consumption and processing-power capabilities. It is the most energy-efficient Cortex-M processor with an extremely small silicon area, making it the perfect fit for such constrained embedded applications.
There are several use cases in the embedded market that require demanding DSP workloads to be executed with maximized efficiency. With the advancements in IoT, there has been an explosion in the number of connected smart devices. There are so many different sensors connected within these devices to collect data for measuring temperature, motion, health, and vision. The sensor data collected is often noisy and requires DSP computation tasks—for example, applying filters to extract the clean data. The Cortex-M4, Cortex-M7, Cortex-M33, and Cortex-M55 processors come with a DSP hardware extension addressing various performance requirements for different signal-processing workloads. They also have an optional FPU that provides high-performance generic code processing in addition to the DSP capabilities. If your workload requires the highest DSP performance, the Cortex-M7 is a great choice. The Cortex-M7 is widely available in microcontrollers and offers high performance due to its six-stage dual-issue instruction pipeline. It is also highly flexible with optional instruction and data caches and tightly coupled memory (TCM), making it easy to select a processor that has been manufactured to meet your specific application needs. Security has become a common requirement for sensors on connected devices to provide protection from physical attacks. If security is an essential requirement for your sensor application in addition to DSP performance, then Cortex-M33 could be a great fit with its TrustZone hardware security features.
With some of the newer sensing and control use cases, we see a common need for not only signal processing but also ML inference on endpoint devices. ML workloads are typically very demanding in terms of computation and memory bandwidth requirements. The significant advancements made in ML via optimization techniques have now made it possible for ML solutions to be deployed on edge devices.
The primary use cases for ML on edge devices today are keyword spotting, speech recognition, object detection, and object recognition. The list of ML use cases is rapidly evolving even as we write this book, with autonomous driving, language translation, and industrial anomaly detection. These use cases can be broadly classified into three categories, as outlined here:
Vibration and motion: Vibration and motion are used to analyze signals, monitor health, and assist with several industrial applications such as predictive maintenance and anomaly detection. For these applications, the installed sensors (generally accelerometers) are used to gather large amounts of data at various vibration levels. Signal processing is used to preprocess the signal data before any decision-making can be done using ML techniques.Voice and sound: Voice applications are in several markets, and we’ve become quite familiar with voice assistants through the deployment of smart speakers. Many other voice-enabled solutions are coming to the mass market. The voice-capture process consists of one or several microphones used for voice keyword detection. Keyword spotting and automatic speech recognition are the primary demanding computing operations of these voice-enabled devices. These tasks require significant DSP and ML computation.Vision: Vision applications are used in several areas for recognizing objects, being able to both sort and spot defects, and detecting movement. There is an increasing number of vision-based ML applications ranging from video doorbells and biometrics for face-unlocking to industrial anomaly detection.Cortex-M processors ranging from the Cortex-M0 to the latest Cortex-M85 can run a broad range of these ML use cases at different performance points. Mapping the different workload performance needs and latency requirements of these use cases to the CPU’s feature capabilities greatly simplifies the process of hardware selection. The following diagram illustrates the range of ML use cases run on the Cortex-M family of processors today:
Figure 1.1 – ML on Cortex-M processors
For example, say you’re designing a smart speaker that is an always-on voice activation device—the Cortex-M55 is a great choice. The Cortex-M55 is a highly capable artificial intelligence (AI) processor in the Cortex-M series of processors. It’s the first in the Arm Cortex-M family of processors to feature the Helium technology, which provides a significant performance uplift for DSP and ML applications on small, embedded devices. Arm Helium technology is also known as the M-Profile Vector Extension (MVE), which adds over 150 new scalar and vector instructions and enables efficient computation of 8-bit, 16-bit, and 32-bit fixed-point data types. Signal processing-intensive applications such as audio processing widely use the 16-bit and 32-bit fixed-point formats. ML algorithms widely use 8-bit fixed-point data types for neural network (NN) computations. The Helium technology makes running ML workloads much faster and more energy-efficient in endpoint devices.
In Figure 1.1, there is also mention of Ethos-U55. This is not a CPU like the other Cortex-M processors but is instead a micro neural processing unit (NPU). It was developed to add significant acceleration to ML workloads while being small enough to be implemented in constrained embedded/IoT environments. When combined with the Cortex-M55, the Ethos-U55 provides a 480x uplift in ML performance compared to Cortex-M-based systems without the Ethos-U55! Keep an eye out for microcontrollers and boards that utilize the Ethos-U55, and learn more about it from a high level here: https://www.arm.com/products/silicon-ip-cpu/ethos/ethos-u55.
To summarize this section, one way to select processors is by understanding the use case, clearly defining requirements, ranking them, and identifying project constraints. This is a great place to start the processor selection process.
Next, we will look at using performance and power as metrics to analyze processor selection choices.
Another way to choose and understand Cortex-M processors is by ranking, based on how well they match performance and power requirements. Without structure, this can be a daunting task, with a wide range of possibilities (from the number of interrupts to the overall price, and everything in between). In this section, we will define six categories to evaluate and go over a few examples of how to use this in practice to select the right processor for your project. Again, this is also a helpful framework for understanding what Cortex-M processors’ capabilities are.
We will select the right processor using an approach we will call requirement heuristics. This means translating your key project requirements into predefined areas and following simple steps to get to the right Cortex-M processor. The six areas are listed here:
PowerDSP performanceML performanceSecuritySafetyCostIn each area, we rank the processors that best meet the project requirement. You can then select the areas that matter most to your project and find the processor that meets these needs. Let’s discuss each area before showing some examples.
Minimizing power consumption is crucial in highly constrained power environments. A common use case is in distributed sensors that require long periods of continuous operation without being serviced.
When looking at power metrics, there are a few things that will help in understanding technical jargon. There are often two types of power measurements: static (or leakage) power consumption and dynamic power consumption. Static power consumption measures the amount of power used by the processor when not actively processing anything, such as being in a “sleep” mode but with the power still on. Dynamic power consumption measures the power consumed when the processor is actively working on a task. Often, dynamic power is measured using an industry-standard software workload called Dhrystone, to enable consistent comparisons. Power is measured in microwatts (uW)/megahertz (MHz). It is defined as power per MHz to enable consistent comparisons between processors running at different frequencies.
The following table shows the dynamic power of the different Cortex-M processors on the same node size:
Table 1.1 – Dynamic power across different Cortex-M processors on 40 LP node size
Another factor that affects the power consumption of the core is the technology node size used to manufacture the silicon. This can be referred to as “technology," "process node," or just "node," depending on the context. The node size refers generally to the physical size of the transistors; as the node size decreases, more transistors can be packed onto the same-size silicon wafer. Smaller node sizes also generally lead to reduced power consumption, both static and dynamic. Understanding the node size is helpful for accurately comparing different chips or boards.
Using a consistent node size of 40 nm and publicly available benchmarking data, we can rank the Cortex-M processors on the low-power axis, like so:
Figure 1.2 – Ranking power consumption for Cortex-M
Note that the processors in the preceding screenshot are ordered from best, starting from the top. In this case, the processors requiring lower power for operation are ranked visually higher. We are also not displaying all the Cortex-M processors in this (and subsequent) screenshots—we’re only displaying processors that perform well in the category. The spacing between processors is not intended to communicate precise quantitative differences, only a general ranking based on power consumption. The dotted line is also intended as a qualitative distinction, indicating a notable separation between the processor capabilities on this axis.
The Cortex-M0+ comes in first, as one of the lowest-power 32-bit processors on the market. It can get as low as 4 uW/MHz in dynamic power consumption when manufactured at 40 nm. This processor is being used at the forefront of low-power technologies. It can even be used in applications without a battery, relying on energy harvesting from the environment to power the device. Now that is low power! We talk about energy harvesting and ultra-low-power applications in Chapter 10, Looking Ahead.
The Cortex-M23 is essentially tied with the Cortex-M0+ in terms of minimizing power. It can achieve similar power figures as the Cortex-M0+ when configured minimally. The security features increase power consumption when included. Overall, given how new the Cortex-M23 is and that it is being used more often at lower node sizes such as 28 nm and below, the Cortex-M23 is equally viable for minimizing power consumption.
The Cortex-M0 also minimizes power draw and is only slightly behind the Cortex-M0+ and Cortex-M23. The Cortex-M0+ is typically a better option than the Cortex-M0, being so closely related.
The Cortex-M33, Cortex-M3, and Cortex-M4 all have about triple the power draw as the Cortex-M0+. If the lower-power-consumption processors do not have enough processing power or features for your use case, these processors are likely a good fit.
DSP is needed when taking real-world signals and digitizing them to perform some computations. This is exceedingly common in movement, image, or audio processing applications when data is coming in real time. Devices with sensors and motors to detect and act on real-time data rely heavily on DSP.
The computational nature of these types of DSP applications is really centered on what we call scalar processing. You may be familiar with the word scalar from math or physics classes. A scalar is a quantity that has only one characteristic. For example, measuring gas pressure 10 times a second for 1 second will produce 10 data points. Each point has one characteristic: the magnitude of the gas pressure at that instant. These types of DSP applications, which include audio processing as well, lend themselves well to scalar processing.
To measure how good Cortex-M processors are at scalar processing, there are two common benchmarks: CoreMark and Dhrystone. Using these imperfect but generally helpful benchmarks, you can compare how well different processors run scalar workloads such as the DSP use cases discussed previously. You can download and view the Dhrystone (Dhrystone Million Instructions per Second (DMIPS)/MHz) and CoreMark scores for all Cortex-M series processors here: https://developer.arm.com/documentation/102787/.
Using these publicly available CoreMark benchmark scores compiled with the Arm Compiler for Embedded, we can rank the Cortex-M processors in terms of DSP performance, as follows:
Figure 1.3 – Ranking DSP performance for Cortex-M
The benchmark numbers quoted next are valid at the time of this book’s publication. Due to subtle changes in firmware, benchmarks, and compilers, the numbers may change slightly over time. These changes will be small, and the rankings listed are still directionally accurate.
As the newest Arm Cortex-M processor, the Cortex-M85 provides the highest scalar and signal-processing performance to date in the Cortex-M family. It boasts a CoreMark score of 6.28 CoreMark/MHz, is suitable for the most demanding DSP applications, and also includes TrustZone security features.
The Cortex-M7, while being superseded by Cortex-M85, is still a good choice for less demanding DSP applications or where functional safety is critical. The Cortex-M7 has a CoreMark score of 5.29.
The Cortex-M55 and Cortex-M33 are similar in scalar performance, with a CoreMark score of 4.4 and 4.1 respectively.
The Cortex-M4 and Cortex-M3 are the next steps down in performance, with CoreMark scores of 3.54 and 3.44 respectively. The Cortex-M4 is better with DSP use cases due to its optional FPU (which the Cortex-M3 does not have). The Cortex-M4 is commonly used in sensor fusion, motor control, and wearables. The Cortex-M3 is used for more balanced applications with lower area and power requirements.
Applications involving video processing are more demanding than traditional DSP software and benefit from simultaneous processing, called vector processing. Vector processing accelerates the most popular workload today—ML.
Because of its increased popularity and potential in edge devices, we will devote an entire chapter to ML in Chapter 6, Leveraging Machine Learning. In this section, we will give an overview of how ML workloads are executed in hardware to identify the right Cortex-M processor for the job.
ML, at a computational level, is matrix math. NNs are represented by layers of neurons, with each neuron in one layer being connected to each neuron in the next layer. When an input is given (such as a picture of a cat to an image recognition network), it gets separated into distinct features and sent through each layer, one at a time. In practice, this means at each layer, there is x number of inputs going into n number of nodes. This leads to x*n computations at each layer, of which there could be dozens, with potentially hundreds of nodes in each layer.
In scalar computing, this could result in tens of thousands of calculations performed one after the other. In vector computing, you instead store each node’s value in a row (or lane) and make x*n calculations all at once. This is the benefit of vector processing, which has existed in larger Arm cores for years via NEON technology. The Helium extension brings this technology to Cortex-M processors without significantly increasing area and power.
Using matrix multiplication performance as a benchmark, we can rank the Cortex-M processors in terms of ML performance, like so:
Figure 1.4 – Ranking ML performance for Cortex-M
The Cortex-M85 processor boasts the most recent implementation of the Helium vector processing technology. It brings more ML functionality to edge devices and enhances applications such as robotics, drones, and smart home control.
The Cortex-M55 processor was the first Cortex-M processor with Helium vector processing technology. It brings anomaly and object detection use cases to the edge when implemented standalone. When paired with an NPU (such as the Ethos-U55 discussed earlier in this chapter), gesture detection and speech recognition use cases can be unlocked while still controlling power consumption and cost. Even by itself, the Cortex-M55 has about an order of magnitude (OOM) better ML performance than the next closest, the Cortex-M7.
The Cortex-M7 processor is a superscalar processor, meaning it enables the parallelization of scalar workloads. This effectively allows it to run DSP applications faster, but the more computationally intensive ML use cases are more of a challenge. This processor is suitable for basic ML use cases such as vibration and keyword detection.
The Cortex-M4 processor is often stretched to its computational limits when applied to ML use cases. In most cases, it should only be considered if the ML use case is around vibration/keyword detection or sensor fusion, and there is a strict power or cost constraint.
As the importance and ubiquity of IoT devices have increased in people’s lives, security has become a strong requirement. The basics of security such as cryptographic password storage are no longer acceptable. As the value and volume of what is stored on edge devices increases, malicious actors get proportionally more incentivized for hacks.
We will also devote Chapter 7, Enforcing Security, to the topic of security and dive into more specifics there. This section will give you an overview of the key considerations to remember when selecting a processor with security in mind. To successfully secure your software and project, the underlying hardware needs to enable some essential features such as software isolation, memory protection, secure boot, and more. Arm has implemented a security extension to the newer Cortex-M processors called TrustZone that enhances these security basics, adds more functionality in hardware, and makes security implementations easier. TrustZone enables you to physically isolate sections of memory or peripherals at the hardware level, making hacks more difficult and more contained if they do occur. The full details, benefits, and a quick-start guide for this extension will be provided in a later chapter.
Note that this is an optional extension, so make sure to verify it is enabled for the processor in any development board you are considering.
Using TrustZone and additional security features as a guide, we can rank the Cortex-M processors in terms of security features, as follows:
Figure 1.5 – Ranking security for Cortex-M processors
In practice, these processors all contain the TrustZone security extension and are all excellent options for developing a secure project. They are all based on the Armv8-M instruction architecture, with the other Cortex-M processors being based on Armv7-M or Armv6-M. They are ordered in terms of most recently released, but other requirements such as low power, ML, or DSP performance should decide which of these processors to select. Note that Arm also has a Platform Security Architecture (PSA) certification that validates the security implementations at the development-board level.
The PSA and TrustZone implementations in software are all discussed more in Chapter 7, Enforcing Security. Resources to learn more about the different Arm instruction sets (outside the scope of this book) are listed under the Further reading section at the end of this chapter.
Important note
The Cortex-M35P processor is a specialized processor that is intended for the highest level of security. It features built-in tamper resistance and physical protection from invasive and non-invasive attack vectors. Basically, it is ideal for devices that protect valuable resources but are accessible by the public, such as a smart door lock. The core is similar to the Cortex-M33, adding that physical layer of protection. If your product needs physical, tamper-proof security as the primary requirement, this is definitely a Cortex-M processor to consider.
In the embedded space, safety requirements typically come when in a regulated, safety-critical environment. These safety requirements can be categorized as either diagnostic requirements or systematic requirements. Diagnostic requirements relate to the management of random faults on the device and are addressed by the addition of hardware features for fault detection (FD) and control. Systematic requirements relate to demonstrating the avoidance of systematic failures and are addressed typically through the design process and verification.
Products sold to high-safety environments must prove a level of risk reduction as defined by international standards. The International Electrotechnical Commission (IEC) 61508 standard defines general Safety Integrity Levels (SILs), with SIL 4 being the most strict and SIL 1 being the least strict. The automotive industry has a dedicated level system, the Automotive Safety Integration Level (ASIL), with ASIL D being the strictest and ASIL A being the least strict.
Some common safety features include the following:
Exception handling, which prevents software crashes in the case of system faults.MPUs that ensure data integrity from invalid behavior.Software test libraries (STLs) test for faults at startup and runtime. Note that this is not a feature of the processor, but instead, a suite of software tests provided to run on a specific processor. Dual-Redundant Core Lockstep (DCLS), where two processors redundantly run the same code to uncover and correct system errors. Error Correction Code (ECC), which automatically detects and corrects memory errors.Memory Built-In Self-Test (MBIST) interfaces that enable memory integrity validation while the processor is running.We can rank the Cortex-M processors in terms of safety features, showing the cutoff for which processors are capable of reaching certain safety levels, as follows:
Figure 1.6 – Ranking safety features of Cortex-M processors
The Cortex-M7 is alone in the Cortex-M family in offering ECC, MBIST, and DCLS features alongside the more common MPU and exception handling. The Cortex-M55, Cortex-M33, and Cortex-M23 contain almost all of those features, but are still capable of meeting the strict SIL 3 and ASIL D safety levels.
The Cortex-M4, Cortex-M3, and Cortex-M0+ all offer enough safety features to achieve the least strict SIL 2 and ASIL B safety levels with STLs, MPUs, and exception handling.
The Cortex-M35P processor is highly effective for safety applications as well as security applications. It contains most of the already listed safety features, adding in heightened observability to ensure expected behavior and more.
Now that we have looked at some key features that can drive your processor selection, let us look at how cost can impact this decision-making process.
Minimizing cost is a common requirement in deeply embedded and IoT spaces. When looking at microcontrollers or boards to purchase, the cost should be obvious and does not require much explanation.
We can, however, provide some context for what contributes to the cost of a microcontroller, with the largest factor here being the silicon area. As the area of a microcontroller increases, it requires more materials to make and thus intuitively raises costs. Production volume will also impact the cost. The higher the production volume, the lower the cost will be. Typically, the smaller the Cortex-M processor is, the less expensive it will be to manufacture, and thus the lower the price to purchase. We will review the Cortex-M processors with the lowest area so that you have a starting point to look for boards with these processors to have the best chance of minimizing your overall cost.
Important note