139,99 €
ARM designs the cores of microcontrollers which equip most "embedded systems" based on 32-bit processors. Cortex M3 is one of these designs, recently developed by ARM with microcontroller applications in mind. To conceive a particularly optimized piece of software (as is often the case in the world of embedded systems) it is often necessary to know how to program in an assembly language. This book explains the basics of programming in an assembly language, while being based on the architecture of Cortex M3 in detail and developing many examples. It is written for people who have never programmed in an assembly language and is thus didactic and progresses step by step by defining the concepts necessary to acquiring a good understanding of these techniques.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 264
Veröffentlichungsjahr: 2013
Table of Contents
Preface
Chapter 1: Overview of Cortex-M3 Architecture
1.1. Assembly language versus the assembler
1.2. The world of ARM
Chapter 2: The Core of Cortex-M3
2.1. Modes, privileges and states
2.2. Registers
Chapter 3: The Proper Use of Assembly Directives
3.1. The concept of the directive
3.2. Structure of a program
3.3. A section of code
3.4. The data section
3.5. Is that all?
Chapter 4: Operands of Instructions
4.1. The constant and renaming
4.2. Operands for common instructions
4.3. Memory access operands: addressing modes
Chapter 5: Instruction Set
5.1. Reading guide
5.2. Arithmetic instructions
5.3. Logical and bit manipulation instructions
5.4. Internal transfer instructions
5.5. Test instructions
5.6. Branch instructions
5.7. Load/store instructions
5.8. “System” instructions and others
Chapter 6: Algorithmic and Data Structures
6.1. Flowchart versus algorithm
6.2. Alternative structures
6.3. Iterative structures
6.4. Compound conditions
6.5. Data structure
Chapter 7: Internal Modularity
7.1. Detailing the concept of procedure
7.2. Procedure arguments
7.3. Local data
Chapter 8: Managing Exceptions
8.1. What happens during Reset?
8.2. Possible exceptions
8.3. Priority management
8.4. Entry and return in exception processing
Chapter 9: From Listing to Executable: External Modularity
9.1. External modularity
9.2. The role of the assembler
9.3. The role of the linker
9.4. The loader and the debugging unit
Appendices
Appendix A. Instruction Set – Alphabetical List
Appendix B. The SysTick Timer
B.1. Counters and timers in general
B.2. SysTick
B.3. The SysTick registers
B.4. Example of SysTick programming
Appendix C. Example of a “Bootstrap” File
C.1. The listing
C.2. Important points
Appendix D. The GNU Assembler
D.1. GNU directive
D.2. Bootstrap program
Bibliography
Index
First published 2012 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:
ISTE Ltd27-37 St George’s RoadLondon SW19 4EUUKJohn Wiley & Sons, Inc.111 River StreetHoboken, NJ 07030USAwww.iste.co.ukwww.wiley.com© ISTE Ltd 2012
The rights of Vincent Mahout to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988.
Library of Congress Cataloging-in-Publication Data
Mahout, Vincent.
Assembly language programming : ARM Cortex-M3 / Vincent Mahout.
p. cm.
Includes bibliographical references and index.
ISBN 978-1-84821-329-6
1. Embedded computer systems. 2. Microprocessors. 3. Assembler language (Computer program language) I. Title.
TK7895.E42M34 2012
005.2--dc23
2011049418
British Library Cataloguing-in-Publication Data
A CIP record for this book is available from the British Library
ISBN: 978-1-84821-329-6
To be able to plan and write this type of book, you need a good work environment. In my case, I was able to benefit from the best working conditions for this enterprise. In terms of infrastructure and material, the Institut National de Sciences Appliquées de Toulouse, France (Toulouse National Institute of Applied Sciences), and in particular their Electrical and Computer Engineering Department, has never hesitated to invest in computer systems engineering, so that the training of our future engineers will always be able to keep up with rapid technological change. I express my profound gratitude to this institution. These systems would not have amounted to much unless, over the years, there was an educational and technical team bringing their enthusiasm and dynamism to implement them. The following pages also contain the hard work of Pascal Acco, Guillaume Auriol, Pierre-Emmanuel Hladik, Didier Le Botlan, José Martin, Sébastien Di Mercurio and Thierry Rocacher. I thank them sincerely. Two final respectful and friendly nods go to François Pompignac and Bernard Fauré who, before retirement, did much work to fertilize this now thriving land.
When writing a book on the assembly language of a μprocessor, we know in advance that it will not register in posterity. By its very nature, an assembly language has the same life expectancy as the processor it supports – perhaps 20 years at best. What’s more, this type of programming is obviously not used for the development of software projects and so is of little consequence.
Assembly language programming is, however, an indispensable step in understanding the internal functioning of a µprocessor. This is why it is still widely taught in industrial computer training, and particularly in training engineers. It is clear that a good theoretical knowledge of a particular assembly language, combined with a practical training phase, allows for easier learning of other programming languages, whether they are the assembly languages of other processors or high-level languages.
Thus, this book intends to dissect programming in the assembly language of a μcontroller constructed around an ARM Cortex-M3 core. The choice of this µcontroller rests on the desire to explain:
This book had been written to be as generic as possible. It is certainly based on the architecture and instruction set of Cortex-M3, but with the intention of explaining the basic mechanisms of assembly language programming. In this way we can use systematically modular programming to show how basic algorithmic structures can be programmed in assembly language. This book also presents many illustrative examples, meaning it is also practical.
A computer program is usually defined as a sequence of instructions that act on data and return an expected result. In a high-level language, the sequence and data are described in a symbolic, abstract form. It is necessary to use a compiler to translate them into machine language instructions, which are only understood by the processor. Assembly language is directly derived from machine language, so when programming in assembly language the programmer is forced to see things from the point of view of the processor.
When executing a program, a computer processor obeys a series of numerical orders – instructions – that are read from memory: these instructions are encoded in binary form. The collection of instructions in memory makes up the code of the program being executed. Other areas of memory are also used by the processor during the execution of code: an area containing the data (variables, constants) and an area containing the system stack, which is used by the processor to store, for example, local data when calling subprograms. Code, data and the system stack are the three fundamental elements of all programs during their execution.
It is possible to program directly in machine language – that is, to write the bit instruction sequences in machine language. In practice, however, this is not realistic, even when using a more condensed script thanks to hexadecimal notation (numeration in base 16) for the instructions. It is therefore preferable to use an assembly language. This allows code to be represented by symbolic names, adapted to human understanding, which correspond to instructions in machine language. Assembly language also allows the programmer to reserve the space needed for the system stack and data areas by giving them an initial value, if necessary. Take this example of an instruction to copy in the no. 1 general register of a processor with the value 170 (AA in hexadecimal). Here it is, written using the syntax of assembly language studied here:
EXAMPLE 1.1. – A single line of code
The same instruction, represented in machine language (hexadecimal base), is written: E3A010AA. The symbolic name MOV takes the name mnemonic. R1 and #0xAA are the arguments of the instruction. The semicolon indicates the start of a commentary that ends with the current line.
The assembler is a program responsible for translating the program from the assembly language in which it is written into machine language. Upon input, it receives a source file that is written in assembly language, and creates two files: the object file containing machine language (and the necessary information for the fabrication of an executable program), and the printout assembly file containing a report that details the work carried out by the assembler.
This book deals with assembly language in general, but focuses on processors based on Cortex-M3, as set out by Advanced RISC Machines (abbreviated to ARM). Different designers (Freescale, STmicroelectronics, NXP, etc.) then integrate this structure into µcontrollers containing memory and multiple peripherals as well as this processor core. Part of the documentation regarding this processor core is available in PDF format at www.arm.com.
ARM does not directly produce semiconductors, but rather provides licenses for microprocessor cores with 32-bit RISC architecture.
This Cambridge-based company essentially aims to provide semiconductors for the embedded systems market. To give an idea of the position of this designer on this market, 95% of mobile telephones in 2008 were made with ARM-based processors. It should also be noted that the A4 and A5 processors, produced by Apple and used in their iPad graphics tablets, are based on ARM Cortex-Type A processors.
Since 1985 and its first architecture (named ARM1), ARM architectures have certainly changed. The architecture upon which Cortex-M3 is based is called ARMV7-M.
ARM’s collection is structured around four main families of products, for which many licenses have been filed1:
Cortex-M3 targets, in particular, embedded systems requiring significant resources (32-bit), but for these the costs (production, development and consumption) must be reduced. The first overall illustration (see Figure 1.1) of Cortex-M3, as found in the technical documentation for this product, is a functional diagram. Although simple in its representation, every block could perplex a novice. Without knowing all of the details and all of the subtleties, it is useful to have an idea of the main functions performed by different blocks of the architecture.
These units make up the main part of the processor – the part that is ultimately necessary to run applications and to perform them or their software functions:
Figure 1.1.Cortex-M3 functional diagram
The development of programs is an important and particularly time-consuming step in the development cycle of an embedded application. What is more, if the project has certification imperatives, it is necessary that tools (software and/or material) allowing maximum monitoring of the events occurring in each clock cycle are at its disposition. In Cortex-M3, the different units briefly introduced below correspond to these monitoring functions. They are directly implanted in the silicon of the circuit, which allows them to use these development tools at a material level. An external software layer is necessary, however, to recover and process the information issued by these units. The generic idea behind the introduction of hardware solutions is to offer the programmer the ability to test and improve the reliability of (or certify) his or her code without making any changes. It is convenient (and usual) to insert some print (“Hello I was here”) into a software structure to check that the execution passes through this structure. This done, a code modification is introduced, which can modify the global behavior of the program. This is particularly true when time management is critical for the system, which, for embedded systems controlled by a µcontroller, is almost always the case. The units relating to monitoring functions in Cortex-M3 include:
As already stated, ARM does not directly make semiconductors. The µcontroller core designs are sold under license to designers, who add all of the peripheral units that make up the “interface with the exterior”. For example, the STM32 family of µcontrollers, made by STMicroelectronics, contains the best selling µcontrollers using Cortex-M3. Like any good family of µcontrollers, the STM32 family is available in many versions. In early 2010, the STMicroelectronics catalog offered the products shown in Figure 1.2.
Figure 1.2.STM32 family products
The choice of the right version of µcontroller can be a significant step in the design phase of a project: based on the needs (in terms of function, number of Input/Output, etc.) but also on the proper constraints (cost, consumption, size, etc.), each version of the processor will be more or less well adapted to the project. Again, the purpose of this book is not to go into detail on the function of the peripheral aspects of the µcontroller. It is, however, useful to have an overall view of the circuit that will eventually be programmed, so you are not surprised by such basic things as, for example, addressing memory. From a functional point of view, Figure 1.3 shows how STMicroelectronics has “dressed” Cortex-M3 to make a processor (the STM32F103RB version is shown here). The first important remark is that, inevitably, not all functions offered by this µcontroller are available simultaneously. Usually several units share the output pins. So, depending on the configuration that the programmer imposes, certain functions must de facto be unavailable for the application. Only a profound knowledge of a given processor will allow the programmer to know, a priori, whether the chosen version will be sufficient for the needs of the application being developed. The issue of choosing the best version is therefore far from trivial.
Figure 1.3.Functional description of the STM32F103RB
In looking at the outline of the STM32F103RB processor, we can see that it has ARM design elements in its processor core, namely:
STMicroelectronics then added the following functions:
Memory space management is certainly the most complicated aspect to be managed when developing a program in assembly language. Fortunately, a number of assembly directives associated with a powerful linker make this management relatively simple. However, it is helpful to have a clear idea of the memory mapping in order to work with full knowledge of the facts while developing (and debugging) the program. In fact, processor registers regularly contain quantities that correspond to addresses. When the program does not behave as desired (a nasty bug, clearly!), it is helpful to know whether the quantities produced are ones that could be expected. Cortex-M3 has a 4 GB consecutive address memory space (32-bit bus). A memory address corresponds to one byte. It follows that a half-word occupies two addresses and a word (32-bits) occupies four memory addresses.
By convention, data storage is arranged according to the little endian standard, where the least significant byte of a word or half-word is stored at the lowest address, and we return to the higher addresses by taking the series of component bytes making up the numbers stored in memory. Figure 1.4 shows how, in the little endian standard, memory placement of words and half-words is managed.
The architecture is of the Harvard-type, which results in a division separating code access from data access. Figure 1.5 shows how this division is planned out. The other zones (Peripheral, External, etc.) impose the placement on the addressing space of different units, as presented in Figure 1.3.
One feature that should be noted concerns memory access – bit banding. This technique is found in both the Static Random-Access Memory (SRAM) zone (between addresses 0×20000000 and 0×2000FFFF for the bit-band region and addresses 0×22000000 and 0×23FFFFFF for the alias) and the peripheral zone (between addresses 0×40000000 and 0×4000FFFF for the bit-band region and addresses0×42000000 and 0×43FFFFFF for the alias). These four zones are schematized by the hatching in the memory-mapping in Figure 1.5. This technique allows the programmer to directly modify (set or reset) bits situated at the addresses within the bit-banding zone.
Figure 1.4.Little-endian convention
Figure 1.5.Cortex-M3 memory mapping
Figure 1.6.Bit-banding principle for the first SRAM address
What is the problem? Insofar as the architecture cannot directly act upon bits in memory, if we wish to modify a bit in a memory zone without this feature, it is necessary to:
ARM is designed to match the address of a word (in the alias zone) with a bit (in the bit-banding zone). So when the programmer writes a value in the alias zone, it amounts to modifying the bit-banding bit corresponding to the zero-weight bit that they just wrote. Conversely, reading the least significant bit of a word in the alias zone lets the programmer know the logic state of the corresponding bit in the bit-banding zone (see Figure 1.6). It should be noted that this technique does not use RAM memory insofar as alias zones are imaginary: they do not physically correspond to memory locations – they only use memory addresses, but with 4 GB of possible addresses this loss is of little consequence.
1 Numbers from the third quarter of 2010.
2 Advanced High-performance Bus (AHB) is a microcontroller bus protocol brought in by ARM.
The previous chapter showed how the programmer could break down the different units present in a µcontroller like STM32 from a functional point of view. It is now time to delve a little deeper into the Cortex-M3 core, and to explain in detail the contents of the CM3CORE box, as shown in Figure 2.1.
It would be pointless to create a replica (which would be incomplete due to the simplification necessary) of the contents of the various reference manuals [ARM 06a, ARM 06b, ARM 06c] that give detailed explanations of Cortex functions. It would also be pointless, however, to claim to program in assembly language without having a reasonably precise idea of its structure. This chapter therefore attempts to present the aspects of the architecture necessary for reasoned programming of a µcontroller.
Cortex-M3 can be put into two different operating modes: thread mode and Handler mode. These modes combine with the privilege levels that can be granted to the execution of a program regarding access to certain registers:
Passage between these three types of functioning can be described by a state machine, see Figure 2.1 [YIU 07]. After being reset, the processor is in thread mode with privilege. By setting the least significant bit (LSB) of the CONTROL register, it is possible to switch into unprivileged mode (also called user mode in the ARM documentation) using software. In unprivileged mode, the user cannot access the CONTROL register, and so it is impossible to return to the privileged mode. Just after the launch of an exception (see Chapter 8) the processor switches to Handler mode, which necessarily has privilege. Once the exception has been processed, the processor returns to its previous mode. If, during processing of the exception, the LSB of the CONTROL register is modified, then the processor can return to the opposite mode to that which was in effect before the launch. The only way to switch to unprivileged thread mode, as opposed to privileged thread mode, is by going through an exception that expresses itself by passing into Handler mode.
This level of protection can appear somewhat minimalist. It is a little like the locking/unlocking of your mobile phone: it takes a combination of keys (known to all) to achieve. It is obviously no use in preventing theft, but it is still useful when the phone is in the bottom of a pocket.
This type of security can only be developed within a global software architecture including an operating system. In a rather simplistic but ultimately quite realistic manner, it is possible to imagine that the operating system (OS) has full access privileges. It can therefore launch tasks in unprivileged thread mode, which could guarantee an initial level of security. A second privilege level concerns the functions of the memory protection unit block mentioned previously. Each task can only access the memory regions assigned to it by the OS.
A supplementary element should be taken into account in order to understand the functioning of Cortex-M3. It concerns the internal state of the processor, which can be in either Thumb or debug state.
The term Thumb refers to the set of processor instructions (see Chapter 5) where the associated state (Thumb state) corresponds to normal running. The debug state results from a switch to development mode. The execution rate of a program does not follow the same rules (stopping point, observation point, etc.) in this mode, so it is understandable that it results in a particular state. As with any event in the internal mechanisms of a processor, the switch from one state to another is reflected (and/or caused) by switching the values of one or more bits. In this case, it involves two bits (C_DEBUGEN and C_HALT) located in the Debug Halting Control and Status Register (DHCSR).
REMARK 2.1.– Mastery of the different functioning options is not a prerequisite for writing your first programs in assembly language. The preceding brief explanations are only there to help you realize the capacities of the processor. They are also to help you to understand that the observation of the “step-by-step” execution of your program comes from the exploitation of specific processor resources by the development software.
Figure 2.1.Modes and privileges
The first thing that should be noticed about a processor is that it is made up of various registers. This is without doubt the most pragmatic approach to the extent that modern architectures, such as those of Cortex-M3, are called load-store type architectures. This means that the programs initially transfer data from memory to the registers and performs operations on these registers in a second step. Finally, and when necessary, there is the transfer of the result to memory.
First we must define what a register is. A register, in the primary sense, corresponds to the location in internal memory (in the sense of a series of accessible bits in parallel) of a processor. We should, however, adjust this definition for the simple reason that, in the case of a µcontroller, although there is internal memory (20 KB of static RAM in the case of a standard STM32, and that is without taking into account Flash
