142,99 €
Embedded Cryptography provides a comprehensive exploration of cryptographic techniques tailored for embedded systems, addressing the growing importance of security in devices such as mobile systems and IoT. The books explore the evolution of embedded cryptography since its inception in the mid-90s and cover both theoretical and practical aspects, as well as discussing the implementation of cryptographic algorithms such as AES, RSA, ECC and post-quantum algorithms.
The work is structured into three volumes, spanning forty chapters and nine parts, and is enriched with pedagogical materials and real-world case studies, designed for researchers, professionals, and students alike, offering insights into both foundational and advanced topics in the field.
Embedded Cryptography 2 is dedicated to masking and cryptographic implementations, as well as hardware security.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 721
Veröffentlichungsjahr: 2025
Cover
Table of Contents
Title Page
Copyright Page
Preface
Part 1. Masking
Chapter 1. Introduction to Masking
1.1. An overview of masking
1.2. The effect of masking on side-channel leakage
1.3. Different types of masking
1.4. Code-based masking: toward a generic framework
1.5. Hybrid masking
1.6. Examples of specific maskings
1.7. Outline of the part
1.8. Notes and further references
1.9. References
Chapter 2. Masking Schemes
2.1. Introduction to masking operations
2.2. Classical linear operations
2.3. Classical nonlinear operations
2.4. Mask refreshing
2.5. Masking S-boxes
2.6. Masks conversions
2.7. Notes and further references
2.8. References
Chapter 3. Hardware Masking
3.1. Introduction
3.2. Category I:
td
+ 1 masking
3.3. Category II:
d
+ 1 masking
3.4. Trade-offs
3.5. Notes and further references
3.6. References
Chapter 4. Masking Security Proofs
4.1. Introduction
4.2. Preliminaries
4.3. Probing model
4.4. Robust probing model
4.5. Random probing model and noisy leakage model
4.6. Composition
4.7. Conclusion
4.8. Notes and further references
4.9. References
Chapter 5. Masking Verification
5.1. Introduction
5.2. General procedure
5.3.
Verify
: verification mechanisms for a set of variables
5.4.
Explore
: exploration mechanisms for all sets of variables
5.5. Conclusion
5.6. Notes and further references
5.7. Solution to Exercise 5.1
5.8. References
Part 2. Cryptographic Implementations
Chapter 6. Hardware Acceleration of Cryptographic Algorithms ..
6.1. Introduction
6.2. Hardware optimization of symmetric-key cryptography
6.3. Modular arithmetic for hardware implementations
6.4. RSA implementations
6.5. Post-quantum cryptography
6.6. Conclusion
6.7. Notes and further references
6.8. References
Chapter 7. Constant-Time Implementations
7.1. What does constant-time mean?
7.2. Low-level issues
7.3. Primitive implementation techniques
7.4. Constant-time algorithms
7.5. References
Chapter 8. Protected AES Implementations
8.1. Generic countermeasures
8.2. Secure evaluation of the SubByte function
8.3. Other functions of AES
8.4. Notes and further references
8.5. References
Chapter 9. Protected RSA Implementations
9.1. Introduction
9.2. Building a protected RSA implementation step by step
9.3. Remarks and open discussion
9.4. Notes and further references
9.5. References
Chapter 10. Protected ECC Implementations
10.1. Introduction
10.2. Protecting ECC implementations and countermeasures
10.3. Conclusion
10.4. Notes and further references
10.5. References
Chapter 11. Post-Quantum Implementations
11.1. Introduction
11.2. Post-quantum encryption and key encapsulation
11.3. Post-quantum signatures
11.4. Notes and further references
11.5. References
Part 3. Hardware Security
Chapter 12. Hardware Reverse Engineering and Invasive Attacks
12.1. Introduction
12.2. Preparation for hardware attacks
12.3. Probing attacks
12.4. Delayering and reverse engineering
12.5. Memory dump and hardware cloning
12.6. Conclusion
12.7. Notes and further references
12.8. References
Chapter 13. Gate-Level Protection
13.1. Introduction
13.2. DPL principle, built-in DFA resistance, and latent side-channel vulnerabilities
13.3. DPL families based on standard cells
13.4. Technological specific DPL styles
13.5. DPL styles comparison
13.6. Conclusion
13.7. Notes and further references
13.8. References
Chapter 14. Physically Unclonable Functions
14.1. Introduction
14.2. PUF architectures
14.3. Reliability enhancement
14.4. Entropy assessment
14.5. Resistance to attacks
14.6. Characterizations
14.7. Standardization
14.8. Notes and further references
14.9. References
List of Authors
Index
Summary of Volume 1
Summary of Volume 3
End User License Agreement
Chapter 1
Table 1.1.
Instantiation of masking schemes through gen...
Chapter 3
Table 3.1.
Distribution of output sharing for Example 3.1
Chapter 12
Table 12.1.
Sample preparation methods
Chapter 13
Table 13.1.
Exhaustive simulation of all the return to...
Table 13.2.
DPL performance and security features overview
Chapter 14
Table 14.1.
List of the main PUF types in digital circuits
Chapter 1
Figure 1.1.
Generalization of masking schemes under code-based masking
.
Chapter 2
Figure 2.1.
The LinearRefreshMasks algorithm, with the randoms r
...
Figure 2.2.
Recursive definition of
...
Figure 2.3.
Sequential computation of x
254
Figure 2.4.
The sequence of gadgets of high order Boolean to...
Chapter 3
Figure 3.1.
Second output share of ISW with n
= 2
Figure 3.2.
Example of a glitch in the signal xy.
Figure 3.3.
Adversarial model with glitch-extended probes
Chapter 4
Figure 4.1.
Illustration of (de)composition
Figure 4.2.
Composition of two t-NI gadgets.
Figure 4.3.
Composition of a t-SNI gadget and a t-NI gadget.
Figure 4.4.
xor
gate with a single share for each input
Figure 4.5.
Parallel (left) and sequential (right) composition of two gadgets
Chapter 6
Figure 6.1.
Comparison of the implementation properties of ASICs...
Chapter 8
Figure 8.1.
Bit-sliced AES S-box where
(
B
7
,
B
6
...
Figure 8.2.
Bit-sliced AES inv S-box where
(
B
7
,
B
6
...
Chapter 9
Figure 9.1.
RSA-CRT pseudo-code
Figure 9.2.
Protected RSA-CRT pseudo-code
Chapter 10
Listing 10.1.
fe25519_cswap
function
Chapter 12
Figure 12.1.
IronKey device before and after opening its metal case
Figure 12.2.
IronKey device before and after epoxy removal.
Figure 12.3.
Omnipod PCB before and after components removal.
Figure 12.4.
eMMC package carrier at different depth of polishing.
Figure 12.5.
Four PCB layers extracted with X-ray CT of Infineon Trust B...
Figure 12.6.
Device prepared for non-invasive attack.
Figure 12.7.
Decapsulated dies with Aluminum and gold wires.
Figure 12.8.
Decapsulated die with copper wires.
Figure 12.9.
Decapsulated dies bonded after invasive preparation.
Figure 12.10.
Device prepared for semi-invasive attack.
Figure 12.11.
Thinning substrate on lapping machine.
Figure 12.12.
Debug interface wired to the microcontroller.
Figure 12.13.
Probing station (a) and probing needles landed on the chip surface...
Figure 12.14.
Probing tips landed on the data bus of the chip.
Figure 12.15.
Optical and SEM images of test points created under FIB.
Figure 12.16.
1 μm (a) and 0.5 μm (b) chip after its top layer was...
Figure 12.17.
Surface of the chip after lapping: (a) corner; (b) logic area.
Figure 12.18.
Surface of the chip after CMP: (a) corner; (b) logic area.
Figure 12.19.
Surface of the chip after RIE: (a) corner; (b) logic area.
Figure 12.20.
Optical and SEM images of the logic area after selective...
Figure 12.21.
Layers in 0.35 μm chip: (a) metal 3; (b) metal 2;...
Figure 12.22.
Reverse engineering phases: (a) SEM overlay image;...
Figure 12.23.
Examples of deprocessed mask ROMs: (a) wet etching to metal 1;...
Figure 12.24.
Examples of stained mask ROMs: (a) deprocessed NOR ROM;...
Figure 12.25.
Flash area SEM imaging in 90-nm chip: (a) conventional image;...
Chapter 13
Figure 13.1.
WDDL AND gate with the early evaluation flaw
Figure 13.2.
(a) Genuine DRSL AND gate and (b) a glitch-free variant
Figure 13.3.
Signal levels in a DRSL AND gate while it returns to precharge in a...
Figure 13.4.
Timing optimization in DPL protocol when the precharge is...
Figure 13.5.
Combinatorial function
in (a) unprotected...
Figure 13.6.
Chronogram of execution of the netlist depicted in Figure
...
Figure 13.7.
Pairwise balance of dual-rail pairs in a DPL netlist
Figure 13.8.
Example of netlist (e.g. one half of a WDDL netlist)....
Figure 13.9.
Schematic of the QDI secured (aka SecLib)
AND
gate (left)...
Figure 13.10.
True half of the AND gate in LBDL
Chapter 14
Figure 14.1.
PUF concept, from design to fabrication.
Figure 14.2.
Process variation in MOSFET transistor.
Figure 14.3.
Block diagram of a PUF circuit
Figure 14.4.
Generic architecture of PUF.
Figure 14.5.
Architecture of a PUF circuit in its real environment.
Figure 14.6.
Strong versus weak PUF.
Figure 14.7.
The two phases of the weak PUF lifecycle.
Figure 14.8.
Contact PUF.
Figure 14.9.
SRAM cell with cross-coupled inverters and device fingerprint...
Figure 14.10.
Cross-couple latch: A basic building block for memory-based PUFs
Figure 14.11.
Butterfly PUF
Figure 14.12.
Arbiter PUF
Figure 14.13.
Arbiter PUF with identical delay chain
Figure 14.14.
RO- PUF
Figure 14.15.
Loop-PUF
Figure 14.16.
PUF fuzzy extraction of a key by means of ECC and secure sketch
Figure 14.17.
The fuzzy extractor for key generation
Figure 14.18.
BER with and without challenges filtering.
Figure 14.19.
Different entropy types for a delay PUF with n
≤ 10
bits.
Figure 14.20.
Distribution of
2
n
PUF responses into the set...
Figure 14.21.
The four dimensions involved in the PUFs metrics.
Cover Page
Table of Contents
Title Page
Copyright Page
Preface
Begin Reading
List of Authors
Index
Summary of Volume 1
Summary of Volume 3
iii
iv
xiii
xiv
xv
xvi
xvii
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
113
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
289
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
379
380
381
382
383
385
386
387
388
389
390
391
393
394
395
396
397
398
399
400
401
SCIENCES
Computer Science,Field Director – Jean-Charles Pomerol
Cryptography, Data Security,Subject Head – Damien Vergnaud
Coordinated by
Emmanuel Prouff
Guénaël Renault
Matthieu Rivain
Colin O’Flynn
First published 2025 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the under mentioned address:
ISTE Ltd27-37 St George’s RoadLondon SW19 4EUUKwww.iste.co.uk
John Wiley & Sons, Inc.111 River StreetHoboken, NJ 07030USAwww.wiley.com
© ISTE Ltd 2025The rights of Emmanuel Prouff, Guénaël Renault, Matthieu Rivain and Colin O’Flynn to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s), contributor(s) or editor(s) and do not necessarily reflect the views of ISTE Group.
Library of Congress Control Number: 2024945128
British Library Cataloguing-in-Publication DataA CIP record for this book is available from the British LibraryISBN 978-1-78945-214-3
ERC code:PE6 Computer Science and InformaticsPE6_5 Security, privacy, cryptology, quantum cryptography
Emmanuel PROUFF1, Guénaël RENAULT2, Matthieu RIVAIN3 and Colin O’FLYNN4
1LIP6, Sorbonne Université, Paris, France
2Agence nationale de la sécurité des systèmes d’information, Paris, France
3CryptoExperts, Paris, France
4Dalhousie University and NewAE Technology Inc, Halifax, Canada
The idea for this project was born during a discussion with Damien Vergnaud. Damien had been asked to propose a series of volumes covering the different domains of modern cryptography for the SCIENCES series. He offered us the opportunity to take charge of the Embedded Cryptography books, which sounded like a great challenge to take on. In particular, we thought it was perfectly timely as the field was gaining increasing importance with the growing development of complex mobile systems and the Internet of Things.
The field of embedded cryptography, as a research domain, was born in the mid-1990s. Until that time, the evaluation of a cryptosystem and the underlying attacker model were usually agnostic of implementation aspects whether the cryptosystem was deployed on a computer or on some embedded hardware like a smart card. Indeed, the attacker was assumed to have no other information than the final results of a computation and, possibly, the corresponding inputs. In this black box context, defining a cryptanalytic attack and evaluating resistance to it essentially consisted of finding flaws in the abstract definition of the cryptosystem.
In the 1990s, teams of researchers published the first academic results, highlighting very effective means of attack against embedded systems. These attacks were based on the observation that a system’s behavior during a computation strongly depends on the values of the data manipulated (which was previously known and exploited by intelligence services). Consequently, a device performing cryptographic computation does not behave like a black box whose inputs and outputs are the only known factors. The power consumption of the device, its electromagnetic radiation or its running time are indeed other sources that provide the observer with information on the intermediate results of the computation. Teams of researchers have also shown that it was possible to disrupt a computation using external energy sources such as lasers or electromagnetic pulses.
Among these so-called physical attacks, two main families emerge. The first gathers the (passive) side-channel attacks, including timing attacks proposed by Kocher in 1996 and power analysis attacks proposed by Kocher et al. in 1999, as well as the microarchitectural attacks which have considerably developed after the publication of the Spectre and Meltdown attacks in 2018. This first family of attacks focuses on the impact that the data manipulated by the system have on measurable physical quantities such as time, current consumption, or energy dissipation related to state changes in memories. The second family gathers the (active) fault injection attacks, whose first principles were introduced by Boneh et al. in 1997. These attacks aim to put the targeted system into an abnormal state of functioning. They consist, for example, of ensuring that certain parts of a code are not executed or that operations are replaced by others. Using attacks from either of these families, an adversary might learn sensitive information by exploiting the physical leakage or the faulted output of the system.
Since their inception, side-channel attacks and fault injection attacks, along with their countermeasures, have significantly evolved. Initially, the embedded systems industry and a limited number of academic labs responded with ad hoc countermeasures. Given the urgency of responding to the newly published attacks, these countermeasures were reasonably adequate at the time. Subsequently, the invalidation of many of these countermeasures and the increasing sophistication of attack techniques highlighted the need for a more formalized approach to security in embedded cryptography. A community was born from this observation in the late 1990s and gathered around a dedicated conference known as Cryptographic Hardware and Embedded Systems (CHES). Since then, the growth of this research domain has been very significant, resulting from the strong stake of the industrial players and the scientific interest of the open security issues. Nowadays, physical attacks involve state-of-the-art equipment capable of targeting nanoscale technologies used in the semiconductor industry. The attackers routinely use advanced statistical analyses or signal processing, while the defenders designing countermeasures calls on concepts from algebra, probability theory, or formal methods. More recently, and notably with the publication of the Spectre and Meltdown attacks, side-channel attacks have extended to so-called microarchitectural attacks, exploiting very common optimization techniques in modern CPUs such as out-of-order execution or speculative execution. Twenty-five years after the foundational work, there is now a large community of academic and industrial scientists dedicated to these problems. Embedded cryptography has gradually become a classic topic in cryptography and computer security, as illustrated by the increasing importance of this field in major cryptography and security conferences besides CHES, such as CRYPTO, Eurocrypt, Asiacrypt, Usenix Security, IEEE S&P or ACM CCS.
For this work, it seemed important to us to have both scientifically ambitious and pedagogical content. We indeed wanted this book to appeal not only to researchers in embedded cryptography but also to Master’s students interested in the subject and curious to take their first steps. It was also important to us that the concepts and notions developed in the book be as illustrated as possible and therefore accompanied by a pedagogical base. In addition to the numerous illustrations proposed in the chapters, we have made pedagogical material available (attack scripts, implementation examples, etc.) to test and deepen the various concepts. These can be found on the following GitHub organization: https://github.com/embeddedcryptobook.
This book provides a comprehensive exploration of embedded cryptography. It comprises 40 chapters grouped into nine main parts, and spanning three volumes. The book primarily addresses side-channel and fault injection attacks as well as their countermeasures. Part 1 of Volume 1 is dedicated to Software Side-Channel Attacks, namely, timing attacks and microarchitectural attacks, primarily affecting software; whereas Part 2 is dedicated to Hardware Side-Channel Attacks, which exploit hardware physical leakages, like power consumption and electromagnetic emanations. Part 3 focuses on the second crucial family of physical attacks against embedded systems, namely, Fault Injection Attacks.
A full part of this volume is dedicated to Masking in Part 1, which is a widely used countermeasure against side-channel attacks and which has become an important research topic since their introduction in 1999. This part covers a variety of masking techniques, their security proofs and their formal verification. Besides general masking techniques, efficient and secure embedded cryptographic implementations are very dependent on the underlying algorithm. Consequently, Part 2, Cryptographic Implementations, is dedicated to the implementation of specific cryptographic algorithm families, namely, AES, RSA, ECC, and post-quantum cryptography. This part also covers hardware acceleration and constant-time implementations. Secure embedded cryptography needs to rely on secure hardware and secure randomness generation. In cases where hardware alone is insufficient for security, we must rely on additional software techniques to protect cryptographic keys. The latter is known as white-box cryptography. The next three parts of the book address those aspects. Part 3 of this current volume, Hardware Security, covers invasive attacks, hardware countermeasures and physically unclonable functions (PUF).
Part 1 of Volume 3 is dedicated to White-Box Cryptography: it covers general concepts, practical attack tools, automatic (gray-box) attacks and countermeasures as well as code obfuscation, which is often considered as a complementary measure to white-box cryptography. Part 2 of Volume 3 is dedicated to Randomness and Key Generation in embedded cryptography. It covers both true and pseudo randomness generation as well as randomness generation for specific cryptographic algorithms (prime numbers for RSA, random nonces for ECC signatures, random errors for post-quantum schemes).
Finally, we wanted to include concrete examples of real world attacks against embedded cryptosystems. The final part of this series of books contains those examples of Real World Applications and Attacks in the Wild. While not exhaustive, we selected representative examples illustrating the practical exploitation of the attacks presented in this book, hence demonstrating the necessity of the science of embedded cryptography.
This series of books results from a collaborative work and many persons from the embedded cryptography community have contributed to its development. We have tried to cover (as broadly as possible) the field of embedded cryptography and the many research directions related to this field. This has not been an easy task, given the dynamism and growth of the field over the past 25 years. Some experts from the community kindly shared their insights on the preliminary plan of the book, namely, Sonia Belaïd, Pierre-Alain Fouque, Marc Joye, Victor Lomné and Yannick Sierra. We would like to thank them for their insightful comments.
For each of the identified topics we wanted the book to cover, we have called upon expert researchers from the community, who have honored us by joining this project. The essence of this work is theirs. Dear Guillaume Barbu, Lejla Batina, Sonia Belaïd, Davide Bellizia, Florent Bernard, Begül Bilgin, Eleonora Cagli, Łukasz Chmielewski, Jessy Clédière, Brice Colombier, Jean-Sébastien Coron, Jean-Luc Danger, Lauren De Meyer, Cécile Dumas, Jean-Max Dutertre, Viktor Fischer, Pierre Galissant, Benedikt Gierlichs, Louis Goubin, Vincent Grosso, Sylvain Guilley, Patrick Haddad, Laurent Imbert, Ján Jančár, Marc Joye, Matthias Kannwischer, Stefan Katzenbeisser, Victor Lomné, José Lopes-Esteves, Ange Martinelli, Pedro Massolino, Loïc Masure, Nele Mentens, Debdeep Mukhopadhyay, Camille Mutschler, Ruben Niederhagen, Colin O’Flynn, Elisabeth Oswald, Dan Page, Pascal Paillier, Louisa Papachristodoulou, Thomas Peeters, Olivier Peirera, Thomas Pornin, Bart Preneel, Thomas Prest, Jean-René Reinhard, Thomas Roche, Francisco Rodríguez-Henríquez, Franck Rondepierre, Eyal Ronen, Melissa Rossi, Mylene Rousselet, Sylvain Ruhault, Ulrich Ruhrmair, Sayandeep Saha, Patrick Schaumont, Sebastian Schrittwieser, Peter Schwabe, Richa Singh, Sergei Skorobogatov, François-Xavier Standaert, Petr Svenda, Marek Sys, Akira Takahashi, Abdul Rahman Taleb, Yannick Teglia, Philippe Teuwen, Adrian Thillard, Medhi Tibouchi, Mike Tunstall, Aleksei Udovenko, David Vigilant, Lennert Wouters, Yuval Yarom and Rina Zeitoun: thank you so much for the hard work!
To accompany these authors, we also relied on numerous reviewers who kindly shared their remarks on the preliminary versions of the chapters. Their behind-the-scenes work allowed us to greatly improve the technical and editorial quality of the books. We express our gratitude to them, namely, Davide Alessio, Sébastien Bardin, Sonia Belaïd, Eloi Benoist-Vanderbeken, Gaëtan Cassiers, Jean-Sebastien Coron, Debayan Das, Cécile Dumas, Julien Eynard, Wieland Fischer, Thomas Fuhr, Daniel Genkin, Dahmun Goudarzi, Eliane Jaulmes, Victor Lomné, Loïc Masure, Bart Mennink, Stjepan Picek, Thomas Pornin, Thomas Prest, Jurgen Pulkus, Michaël Quisquater, Thomas Roche, Franck Rondepierre, Franck Salvador, Tobias Schneider, Okan Seker, Pierre-Yves Strub, Akira Takahashi, Abdul Rahman Taleb, Mehdi Tibouchi, Aleksei Udovenko, Gilles Van Assche, Damien Vergnaud, Vincent Verneuil and Gabriel Zaid.
October 2024
Ange MARTINELLI and Mélissa ROSSI
Agence nationale de la sécurité des systèmes d’information, Paris, France
In order to thwart the menace of passive side-channel attacks such as Differential Power Analysis (DPA) or templates attacks, the most deployed software countermeasure has been masking since its introduction in 1999. The main observation is that side-channel attacks use the relation between intermediates values and leakage traces. Thus, if intermediate values are independent from any secret, no information can be retrieved from the leakage. In a nutshell, masking consists of randomizing the intermediate data, which depend on both the secret values and known variables.
Each secret input x is basically split into d + 1 variables (xi)0≤i≤d referred as shares: d of them are generated uniformly at random, whereas the last one is computed such that their combination through some defined operation reveals the secret value x. The integer d is called masking order.
Sharing: let be the finite support of the possible values for a sensitive variable x. Let be a function. A (d + 1)-uple is a sharing of x with respect to the recombination function F iff:
the distribution of all subsets of (
x
0
, …,
x
d
) of size at most
d
is statistically independent from
x
;
x
=
F
(
x
0
, …,
x
d
).
The following example gives a method to design a three-sharing of an 8-bit value with respect to the XOR operation ⊕.
To mask an 8-bit valueat order 2 with respect to the recombination functionF(x0, x1, x2) := x0 ⊕ x1 ⊕ x2, we can proceed by generating two values x0and x1uniformly at random in, and next defining x2 := x⊕x0⊕x1. That way, the second condition is automatically validated. For the first condition, any set of two values in (x0, x1, x2) is indistinguishable from(wheredenotes the uniform distribution in the given finite set) and the same is verified for singletons. Thus, (x0, x1, x2) is a sharing of x with respect to operation ⊕.
Masked scheme: let be a circuit implementing a cryptographic scheme; we call masked scheme a scheme that is functionally equivalent and where all the sensitive values are manipulated in masked form.
Note that a masked scheme is not secure by design and designing a masked scheme could be an error-prone task. Each masked design should be proved in a security model as will be presented later in Chapter 4 of this volume. For this purpose, we also define the notion of gadget that corresponds to sub-algorithms.
Gadget: a gadget is a probabilistic algorithm that takes shared and unshared inputs values and returns shared and unshared values.
In practice, while security increases with the masking order, the cost of masking can be quite heavy, both in terms of circuit size and performances and increases with the number of shares; thus, most masked scheme implementations are in first order (i.e. with two shares) or second order (i.e. with three shares).
In general, if the device leaks information on one manipulated variable at a time, a d-order masking resists to stronger attacks which can combine information on at most d points of measure. In particular, Chari et al. (1999) have shown that the number of measurements required to mount a successful power analysis attack increases exponentially with the number of shares – see higher order attacks in Chapter 7 of Volume 1. Their model was considering the very classical case where each manipulated bit is leaking its noisy value, that is, the sum of its value and a Gaussian noise. Therefore, the masking order represents a trade-off between security and efficiency.
However, the reader may wonder about the precise theoretical link between masking and the potential physical leakages. Indeed, the masking countermeasure protects against an attacker able to obtain at most d intermediate values with exact precision. This attacker model is called the d-probing model. Such a model could be far from the reality where the attacker actually obtains many noisy intermediate values through measured leakage. We can thus define various leakage models being more realistic, such as the noisy leakage model, or in which the security of masking is easier to assess, such as the probing model.
Let us first mention the three prominent leakage models. More details with formal definitions will be provided later in this chapter.
Probing model: in this model, besides the black-box information of the cryptosystem, the attacker is able to obtain the actual value (called “probes”) of at most d exact intermediate values of its choice, where d is a parameter. Masking allows us to reach provable security in this model.
Random probing model: in this model, besides the black-box information of the cryptosystem, the attacker has access to each intermediate value with a certain probability p, where p is a parameter.
Noisy leakage model: in this model, besides the black-box information of the cryptosystem, the attacker has access to all intermediate values but they are all noisy. The standard deviation of the noise level is a parameter of this model.
The precise description and reductions between models will be detailed and proved in Chapter 4 of this volume. A high level idea that outlines the power of provable security of masking is as follows: it is possible to derive bounds on the parameters to move from a model to another. For example, if a scheme is secure in the probing model with a high d, it will be secure in the noisy leakage model with a low level of noise.
As seen in our Definition 1.1, masking is a very versatile definition that can be implemented with any recombination function, purposely arbitrarily denoted as F. The choice of this function has strong repercussions on the efficiency and the leakage of the overall implementation. For efficiency matters, an operator that behaves efficiently, typically linearly, when composed with operations on shares is desirable. However, linear operation allows for easy recovery with side-channel information. In particular, the higher the algebraic complexity of the operation, the higher the impact of the noise on the recombining function, thus the higher the concrete security against side-channel attacks. In practice, masking is a costly countermeasure and efficiency matters are prominent. Thus, the vast majority of the masked schemes involve efficient recombination functions when composed of the most numerous operations of the cryptographic algorithm, often linear operations such as XOR, modular additions or multiplications by scalars. In practice, the two foremost maskings are Boolean masking and arithmetic masking.
Boolean and arithmetic masking: these families of masking provide the best trade-off concerning the efficiency and leakage. Main cryptographic operations are linear with either Boolean or arithmetic maskings. Thus, even if the leakage could be strong in case of low noise, it is sometimes less expensive to increase the masking order than to consider another type of masking with the same order.
Let be the support of the possible values for a sensitive variable x. Let be a function and a parameter. A Boolean, respectively, arithmetically, masked value is a sharing as defined in Definition 1.1 with F(x0, …, xd) := x0 ⊕ … ⊕ xd, respectively, F(x0, …, xd) := x0 + ⋯ + xd mod q.
Let us remark that the space is not necessarily a set of integers, and it can be any set compatible with the recombination function. Commonly, for example, a sharing can be defined over a polynomial ring like for . In that case, an arithmetic masking modulus q is a natural masking solution.
Polynomial masking: this masking leverages the idea behind Lagrange polynomials: let P be a degree d + 1 polynomial. The knowledge of d + 1 couples (yi, P(yi)) for any all distinct y0, y1 …yd allows us to completely reconstruct the polynomial P. Conversely, the knowledge of strictly less than d + 1 couples (yi, P(yi)) gives no information on the actual polynomial. The principle for masking a sensible value is simple: x is first embedded in a (d + 1)-degree polynomial , typically x is such that P(0) = x. The shares are therefore defined as d + 1 couples xi = (yi, P(yi)), and we choose here (1, P(1)), (2, P(2)), …, (d + 1, P(d + 1)). To recover the polynomial with the knowledge of all the P(i), we can recover the polynomial through Lagrange’s formula:
Next, we can recombine the secret with F(x0, …, xd) = LP(1),…,P(d+1)(0).
This type of masking has a quite complex recombination function, thus it intuitively leads to a very low practical leakage while being somehow more expensive in terms of performance. Indeed, few common cryptographic operations are linear with a decomposition on polynomial images.
Multiplicative masking: the multiplicative masking consists of applying F(x0, …, xd) := x0 × ⋯ × xd in Definition 1.1, where the multiplication is made in the corresponding field . This type of masking can be very well suited for the SubBytes step of the AES. Indeed, x ↦ x−1 is an endomorphism of GF(256)*.
Practical activity: Algorithm 1.1 is applying this technique for a field inversion after an addition of the key to the plaintext. The inversion is performed in GF(256)* and it is sequentially applied to every byte of the 128-bit state, denoted as S. For the sake of simplicity and because the XOR operation is outside of the scope of this activity, the masked state S = (S0, …, Sd) given as input of the algorithm is the masked XOR of the key k and the plaintext P. The shares S1, … Sd are generated uniformly at random in (GF(256)*)16 and S0 = x × S1 × ⋯ × Sd. In the additional material, some traces of this algorithm are provided with the corresponding plaintexts. They were derived on a ChipWhisperer-Lite Xmega. A high-order side-channel attack seems tedious; you could exploit a property of multiplicative masking to recover the whole secret k with a simple first-order DPA (see Chapter 4 of Volume 1).
Inner-product masking and affine masking: as outlined above, multiplicative masking has some vulnerabilities if not used properly but we can combine it with a Boolean one as follows.
Let be a sensitive value, be fixed vector and 〈·,·〉 be a ⊕-linear inner product defined on . An inner product masked value is a masked value as defined in Definition 1.1, with:
The particular case of d = 1 and v1 = 1 is called an affine masking.
This masking is slightly less efficient than the Boolean masking because of the extra multiplication operations but it may offer stronger security as the leakage can be better de-correlated from the secret. This kind of masking and the following schemes aim at lowering down the amount of information which can be recovered from the leakage. They can all offer a better security/complexity tradeoff in a specific setup.
Leakage squeezing: this consists of applying a different bijection to each share and proceeds with standard masking – mostly Boolean. The squeezing functions need to be bijective to recover the plain shares during computation and to unmask the output. Eventually, first-order Boolean masking combined with leakage squeezing gives similar security than masking with two to three more shares for a different complexity, more suited in hardware.
Let be a masked sensitive value. Let (B0, …, Bd) be a set of bijections over . We call leakage squeezing the application of the Bis over x such that the operated circuit works on .
Orthogonal direct sum masking: in order to ensure dual security against both passive and active attacks, masking using error correcting codes has been proposed as follows.
Let be a sensitive value and a vector of masks. Let G and H be generator matrices of two linear codes and such that and . We can define the orthogonal direct sum masking of x as z = xG ⊕ yH, where xG denote the scalar–vector product and yH the vector–matrix product. We can then recover x as x = zGT(GGT)−1.
While Boolean and arithmetic masking are the most common choices, a lot of other operations are possible as outlined in section 1.4. Many more masking recombination functions F have been attempted to improve either the security in some settings or to provide a more global theoretical background. In this section, we outline a generalization of the Boolean masking family. Indeed, the Boolean masking can be seen as a sub-case of the inner product masking. This can go further: inner product masking is a sub-case of leakage squeezing (LS), itself generalized as orthogonal direct sum masking. As a side methodology with some intersections, there stands polynomial masking.
As an attempt at uniformization, generalized code-based masking has been introduced to give a common framework to all of these schemes as shown in Figure 1.1. It has been shown that any other masking scheme outlined in section 1.4 can be instantiated using generalized code-based masking.
Generalized code-based masking is similar to orthogonal direct sum masking where and can be any matrices such that . Let us define the matrices G and H as generator matrices of the linear codes and , respectively. Let k be the dimension of G and t the dimension of H. Table 1.1 gives an overview of how to parameterize code-based masking to instantiate specific masking.
Table 1.1.Instantiation of masking schemes through generalized code-based masking
Scheme
Boolean
IP
LS
ODS
Polynomial
GCB
Conditions
(1 0 0 ⋯ 0)
(1 0 0 ⋯ 0)
(1 1 ⋯ 1)
Parameters
k
= 1,
d
=
t
+ 1
k
= 1,
d
=
t
+ 1
d
=
k
+
t
d
=
k
+
t
d
≥
k
+
t
d
≥
k
+
t
Figure 1.1.Generalization of masking schemes under code-based masking.
One type of masking is not often suited for a whole cryptographic algorithm. For example, whereas a multiplicative masking can be well suited for a SubBytes application corresponding to x ↦ x254 in GF(256), it is not suited for the addition of the sub-keys in the value (this time linear for ⊕). Furthermore, an arithmetic masking could be perfectly suited for addition of polynomials in for , but it is not well suited for deriving a masked outcome of a comparison |p| ≥ K for p being a polynomial coefficient and K being a constant. These cases are very frequent in cryptographic algorithms. Hence, several types of masking can be necessary for an efficient masked algorithm. In consequence, conversion algorithms are necessary. The particularity of these algorithms is the fact that they are functionally equivalent to the identity function; however, they take a masked variable as input and return this same variable but masked with a different type of masking.
There is no generic technique for deriving such algorithms independently of the types of masking. Efficient conversions are sometimes challenging to design in a secure way. The case of Boolean to arithmetic conversions will be depicted later in Chapter 2 of this volume. There also exist less widespread conversions like multiplicative to Boolean. Some additional information on these conversions will be pointed out later in the references.
Masking is a very powerful countermeasure and is usually implemented either table-based or directly implementing masked operations. Nevertheless, some algorithms can be efficiently masked with specific methods.
As will be detailed in Chapter 9 of this volume, the RSA algorithm has simple and efficient ways to do exponent blinding to protect the exponentiation modulo, the product of two primes. This kind of masking uses a random multiple of the order of the multiplicative group to protect the exponentiation. The eventual output stays unchanged but the actual exponent used in the computation is randomized.
Mini-game: let us try to wear the adversary’s shoes. Here are several masked algorithms, and we assume that they are functionally correct, that is, the outputs are correctly computed as functions of the inputs. However, for each algorithm, there is a masking flaw in the probing model (informally defined earlier in this chapter and formally defined later in Chapter 4 of this volume). In a masking context, the goal of an adversary against the masking is to break Definition 1.1. Thus, the goal is to find a subset of less than d variables, which is correlated to the secret.
Here,
d
= 2. This algorithm is simply computing an XOR of its inputs. The secret here is
a
.
Here,
d
= 3. This algorithm is computing the multiplication of two inputs masked in Boolean form. Your goal is to recover
a
with at most two intermediate values.
In this algorithm,
d
= 4. This algorithm generates an arithmetic masking in of a sample whose distribution is centered and has a small standard deviation. Let us define
D
as a centered discrete Gaussian distribution on with a small standard deviation 0.5. Note that distribution does not correspond to the final one, and the precise description of the final distribution is not necessary for finding flaws in this algorithm. Your goal is to find three intermediate variables that provide some information about the secret, the output
c
.
Masking is a very powerful tool to generically protect cryptographic algorithms against side-channel attacks. However, there is an important tradeoff between efficiency and security with many levels, and it will be the main subject of this part.
In Chapter 2, several common cryptographic algorithms and functions will be masked with emphasis on elementary operations.
Chapter 3 of this volume outlines the impact of hardware on masking and the necessity to take into account physical phenomena like glitches in the design of masking implementations. Chapter 4 focuses on the various security models available to prove masking implementations. Eventually, Chapter 5 gives tools to implement masking schemes on specific algorithms and to assess their security.
Section 1.1
. The theory of masking has been seminally introduced in Goubin and Patarin (
1999
); Ishai et al. (
2003
); Goudarzi and Rivain (
2016
).
Section 1.2
. For more information about the effect of masking on side-channel leakage, we refer to Chari et al. (
1999
) and Barthe et al. (
2017
), and more information on the different models will be presented later in the chapter.
Section 1.3
. For concrete examples of Boolean, arithmetic and multiplicative masking, we refer to Rivain and Prouff (
2010
) and Genelle et al. (
2011
).
Section 1.4
. For more information and recent works in the domain of code-based masking, we recommand Bringer et al. (
2014
); Wang et al. (
2020
) and Cheng et al. (
2021
).
Barthe, G., Dupressoir, F., Faust, S., Grégoire, B., Standaert, F.-X., Strub, P.-Y. (2017). Parallel implementations of masking schemes and the bounded moment leakage model. Report 2016/912, Cryptology ePrint Archive [Online]. Available at:
https://eprint.iacr.org/2016/912
.
Bringer, J., Carlet, C., Chabanne, H., Guilley, S., Maghrebi, H. (2014). Orthogonal direct sum masking: A smartcard friendly computation paradigm in a code, with builtin protection against side-channel and fault attacks. In
WISTP
. Springer, Berlin, Heidelberg.
Chari, S., Jutla, C.S., Rao, J.R., Rohatgi, P. (1999). Towards sound approaches to counteract power-analysis attacks. In
Advances in Cryptology – CRYPTO’ 99
, Wiener, M. (ed.). Springer, Berlin, Heidelberg.
Cheng, W., Guilley, S., Carlet, C., Danger, J.-L., Mesnager, S. (2021). Information leakages in code-based masking: A unified quantification approach.
IACR Transactions on Cryptographic Hardware and Embedded Systems
, 3, 465–495.
Genelle, L., Prouff, E., Quisquater, M. (2011). Thwarting higher-order side channel analysis with additive and multiplicative maskings. In
Cryptographic Hardware and Embedded Systems – CHES 2011
, Preneel, B. and Takagi, T. (eds). Springer, Berlin, Heidelberg.
Goubin, L. and Patarin, J. (1999). DES and differential power analysis (the “duplication” method). In
Cryptographic Hardware and Embedded Systems – CHES 1999
, Koç, Ç.K. and Paar, C. (eds). Springer, Berlin, Heidelberg.
Goudarzi, D. and Rivain, M. (2016). On the multiplicative complexity of Boolean functions and bitsliced higher-order masking. In
Cryptographic Hardware and Embedded Systems – CHES 2016
. Springer, Berlin, Heidelberg.
Ishai, Y., Sahai, A., Wagner, D. (2003). Private circuits: Securing hardware against probing attacks. In
Advances in Cryptology – CRYPTO 2003
, Boneh, D. (ed.). Springer, Berlin, Heidelberg.
Rivain, M. and Prouff, E. (2010). Provably secure higher-order masking of AES. In
Cryptographic Hardware and Embedded Systems – CHES 2010
, Mangard, S. and Standaert, F.X. (eds). Springer, Berlin, Heidelberg.
Wang, W., Méaux, P., Cassiers, G., Standaert, F.-X. (2020). Efficient and private computations with code-based masking.
IACR Transactions on Cryptographic Hardware and Embedded Systems
, 2, 128–171.
Jean-Sébastien CORON1 and Rina ZEITOUN2
1University of Luxembourg, Luxembourg
2Cryptography & Security Labs, IDEMIA, Courbevoie, France
As we have seen in the previous chapter, the main countermeasure against side-channel attacks is masking. For Boolean masking, it consists of splitting each variable x into n Boolean shares x = x1 ⊕ x2 ⊕ ⋯ ⊕ xn, with n > t for security against t probes. Initially, the shares are generated uniformly at random under this condition. For example, we can generate x1, …, xn−1 randomly and let xn = x ⊕ x1 ⊕ ⋯ ⊕ xn−1. The shares are then processed separately in masked operations (also called gadgets) that enable us to compute the underlying secret variables in a secure way.
A block-cipher such as AES alternates several rounds, each containing some linear transformations, and some nonlinear transformations. A linear function y = f(x) is easy to mask when x is shared as x = x1 ⊕ ⋯ ⊕ xn, as we have:
so it suffices to compute yi = f(xi) separately for every i, so that we eventually get y = y1 ⊕ ⋯ ⊕ yn as required. Consider, for example, the squaring operation. Since squaring is linear in , given as input the shares xi of x = x1 ⊕ ⋯ ⊕ xn, we can compute the shares yi of simply by letting for all 1 ≤ i ≤ n.
Similarly, processing the xor operation is straightforward. Given as input the shares xi and yi of x = x1 ⊕ ⋯ ⊕ xn and y = y1 ⊕ ⋯ ⊕ yn, we can obtain the shares zi of z = x ⊕ y simply by letting zi = xi ⊕ yi for all 1 ≤ i ≤ n.
As explained above, linear operations are straightforward to mask by performing computations on each share independently. This is however not the case for nonlinear operations, which generally require the processing of all shares together to compute the correct result. To preserve security against t-th order attack in this context, a dedicated method becomes necessary, usually involving additional randomness.
In the following, the crucial problem of masking the AND operation is addressed, as any circuit (at the hardware level) can be implemented using only the linear XOR gate and the nonlinear AND gate. As a second step, we describe mask refreshing techniques, which are frequently required to ensure the security of a full construction when composing many gadgets.
The so-called Ishai, Sahai and Wagner (ISW) scheme securely computes a sharing of c = a ∧ b from two input sharings (a1, a2, …, an) and (b1, b2, …, bn) of a and b, respectively, by computing all of the cross-products ai ∧ bj for 1 ≤ i, j ≤ n based on the following equation:
Those n2 products are then recombined in n new shares (c1, c2, …, cn), including fresh random elements ri,j to ensure security; namely, a straightforward recombination of the cross-products would not achieve security against t probes for any t < n.
The ISW scheme is pictured in Algorithm 2.1. It was originally described over . To process an AND gate over {0, 1}k, the algorithm is adapted as follows: (i) all 1-bit variables defined over are replaced by k-bit variables defined over {0, 1}k; (ii) the 1-bit XOR operations are replaced by k-bit XOR operations; and (iii) the 1-bit AND operations are replaced by k-bit AND operations. This extension still preserves the security of the original scheme. In line 8, we use brackets to indicate the order of the operations, as this is mandatory for the security of the scheme. Algorithm 2.1 enables us to securely evaluate the AND operation at the cost of n2 multiplications (which correspond to the n2 cross-products), n(n − 1)/2 random values (which are necessary to perform their secure recombination) and 2n(n − 1) additions (which allow us to achieve the aforementioned recombination). Its complexity is therefore , with a number of operations TISW(n) = n(7n − 5)/2.
We illustrate the ISW algorithm for n = 2 and n = 3.
For
n
= 2 with input shares (
a
1
,
a
2
) and (
b
1
,
b
2
), the output shares (
c
1
,
c
2
) are given as:
For
n
= 3 with input shares (
a
1
,
a
2
,
a
3
) and (
b
1
,
b
2
,
b
3
), the output shares (
c
1
,
c
2
,
c
3
) are given as:
A mask refreshing algorithm consists of updating a given sharing while encoding the same value. Refreshing algorithms are an essential ingredient for secure masking, as they enable us to securely compose gadgets within a larger circuit. This enables the partitioning of a masked implementation into small components that are easier to analyze in terms of security. A drawback of refreshing algorithms is their randomness cost, which can account for a significant part of the global performance overhead. Three classical refresh masks algorithms are depicted hereafter according to their security level and efficiency.
The mask refreshing scheme depicted in Algorithm 2.2 has complexity , which is quite efficient. However, it is only useful in specific constructions such as the table recomputation countermeasure (see section 2.5). As depicted in Figure 2.1, it simply consists of updating all shares by xoring each of them with a fresh random value, and in accumulating all of those fresh random values into a dedicated share (here the last one cn) in order to encode the same value (see Figure 2.1 for an illustration). This gives a total number of operations TLinearRefreshMasks(n) = 3(n − 1).
The refresh masks operation depicted in Algorithm 2.3 has complexity , but it enables the secure composability of gadgets in a full construction. The algorithm is a straightforward application of the ISW multiplication scheme by performing a secure multiplication by 1, that is, by using (1, 0, …, 0) as second input of the ISW algorithm. The security and efficiency properties therefore directly follow from ISW. With (1, 0, …, 0) as a multiplicand, the n2 cross-products do not need to be performed, however, we must still generate n(n − 1)/2 random values and perform n(n − 1) additions, which gives a total number of operations TQuadraticRefreshMasks(n) = 3n(n − 1)/2.
Figure 2.1.The LinearRefreshMasks algorithm, with the randoms riaccumulated on the last column
The QuasiLinearRefreshMasks algorithm depicted in Algorithm 2.4 has complexity , instead of for the previous algorithm, but with the same level of security. That is, it can still ensure secure composition in a full construction. As illustrated in Figure 2.2, the scheme is defined recursively and for simplicity, it is assumed here that n is a power of 2. First, a preprocessing layer LI is applied on the n input shares ai, corresponding to lines 5–9 of Algorithm 2.4. Then the refresh masks algorithm is applied recursively on the two halves of the shares. Eventually, a post-processing layer LO is applied (same as LI), before outputting the output shares di (it corresponds to lines 12–16 in Algorithm 2.4) (see Figure 2.2 for an illustration). The complexity comes from the fact that T(n) ≤ 2 · T(n/2) + c · n for some constant c, which recursively gives T(n) ≤ 2i· T(n/2i) + i · c · n, implying .
Figure 2.2.Recursive definition ofQuasiLinearRefreshMasks, with the preprocessing layer LI, the two recursive applications ofQuasiLinearRefreshMaskson the two halves of the shares, and the post-processing layer LO
A block-cipher such as AES alternates several rounds, each containing some linear transformations, and some nonlinear transformations, usually based on S-boxes. In the following, we show how to securely mask such S-boxes at any order.
In order to mask the AES block-cipher, the previous ISW multiplication gadget can be adapted to the AES finite field instead of , by replacing the AND operation by a finite field multiplication in . We describe the corresponding SecMult algorithm in Algorithm 2.5.
Moreover, the nonlinear part of the AES S-box can be written as S(x) = x254 over , and can be evaluated as a sequence of nonlinear multiplications and linear squarings, with only four nonlinear multiplications. More precisely, to compute y = x254 over with four multiplications, we use the sequence of multiplications described in Figure 2.3.
Figure 2.3.Sequential computation of x254
Eventually, we obtain the SecExp254 algorithm described in Algorithm 2.6. Thanks to the additional mask refreshing, resistance against an attack of order t is achieved with n ≥ t + 1 shares. The RefreshMasks scheme used in Algorithm 2.6 corresponds to the QuadraticRefreshMasks or QuasiLinearRefreshMasks algorithms described in the previous section.
The Rivain–Prouff countermeasure can be extended to the high-order computation of any S-box. Namely, any given k-bit S-box can be represented by a polynomial over using Lagrange’s interpolation theorem, and therefore, it can be securely evaluated with a sequence of n-shared additions, squarings and multiplications. Asymptotically, the running time of the countermeasure is dominated by the number of nonlinear multiplications, where each nonlinear multiplication has complexity for n shares. To minimize the number of such nonlinear multiplications, the authors described a technique called Parity-Split, with a proven complexity of nonlinear multiplications for evaluating any k-bit S-box.
More precisely, any S-box can be written using Lagrange’s interpolation theorem as:
for coefficients . The naive method to high-order compute S(x) would require 2k − 2 multiplications in . Since squarings are essentially free in , we can do better using the following simple observation: any polynomial Q(x) of degree t can be written as:
where both Q1(x) and Q2(x) have degree at most ⌊t/2⌋. Using this relation, we can precompute the set of powers x2i for 1 ≤ i ≤ 2k−1 − 1, which requires 2k−1 − 2 multiplications. This set of powers is used to evaluate Q1(x2) and Q2(x2), and the product Q2(x2) · x requires one additional multiplication, hence a total of 2k−1 − 1 multiplications (instead of 2k − 2). By applying recursively the above relation, we can evaluate S(x) with multiplications only.
An alternative countermeasure for S-box computation is based on randomized tables. The simplest case is first-order computation. The first-order countermeasure consists in recomputing in RAM the original S-box S with inputs shifted by some random r and with masked outputs. More precisely, we compute the randomized table:
for all u ∈ {0, 1}k, where r ∈ {0, 1}k is the input mask and s ∈ {0, 1}k is the output mask. To evaluate S(x) from the masked value x′ = x ⊕ r, it suffices to compute y′ = T(x′), which gives y′ = T(x′) = S(x′ ⊕ r) ⊕ s = S(x) ⊕ s, and therefore y′ is a masked value for S(x).
The first-order countermeasure can be generalized as follows. Every row of the randomized table T now consists of n shares. Given the n input shares xi such that x = x1 ⊕ ⋯ ⊕ xn, we start with an n-encoding of each row of the original S-box S, with:
for all rows u ∈ {0, 1}k, and we progressively shift the table T by the successive input shares x1, …, xn−1. Between every shift, the n-encodings on each row of the table are refreshed. After the last shift by xn−1, the rows of the table have been shifted by x1 ⊕ ⋯ ⊕ xn−1 and therefore the table T satisfies for all u ∈ {0, 1}k:
Then it suffices to read the table T at the row u = xn to get the n output shares yi corresponding to y = S(x). More precisely, we let (y1, …, yn) ← T(xn), which from [2.3] gives as required:
The high-order randomized countermeasure is described in Algorithm 2.7. The asymptotic complexity of this countermeasure for k-bit S-boxes is .