



RUHR-UNIVERSITÄT BOCHUM

# **Automated Generation of Masked Hardware (AGEMA)**

David Knichel, Amir Moradi, Nicolai Müller and Pascal Sasdrich



# **Crypto Device**





## **Passive Physical Attacks**









## Physical characteristics can be exploited to extract secret information:

- Timing
- Power Consumption
- Electromagnetic radiations

• ...

## **Masking**



Masking randomizes the intermediate values of a cryptographic computation to avoid dependencies between these values and the power consumption

It is usually applied on an algorithmic level

Does not rely on the power consumption characteristics of the device

Each intermediate value is concealed by a random mask that is different for every execution

Basically, it corresponds to a secret sharing scheme:

Boolean secret sharing

## **Boolean Secret Sharing**



### First order Boolean secret sharing (two shares):

Secret: x



Shares:  $(x_1, x_2)$   $x_1 = x \oplus m$   $x_2 = m$ 

$$x_1 = x \oplus m$$

$$x_1 \oplus x_2 = x$$

- One needs to know share  $x_1$  and  $x_2$  to compute secret x
  - Neither of them alone provides enough information

#### **Linear Function** *F*

Random: m

- Definition  $F(x \oplus z) = F(x) \oplus F(z)$
- Boolean share before  $F: (x_1, x_2)$  with  $x_1 \oplus x_2 = x$
- Boolean share after  $F: (F(x_1), F(x_2))$  with  $F(x_1) \oplus F(x_2) = F(x_1 \oplus x_2) = F(x)$

#### Non-linear Function?

# **Masking in Hardware**



### Pre-computing the masked tables in software

- Sequential operations, time consuming, low efficiency
- High efficiency is desired in hardware

## Ad-hoc/heuristic schemes



## **Masking in Hardware**



#### Pre-computing the masked tables in software

- Sequential operations, time consuming, low efficiency
- High efficiency is desired in hardware

#### Ad-hoc/heuristic schemes



### Processing the mask (m) and masked data $(x \oplus m)$ simultaneously

- Joint distribution of leakages
  - It is called to be due to glitches [actually not always true]
  - Possible attacks

## **Masking in Hardware**



#### Pre-computing the masked tables in software

- Sequential operations, time consuming, low efficiency
- High efficiency is desired in hardware

#### Ad-hoc/heuristic schemes



### Processing the mask (m) and masked data $(x \oplus m)$ simultaneously

- Joint distribution of leakages
  - It is called to be due to glitches [actually not always true]
  - Possible attacks

#### **Systematic schemes**

Threshold Implementation, provable security



#### Let's consider an Sbox:

$$x_1 \oplus x_2 \oplus x_3 = x$$



$$y_1 \oplus y_2 \oplus y_3 = y$$



#### Let's consider an Sbox:

$$x_1 \oplus x_2 \oplus x_3 = x$$



 $y_1 \oplus y_2 \oplus y_3 = y$ 

Each f should be independent of one share



#### Let's consider an Sbox:

$$x_1 \oplus x_2 \oplus x_3 = x$$



Each f should be

independent of one share

$$y_1 \oplus y_2 \oplus y_3 = y$$

## Example:

$$x = (a, b, c, d)$$
  $y = (e, f, g, h)$ 

$$S_1(a,b,c,d) = e$$

$$e = a \oplus bc \oplus d$$

$$e = a_1 \oplus a_2 \oplus a_3 \oplus b_1c_1 \oplus b_1c_2 \oplus b_1c_3 \oplus b_2c_1 \oplus a_1c_2 \oplus a_2c_1 \oplus a_2c_2 \oplus a_3 \oplus a_3$$

$$b_2c_2 \oplus b_2c_3 \oplus b_3c_1 \oplus b_3c_2 \oplus b_3c_3 \oplus d_1 \oplus d_2 \oplus d_3$$

AGEMA



#### Let's consider an Sbox:

$$x_1 \oplus x_2 \oplus x_3 = x$$



independent of one share

Each f should be

## Example:

$$x = (a, b, c, d)$$
  $y = (e, f, g, h)$ 

$$S_1(a, b, c, d) = e$$

$$e = a \oplus bc \oplus d$$

$$e = a_1 \oplus a_2 \oplus a_3 \oplus b_1c_1 \oplus b_1c_2 \oplus b_1c_3 \oplus b_2c_1 \oplus a_1c_2 \oplus a_2c_1 \oplus a_2c_2 \oplus a_3 \oplus a_3$$

$$b_2c_2 \oplus b_2c_3 \oplus b_3c_1 \oplus b_3c_2 \oplus b_3c_3 \oplus d_1 \oplus d_2 \oplus d_3$$

$$y_1 \oplus y_2 \oplus y_3 = y$$

are clear where to go (to which f)

can be arbitrarily distributed among two component functions.

$$f_1 = b_2c_3 \oplus b_3c_2 \oplus a_2 \oplus d_2 \oplus b_2c_2$$

$$f_2 = b_3c_1 \oplus b_1c_3 \oplus a_3 \oplus d_3 \oplus b_3c_3$$

$$f_3 = b_1c_2 \oplus b_2c_1 \oplus a_1 \oplus d_1 \oplus b_1c_1$$

 $a_2 \oplus d_2 \oplus b_2 c_2$   $a_3 \oplus d_3 \oplus b_3 c_3$ 

# How to Make a Masked Design?



Unprotected Implementation (behavioral verilog)

## How to Make a Masked Design?



Unprotected
Implementation
(behavioral verilog)

## Manual Design

- Not straightforward
- Based on experience
- Algorithmic level
- Prone to errors/defeats

## How to Make a Masked Design?



Unprotected
Implementation
(behavioral verilog)

## **Manual Design**

- Not straightforward
- Based on experience
- Algorithmic level
- Prone to errors/defeats

Side-Channel
Protected
Implementation
(behavioral verilog)

# Non-Composability in the TI context





## Composability



#### Masking large and complex circuits is a hard task especially for high security orders

### Composable hardware gadgets offer a systematic way to generate provable secure designs

- Arbitrary security orders possible
- Based on formal security notions
- Following divide-and-conquer approach based on fundamental building blocks
- Simply replacing unprotected gates (or larger modules) with its masked and composable counterpart



Unprotected
Implementation
(behavioral verilog)

### Manual Design

- Not straightforward
- Based on experience
- Algorithmic level
- Prone to errors/defeats

Side-Channel
Protected
Implementation
(behavioral verilog)









### **Settings**

- Level of protection (order)?
- Optimize for area or speed?





Manual Design

- Not straightforward
- Based on experience
- Algorithmic level
- Prone to errors/defeats

Side-Channel Protected **Implementation** (behavioral verilog)

#### **AGEMA**

- Identifies gates/modules to be secured
- Replaces with equivalent variants
- Adjusts the control logic

- Optimize for area or speed?





Side-Channel
Protected
Implementation
(behavioral verilog)

Side-Channel
Protected
Implementation
(netlist)





Manual Design

- Not straightforward
- Based on experience
- Algorithmic level
- Prone to errors/defeats

Side-Channel Protected **Implementation** (behavioral verilog)

> Side-Channel Protected **Implementation** (netlist)

#### **AGEMA**

- Identifies gates/modules to be secured
- Replaces with equivalent variants
- Adjusts the control logic

- Free of heuristics
- Based on proofs
- Free of engineering's failures
- Open-source (GitHub)

- Optimize for area or speed?





































# **Example**





### **General Procedure**





#### **AGEMA**



## Requirements

- Composable security
  - A secure circuit is not necessarily secure when composed
  - PINI (Probe-Isolating Non-Interference)
- PINI gadgets of essential gates
  - AND/NAND/OR/NOR/...

### **AGEMA**



#### Requirements

- Composable security
  - A secure circuit is not necessarily secure when composed
  - PINI (Probe-Isolating Non-Interference)
- PINI gadgets of essential gates
  - AND/NAND/OR/NOR/...

## Efficiency

- Provable security
  - As long as the gadgets are PINI
- Extendable to any arbitrary order
- Not as efficient as manually-crafted designs
  - Larger, higher latency, higher demand for fresh masks
- Any engineer can make secure designs
  - https://github.com/Chair-for-Security-Engineering/AGEMA





# Thanks! Any Questions?

amir.moradi@rub.de

#### **Standardization Process**



#### https://csrc.nist.gov/Projects/masked-circuits

