1 Introduction
Hash functions are one of the most fundamental building blocks in protocol design. For this reason, both the cryptanalysis and provable security of hash functions have been active areas of research in recent years. The first known instances of collisions and chosen-prefix collisions in SHA-1 were recently demonstrated by Stevens et al. [26] and Leurent and Peyrin [20], respectively. Furthermore, feasibility of built-in adversarial weaknesses (aka. backdoors) in efficient hash functions have been demonstrated by Fischlin, Janson, and Mazaheri [13]. A practical way to provide safeguards against similar failures of hash functions is to combine a number of independent hash functions so that the resulting function is at least as secure as their strongest. Most works in this area have focused attention on a setting where at least one of the hash functions is secure, although positive results when all underlying hash functions have weaknesses have also been demonstrated [15, 22].
In this work we are interested in protecting hash functions against a variety of attacks that may arise due to backdoors, cryptanalytic advances, or preprocessing attacks. We carry out our study in the recent backdoored random-oracle (BRO) model, which uniformly treats these settings and also permits strong adversarial settings where all hash functions may be weak.
1.1 The BRO Model
Bauer, Farshim, and Mazaheri (BFM) [3] at Crypto 2018 formulated a new model for the analysis of hash functions that substantially weakens the traditional random-oracle (RO) model. Here an adversary, on top of direct access to the random oracle, is able to obtain arbitrary functions of the function table of the random oracle.1 The implications of this weakening are manifold. To start with, positive results in this model imply positive results in the traditional setting where all but one of the hash functions is weak. Second, this model captures arbitrary preprocessing attacks on hash functions, another highly active area of research [6, 7, 10, 27]. Finally, it allows to model unrestricted adversarial capabilities, which can adaptively depend on input instances, and thus captures built-in as well as inadvertent weaknesses that may or may not be discovered in course of time.


The reductions to communication complexity problems are at times tedious and very specific to the combiner. Moreover, the hardness of the communication complexity problem underlying collision resistance is conjectural and still remains to be proven. Furthermore, a number of deployed protocols have only been shown to be secure in the random-oracle model, and thus may rely on properties beyond one-wayness, pseudorandomness, or collision resistance.
This raises the question whether or not other cryptographic properties expected from a good hash function are also met by these combiners. In other words: Can combining two or more backdoored random oracles render access to independent but adaptive auxiliary information useless? We formalize and study this question in the indifferentiability framework, which has been immensely successful in justifying the soundness of hash-function designs.
1.2 Indifferentiability
A common paradigm in the design of hash functions is to start with some underlying primitive, and through some construction build a more complex one. The provable security of such constructions have been analyzed through two main approaches. One formulates specific goals (such as collision resistance) and goes on to show that the construction satisfies them if its underlying primitives satisfy their own specific security properties. Another is a general approach, whose goal is to show that a (wide) class of security goals are simultaneously met.
The latter has been formalized in a number of frameworks, notably in the UC framework of Canetti [5], the reactive systems framework of Pfitzmann and Waidner [24], and the indifferentiability framework of Maurer, Renner, and Holenstein [23]. The latter is by now a standard methodology to study the soundness of cryptographic constructions, particularly symmetric ones such as hash functions [4, 8] and block-ciphers [1, 9, 12, 16] in idealized models of computation.
In the MRH framework, a public primitive is available and the goal is to build another primitive, say a random oracle
, from
through a construction
. Indifferentiability formalizes a set of necessary and sufficient conditions for the construction
to securely replace its ideal counterpart
in a wide range of environments: for a simulator
, the systems
and
should be indistinguishable. The composition theorem proved by MRH states that, if
is indifferentiable from
, then
can securely replace
in arbitrary single-stage contexts. A central corollary of this composition theorem is that indifferentiability implies any single-stage security goal, which includes among others, one-wayness, collision resistance, PRG/PRF security, and more.
1.3 Contributions
With the above terminology in hand, the central question tackled in this work is whether or not combiners that are indifferentiable from a conventional (backdoor-free) random oracle exist, when the underlying primitives are two (or more) backdoored random oracles.
Let us consider the concatenation combiner , where
and
are both backdoored. This construction was shown to be one-way, collision resistant, and PRG secure if both underlying functions are highly compressing. Despite this, the concatenation combiner cannot be indifferentiable from a random oracle: using the backdoor oracle for
an attacker can compute two inputs x and
such that
, query them to the construction and return 1 iff the left sides of the outputs match. However, any simulator attempting to find such a pair with respect to a backdoor-free random oracle must place an exponentially large number of queries. Attacks on the cascade combiner
were also given in [3, Section D.2] for a wider range of parameter regimes, leaving only the expand-then-compress case as potentially indifferentiable. Finally, the xor combiner
, which is simpler, more efficient, and one of the most common ways to combine hash functions, resists these.2
Decomposition of Distributions. When proving results in the presence of auxiliary input, Uhruh [27] observed that pre-computation (or leakage) on a random oracle can reveal a significant amount of information only on restricted parts of its support. The problem of dealing with auxiliary input was later revised in a number of works [6, 7, 10]. In particular Coretti et al. [7], building on work in communication complexity, employed a pre-sampling technique to prove a number of positive results in the RO model with auxiliary input with tighter bounds. At a high level, this method permits writing a high min-entropy distribution (here, over a set of functions) as the convex combination of a (large) number of distributions which are fixed on a certain number (p) of points and highly unpredictable on the rest, the so-called -dense distributions. This technique was originally introduced in the work of Göös et al. [14].
The Simulator. Our simulator for the xor combiner builds on this technique to decompose distributions into a convex combination of -dense distributions. Simulation of backdoor oracles is arguably quite natural and proceeds as follows. Starting with uniform random oracles
and
, on each backdoor query f for
the simulator computes
and updates the distribution of the random oracle
to be uniform conditioned on the output of f being z. This distribution is then decomposed into a convex combination of
-dense distributions, from which one function is sampled. For all of the p fixed points, the simulator sets the value of
consistently with the random oracle and the distribution of
is updated accordingly. An analogous procedure is implemented as the simulator for the second backdoored random oracle.
Technical Analysis. The first technical contribution of our work is a refinement of the decomposition technique which can be used to adaptively decompose distributions after backdoor queries. We show that this refinement is sufficiently powerful to allow proving indifferentiability up to a logarithmic (in the input size of the BROs) number of switches between the backdoor queries. We prove this via a sequence of games which are carefully designed so as to be compatible with the decomposition technique. A key observation is that in contrast to previous works in the AI-RO model, we do not replace the dense (intuitively, unpredictable) part of the distribution of random oracles with uniform: backdoor functions “see” the entire table of the random oracle and this replacement would result in a noticeable change. Second, we modify the number of fixed points in the (partially) dense distributions so that progressively smaller sets of points are fixed. Even though each leakage corresponds to fixing a large number of points, it is proportionally smaller than the previous number of fixed points. Thus the overall bound remains small.
Simulator Efficiency. Our simulator runs in doubly exponential time in the bit-length of the random oracle and thus is of use in information-theoretic settings. These include the vast majority of symmetric constructions. Protocols based on computational assumptions (such as public-key encryption) escape this treatment: the overall adversary obtained via the composition would run the decomposition algorithm and hence will not be poly-time. This observation, however, also applies to the BRO model as the backdoor oracles also allow for non-polynomial time computation, trivially breaking any computational assumption if unrestricted. Despite this, in a setting where the computational assumption holds relative to the backdoor oracles, positive results may hold. We can for example restrict the backdoor capability to achieve this. Another promising avenue is to rely on an independent idealized model such as the generic-group model (GGM) and for instance, prove IND-CCA security of Hashed ElGamal in the BRO and (backdoor-free) GGM models. We leave exploring these solutions to future work.




Composition. Let denote the number of times the adversary switches between one backdoor oracle to the other. Regarding the query complexities of our simulators, each query to the backdoor oracle translates to roughly
queries to the random oracle for the xor combiner and roughly
queries to the random oracle for the extractor combiner. This in particular means that, for a wide range of parameters, composition is only meaningful with respect to security notions whereby the random oracle can tolerate a large number of queries. This, for example, would be the case for one-way, PRG, and PRF security notions where the security bounds are of the form
. However, with respect to a smaller number of switches (as well as in the auxiliary-input setting with no adaptivity), collision resistance can still be achieved.
Indifferentiability with Auxiliary Input. When our definition of indifferentiability is restricted so that only a single backdoor query to each hash function at the onset is allowed, we obtain a notion that formalizes indifferentiability with auxiliary input. This definition is interesting as it is sufficiently strong to allow for the generic replacement of random oracles with iterative constructions even in the presence of preprocessing attacks. Accordingly, our positive results in the BRO model when considered with no adaptivity translate to indifferentiability with independent preprocessing attacks. To complement this picture, we also discuss the case of auxiliary-input indifferentiability with a single BRO and show, as expected, that a salted indifferentiable construction is also indifferentiable with auxiliary input.
Open Problems. In order to overcome the bounded adaptivity restriction and prove full indifferentiability, one would require an improved decomposition technique which fixes considerably less points after each leakage. This, at the moment, seems (very) challenging and is left as an open question. In particular, such a result would simultaneously give new proofs of known communication complexity lower bounds for a host of problems, such as set-disjointness and intersection, potentially a proof of the conjecturally hard problem stated in [3], and many others. (We note that improved decomposition techniques can potentially also translate to improved bounds.) Indeed the xor combiner may achieve security well beyond what we establish here (and indeed the original work of BFM does so for specific games). Finally, as the extractor combiner suggests, the form of the combiner and the number of underlying BROs can also affect the overall bounds.
2 Preliminaries
Throughout the paper, when we write [N] for any uppercase letter N, we use the convention that N is an integer and a power of two, i.e., for some
. Let
denote the set of all n-bit strings. We use
to denote the set of all bit-strings of length
, which corresponds to the set of all functions
. We denote the uniform distribution over an arbitrary finite set S by
.
For and
we denote by
the projection of
onto the points in I. Let
be a probability density function over
. We define
as the probability that a sample randomly drawn from
falls into the domain
. By
we denote the density
conditioned on the domain D. For a function
and
, by
we denote
conditioned on
for all
.
For a set of assignments , by
we denote
conditioned on
for all
and all
. We further let
(resp.
) denote the set containing the first (resp. second) coordinates of all elements in A.
For an algorithm we denote by
a call of the algorithm with (constant) parameters
and variable inputs
. This is to increase clarity among multiple calls to the algorithm about the main input, while the parameters remain unchanged.
2.1 Backdoored Random Oracles
We recall the definition of the backdoored random-oracle model from [3]. The model (for some
) defines a setting where all parties have access to k functions
, where
’s are chosen uniformly and independently at random from
, while the adversarial parties also have access to the corresponding backdoor oracles
’s. A backdoor oracle
can be queried on functions f and return
. If for all
we have
and
, we simply refer to this model as
and when N and M are clear from the context, we simply use
.
These models may be weakened by restricting the adversary to query only on functions f in some capability class
. However our results as well as those in [3] hold for arbitrary backdoor capabilities. In other words an adversary can (adaptively) query arbitrary functions f to any of the backdoor oracles.
2.2 Indifferentiability in the BRO Model





![$$ \mathsf {Adv}^{\mathrm {indiff}}_{\mathsf {C}^{\mathsf {H}_i},\mathsf {Sim}}(\mathcal {D}):= \Big | \text {Pr}\left[ {\mathcal {D}^{\mathsf {C}^{\mathsf {H}_i},\mathsf {H}_i,\textsc {Bd}_i}}\right] - \text {Pr}\left[ {\mathcal {D}^{\mathsf {RO},\mathsf {SimH}_i^\mathsf {RO},\mathsf {SimBD}_i^\mathsf {RO}}}\right] \Big |~, $$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_Equ4.png)


We emphasize that the simulators do not get access to any backdoor oracles. This ensures that any attack against a construction with backdoors translates to one against the underlying random oracles without any backdoors.
2.3 Randomness Extractors
Let X be a random variable. The min-entropy of X is defined as . The random variable X is called a (weak) k-source if
, i.e.,
. The min-entropy of a distribution typically determines how many bits can be extracted from it which are close to uniform. The notion of closeness is formalized by the statistical distance. For two random variables X and Y over a common support D, their statistical distance is defined as
.
In this paper we are interested in extractors that do not require seeds but rather rely on multiple weak sources.
![$$\mathsf {Ext}:[N_1]\times \ldots \times [N_t] \rightarrow [M]$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq103.png)



![$$[N_i]$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq107.png)
![$$\begin{aligned} \mathrm {SD}\big ( \mathsf {Ext}(X_1,\ldots ,X_t), \mathcal {U}_{[M]} \big ) \le \varepsilon ~, \end{aligned}$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_Equ5.png)






Below we define useful classes of distributions, the so-called (partially) dense distributions, resp. dense probability density functions. Intuitively, bit strings from a dense distribution are unpredictable not only as a whole but also in any of their substrings and any combination of those substrings.

![$$[M]^N$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq115.png)
is called
-dense if for
, it holds that for every subset
we have
.
is called
-dense if for
there exists a set
of size
such that
, while for every subset
we have
. That is,
is fixed on at most p coordinates and
-dense on the rest.
We call a distribution dense, if the corresponding density function is dense.
3 Decomposition of High Min-Entropy Distributions
Any high min-entropy distribution can be written as a convex combination of distributions that are fixed on a number of points and dense on the rest (i.e., -dense distributions for some p and
).3 The decomposition technique introduced by Göös et al. [14] has its origins in communication complexity theory. We generalize this technique, with a terminology closer to that of Kothari et al. [18], in order to allow for adaptive leakage. The original lemma, also used by Coretti et al. [7], can be easily derived as a special case of our lemma. For this, one assumes that the starting distribution before the leakage was uniform, in other words (0, 1)-dense.
When proving results in the auxiliary-input random-oracle (AI-RO) model, Uhruh [27] observed that pre-computation (or leakage) on a random oracle can cause a significant decrease of its min-entropy only on restricted parts of its support (i.e., on p points), causing that part to become practically fixed, while the rest remains indistinguishable from random to a bounded-query distinguisher. This means that after fixing p coordinates of the random oracle, the rest can be lazily sampled from a uniform distribution. Coretti et al. [7] recently gave a different and tighter proof consisting of two main steps. First, the decomposition technique is used to show that the distribution of a random oracle given some leakage is statistically close to a -dense distribution. Second, they prove that no bounded-query algorithm can distinguish a
-dense distribution from one that is fixed on the same p points and is otherwise uniform (a so-called p-bit-fixing distribution), as suggested by Unruh [27].
Since in the BRO model adaptive queries are allowed, a function queried to the backdoor oracle is able to “see” the entire random oracle, rather than a restricted part of it. Hence, when analyzing the distribution of a random oracle after adaptive leakage, it is crucial that we keep the distributions statistically close. In other words we use -dense distributions instead of p-bit-fixing.
In the k-BRO model, we are concerned with multiple queries to the backdoor oracles, i.e., continuous and adaptive leakage that can depend on previously leaked information about both hash functions. Intuitively, since the leakage function can be arbitrary, it can in particular depend on the previously leaked values. We still need to argue that the distribution obtained after leakage about a distribution, which is not necessarily uniform, is also close to a convex combination of
distributions. Naturally, we have
, since min-entropy decreases after new leakage, and
, since additional points are fixed. Looking ahead, in the indifferentiability proofs, this refined decomposition lemma allows us to simply fix a new portion
of the simulated hash function after each leakage (i.e., backdoor query) and not to worry about the rest, which still has high entropy and can be lazily sampled (from a dense distribution) upon receiving the next query.


![$$[M]^N$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq146.png)

![$$f:[M]^N\rightarrow \{0,1\}^\ell $$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq148.png)










This refined decomposition lemma differs from the original lemma in that the starting density function is
-dense. As a first step, we modify the original decomposition algorithm from [14, 18] so that it additionally gets the set of
indices
that are already fixed in
from the start.
Our refined decomposition algorithm , given below, recursively decomposes the domain
, according to the density function after leakage
, into
partitions
such that
, where
stands for erroneous. For all i with
the partition
defines a
-dense density function
.
Each recursive call on a domain D to (other than the call leading to
, which we will discuss shortly) returns a pair
, where
represents a subset of
, where the images of all points in the set
are fixed to the same values under all functions
. In other words, we have
for some
. The algorithm finds such a pair
by considering the biggest set
(excluding those points fixed from the start, i.e.,
) such that the min-entropy of
(for
) is too small (as determined by the rate
) and then finding some
which is a very likely value of
. Then
is returned with some
as the partition that contains all
with
. The next recursive call will exclude
from the considered domain.
Decomposition halts either if the probability of a sample falling into the current domain is smaller than (i.e.,
) or the current distribution is already
-dense. In both cases the algorithm returns the current domain D together with an empty set. In the former case the returned domain is marked as an erroneous domain
, since it may not define a
-dense distribution. Let us without loss of generality assume that
is not
-dense, as otherwise the claim holds trivially.




Now we turn our attention to proving that every partition (other than
) returned by the above decomposition algorithm defines a density function
which is
-dense.
For all values of i with it holds that the density function
is
-dense, where
.
Let . Let I be the set of freshly fixed points in
and
. Let
be such that
for
. We first argue for the
-density of
on values projected to
and afterwards bound the size of I.
- 1.Suppose
is not
-dense on
. Then there exists a non-empty set which violates the density property. That is, there exists a non-empty set
and some
such that, with the probability taken over
, we have:
Now the union of the three setsforms a new set such that for some
we have
Since J was assumed to be non-empty and disjoint from(and in particular with I), its existence violates the maximality of I. Therefore,
is
dense.
- 2.We now bound the size of I, given that
. Let
and
. We have
, where the inequality holds, since
is
-dense in
rows. Let
. Then we have:
Since by definition of the decomposition algorithm, there exists ansuch that
, we obtain
Substitutingby
, we obtain
and therefore, for the total number of fixed points
we get
as stated in the claim.

Therefore, can be written as a convex combination of
and
, i.e.,
. Since
when the algorithm
terminates, the distribution
is
-close to a convex combination of
distributions.
A special case of the above lemma for a uniform (i.e., (0, 1)-dense) starting distribution , where
and
, implies the bound
used by Coretti et al. [7].
Remark. Note that the coefficient of in the right hand side of the inequality established in the lemma is of the order
. Looking ahead (see discussions on parameter estimation) this results in an increase in the number of points that the simulator needs to set. Thus any improvement in the bound established in this lemma would translate to tolerating a higher level of adaptivity and/or obtaining an improved bound.
Below we show that the expected min-entropy deficiency after leaking bits of information can be upper-bounded by
bits.
Let be a random variable over
and
be an arbitrary function. Let
be the min-entropy deficiency of
. Then, we have
.
4 The Xor Combiner
In this section, we study the indifferentiability of the xor combiner in the 2-BRO model from a random oracle
. We show indifferentiability against adversaries that switch between the two backdoor oracles
and
only a logarithmic number of times, while arbitrarily interleaving queries to the underlying BROs
and
, as well as to the random oracle
.



















Indifferentiability simulator for the xor combiner. We assume initial values ,
,
,
, and
.
Finding a distribution, which is partly fixed and partly dense, is performed by the algorithm from Fig. 1. On input of a distribution
, integer
, and a set
, the algorithm
returns a new distribution which is fixed on points in a set I of size at most
and is for some
,
-dense on the rest, together with a set of assignments A for elements in I according to the output distribution. The
algorithm internally calls the refined decomposition algorithm, whose existence is guaranteed by Lemma 1 and its output distribution is one of the distributions in the convex combination returned by
.
Upon fixing rows of one simulated BRO, the same rows in the other simulated BRO have to be fixed in a way that consistency with is assured. More precisely, for any x if
is fixed, the simulator
will immediately set
(and, analogously, so does
). The simulator specifies the number of points that it can afford to fix (since every such query requires a call to
) and the statistical distance that it wants. Such a strategy to assure consistency with
is also followed by evaluation simulators
and
, where only one coordinate of each BRO is fixed.
Note that the simulator programs values of
, which were supposed to be dense (after a first
query), to values that are uniform instead. Hence, we need to argue later that the statistical distance between a uniform and a dense distribution is small for the number of points that are being treated this way. This is formalized in Claim 2, below. Looking ahead, the need to keep the advantage of the differentiator small is the reason why the simulator adapts the number of fixed points with a differentiator’s switch to the other backdoor oracle. Finally, via a hybrid argument we can upper bound the total number of random oracle queries by the simulator and the advantage of the differentiator.
Let be the uniform distribution and
be a
-dense distribution, both over the domain
. Then we have
.

![$$z\in [M]^t$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq335.png)
![$$\Pr [\mathcal {V}=z] > 0$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq336.png)


![$$\begin{aligned} \mathrm {SD}(\mathcal {U},\mathcal {V})&= \sum _{z\in [M]^t} \max \big \{0,\Pr [\mathcal {V}=z]-\Pr [\mathcal {U}=z]\big \} \\&= \sum _{z\in V_+} \max \big \{0,\Pr [\mathcal {V}=z]-\Pr [\mathcal {U}=z]\big \} \\&= \sum _{z\in V_+} \Pr [\mathcal {V}=z] \cdot \max \Big \{0,1-\frac{\Pr [\mathcal {U}=z]}{\Pr [\mathcal {V}=z]}\Big \}~. \end{aligned}$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_Equ13.png)
![$$z\in [M]^t$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq339.png)
![$$\Pr [\mathcal {V}=z] \le M^{-(1-\delta )\cdot t}$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq340.png)
![$$\Pr [\mathcal {U}=z]= M^{- t}$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq341.png)





The following theorem states our indifferentiability result for xor.

![$$\mathsf {H}_1,\mathsf {H}_2\in [M]^N$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq347.png)



![$$\mathsf {Sim}[\bar{p},\gamma ]:=(\mathsf {SimH}^\mathsf {RO}_1,\mathsf {SimH}^\mathsf {RO}_2,\mathsf {SimBD}^\mathsf {RO}_1[\bar{p},\gamma ] ,\mathsf {SimBD}^\mathsf {RO}_2[\bar{p},\gamma ])$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq351.png)






![$$\begin{aligned} \mathsf {Adv}^{\mathrm {indiff}}_{\mathsf {C}^{\mathsf {H}_1,\mathsf {H}_2}_\oplus ,\mathsf {Sim}[\bar{p},\gamma ]}(\mathcal {D}) \le \;&(c+1) \cdot \gamma \\&+ \log M \cdot \big ( \sum ^{c}_{i=1} p_i \cdot \delta _{i-1} + q_\mathsf {H}\cdot \delta _{c+1} +\, q_\mathsf {C}\cdot (\delta _{c}+\delta _{c+1})\big ) ~, \end{aligned}$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_Equ15.png)





We prove indifferentiability by (1) defining a simulator, (2) upper bounding the advantage of any differentiator in distinguishing the real and the simulated worlds, and (3) upper bounding the number of queries that the simulator makes to the random oracle.
Simulator. All four sub-algorithms of the simulator are described in Fig. 1. They share state, in particular, variables to keep track of the fixed history and the current distribution of the hash functions. Two sets are used to keep track of the fixed coordinates of the simulated hash functions
and
, respectively. The density functions, from which the simulated backdoored hash functions will be sampled, are denoted by
and
. Furthermore, the simulator uses a counter s to recognize switches from one backdoor oracle to the other in order to use the appropriate number of points to fix from the list
. It also maintains a counter q for counting the number of consecutive queries to a backdoor oracle in order to decompose, i.e., substitute the current distribution with a partially fixed and partially dense distribution, only when necessary which is the case after each set of Q backdoor queries. We assume the initial values
,
,
,
, and
.



![$$\Pr [\mathcal {D}^{\mathsf {Game}_i}]:=\Pr [\mathcal {D}^{\mathsf {Game}_i}= 1]$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq377.png)



































![$$ \big |\Pr [\mathcal {D}^{\mathsf {Game}_3}]-\Pr [\mathcal {D}^{\mathsf {Game}_4}]\big | \le (c+1) \cdot \gamma ~. $$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq409.png)






















![$$\begin{aligned} \big |\Pr [\mathcal {D}^{\mathsf {Game}_5}]-\Pr [\mathcal {D}^{\mathsf {Game}_6}]\big | \le&\log M \cdot \big ( \sum ^{c}_{i=1} p_i \cdot \delta _{i-1} + q_\mathsf {H}\cdot \delta _{c+1} \big ) \end{aligned}$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_Equ16.png)

. The construction oracle in this game differs from
in that it never evaluates the individual hash functions anymore. Here, we can safely remove the second case distinction, where x is in both
and
, since this case is covered by the first case where x has been queried to the construction itself. It remains to bound the distinguisher’s advantage in distinguishing the two games while making queries x to the construction that are prior to the query fixed for neither hash function.
Let X and Y be two independent and
-dense distributions over a domain
. Then the xor distribution
is
-dense over the same domain
.
![$$I\subseteq [N]$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq440.png)
![$$z\in [M]^{|I|}$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq441.png)
![$$\begin{aligned} \Pr [X_I\oplus Y_I = z]&= \sum _x\Pr [X_I\!=\!x \wedge Y_I\!=\!x\oplus z] = \sum _x \Pr [X_I\!=\!x] \cdot \Pr [Y_I\!=\!x\oplus z] \\&\le 2^{|I|\cdot \log M} \cdot 2^{-(1-\delta )\cdot |I|\cdot \log M} \cdot 2^{-(1-\delta ')\cdot |I|\cdot \log M} \\&= 2^{-(1-(\delta +\delta '))\cdot |I| \cdot \log M}. \end{aligned}$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_Equ17.png)




![$$\begin{aligned} \big |\Pr [\mathcal {D}^{\mathsf {Game}_6}]-\Pr [\mathcal {D}^{\mathsf {Game}_7}]\big |&\le q_\mathsf {C}\! \cdot \! \log M \!\cdot \! \max _{0\le i \le c}\{\delta _i+\delta _{i+1}\} = q_\mathsf {C}\! \cdot \! \log M \!\cdot \! (\delta _{c}+\delta _{c+1}). \end{aligned}$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_Equ18.png)

The last game is identical to the simulated world. Therefore, the overall advantage of
is as stated in the theorem.
Query Complexity. The queries made by the simulator to consist of those made when simulating evaluation queries and those made when simulating backdoor queries. Responding to each evaluation query requires exactly one query to
, which makes a total of
queries. Right after the Q-th consecutive backdoor query (i.e., right before a switch), the simulator fixes some rows of the other BRO, where for each fixed row one query to the random oracle
is made. The maximum number of rows that should be fixed after each sequence of Q queries to
(resp.
) is predetermined by the simulator’s parameter
. Hence we obtain the claimed query complexity
.
We now provide estimates for the involved parameters.









































We note that in the special case where , we must have that
. In particular we can set
to obtain a simulator that places
queries. Thus in this case we obtain collision resistance. Note, however, that as soon as
we would need to have that
, which means the simulator places at least
queries, and we do not get collision resistance.
The above corollary shows that the xor combiner can only tolerate a logarithmic number of switches in , which we think of as the security parameter. This is due to the fact that the simulator complexity needs to be less than N/2 for it to be non-trivial. Although our bounds are arguably weak, they are still meaningful, and we conjecture that much better bounds in reality hold.
5 An Extractor-Based Combiner
In this section we study the indifferentiability of extractor-based combiners and show that they can give better security parameters compared to the xor combiner of Sect. 4. Recall that in the k-BRO model one considers adversaries that have access to all k backdoor oracles. A query to the backdoor oracle reveals some information about the underlying BRO
. The resulting distribution conditioned on the leakage can, using the decomposition technique, be translated into a distribution with a number of fixed coordinates, while the distribution of the rest remains dense. An indifferentiability simulator then fixes the same rows of the other BRO(s) in a way that consistency with the random oracle (which is to be indistinguishable from the construction) is ensured.
We demonstrated this idea for the xor combiner, where, before a switch to the other backdoor oracle, the simulator substituted p images of that BRO by uniformly random values, i.e., the result of the random oracle values xored with the ones just fixed. This causes a security loss of per switch, which corresponds to the advantage of an adversary distinguishing p uniform values from
-dense ones. Now consider a multi-source
-extractor as the combiner in t-BRO. The hope would be that as long as the images of the BROs have high min-entropy, the output of the extractor is
-close to uniform. This makes it possible for us to express the loss described above in terms of a negligible
and forgo the requirement on
to be negligible.
In this section we focus on 2-out-of-3-source extractors as combiners, i.e., extractors that only require a minimal amount of min-entropy from two of the sources. More formally, let be a 2-out-of-3-source
-extractor. For three functions
, the combiner
is defined as
. Here we show that in the 3-BRO model the construction
is indifferentiable from a random oracle.
Why not a Two-Source Extractor? Note that we cannot guarantee that images which are being fixed by the simulator in some as a result of a
-query have any min-entropy whatsoever. To understand why, simply consider an adversary that makes a backdoor query to
requesting a preimage of the zero-string
under
. Suppose
responds to this query with
. In this case
has no min-entropy, since
was chosen by the adversary and is, therefore, completely predictable. Hence,
cannot be used in a
-two-source extractor, i.e.,
, which relies on min-entropy from both sources for its output to be
-close to uniform. Overall, using a two-source extractor does not seem to have any advantage over the xor combiner in the 2-BRO model. On the contrary, when using a 2-out-of-3-source extractor, assuming that the rows under consideration are not already fixed in the function tables of all three BROs due to some previous query, there will be two images with high min-entropy, from which we can extract a value
-close to uniform. We give a proof of the following theorem, which is relatively similar to the one for xor, in the full version of the paper.
![$$\mathsf {Ext}:[M]^3\rightarrow [2]$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq525.png)




![$$\mathsf {H}_1,\mathsf {H}_2,\mathsf {H}_3\in [M]^N$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq530.png)



![$$\mathsf {Sim}[\bar{p},\gamma ]:=(\mathsf {SimH}^\mathsf {RO}_1,\mathsf {SimH}^\mathsf {RO}_2,\mathsf {SimH}^\mathsf {RO}_3,\mathsf {SimBD}^\mathsf {RO}_1[\bar{p},\gamma ],\mathsf {SimBD}^\mathsf {RO}_2[\bar{p},\gamma ],\mathsf {SimBD}^\mathsf {RO}_3[\bar{p},\gamma ])$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq534.png)





![$$\begin{aligned} \mathsf {Adv}^{\mathrm {indiff}}_{\mathsf {C}^{\mathsf {H}_1\!,\mathsf {H}_2\!,\mathsf {H}_3}_\mathrm {3ext},\mathsf {Sim}[\bar{p},\gamma ]}(\mathcal {D}) \le \;&(c+1) \cdot \gamma \\ +&\sum ^c_{i=1} \mathrm {SD}\big ( E_1|\cdots |E_{p_i},\mathcal {U}_{[2]^{p_i}} \big ) + q_\mathsf {H}\cdot \mathrm {SD}\big ( E_1, \mathcal {U}_{[2]}\big ) \\ +&\, q_\mathsf {C}\!\cdot \!\varepsilon \big ((1\!-\!\delta _{c-1})\!\cdot \! \log M, (1\!-\!\delta _{c})\!\cdot \! \log M, (1\!-\!\delta _{c+1})\!\cdot \! \log M \big ), \end{aligned}$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_Equ28.png)








We include the proof of the above theorem in the full version of the paper.
5.1 Instantiation with the Pairwise Inner-Product Extractor
![$$\mathsf {Ext}_\mathrm {pip}:[M]^t \rightarrow [2]$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq548.png)






We obtain the following corollary for the indifferentiability of the pairwise-inner-product, which we prove in the full version of the paper.
![$$\mathsf {Ext}_\mathrm {pip}:[M]^t \rightarrow [2]$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq554.png)

![$$\begin{aligned} \mathsf {Adv}^{\mathrm {indiff}}_{\mathsf {C}^{\mathsf {H}_1,\mathsf {H}_2,\mathsf {H}_3}_\mathrm {3ext},\mathsf {Sim}[p,\gamma ]}(\mathcal {D}) \le \,&\, (c+1) \cdot \gamma \\&+ c\cdot \sqrt{(e^{p\cdot M^{-(1-2\delta _{c})}} - 1)/2} \\&+ \big ( q_\mathsf {H}+ q_\mathsf {C}\big ) \cdot 2^{-((1-2\delta _{c+1})\cdot \log M +1)/2}, \end{aligned}$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_Equ30.png)


We now provide estimates for the involved parameters.
























Note that for the query complexity of the simulator does not involve the
factor, and hence we obtain collision resistance. For
, however there is a factor of at least
.
The above corollary shows that the extractor combiner can tolerate a linear number of switches in (which can be thought of as the security parameter) for the simulator query complexity to be less than N/2. As for the xor combiner we conjecture that (much) better bounds for the extractor combiner are possible.
6 Indifferentiability with Auxiliary Input
In this section we discuss indifferentiability in a setting where there is no adaptivity and the backdoor oracles are called only once at the onset. Although this may seem overly restrictive, the resulting definition is sufficiently strong to model indifferentiability in the presence of auxiliary input, whereby we would like to securely replace random oracles in generic applications even in the presence of auxiliary input.






with indistinguishable outputs b. The on-line simulators can also share state.










are indistinguishable. Once again the query complexity of should be small (or even zero) to obtain a definition which meaningfully formalizes indifferentiability from random oracles without auxiliary input. This definition, however, turns out to be unachievable:
can simply encode a pair of collisions for the construction, which
will not be able to match (with respect to
) without an exponentially large number of queries to
.5
There are two natural ways to overcome this: (1) restrict the interface of the construction; or (2) restrict the form of preprocessing. The former is motivated by use of salting as a means to defeat preprocessing, and the latter by independence of preprocessing for BROs.
A final question arises here: is it possible to simplify this definition further by removing the quantification over (as done for standard indifferentiability)? This could be done in the standard way by absorbing
into
to form a differentiator
. However, this means that
must receive the auxiliary information z. The resulting notion is stronger and models composition with respect to games that also depend on preprocessing. Thus, due to its simplicity, strength, and the fact that we can establish positive results for it, we focus on this definitional approach. We now make the two definitions arising from (1) and (2) explicit.








resulting in indistinguishable outputs b. We denote the advantage of in the salted AI-indifferentiability game with simulator
for a construction
by
. Notice that in the above definition, the distinguisher gets access to a salted
. A different definition arises when the distinguisher gets access to an unsalted
instead. However, since the simulated auxiliary information is computed given access to an unsalted
(which can be interpreted as having implicit access to the salt), such a definition calls for the existence of a more powerful simulator. In particular, such
and
can easily call
on common points. The practical implications of such a definition are unclear to us, and moreover, it is strictly weaker than our definition.




resulting in indistinguishable outputs b. Note that this is slightly weaker than the definition of indifferentiability in 2-BRO since is fully independent of
, whereas BRO indifferentiability allows for a limited amount of dependence. We denote by
the advantage of
in the AI-indifferentiability game with independent preprocessing with respect to a simulator
and a construction
in the 2-BRO model.
We are now ready to prove our feasibility results for AI-indifferentiability.















The first part of the theorem follows directly from the discussion above that indifferentiability with backdoors and no adaptivity is stronger than indifferentiability with auxiliary input for independent preprocessing.
We now prove the second part of the theorem.










![$$ \Pr [\mathsf {Game}_1] - \Pr [\mathsf {Game}_0] \le \frac{\ell + \log \gamma ^{-1}}{p} + \gamma ~, $$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_Equ38.png)




![$$\mathsf {H}[A]$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq658.png)

This modification is justified by the fact that the probability that a uniform is (the prefix of the first component of some point) in
is at most
. We have that
.







Here runs
relaying its queries to the first oracle to its own first oracle and the second oracle queries to its own second oracle except when a queried point appears as a prefix of the first component of an entry in
in which case
uses
to answer the query. We have that
.





Here simply runs
, followed by picking
, and then running
. We have that
.


where is an indifferentiability simulator. We have that
.




We have that .



![$$\mathsf {Sim}_1[A]$$](../images/508076_1_En_9_Chapter/508076_1_En_9_Chapter_TeX_IEq694.png)




We have that .



We have that .




We may instantiate the first part of the theorem with the xor combiner and an indifferentiability simulator for it given in Sect. 4. In this case the off-line phase of the simulator makes no queries to the (and outputs simulated auxiliary inputs by picking hash functions for the queried backdoor functions to
and
). This off-line phase also outputs two sets of
and
preset points as its state, which will be shared with the on-line phase of simulation. The second phase of the simulator is a simple xor indifferentiability simulator which ensures consistency with the preset points. Here our simulator fixes
points for
and
points for
. This results in simulator query complexity of
. The corresponding advantage bound is at most
which is of order
. Setting
we get a simulator with
queries for an advantage
. For
we get a bound that is meaningful for collision resistance.
As a result, we get that the xor combiner is collision resistant in the presence of independent auxiliary input (with no-salting). We note that the xor construction comes with added advantage that its security goes beyond AI-indifferentiability, and is also more domain efficient. Strictly speaking, however, the two settings are incomparable as the form of auxiliary information changes.
Dodis was partially supported by gifts from VMware Labs, Facebook and Google, and NSF grants 1314568, 1619158, 1815546. Mazaheri was supported by the German Federal Ministry of Education and Research (BMBF) and by the Hessian State Ministry for Higher Education, Research and the Arts, within ATHENE. Tessaro was partially supported by NSF grants CNS-1930117 (CAREER), CNS-1926324, CNS-2026774, a Sloan Research Fellowship, and a JP Morgan Faculty Award.