1 Introduction
In the function-inversion problem, an algorithm, inverter, attempts to find a preimage for a randomly chosen of a random function
. The inverter is equipped with an s-bit advice on f, and may make q oracle queries to f. Since s lowerbounds the inverter space complexity and q lowerbounds the inverter time complexity, it is common to refer to the relation between s and q as the inverter’s time/memory tradeoff. The function-inversion problem is central to both theoretical and practical cryptography. On the theoretical end, the security of many systems relies on the existence of one-way functions. While the task of inverting one-way functions is very different from that of inverting random functions, understanding the latter task is critical towards developing lower bounds on the possible (black-box) implications of one-way functions, e.g., Impagliazzo and Rudich [18], Gennaro et al. [14]. But advances on this problem (at least on the positive side, i.e., inverters) are likely to find practical applications. Indeed, algorithms for function inversion are used to expose weaknesses in existing cryptosystems.
Much progress was done regarding adaptive function inversion—the inverter may choose its queries adaptively (i.e., based on answers for previous queries). Hellman [17] presented an adaptive inverter that inverts with high probability a random f. Fiat and Naor [12] proved that for any s, q with (ignoring low-order terms), an s-advice q-query variant of Hellman’s algorithm inverts a constant fraction of the image points of any function. Yao [27] proved a lower bound of
for this problem. Closing the gap between the above lower and upper bounds is a long-standing open question. In contrast, very little is known about the non-adaptive variant of this problem—the inverter performs all queries at once. This variant is interesting since such inverter is likely be highly parallelizable, making it significantly more tractable for real world applications. The only known upper bounds for this variant, i.e., inverters, are the trivial ones (i.e.,
), and the only known lower bound is the above bound of Yao [27]. In a recent work, Corrigan-Gibbs and Kogan [9] have partially justified the difficulty of finding lower bounds on this seemingly easier to tackle problem, showing that lower bounds on non-adaptive inversion imply circuit lower bounds that, for strong enough parameters, are notoriously hard (see further details in Sect. 1.1).
1.1 Our Results
We make progress on this intriguing question, proving lower bounds on restricted families of inverters. To state our results, we use the following formalization to capture inverters with a preprocessing phase: such inverters have two parts, the preprocessing algorithm that gets as input the function to invert f and outputs an advice string a, and the decoding algorithm that takes as input the element to invert y, the advice string a, and using restricted query access to f tries to find a preimage of y. We start with describing our bound for the time/memory tradeoff of linear-advice (adaptive) inverters, and then present our lower bounds for non-adaptive inverters. In the following, fix and let
be the set of all functions from [n] to [n].
Linear-Advice Inverters. We start with a more formal description of adaptive function inverters.



![$$\mathsf {C} _{\mathsf {dec}}^{(\cdot )}:[n]\times {\left\{ 0,1\right\} }^{s} \rightarrow [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq28.png)


![$$\begin{aligned} \Pr _{{\mathop {a :=\mathsf {C} _{\mathsf {pre}}(f)}\limits ^{f\leftarrow \mathcal {F}}}}\left[ \Pr _{{\mathop {y:=f(x)}\limits ^{x\leftarrow [n]}}}\left[ \mathsf {C} _{\mathsf {dec}}^f(y,a) \in f^{-1}(y)\right] \ge 1/2\right] \ge 1/2. \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ30.png)
It is common to refer to a () as the advice string. In linear-advice inverters, the preprocessing algorithm
is restricted to output a linear function of f. That is,
, where the addition
is coordinate-wise with respect to an arbitrary group over [n], and the addition
is over an arbitrary group that contains the image of
. An example of such a preprocessing algorithm is
, for
, viewing
as a vector in
. For such inverters, we present the following bound.
(Bound on linear-advice inverters). Assume there exists an s-advice -query inverter with linear preprocessing that inverts
with high probability. Then
.
We prove Theorem 1.2 via a reduction from set disjointness, a classical problem in the study of two-party communication complexity. The above result generalizes to the following bound that replaces the restriction on the decoder (e.g., linear and short output) with the ability to compute the advice string of by a low-communication protocol over the inputs
and
.
(Bound on additive-advice inverters, informal). Assume there exists a -query inverter
and an s-bit communication two-party protocol
such that for every
, the output of
in
equals with constant probability to
. Then
.
The above bound indeed generalizes Theorem 1.2: a preprocessing algorithm of the type required by Theorem 1.2 immediately implies a two-party party protocol of the type required by Theorem 1.3.
Non-adaptive Inverters. In the non-adaptive setting, the decoding algorithm has two phases: the query selection algorithm that chooses the queries as a function of the advice and the element to invert y, and the actual decoder that receives the answers to these queries along with the advice string and y.


![$$\mathsf {C} _\mathsf {qry}:[n] \times {\left\{ 0,1\right\} }^{s} \rightarrow [n]^q$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq57.png)
![$$\mathsf {C} _{\mathsf {dec}}:[n] \times {\left\{ 0,1\right\} }^{s} \times [n]^q\rightarrow [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq58.png)


![$$\begin{aligned} \Pr _{{\mathop {a = \mathsf {C} _{\mathsf {pre}}(f)}\limits ^{f\leftarrow \mathcal {F}}}}\left[ \Pr _{{\mathop {{\mathop {v = \mathsf {C} _\mathsf {qry}(y,a)}\limits ^{y=f(x)}}}\limits ^{x\leftarrow [n]}}}\left[ \mathsf {C} _{\mathsf {dec}}(y,a,f(v)) \in f^{-1}(y)\right] \ge 1/2\right] \ge 1/2. \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ31.png)
Note that the query vector v is of length q, so the answer vector f(v) contains q answers. Assuming there exists a field of size n (see Remark 1.7), we provide two lower bounds for such inverters.
Affine Decoders. The first bound regards inverters with affine decoders. A decoder algorithm is affine if it computes an affine function of f’s answers. That is, for every image
and advice
, there exists a
-sparse vector
and a field element
such that
for every
. For this type of inverters, we present the following lower bound.
(Bound on non-adaptive inverters with affine decoders, informal). Assume there exists an s-advice non-adaptive function inverter with an affine decoder, that inverts with high probability. Then
.
Note that the above bound on s holds even if the inverter queries f on all inputs. While Theorem 1.5 is not very insightful for its own sake, as we cannot expect a non-adaptive inverter to have such a limiting structure, it is important since it can be generalized to affine decision trees, a much richer family of non-adaptive inverters defined below. In addition, the result should be contrasted with the question of black-box function computation, see Sect. 1.2, for which linear algorithm are optimal. Thus, Theorem 1.5 highlights the differences between these two related problems.
Affine Decision Trees. The second bound regards inverters whose decoders are affine decision trees. An affine decision tree is a decision tree whose nodes compute an affine function, over , of the input vector. A decoder algorithm
is an affine decision tree, if for every image
, advice
and queries
, there exists an affine decision tree
such that
(i.e., the output of
on input f) for every
. For such inverters, we present the following bound.

, for some universal constant c,
.
.
That is, we pay a factor of 1/d comparing to the affine decoder bound, and the bound on s only holds for not too large q. Affine decision trees are much stronger than affine decoders, since the choice of the affine operations it computes can be adaptively dependent on the results of previous affine operations. For example, a depth d affine decision tree can compute any function on d linear combinations of the inputs. In particular, multiplication of function values, e.g., , which cannot be computed by an affine decoder, can be computed by a depth two decision tree. We note that an affine decision tree of depth q can compute any function of its q queries. Unfortunately, for
, our bound only reproduces (up to log factors) the lower bound of Yao [27].
(Field size). In Theorems 1.5 and 1.6, the field size is assumed to be exactly n (the domain of the function to invert). Decoders over fields smaller than n are not particularly useful, since their output cannot cover all possible preimages of f. Our proof breaks down for fields of size larger than n, since we cannot use linear equations to represent the constraint that the decoder’s output must be contained in the smaller set [n].



- 1.
It has
middle layer gates.
- 2.
Each output gate is connected to
inputs gates (and to an arbitrary number of middle-layer gates).
- 3.
Each output gate computes a function which is computable by a d-depth linear decision tree in the inputs (and depends arbitrarily on the middle layer).
In fact, our bound yields that such circuits cannot even approximate f so that every output gate outputs the right value with probability larger than 1/2, over the inputs.
1.2 Additional Related Work
Adaptive Inverters
Upper Bounds. The main result in this setting is the s-advice, q-query inverter of Hellman [17], Fiat and Naor [12] that inverts a constant fraction of the image of any function, for any s, q such that (ignoring low-order terms). When used for random permutations, a variant on the same idea implies an optimum inverter with
. The inverter of Hellman, Fiat and Naor has found application to practical cryptanalysis, e.g., Biryukov and Shamir [5], Biryukov et al. [6], Oechslin [20].
Lower Bounds. A long line of research (Gennaro et al. [14], Dodis et al. [11], Abusalah et al. [1], Unruh [22], Coretti et al. [8], De et al. [10]) provides lower bounds for various variations on the classical setting, such as that of randomized inversion algorithms that succeed on a sub-constant fraction of functions. None of these lower bounds, however, manage to improve on Yao’s lower bound of , leaving a large gap between this lower bound and Hellman, Fiat and Naor’s inverter.
Non-adaptive Inverters
Upper Bounds. In contrast with the adaptive case, it is not clear how to exploit non-adaptive queries in a non trivial way. Indeed, the only known inverters are the trivial ones (roughly, the advice is the function description, or the inverter queries the function on all inputs).
Lower Bounds. Somewhat surprisingly, the only known lower bound for non-adaptive inverters is Yao’s, mentioned above. This defies the basic intuition that this task should be easier than the adaptive case, due to the extreme limitations under which non-adaptive inverters operate. This difficulty was partially justified by the recent reduction of Corrigan-Gibbs and Kogan [9] (see Sect. 1.1) that implies that a strong enough lower bound on even strongly non-adaptive inverters, would yield a lower bound on low-depth Boolean circuits that is notoriously hard to prove.
Relation to Data Structures. The problem of function inversion with advice may also be phrased as a problem in data structures, where the advice string serves as a succinct data structure for answering questions about f. In particular, it bears strong similarity to the substring search problem using the cell-probe model [25]. This is the task of ascertaining the existence of a certain element within a large, unsorted database, using as few queries to the database and as little preprocessing as possible. Upper and lower bounds easily carry over between the two problems, a connection which was made in Corrigan-Gibbs and Kogan [9], where it was used to obtain previously unknown upper bounds on substring search.
Index Coding and Black-Box Function Computation. A syntactically related problem to function inversion is the so-called black-box function computation: an algorithm tries to compute f(x), for a randomly chosen x, using an advice of length s on f, and by querying f on q inputs other than x. Yao [26] proved that , and presented a linear, non-adaptive algorithm that matches this lower bound.
A much-researched special case of this problem is known as the index coding problem [4], originally inspired by information distribution over networks. In this setting, a single party is in possession of a vector f, and broadcasts a short message a such that n different recipients may each recover a particular value of f, using the broadcast message and knowledge of certain other values of f, as determined by a knowledge graph. The analogy to non-adaptive black-box function computation is obvious when considering a as the advice string, and the access to various values of f as queries. While Yao’s bound on the time/memory tradeoff also holds for the index coding problem, other lower bounds, some of which consider “linear” algorithms [3, 4, 15, 16, 19], do not seem to be relevant for the function inversion problem.
Open Questions
The main challenge remains to gain a better understanding on the power of adaptive and non-adaptive function inverters. A more specific challenge is to generalize our bound on affine decoders and decision trees to affine operations over arbitrary (large) fields.
Paper Organization
A rather detailed description of our proof technique is given in Sect. 2. Basic notations, definitions and facts are given in Sect. 3, where we also prove several basic claims regarding random function inversion. The bound on linear-advice inverters is given in Sect. 4, and the bounds on non-adaptive inverters are given in Sect. 5.
2 Our Technique
In this section we provide a rather elaborate description of our proof technique. We start with the bound on linear-advice inverters in Sect. 2.1, and then in Sect. 2.2 describe the bounds for non-adaptive inverters.
2.1 Linear-Advice Inverters
Our lower bound for inverters with linear advice (and its immediate generalization to additive-advice inverters) is proved via a reduction from set disjointness, a classical problem in the study of two-party communication complexity. In the set disjointness problem, two parties, Alice and Bob, receive two subsets, and
, respectively, and by communicating with each other try to determine whether
. The question is how many bits the parties have to exchange in order to output the right answer with high probability. Given an inverter with linear advice, we use it to construct a protocol that solves the set disjointness problem on all inputs in
by exchanging
bits. Razborov [21] proved that to answer with constant success probability on all input pairs in
, the parties have to exchange
bits. Hence, the above reduction implies the desired lower bound on the time/memory tradeoff of such inverters.
Fix a q-query s-advice inverter with linear advice, and assume for simplicity that
’s success probability is one. The following observation immediately follows by definition: let
and
be the advice strings for some functions f and
, respectively. The linearity of
yields that
. That is, a is the advice for the function
(all additions are over the appropriate groups). Given this observation, we use
to solve set disjointness as follows: Alice and Bob (locally) convert their input sets
and
to functions
and
respectively, such that for any
it holds that
, and f(x) is uniform for
. Alice then sends
to Bob who uses it to compute
. Equipped with the advice a and the help of Alice, Bob then emulates
and finds
, if such exists. Since f is unlikely to map many elements outside of
to 0, finding such x is highly correlated with
. In more details, the set disjointness protocol is defined as follows.

- 1.
samples
by letting
- 2.
samples
analogously, with respect to
.
Let
.
- 3.
sends
to
, and
sets
.1
- 4.
emulates
while answering each query r that
makes to f as follows:
- (a)
sends r to
.
- (b)
sends
back to
.
- (c)
replies
to
(as the value of f(r)).
Let x be
’s answer at the end of the above emulation.
- 5.
The parties reject if
(using an additional
bits to find it out), and accept otherwise.
The communication complexity of is essentially
. It is also clear that the parties accept if
. For the complementary case, by construction, the intersection point of
is in
. Furthermore, since f(i) is a random value for all
, with constant probability only the intersection point is in
. Therefore, the protocol is likely to answer correctly also in the case that
.
2.2 Non-adaptive Inverters





![$$i\in [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq163.png)






![$$j\in [i]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq170.png)


![$$\begin{aligned} \Pr \left[ Z_m\right] \in 2^{-\varOmega (m)} \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ1.png)


![$$\begin{aligned} \Pr _{f\leftarrow \mathcal {F}}\left[ \Pr _{{\mathop {{\mathop {v = \mathsf {C} _\mathsf {qry}(f,y)}\limits ^{y=f(x)}}}\limits ^{x\leftarrow [n]}}}\left[ \mathsf {C} _{\mathsf {dec}}(y,f(v)) \in f^{-1}(y)\right] \ge 1/2\right] < 2^{-\varOmega (m)}= 2^{-\varOmega (n)} \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ2.png)

![$$i\in [m]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq176.png)
![$$\begin{aligned} \Pr \left[ Z_i \mid Z_{i-1}\right] < 3/5 \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ3.png)



![$$\Pr \left[ Z_i \mid Z_{i-1}\right] $$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq180.png)






![$$\Pr \left[ Z_i \mid Z_{i-1}\right] $$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq187.png)

![$$\Pr \left[ Z_i \mid Z_{i-1}\right] $$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq189.png)


![$$[n]^n$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq192.png)


- 1.
is a deterministic function of
and
is a deterministic function of
, letting
stand for
and likewise for
.
- 2.
The event
is the event
, for
being a uniform, and independent, element of
.
(In particular,
implies that
holds, and binds the value of
to
.)
- 3.
The event
is the event
.
(In particular,
binds the value of
to
.)


![$$y \in [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq218.png)




- 1.
.


- 2.
.
Let , and let
be the (random) matrix defined by
and
, letting
being the unit vector
. Let
be the (random) vector defined by
and
. By definition, the event
is equivalent to the event
. The computation
makes on input
can also be described by the linear equation
. Let
and
. We make use of the following claims (see proofs in Sect. 3.2).
(Spanned unit vectors). For a matrix , let
, for
being the (linear) span of
’s rows.
That is, is the set of indices of all unit vectors spanned by
. It is clear that
. The following claim states that for
, knowing the value of
gives no information about
.
Let and
. Then for every
and
, it holds that
.
The second claim roughly states that by concatenating a c-row matrix to a given matrix , one does not increase the spanned unit set of
by more than c elements.
For every there exists an
-size set
such that the following holds: for every
there exists a c-size set
such that
.
![$$\Pr \left[ Z_i \mid Z_{i-1}\right] $$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq266.png)
![$$\begin{aligned} \Pr \left[ Z_i \mid Z_{i-1}\right]&= \Pr \left[ Z_i \wedge X_i \in \mathcal {E}(\mathbf{M} ) \mid Z_{i-1}\right] + \Pr \left[ Z_i \wedge X_i \notin \mathcal {E}(\mathbf{M} ) \mid Z_{i-1}\right] \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ4.png)








![$$\begin{aligned} \Pr \left[ Z_i \wedge X_i \in \mathcal {E}(\mathbf{M} ) \mid Z_{i-1}\right]&\le \Pr \left[ Y_i \in F(\mathcal {E}(\mathbf{M} )) \mid Z_{i-1}\right] \nonumber \\&\le \Pr \left[ Y_i \in F(\mathcal{{S}}\cup \left\{ T\right\} ) \mid Z_{i-1}\right] \nonumber \\&\le \Pr \left[ Y_i \in F(\mathcal{{S}}) \mid Z_{i-1}\right] + \Pr \left[ Y_i = F(T) \mid Z_{i-1}\right] . \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ6.png)


![$$\begin{aligned} \Pr \left[ Y_i \in F(\mathcal{{S}}) \mid Z_{i-1}\right] \le \left| \mathcal{{S}}\right| /n = \ell /n \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ7.png)
![$$ \Pr \left[ Y_i = F(T) \mid Z_{i-1}\right] $$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq276.png)


![$$\begin{aligned} \Pr _{f \leftarrow \mathcal {F}}\left[ \Pr _{y\leftarrow [n]}\left[ y = f(g(y))\right] \ge 1/2\right] \le n^{-n/3} \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ8.png)





![$$\Pr \left[ H=h,Z_{i-1}\mid Y_{< i } = y_{<i}\right] \ge n^{-n/4}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq283.png)
![$$ \Pr \left[ Y_i = F(T) \mid Z_{i-1}\right] \le 1/2 + o(1)$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq284.png)
![$$\Pr \left[ Z_i \mid Z_{i-1}\right]< 1/n + \ell /n + 1/2 + o(1) < 3/5$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq285.png)
Affine Decision Trees. Similarly to the affine decoder case, we prove the theorem by bounding for all “not too large i”. Also in this case, we bound this probability by translating the conditioning on
into a system of affine equations. In particular, we would like to find proper definitions for the matrix
and vector
, functions of
, such that conditions 1–3 mentioned in the affine decoder case hold.
We achieve these conditions by adding for each an equation for each of the linear computations done in the decision tree that computes
from
. The price is that rather than having
equations, we now have
, for d being the depth of the decision tree. In order to have
a deterministic function of
alone, we cannot simply make
reflect the d linear computations performed by the decoder, since each of these may depend on the results of previous computations, and thus depend on F. So rather, we have to add a row (i.e., an equation) for each of the q queries the decoder might use (queries that span all possible computations), which by definition also imply the dependency on q. Taking these additional rows into account yields the desired bound.
3 Preliminaries
3.1 Notation
All logarithms considered here are in base two. We use calligraphic letters to denote sets, uppercase for random variables and probabilistic events, lowercase for functions and fixed values, and bold uppercase for matrices. Let . Given a vector
, let
denote its i
entry, let
and let
. Let
denote the set of all subsets of [n] of size k. The vector v is q-sparse if it has no more than q non-zero entries.
Functions. We naturally view functions from [n] to [m] as vectors in , by letting
. For a finite ordered set
, let
. Let
and let
. A function
, for a field
and
, is affine if there exist a vector
and a constant
such that
for every
, letting
(all operations are over
).
Distributions and Random Variables. The support of a distribution P over a finite set is defined by
. For a set
, let
denote that s is uniformly drawn from
. For
, let
, i.e., the binary entropy function.
3.2 Matrices and Linear Spaces
We identify the elements of a finite field of size n with the elements of the set [n], using some arbitrary, fixed, mapping. Let denote the i
unit vector
.
For a matrix , let
denote the i
row of
. The span of
’s rows is defined by
. Let
, or equivalently, the image set of the function
. We use the following well-known fact:
Let be a finite field of size n, let
, let
, and let
be the solution set of the system of equations
. Then
.
We also make use of the following less standard notion.
(Spanned unit vectors). For a matrix , let
.
That is, is the indices of all unit vectors spanned by
. It is clear that
. It is also easy to see that for any
,
holds those entries that are common to all solutions w of the system
.4 The following claim states that for
, the number of solutions w of the system
with
, is the same for every y.
Let be a finite field of size n, let
and
. Then for every
and
, it holds that
.
Let be the set of solutions for the equation
. Since, by assumption,
has a solution, by Fact 3.1 it holds that
. Next, let
, and
(i.e.,
is the set of solutions for
). Since, by assumption,
, it holds that
has a solution and
. We conclude that
.
The following claim states that adding a small number of rows to a given matrix does not increase the set
by much.
For every there exists an
-size set
such that the following holds: for any
there exists a
-size set
for which
.
Standard row operations performed on a matrix do not affect
, and thus do not affect
. Therefore, we may assume that both
and
are in row canonical form.5 For a matrix
in row canonical form, let
the i
column of
contains a leading 1 }. Let
and note that
. Perform Gaussian elimination on
to yield a matrix
in row canonical form, and let
. Note that
, since adding rows to a matrix may only expand the set of leading ones. Furthermore,
. Clearly,
, and we can write
, for
. Finally,
, and the proof follows.
3.3 Random Functions
Let be the set of all functions from [n] to [n]. We make the following observations.
![$$\mathcal {S}_1,\ldots ,\mathcal {S}_n \subseteq [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq407.png)


![$$\mathcal {K}_f:=\left\{ y\in [n] :y \in f(\mathcal {S}_y) \right\} $$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq410.png)
![$$\mu \in [0,\tfrac{1}{2}]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq411.png)
![$$\begin{aligned} \Pr _{f\leftarrow {\mathcal {F}_n} }\left[ \left| \mathcal {K}_f\right| \ge \mu n\right] \le 2^{2\lceil \mu n\rceil \log (1/\mu ) + \lceil \mu n\rceil \log ( c / n)}. \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ32.png)
![$$\mathcal {T}:=\left\{ t_1,\ldots ,t_{\lceil \mu n\rceil } \right\} \subseteq [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq412.png)



![$$x_i \in [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq416.png)






![$$\Pr _{f\leftarrow {\mathcal {F}_n} ,\mathcal {T}\leftarrow \left( {\begin{array}{c}[n]\\ \lceil \mu n\rceil \end{array}}\right) }\left[ \mathcal {T}\subseteq \mathcal {K}_f \right] \le \left( \frac{ c }{n}\right) ^{\lceil \mu n\rceil }$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq423.png)
![$$\begin{aligned}&\Pr _{f\leftarrow {\mathcal {F}_n} }\left[ |\mathcal {K}_f| \ge \mu n \right] = \Pr \left[ \exists \mathcal {T}\subseteq \mathcal {K}_f:|\mathcal {T}|= \lceil \mu n\rceil \right] \\&\le \sum _{\mathcal {T}\in {[n] \atopwithdelims ()\lceil \mu n\rceil }} \Pr _{f\leftarrow {\mathcal {F}_n} }\left[ \mathcal {T}\subseteq \mathcal {K}_f\right] \le \left( {\begin{array}{c}n\\ \lceil \mu n\rceil \end{array}}\right) \cdot \left( \frac{ c }{n} \right) ^{\lceil \mu n\rceil }\le 2^{2\lceil \mu n\rceil \log (1/\mu ) + \lceil \mu n\rceil \log ( c / n)}. \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ33.png)



![$$Y\leftarrow [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq427.png)

![$$\mathcal {S}_1,\ldots ,\mathcal {S}_n \subseteq [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq429.png)
![$$\gamma \in [0,\tfrac{1}{2}]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq430.png)
![$$\begin{aligned} \Pr \left[ Y \in F(\mathcal {S}_{Y}) \mid W \right] \le \gamma + 2^{2\lceil \gamma n\rceil \log (1/\gamma ) + \lceil \gamma n\rceil \log ( c / n) + \log (1/p)}. \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ34.png)
![$$\mathcal {K}_f :=\left\{ y\in [n] :y \in f(\mathcal {S}_y) \right\} $$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq431.png)
![$$\gamma \in [0,1]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq432.png)

![$$\Pr \left[ W\right] \ge p$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq433.png)
![$$\begin{aligned} \Pr \left[ |\mathcal {K}_F| \ge \gamma n \mid W \right]&\le \frac{ \Pr \left[ |\mathcal {K}_F| \ge \gamma n \right] }{ \Pr \left[ W\right] } \le 2^{2\lceil \gamma n\rceil \log (1/\gamma ) + \lceil \gamma n\rceil \log ( c / n) + \log (1/p)} \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ11.png)
![$$\begin{aligned} \Pr \left[ Y \in F(\mathcal {S}_{Y}) \mid W \right] \le \gamma + 2^{2\lceil \gamma n\rceil \log (1/\gamma ) + \lceil \gamma n\rceil \log ( c / n) + \log (1/p)}. \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ35.png)
The next claim bounds the probability that a random function compresses an image set.
For any and
, it holds that
.
The last claim states that an algorithm that inverts f(x) with good probability, is likely to return x itself.
Let be a function from
to [n] such that
. Then,
.


![$$\begin{aligned} \Pr _{ \begin{array}{c} f \leftarrow {\mathcal {F}_n} \\ x \leftarrow [n] \end{array} }\left[ \mathsf {C} (f,f(x)) = x \right]&= \Pr \left[ \mathsf {C} (f,f(x)) = x \wedge \mathsf {C} (f,f(x)) \in f^{-1}(f(x)) \right] \nonumber \\&\ge \Pr \left[ \mathsf {C} (f,f(x)) = x \mid \mathsf {C} (f,f(x)) \in f^{-1}(f(x)) \right] \cdot \alpha . \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ12.png)

![$$\begin{aligned}&\Pr _{ \begin{array}{c} f \leftarrow {\mathcal {F}_n} \\ x \leftarrow [n] \end{array} }\left[ \mathsf {C} (f,f(x)) = x \mid \mathsf {C} (f,f(x)) \in f^{-1}(f(x)) \right] \nonumber \\ \ge&\Pr \left[ \mathsf {C} (f,f(x)) = x \wedge |\mathcal {P}_f(x)| \le d \mid \mathsf {C} (f,f(x)) \in f^{-1}(f(x)) \right] \nonumber \\ =&\Pr \left[ \mathsf {C} (f,f(x)) = x \mid |\mathcal {P}_f(x)| \le d, \mathsf {C} (f,f(x)) \in f^{-1}(f(x)) \right] \nonumber \\&\cdot \Pr \left[ |\mathcal {P}_f(x)| \le d \mid \mathsf {C} (f,f(x)) \in f^{-1}(f(x)) \right] \nonumber \\ \ge&\frac{1}{d+1} \cdot \Pr \left[ |\mathcal {P}_f(x)| \le d \mid \mathsf {C} (f,f(x)) \in f^{-1}(f(x)) \right] \nonumber \\ =&\frac{1}{d+1} \left( 1 - \Pr \left[ |\mathcal {P}_f(x)| > d \mid \mathsf {C} (f,f(x)) \in f^{-1}(f(x)) \right] \right) . \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ13.png)
![$${\text {*}}{E}_{ f \leftarrow {\mathcal {F}_n} }\left[ \left| \mathcal {P}_f(x)\right| \right] = \frac{n-1}{n} < 1$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq445.png)
3.4 Additional Inequalities
We use the following easily-verifiable facts:
For :
.
For :
.
We also use the following bound:
([13]) .
4 Linear-Advice Inverters
In this section we present lower bounds on the time/memory tradeoff of adaptive function inverters with linear advice. The extension to additive-advice inverters is given in Sect. 4.1.
To simplify notation, the following definitions and results are stated with respect to a fixed . Let
be the set of all functions from [n] to [n]. All asymptotic notations (e.g.,
) hide constant terms that are independent of n. We start by formally defining adaptive function inverters.



![$$\mathsf {C} _{\mathsf {dec}}^{(\cdot )}:[n] \times {\left\{ 0,1\right\} }^{s} \rightarrow [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq460.png)

![$$y\in [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq462.png)

That is, is the preprocessing algorithm that takes as input the function description and outputs a string of length s that we refer to as the advice string. The oracle-aided
is the decoder algorithm that performs the actual inversion action. It receives the element to invert y and the advice string, and using q (possibly adaptive) queries to f, attempts to output a preimage of y. Finally,
is the candidate preimage the algorithms of
produce for the element to invert y given the (restricted) access to f. We define adaptive inverters with linear advice as follows, recalling that we may view
as a vector
.
(Linear preprocessing). A deterministic algorithm is linear if there exist an additive group
that contains
, and an additive group
of size n such that for every
it holds that
, letting
.
Below we omit the subscripts from and
when clear from the context.
We prove the bound for inverters with linear preprocessing by presenting a reduction from the well-known set disjointness problem.



![$$\subseteq \left\{ (\mathcal {X},\mathcal {Y}) :\mathcal {X}, \mathcal {Y}\subseteq [\mathbb {N}] \right\} $$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq481.png)




Namely, except with probability over their private and public randomness, the two parties find out whether their input sets intersect. Set disjointness is known to require large communication over the following set of inputs.
(Communication complexity). The communication complexity of a protocol , denoted
, is the maximal number of bits the parties exchange in an execution (over all possible inputs and randomness).
(Hardness set disjointness, Razborov [21]). Exists such that for every protocol
that solves set disjointness over all inputs in
with error
, it holds that
.6
Our main result is the following reduction from set disjointness to function inversion.
(From set disjointness to function inversion). Assume exists an s-advice, -query linear-advice inversion algorithm with
, and let
. Then for every
there exists a protocol that solves set disjointness with (one-sided) error
and communication
, on all inputs in
.
Combining Theorems 4.5 and 4.6 yields the following bound on linear-advice inverters.
(Theorem 1.2, restated). Let be an s-advice
-query inversion algorithm with linear preprocessing such that
. Then
.
(Proof of Corollary 4.7). By Theorem 4.6, the existence of an s-advice, -query linear-advice inverter
with success probability
implies that set disjointness can be solved over
, with error
and communication complexity
. Thus, Theorem 4.5 yields that
. Since
, and since, by Fact 3.9, it holds that
, it follows that
.
The rest of this section is devoted to proving Theorem 4.6. Fix an s-advice, -query inverter
with linear preprocessing. We use
in Protocol 4.8 to solve set disjointness. In the protocol below we identify a vector
with the set
.

’s input:
.
’s input:
.
Public randomness:
.
- Operation:
- 1.
chooses
.
- 2.
constructs a function
as follows:
for i such that
, it samples
uniformly at random.
for i such that
, it sets
.
- 3.
constructs a function
as follows:
for i such that
, it samples
uniformly at random.
for i such that
, it sets
.
Let
.
- 4.
sends
to
.
- 5.
sets
.
- 6.
emulates
: whenever
sends a query r to f, algorithm
forwards it to
, and feeds
back into
.
Let x be
’s output in the above emulation, and let
.
- 7.
sends
to
. If
, algorithm
outputs False and informs
.
- 8.
Otherwise, both parties output True.
In the following we analyze the communication complexity and success probability of .
(’s communication complexity). It holds that
.
- 1.
In Step 4, party
sends
to
.
- 2.
In Step 6, the parties exchange at most
bits for every query
makes.
- 3.
In Step 7, the parties exchange at most
bits.
Thus, the total communication is bounded by .

- 1.
for every
.
- 2.
for every
.
By construction, it is clear that always accepts (the parties output True) on inputs
. Fix
, and let
and I be the values of
and i respectively, in a random execution of
. By construction,
for all
. For j not in the intersection, either
or
is chosen uniformly at random by one of the parties, and therefore F(j) is uniformly distributed and independent of all other outputs. For the intersection element w, it holds that
, which is uniform, and since there is exactly one intersection, is independent from all other outputs.


![$$X\leftarrow [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq588.png)

![$$\begin{aligned} \Pr \left[ \mathsf {C} (Y ; F) \in F^{-1}(Y)\right] \ge \alpha \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ39.png)
![$$\begin{aligned} \Pr \left[ \mathsf {C} (Y ; F) = W \right] \ge {\alpha ^2}/{8}. \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ40.png)

Proving Theorem 4.6. We now use Claims 4.9 and 4.10 to prove Theorem 4.6.
(Proof of Theorem 4.6). Let , and consider the protocol
, in which on input (a, b) the parties interact in protocol
for t times, and accept only if they do so in all iterations. By Claims 4.9 and 4.10, the communication complexity and success probability of
in solving set disjointness over
match the theorem statement.
4.1 Additive-Advice Inverters
The following result generalizes Corollary 4.7 by replacing the restriction on the decoder (e.g., linear and short output) with the ability to compute the advice string of by a low-communication protocol over the inputs
and
.
(Bound on additive-advice inverters). Let be an
-query inversion algorithm such that
. Assume exists a two-party protocol
with communication complexity k such that for every
, the output of
in
equals to
with probability at least
for some
, letting
be according to Definition 4.2. Then
.
The proof follows almost the exact same lines as that of Theorem 4.6, with the following changes: first, steps 4. and 5. in Protocol 4.8 are replaced by the parties and
interacting in
, resulting in
outputting
(thus, transmitting a total of
bits over the entire execution of the protocol). Second, note that due to the constant failure probability of
in computing
, the success probability of each execution of the protocol is now lowered by a constant factor
. This means that the rate of success when
is now bounded from below by only
(rather than
). The rest of the analysis is identical to that of Theorem 4.6.
5 Non-adaptive Inverters
In this section we present lower bounds on the time/memory tradeoff of non-adaptive function inverters. In Section 5.1, we present a bound for non-adaptive affine decoders, and in Section 5.2 we extend this bound to non-adaptive affine decision trees. To simplify notation, the following definitions and results are stated with respect to some fixed , for which there exists a finite field of size n which we denote by
. Let
be the set of all functions from [n] to [n]. All asymptotic notations (e.g.,
) hide constant terms that are independent of n. We start by formally defining non-adaptive function inverters.


![$$\mathsf {C} _\mathsf {qry}:[n] \times {\left\{ 0,1\right\} }^{s} \rightarrow [n]^q$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq629.png)
![$$\mathsf {C} _{\mathsf {dec}}:[n] \times {\left\{ 0,1\right\} }^{s} \times [n]^q\rightarrow [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq630.png)

![$$y\in [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq632.png)

That is, is the preprocessing algorithm. It takes the function description as input and outputs a string of length s, to which we refer as the advice string. In the case that
, we say that
has zero-advice, and omit
from the notation. Algorithm
is the query selection algorithm. It chooses the queries according to the element to invert y and the advice string, and outputs q indices, to which we refer as the queries. Algorithm
is the decoder algorithm that performs the actual inversion. It receives the element y, the advice string and the function’s answers to the (non-adaptive) queries selected by
(the query indices themselves may be deduced from y and the advice), and attempts to output a preimage of y. Finally,
is the candidate preimage of y produced by the algorithms of
given the (restricted) access to f.
5.1 Affine Decoders
In this section we present our bound for non-adaptive affine decoders, defined as follows:
(Affine decoder). A non-adaptive inverter has an affine decoder, if for every
and
there exists a
-sparse vector
and a field element
, such that for every
:
.
The following theorem bounds the probability, over a random function f, that a non-adaptive inverter with an affine decoder inverts a random output of f with probability .

![$$\tau \in [0,1]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq652.png)
![$$\delta \in [0,1]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq653.png)

![$$\begin{aligned} \Pr _{ f\leftarrow \mathcal {F}}\left[ \Pr _{{\mathop {y=f(x)}\limits ^{x\leftarrow [n]}} }\left[ \mathsf {C} (y;f) \in f^{-1}(y)\right] \ge \tau \right] \le \alpha _{\tau ,\delta } + 2^s\delta ^{-m} \prod _{j=1}^{m} \left( \frac{2 j }{n} + \max \left\{ \frac{1}{\root 4 \of {n}}, \frac{4j }{ n }\right\} \right) \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ42.png)
![$$\alpha _{\tau ,\delta } :=\Pr _{f \leftarrow \mathcal {F}}\left[ \exists \tau n\text {-size set } \mathcal {X}\subset [n]:\left| f(\mathcal {X})\right| \le \delta n \right] $$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq655.png)
While it is not easy to see what is the best choice, per , of the parameters
and
above, the following corollary (proven in the full version) exemplifies the usability of Theorem 5.3 by considering the consequences of such a choice.
Our key step towards proving Theorem 5.3 is showing that even when conditioned on the (unlikely) event that a zero-advice inverter successfully inverts random elements, the probability the inverter successfully inverts the next element is still low. To formulate the above statement, we define the following jointly distributed random variables: let F be uniformly distributed over
and let
be a uniform vector over
. For a zero-advice inverter, we define the following random variables (jointly distributed with F and Y).
For a zero-advice inverter , let
, let
be the event
, and let
.
That is, is
’s answers to the challenges
, and
indicates whether
successfully answered each of the first i challenges. Given the above notation, our main lemma is stated as follows:


![$$i\in [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq679.png)
![$$\mu \in [0,\tfrac{1}{2}]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq680.png)
![$$\begin{aligned} \Pr \left[ Z_{ i }^{\mathsf {D} } \mid Z_{ i -1}^{\mathsf {D} } \right] \le \frac{2i-1}{n} + \mu + 2^{2\lceil \mu n\rceil \log (1/\mu ) - \lceil \mu n\rceil \log (n) + (2i-2)\log n}. \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ43.png)
We prove Lemma 5.6 below, but first use it to prove Theorem 5.3.
Proving Theorem 5.3. Lemma 5.6 immediately yields a bound on the probability that , a zero-advice inverter, successfully inverts the first i elements of Y. For proving Theorem 5.3, however, we need to bound the probability that
, and later on, an inverter with non-zero advice, finds a preimage of a random output of f. Yet, the conversion between these two measurements is rather straightforward. Hereafter we assume
, as otherwise Theorem 5.3 is trivial, as
.

![$$\tau \in [0,1]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq686.png)
![$$\delta \in [0,1]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq687.png)




![$$j\in [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq692.png)



![$$\begin{aligned} \Pr \left[ Z_j \mid Z_{j-1} \right] \le \frac{2 j }{n} +\mu _j \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ16.png)
![$$\begin{aligned} \Pr \left[ Z_j \mid Z_{j-1}\right]&\le \frac{2j-1}{n} + \mu _j + 2^{\underbrace{2\lceil \mu _j n\rceil \log (1/\mu _j) - \lceil \mu _j n\rceil \log n + (2j-2)\log n}_\beta } \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ17.png)



![$$\Pr \left[ Z_j \mid Z_{j-1}\right] \le \frac{2j-1}{n} + \mu _j + 2^{-2\log n}\le \frac{2j}{n} + \mu _j$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq696.png)
![$$\begin{aligned} \Pr \left[ Z_{m}\right]&= \prod _{j=1}^{m} \Pr \left[ Z_j \mid Z_{j-1} \right] \le \prod _{j=1}^{m} \left( \frac{2 j }{n} +\mu _j \right) \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ19.png)

![$$ {\mathcal {G}}_\mathcal {Y}^a(f) :=\left\{ y \in [n] :\mathsf {C} ^a(y;f) \in f^{-1}(y)\right\} $$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq698.png)
![$$\begin{aligned} \Pr \left[ Z_{m}\right]&= \Pr _{ f \leftarrow \mathcal {F}}\left[ \forall j \in [m] :Y_j \in {\mathcal {G}}_\mathcal {Y}^a(f) \right] \nonumber \\&\ge \Pr _{ f \leftarrow \mathcal {F}}\left[ \forall j \in [m] :Y_j \in {\mathcal {G}}_\mathcal {Y}^a(f) \bigwedge | {\mathcal {G}}_\mathcal {Y}^a(f)| \ge \delta n\right] \nonumber \\&=\Pr _{ f \leftarrow \mathcal {F}}\left[ \forall j \in [m] :Y_j \in {\mathcal {G}}_\mathcal {Y}^a(f) \mid | {\mathcal {G}}_\mathcal {Y}^a(f)| \ge \delta n\right] \cdot \Pr _{ f \leftarrow \mathcal {F}}\left[ | {\mathcal {G}}_\mathcal {Y}^a(f)| \ge \delta n \right] \nonumber \\&\ge \delta ^m\cdot \Pr _{ f \leftarrow \mathcal {F}}\left[ | {\mathcal {G}}_\mathcal {Y}^a(f)| \ge \delta n \right] . \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ20.png)

![$$\begin{aligned} \Pr \left[ | {\mathcal {G}}_\mathcal {Y}^a(f)| \ge \delta n \right] \le \delta ^{-m} \cdot \prod _{j=1}^{m} \left( \frac{2 j }{n} + \mu _j \right) \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ21.png)

![$$ {\mathcal {G}}_\mathcal {Y}(f) :=\left\{ y \in [n] :\mathsf {C} (y;f) \in f^{-1}(y)\right\} $$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq701.png)

![$$\begin{aligned} \Pr _{ f \leftarrow \mathcal {F}}\left[ | {\mathcal {G}}_\mathcal {Y}(f)| \ge \delta n \right] \le 2^s \cdot \delta ^{-m} \cdot \prod _{j=1}^{m} \left( \frac{2 j }{n} + \mu _j \right) \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ22.png)


Proving Lemma 5.6 In the rest of this section we prove Lemma 5.6. Fix a zero-advice non-adaptive inverter with an affine decoder ,
and
. Let
and, for
let
. We start by proving the following claim that bounds the probability in hand, assuming
, the inverter’s answer, is coming from a small linear space. (Recall, from Definition 3.2, that
, where
is the j
unit vector in
.)



![$$y\in [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq718.png)

![$$\begin{aligned} \Pr \left[ Y_i \in F(\mathcal {E}(\mathbf{A} ^{Y_i})) \mid \mathbf{A} \times F= v\right] \le \left( \frac{\ell }{n} + \mu \right) + 2^{2\lceil \mu n\rceil \log (1/\mu ) + \lceil \mu n\rceil \log (t / n) + \ell \log n}. \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ46.png)
Given the above claim, we prove Lemma 5.6 as follows.

![$$y \in [n]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq734.png)






- 1.
.


- 2.
.


![$$j\in [i-1]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq747.png)














![$$\begin{aligned} \Pr \left[ Z_i \mid Z_{i-1}\right]&= \Pr \left[ Z_i \wedge X_i \in \mathcal {E}(\mathbf{M} ) \mid Z_{i-1} \right] + \Pr \left[ Z_i \wedge X_i \notin \mathcal {E}(\mathbf{M} ) \mid Z_{i-1}\right] \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ27.png)

![$$\begin{aligned}&\Pr \left[ Z_i \wedge X_i \in \mathcal {E}(\mathbf{M} ) \mid Z_{i-1} \right] \le \Pr \left[ Y_i \in F(\mathcal {E}(\mathbf{M} )) \mid Z_{i-1}\right] \nonumber \\&= {\text {*}}{E}_{h \leftarrow H \mid Z_{i-1}}\left[ \Pr \left[ Y_i \in F(\mathcal {E}(\mathbf{M} )) \mid H = h,Z_{i-1}\right] \right] \nonumber \\&= {\text {*}}{E}_{h=(y_{< i}, m^{i-1},v^{i-1}) \leftarrow H \mid Z_{i-1}}\left[ \Pr \left[ Y_i \in F\left( \mathcal {E} \begin{pmatrix} m^{i-1} \\ \alpha ^{Y_i} \end{pmatrix} \right) \mid H = h,m^{i-1}\times F = v^{i-1}\right] \right] \nonumber \\&= {\text {*}}{E}_{(y_{< i}, m^{i-1},v^{i-1}) \leftarrow H \mid Z_{i-1}}\left[ \Pr \left[ Y_i \in F\left( \mathcal {E} \begin{pmatrix} m^{i-1} \\ \alpha ^{Y_i} \end{pmatrix} \right) \mid Y_{< i}=y_{<i}, m^{i-1}\times F = v^{i-1}\right] \right] \nonumber \\&= {\text {*}}{E}_{(y_{< i}, m^{i-1},v^{i-1}) \leftarrow H \mid Z_{i-1}}\left[ \Pr \left[ Y_i \in F\left( \mathcal {E} \begin{pmatrix} m^{i-1} \\ \alpha ^{Y_i} \end{pmatrix} \right) \mid m^{i-1}\times F = v^{i-1}\right] \right] \nonumber \\&\le \left( \frac{2i-2}{n} + \mu \right) + 2^{2\lceil \mu n\rceil \log (1/\mu ) + \lceil \mu n\rceil \log (1 / n) + (2i-2)\log n}. \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ28.png)









![$$\begin{aligned}&\Pr \left[ Z_i \wedge X_i \notin \mathcal {E}(\mathbf{M} ) \mid Z_{i-1}\right] \le \Pr \left[ Z_i \mid X_i \notin \mathcal {E}(\mathbf{M} ) , Z_{i-1}\right] \nonumber \\&= {\text {*}}{E}_{h \leftarrow H\mid X_i \notin \mathcal {E}(\mathbf{M} ) , Z_{i-1}}\left[ \Pr \left[ Z_i \mid H=h,Z_{i-1}\right] \right] \nonumber \\&= {\text {*}}{E}_{h= (x_i,y_{\le i}, m,v) \leftarrow H\mid X_i \notin \mathcal {E}(\mathbf{M} ) , Z_{i-1}}\left[ \Pr \left[ F(x_i) =y_i \mid H=h,m\times F = v\right] \right] \nonumber \\&= {\text {*}}{E}_{(x_i,y_{\le i}, m,v) \leftarrow H\mid X_i \notin \mathcal {E}(\mathbf{M} ) , Z_{i-1}}\left[ \Pr \left[ F(x_i) =y_i \mid Y_{\le i}=y_{\le i},m\times F = v\right] \right] \nonumber \\&= {\text {*}}{E}_{(x_i,y_{\le i}, m,v) \leftarrow H\mid X_i \notin \mathcal {E}(\mathbf{M} ) , Z_{i-1}}\left[ \Pr \left[ F(x_i) =y_i \mid m\times F = v\right] \right] \nonumber \\&= 1/n. \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ29.png)





![$$\begin{aligned} \Pr \left[ Z_i \mid Z_{i-1}\right] \le&\left( \frac{2i-2}{n} + \mu \right) + 2^{2\lceil \mu n\rceil \log (1/\mu ) + \lceil \mu n\rceil \log (1 / n) + (2i-2)\log n} + 1/n \\&= \frac{2i-1}{n} + \mu + 2^{2\lceil \mu n\rceil \log (1/\mu ) - \lceil \mu n\rceil \log (n) + (2i-2)\log n}. \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ47.png)
5.2 Affine Decision Trees
In this section we present lower bounds for non-adaptive affine decision trees. These are formally defined as follows:















is the root of
.
.
The edge path of on w is defined by
. Lastly, the output of
on w, denoted
, is the value of
.
Note that the edge path determines the computation path and output. Given the above, affine decision tree decoders are defined as follows.
(Affine decision tree decoder). An inversion algorithm has a d-depth affine decision tree decoder, if for every
,
and
, there exists an n-input, d-depth affine decision tree
such that
.
Note that such a decision tree may be of size . The following theorem bounds the probability, over a random function f, that a non-adaptive inverter with an affine decision tree decoder inverts a random output of f with probability
.


![$$\tau \in [0,1]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq810.png)
![$$\delta \in [0,1]$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq811.png)

![$$\begin{aligned}&\Pr _{ f\leftarrow \mathcal {F}}\left[ \Pr _{{\mathop {y=f(x)}\limits ^{x\leftarrow [n]}} }\left[ \mathsf {C} (y;f) \in f^{-1}(y)\right] \ge \tau \right] \\&\le \alpha _{\tau ,\delta } + 2^s\cdot \delta ^{-m} \prod _{j=1}^{m} \left( \frac{(d+1) j }{n} + \max \left\{ \root 4 \of { q/ n}, \frac{2(d+1)j \log n}{ n \log (n/q) }\right\} \right) \end{aligned}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_Equ48.png)
![$$\alpha _{\tau ,\delta } :=\Pr _{ f \leftarrow {\mathcal {F}_n} }\left[ \exists \tau n\text {-size set } \mathcal {X}\subset [n]:|f(\mathcal {X})| \le \delta n \right] $$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq813.png)
Comparing to the bound we derive on affine decoders (Theorem 5.3), we are paying above for the tree depth d, but also for the number of queries q. In particular, we essentially multiply each term of the above product by the tree depth d, and by . In addition, the theorem only holds for smaller values of m. The following corollary exemplifies the usability of Theorem 5.10 by considering the consequences of two choices of parameters.
(Theorem 1.6, restated). Let be as in Theorem 5.10 and assume
![$$\Pr _{ f\leftarrow \mathcal {F}}\left[ \Pr _{{\mathop {y=f(x)}\limits ^{x\leftarrow [n]}} }\left[ \mathsf {C} (y;f) \in f^{-1}(y)\right] \ge \tau \right] \ge \nicefrac {1}{2}$$](../images/508076_1_En_11_Chapter/508076_1_En_11_Chapter_TeX_IEq816.png)
If
, then
.
If
, then
.
Omitted, follows by Theorem 5.10 using very similar lines to those used to derive Corollary 5.4 from Theorem 5.3.
The proof of Theorem 5.10 is omitted and can be found in the full version of this paper.
We are thankful to Dmitry Kogan, Uri Meir and Alex Samorodnitsky for very useful discussions. We also thank the anonymous reviewers for their comments.