1 Introduction
1.1 Non-malleable Codes
Non-malleable codes were introduced by Dziembowski, Pietrzak, and Wichs [36] as an elegant relaxation and generalization of standard error correcting codes, where the motivation is to handle much larger classes of tampering functions on the codeword. Traditionally, error correcting codes only provide meaningful guarantees (e.g., unique decoding or list-decoding) when part of the codeword is modified (i.e., the modified codeword is close in Hamming distance to an actual codeword), whereas in practice an adversary can possibly use much more complicated functions to modify the entire codeword. In the latter case, it is easy to see that error correction or even error detection becomes generally impossible, for example an adversary can simply change all codewords into a fixed string. On the other hand, non-malleable codes can still provide useful guarantees here, and thus partially bridge this gap. Informally, a non-malleable code guarantees that after tampering, the decoding either correctly gives the original message or gives a message that is completely unrelated and independent of the original message. This captures the notion of non-malleability: that an adversary cannot modify the codeword in a way such that the tampered codeword decodes back to a related but different message.
The original intended application of non-malleable codes is in tamper-resilient cryptography [36], where they can be used generally to prevent an adversary from learning secret information by observing the input/output behavior of modified ciphertexts. Subsequently, non-malleable codes have found applications in non-malleable commitments [40], non-malleable encryption [30], public-key encryptions [31], non-malleable secret-sharing schemes [38], and privacy amplification protocols [19]. Furthermore, interesting connections were found to non-malleable extractors [27], and very recently to spectral expanders [54]. Along the way, the constructions of non-malleable codes used various components and sophisticated ideas from additive combinatorics [5, 22] and randomness extraction [18], and some of these techniques have also found applications in constructing extractors for independent sources [46]. As such, non-malleable codes have become fundamental objects at the intersection of coding theory and cryptography. They are well deserved to be studied in more depth in their own right, as well as to find more connections to other well studied objects in theoretical computer science.
We first introduce some notation before formally defining non-malleable codes. For a function , we say
is a fixed point (of f) if
.
(Tampering functions). For any , let
denote the set of all functions
. Any subset of
is a family of tampering functions.
We use the statistical distance to measure the distance between distributions.
The statistical distance between two distributions and
over some universal set
is defined as
. We say
is
-close to
if
and denote it by
.






(Coding schemes). Let and
be functions such that
is a randomized function (i.e., it has access to private randomness) and
is a deterministic function. We say that
is a coding scheme with block length n and message length k if for all
,
, where the probability is taken over the randomness in
.
We can now define non-malleable codes.










Relevant Prior Work on Non-malleable Codes in the Information Theoretic Setting. There has been a lot of exciting research on non-malleable codes, and it is beyond the scope of this paper to provide a comprehensive survey of them. Instead we focus on relevant explicit (unconditional) constructions in the information theoretic setting, which is also the focus of this paper. One of the most studied classes of tampering functions is the so called split-state tampering, where the codeword is divided into (at least two) disjoint intervals and the adversary can tamper with each interval arbitrarily but independently. This model arises naturally in situations where the codeword may be stored in different parts of memory or different devices. Following a very successful line of work [1, 2, 4, 5, 7, 18, 22, 27, 34, 41, 43, 44, 46, 47], we now have explicit constructions of non-malleable codes in the 2-split state model with constant rate and negligible error.
The split state model is a “compartmentalized” model, where the codeword is partitioned a priori into disjoint intervals for tampering. Recently, there has been progress towards handling non-compartmentalized tampering functions. A work of Agrawal, Gupta, Maji, Pandey and Prabhakaran [8] gave explicit constructions of non-malleable codes with respect to tampering functions that permute or flip the bits of the codeword. Ball, Dachman-Soled, Kulkarni and Malkin [12] gave explicit constructions of non-malleable codes against t-local functions for . However in all these models, each bit of the tampering function only depends on part of the codeword. A recent work of Chattopadhyay and Li [21] gave the first explicit constructions of non-malleable codes where each bit of the tampering function may depend on all bits of the codeword. Specifically, they gave constructions for the classes of affine functions and small-depth (unbounded fain-in) circuits. The rate of the non-malleable code with respect to small-depth circuits was exponentially improved by a subsequent work of Ball, Dachman-Soled, Guo, Malkin, and Tan [11]. In a recent work, Ball, Guo and Wichs [13] constructed non-malleable codes with respect to bounded depth decision trees.
Given all these exciting results, a major goal of the research on non-malleable codes remains to give explicit constructions for broader classes of tampering functions, as one can use the probabilistic method to show the existence of non-malleable codes with rate close to for any class
of tampering functions with
[26].
Our Results. We continue the line of investigation on explicit constructions of non-malleable codes, and give explicit constructions for several new classes of non-compartmentalized tampering functions, where in some classes each bit of the tampering function can depend on all the bits of the codeword. In Sect. 1.2, we discuss motivations and applications of our new non-malleable codes in cryptography. The new classes strictly generalize several previous studied classes of tampering functions. In particular, we consider the following classes.
- 1.
Interleaved 2-split-state tampering, where the adversary can divide the codeword into two arbitrary disjoint intervals and tamper with each interval arbitrarily but independently. This model generalizes the split-state model and captures the situation where the codeword is partitioned into two blocks (not necessarily of the same length) in an unknown way by the adversary before applying a 2-split-state tampering function. Constructing non-malleable codes for this class of tampering functions was left as an open problem by Cheraghchi and Guruswami [27].
- 2.
Composition of tampering, where the adversary takes two tampering functions and composes them together to get a new tampering function. We note that function composition is a natural strategy for an adversary to achieve more powerful tampering, and it has been studied widely in other fields (e.g., computational complexity and communication complexity). We believe that studying non-malleable codes for the composition of different classes of tampering functions is also a natural and important direction.
We now formally define these classes and some related classes below. For notation, given any permutation and any string x of length n, we let
denote the length n string such that
.
The family of 2-split-state functions
: Any
comprises of two functions
and
, and for any
,
. This family of tampering functions has been extensively studied, with a long line of work achieving near optimal explicit constructions of non-malleable codes.
The family of affine functions
: Any
is an affine function from
to
(viewing
as
, i.e.,
, for some
matrix M on
and
.
The family of interleaved 2-split-state functions (2, t)-
: Any
-
comprises of two functions
,
such that
and
(i.e both partitions are of length at least t), and a permutation
. For any
, where
, let
. In this paper we require that
for some fixed constant
. Note this includes as a special case the situation where the two states have the same size, which we denote by
, and in particular
.
For any tampering function families
, define the family
to be the set of all functions of the form
, where
,
and
denotes function composition.
We now formally state our results. Our most general result is an explicit non-malleable code with respect to the tampering class of -
, i.e, an affine function composed with an interleaved 2-split-state tampering function. Specifically, we have the following theorem.
There exist constants such that for all integers
there exists an explicit non-malleable code with respect to
-
with rate
and error
.
We immediately have the following corollary, which records the classes of functions for which no explicit non-malleable codes were known (for any rate) prior to this work.




,
-
,
and
.
1.2 Motivations and Applications in Cryptography
Just as standard non-malleable codes for split-state tampering arise from natural cryptographic applications, our non-malleable codes for interleaved 2-split-state tampering and affine tampering composed with interleaved split-state tampering also have natural cryptographic motivations and applications.
It is known that any non-malleable code in the 2-split-state model gives a 2 out of 2 secret-sharing scheme, if one views the two split states as two shares [6]. We show that any non-malleable code in the interleaved 2-split state model gives a non-malleable secret-sharing scheme with binary shares. Secret-sharing schemes [14, 58] are fundamental objects in cryptography, and building blocks for many other more advanced applications such as secure multiparty computation. In short, a secret-sharing scheme shares a message secretly among n parties, such that any qualified subset can reconstruct the message, while any unqualified subset reveals nothing (or almost nothing) about the message. Equivalently, one can view this as saying that any leakage function which leaks the shares in an unqualified subset reveals nothing. In the standard threshold or t out of n secret-sharing, any subset of size at most t is an unqualified subset while any subset of size larger than t is a qualified subset. However, it is known that in such a scheme, the share size has to be at least as large as the message size. Thus, a natural and interesting question is whether the share size can be smaller under some relaxed notion of secret-sharing. This is indeed possible when one considers the notion of (r, t)-ramp secret-sharing, where . In this setting, any subset of size at most t reveals nothing about the message, while any subset of size at least r can reconstruct message. Thus t is called the privacy threshold and r is called the reconstruction threshold. Subsets of size between
and
may reveal some partial information about the message. Again, it is not hard to see that the share size in this case has to be at least as large as
, where m is the message length. Thus, if one allows a sufficiently large gap between r and t, then it is possible to achieve a secret-sharing scheme even with binary shares.
Secret-sharing schemes are also closely related to error correcting codes. For example, the celebrated Shamir’s scheme [58] is based on Reed-Solomon codes. Similarly, binary secret-sharing schemes are largely based on binary error correcting codes, and they are studied in a series of recent works [15, 16, 25, 48] in terms of the tradeoff between the message length, the privacy threshold t, the reconstruction threshold r, and the complexity of the sharing and reconstruction functions.
However, standard secret-sharing schemes only allow an adversary to passively observe some shares, thus one can ask the natural question of whether it is possible to protect against even active adversaries who can tamper with the shares. In this context, the notion of robust secret-sharing schemes (e.g., [17, 51]) allows qualified subsets to recover the message even if the adversary can modify part of the shares. More recently, by generalizing non-malleable codes, Goyal and Kumar [38] introduced non-malleable secret-sharing schemes, where the adversary can tamper with all shares in some restricted manner. Naturally, the guarantee is that if tampering happens, then the reconstructed message is either the original message or something completely unrelated. In particular, they constructed t out of n non-malleable secret-sharing schemes in the following two tampering models. In the independent tampering model, the adversary can tamper with each share independently. In the joint tampering model, the adversary can divide any subset of shares arbitrarily into two sets of different size, and tamper with the shares in each set jointly, but independently across the two sets. Note that the adversary in the second model is strictly stronger than the adversary in the first one, since for reconstruction one only considers subsets of size
. Several follow up works [3, 9, 39] studied different models such as non-malleable secret-sharing schemes for general access structures, and achieved improvements in various parameters.
However, in all known constructions of non-malleable secret-sharing schemes the share size is always larger than 1 bit. In other words, no known non-malleable secret-sharing scheme can achieve binary shares. This is an obstacle that results from the techniques in all known constructions. Indeed, even if one allows (r, t)-ramp non-malleable secret-sharing with an arbitrarily large gap between r and t, no known constructions can achieve binary shares, because they all need to put at least two shares of some standard secret-sharing schemes together to form a single share in the non-malleable scheme. Thus it is a natural question to see if one can construct non-malleable secret-sharing schemes with binary shares using different techniques.
Our non-malleable codes for interleaved 2-split-state tampering directly give non-malleable secret-sharing schemes with binary shares that protect against joint tampering. We have the following theorem.
There exist constants such that for all integers
there exists an explicit (r, t)-ramp non-malleable secret-sharing scheme with binary shares, where
,
and the message length is
. The scheme has statistical privacy with error
, and is resilient with error
to joint tampering where the adversary arbitrarily partitions the r shares into two blocks, each with at most t shares, and tampers with each block independently using an arbitrary function.
Intuitively, any n-bit non-malleable code for interleaved 2-split-state tampering gives a ramp non-malleable secret-sharing scheme with reconstruction threshold , as follows. If the code protects against an adversary who can partition the codeword into two disjoint sets and tamper with each set arbitrarily but independently, then each set must reveal (almost) nothing about the secret message. Otherwise, the adversary can simply look at one set and use the leaked information to modify the shares in this set, and make the reconstructed message become a different but related message. In particular, the same proof in [6] for the standard 2-split state model also works for the interleaved 2-split state model. Since our code works for interleaved 2-split-state tampering and the size of one set can be as large as
, this implies privacy threshold at least
, with the small error in privacy coming from the error of the non-malleable code. We refer the reader to the full version of our paper for more details.
It is an interesting open question to construct explicit non-malleable secret-sharing schemes with binary shares where the reconstruction threshold . We note that this question is closely related to constructing non-malleable codes for the tampering class
or
(i.e., reverse the order of composition). This is because to get such a scheme, one natural idea is to apply another secret-sharing scheme on top of our non-malleable code. If one uses a linear secret-sharing scheme as in many standard schemes, then the tampering function on the codeword becomes
or
.
We also note that in an (r, t)-ramp secret-sharing scheme with binary shares, unless the message has only one bit, we must have . Thus in the joint tampering model, instead of allowing the adversary to divide r shares arbitrarily into two sets, one must put an upper bound t on the size of each set as in our theorem. For example, one cannot allow an adversary to look at a set of shares with size
, because
and this set of shares may already leak some information about the secret message.
In both standard secret-sharing and non-malleable secret-sharing, in addition to looking at sets of shares, researchers have also studied other classes of leakage function or tampering function. For example, the work of Goyal et al. [37] studied secret-sharing schemes that are resilient to affine leakage functions on all shares, and used them to construct parity resilient circuits and bounded communication leakage resilient protocols. A recent work of Lin et al. [49] also studied non-malleable secret-sharing schemes where the adversary can tamper with all shares jointly using some restricted classes of functions. Specifically, [49] considered the model of “adaptive” affine tampering, where the adversary is allowed to first observe the shares in some unqualified subset, and then choose an affine function based on this to tamper with all shares. In this sense, our non-malleable codes for affine tampering composed with interleaved 2-split-state tampering also directly give non-malleable secret-sharing schemes with binary shares that protect against affine tampering composed with joint tampering, which is strictly stronger than both the joint tampering model and the affine tampering model (although our affine tampering is non-adaptive compared to [49]). Specifically, we have the following theorem (which strictly generalizes Theorem 6).
There exist constants such that for all integers
there exists an explicit (r, t)-ramp non-malleable secret-sharing scheme with binary shares, where
,
and the message length is
. The scheme has statistical privacy with error
, and is resilient with error
to an adversary that tampers in two stages: In the first stage, the adversary partitions the r shares arbitrarily into two blocks, each with at most t shares, and tampers with each block independently using an arbitrary function. In the second stage, the adversary applies an arbitrary affine tampering function jointly on all the already tampered (from the first stage) r shares.
We provide a formal proof of the above theorem in the full version of our paper.
Again, it is an interesting open question to construct explicit non-malleable secret-sharing schemes where the order of tampering is reversed.
1.3 Seedless Non-malleable Extractors
Our results on non-malleable codes are based on new constructions of seedless non-malleable extractors, which we believe are of independent interest. Before defining seedless non-malleable extractors formally, we first recall some basic notation from the area of randomness extraction.
Randomness extraction is motivated by the problem of purifying imperfect (or defective) sources of randomness. The concern stems from the fact that natural random sources often have poor quality, while most applications require high quality (e.g., uniform) random bits. We use the standard notion of min-entropy to measure the amount of randomness in a distribution.
The min-entropy of a probability distribution
on
is defined to be
. We say
is an
-source and the min-entropy rate is
.
It turns out that it is impossible to extract from a single general weak random source even for min-entropy . There are two possible ways to bypass this barrier. The first one is to relax the extractor to be a seeded extractor, which takes an additional independent short random seed to extract from a weak random source. The second one is to construct deterministic extractors for special classes of weak random sources.
Both kinds of extractors have been studied extensively. Recently, they have also been generalized to stronger notions where the inputs to the extractor can be tampered with by an adversary. Specifically, Dodis and Wichs [33] introduced the notion of seeded non-malleable extractor in the context of privacy amplification against an active adversary. Informally, such an extractor satisfies the stronger property that the output of the extractor is independent of the output of the extractor on a tampered seed. Similarly, and more relevant to this paper, a seedless variant of non-malleable extractors was introduced by Cheraghchi and Guruswami [27] as a way to construct non-malleable codes. Apart from their original applications, both kinds of non-malleable extractors are of independent interest. They are also related to each other and have applications in constructions of extractors for independent sources [46].
We now define seedless non-malleable extractors.


















In the above definition, when the class of sources is the distribution
, we simply say that
is a seedless
-non-malleable extractor with respect to
.
Relevant Prior Work on Seedless Non-malleable Extractors. The first construction of seedless non-malleable extractors was given by Chattopadhyay and Zuckerman [22] with respect to the class of 10-split-state tampering. Subsequently, a series of works starting with the work of Chattopadhyay, Goyal and Li [18] gave explicit seedless non-malleable extractors for 2-split-state tampering. The only known constructions with respect to a class of tampering functions different from split state tampering is from the work of Chattopadhyay and Li [21], which gave explicit seedless non-malleable extractors with respect to the tampering class and small depth circuits, and a subsequent follow-up work of Ball et al. [10] where they constructed non-malleable extractors against tampering functions that are low-degree polynomials over large fields. We note that constructing explicit seedless non-malleable extractors with respect to
was also posed as an open problem in [27].
Our Results. As our most general result, we give the first explicit constructions of seedless non-malleable extractors with respect to the tampering class -
.
There exists a constant such that for all
there exists an efficiently computable seedless
-non-malleable extractor with respect to
-
, that is
-invertible.
This immediately yields the first explicit non-malleable extractors against the following classes of tampering functions.


,
-
,
, and
.
We derive our results on non-malleable codes using the above explicit constructions of non-malleable extractors based on a beautiful connection discovered by Cheraghchi and Gurswami [27] (see Theorem 25 for more details).
1.4 Extractors for Interleaved Sources
Our techniques also yield improved explicit constructions of extractors for interleaved sources, which generalize extractors for independent sources in the following way: the inputs to the extractor are samples from a few independent sources mixed (interleaved) in an unknown (but fixed) way. Raz and Yehudayoff [57] showed that such extractors have applications in communication complexity and proving lower bounds for arithmetic circuits. In a subsequent work, Chattopadhyay and Zuckerman [24] showed that such extractors can also be used to construct extractors for certain samplable sources, extending a line of work initiated by Trevisan and Vadhan [60]. We now define interleaved sources formally.
(Interleaved Sources). Let be arbitrary independent sources on
and let
be any permutation. Then
is an r-interleaved source.
Relevant Prior Work on Interleaved Extractors. Raz and Yehudayoff [57] gave explicit extractors for 2-interleaved sources when both the sources have min-entropy at least for a tiny constant
. Their construction is based on techniques from additive combinatorics and can output
bits with exponentially small error. Subsequently, Chattopadhyay and Zuckerman [24] constructed extractors for 2-interleaved sources where one source has entropy
for a small constant
and the other source has entropy
. They achieve output length
bits with error
.
A much better result (in terms of the min-entropy) is known if the extractor has access to an interleaving of more sources. For a large enough constant C, Chattopadhyay and Li [20] gave an explicit extractor for C-interleaved sources where each source has entropy . They achieve output length
and error
.
Our Results. Our main result is an explicit extractor for 2-interleaved sources where each source has min-entropy at least 2n/3. The extractor outputs bits with error
.







![$$\pi :[2n] \rightarrow [2n]$$](../images/508076_1_En_21_Chapter/508076_1_En_21_Chapter_TeX_IEq202.png)

2 Overview of Constructions and Techniques
Our results on non-malleable codes are derived from explicit constructions of invertible seedless non-malleable extractors (see Theorem 25). In this section, we illustrate our main ideas used to give explicit constructions of seedless non-malleable extractors with respect to the relevant classes of tampering functions, and explicit extractors for interleaved sources.
We first focus on the main ideas involved in constructing non-malleable extractors against 2-split-state adversaries when the partition are of equal length (we denote this by ). This serves to illustrate the important ideas that go into all our explicit non-malleable extractor constructions. We refer the reader to the full version of our paper for complete details of our non-malleable extractor and code constructions.
2.1 Seedless Non-malleable Extractors with Respect to Interleaved 2-split-state Tampering












![$$\pi :[2n] \rightarrow [2n]$$](../images/508076_1_En_21_Chapter/508076_1_En_21_Chapter_TeX_IEq216.png)


Our construction is based on the framework of advice generators and correlation breakers set up in the work [18], and used in various follow-up works on non-malleable extractors and codes. Before explaining this framework, we introduce some notation for ease of presentation. Let . We use the notation that if
(for some function h), then
or
stands for the corresponding random variable
. Thus,
.
- 1.
Generating advice: the task here is to construct a function
,
, such that
with high probability.
- 2.Breaking correlation: here we construct an object that can be seen as a relaxation of a non-malleable extractor, in the sense that we supply the non-malleable extractor with a short advice string. This object is called an advice correlation breaker. We require that for all distinct strings
,






An Explicit Advice Generator. A natural first idea to construct an advice generator can be as follows: Take a slice (prefix) of , say
, and use this to sample some coordinates from an encoding (using a good error correcting code) of
. A similar high level strategy has for example been used in [18], and other follow-up works. The intuition behind such a strategy is that since we assume
, encoding it will ensure that they differ on a lot of coordinates. Thus, sampling a random set of coordinates will include one such coordinate with high probability. However, in the present case, it is not clear why this should work since it could be that
contains all bits from say
, and the set of coordinates where the encoding of
and
differ may be a function of
, which leads to unwanted correlations.
The next natural idea could be the following: First use the slice to sample a few coordinates from
. Let
indicate
projected onto the sampled coordinates. Now, it is not hard to prove that
contains roughly equal number of bits from both the sources
and
. A strategy could be to now use
to sample coordinates from an encoding of
. However, in this case, we run into similar problems as before: there may be unwanted correlations between the randomness used for sampling, and the random variable corresponding to the set of coordinates where the encoding of
and
differ.
It turns out that the following subtler construction works:
Let for some small constant
. We take two slices from
, say
and
of lengths
and
, for some constant
. Next, we use a good linear error correcting code (let the encoder of this code be E) to encode
and sample
coordinates (let
denote this set) from this encoding using
(the sampler is based on seeded extractors [61]). Let
. Next, using
, we sample a random set of indices
, and let
. We now use an extractor for interleaved sources, i.e., an extractor that takes as input an unknown interleaving of two independent sources and outputs uniform bits (see Sect. 1.4). Let
be this extractor (say from Theorem 12), and we apply it to
to get
. Finally, let
be the output of a linear seeded extractor1
on
with
as the seed. The output of the advice generator is
.
Notation: Define and
. Similarly, define
and
. Thus,
and
. Let
be the bits of
in
for
and
be the remaining bits of
. Similarly define
’s,
.

















Now the idea is the following: Either (i) we can fix and claim that
still has enough min-entropy, or (ii) we can claim that
has enough min-entropy conditioned on the fixing of
. Let us first discuss why this is enough. Suppose we are in the first case. Then, we can fix
and
and argue that
is a deterministic function of
and contains enough entropy. Note that
is now fixed, and in fact it is fixed to a non-zero string (using the assumption that at least one of f or g has no fixed points). Thus,
is a string with a constant fraction of the coordinates set to 1 (since E is an encoder of a linear error correcting code with constant relative distance), and it follows that with high probability
contains a non-zero entry (using the fact that
is sampled using
, which has enough entropy). This finishes the proof in this case since it implies
with high probability.
Now suppose we are in case (ii). We use the fact that contains entropy to conclude that the sampled bits
contain almost equal number of bits from
and
(with high probability over
). Now we can fix
without loosing too much entropy from
(by making the size of
to be significantly larger than
). Next, we observe that
is an interleaved source, and hence
is close to uniform. We now fix
, and argue that
continues to be uniform. This follows roughly from the fact that any extractor for an interleaving of 2-sources is strong. Thus,
now becomes a deterministic function of
while at the same time,
still has enough min-entropy. Hence,
is close to uniform even conditioned on
. We can now fix
and
without affecting the distribution
, since
is a deterministic function of
while
is a deterministic function of
conditioned on the previous fixing of
. It follows that after these fixings,
is close to a uniform string and hence
with probability
, which completes the proof.
The fact that it is enough to consider case (i) and case (ii) relies on a careful convex combination analysis based on the pre-image size of the function . In addition, for the above argument to work we need to carefully adjust the sizes of
,
and
. We skip the details here, and refer the interested reader to later parts of the paper for more details.



































This almost works, modulo a technical caveat–the somewhere random source constructed out of is a tall matrix, with more rows than columns, but the parameters of
require us to work with a fat matrix, with more columns that rows. This is roughly because, we want the uniform row to have more entropy than the total size of all tampered random variables. To fix this, we use another linear seeded extractor on the source
with each row
as the seed to obtain another somewhere random source of the right shape.
2.2 From Non-malleable Extractors to Non-malleable Codes
To obtain our non-malleable codes, the decoding function corresponds to computing the extractor, which is already efficient. On the other hand, the encoding function corresponds to sampling from the pre-image of any given output of the non-malleable extractor. Thus we need to find an efficient way to do this, which is quite non-trivial. We suitably modify our extractor to support efficient sampling. Here we briefly sketch some high level ideas involved and refer the reader to the full version of our paper for more details.
Recall . The first modification is that in all applications of seeded extractors in our construction, we specifically use linear seeded extractors. This allows us to argue that the pre-image we are trying to sample from is in fact a convex combination of distributions supported on subspaces. The next crucial observation is the fact that we can use smaller disjoint slices of
to carry out various steps outlined in the construction. This is to ensure that the dimensions of the subspaces that we need to sample from, do not depend on the values of the random variables that we fix. For the steps where we use the entire source
(in the construction of the advice correlation breaker), we replace
by a large enough slice of
. However this is problematic if we choose the slice deterministically, since in an arbitrary interleaving of two sources, a slice of length less than n might have bits only from one source. We get around this by pseudorandomly sampling enough coordinates from
(by first taking a small slice of
and using a sampler that works for weak sources [61]).
We now use an elegant trick introduced by Li [46] where the output of the non-malleable extractor described above (with the modifications that we have specified) is now used as a seed in a linear seeded extractor applied to an even larger pseudorandom slice of . The linear seeded extractor that we use has the property that for any fixing of the seed, the rank of the linear map corresponding to the extractor is the same, and furthermore one can efficiently sample from the pre-image of any output of the extractor. The final modification needed is a careful choice of the error correcting code used in the advice generator. For this we use a dual BCH code, which allows us to argue that we can discard some output bits of the advice generator without affecting its correctness (based on the dual distance of the code). This is crucial in order to argue that the rank of the linear restriction imposed on the free variables of
does not depend on the values of the bits fixed so far. We refer the reader to the full version of our paper where we provide more intuition and complete details of the modified non-malleable extractor and sampling procedure.
2.3 Extractors for Interleaved Sources
Here we give a sketch of our improved extractor for interleaved sources . We refer the reader to the full version of our paper for more details. We present our construction and also explain the proof along the way, as this gives more intuition to the different steps of the construction. The high level idea is the following: transform
into a matrix of random variables (called a somewhere random source) such that at least one of the random variables is uniform, and the matrix is of the right shape, i.e, a fat matrix with more columns than rows. Once we have such a matrix, the idea is to use the advice correlation breaker from [21] mentioned above to break the correlation among the rows of the matrix. The final output will just be a bit-wise XOR of the output of the advice correlation breaker on each row of the matrix. We now give some more details on how to make this approach work.
Let . We start by taking a large enough slice
from
(say, of length
). Let
have more bits in this slice than
. Let
be the bits of
in
and
be the remaining bits of
. Similarly define
and
. Notice that
has linear entropy and also that
has linear entropy conditioned on
. We fix
and use a condenser (from work of Raz [55]) to condense
into a matrix with a constant number of rows such that at least one row is close to a distribution with entropy rate at least 0.9. Notice that this matrix is a deterministic function of
. The next step is to use
and each row of the matrix as a seed to a linear seeded extractor to get longer rows. This requires some care for the choice of the linear seeded extractor since the seed has some deficiency in entropy. After this step, we use the advice correlation breaker from [21] on
and each row of the somewhere random source, with the row index as the advice (similar to what is done in the construction of non-malleable extractors sketched above). Finally we compute the bit-wise XOR of the different outputs that we obtain. Let
denote this random variable. To output
bits, we use a linear seeded extractor on
with
as the seed. The correctness of various steps in the proof exploits the fact that
can be written as the bit-wise sum of two independent sources, and the fact that we use linear seeded extractors.
2.4 Organization
We use Sect. 3 to introduce some background and notation. We present our seedless non-malleable extractors with respect to interleaved split-state tampering in Sect. 4. We conclude with some open problems in Sect. 5.
3 Background and Notation
We use to denote the uniform distribution on
.
For any integer , [t] denotes the set
.
For a string y of length n, and any subset , we use
to denote the projection of y to the coordinates indexed by S.
We use bold capital letters for random variables and samples as the corresponding small letter, e.g., is a random variable, with x being a sample of
.
For strings , we use
(or equivalently
) to denote the bit-wise xor of the two strings.
3.1 Probability Lemmas
The following result on min-entropy was proved by Maurer and Wolf [50].



![$$\begin{aligned} \mathbf {Pr}_{y \sim \mathbf {Y}}[ H_{\infty }(\mathbf {X}| \mathbf {Y}= y) \ge H_{\infty }(\mathbf {X}) - \log \ell -\log (1/\epsilon )] > 1-\epsilon . \end{aligned}$$](../images/508076_1_En_21_Chapter/508076_1_En_21_Chapter_TeX_Equ11.png)
The following lemma is useful in bounding statistical distance of distributions after conditionings.
Let and
be distributions on some universe
such that
. Let
be some event some that
. Then,
.
3.2 Conditional Min-Entropy


![$$\begin{aligned} \widetilde{H}_{\infty }(\mathbf {X}|\mathbf {W}) = -\log \left( \mathbf {E}_{w \sim W}\left[\max _{x} \Pr [\mathbf {X}=x | \mathbf {W}=w] \right] \right) = - \log \left(\mathbf {E}\left[ 2^{-H_{\infty }(\mathbf {X}|\mathbf {W}=w)} \right]\right). \end{aligned}$$](../images/508076_1_En_21_Chapter/508076_1_En_21_Chapter_TeX_Equ12.png)
We recall some results on conditional min-entropy from the work of Dodis et al. [32].
([32]). If a random variable has support of size
, then
.
3.3 Seeded Extractors
A function is a
-seeded extractor if for any source
of min-entropy k,
.
is called a strong seeded extractor if
, where
and
are independent.
Further, if for each ,
is a linear function, then
is called a linear seeded extractor.
We require extractors that can extract uniform bits when the source only has sufficient conditional min-entropy.







It was shown in [32] that any seeded extractor is also an average case extractor.
([32]). For any , if
is a
-seeded extractor, then it is also a
-seeded average case extractor.
We record a folklore lemma, and include a proof for completeness.












3.4 Samplers and Extractors
Zuckerman [61] showed that seeded extractors can be used as samplers given access to weak sources. This connection is best presented by a graph theoretic representation of seeded extractors. A seeded extractor can be viewed as an unbalanced bipartite graph
with
left vertices (each of degree
) and
right vertices. Let
denote the set of neighbors of x in
.







![$$\begin{aligned} \mathbf {Pr}_{\mathbf {x} \sim \mathbf {X}}[||\text {Samp}(\mathbf {x}) \cap R | - \mu _RD| > \epsilon D] < 2^{-k}, \end{aligned}$$](../images/508076_1_En_21_Chapter/508076_1_En_21_Chapter_TeX_Equ18.png)

3.5 Explicit Extractors from Prior Work
We recall an optimal construction of strong-seeded extractors.
([42]). For any constant , and all integers
there exists a polynomial time computable strong-seeded extractor
with
and
.
The following are explicit constructions of linear seeded extractors.
([56, 59]). For every and
, with
, there exists an explicit strong linear seeded extractor
for min-entropy k and error
, where
.
A drawback of the above construction is that the seeded length is for sub-linear min-entropy. A construction of Li [45] achieves
seed length for even polylogarithmic min-entropy.
([45]). There exists a constant such that for every
with
and any
, there exists a polynomial time computable linear seeded extractor
for min-entropy k and error
, where
and
.
A different construction achieves seed length for high entropy sources.
([18, 46]). For all there exist
such that for all integers
,
, there exists an efficiently computable linear strong seeded extractor
,
for min-entropy
. Further, for any
, the linear map
has rank
.
The above theorem is stated in [46] for , but it is straightforward to see that the proof extends for any constant
.
We use a property of linear seeded extractors proved by Rao [53].


![$$\begin{aligned} \Pr _{u \sim U_{d}}[|\text {Ext}(X,u) - U_{m}|>0] \le 2\epsilon . \end{aligned}$$](../images/508076_1_En_21_Chapter/508076_1_En_21_Chapter_TeX_Equ19.png)
We recall a two-source extractor construction for high entropy sources based on the inner product function.









Rao [52] (based on an argument by Boaz Barak) proved that every two-source extractor is strong. It is easy to observe that the proof generalizes to the case of interleaved two-source extractors. We record this below in a slightly more general setting of unequal length sources.





![$$\pi : [n_1 + n_2] \rightarrow [n_1 + n_2]$$](../images/508076_1_En_21_Chapter/508076_1_En_21_Chapter_TeX_IEq554.png)


- Let
be a
-source,
be an independent
-source, and
be an arbitrary permutation. Then,
- Let
be a
-source,
be an independent
-source, and
be an arbitrary permutation. Then,
3.6 Advice Correlation Breakers
We use a primitive called ‘correlation breaker’ in our construction. Consider a situation where we have arbitrarily correlated random variables , where each
is on
bits. Further suppose
is a ‘good’ random variable (typically, we assume
is uniform or has almost full min-entropy). A correlation breaker
is an explicit function that takes some additional resource
, where
is typically additional randomness (an (n, k)-source) that is independent of
. Thus using
, the task is to break the correlation between
and the random variables
, i.e.,
is independent of
. A weaker notion is that of an advice correlation breaker that takes in some advice for each of the
’s as an additional resource in breaking the correlations. This primitive was implicitly constructed in [18] and used in explicit constructions of non-malleable extractors, and has subsequently found many applications in explicit constructions of extractors for independent sources and non-malleable extractors.
We recall an explicit advice correlation breaker constructed in [20]. This correlation breaker works even with the weaker guarantee that the ‘helper source’ is now allowed to be correlated to the sources random variables
in a structured way. Concretely, we assume the source to be of the form
, where
is assumed to be an (n, k)-source that is uncorrelated with
. We now state the result more precisely.






be an
-source,
a r.v on n bits,
be an
-source,
are r.v’s on n bits, and
be r.v’s on
bits each, such that
is independent of
,
be bit-strings of length h such that for each
,
.

,
,

3.7 A Connection Between Non-malleable Codes and Extractors
The following theorem proved by Cheraghchi and Guruswami [27] that connects non-malleable extractors and codes.
([27]). Let be an efficient seedless
-non-malleable extractor with respect to a class of tampering functions
acting on
. Further suppose
is
-invertible. Then there exists an efficient construction of a non-malleable code with respect to the tampering family
with block length
, relative rate
and error
.
4 NM Extractors for Interleaved Split-State Adversaries
The main result of this section is an explicit non-malleable extractor for interleaved 2-split-state tampering families with equal length partitions, which we denote by .




![$$\pi :[2n] \rightarrow [2n]$$](../images/508076_1_En_21_Chapter/508076_1_En_21_Chapter_TeX_IEq624.png)





In such settings, it was shown in [27] that it is enough to construct non-malleable extractors assuming that at least one of f and g does not have any fixed points, assuming that the sources and
have entropy at least
. We thus prove the following theorem, from which Theorem 26 is direct.






![$$\pi :[2n] \rightarrow [2n]$$](../images/508076_1_En_21_Chapter/508076_1_En_21_Chapter_TeX_IEq638.png)



We will prove a slightly more general result which is a direct by-product of our proof technique for proving the above theorem, and lets us re-use this non-malleable extractor for the class of linear adversaries composed with split-state adversaries. We prove the following theorem.








![$$\pi :[2n] \rightarrow [2n]$$](../images/508076_1_En_21_Chapter/508076_1_En_21_Chapter_TeX_IEq649.png)

and
,
or
and
,
.

Clearly, Theorem 27 follows directly from the above theorem by setting for all y and
for all x. We use the rest of the section to prove Theorem 28.
Our high level ideas in constructing the non-malleable extractor is via the framework set up in [18] of using advice generators and correlation breakers. We give intuition behind our construction in Sect. 2. We use Sect. 4.1 to construct an advice generator and Sect. 4.2 to construct an advice correlation breaker. Finally, we present the non-malleable extractor construction in Sect. 4.3.
If
(for some function h), then we use
or
to denote the random variable
.
Define
,
,
,
,
and
.
Finally, define
and
.
4.1 An Advice Generator




,
and
are independent,
,
.
We present the construction of our advice generator and refer the reader to the full version of our paper for the proof. We claim that the function computed by Algorithm 1 satisfies the above lemma. We first set up some parameters and ingredients.
Let C be a large enough constant and
.
Let
, for some constant
that we set below.
Let
be the encoding function of a linear error correcting code
with constant rate
and constant distance
.
Let
be a
-seeded extractor instantiated using Theorem 18. Thus
, for some constant
. Let
.
Let
be the sampler obtained from Theorem 17 using
.
Let
be a
-seeded extractor instantiated using Theorem 18. Thus
, for some constant
. Let
. Thus
.
Let
be the sampler obtained from Theorem 17 using
.
Set
.
Let
be the extractor from Theorem 12.
Let
be a linear seeded extractor instantiated from Theorem 22 set to extract from min-entropy
and error
.

4.2 An Advice Correlation Breaker



![$$\pi :[2n] \rightarrow [2n]$$](../images/508076_1_En_21_Chapter/508076_1_En_21_Chapter_TeX_IEq711.png)

and
,
or
and
,
.
Further, we defined the following: ,
,
,
,
and
. It follows that
and
. Thus, for some functions
,
. Let
and
.
The following is the main result of this section. Assume that we have some random variables such that and
continue to be independent, and
.






We present the construction of our advice correlation breaker, and refer the reader to the full version of our paper for the proof. We prove that the function computed by Algorithm 2 satisfies the conclusion of Lemma 9.
We start by setting up some ingredients and parameters.
Let
be a small enough constant.
Let
, where
.
Let
,
, be a linear-seeded extractor instantiated from Theorem 19 set to extract from entropy
with error
. Thus
, for some constant
. Let
,
.
Set
.
Let
,
be a linear-seeded extractor instantiated from Theorem 19 set to extract from entropy
with error
, such that the seed length of the extractor
(by Theorem 19) is
.
Let
, be the advice correlation breaker from Theorem 24 set with the following parameters:
,
. It can be checked that by our choice of parameters, the conditions required for Theorem 24 indeed hold for
.

4.3 The Non-malleable Extractor
We are now ready to present the construction of that satisfies the requirements of Theorem 28.
5 Open Questions
Non-malleable Codes for Composition of Functions. Here we give efficient constructions of non-malleable codes for the tampering class . Many natural questions remain to be answered. For instance, one open problem is to efficiently construct non-malleable codes for the tampering class
or
, which as explained before is closely related to the question of constructing explicit (r, t)-ramp non-malleable secret-sharing schemes with binary shares, where
. It looks like one needs substantially new ideas to give such constructions. More generally, for what other interesting classes of functions
and
can we construct non-malleable codes for the composed class
? Is it possible to efficiently construct non-malleable codes for any tampering class
as long as we have efficient non-malleable codes for the classes
and
?
Other Applications of Seedless Non-malleable Extractors. The explicit seedless non-malleable extractors that we construct satisfy strong pseudorandom properties. A natural question is to find more applications of these non-malleable extractors in explicit constructions of other interesting objects.
Improved Seedless Extractors. We construct an extractor for 2-interleaved sources that works for min-entropy rate 2/3. It is easy to verify that there exists extractors for sources with min-entropy as low as , and a natural question here is to come up with such explicit constructions. Given the success in constructing 2-source extractors for low min-entropy [23, 47], we are optimistic that more progress can be made on this problem.
We are grateful for useful comments from anonymous referees.