1 Introduction
Error correcting codes (ECCs) are a tool for handling errors when transmitting messages over an unreliable communication channel. They work by first encoding the message with additional redundant information, which is then sent over the channel. This allows the recipient to recover the original encoded message, even in the presence of a limited number of errors that might occur during transmission.
Since their introduction in the 1950s, error correcting codes [Ham50] have been a thriving research area due to their role both in practical applications and in theoretical computer science. One of the central open questions concerns the exact tradeoff between a code’s rate (message length divided by codeword length) and the code’s error tolerance (the number of errors that its decoding algorithm can tolerate). There are several known fundamental bounds (e.g. the Hamming, Singleton, and Plotkin bounds) on the maximum rate of a code in terms of its distance, and state of the art codes (especially over small alphabets) often only achieve significantly lower rates.
To achieve better rates, two major relaxations of error correction have been proposed. In the first, called list decoding [Eli57, Woz58], a decoding algorithm is no longer required to output the originally encoded message, but may instead output a short list of messages which is required to contain the original message. In this work, we will focus on standard (unique) decoding, but we will use list-decodable codes as a central building block.
In the second relaxation, the communication channel between the sender and receiver is assumed to be restricted in some way. In other words, the code is no longer required to handle fully worst-case errors. The most relevant model for us is the computationally bounded channel [Lip94], which loosely speaking, models codeword errors as generated by a polynomial-time process.
An additional drawback of the constructions of [Lip94] and [MPSW10] is that they require a stateful encoder, which may render them unsuitable for use in data storage or in applications requiring concurrent transmission of multiple messages. In [Lip94], it is essential for security that the encoder’s state never repeats, and essential for correctness that the decoder’s state is synchronized with the encoder’s state. In [MPSW10], the decoder is stateless, but it is essential for security that errors are chosen in an online fashion. In other words, there are no guarantees if a codeword c is corrupted after seeing a codewordAre there “good” uniquely decodable codes for the computationally bounded channel with transparent setup?

Are there “good” uniquely decodable codes for the computationally bounded channel with a stateless encoder?
1.1 Our Contributions
We answer both questions affirmatively, constructing a code for computationally bounded channels (with transparent setup and stateless encoding) that outperforms codes for worst-case errors. As a contribution that may be of independent interest, we also construct codes with high “pseudodistance”, i.e., codes for which it is hard to find two codewords that are close in Hamming distance.
Pseudounique Decoding. The main goal of an error correcting code C is to facilitate the recovery of a transmitted message given a partially corrupted copy of C(m). To formalize this (in the information-theoretic setting), a polynomial-time algorithm D is said to be a unique decoding algorithm for C against errors if for all messages m and all strings
that are
-close in Hamming distance to C(m), we have
.
In reality, messages and noise are created by nature, which can be conservatively modeled as a computationally bounded adversary. We thus relax the above for all quantification and only require efficient decoding when both m and are chosen by a computationally bounded process. Our codes will be described by a randomly generated seed that is used in the encoding and decoding procedures. In other words, we will work with a seeded family of codes
, where
is the seed, which we will also refer to as the public parameters for the code. In our constructions, the public parameters are merely unstructured uniformly random strings of a certain length.
More formally, we say that a polynomial-time algorithm D is a pseudounique decoding algorithm for against
errors if no polynomial-time adversary A can win the following game with noticeable probability. The public parameters
are first sampled uniformly at random and given to A. The adversary then produces a message m and a string
, and is said to win if
is
-close to
and
.
Under cryptographic assumptions (or in the random oracle model), we construct codes with pseudounique decoding algorithms for a larger fraction of errors than is possible in the standard setting. Our main theorem requires a “good” cryptographic hash function (which is used as a black box), where we defer the formalization of the necessary security requirements to Sect. 3. For now, we simply mention that it is a multi-input generalization of correlation intractability, and in Section 3 we show that it can be instantiated by a (non-programmable) random oracle. The precise statement and details about the construction appear in Sect. 4.
For any and any
there exist rate-r codes, over large (polynomial-sized) alphabets, that are efficiently pseudouniquely decodable against up to a
fraction of errors, assuming good hash functions exist (or in the random oracle model).
This should be contrasted with the Singleton bound, which rules out (standard) unique decoding for more than errors. Our positive result is a corollary of a more general connection to efficient list-decodability, which we prove in Sect. 4. This connection also implies results over binary alphabets, albeit with bounds that are harder to state (see Corollary 3) because known binary codes do not achieve list-decoding capacity and instead have messy rate vs. error correction tradeoffs.
Pseudodistance. Our second notion is an analogue of distance. Recall that a code C is said to have distance d if for all pairs of distinct messages ,
, their encodings
and
have Hamming distance d. We can similarly replace this for all quantifier and only require
and
to be far for pairs
,
that are computed from
by a computationally bounded adversary.
We note that a code’s pseudodistance may be arbitrarily high without implying anything about its decodability, even by an inefficient algorithm. It is instructive to imagine a rate-1 code whose encoding algorithm is given by a (sufficiently obfuscated) random permutation mapping . The pseudodistance of this code will be roughly n/2, but it is information theoretically impossible to decode in the presence of even a single error.
Still, pseudodistance is a useful intermediate notion for us in the construction of pseudouniquely decodable codes, and the notion may be of independent interest.
1.2 Main Definitions and Main Theorem Statement
The preceding discussion is formalized in the following definitions.


is probabilistic, takes a domain length
(in unary), and outputs public parameters
.
is deterministic, takes parameters
and a message
, and outputs a codeword
, where
is called the length of
.
When is well-defined it is called the rate of
. If
simply outputs a uniformly random binary string of some length that depends on k, then we say that
is public-coin.





![$$\begin{aligned} \Pr _{\begin{array}{c} \mathsf {pp}\leftarrow \mathsf {Setup}(1^k) \\ (m_0, m_1) \leftarrow \mathcal {A}_k(\mathsf {pp}) \end{array}} \left[ \varDelta \big (\mathsf {Enc}(\mathsf {pp}, m_0), \mathsf {Enc}(\mathsf {pp}, m_1) \big ) < d \right] \le \epsilon (k), \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ4.png)

is said simply to have pseudodistance
if for all
, there exists
such that
has
-pseudodistance d.






![$$\begin{aligned} \Pr _{\begin{array}{c} \mathsf {pp}\leftarrow \mathsf {Setup}(1^k) \\ (m, c) \leftarrow \mathcal {A}_k(\mathsf {pp}) \end{array}} \left[ \varDelta \big (c, \mathsf {Enc}(\mathsf {pp}, m) \big ) \le d(k)\ \wedge \ \mathsf {Dec}(\mathsf {pp}, c) \ne m \right] \le \epsilon (k). \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ5.png)









We sometimes say a “ fraction of errors” to refer to some d(k) such that
, where
is the length of
.
As in the previous theorem, we assume the existence of random-like hash functions to obtain our result. These hash functions can be instantiated in the random oracle model.
If is a rate-r ensemble of codes that is efficiently list-decodable against a
fraction of errors, and if good hash functions exist, then there exists a rate-r seeded code that is efficiently pseudouniquely decodable against a
fraction of errors.
The above bound has a nice interpretation when C approaches capacity, i.e. when . Then
, which upper bounds the pseudo-unique decodability of any positive-rate code (implied by the proof of the Plotkin bound, and made explicit in [MPSW10]). So if C achieves capacity, Theorem 2 says that one can uniquely decode up to the (efficient) list-decoding radius of C, as long as that doesn’t exceed
.
1.3 Related Work
The notion of a computationally bounded channel was first studied by Lipton [Lip94], and has subsequently been studied in a variety of coding theory settings including local decodability, local correctability, and list decoding, with channels that are bounded either in time complexity or space complexity [DGL04, GS16, BGGZ19, MPSW10, SS16, HOSW11, HO08, OPS07]. We compare some of these works in Table 1. Focusing on unique decoding against polynomial-time computationally bounded errors, the work most relevant to us is [MPSW10], improving on [Lip94].
Lipton [Lip94] showed that assuming one-way functions, any code that is (efficiently) uniquely decodable against random errors can be upgraded to a “secret-key, stateful code” that is (efficiently) uniquely decodable against any
errors that are computed in polynomial time. Using known results on codes for random errors, this gives rate-r (large alphabet) codes that are uniquely decodable against a
fraction of errors. However, these codes require the sender and receiver to share a secret key, and to be stateful (incrementing a counter for each message sent/received).
Micali et al. [MPSW10] improve on this result, obtaining a coding scheme where only the sender needs a secret key (the receiver only needs a corresponding public key), and only the sender needs to maintain a counter. They show that these limitations are inherent in the high-error regime; namely, it is impossible to uniquely decode beyond error rates 1/4 (in the binary case) and more generally over q-ary alphabets, even if the errors are computationally bounded. Compared to [Lip94, MPSW10] starts with codes that are efficiently list decodable, rather than codes that are uniquely decodable against random errors. The crux of their technique is using cryptographic signatures to “sieve” out all but one of the candidate messages returned by a list-decoding algorithm. Our construction also uses list decodability in a similar way. The key difference is that we use a different sieving mechanism that is stateless and transparent (i.e., the only setup is a public uniformly random string), but is only applicable for error rates below
.
Our work improves over [MPSW10] in the amount of setup required for the code. In [MPSW10], the sender must initially create a secret key and share the corresponding public key with the receiver (and the adversarial channel is also allowed to depend on the public key). In contrast, our code allows anyone to send messages—no secret key is needed. This property may be useful in applications such as Wi-Fi and cellular networks, where many parties need to communicate.
Another important difference between [MPSW10] and our work is that in [MPSW10], the sender is stateful. That is, whenever the sender sends a message, he updates some internal state which affects the way the next message will be encoded. We do not make such an assumption. Note that in some situations, maintaining a state may not be possible. For example, if there are multiple senders (or a single sender who is operating several servers in different locations), it is unclear how to collectively maintain state. Whenever one of the senders sends a message, he must inform all the other senders so they can update their state accordingly, which may not be possible, or significantly slow down communication. Moreover, the guarantees of [MPSW10] only apply to adversaries that operate in a totally “online” fashion. The error tolerance guarantees break down if an adversary is able to corrupt a codeword after seeing a subsequently encoded message. In our construction, the sender and receiver are both stateless, so these issues do not arise.
Summary of related work. The column “message” refers to how the message are generated. The column “noise” describes the computational power of the adversary adding noise. URS stands for uniform random string (shared publicly between the sender, receiver, and adversary), BSC for binary symmetric channel, and PRG for pseudorandom generator.
Work | Setup | Noise | Decoding | Rate | Assumptions |
---|---|---|---|---|---|
This paper | URS | unique | arbitrarily close to | two-input correlation intractablility | |
[GS16] | URS | list | arbitrarily close to | no assumptions | |
[SS16] | none | list | arbitrarily close to | PRGs for small circuits | |
[SS20] | none | unique | arbitrarily close to | none | |
[MPSW10] | public key | unique | matches list decoding radius | stateful sender and one-way functions | |
[OPS07] | private shared randomness | local |
| one-way functions | |
[HOSW11] | public key | local |
| public-key encryption | |
[BGGZ19] | URS | local correction |
| collision-resistant hash function | |
[Lip94] | private shared randomness | unique | matches BSC channel | stateful sender and one-way functions |
2 Preliminaries
2.1 Combinatorics
The falling factorial of
is
.
![$$H_q : [0, 1] \rightarrow [0, 1]$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq110.png)







![$$\begin{aligned} \varDelta (u,v) {\mathop {=}\limits ^{\mathsf {def}}}\Big | \big \{ i \in [n] : u_i \ne v_i \big \} \Big |. \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ7.png)




2.2 Codes
A deterministic q-ary code is a function , where n is called the block length of C, [K] is called the message space, and [q] is called the alphabet. The distance of C is the minimum Hamming distance between C(m) and
for distinct
. A probabilistic q-ary code of block length n and message space [K] is a randomized function
.
![$$\{C_i : [K_i] \rightarrow [q_i]^{n_i}\}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq125.png)



,
, and
increase weakly monotonically with i and are computable from i in polynomial time (i.e. in time
).
is at most
.
There is a polynomial-time algorithm E that given (i, x) for
outputs
.
The limit
exists with
. We call r the rate of the ensemble.
. This is important so that the cost of padding (to encode arbitrary-length messages) is insignificant.
One implication of these restrictions is that without loss of generality we can assume that and we can index our codes by k rather than by i.
We say that an ensemble of codes is combinatorially
-list decodable if for any
, there are at most
values of
for which
. If there is a polynomial-time algorithm that outputs all such m given y (and
), we say that
is efficiently
-list decodable.
2.3 Pseudorandomness
Random variables are said to be t-wise independent if for any set
with size
, the random variables
are mutually independent.



![$$S \subseteq [n]$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq157.png)


![$$\begin{aligned} \Pr \left[ \bigwedge _{i \in S} X_i = x_i \right] \le \beta \cdot \prod _{i \in S} \Pr [X_i = x_i]. \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ8.png)
Permutations. If X is a finite set, we write to denote the set of all permutations of X.
A family of permutations is said to be t-wise
-dependent if for all distinct
, the distribution of
for uniformly random
is
-close in statistical distance to uniform on
To avoid pathological issues regarding the domains of permutation families (e.g. their sampleability, decidability, and compressability), we will restrict our attention to permutations on sets of the form for
.


sampling a description of
; and
computing
and
given x and a description of
.
([KNR09]). For any , and any
, there is a fully explicit t-wise
-dependent ensemble
of permutation families.
The following non-standard variation on the notion of t-wise almost-independence will prove to be more convenient for us.
A probability distribution P is said to be -close in Rényi
-divergence to a distribution Q if for all x,
.
We say that a family is t-wise
-dependent in Rényi
-divergence if for all distinct
, the distribution of
is
-close in Rényi
-divergence to the uniform distribution on
.
It is easily verified that any family of permutations that is t-wise
-dependent as in Definition 11 is also t-wise
-dependent in Rényi
-divergence with
. Thus Theorem 3 gives us the following.
For any , there is a fully explicit t-wise O(1)-dependent (in Rényi
-divergence) ensemble
of permutation families.
3 Multi-input Correlation Intractability
Correlation intractability was introduced by Canetti Goldreich and Halevi [CGH04] as a way to model a large class of random oracle-like security properties of hash functions. Roughly speaking, H is said to be correlation intractable if for any sparse relation R it is hard to find x such that . In recent years, CI hash functions have been under the spotlight with surprising results on instantiating CI hash families from concrete computational assumptions (e.g., [CCR16, KRR17, CCRR18, CCH+18, PS19]).
In this work, we need a stronger multi-input variant of correlation intractability. We formulate a notion of multi-input sparsity such that a hash function can plausibly be correlation intractable for all sparse multi-input relations. Indeed, we prove that a random oracle has this property.
(Multi-input Relations). For sets and
, an
-input relation on
is a subset
.
![$$i \in [{\ell }]$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq205.png)


![$$\begin{aligned} \Pr _{y_i \leftarrow \mathcal {Y}} \left[ (x_1, \ldots , x_{\ell }, y_1, \ldots , y_{\ell }) \in R \right] \le p. \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ9.png)





A natural but flawed generalization of single-input sparsity for an -input relation R might instead require that for all
, it holds with overwhelming probability over a uniform choice of
that
. Unfortunately this definition does not account for an adversary’s ability to choose some
adaptively. Indeed, even a random oracle would not be 2-input correlation intractable under this definition for the relation
, which does satisfy the aforementioned “sparsity” property.







![$$\begin{aligned} \Pr _{\begin{array}{c} k \leftarrow \mathcal {K}_\lambda \\ (x_1,\ldots ,x_{\ell }) \leftarrow \mathcal {A}(k) \end{array}}\Big [ \big (x_1,\ldots ,x_{\ell },H_k(x_1),\ldots ,H_k(x_{\ell }) \big ) \in R_\lambda \Big ] \le \epsilon (\lambda ). \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ10.png)
3.1 Multi-input Correlation Intractability of Random Oracles
We show that a random oracle is -input correlation intractable as in Definition 16.





![$$\begin{aligned} \Pr&\left[ \mathcal {A}^F \text { outputs }(x_1, \ldots , x_\ell ) \in \mathcal {X}^\ell \text { s.t. }\big (x_1, \ldots , x_\ell , F(x_1), \ldots , F(x_\ell ) \big ) \in R \right] \\&\le p \cdot (T)_\ell \le p \cdot T^\ell . \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ11.png)
Proof Overview. We give an overview of the proof which should give some intuition as to why get the expression . Fix a set of elements
then the probability, over the random oracle, that these elements will be in the relation with respect with the random oracle is at most p, which follows from the definition of sparsity. However, for a longer list of elements of length, we would need to take into account all the possible tuples of size
in that list, and apply a union bound. Since the number of queries is bounded by T, we get that the probability is at most
.
The above arguments work for a fixed list of elements, and gives intuition for the probability expression achieved in the theorem. However, an oracle algorithm is allowed to perform adaptive queries where the next query might depend on the result of the random oracle for previous queries. This makes the proof more challenging and, in particular, much more technical.

is deterministic;
never makes repeated queries to F nor does
output non-distinct
; and
If at any point
has made queries
and received answers
such that for some
the tuple
is in R, then
immediately outputs one such tuple (without making any further queries).
We denote the random variables representing the various queries of the algorithm , and their responses from the oracle. Let M be a random variable denoting the number of queries made by
, let
denote the queries made by
to F, let
denote the corresponding evaluations of F, let
denote
, let
denote the output of
, and let
denote the corresponding evaluations of F.
We split our analysis into two cases: either , or not, meaning that the algorithm did not query all of the
elements it outputs. We argue about each case separately and at the end combine to the two to a single argument. We begin with the second case.

![$$\begin{aligned} \Pr \left[ (X_1, \ldots , X_\ell ,Y_1,\ldots ,Y_\ell ) \in R \; \mid \; (X_1, \ldots , X_\ell ) \notin \mathcal {Q}^\ell \right] \le p \cdot \ell . \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ12.png)
Proof Sketch. There exists some component of ’s output whose image under F is independent of
’s view, and thus is uniformly random. Since R is p-sparse, this ensures that
is in R with probability at most p.
![$$i \in [\ell ]$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq263.png)

![$$\begin{aligned} \Pr [ (q_1, \ldots ,q_{\ell },a_1,\ldots ,a_{i-1},Y_i,a_{i+1},\ldots ,a_\ell ) \in R ]\le p . \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ13.png)
![$$\begin{aligned} \Pr&\left[ (X_1, \ldots , X_\ell ,Y_1,\ldots ,Y_\ell ) \in R \; \mid \; (X_1, \ldots , X_\ell ) \notin \mathcal {Q}^\ell \right] \\&\le \sum _{i \in [\ell ]} \Pr \left[ (X_1, \ldots , X_\ell ,Y_1,\ldots ,Y_\ell ) \in R \; \mid \; X_i \in \mathcal {Q}\right] \cdot \Pr [X_i \notin \mathcal {Q}]\\&\le \sum _{i \in [\ell ]} \sum _{\begin{array}{c} q_1,\ldots ,q_\ell \\ a_1,\ldots ,a_{i-1},a_{i+1},\ldots ,a_\ell \end{array}} \Pr \left[ (q_1, \ldots ,q_{\ell },a_1,\ldots ,a_{i-1},Y_i,a_{i+1},\ldots ,a_\ell ) \in R \right] \cdot \\&\; \; \; \; \; \Pr [\forall j \ne i : X_j=q_j,Y_j=a_j, X_i=a_i] \cdot \Pr [X_i \notin \mathcal {Q}] \\&\le p \cdot \sum _{i \in [\ell ]} \Pr [X_i \notin \mathcal {Q}] \cdot \sum _{\begin{array}{c} q_1,\ldots ,q_\ell \\ a_1,\ldots ,a_{i-1},a_{i+1},\ldots ,a_\ell \end{array}} \Pr [\forall j \ne i : X_j=q_j,Y_j=a_j, X_i=a_i] \\&\le p \cdot \sum _{i \in [\ell ]} \Pr [X_i \notin \mathcal {Q}] \le p \cdot \ell . \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ14.png)
We turn to prove the first case, where all the elements in the algorithm’s output where queried. This case is where we pay the in the probability.

![$$\begin{aligned} \Pr \left[ (X_1, \ldots , X_\ell , Y_1, \ldots , Y_\ell ) \in R \; \mid \; (X_1, \ldots , X_\ell ) \in \mathcal {Q}^\ell \right] \le p \cdot T^k. \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ15.png)
![$$m \in [T]$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq267.png)






![$$m \in [T]$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq273.png)

![$$\Pr [Z_m=1]$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq275.png)
![$$m \in [T]$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq276.png)
![$$\begin{aligned}&\Pr [Z_m] = 1 \\&\Pr [\exists i_1,\ldots ,i_{\ell } \in [m] \text { such that } (Q_{i_1},\ldots ,Q_{i_{\ell }},A_{i_1},\ldots ,A_{i_{\ell }}) \in R \text { and } m \in \{i_1,\ldots ,i_\ell \} ] \\ {}&\le \sum _{i_1,\ldots ,i_{\ell } \in [m], \; m \in \{i_1,\ldots ,i_\ell \}} \Pr [(Q_{i_1},\ldots ,Q_{i_{\ell }},A_{i_1},\ldots ,A_{i_{\ell }}) \in R] \\ {}&\le \sum _{i_1,\ldots ,i_{\ell } \in [m], \; m \in \{i_1,\ldots ,i_\ell \}} p \le p \cdot \ell \cdot (m-1)_{\ell -1} . \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ16.png)

![$$m \in [T]$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq278.png)
![$$\begin{aligned} \Pr&\left[ (X_1, \ldots , X_\ell , Y_1, \ldots , Y_\ell ) \in R \; \mid \; (X_1, \ldots , X_\ell ) \in \mathcal {Q}^\ell \right] \le \Pr [ \exists m \in [T] : Z_m=1] \\ {}&\le \sum _{m=1}^{T}\Pr [Z_m=1] \le \sum _{m=1}^{T} p \cdot \ell \cdot (m-1)_{\ell -1} \le p \cdot (T)_{\ell } . \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ17.png)
![$$\begin{aligned} \Pr&\left[ (X_1, \ldots , X_\ell , Y_1, \ldots , Y_\ell ) \in R \right] \\ {}&= \Pr \left[ (X_1, \ldots , X_\ell , Y_1, \ldots , Y_\ell ) \in R \; \mid \; (X_1, \ldots , X_\ell ) \in \mathcal {Q}^\ell \right] \cdot \Pr [(X_1, \ldots , X_\ell ) \in \mathcal {Q}^\ell ] \\ {}&+ \Pr \left[ (X_1, \ldots , X_\ell , Y_1,\ldots , Y_\ell ) \in R \; \mid \; (X_1, \ldots , X_\ell ) \notin \mathcal {Q}^\ell \right] \cdot \Pr [(X_1, \ldots , X_\ell ) \notin \mathcal {Q}^\ell ] \\ {}&\le p \cdot (T)_{\ell } \cdot \Pr [(X_1, \ldots , X_\ell ) \in \mathcal {Q}^\ell ] + p \cdot \ell \cdot \Pr [(X_1, \ldots , X_\ell ) \notin \mathcal {Q}^\ell ] \\ {}&\le p \cdot (T)_{\ell } \cdot \Pr [(X_1, \ldots , X_\ell ) \in \mathcal {Q}^\ell ] + p \cdot (T)_{\ell } \cdot \Pr [(X_1, \ldots , X_\ell ) \notin \mathcal {Q}^\ell ] \\ {}&= p \cdot (T)_\ell \le p \cdot T^{\ell } . \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ18.png)
4 Our Construction
We have defined the notion of a multi-input correlation intractable hash, and showed that they can be constructed in the random oracle model. We now construct a seeded family of codes that is pseudouniquely decodable against a large fraction of errors, using 2-input correlation intractable hash functions as a central tool (in a black-box way). Loosely speaking, our construction starts with any efficiently list-decodable code , and modifies it in several steps.
- 1.
We first apply a decodability- and rate-preserving seeded transformation to
to obtain (a seeded family of) stochastic codes in which with all pairs of messages are mapped to far apart codewords with overwhelmingly probability.
Specifically, the seed is (loosely speaking) a pseudorandom permutation
, and the stochastic code maps
to
for uniformly random
, where
satisfies
.
- 2.
We derandomize these codes by generating randomness deterministically as a hash of the message.
More formally, we will consider the following parameterized construction of a seeded code family.
is a fully explicit ensemble of codes,
is a fully explicit ensemble of permutation families, and
is a fully explicit ensemble of hash function families, where functions in
map
to
for some
satisfying
.
![$$\mathcal {SC}[\mathcal {C},\varPi ,\mathcal {H}]$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq295.png)

takes
as input, samples
and
, and outputs
.
takes
as input, as well as a message
, and outputs
.
inherits several basic properties from
, including alphabet size and block length. We only consider hash family ensembles
in which the output length
of functions in
satisfies
. With such parameters, the resulting coding scheme
has the same rate as
.
4.1 From 2-Input Correlation Intractability to Pseudodistance
In this section, we show that if is a sufficiently good ensemble of codes,
is a two-input correlation intractable hash with an appropriate output length, and
is pseudorandom, then
has high pseudodistance.
rate-r (combinatorially)
-list decodable ensemble of codes
;
ensemble
of
-wise O(1)-dependent (in Rényi
-divergence) permutation families;
satisfying
, where
![$$\mathcal {SC}[\mathcal {C}, \varPi , \mathcal {H}]$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq326.png)


![$$\mathcal {SC}[\mathcal {C}, \varPi , \mathcal {H}]$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq329.png)







To finish the proof of Proposition 1, it suffices to show that this relation is sparse with high probability (over the choice of ), which is established by the following claim.
rate-r combinatorially
-list decodable ensemble of codes
;
satisfying
;









(Proof of Section 4.1). For simplicity of presentation, we omit the explicit dependencies of ,
,
,
,
, and
on k, simply writing C, q, n, t, and
respectively in statements that are to be interpreted as holding for all sufficiently large k.

![$$B \subseteq [q]^{n}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq358.png)







![$$\begin{aligned} \Pr _{c \leftarrow C} \left[ c \approx _\delta C(x_1, y_1) \right] \le \mathrm{poly}(k) \cdot \mathrm{poly}(n) \cdot q^{n \cdot \big (H_q(\delta ) - H_q(\rho ) \big )} \cdot q^{-nr} \le q^{-\varOmega (n)} \le 2^{-\varOmega (k)}. \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ20.png)




![$$\begin{aligned} \Pr _{\pi }[(x_1, x_2, y_1, y_2) \in \mathcal {R}^{\mathsf {close}}_{\mathcal {C}, \pi , \delta , \ell }] \le c \cdot \Pr _{c \leftarrow C} \left[ c \approx _\delta C(x_1, y_1) \right] \le 2^{-\varOmega (k)} . \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ21.png)






![$$\begin{aligned} \Pr _\pi \left[ \Pr _{y_2 \leftarrow \{0,1\}^\ell } \left[ (x_1, x_2, y_1, y_2) \in \mathcal {R}^{\mathsf {close}}_{\mathcal {C}, \pi , \delta , \ell } \right] \ge \frac{t+1}{2^\ell } \right] \le O \left( \frac{\mu ^t}{(t+1)!} \right) \le O\left( \mu ^t \right) . \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ22.png)



![$$\begin{aligned} \Pr _{y_2 \leftarrow \{0,1\}^\ell } \left[ (x_1, x_2, y_1, y_2) \in \mathcal {R}^{\mathsf {close}}_{\mathcal {C}, \pi , \delta , \ell } \right] \le \frac{t}{2^\ell }. \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ1.png)


![$$\begin{aligned} \Pr _{y_1 \leftarrow \{0,1\}^\ell } \left[ (x_1, x_2, y_1, y_2) \in \mathcal {R}^{\mathsf {close}}_{\mathcal {C}, \pi , \delta , \ell } \right] \le \frac{t}{2^\ell }. \end{aligned}$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_Equ2.png)



This concludes the proof of Proposition 1.
4.2 From Efficient List Decodability to Pseudounique Decodability
We next observe that if is efficiently
-list decodable then so is
(as long as
and
are fully explicit). We show that this, combined with the high pseudodistance that we have already established, implies that
has a pseudounique decoding algorithm against a large fraction of errors.
We first define the straight-forward adaptation of list decoding for seeded families of codes.
We say that is an
-list decoding algorithm for a seeded family of codes
if for all
in the support of
, all
, and all
,
is an L(k)-sized set that contains m. We say that
is simply a
-list decoding algorithm if it is an
-list decoding algorithm for some
.
We say that is efficiently
-list decodable if there exists a polynomial-time
-list decoding algorithm for
.
If is efficiently
-list decodable and
and
are fully explicit, then so is
.


- 1.
Running the list-decoding algorithm for
to obtain strings
,
- 2.
Inverting each
under
to obtain pairs
,
- 3.
Outputting the set
.

is efficiently list-decodable against a
fraction of errors; and
has relative pseudodistance
,








![$$y \in [q]^n$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq429.png)
- 1.
Run the list-decoding algorithm for
on
to obtain a list of messages
(and corresponding codewords
).
- 2.
Output
for the
minimizing
.
This algorithm clearly runs in polynomial-time, so it suffices to analyze correctness. Suppose we have , where
is a polynomial-size adversary and
. We first observe that some
by the list-decodability of
. No other
can also have
, because otherwise we would have
by the triangle inequality. This contradicts the
’s pseudodistance since the above process for generating
is efficient.
In other words, is the closest codeword to y, and the decoding algorithm outputs
as desired.
4.3 Main Theorem
We are now ready to state our main theorem:
rate-r (efficiently)
-list decodable fully explicit ensemble
of codes
;
ensemble
of
-wise O(1)-dependent (in Rényi
-divergence) permutation families;
ensemble
of 2-input correlation intractable hash families, where functions in
map
to
for
;
where
,
![$$\mathcal {SC}[\mathcal {C}, \varPi , \mathcal {H}]$$](../images/508076_1_En_19_Chapter/508076_1_En_19_Chapter_TeX_IEq462.png)

4.4 Instantiations with Known Codes
Finally, we apply Theorem 6 with some known codes, first recalling applicable results from coding theory. We focus on large alphabets () and binary alphabets (
).
([GR08]). For all satisfying
, there is a rate-r, efficiently
-list decodable, fully explicit ensemble of codes
with
.
Plugging these codes into Theorem 6, we get
For all r, with
, there is a rate-r seeded family of codes (with alphabet size
), that is efficiently pseudouniquely decodable against a
fraction of errors.
This result should be contrasted with the Singleton bound, which states that if rate-r code is uniquely decodable against a fraction of errors, then
.
For all and all
, there is a rate-r seeded family of binary codes that is efficiently pseudouniquely decodable against a
fraction of errors.
This work was done (in part) while the authors were visiting the Simons Institute for the Theory of Computing. Eylon Yogev is funded by the ISF grants 484/18, 1789/19, Len Blavatnik and the Blavatnik Foundation, and The Blavatnik Interdisciplinary Cyber Research Center at Tel Aviv University.