the PCP Theorem in Lean

3 Exponential-length PCP

Definition 3.0.1
#

A system of \(m\) quadratic equations in \(n\) variables over a field \(\mathbb {F}\) is a list of polynomials \(p_1, \ldots , p_m \in \mathbb {F}[x_1, \ldots , x_n]\) where each \(p_i\) has total degree at most \(2\).

\[ \mathrm{QESAT}(\mathbb {F},n) := \{ (p_1, \ldots , p_m) \mid \exists a_1, \ldots , a_n \in \mathbb {F}, \, \forall i \in [m], \, p_i(a_1, \ldots , a_n) = 0 \} . \]

For example, \((x + 1, xy + z) \in \mathrm{QESAT}(\mathbb {F}_2,3)\).

Definition 3.0.2
#

A PCP verifier is a probabilistic oracle Turing machine \(V\) with query access to a function \(\pi : [\ell (|x|)] \to \Sigma \).

Definition 3.0.3
#

A language \(L \subseteq \{ 0,1\} ^*\) is in \(\mathrm{PCP}[\varepsilon _c, \varepsilon _s, \Sigma , \ell , q, r]\) if there exists a polynomial-time PCP verifier \(V\) such that for every \(x \in \{ 0,1\} ^*\), \(V\) makes at most \(q(|x|)\) queries to the proof oracle, uses at most \(r(|x|)\) bits of randomness, and the following holds:

  • Completeness: If \(x \in L\), then \(\exists \pi \in \Sigma ^{\ell (|x|)}, \, \Pr \left[V^\pi (x)=1 \right] \geq 1 - \varepsilon _c\).

  • Soundness: If \(x \notin L\), then \(\forall \widetilde{\pi } \in \Sigma ^{\ell (|x|)},\, \Pr \left[V^{\widetilde{\pi }}(x)=1 \right] \leq \varepsilon _s\).

The following theorem is the main goal of this chapter.

Theorem 3.0.4
\[ \mathrm{QESAT}(\mathbb {F}_2,n) \in \mathrm{PCP}[ \varepsilon _c = 0, \varepsilon _s = 1/2, \Sigma = \mathbb {F}_2, \ell = \exp (|x|), q = O(1), r = |x|^{O(1)} ]. \]
Proof
Definition 3.0.5
#

A LPCP verifier is a probabilistic oracle Turing machine \(V\) with query access to a linear function \(\langle \pi ,\cdot \rangle : \Sigma ^{\ell (|x|)} \to \Sigma \).

Definition 3.0.6
#

A language \(L \subseteq \{ 0,1\} ^*\) is in \(\mathrm{LPCP}[\varepsilon _c, \varepsilon _s, \Sigma , \ell , q, r]\) if there exists a polynomial-time LPCP verifier \(V\) such that for every \(x \in \{ 0,1\} ^*\), \(V\) makes at most \(q(|x|)\) queries to the linear proof oracle, uses at most \(r(|x|)\) bits of randomness, and the following holds:

  • Completeness: If \(x \in L\), then \(\exists \pi \in \Sigma ^{\ell (|x|)}, \, \Pr \left[V^{\langle \pi , \cdot \rangle }(x)=1 \right] \geq 1 - \varepsilon _c\).

  • Soundness: If \(x \notin L\), then \(\forall \widetilde{\pi } \in \Sigma ^{\ell (|x|)},\, \Pr \left[V^{\langle \widetilde{\pi }, \cdot \rangle }(x)=1 \right] \leq \varepsilon _s\).

3.1 Linear PCP for linear equations

Definition 3.1.1
#

For a field \(\mathbb {F}\) and parameters \(m, n \in \mathbb {N}\), we define the language

\[ \mathrm{LINEQ}(\mathbb {F},m,n) := \{ (M, c) \in \mathbb {F}^{m\times n} \times \mathbb {F}^{m} \mid \exists b\in \mathbb {F}^n, \, Mb = c \} . \]
Definition 3.1.2
#

Let \(V_{\mathrm{LINEQ}(\mathbb {F},m,n)}^{\langle b, \cdot \rangle } (M,c)\) be the LPCP verifier defined as follows:

  1. Sample \(r \leftarrow \mathbb {F}^m\).

  2. Query \(b\in \mathbb {F}^n \) at \(u:= M^{\intercal }r \in \mathbb {F}^n\).

  3. Accept if and only if \(\langle b,u\rangle = \langle c, r\rangle \).

Theorem 3.1.3
\[ \mathrm{LINEQ}(\mathbb {F},m,n) \in \mathrm{LPCP}\left[\varepsilon _c=0, \varepsilon _s= 1 / |\mathbb {F}|, \Sigma = \mathbb {F}, \ell = |x|, q = 1, r = m \log |\mathbb {F}|\right]. \]
Proof

Consider the LPCP verifier \(V_{\mathrm{LINEQ}(\mathbb {F},m,n)}^{\langle b, \cdot \rangle } (M,c)\).
Completeness: If \(Mb =c\), then for every \(r \in \mathbb {F}^m\), we have

\[ \langle b, u \rangle = b^{\intercal } (M^{\intercal } r) = (Mb)^{\intercal } r = \langle c , r\rangle . \]

Soundness: If \(Mb \neq c\), then

\begin{gather*} \Pr _{r\leftarrow \mathbb {F}^m} \left[\langle b, u \rangle = \langle c , r\rangle \right] = \Pr _{r\leftarrow \mathbb {F}^m} \left[\langle Mb, r \rangle = \langle c , r\rangle \right] \\ = \Pr _{r\leftarrow \mathbb {F}^m} \left[ \sum _{i=1}^{m} (Mb - c)_{i} \cdot r_{i} = 0\right] \leq 1 / |\mathbb {F}|, \end{gather*}

where the last inequality follows from the Schwartz-Zippel lemma.

3.2 Linear PCP for tensor checks

Definition 3.2.1
#

For a field \(\mathbb {F}\), vectors \(a \in \mathbb {F}^n\) and \(b \in \mathbb {F}^m\), the tensor product \(a \otimes b \in \mathbb {F}^{n \times m}\) is the matrix defined by

\[ (a \otimes b)_{i,j} := a_i \cdot b_j \quad \forall i \in [n],\, j \in [m]. \]

We write \(\mathrm{flat}(a \otimes b) \in \mathbb {F}^{n \cdot m}\) for the row-concatenation of \(a \otimes b\).

For our purposes, we would love to test to check if \(\mathrm{flat}(a \otimes a) = b\)

Definition 3.2.2
#

For a field \(\mathbb {F}\) and \(n \in \mathbb {N}\), define the tensor test language

\[ \mathrm{TENSORQ}(\mathbb {F}, n) := \bigl\{ (a, b) \in \mathbb {F}^n \times \mathbb {F}^{n^2} \mid b = \mathrm{flat}(a \otimes a) \bigr\} . \]

For every finite field \(\mathbb {F}\) and \(n \in \mathbb {N}\),

\[ \mathrm{TENSORQ}(\mathbb {F}, n) \in \mathrm{LPCP}\! \left[ \varepsilon _c = 0,\; \varepsilon _s = \frac{2|\mathbb {F}|-1}{|\mathbb {F}|^2},\; \Sigma = \mathbb {F},\; \ell = n + n^2,\; q = 2,\; r = 2n \cdot \log (|\mathbb {F}|) \right]. \]
Proof

Consider the following LPCP verifier \(V^{\langle a,\cdot \rangle , \langle b,\cdot \rangle }\) proceeds as follows:

  1. Sample \(s, t \leftarrow \mathbb {F}^n\).

  2. Query \(a\) at \(s\) and at \(t\), obtaining \(\langle a, s \rangle \) and \(\langle a, t \rangle \).

  3. Query \(b\) at \(\mathrm{flat}(s \otimes t)\), obtaining \(\langle b, \mathrm{flat}(s \otimes t) \rangle \).

  4. Check if \(\langle b, \mathrm{flat}(s \otimes t) \rangle = \langle a, s \rangle \cdot \langle a, t \rangle \).

Completeness. Suppose \(b = \mathrm{flat}(a \otimes a)\). Then \(\forall s, t \in \mathbb {F}^n\),

\begin{align*} \langle b,\, \mathrm{flat}(s \otimes t) \rangle & = \langle \mathrm{flat}(a \otimes a),\, \mathrm{flat}(s \otimes t) \rangle = \sum _{i,j \in [n]} a_i a_j s_i t_j \\ & = \Bigl(\sum _{i \in [n]} a_i s_i\Bigr) \cdot \Bigl(\sum _{j \in [n]} a_j t_j\Bigr) = \langle a, s \rangle \cdot \langle a, t \rangle , \end{align*}

Soundness. Suppose \(b \neq \mathrm{flat}(a \otimes a)\), i.e., there exists \(i^*, j^* \in [n]\) such that \(b_{i^* j^*} \neq a_{i^*} \cdot a_{j^*}\). For each \(i \in [n]\) define the linear polynomial \(p_i(t) := \sum _{j \in [n]} (b_{ij} - a_i a_j)\, t_j\). The check can be rewritten as

\[ \langle b, \mathrm{flat}(s \otimes t) \rangle - \langle a,s\rangle \langle a,t\rangle = \sum _{i \in [n]} \sum _{j \in [n]} (b_{ij} - a_i a_j)\, s_i t_j = \sum _{i \in [n]} p_i(t)\, s_i. \]

Immediate application of Polynomial Identity Testing yields a lower bound of \(1 - \tfrac {2}{|\mathbb {F}|}\) but we can do better. Since \(b_{i^* j^*} \neq a_{i^*} a_{j^*}\), the polynomial \(p_{i^*}\) is nonzero, so by the Schwartz–Zippel lemma applied to \(t\),

\[ \Pr _{t}\! \bigl[p_{i^*}(t) \neq 0\bigr] \geq 1 - \tfrac {1}{|\mathbb {F}|}. \]

Conditioned on \(p_{i^*}(t) \neq 0\), the expression \(\sum _i p_i(t)\, s_i\) is a nonzero linear polynomial in \(s\), so by the Schwartz–Zippel lemma applied to \(s\),

\[ \Pr _{s \leftarrow \mathbb {F}^n}\! \Bigl[\sum _i p_i(t)\, s_i \neq 0 \; \Big|\; p_{i^*}(t) \neq 0\Bigr] \geq 1 - \tfrac {1}{|\mathbb {F}|}. \]

Hence

\[ \Pr _{s,t}\! \bigl[\text{verifier rejects}\bigr] \; \geq \; \Bigl(1 - \tfrac {1}{|\mathbb {F}|}\Bigr)^{\! 2}, \]

and therefore

\[ \Pr _{s,t}\! \bigl[\text{verifier accepts}\bigr] \; \leq \; 1 - \Bigl(1 - \tfrac {1}{|\mathbb {F}|}\Bigr)^{\! 2} = \frac{2|\mathbb {F}|-1}{|\mathbb {F}|^2}. \qedhere \]
Corollary 3.2.4
\[ \mathrm{TENSORQ}(\mathbb {F}_2, n) \in \mathrm{LPCP}\! \left[ \varepsilon _c = 0,\; \varepsilon _s = \frac{3}{4},\; \Sigma = \mathbb {F}_2,\; \ell = n + n^2,\; q = 2,\; r = 2 \log (2) \cdot n \right]. \]
Proof

Immediate.

3.3 Linear PCP to PCP