Chunking

Summary

This RFC proposes a modification to the Kimchi proof system and the pickles recursion layer to increase the circuit size limit by splitting the polynomials from a circuit into 'chunks' which are less than the hard limit of 2^16 that Mina / SnarkyJS supports.

Motivation

The motivation, assessment criteria, and exit criteria can be found in the PRD.

This proposal raises the maximum 'circuit size' limit for kimchi proofs from 2^16 to 2^32. Out-of-scope is further work which would remove this limit, since this limit will not currently be reachable without other improvements to memory usage and performance.

This proposal optimizes for compatibility with the kimchi and pickles versions proposed for deployment in the Berkeley hard-fork, as well as the proposed zkApps feature, at the expense of some potential loss of performance when the feature is in use. Timing measurements can be found in the PR description for the prototype implementation.

Detailed design

Theory overview

Polynomial commitments in kimchi work by representing a polynomial's coefficients 'in the exponent' of an elliptic curve point, using a list of generators. For example, we can represent the polynomial

f(x) = a_0 x^0 + a_1 x^1 + ... + a_n x^n

by the curve point

g_0^{a_0} + g_1^{a_1} + ... + g_n^{a_n}

for some shared public generators g_0, ..., g_n. These commitments are integral to the kimchi proof system, where we use the values of the polynomials to represent the list of values that appear in the proof. For example, the list

[b_0, b_1, ..., b_{n-1}]

is represented by the polynomial B over the domain generated by w as

B(w^0) = b_0
B(w^1) = b_1
...
B(w^{n-1}) = b_{n-1}

The polynomial B of degree n-1 is uniquely determined by these equations, and the domain generator satisfies w^n = 1, w^i != 1 for 0 < i < n.

In order to support circuits of a given length, we need to represent the values at each step (or 'gate') in the proof, and thus to represent them as a polynomial of the degree. However, kimchi+pickles as proposed for the Berkeley hard fork are limited to 2^16 generators for performance.

This RFC proposes splitting the polynomial into chunks when we build the commitment. For example, we could represent the polynomial

f(x) = a_0 x^0 + a_1 x^1 + ... + a_{4*n-1} x^{4*n-1}

by the commitments

C_0 = g_0^{a_0} + g_1^{a_1} + ... + g_n^{a_{n-1}}
C_1 = g_0^{a_n} + g_1^{a_{n+1}} + ... + g_n^{a_{2n-1}}
C_2 = g_0^{a_{2n}} + g_1^{a_{2n+1}} + ... + g_n^{a_{3n-1}}
C_3 = g_0^{a_{3n}} + g_1^{a_{3n+1}} + ... + g_n^{a_{4n-1}}

which is equivalent to committing to the 4 'chunked' polynomials

f_0(x) = a_0 x^0 + a_1 x^1 + ... + a_{n-1} x^{n-1}
f_1(x) = a_n x^0 + a_{n+1} x^1 + ... + a_{2n-1} x^{n-1}
f_2(x) = a_{2n} x^0 + a_{2n+1} x^1 + ... + a_{3n-1} x^{n-1}
f_3(x) = a_{3n} x^0 + a_{3n+1} x^1 + ... + a_{4n-1} x^{n-1}

that we can recompose to form f

f(x) = f_0(x) + x^n * f_1(x) + x^{2n} * f_2(x) + x^{3n} * f_3(x)

When we check the constraint polynomial at a 'random' point z later in the proof, we can build a commitment that matches that point by computing

C = C_0 + C_1^{z^n} + C_2^{z^{2n}} + C_3^{z^{3n}}

while also checking that the prover's claimed evaluations are consistent with the commitments:

[( f_0(z), C_0 ), ( f_1(z), C_1 ), ( f_2(z), C_2 ), ( f_3(z), C_3 )]

Finally, we can compute the value f(z) by proxy:

f(z) = f_0(z) + z^n f_1(z) + z^{2n} f_2(z) + z^{3n} f_3(z)

Thus, this chunking approach provides all of the features that we need to support the kimchi proof system over a larger domain size than the current Mina limit.

Retaining zero-knowledge

After we have c > 1 chunks, every column will come with 2 * c evaluations, 1 for each chunk at each of z and z * w. In order to retain zero-knowledge, we need to ensure that we randomize more columns than there are evaluations. Thus, we must randomize at least 2 * c + 1 values at the end of each column. The helpers for adding zero-knowledge to columns must be updated to take the number of zk rows that are required as an argument.

Lookup argument

The VanishesOnLast4Rows constructor in the expression framework should be replaced with a VaishesOnLastZkPlus1Rows, and the number zk rows should be made available from the expression framework's environment.

The logic for the lookup argument will need to be tweaked to use a zk rows parameter instead of the hard-coded 3. The constraints (e.g. lagrange selectors) should be updated to use an abstract zk_rows concept that can be passed through to OCaml via the auto-generated code.

Permutation argument

Recall that the permutation argument has 2 boundary constraints

L_0(x) * (aggregation(x) - 1)
L_{n-zk_rows}(x) * (aggregation(x) - 1)

and an aggregation constraint

zk_polynomial(x) *
( aggregation(x)
    * (w_0(x) + gamma + x * beta * shift_0)
    * (w_1(x) + gamma + x * beta * shift_1)
    * (w_2(x) + gamma + x * beta * shift_2)
    * (w_3(x) + gamma + x * beta * shift_3)
    * (w_4(x) + gamma + x * beta * shift_4)
    * (w_5(x) + gamma + x * beta * shift_5)
    * (w_6(x) + gamma + x * beta * shift_6)
  -
    * aggregation(x * omega)
    * (w_0(x) + gamma + sigma_0 * beta)
    * (w_1(x) + gamma + sigma_1 * beta)
    * (w_2(x) + gamma + sigma_2 * beta)
    * (w_3(x) + gamma + sigma_3 * beta)
    * (w_4(x) + gamma + sigma_4 * beta)
    * (w_5(x) + gamma + sigma_5 * beta)
    * (w_6(x) + gamma + sigma_6 * beta) )

Each of the polynomials w_i is defined by n evaluation points -- one for each element of the domain -- which gives us deg(w_i) = n-1, and similarly so for aggregation. Further, deg(zk_polynomial) has degree zk_rows. Therefore, the whole aggregation constraint has degree

8*(n-1) + zk_rows = 8*n + (zk_rows - 8)

We compute the constraint polynomials over the d8 domain -- that is, the domain with 8n points -- so the maximum polynomial that it can represent has degree 8n-1. Thus, in order for the constraint to be computable over the d8 domain, we require that

8*n + (zk_rows - 8) <= 8*n - 1
zk_rows <= 7

which was fine without chunking, since we had hard-coded zk_rows = 3. However, when we have c chunks for c >= 4, we require zk_rows >= 9, which blows through the d8 domain size limit and prevents us from forming a valid proof.

This prevents us from adding the zk_rows of randomness to the aggregation polynomial, since we cannot create a large enough zk_polynomial to mask out the relevant values.

However, we can choose the values at n-zk_rows+1, and n-zk_rows+2 at random. The verifier may still be able to guess the ratio aggregation(x * omega)/aggregation(x) at every non-randomised point in the domain, but they will have negligible (~1/2^254) probability of choosing the actual value of aggregation(x) for all values between n-zk_rows+1 and n-1.

Therefore, we will modify the permutation argument to use the constraints:

L_0(x) * (aggregation(x) - 1)
L_{n-zk_rows}(x) * (aggregation(x) - 1)

and

(x - omega^(n-zk_rows+1)) *
(x - omega^(n-zk_rows+2))
(x - omega^(n-1))
( aggregation(x)
    * (w_0(x) + gamma + x * beta * shift_0)
    * (w_1(x) + gamma + x * beta * shift_1)
    * (w_2(x) + gamma + x * beta * shift_2)
    * (w_3(x) + gamma + x * beta * shift_3)
    * (w_4(x) + gamma + x * beta * shift_4)
    * (w_5(x) + gamma + x * beta * shift_5)
    * (w_6(x) + gamma + x * beta * shift_6)
  -
    * aggregation(x * omega)
    * (w_0(x) + gamma + sigma_0 * beta)
    * (w_1(x) + gamma + sigma_1 * beta)
    * (w_2(x) + gamma + sigma_2 * beta)
    * (w_3(x) + gamma + sigma_3 * beta)
    * (w_4(x) + gamma + sigma_4 * beta)
    * (w_5(x) + gamma + sigma_5 * beta)
    * (w_6(x) + gamma + sigma_6 * beta) )

In addition, we should 'zero out' the sigma_i terms between n-zk_rows+1 and n-zk_rows+2, to ensure that the randomness in the zk rows for w_i contributes to the aggregation.

In total, this modified permutation argument achieves the same zero knowledge

Number of zk rows

The permutation argument interacts with the c chunks in parallel, so it is possible to cross-correlate between them to compromise zero knowledge. We know that there is some c >= 1 such that zk_rows = 2*c + k from the above. Thus, attempting to find the evaluation at a new point, we find that:

the evaluation of every witness column in the permutation contains k unknowns;
the evaluations of the permutation argument aggregation has k-1 unknowns;
the permutation argument applies on all but zk_rows - 3 rows;
and thus we form the equation zk_rows - 3 < 7 * k + (k - 1) to ensure that we can construct fewer equations than we have unknowns.

This simplifies to k > (2 * c - 2) / 7, giving zk_rows > (16 * c - 2) / 7. We can derive c from the max_poly_size supported by the URS, and thus we find zk_rows and domain_size satisfying the fixpoint

zk_rows = (16 * (domain_size / max_poly_size) + 5) / 7
domain_size = circuit_size + zk_rows

Technical approach in kimchi

Support for chunked commitments already exists in kimchi's supporting polynomial commitment library. This RFC proposes to extend that support through to the kimchi prover, where an attempt to use the feature currently results in an error. This RFC also proposes to add support for chunked commitments and evaluations to pickles, where there is currently no support.

Value columns in the Mina proof are represented in evaluation form, as a list of values. In order to support efficient evaluations over these values, the LagrangeBasisEvaluations structure will need to be generalized to support evaluations of the 'chunked lagrange basis'. The approach to achieve this is equivalent to the one used in the SRS's add_lagrange_basis function, but using the explicit powers of z (z^0, z^1, ..., z^n) in place of the SRS curve points. This uses an iFFT, so will result in a performance degradation from linear time (O(n)) to O(c n log(n)) when chunking is in use, where c is the number of chunks.

The commit_non_hiding function will need to be updated to emit commitments of a fixed size, where the current implementation will attempt to 'skip' zero segments. This ensures that the layout of proofs is homogeneous, so that they can be verified by the same pickles recursion circuit.

The verifier logic for the public input commitment will need to use the chunked lagrange basis, to be consistent with how we will generate them on the prover side.

To avoid the O(c n log(n)) cost of computing the chunked evaluation for the public input polynomial in the verifier, the prover should pass this evaluation as part of the proof. In order to avoid breaking compatibility with zkApps, this 'public input' field should be optional, as proofs submitted in zkApp commands will not have this field available. This value is already considered part of the transcript for the purposes of pickles, so there is no soundness risk of using a value provided by a malicious prover.

After all of the above changes are complete, the error checks in kimchi's prover and verifier can be removed, and chunking can be considered enabled in kimchi. This should be accompanied by a test that exercises this behavior.

Technical approach in OCaml-to-rust bindings

The polynomial commitments and evaluations contained in the proofs and the verifier keys will need to be represented as arrays instead of single values. Several of the OCaml type conversion helpers assume that these only ever contain 1 element. The autogenerated OCaml types should be used to validate this assumption, and to guide the necessary changes if useful.

The additional public input evaluation must not appear in the types already serialized as part of zkApps, including the list of evaluations, where they most naturally would fit. They should be passed in a separate structure, and should be optional as on the rust side.

Technical approach in pickles

The pickles code must be updated to reflect the changes to the bindings, which will propagate arrays into several types. The logic for handling these will then need to be updated to handle these cases, in particular the ft_comm calculation will need to be updated to match the kimchi logic, absorbing the evaluations in the step verifier and the wrap verifier, and the check_bulletproof logic for step and wrap.

In order to represent chunked proofs in the pickles circuits, it will also be necessary to update the Typ.ts for proofs to allow a flexible number of chunks for both step and wrap proofs. The Commitment_lengths and Evaluation_lengths constants can be modified to cause compiler warnings in most of the relevant locations. The representation of unchunked proofs must not change, including, but not limited to, reordering of any fields. The lengths passed to these Typ.ts must be inferrable from the 'tag' that pickles associates with each collection of inductive rules; as such, they should be stored in the Types_map system along with all similar metadata.

Due to the requirement for proof layout homogeneity, the Pickles.compile interface must take parameters for the step_num_chunks and wrap_num_chunks at the top level. These arguments may be optional labelled arguments, since they will not be regularly used in the protocol circuits. These arguments will have to be passed explicitly by the user / caller when needed, since it will be prohibitively expensive to try all of the various options programmatically on every compile.

The Pickles.compile function must be modified to check for the number of chunks in each inductive rule's previous rule tags. When these numbers are not consistent between inductive rules, the code must throw an error in order to achieve layout homogeneity.

Prototype PRs

Test Plan and Functional Requirements

Testing Goals and Objectives:

A SnarkyJS proof is able to achieve use more than the previous circuit size limit.
A SnarkyJS proof using more than the previous circuit size limit is able to be applied in a zkApp transaction by Berkeley.

Testing Approach:

Unit testing, at each of the kimchi, pickles, and SnarkyJS layers.

Testing Scope:

That proofs are produced, and that they verify, in kimchi, pickles, SnarkyJS, and Berkeley.

Testing Requirements:
Testing Resources:

Additional CI time on each PR to proof-systems, mina, and SnarkyJS.

Drawbacks

The existing recursion layer in pickles allows most zkApps that would hit the limit to be restructured in a way that doesn't. However, to do this requires significant extra expertise.

Rationale and alternatives

A previous alternative was to increasing the circuit size limit by generating a larger (2^17) URS. This was rejected after the slow-down for verifying kimchi proofs on Berkeley was deemed unacceptable.

Since this feature was already partly implemented, it is significantly lower-cost than other potential proof-system changes.

None of the other alternatives were able to increase the circuit size limit without either breaking compatibility with zkApps or dramatically increasing complexity and proving time. In particular, switching the polynomial commitments scheme, and using a different proof system were considered and rejected by the crypto team.

Failure to complete this feature will result in less usability for zkApps, which will have a negative effect on both the Berkeley hard fork and SnarkyJS.

Prior art

The existing broken chunking implementation.

Unresolved questions

What parts of the design do you expect to resolve through the RFC process before this gets merged?
- SnarkyJS design
What parts of the design do you expect to resolve through the implementation of this feature before merge?
- None
What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC?
- Performance and memory usage of the proof system while using chunking.

Chunking

Chunking

Summary

Motivation

Detailed design

Theory overview

Retaining zero-knowledge

Lookup argument

Permutation argument

Number of zk rows

Technical approach in kimchi

Technical approach in OCaml-to-rust bindings

Technical approach in pickles

Prototype PRs

Test Plan and Functional Requirements

Drawbacks

Rationale and alternatives

Prior art

Unresolved questions

Related Documents

基于命题分块以增强RAG

TileMap Chunk Manager

🤖 n8n AI Agent Mastery Course 2025

Document Chunking/Splitting in Langroid