ZIP: 231
Title: Memo Bundles
Owners: Jack Grigg <jack@electriccoin.co>
Kris Nuttycombe <kris@electriccoin.co>
Daira-Emma Hopwood <daira@electriccoin.co>
Arya Solhi <arya@zfnd.org>
Credits: Sean Bowe
Nate Wilcox
Status: Draft
Category: Consensus / Wallet
Created: 2024-04-26
License: MIT
Discussions-To: <https://github.com/zcash/zips/issues/627>
The key words “MUST”, “MUST NOT”, “SHOULD”, and “MAY” in this document are to be interpreted as described in BCP 14 1 when, and only when, they appear in all capitals.
The term “network upgrade” in this document is to be interpreted as described in ZIP 200. 2
The character § is used when referring to sections of the Zcash Protocol Specification. 3
The terms “Mainnet” and “Testnet” are to be interpreted as described in § 3.12 ‘Mainnet and Testnet’. 4
Currently, the memo sent in a shielded output is limited to at most 512 bytes. This ZIP proposes to allow larger memos, and to enable memo data to be shared between multiple recipients of a transaction.
In Zcash transaction versions v2 to v5 inclusive, each Sapling or Orchard shielded output contains a ciphertext comprised of a 52-byte note plaintext, and a corresponding 512-byte memo field. 5 Recipients can only decrypt the outputs sent to them, and thus can also only observe the memo fields included with the outputs they can decrypt.
The shielded transaction protocol hides the sender(s) (that is, the addresses corresponding to the keys used to spend the input notes) from all of the recipients. For certain kinds of transactions, it is desirable to make one or more sender addresses available to one or more recipients (for example, a reply address). In such circumstances it is important to authenticate the sender addresses, to give the recipient a guarantee that the address is controlled by a sender of the transaction; failure to authenticate this address can enable phishing attacks. These Authenticated Reply Addresses require zero-knowledge proofs, and for the Orchard protocol these proofs are too large to fit into a 512-byte memo field.
It is also desirable, for clients with more stringent bandwidth constraints, to be able to transmit encrypted notes to the client without including the encrypted memo data. In the current light client protocol 6, this is done by truncating the note ciphertext to just the part that encrypts the memo. However, that has the effect of truncating the authentication tag, and so the resulting decryption algorithm does not meet standard security notions for an authenticated encryption scheme. It is a goal of this proposal to rectify this, simplifying the security argument.
Instead of the memo data, this ZIP proposes that it is possible to indicate whether a memo is present for the recipient. When using the light client protocol, a recipient need not download full transaction information if this indication tells them that they have not received any memo in the transaction.
At present, it is not possible to transmit the same memo data to multiple transaction recipients without redundantly encoding that data, and sending memo data greater than 512 bytes requires sending multiple outputs; the problem is compounded when attempting to send more than 512 bytes to each recipient. By separating memo data from the decryption capability for those memos, it admits a greater variety of applications that utilize memo data, while decreasing the amount of data that needs to be stored on-chain overall.
Prior to the activation of this ZIP, every shielded output has an associated memo field. A chain observer can therefore infer a likely 1:1 correlation between transaction recipients and memo payloads. The maximum number of distinct memos is precisely known, and the bounds on possible memo sizes are very small (between 0 and 512 bytes). It is possible to send additional data to a single recipient by adding (potentially zero-valued) outputs; on-chain this is indistinguishable from sending more memos to more recipients or (in the case of Orchard) spending more input notes.
After the activation of this ZIP, recipient count and memo count are decoupled. It becomes possible to construct transactions for which there are many more recipients than distinct memos, or vice versa. This may result in more observably distinct patterns relating the number of recipients to the number of possible memos. One important example is that if a wallet includes an authenticated reply-to address, this may be distinguishable from other ordinary wallet behaviour.
On the flip side, a chain observer now only knows upper bounds on the amount of memo data being conveyed, and the number of possible distinct memos. They have no information about how memo data is split among the recipients. This can provide a privacy improvement in some situations. For example, it is not possible to distinguish between many recipients receiving many small memos, and the same set of recipients receiving one large shared memo.
In summary, when sending large amounts of memo data, the change introduced by this ZIP eliminates a potential distinguisher along one axis in exchange for a potential distinguisher along another.
Since this proposal is defined only for v6 and later transactions, it is not necessary to consider Sprout JoinSplit outputs. The following sections apply to both Sapling and Orchard outputs.
A memo bundle consists of a sequence of 256-byte memo chunks, each individually encrypted. These memo chunks represent zero or more encrypted memos.
Each transaction may contain a single memo bundle, and a memo bundle may contain at most \(\mathsf{memo\_chunk\_limit}\) = 64 memo chunks. This limits the total amount of memo data that can be conveyed within a single transaction to \(\mathsf{memo\_chunk\_limit}\) × 256 = 16384 bytes, or 16 KiB.
Memo bundles are encoded in transactions in a prunable manner: each memo chunk can be replaced by its representative digest.
During transaction construction, each output with memo data is assigned a 32-byte memo key \(\mathsf{K^{memo}}\). These keys SHOULD be generated randomly, and MUST NOT be used to encrypt more than one memo within a single transaction. If an output has no memo data, it is assigned the memo key consisting of 32 \(\mathtt{0xFF}\) bytes.
In note plaintexts of v6-onward transactions, the 512-byte memo field is replaced by \(\mathsf{K^{memo}}\).
The transaction builder generates a 32-byte salt value \(\mathsf{salt}\) from a CSPRNG. A new salt MUST be generated for each memo bundle.
The symmetric encryption key for a memo is derived from its \(\mathsf{K^{memo}}\) as follows:
\(\hspace{2em}\mathsf{encryptionKey} = \mathsf{PRF^{expand}_{K^{memo}}}([\mathtt{0xE0}] \,||\, \mathsf{salt})\)
The first byte \(\mathtt{0xE0}\) should be added to the documentation of inputs to \(\mathsf{PRF^{expand}}\) in § 4.1.2 ‘Pseudo Random Functions’ 7.
If the generated key is 32 \(\mathtt{0xFF}\) bytes, the transaction constructor MAY repeat this procedure with a different salt, in order to avoid the recipient misinterpreting the output as having no memo data. Since that has negligible probability, it alternatively MAY omit this check.
Each memo is padded to a multiple of 256 bytes with zeroes, and split into 256-byte chunks. Each memo chunk is encrypted with ChaCha20Poly1305 8 as follows:
\(\hspace{2em}\mathsf{IETF\_AEAD\_CHACHA20\_POLY1305}(\mathsf{encryptionKey}, \mathsf{nonce}, \mathsf{memo\_chunk})\)
where \(\mathsf{nonce} = \mathsf{I2BEOSP}_{88}(\mathsf{counter}) \,||\, [\mathsf{final\_chunk}]\).
This is a variant of the STREAM construction 9.
Finally, the encrypted memo chunks for all memos are combined into a single sequence using an order-preserving shuffle. Memo chunks from different memos MAY be interleaved in any order, but memo chunks from the same memo MUST have the same relative order. The following diagram shows an example shuffle of three memos:
[
(memo_a, 0),
(memo_b, 0),
(memo_a, 1),
(memo_c, 0),
(memo_c, 1),
(memo_a, 2),
]
When a recipient decrypts a shielded output, they obtain a memo key \(\mathsf{K^{memo}}\).
From this they derive encryption_key
as above, and then proceed as follows:
\(\hspace{1.5em}\) let mutable \(\mathsf{memo} \;{\small ⦂}\; \mathbb{N} \leftarrow 0 \\\) \(\hspace{1.5em}\) let mutable \(\mathsf{counter} \;{\small ⦂}\; \mathbb{N} \leftarrow 0 \\\) \(\hspace{1.5em}\) let mutable \(\mathsf{potential\_last\_chunk} \;{\small ⦂}\; \mathbb{N} \leftarrow 0 \\\) \(\hspace{1.5em}\) for \(i\) from \(0\) up to \(\mathsf{len}(\mathsf{memo\_chunks})-1\): \(\\\) \(\hspace{3.0em}\) if \(\mathsf{memo\_chunks}[i]\) is not pruned: \(\\\) \(\hspace{4.5em}\) let \(\mathsf{nonce} = \mathsf{I2BEOSP}_{88}(\mathsf{counter}) \,||\, [\mathtt{0x00}] \\\) \(\hspace{4.5em}\) let \(P = \mathsf{IETF\_AEAD\_CHACHA20\_POLY1305.Decrypt}(\mathsf{encryptionKey}, \mathsf{nonce}, \mathsf{memo\_chunks}[i]) \\\) \(\hspace{4.5em}\) if \(P \neq \bot\): \(\\\) \(\hspace{6.0em}\) set \(\mathsf{memo} \leftarrow \mathsf{memo} \,||\, P \\\) \(\hspace{6.0em}\) set \(\mathsf{counter} \leftarrow \mathsf{counter} + 1 \\\) \(\hspace{6.0em}\) set \(\mathsf{potential\_last\_chunk} \leftarrow i + 1 \\\) \(\,\\\) \(\hspace{1.5em}\) let \(\mathsf{nonce} = \mathsf{I2BEOSP}_{88}(\mathsf{counter}) \,||\, [\mathtt{0x01}] \\\) \(\hspace{1.5em}\) let mutable \(\mathsf{success} \leftarrow \kern0.05em\) false \(\\\) \(\hspace{1.5em}\) for \(i\) from \(0\) up to \(\mathsf{len}(\mathsf{memo\_chunks})-1\): \(\\\) \(\hspace{3.0em}\) if \(\mathsf{memo\_chunks}[i]\) is not pruned: \(\\\) \(\hspace{4.5em}\) let \(P = \mathsf{IETF\_AEAD\_CHACHA20\_POLY1305.Decrypt}(\mathsf{encryptionKey}, \mathsf{nonce}, \mathsf{memo\_chunks}[i]) \\\) \(\hspace{4.5em}\) if \(i \geq \mathsf{potential\_last\_chunk}\) and \(P \neq \bot\) and not \(\mathsf{success}\): \(\\\) \(\hspace{6.0em}\) set \(\mathsf{memo} \leftarrow \mathsf{memo} \,||\, P \\\) \(\hspace{6.0em}\) set \(\mathsf{success} \leftarrow \kern0.05em\) true \(\\\) \(\,\\\) \(\hspace{1.5em}\) if \(\mathsf{success}\) then return \(\mathsf{memo}\) else return \(\bot \\\)
If any chunk of the memo encrypted to \(\mathsf{encryptionKey}\) has been pruned, the decryption process above returns nothing (as \(\mathsf{final_chunk}\) will be set to \(\mathtt{0x01}\) with the wrong counter value), ensuring that a malformed memo is not returned.
Bytes | Name | Data Type | Description |
---|---|---|---|
1 | fAllPruned |
uint8 |
1 if all chunks have been pruned, otherwise 0. |
32 | nonceOrHash |
byte[32] |
The nonce for deriving encryption keys, or the overall hash. |
† varies | nMemoChunks |
compactSize |
The number of memo chunks. |
† varies | pruned |
byte[ \(\mathsf{ceiling}(\mathtt{nMemoChunks}/8)\)] |
Bitflags indicating the type of each entry in vMemoChunks . |
† varies | vMemoChunks |
MemoChunk[nMemoChunks] |
A sequence of encrypted memo chunks. |
† These fields are present if and only if fAllPruned == 0
.
If fAllPruned == 0
, then:
nonceOrHash
represents the nonce for deriving encryption keys.pruned
, in little-endian order, indicates the type of the
corresponding entry in vMemoChunks
. A bit value of 0 indicates that
the entry will be of type byte[272]
representing an encrypted memo
chunk. A bit value of 1 indicates the entry will be a byte[32]
and
contains the memo_chunk_digest
for a pruned chunk.If fAllPruned == 1
, then:
nonceOrHash
represents the overall hash for the memo bundle as defined in
Transaction sighash.nMemoChunks
, pruned
, and vMemoChunks
fields will be absent.memo_chunk_digest = H(AEAD(MemoChunk, memo_key))
memo_bundle_digest = H(concat(memo_chunk_digests))
The memo bundle digest structure is a performance optimization for the case where all memo chunks in a transaction have been pruned.
TODO: finish this to be a modification to the equivalent of ZIP 244 for transaction v6.
The conventional fee in ZEC is altered such that a memo bundle may contain two
free chunks if there are any shielded outputs in the transaction. Any memo chunk
beyond this requires marginal_fee
. See the Fee calculation section of ZIP 317
11 for details.
Nodes must reject GetData
responses having an fAllPruned
value that is nonzero,
or any byte of pruned
that is nonzero.
The following changes affecting the definitions of note plaintexts and note ciphertexts, and the algorithms for encryption and decryption.
In § 3.2.1 ‘Note Plaintexts and Memo Fields’:
Change
Each Sapling or Orchard note plaintext (denoted \(\mathbf{np}\)) consists of
\(\hspace{2em}(\mathsf{leadByte} \;{\small ⦂}\; \mathbb{B}^{{\kern-0.05em\tiny\mathbb{Y}}}, \mathsf{d} \;{\small ⦂}\; \mathbb{B}^{[\ell_{\mathsf{d}}]}, \mathsf{rseed} \;{\small ⦂}\; \mathbb{B}^{{\kern-0.05em\tiny\mathbb{Y}}[32]}, \mathsf{memo} \;{\small ⦂}\; \mathbb{B}^{{\kern-0.05em\tiny\mathbb{Y}}[512]})\)
to
The form of a Sapling or Orchard note plaintext depends on the version of the transaction in which it will be included; specifically whether that version is pre-v6, or v6-onward.
Each pre-v6 Sapling or Orchard note plaintext (denoted \(\mathbf{np}\)) consists of
\(\hspace{2em}(\mathsf{leadByte} \;{\small ⦂}\; \mathbb{B}^{{\kern-0.05em\tiny\mathbb{Y}}}, \mathsf{d} \;{\small ⦂}\; \mathbb{B}^{[\ell_{\mathsf{d}}]}, \mathsf{rseed} \;{\small ⦂}\; \mathbb{B}^{{\kern-0.05em\tiny\mathbb{Y}}[32]}, \mathsf{memo} \;{\small ⦂}\; \mathbb{B}^{{\kern-0.05em\tiny\mathbb{Y}}[512]})\)
Each v6-onward Sapling or Orchard note plaintext (denoted \(\mathbf{np}\)) consists of
\(\hspace{2em}(\mathsf{leadByte} \;{\small ⦂}\; \mathbb{B}^{{\kern-0.05em\tiny\mathbb{Y}}}, \mathsf{d} \;{\small ⦂}\; \mathbb{B}^{[\ell_{\mathsf{d}}]}, \mathsf{rseed} \;⦂\; \mathbb{B}^{{\kern-0.05em\tiny\mathbb{Y}}[32]}, \mathsf{K^{memo}} \;{\small ⦂}\; \mathbb{B}^{{\kern-0.05em\tiny\mathbb{Y}}[512]})\)
In § 5.5 ‘Encodings of Note Plaintexts and Memo Fields’ 12:
Change the paragraph that describes “The encoding of a Sapling or Orchard note plaintext” to refer to “The encoding of a pre-v6 Sapling or Orchard note plaintext”.
Add a new paragraph at the end of the section:
The encoding of a v6-onward Sapling or Orchard note plaintext consists of:
| 8-bit \(\mathsf{leadByte}\) | 88-bit \(\mathsf{d}\) | 64-bit \(\mathsf{v}\) | 256-bit \(\mathsf{rseed}\) | 32-byte \(\mathsf{K^{memo}}\) |
- A byte 0x03, indicating this version of the encoding of a v6-onward Sapling or Orchard note plaintext.
- 11 bytes specifying \(\mathsf{d}\).
- 8 bytes specifying \(\mathsf{v}\).
- 32 bytes specifying \(\mathsf{rseed}\).
- 32 bytes specifying \(\mathsf{K^{memo}}\).
A value consisting of 32 \(\mathtt{0xFF}\) bytes for \(\mathsf{K^{memo}}\) is used to indicate that there is no memo for this note plaintext.
In § 4.7.2 ‘Sending Notes (Sapling)’ 13 and § 4.7.3 ‘Sending Notes (Orchard)’ 14:
Add a reference to this ZIP specifying the construction of the memo bundle and derivation of \(\mathsf{K^{memo}}\) in the case of a v6-onward note plaintext.
Change
Let \(\mathbf{np} = (\mathsf{leadByte}, \mathsf{d}, \mathsf{v}, \mathsf{rseed}, \mathsf{memo})\).
to
Let \(\mathbf{np}\) be the encoding of a Sapling note plaintext using \(\mathsf{leadByte}\), \(\mathsf{d}\), \(\mathsf{v}\), \(\mathsf{rseed}\), and either \(\mathsf{memo}\) for a pre-v6 note plaintext or \(\mathsf{K^{memo}}\) for a v6-onward note plaintext.
replacing “Sapling” with Orchard in the case of § 4.7.3.
In § 4.20.1 ‘Encryption (Sapling and Orchard)’ 15:
Change
Let \(\mathbf{np} = (\mathsf{leadByte}, \mathsf{d}, \mathsf{v}, \mathsf{rseed}, \mathsf{memo})\) be the Sapling or Orchard note plaintext. \(\mathbf{np}\) is encoded as defined in § 5.5 ‘Encodings of Note Plaintexts and Memo Fields’.
to
Let \(\mathbf{np}\) be the encoding of the Sapling or Orchard note plaintext (which may be pre-v6 or v6-onward), as defined in § 5.5 ‘Encodings of Note Plaintexts and Memo Fields’.
Add another normative note to that section:
- \(\mathsf{C^{enc}}\) will be of length either 580 or 100 bytes, depending on whether \(\mathbf{np}\) is a pre-v6 or v6-onward note plaintext.
In § 4.20.2 ‘Decryption using an Incoming Viewing Key (Sapling and Orchard)’ 16 and § 4.20.3 ‘Decryption using a Full Viewing Key (Sapling and Orchard)’ 17:
All of these changes apply identically to Mainnet and Testnet.
TBD
Restricting the total amount of memo data in a bundle, for example to 16 KiB, limits the rate at which the chain size can grow cheaply (from a computational perspective; memo bundles are much easier to produce than proofs or signatures).
The current behaviour for previous transaction versions (no limit on the number of memos) is not altered by this ZIP, because memos in those transactions are tied to individual shielded outputs (incurring their computational cost), and are not natively aggregatable.
To understand the effect of memo chunk size, we construct a table showing the total amount of data stored on-chain when encoding 16 KiB of memo data to as many recipients as possible.
Each table entry has the format “\(N\) @ \(M\) (\(O\))” where \(N\) is the maximum number of distinct recipients you can have within the memo data limits, \(M\) is the cost in bytes of that memo data plus memo keys and authentication tags when using a 32-byte memo key, and \(O\) is the relative overhead compared to pre-ZIP-231 memos.
Let:
Then
Chunk size | Memo size ≤ 256 bytes | Memo size = 512 bytes |
---|---|---|
Pre-231 | 32 @ 16384 ( 0.00%) | 32 @ 16384 ( 0.00%) |
512 | 32 @ 17920 (+ 9.38%) | 32 @ 17920 (+ 9.38%) |
256 | 64 @ 19456 (+18.75%) | 32 @ 18432 (+12.50%) |
256 32-out | 32 @ 9728 (-40.63%) |
In the “256 32-out” case you have a distinguisher compared to old transactions, in that you can tell the transaction is sending at most 256 bytes per recipient rather than 512 if it is sending the maximum number of memos. But that’s inherently baked into the decision to use a smaller memo chunk size (and it is still possible for the chunks to all be a single memo sent to all outputs, or anything in between).
16-byte (128-bit) keys don’t meet Zcash’s target security level of 125 bits, as argued in 19.
However, for the sake of argument, if we used a 16-byte memo key instead of 32 bytes, the transaction size overhead would become:
Chunk size | Memo size ≤ 256 bytes | Memo size = 512 bytes |
---|---|---|
Pre-231 | 32 @ 16384 ( 0.00%) | 32 @ 16384 ( 0.00%) |
512 | 32 @ 17408 (+ 6.25%) | 32 @ 17408 (+ 6.25%) |
256 | 64 @ 18432 (+12.50%) | 32 @ 17920 (+ 9.38%) |
256 32-out | 32 @ 9216 (-43.75%) |
The decrease in overhead is relatively modest in most cases, but more noticeable for small memos with a 256-byte memo chunk.
The benefits of 256-bit keys are:
Including a per-transaction \(\mathsf{salt}\) in the derivation of \(\mathsf{encryption_key}\) gives protection against accidental (or intentional) reuse of \(\mathsf{K^{memo}}\) reuse across multiple transactions. We do not protect against \(\mathsf{K^{memo}}\) reuse within a transaction; it is up to the transaction builder to ensure that the same \(\mathsf{K^{memo}}\) is not used to encrypt two different memos (and if they did so, normal clients would either never observe the second memo, or would decrypt parts of each memo and get a nonsensical and potentially insecure “spliced” memo).
We do not include commitments to the shielded outputs in the derivation of \(\mathsf{encryptionKey}\) for two reasons:
The separation of memo data from note data, and the new ability to easily store variable-length memo data, opens up an attack vector against node operators for storing arbitrary data. The transaction digest commitments to the memo bundle are structured such that if a node operator is presented with a memo key (i.e. they are given the capability to decrypt a particular memo), they can identify and prune the corresponding memo chunks, while still enabling the transaction to be validated as part of its corresponding block and broadcast over the network.
The transaction encoding permits pruning at the individual chunk level in order to facilitate pruning an individual memo from a transaction without affecting the other memos. This enables node operators to be responsive to, for example, GDPR deletion requests.
Note that broadcasting a partially-pruned transaction means that the pruned chunks no longer contribute to the upper bound on memo data.
The prunable structure does not introduce a censorship axis; memo bundles do not reveal which memo chunks correspond to which memos, and therefore a network adversary cannot selectively censor individual memos. They can censor any/all chunks within specific transactions, however shielded transactions do not reveal their senders, recipients, or amounts, and thus also cannot be individually targeted for censorship.
This ZIP is proposed to activate with Network Upgrade 7. 21
TBD
Information on BCP 14 — “RFC 2119: Key words for use in RFCs to Indicate Requirement Levels” and “RFC 8174: Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words” ↩︎
Zcash Protocol Specification, Version 2024.5.1 [NU6] or later ↩︎
Zcash Protocol Specification, Version 2024.5.1 [NU6]. Section 3.12: Mainnet and Testnet ↩︎
Zcash Protocol Specification, Version 2024.5.1 [NU6]. Section 3.2.1: Note Plaintexts and Memo Fields ↩︎
Zcash Protocol Specification, Version 2024.5.1 [NU6]. Section 4.1.2: Pseudo Random Functions ↩︎
Online Authenticated-Encryption and its Nonce-Reuse Misuse-Resistance ↩︎
ZIP 317: Proportional Transfer Fee Mechanism - Fee calculation ↩︎
Zcash Protocol Specification, Version 2024.5.1 [NU6]. Section 5.5: Encodings of Note Plaintexts and Memo Fields ↩︎
Zcash Protocol Specification, Version 2024.5.1 [NU6]. Section 4.7.2: Sending Notes (Sapling) ↩︎
Zcash Protocol Specification, Version 2024.5.1 [NU6]. Section 4.7.3: Sending Notes (Orchard) ↩︎
Zcash Protocol Specification, Version 2024.5.1 [NU6]. Section 4.20.1: Encryption (Sapling and Orchard) ↩︎
Zcash Protocol Specification, Version 2024.5.1 [NU6]. Section 4.20.2: Decryption using an Incoming Viewing Key (Sapling and Orchard) ↩︎
Zcash Protocol Specification, Version 2024.5.1 [NU6]. Section 4.20.3: Decryption using a Full Viewing Key (Sapling and Orchard) ↩︎
Zcash Protocol Specification, Version 2024.5.1 [NU6]. Section 8.7: In-band secret distribution ↩︎
zcash/zips issue #693: Standardize a protocol for creating shielded transactions offline ↩︎