Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: move hash memoization out of signed #1485

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

prestwich
Copy link
Member

@prestwich prestwich commented Oct 15, 2024

This is a breaking change

Motivation

See discussion in #1460 and #1483

Memoizing the hash in Signed is inappropriate, as the transaction does not know its full 2718 serialization without being embedded in an envelope. This breaks the hash out into a Sealed type that is produced when the Envelope is constructed. The sealed type is transparent to the user

Solution

  • Introduce Enveloped type alias
  • ensure that From<> impls properly Seal
  • ensure that impl Encodable2718 for TxEnvelope produces the correct trie_hash by returning the memoized hash. This previously produced incorrect hashes for 4844 txns with sidecars

PR Checklist

  • Added Tests
  • Added Documentation
  • Breaking changes

Comment on lines +126 to +127
let mut buf = Vec::with_capacity(v.tx().encoded_len_with_signature(v.signature()));
v.tx().encode_with_signature_fields(v.signature(), &mut buf);
let h = keccak256(buf);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should all of those be Sealable impls?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, only the envelope knows the correct serialization for its components (tx types don't know their flag as it may be network dependent c.f. issue #1483)

@prestwich
Copy link
Member Author

this is ready for re-review

Comment on lines 104 to 121
pub enum TxEnvelope {
/// An untagged [`TxLegacy`].
Legacy(Signed<TxLegacy>),
Legacy(Enveloped<TxLegacy>),
/// A [`TxEip2930`] tagged with type 1.
Eip2930(Signed<TxEip2930>),
Eip2930(Enveloped<TxEip2930>),
/// A [`TxEip1559`] tagged with type 2.
Eip1559(Signed<TxEip1559>),
Eip1559(Enveloped<TxEip1559>),
/// A TxEip4844 tagged with type 3.
/// An EIP-4844 transaction has two network representations:
/// 1 - The transaction itself, which is a regular RLP-encoded transaction and used to retrieve
/// historical transactions..
///
/// 2 - The transaction with a sidecar, which is the form used to
/// send transactions to the network.
Eip4844(Signed<TxEip4844Variant>),
Eip4844(Enveloped<TxEip4844Variant>),
/// A [`TxEip7702`] tagged with type 4.
Eip7702(Signed<TxEip7702>),
Eip7702(Enveloped<TxEip7702>),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent, but, what's the plan for alloy_consensus::TypedTransaction now? it's essentially the same type

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypedTransaction is the unsigned version. As such neither signature or hash is known, and it just includes a signable version of the inner transaction and is used in the NetworkWallet::<N>::sign_transaction_from API

actually now that you mention it, we need to apply #1489 to TypedTransaction as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prestwich prestwich mentioned this pull request Oct 16, 2024
3 tasks
Comment on lines +125 to +126
let mut buf = Vec::with_capacity(v.tx().encoded_len_with_signature(v.signature()));
v.tx().encode_with_signature_fields(v.signature(), &mut buf);
Copy link
Member

@mattsse mattsse Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there no other way to get the tx hash?

if the only solution is to go through the txenvelope enum this will be very inconvenient

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c.f. #1483. Technically it's legal for networks to make envelopes that do the following:

  • use the same inner type, but a different serialization (e.g. a TxLegacy serialized borsht instead of RLP)
  • use a different tx flag for the same inner type (e.g. a TxEnvelope containing TxEip1559 at 23 instead of 2
  • use the same inner type at multiple tx flags (e.g. a TxEnvelope containing TxEip1559 at both 2 and 23)

So there is no way to correctly get the hash of a transaction without wrapping it in the specific network's envelope first

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo if a network does this then that's on them, because the type id is even part of the eip itself, if they'd choose 23 then this would no longer be 1559

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of those usecases can be solved by introducing a separate inner transaction type (e.g. network can provide its own TxLegacy)

right now it makes total sense for separate transaction types to know their full encoding because we don't yet have a case of network rolling out encoding of transaction conflicting with existing tx types. also given that most of the networks target for compatibility with ethereum I'd consider such case very unlikely

Copy link
Member Author

@prestwich prestwich Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also consider it unlikely. But it's still the only correct abstraction of an EIP-2718 envelope if we want our code to be reusable.

because the type id is even part of the eip itself

This is a fair reading of the spec. And @klkvr is right that people downstream can just duplicate 500 lines of code per tx type to change a single constant. However, it costs us very little to have the envelope drive serialization and hashing and prevents future headaches when someone inevitably does something silly here

For another interesting example, consider the extension sentinel value in eip-2718 0xff 🙃 This is also unlikely ofc, but tx-association falls down in extended envelopes

@klkvr
Copy link
Member

klkvr commented Oct 16, 2024

At current state of alloy transactions do know their encoding and their type byte. I've seen a lot of examples of people using encode_with_signature and I don't think there's a reason to break this API (which would be required to make tx type envelope-associated)

IMO this edge case (Ethereum transactions with different type byte) is very unlikely and already can be solved by introducing separate types

@prestwich
Copy link
Member Author

prestwich commented Oct 16, 2024

At current state of alloy transactions do know their encoding and their type byte.

yes, this is the logic bug in #1483. They do, but they should not, as the type byte is an envelope feature, not a transaction feature. See #1483 and #1485 (comment)

I do not know of any network that do this, but i do think that the cost of implementing it correctly now is small, while the cost of fixing it later is large

I've seen a lot of examples of people using encode_with_signature and I don't think there's a reason to break this API (which would be required to make tx type envelope-associated)

This API is broken right now, and their usage is almost certainly incorrect. It's why the encoding/decoding functions were private before #529, and why #529 hides them from users. We deliberately set out to prevent people from doing this and made it officially not part of the public API. as TxEnvelope is the only safe and correct API for transaction enc/decoding in Ethereum.

@klkvr
Copy link
Member

klkvr commented Oct 16, 2024

This API is broken right now, and their usage is almost certainly incorrect.

could you please elaborate a bit on that? is it broken for 4844 case?

@prestwich
Copy link
Member Author

prestwich commented Oct 16, 2024

could you please elaborate a bit on that? is it broken for 4844 case?

There's that, yes. 4844 transaction with sidecars produce incorrect trie_hash output right now. So that needs to be solved regardless.

  • The docs for encode_with_signature are wrong
  • encode_with_signature produces 2718 output when with_header is false, and RLP in network format when with_header is true. All other encode_ methods produce RLP output in non-network format
  • decode_with_signature is missing entirely
  • EncodableSignature seems to now be redundant as reth migrated to alloy-primitives::Signature
  • sometimes header presence is determined by method name encode_fields, while sometimes it's by arg with_header: bool (related to second bullet above)

I can continue, but basically this is a broken API that confuses multiple serialization schemes and is almost impossible to correctly use

The long term fix is to change the RLP derive to allow flattening and field-only serialization

@prestwich
Copy link
Member Author

actually this is a good time to just make a cleanup PR. I'll go do that

@prestwich
Copy link
Member Author

prestwich commented Oct 19, 2024

holding off on rebasing this as it's better with #1496

@prestwich
Copy link
Member Author

post-#1496 would y'all prefer a PR that does not break out memoization, and instead just fixes #1537 using the new tx_hash api?

@prestwich prestwich linked an issue Oct 20, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] trie_hash impl incorrect for tx envelope
4 participants