-
Notifications
You must be signed in to change notification settings - Fork 87
feat: Move block proving to the Store #1472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
Conversation
| /// \[crypto::dsa::ecdsa_k256_keccak::Signature\]. | ||
| #[prost(bytes = "vec", tag = "5")] | ||
| pub signature: ::prost::alloc::vec::Vec<u8>, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsure that this is how we want to structure this (all opaque bytes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For signatures and other crypto primitives, bytes are ok; BlockHeader is a bag of Words.
ordered_batches and block_inputs seem better candidates for proper protobuf representation.
I am not sure it needs to be done as part of this PR though.
| } | ||
|
|
||
| let block_data = block.to_bytes(); | ||
| let block_data = body.to_bytes(); // TODO(currentpr): is this correct? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used for:
async fn get_block_by_number(
&self,
request: Request<proto::blockchain::BlockNumber>,
) -> Result<Response<proto::blockchain::MaybeBlock>, Status> {I'm not sure what we want to do now that ProvenBlock has been replaced with body and header here. Maybe we change it to SignedBlock when that is added?
Do we want to continue to use this file-based store at all? Could we replace it altogether with data in the db?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iiuc the separation of raw block data from the structured DB data was originally done so that indexers / explorers could trivially sync this data without locking the database. The long term idea was that this would live on a separate disk / instance and therefore not interfere at all with database IO.
I'm unsure if this usage pattern still holds as it does make it difficult to perform ACID style store IO as this is external to the sqlite. You can see this complication in the store's apply_block machinery.
I do think this was done a bit prematurely since its not really something we've used at all (this separation) and its mostly just complicated our code though its not the end of the world.
There are many ways to approach this in general and maybe we should revisit this part of the design. The nice thing about this separation is that it enables syncing "cold" historical data without the main database worrying about retaining it. My view on this is that the sqlite database store's the recent history (where recent is some N number of blocks for certain queries), and the file store retains all blocks ever (at least for now).
Now that we need to also retain recent unsigned blocks and then migrate them to signed or potentially rollback invalid blocks this algebra becomes a bit more complex.
If we were restarting from scratch I would probably say put it all in the database. And then figure out what the "cold" storage should look like since we now have unsigned and signed blocks.
As an example design, We could have a cold storage component & API that only stores fully finalized blocks that will never revert or rollback. This allows crawlers/indexers to use this without having to consider rollbacks of this data. The primary store pushes finalized blocks into cold storage and retains some minimal overlap of blocks with it as part of the database itself. What finalized means in our context is debatable -- right now it would mean signed & validated, but it could also eventually mean finalized on agglayer/L1.
Something to consider is that imo horizontally scaling RPC will require horizontal scaling of the database. Whether that means moving to postgres with replication or simply hosting a slave sqlite database with every RPC instance is an open question. Once this is the case, it may also make sense to simply whack the "cold" data in there as well. But probably this makes no sense since cold data should be relatively unused and could "scale" separately.
I would leave things as they are now, for now, until we settle the above design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should keep the things roughly as they are now and I do think that keeping the "cold data" separately could have advantages in the future (even if done a bit prematurely).
As to what this specific endpoint should return, I think it could still be a MaybeBlock message, but it could look like so:
message MaybeBlock {
optional Block block = 1;
}
message Block {
BlockHeader header = 1;
bytes content = 2;
BlockSignature signature = 3;
optional BlockProof proof = 4;
}This implies that blocks could be serialized in two states:
- Signed but not yet proven.
- Signed and proven.
Since we first serialize the block without the proof, and then the proof gets added later, we have two options for serializing the proof:
- We could append proofs to existing block files.
- We could save the proofs in separate files (e.g.,
block134_proof.dat).
Not sure if there are any significant pros/cons in one approach vs. another - so, don't have a strong opinion on this.
| .inspect_err(|err| Span::current().set_error(err)) | ||
| .or_else(|_err| self.rollback_block(mempool, block_num).never_error()) | ||
| // All errors were handled and discarded above, so this is just type juggling | ||
| // to drop the result. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit/out of scope: we might want to refactor this into a more linear flow with ?s and maybe a sub function
| diesel::joinable!(notes -> block_headers (committed_at)); | ||
| diesel::joinable!(notes -> note_scripts (script_root)); | ||
| diesel::joinable!(nullifiers -> block_headers (block_num)); | ||
| diesel::joinable!(transactions -> block_headers (block_num)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting that we don't need those anymore
Context
We have added the Validator component to perform block validation and signing. As such, blocks now have the following states (conceptually, but also expressed in the code to some extent):
We now need the block producer to "commit" blocks to the store after they are signed and have the store handle the proving. This PR will move block proving from the block producer to the store. Subsequent PRs will update the store RPC and runtime so that block proving is asynchronous to block committal ("deferred" block proving, so to speak).
Relates to #1316.
Changes
ProvenBlockusage with constituent parts of block required for proving.