-
Notifications
You must be signed in to change notification settings - Fork 87
feat: add pagination to GetNetworkAccountIds #1452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
feat: add pagination to GetNetworkAccountIds #1452
Conversation
This is for the |
|
I think we can technically rely on Its a cheap column to add so imo we can do it. |
crates/ntx-builder/src/store.rs
Outdated
| Ok(all_notes) | ||
| } | ||
|
|
||
| // TODO: add pagination. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed!
crates/ntx-builder/src/store.rs
Outdated
| let response = self.inner.clone().get_network_account_ids(()).await?.into_inner(); | ||
| pub async fn get_network_account_ids( | ||
| &self, | ||
| block_range: Option<RangeInclusive<BlockNumber>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid the Option? Nobody should request this without anyways, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think we need to retain the original behavior within this method. The ntx builder currently expects this to return all network account IDs.
To retain this behavior for now I would
- revert function signature change
- internally loop over and aggregate the returned pages
As currently used this will only return the first page in the main builder https://github.com/0xMiden/miden-node/pull/1452/changes#diff-630c8daca8771b806879dbee6a3059b92341ac7418c345180f537c911dd272a2R15
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this changes
| schema::accounts::account_id, | ||
| block_range: RangeInclusive<BlockNumber>, | ||
| ) -> Result<(Vec<AccountId>, BlockNumber), DatabaseError> { | ||
| const ROW_OVERHEAD_BYTES: usize = AccountId::SERIALIZED_SIZE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
future work/probably too much: we might want to use build.rs that generates this const value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I create an issue for this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like @Mirko-von-Leipzig 's input on it first :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an internal only endpoint, so I think we can get away with simplifying this as much as possible such that we don't need a derived MAX_ROWS at all, but can simply keep it at some arbitrary value.
With a small bit of refactoring in the ntx builder (briefly touched upon in #1478 (comment)), we can simplify this function signature to
pub(crate) fn select_all_network_account_ids(
conn: &mut SqliteConnection,
token: ContinuationToken,
) -> Result<(Vec<AccountId>, ContinuationToken), DatabaseError> This is because the ntx builder won't need to care about block-delimited pages, and instead only needs a mechanism to get all currently committed network account IDs.
So all we need is an iteration key that retains the order, which could be
ORDER BY rowid -- if we don't delete, or
ORDER BY created_at, account_id -- for an agnostic oneand then
struct ContinuationToken(rowid);
// or
struct ContinuationToken(BlockNumber, AccountId);If we do decide to go the stream route, then ContinuationToken would remain internal only, otherwise it forms part of the protobuf.
Mirko-von-Leipzig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, but definitely requires the fix on the ntx client side.
I would also be interested in an experiment where we change the proto to return a stream of AccountId instead. Internally, the store would manage the stream by paginating internally (which no longer needs to be block bound, but can use rowid or any other internally consistent row token). e.g. request a page from db, stream the page, request next page, stream it etc.
Client can recover from a broken stream by initializing stream with Option<AccountI>, the last account ID received. Though we could also just restart from scratch since its an internal only API.
crates/ntx-builder/src/store.rs
Outdated
| let response = self.inner.clone().get_network_account_ids(()).await?.into_inner(); | ||
| pub async fn get_network_account_ids( | ||
| &self, | ||
| block_range: Option<RangeInclusive<BlockNumber>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think we need to retain the original behavior within this method. The ntx builder currently expects this to return all network account IDs.
To retain this behavior for now I would
- revert function signature change
- internally loop over and aggregate the returned pages
As currently used this will only return the first page in the main builder https://github.com/0xMiden/miden-node/pull/1452/changes#diff-630c8daca8771b806879dbee6a3059b92341ac7418c345180f537c911dd272a2R15
|
@Mirko-von-Leipzig I created #1478 to test the stream approach |
|
|
||
| const _: () = assert!( | ||
| MAX_ROWS > miden_objects::MAX_ACCOUNTS_PER_BLOCK, | ||
| "Block pagination limit must exceed maximum block capacity" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| "Block pagination limit must exceed maximum block capacity" | |
| "Block pagination limit must exceed maximum block capacity to uphold assumed logic invariant" |
bobbinth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thank you! I left a few comments inline - mostly related to making sure the comments/docs are up to date.
| /// Returns all network account IDs. | ||
| /// | ||
| /// # Returns | ||
| /// | ||
| /// A vector with network account IDs, or an error. | ||
| pub(crate) fn select_all_network_account_ids( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think doc comments here are correct - i.e., this does not return all network account IDs. Let's update it to indicate that this tries to return all IDs within the specified block range (based on account creation block), but could return less if there are too many account IDs (we should specify what's the maximum number of account IDs that can be returned).
| if let Some(&(_, last_created_at_block)) = account_ids_raw.last() | ||
| && account_ids_raw.len() > MAX_ROWS | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: to improve readability, I think we should be able to write this as something like this:
if account_ids_raw.len() > MAX_ROWS {
let Some(&(_, last_created_at_block)) = account_ids_raw.last().expect("...");
...
}| pub(crate) fn upsert_accounts( | ||
| conn: &mut SqliteConnection, | ||
| accounts: &[BlockAccountUpdate], | ||
| block_num: BlockNumber, | ||
| ) -> Result<usize, DatabaseError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR, but this method has gotten pretty complicated with multiple requests to the database being made. Let's create an issue to refactor it so that we issue only one read and one write request during this method (I think this should be enough).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created #1483
crates/store/src/db/mod.rs
Outdated
| /// Loads all network account IDs from the DB. | ||
| #[instrument(level = "debug", target = COMPONENT, skip_all, ret(level = "debug"), err)] | ||
| pub async fn select_all_network_account_ids(&self) -> Result<Vec<AccountId>> { | ||
| self.transact("Get all network account IDs", queries::select_all_network_account_ids) | ||
| .await | ||
| pub async fn select_all_network_account_ids( | ||
| &self, | ||
| block_range: RangeInclusive<BlockNumber>, | ||
| ) -> Result<(Vec<AccountId>, BlockNumber)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc comments here need to be updated.
| // TODO: add pagination. | ||
| #[instrument( | ||
| parent = None, | ||
| target = COMPONENT, | ||
| name = "store.ntx_builder_server.get_network_account_ids", | ||
| skip_all, | ||
| ret(level = "debug"), | ||
| err | ||
| )] | ||
| async fn get_network_account_ids( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc comments here need to be updated.
crates/store/src/state.rs
Outdated
| /// Returns account IDs for all public (on-chain) network accounts. | ||
| pub async fn get_all_network_accounts(&self) -> Result<Vec<AccountId>, DatabaseError> { | ||
| self.db.select_all_network_account_ids().await | ||
| pub async fn get_all_network_accounts( | ||
| &self, | ||
| block_range: RangeInclusive<BlockNumber>, | ||
| ) -> Result<(Vec<AccountId>, BlockNumber), DatabaseError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc comments here need to be updated.
| .collect(); | ||
|
|
||
| Ok(accounts?) | ||
| const MAX_ITERATIONS: u32 = 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great to add a comment here indicating why 1000 iterations is enough or at least state what kind of assumption we are making here about the maximum number of network accounts.
I'm working on #1299 .
I noticed that we do not store the block in which an account is created, so the typical pagination that we are using in the other endpoint does not work here. We would need some kind of cursor to keep track the position of the overall query for the pagination. The problem is that we cannot count on any of the fields in the accounts schema to be ordered. A solution could be to add a
created_atfield storing the block number in which the account was deployed.cc @Mirko-von-Leipzig @drahnr