Skip to content
This repository was archived by the owner on Jan 22, 2025. It is now read-only.

solana-program: VoteState::deserialize() #34829

Merged
merged 27 commits into from
Jan 25, 2024

Conversation

2501babe
Copy link
Contributor

Problem

presently, VoteState::deserialize() is not implemented for bpf compile targets because the bincode call blows the 200k compute limit. being able to deserialize this struct in a program context would likely be useful for many downstream projects. but in particular, we need this to port the stake program to bpf

Summary of Changes

implement a parser by hand. this implementation can deserialize a real devnet VoteState (empty prior_voters, several dozen votes and epoch_credits) in 7.4k CUs. it supports Current and V1_14_11 versions of the struct and returns a Current without a call to convert_to_current()

(this will remain a draft until there is sufficient test coverage, im planning on using proptest to compare it to bincode thoroughly)

Copy link

codecov bot commented Jan 18, 2024

Codecov Report

Attention: 28 lines in your changes are missing coverage. Please review.

Comparison is base (8ff511e) 81.6% compared to head (872d39d) 81.6%.
Report is 20 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master   #34829     +/-   ##
=========================================
- Coverage    81.6%    81.6%   -0.1%     
=========================================
  Files         827      829      +2     
  Lines      223884   224092    +208     
=========================================
+ Hits       182841   183008    +167     
- Misses      41043    41084     +41     

@2501babe 2501babe marked this pull request as ready for review January 23, 2024 20:55
@2501babe 2501babe requested a review from joncinque January 23, 2024 20:55
@2501babe
Copy link
Contributor Author

@CriesofCarrots i put the more general deserialization helpers in a new submodule of solana_program::serialize_utils since you suggested it, note theyre not marked for bincode because the binary layout is the same as borsh. other than moving some helpers and the addition of tests, the code is the same as when you looked last

Copy link
Contributor

@joncinque joncinque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly tiny nits, this looks great and easy to follow!


// variant
for _ in 0..1000 {
let size_hint = VoteStateVersions::size_hint(0).0 * 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I looked up what size_hint does in the arbitrary lib, but I still don't understand what the * 4 bit is for. Why is it needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_hint returns a tuple with a lower bound and possibly an upper bound (here None because the struct can be any size). i just gave arbitrary 4x the minimum size to work with so it typically touches all the fields

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and it turns out this always returns (0, None) anyway so this is actually wrong

Comment on lines +393 to +396
pub fn deserialize_into(
input: &[u8],
vote_state: &mut VoteState,
) -> Result<(), InstructionError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

micro-nit: Could you add a doc comment to this function explaining when to use it? Something like:

Deserializes the input buffer into the provided `VoteState`

This function exists to deserialize `VoteState` in a BPF context without going above
the stack frame limit, and must be kept up to date with `bincode::deserialize`.

Or feel free to update that as you wish!

let mut r = StdRng::seed_from_u64(seed);

for _ in 0..1000 {
let raw_data_length = r.gen_range(1..1024);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this number come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its arbitrary, i could note this in a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

going to use the serialized size of the default struct instead

r.fill(&mut raw_data[..]);

let mut test_vote_state = VoteState::default();
let e = VoteState::deserialize_into(&raw_data, &mut test_vote_state).unwrap_err();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the first 4 bytes give 1u32 or 2u32, is it possible for this to succeed and give a false negative? If so, maybe we shouldn't check that it's InvalidAccountData and just that it doesn't panic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously commented this cannot false negative because the input is too short but it would be better to do this and make the input long to hit all possible cases. its probably impossible to actually false negative tho because the is_empty byte would have to be correct and all of the u32 vector lengths would need to be below a few dozen or so

Comment on lines 34 to 42
let is_empty_idx = position_after_prior_voters
.checked_sub(1)
.ok_or(InstructionError::InvalidAccountData)?;

match input[is_empty_idx as usize] {
0 => read_prior_voters_into(cursor, vote_state)?,
1 => cursor.set_position(position_after_prior_voters),
_ => return Err(InstructionError::InvalidAccountData),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It's a bit strange and potentially error-prone to take in the input bytes and also a cursor over the same input bytes.

Although it might make the code slightly uglier, I'd prefer to see it all operate on the cursor, so this would need to change to move the cursor to is_empty_idx, read the u8 there, and optionally move it back to the previous position if it's not empty. Does that sound reasonable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could do that. honestly we could also just remove this logic and have it step over the bytes normally, it turns out to not be that much of an optimization since the final version of read_prior_voters_into doesnt memcpy the null pubkeys

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

went ahead and did that

Copy link
Contributor

@CriesofCarrots CriesofCarrots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beyond Jon comments (all of which I think you've already handled), just nits from me

use {super::*, itertools::Itertools, rand::Rng};
use {
super::*,
arbitrary::{Arbitrary, Unstructured},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you actually need to reimport these? I would think super::* would get them.

match variant {
// V0_23_5. not supported; these should not exist on mainnet
0 => Err(InstructionError::InvalidAccountData),
// V1_14_11. substantially different layout and data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Substantially different from what? I think you mean from V0_23_5, but it is not clear.

0 => Err(InstructionError::InvalidAccountData),
// V1_14_11. substantially different layout and data
1 => deserialize_vote_state_into(input, &mut cursor, vote_state, false),
// Current. the only difference is the addition of a slot-latency to each vote
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only difference from what?

Copy link
Contributor

@CriesofCarrots CriesofCarrots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but would be good to ensure Jon's satisfied :)

Copy link
Contributor

@joncinque joncinque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants