Skip to content

Check shared variable values to determine volatility in posterior predictive sampling #6147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 30, 2022

Conversation

lucianopaz
Copy link
Member

@lucianopaz lucianopaz commented Sep 26, 2022

What is this PR about?

Closes #6047.

This PR did two things:

  1. It makes the mutable dimension SharedVariable length have the name of the corresponding dimension.
  2. Adds two arguments to compile_forward_sampling_function: constant_data and constant_coords.

These two arguments allow it to determine if a SharedVariable has changed after between a call to sample and sample_posterior_predictive. We have to note that constant_data is only knowable if sample_posterior_predictive is called supplying an InferenceData object, and constant_coords are only knowable if sample_posterior_predictive is called supplying an InferenceData or an xarray.Dataset object.

The way sample_posterior_predictive is able to determine if a data container changed is by doing the following. It first checks the InferenceData.constant_data group to find the values of data containers at inference time, and passes those into the constant_data argument of compile_forward_sampling_function. When a SharedVariable is found while walking the graph, it looks up the entry in constant_data, it it finds an entry, it checks whether the values in the dictionary match the values of the SharedVariable at run time. If they match, the SharedVariable is deemed not volatile, if they don't match, the SharedVariable is considered volatile.
To check if a dimension's coordinates changed, sample_posterior_predictive compares the model's coordinates to those found in the supplied trace (must be an InferenceData or xarray.Dataset to have this information). If the coordinates did not change, then the dimension name is added to the constant_coords set. Then, when compile_forward_sampling_function finds the dimension length's shared variable, it tries to see if its name is in constant_coords. If it is, then the dimension is not deemed volatile, if it isn't it is considered volatile.

Checklist

Major / Breaking Changes

  • ...

Bugfixes / New features

  • sample_posterior_predictive, when supplied with an InferenceData object, properly identifies if a MutableData or mutable dimension has changed between a call to pymc.sample and pymc.sample_posterior_predictive. If they have, then the descendant random variables are resampled, if they have not changed, then the descendant random variables are taken from the InferenceData.posterior.

Docs / Maintenance

  • ...

@codecov
Copy link

codecov bot commented Sep 26, 2022

Codecov Report

Merging #6147 (b12c189) into main (eff1cf2) will increase coverage by 47.10%.
The diff coverage is n/a.

❗ Current head b12c189 differs from pull request most recent head a9f6ab5. Consider uploading reports for the commit a9f6ab5 to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #6147       +/-   ##
===========================================
+ Coverage   40.84%   87.94%   +47.10%     
===========================================
  Files          91       91               
  Lines       20713    20707        -6     
===========================================
+ Hits         8460    18211     +9751     
+ Misses      12253     2496     -9757     
Impacted Files Coverage Δ
pymc/tests/distributions/test_bound.py 0.00% <0.00%> (-100.00%) ⬇️
pymc/tests/distributions/test_censored.py 0.00% <0.00%> (-100.00%) ⬇️
pymc/tests/distributions/test_truncated.py 0.00% <0.00%> (-100.00%) ⬇️
pymc/tests/distributions/test_simulator.py 0.00% <0.00%> (-99.51%) ⬇️
pymc/sampling_jax.py 0.00% <0.00%> (-97.19%) ⬇️
pymc/distributions/truncated.py 34.96% <0.00%> (-64.34%) ⬇️
pymc/distributions/bound.py 45.63% <0.00%> (-54.37%) ⬇️
pymc/distributions/simulator.py 34.04% <0.00%> (-53.20%) ⬇️
pymc/distributions/censored.py 92.50% <0.00%> (ø)
pymc/parallel_sampling.py 85.80% <0.00%> (+0.33%) ⬆️
... and 69 more

@lucianopaz lucianopaz marked this pull request as ready for review September 27, 2022 12:36
@lucianopaz
Copy link
Member Author

I'm not sure why precommit just dies after checking for links.

@lucianopaz lucianopaz force-pushed the ppc_volite_coords branch 2 times, most recently from 5d9228f to db4cbe5 Compare September 28, 2022 12:28
@ricardoV94 ricardoV94 self-requested a review September 28, 2022 12:32
@ricardoV94 ricardoV94 changed the title Check shared variable values to determine volatility in ppc Check shared variable values to determine volatility in posterior predictive sampling Sep 30, 2022
@ricardoV94 ricardoV94 merged commit e419d53 into pymc-devs:main Sep 30, 2022
@ricardoV94
Copy link
Member

Neat one @lucianopaz!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Variables with shared inputs are always resampled from the prior in sample_posterior_predictive
2 participants