-
-
Notifications
You must be signed in to change notification settings - Fork 269
update data_container example to pymc 5.6 #559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update data_container example to pymc 5.6 #559
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Just saw #527, I didn't even know this feature existed. I'll update this notebook to use it in the final example. |
Thanks! This is great, we get questions about this quite often |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also add yourself as author: https://www.pymc.io/projects/docs/en/latest/contributing/jupyter_style.html#jupyter-authors
I refactored the last baby weight example to:
I think (2) is important, since it's not really a documented feature anywhere yet. |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-07-12T09:26:25Z Should we just rename this to Using Data containers? jessegrabowski commented on 2023-07-12T09:46:55Z I like that better, since users don't need to know what a "shared variable" is to use the data containers API |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-07-12T09:26:26Z *allow you to
Maybe also mention why anyone would want to use ConstantData (performance).
Also, I think that pm.Data also allows specification of dims, no? The only reason we have Data, MutableData, and ConstantData is historical and I think we should say that best practice is to use MutableData and ConstantData and only mention Data as a side-note. jessegrabowski commented on 2023-07-12T09:44:57Z I was thinking it would be nice to change one of the examples to use ConstantData, so that there's one usage for both. Maybe the first example. Then we could change around some headings (to point out which container is being used in which section, and why)?
A short passage at the beginning could be added, like "In past versions of PyMC, data was added to a model using
twiecki commented on 2023-07-13T10:20:53Z Yes, I think that's perfect. jessegrabowski commented on 2023-07-13T10:59:44Z I tried adding some details about each of the containers and give an example for each, let me know if it falls short of the mark. |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-07-12T09:26:27Z size is internal API, either use shape, or, better yet, also a dummy-dim. jessegrabowski commented on 2023-07-12T09:51:08Z What do you mean by a dummy dim? twiecki commented on 2023-07-13T10:20:20Z Oh, I see what you're doing here now. I think it should just be |
I was thinking it would be nice to change one of the examples to use ConstantData, so that there's one usage for both. Maybe the first example. Then we could change around some headings (to point out which container is being used in which section, and why)?
A short passage at the beginning could be added, like "In past versions of PyMC, data was added to a model using
View entire conversation on ReviewNB |
I like that better, since users don't need to know what a "shared variable" is to use the data containers API View entire conversation on ReviewNB |
What do you mean by a dummy dim? View entire conversation on ReviewNB |
Oh, I see what you're doing here now. I think it should just be View entire conversation on ReviewNB |
Yes, I think that's perfect. View entire conversation on ReviewNB |
I tried adding some details about each of the containers and give an example for each, let me know if it falls short of the mark. View entire conversation on ReviewNB |
@@ -112,67 +116,100 @@ idata.posterior.coords | |||
az.plot_trace(idata, var_names=["europe_mean_temp", "city_temperature"]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
az.plot_trace(idata, var_names=["europe_mean_temp", "city_temperature"]); | |
az.plot_trace(idata, var_names=["europe_mean_temp", "city_temperature"], legend=True); |
this should add a legend to both rows, first with chain linestyle mapping, then with city color mapping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The auto-placement leaves a bit to be desired -- is there any way to pass legend_kwargs
to the underlying plot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added one extra comment, also forgot to mention that I review on the myst file, but you should work on the ipynb and let pre-commit update the myst one
…mples, rename variables to be more meaningful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I left a couple of smaller suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shaping up really nice, just some minor suggestions
…NDOM_SEED in all sampling functions, promote GLM example to heading 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noticed one typo (again, fix in ipynb, then pre-commit), but other than that I think it is good to merge
Update
how_to/data_container
to PyMC 5.6Related to #333
The data container notebook is out of date. It uses PyMC3, and uses several APIs that are no longer needed/suggested/supported. Most prominent is using
pm.Data
instead of specifyingpm.MutableData
, but other small updates include:return_inferencedata=True
frompm.sample
az.from_pymc
to add apredictions
group; users should instead use thepredictions=True
flag inpm.sample_posterior_predictive
idata
to include aposterior_predictive
group; users should passextend_idata=True
insteadI also added a small explanation about the difference between
pm.ConstantData
and how to set mutablecoords
usingmodel.add_coord
. There might be a better way to do that last one.📚 Documentation preview 📚: https://pymc-examples--559.org.readthedocs.build/en/559/