Interpolated distribution in V4 #5959

symeneses · 2022-07-08T14:49:27Z

Description of your problem

While working on the issue pymc-devs/pymc-examples#115, where I am updating the code from PyMC v3 to v4, I got this error:

File ~/Documents/tech_world/bayesian/pymc/pymc/model.py:1790, in Model.check_start_vals(self, start)
   1787 initial_eval = self.point_logps(point=elem)
   1789 if not all(np.isfinite(v) for v in initial_eval.values()):
-> 1790     raise SamplingError(
   1791         "Initial evaluation of model at starting point failed!\n"
   1792         f"Starting values:\n{elem}\n\n"
...
Starting values:
{'alpha_interval__': array(nan), 'beta0_interval__': array(nan), 'beta1_interval__': array(nan)}

Initial evaluation results:
{'alpha': nan, 'beta0': nan, 'beta1': nan, 'Y_obs': nan}

Trying to find the reason, I realized that the next basic model works in v3 but not in V4.

Please provide a minimal, self-contained, and reproducible example.

import numpy as np
import pymc as pm

from scipy import stats

model = pm.Model()

x_points = np.linspace(0, 10, 100)
pdf_points = stats.norm.pdf(x_points, loc=1, scale=1)
with model:    
    # alpha = from_posterior("alpha", trace["alpha"])
    alpha = pm.Interpolated("alpha", x_points, pdf_points)
    # Expected value of outcome
    pm.sample(1000)

Please provide the full traceback.

Complete error traceback

---------------------------------------------------------------------------
SamplingError                             Traceback (most recent call last)
/Users/symeneses/Documents/tech_world/bayesian/pymc-examples/examples/howto/updating_priors.ipynb Cell 16' in <cell line: 10>()
     [12](vscode-notebook-cell:/Users/symeneses/Documents/tech_world/bayesian/pymc-examples/examples/howto/updating_priors.ipynb#ch0000023?line=11) alpha = pm.Interpolated("alpha", x_points, pdf_points)
     [13](vscode-notebook-cell:/Users/symeneses/Documents/tech_world/bayesian/pymc-examples/examples/howto/updating_priors.ipynb#ch0000023?line=12) # Expected value of outcome
---> [14](vscode-notebook-cell:/Users/symeneses/Documents/tech_world/bayesian/pymc-examples/examples/howto/updating_priors.ipynb#ch0000023?line=13) pm.sample(1000)

File ~/Documents/tech_world/bayesian/pymc/pymc/sampling.py:558, in sample(draws, step, init, n_init, initvals, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, jitter_max_retries, return_inferencedata, idata_kwargs, mp_ctx, **kwargs)
    556 # One final check that shapes and logps at the starting points are okay.
    557 for ip in initial_points:
--> 558     model.check_start_vals(ip)
    559     _check_start_shape(model, ip)
    561 sample_args = {
    562     "draws": draws,
    563     "step": step,
   (...)
    573     "discard_tuned_samples": discard_tuned_samples,
    574 }

File ~/Documents/tech_world/bayesian/pymc/pymc/model.py:1790, in Model.check_start_vals(self, start)
   1787 initial_eval = self.point_logps(point=elem)
   1789 if not all(np.isfinite(v) for v in initial_eval.values()):
-> 1790     raise SamplingError(
   1791         "Initial evaluation of model at starting point failed!\n"
   1792         f"Starting values:\n{elem}\n\n"

Please provide any additional information below.

Changing the number of sample in linspace from 100 to 10 , it works also in V4.

import numpy as np
import pymc as pm

from scipy import stats

model = pm.Model()

x_points = np.linspace(0, 10, 10)
pdf_points = stats.norm.pdf(x_points, loc=1, scale=1)
with model:    
    # alpha = from_posterior("alpha", trace["alpha"])
    alpha = pm.Interpolated("alpha", x_points, pdf_points)
    # Expected value of outcome
    pm.sample(1000)

Versions and main components

PyMC/PyMC3 Version: v4.0.1 (It works in 3.11.4)
Aesara/Theano Version: 2.7.3
Python Version: 3.10
Operating system: macOS 12.4
How did you install PyMC/PyMC3: conda

The text was updated successfully, but these errors were encountered:

michaelosthege · 2022-07-09T14:37:38Z

Your observation that changing from linspace(0, 10, 100) to linspace(0, 10, 10) sounds like this is a numerical issue.

Not sure where it comes from though. It definitely sounds like a serious regression.
@ricardoV94 any idea?

larryshamalama · 2022-07-18T19:13:44Z

I know where the problem is... the moment dispatch is incorrectly implemented. It should be an easy fix; more on this in a bit. Thanks for spotting this issue @symeneses!

larryshamalama · 2022-07-18T20:07:53Z

To be a bit more descriptive: the moment method provides the value of the expectation of the random variable. Here, $\mathbb{E}[X] = \int_{a}^b xf(x)dx$ with $a = x_0 < \dots < x_N = b$. A good approximation method would be the trapezoidal rule:

$$\mathbb{E}[X] = \int_{a}^b xf(x)dx \approx \sum_{k=1}^N \frac{x_k f(x_{k}) - x_{k-1}f(x_{k-1})}{2} \Delta x_k$$

with $\Delta x_k = x_k - x_{k-1}$ being the spacing between values of x_points. The current approach is more naive, but, most importantly, it does not account for such spacing, which is why moment can propose unreasonable initial values and yield nans when provided more points in x_points. Here, moment(alpha).eval() yields array(1.16859335) with 10 observations but array(12.74793719) with 100 which, I am guessing, is numerically unstable as @michaelosthege is suggesting. In the example above, these two values should in principle be close to each other.

In V3, I believe that the initial points were randomly sampled from the prior, which should be why this model works in V3.

larryshamalama self-assigned this Jul 18, 2022

larryshamalama added the bug label Jul 18, 2022

larryshamalama mentioned this issue Jul 19, 2022

Fix pm.Interpolated moment #5986

Merged

symeneses mentioned this issue Jul 22, 2022

Update example notebooks for better rendering pymc-devs/pymc-examples#394

Open

16 tasks

michaelosthege closed this as completed in #5986 Jul 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interpolated distribution in V4 #5959

Interpolated distribution in V4 #5959

symeneses commented Jul 8, 2022

michaelosthege commented Jul 9, 2022

larryshamalama commented Jul 18, 2022

larryshamalama commented Jul 18, 2022 •

edited

Loading

Interpolated distribution in V4 #5959

Interpolated distribution in V4 #5959

Comments

symeneses commented Jul 8, 2022

Description of your problem

Versions and main components

michaelosthege commented Jul 9, 2022

larryshamalama commented Jul 18, 2022

larryshamalama commented Jul 18, 2022 • edited Loading

larryshamalama commented Jul 18, 2022 •

edited

Loading