Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use_observed_lib_size=False fails with NaN for some datasets #1903

Open
vitkl opened this issue Feb 11, 2023 · 6 comments
Open

use_observed_lib_size=False fails with NaN for some datasets #1903

vitkl opened this issue Feb 11, 2023 · 6 comments
Assignees

Comments

@vitkl
Copy link
Contributor

vitkl commented Feb 11, 2023

Hi,

I see that use_observed_lib_size=False fails with NaN for some datasets. I would like to use use_observed_lib_size=False to estimate the technical normalisation effect per cell - which does not necessarily match the total count per cell, especially in more complex data from multiple tissues and cell types. For one dataset, a few attempts to restart the notebook and run all cells help. For the same dataset, it also helps to reduce n_hidden 1024 -> 512. For a different dataset, even n_hidden=128 and n_latent=30 don't work. It would be hard to provide a reproducible example because I observed this issue using unpublished snRNA-seq data.

My guess would be that use_observed_lib_size=False is not particularly numerically stable. I observed a similar issue with other models when priors are selected in suboptimal ways. Which prior is used for this technical normalisation effect?

I generally use a batch-specific prior that regularises the model to keep the cell-specific normalisation y_c close to 1 (e.g. in cell2location package):

y_c ~ Gamma(a, a / y_e)
y_e ~ Gamma(10, 10)

which regularises y_e batch-specific normalisation effect to be close to 1, and regularises y_c cell-specific normalisation effect to be close to the average for each batch y_e using hyperparameter a.

Please let me know what you think is going on with this NaN use_observed_lib_size=False issue and what do you think about using more regularised priors.

@vitkl vitkl added the bug label Feb 11, 2023
@canergen
Copy link
Member

Can you provide what the minimum of your library size (and maximum) is and whether this behavior is dependent on this? I have observed this when not filtering for min_counts. I am not sure the problem is the range or the absolute value. It might be not what you are searching for but could help figuring out the problem.

@adamgayoso
Copy link
Member

This is why we switched the default. I imagine the problem exists because we use an exp to transform the log library size, where we could instead consider using softplus.

The prior is described here. It's designed so that $\ell_n$ is on the same scale as the observed library size

@vitkl
Copy link
Contributor Author

vitkl commented Mar 13, 2023

After reviewing this more carefully, I think

  1. the prior l_sigma is an overestimate of the total variance that can be attributed to the technical effect. A potential fix (and an easy fix) would be to add a hyper parameter to allows the user to reduce variance prior using a simple weight.

  2. Indeed it is possible that softplus would make the computation more stable. Is it easy to change to softplus exclusively for size factors? Is this the operation here https://github.com/scverse/scvi-tools/blob/library_stability/scvi/nn/_base_components.py#L290?

  3. I assume that the encoder network size, n_hidden is the same both for z and l which in my example is a pretty large number. This would mean that the network is massively overparameterised. In my trials of amortizing inference for cell2location I observed that such 1d parameters need to be amortized with a much smaller network to achieve numerical stability and avoid loss increase (n_hidden=10). Is it possible to change n_hidden exclusively for this parameter? If yes I would like to try this and report results.

I think a combination of 1 and 3 could solve this.

I assume that when the library size is learnable, the biological expression is transformed to positive using softplus rather than softmax, correct?

@vitkl
Copy link
Contributor Author

vitkl commented Mar 14, 2023

This line https://github.com/scverse/scvi-tools/blob/library_stability/scvi/module/_vae.py#L218 means that softplus is used here https://github.com/scverse/scvi-tools/blob/library_stability/scvi/nn/_base_components.py#L406 only when use_size_factor_key == True. This doesn't make sense - it should be "softplus" if (use_size_factor_key or not use_observed_lib_size) else "softmax".

If you softmax transform gene expression prediction, it doesn't matter if the library size is estimated or "observed total count" - it has to match the same total count per cell.

@adamgayoso
Copy link
Member

only when use_size_factor_key == True. This doesn't make sense - it should be "softplus" if (use_size_factor_key or not use_observed_lib_size) else "softmax".

We need to preserve backwards compatibility. scVI was originally described with a latent library size and softmax transformation

@vitkl
Copy link
Contributor Author

vitkl commented Mar 14, 2023

Ok, I can define an option to use softplus but keep the rest the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants