Support conda (python) package manager #2213

rarkins · 2018-07-04T04:53:05Z

meg-hegde · 2020-07-17T07:03:54Z

Hi, does Renovate now support conda?

meg-hegde · 2020-08-05T16:47:23Z

Hi, just wondering whether there are plans to add conda support soon? Alternatively, shall I try adding it using the instructions here: https://github.com/renovatebot/renovate/blob/master/docs/development/adding-a-package-manager.md?

rarkins · 2020-08-05T20:14:51Z

No plans, and a PR would be very welcome! I updated the doc just now to make sure it's current.

meg-hegde · 2020-08-06T08:45:20Z

Thank you for updating the docs - I'll give this a go when I have some time :)

gerbenoostra · 2020-08-13T11:08:20Z

#6969 duplicated this, thus closed it. The relevant comments from there:
What would you like Renovate to be able to do?

To also verify python package versions in conda environment files (environment.yml)

Did you already have any implementation ideas?
no

Are there any workarounds or alternative ideas you've tried to avoid needing this feature?

Conda environments can also include pip requirememnts, a workaround is to put those in a separate txt file, and have renovatebot check those.
environment.yml would then be:

dependencies:
- python=3.7
- jupyter
- pip
- pip:
  - -rrequirements.txt

This workaround however does not work for the conda packages (like the python=3.7 here, and any conda packages installed, like jupyter in this case)

Is this a feature you'd be interested in implementing yourself?
maybe

** Related features**
Relates to #931 , but that only implemented pip dependencies.

rarkins · 2020-08-13T11:24:02Z

It would be helpful if anyone can provide some public repo examples that can be tested against, as well as clarifications on file naming / file syntax. For example should we match against every environment.yml file in the repo?

padmick · 2020-09-07T10:21:44Z

From their docs https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-from-an-environment-yml-file they reference it against environment.yml files. I know internally we use different names for our different projects (IE we name our env the same codename as the project so titan.yml ect) but if that is an issue we can look at changing them back to the conda default name.

davidspek · 2021-02-03T13:32:30Z

I'm also interested in this as the PyTorch images (and the jupyter-stack ones) use conda.

AndreaGiardini · 2021-06-05T14:16:59Z

We could give it a shot, starting from the datasource. I am not very familiar with typescript but I could give it a go.

The docs in the datasource require a function called ``getReleases` with input:

`lookupName`: the package's full name including scope if present (e.g. @foo/bar)
`registryUrls`: an array of registry Urls to try

If I understood it correctly, for something like https://anaconda.org/conda-forge/proj/ we will have:

lookupName -> conda-forge/proj
registryUrl -> https://anaconda.org/

Am I missing something?

davidspek · 2021-06-05T21:58:59Z

I would just like to make a note that the conda-forge channel would be important to have support for (also in terms of ToS of the regular conda channel).

rarkins · 2021-06-06T19:14:35Z

I agree that starting with the datasource first makes best sense.

Importantly, note that Renovate doesn't yet have the concept of "platform" for datasources but it looks like that might be necessary for conda packages?

mathbunnyru · 2021-06-20T19:04:33Z

Jupyter docker images would also greatly benefit from this feature.
jupyter/docker-stacks#1153

Right now, we're updating our dependencies manually, and it would be great to get rid of this maintenance burden.

morremeyer · 2022-02-16T15:01:52Z

Hey everyone, Anaconda engineer here. We've assembled a small group of engineers that is looking into adding functionality for conda over time.

Please note that none of us is working on this full time right now, but work will be done over time.

I've started implementing a datasource for conda in #14257, any help with my testing issue and feedback in general is very welcome!

mdehollander · 2022-02-25T16:02:59Z

Hey everyone, Anaconda engineer here. We've assembled a small group of engineers that is looking into adding functionality for conda over time.

Please note that none of us is working on this full time right now, but work will be done over time.

I've started implementing a datasource for conda in #14257, any help with my testing issue and feedback in general is very welcome!

Thanks for starting the work on this! I am happy to start testing, but I am not sure how since I am new to renovate. I gave it a try with adding a renovate.json to my repo (https://github.com/mdehollander/orochi/blob/master/renovate.json). The conda environment files are the folder src/envs/. But I did not manage to set it up correctly since I get this error:

validationMessage": "Invalid configuration option: conda, The following managers configured in enabledManagers are not supported: \"conda\"",

What is the best way to test and configure this? Or I am too early 😺

morremeyer · 2022-02-25T17:38:00Z

@mdehollander You're not too early, but there's no manager implemented yet, just a datasource. I'll take a first shot at a manager next Friday.

You can check the datasource documentation at https://docs.renovatebot.com/modules/datasource/#conda-datasource.

If you want to use it right now, here's an example for how to do so. In your environment.yml, add a comment that annotates the line for renovate:

name: your-project
channels:
  - defaults
dependencies:
  - pytest
  - pytest-cov
  - coverage
  # renovate datasource=conda depName=main/yapf
  - yapf==0.31.0

This annotates the yapf package for the regexManager and will query the main channel for the versions. You can then use the regexManager with the following configuration:

{
  "reviewersFromCodeOwners": "true",
  "regexManagers": [
    {
      "description": "Upgrade conde dependencies",
      "fileMatch": [
        "(^|/)environment.yml$"
      ],
      "matchStrings": [
        "# renovate datasource=conda\\sdepName=(?<depName>.*?)\\s+- [a-z0-9]+==\"?(?<currentValue>.*)\"?"
      ],
      "datasourceTemplate": "conda"
    }
  ]
}

The job of the package manager is to do the discovery/annotation that I show above automatically so that you don’t need any configuration in the default case.

mdehollander · 2022-03-04T19:39:30Z

Thanks for the extra information and the example config. With this I managed to get a PR triggered for an update of a conda environment.
From the logs:

DEBUG: Matched 33 file(s) for manager regex: src/envs/amos.yaml, src/envs/antismash.yaml, src/envs/bamm.yaml, src/envs/bbmap.yaml, src/envs/bedtools.yaml, src/envs/bigscape.yaml, src/envs/bigslice.yaml, src/envs/bwa.yaml, src/envs/cat.yaml, src/envs/checkm.yaml, src/envs/concoct.yaml, src/envs/coverm.yaml, src/envs/dastool.yaml, src/envs/fraggenescan.yaml, src/envs/groopm.yaml, src/envs/khmer.yaml, src/envs/kraken.yaml, src/envs/mash.yaml, src/envs/megahit.yaml, src/envs/metabat.yaml, src/envs/minimap2.yaml, src/envs/mmgenome.yaml, src/envs/mmgenome_prepare.yaml, src/envs/pigz.yaml, src/envs/prodigal.yaml, src/envs/quast.yaml, src/envs/report.yaml, src/envs/samtools.yaml, src/envs/seqtk.yaml, src/envs/spades.yaml, src/envs/tree.yaml, src/envs/vamb.yaml, src/envs/vsearch.yaml

DEBUG: 2 flattened updates found: bioconda/spades, bioconda/vsearch
DEBUG: Returning 2 branch(es)
DEBUG: Fetching changelog: https://github.com/ablab/spades (3.14.0 -> 3.15.4)

And 2 PRs for 2 packages that I enabled: https://github.com/mdehollander/orochi/pulls

To get the environment files in subfolders recognized I changed the regular expression in the config to:

      "fileMatch": [
        "^(?:src/envs/)?\\w+\\.yaml$"
      ],

Looks very promising! Thanks! Looking forward for a manager :)

mdehollander · 2022-04-26T15:40:21Z

@morremeyer I am wondering if you managed to work on automatic discovery of conda packages via the package manager. That would make the use on existing installation easier, because you don't need to annotate the env files :)

trim21 · 2025-02-27T09:08:23Z

The files are often also served used zstd compression which makes them much smaller (20mb). There is also an accept conda enhancement proposal called "sharded repodata" that splits the repodata into individual files per package but it is at the moment only supported by channels on prefix.dev.

rattler provides the so called repodata gateway object which is an object that hides all this complexity and allows one to simply query for the repodata of a specific package. The gateway then figures out what the most efficient way of fetching the data is, as well as caching all of it.

I have been working on adding this to the javascript bindings of rattler with some good results but its not done yet.

For the time being, I recommend you use the zstd compressed files.

for rattler_repodata_gateway crate， I think renovate should handle the http requests but not rust code so it can use renovat's shared http cache

baszalmstra · 2025-02-27T09:25:26Z

I assume renovate uses the fetch api, which is what the rust code will also use.

trim21 · 2025-02-27T09:29:59Z

I assume renovate uses the fetch api, which is what the rust code will also use.

It have a wrapped http request client to be used in all data source

renovate/lib/modules/datasource/conda/index.ts

Line 55 in 2c6a500

response = await this.http.getJsonUnchecked(url);

baszalmstra · 2025-02-27T09:54:55Z

I see, yeah that complicates things. We can probably make this work but it will make things more complicated. From what I understand the reqwest crate which is used by rattler calls back into javascript to call the fetch API. I assume we can also inject another method to do this manually and allow overriding the client?

But I believe the fetch API also does caching, so as long as renovate is not requesting the same URLs (which doesnt make a lot of sense when you use the gateway) I dont think its that bad of a problem.

Ill not be implementing a custom fetch API in the initial version of the rattler gateway API. Would be happy to accept PRs though!

trim21 · 2025-02-27T09:57:44Z

The files are often also served used zstd compression which makes them much smaller (20mb). There is also an accept conda enhancement proposal called "sharded repodata" that splits the repodata into individual files per package but it is at the moment only supported by channels on prefix.dev.

rattler provides the so called repodata gateway object which is an object that hides all this complexity and allows one to simply query for the repodata of a specific package. The gateway then figures out what the most efficient way of fetching the data is, as well as caching all of it.

I have been working on adding this to the javascript bindings of rattler with some good results but its not done yet.

For the time being, I recommend you use the zstd compressed files.

It's not only the file size but also memory usage. for example, parsing conda-forge/linux-64/repodata.json will take up to 1G memory

import * as fs from 'node:fs';

const file = fs.readFileSync(`./conda-forge/linux-64/repodata.json`, 'utf8');

const obj = JSON.parse(file);

console.log(process.memoryUsage());

const _ = obj;

{
  rss: 1052028928,
  heapTotal: 1012838400,
  heapUsed: 983224232,
  external: 1691001,
  arrayBuffers: 10475
}

trim21 · 2025-02-27T10:04:37Z

does anaconda has sharded_repodata now?

looks like not conda/conda-index#161

baszalmstra · 2025-02-27T10:06:10Z

The memory usage is significantly reduced by using the gateway. It does not parse the entire file as JSON but only cleverly parses the parts from the repodata that it actually needs. It does however need all the bytes in memory. With sharded repodata this problem is also mitigated.

does anaconda has sharded_repodata now?

Unfortunately not yet.

trim21 · 2025-02-27T10:08:24Z

I see, yeah that complicates things. We can probably make this work but it will make things more complicated. From what I understand the reqwest crate which is used by rattler calls back into javascript to call the fetch API. I assume we can also inject another method to do this manually and allow overriding the client?

But I believe the fetch API also does caching, so as long as renovate is not requesting the same URLs (which doesnt make a lot of sense when you use the gateway) I dont think its that bad of a problem.

Ill not be implementing a custom fetch API in the initial version of the rattler gateway API. Would be happy to accept PRs though!

I think renovate also support auth config for each http host, which is supported by this.http here.

trim21 · 2025-02-27T10:09:46Z

I only use anaconda so I also won't implement it. 😅

trim21 · 2025-02-27T10:24:02Z

I'd like to suggest we rename current conda source to ananconda, since it only work with anaconda repo but not generic conda repo

rarkins · 2025-02-27T10:26:45Z

I'd like to suggest we rename current conda source to ananconda, since it only work with anaconda repo but not generic conda repo

It depends.

Assuming that non-anaconda registries will be supported in future, would they be best added to the existing datasource, or to a separate one?

If the anaconda API is close to identical to the non-anaconda Conda APIs, then it should be the same datasource (like we do with docker datasource).

trim21 · 2025-02-27T10:37:03Z

Assuming that non-anaconda registries will be supported in future, would they be best added to the existing datasource, or to a separate one?

I think a separate one.

If the anaconda API is close to identical to the non-anaconda Conda APIs, then it should be the same datasource (like we do with docker datasource).

It's not very close. non-anaconda conda repo doesn't even have API for single package. Anaconda api is also not part of spec

trim21 · 2025-02-27T10:44:58Z

It's kind like git tags / github tags

rarkins · 2025-02-27T10:52:33Z

The next challenge is that although we could rename datasource/conda to datasource/anaconda, and add migration code so that any existing config for conda is now massaged to anaconda, this concept would then break if we added back a conda datasource. Then we wouldn't know if user config was referring to conda (new) or the older conda/anaconda.

trim21 · 2025-02-27T10:54:32Z

The next challenge is that although we could rename datasource/conda to datasource/anaconda, and add migration code so that any existing config for conda is now massaged to anaconda, this concept would then break if we added back a conda datasource. Then we wouldn't know if user config was referring to conda (new) or the older conda/anaconda.

oops, I forget there is regex manager, someone is already using it.

baszalmstra · 2025-02-27T10:55:03Z

We could call the new one simply conda-repodata or something along those lines? or conda-channel? Naming is hard..

pavelzw · 2025-02-27T10:56:09Z

Couldn't we call that one anaconda-api?

trim21 · 2025-02-27T10:57:34Z

conda-channel looks good and make more sence.

rarkins · 2025-02-27T10:59:23Z

It's kind like git tags / github tags

Funny that you mention that. I have plans for git-tags to contain the logic which identifies "oh this is a github repository - let's use the github-tags datasource instead". Otherwise you force each manager to have to implement that logic. Similar with conda - it could be a single datasource but dispatch logic separately. Docker isn't the only one like that - it's quite common for the default registry in ones like PyPI or Cargo to have specific implementations.

trim21 · 2025-02-27T11:14:27Z

I was actually just thinking the same thing.

For multiple channels package, you can't use different data source, so a single conda data source would be more ideal.

In this case, the manager that produce conda packages should prepare the full registry urls (to the conda repo), for conda-forge with anaconda it should be https://conda.anaconda.org/conda-forge/, and for conda-forge from prefix.dev it's https://prefix.dev/conda-forge/. and for mirror it's https://mirrors.some.org/anaconda/conda-forge/, always add trailing slash in all cases.

Then we parse registry url in conda data source to decide how we get packages versions, for example use api.anaconda for https://conda.anaconda.org/, graphql of prefix.dev for https://prefix.dev/, or (if someone implement it) we use generic conda repo logic for repodata.json for all unknown registry.

And in manager, it should output packages in following cases, all conda versioning and conda data source:

// what current conda data source support, goes to api.anaconda.org
{
  packageName: 'conda-forge/numpy',
}

// goes to https://prefix.dev/api/graphql.
{
  packageName: 'numpy',
  registryUrls: ["https://prefix.dev/conda-forge/"]
}

// for multiple channel support, goes to api.anaconda.org and will fallback to
// https://prefix.dev/api/graphql
// if it's missing from api.anaconda.org conda-forge.
{
  packageName: 'numpy',
  registryUrls: [
    "https://conda.anaconda.org/conda-forge/", 
    "https://prefix.dev/conda-forge/",
  ]
}

// for multiple channel support, goes to api.anaconda.org first,
// then use generic conda logic to get versions from
// https://conda.repo.some.org/internal/
{
  packageName: 'package-not-exists-in-conda-forge',
  registryUrls: [
    "https://conda.anaconda.org/conda-forge/",
    "https://conda.repo.some.org/internal/",
  ]
}

And we keep the current default registry url of conda data source (which is api.anaconda) so current regex manager users will also be happy.

for a environment.yml example:

name: example
channels:
    - https://conda.anaconda.org/menpo
    - conda-forge
dependencies:
    - python==3.5.2
    - conda-forge::numpy
    - pip:
        - tensorflow

I would expect it to produce this:

[
  {
    packageName: 'python',
    datasource: 'conda',
    versioning: 'conda',
    currentValue: '==3.5.2',
    registryUrls: [
      "https://conda.anaconda.org/menpo/",
      "https://conda.anaconda.org/conda-forge/",
    ]
  },
  {
    packageName: 'conda-forge/numpy',
    versioning: 'conda',
    datasource: 'conda',
  },
  {
    packageName: 'tensorflow',
    versioning: 'pep440',
    datasource: 'pypi',
  }
]

And there is also a defaults channel means main + r + msys2, which should be handled by conda manager

Should we allow conda manager and pixi manager produce package like this? It currently work with our conda manger, but now very ideal

{
  packageName: 'numpy',
  registryUrls: [
    "https://api.anaconda.org/package/cuda/", 
    "https://api.anaconda.org/package/conda-forge/", 
  ]
}

I think it would be best that manager never use api url as registry url in the future, but currenly it should do this to support multiple channels from anaconda, and ignore channels that are not from anaconda (for now).

trim21 · 2025-03-05T18:10:04Z

Another problem: there is no way to know if a version is yanked. So you will encounter this:

renovatebot will try to update a package to yanked version and you got a broken manifest file, package manager tell you it can't find available distribution

baszalmstra · 2025-03-05T20:32:57Z

If you use the prefix graphql API you can use the yankedReason to determine if a package is yanked.

If you use repodata.json, yanked entries should be under the removed key.

Does that help?

trim21 · 2025-03-05T21:40:09Z

#34646 should work for most case of pixi, now I just need to get it merged 😄

You should be able to get lock file maintenance of pixi when renovate deploy 39.190.0 to production.

rarkins added type:feature Feature (new functionality) needs-requirements priority-4-low Low priority, unlikely to be done unless it becomes important to more people labels Jul 4, 2018

rarkins added ready and removed needs-requirements labels Oct 25, 2018

rarkins changed the title ~~Support conda (python)~~ Support conda (python) package manager Mar 8, 2019

rarkins added the new package manager New package manager support label Mar 8, 2019

rarkins removed ready labels Jun 18, 2020

rarkins mentioned this issue Aug 13, 2020

Conda environment support for python projects #6969

Closed

rarkins added the status:ready label Jan 12, 2021

morremeyer mentioned this issue Feb 16, 2022

feat(datasource/conda): add conda datasource #14257

Merged

6 tasks

This was referenced Feb 27, 2025

docs: update conda data source usage for multiple channels #34529

Closed

feat: new manager pixi #34400

Merged

This was referenced Mar 5, 2025

feat(manager/pixi): full support #34642

Closed

feat(manager/pixi): full support #34646

Open

This was referenced Mar 6, 2025

export MatchSpec in js-rattler conda/rattler#1142

Open

feat(datasource/conda): support calling prefix.dev #34681

Merged

trim21 mentioned this issue Mar 15, 2025

feat: add conda versioning #34351

Open

6 tasks

Support conda (python) package manager #2213

Support conda (python) package manager #2213

Comments

rarkins commented Jul 4, 2018

meg-hegde commented Jul 17, 2020

meg-hegde commented Aug 5, 2020

rarkins commented Aug 5, 2020

meg-hegde commented Aug 6, 2020

gerbenoostra commented Aug 13, 2020

rarkins commented Aug 13, 2020

padmick commented Sep 7, 2020

davidspek commented Feb 3, 2021

AndreaGiardini commented Jun 5, 2021

davidspek commented Jun 5, 2021

rarkins commented Jun 6, 2021

mathbunnyru commented Jun 20, 2021

morremeyer commented Feb 16, 2022 • edited Loading

mdehollander commented Feb 25, 2022

morremeyer commented Feb 25, 2022 • edited Loading

mdehollander commented Mar 4, 2022

mdehollander commented Apr 26, 2022

trim21 commented Feb 27, 2025

baszalmstra commented Feb 27, 2025

trim21 commented Feb 27, 2025

baszalmstra commented Feb 27, 2025

trim21 commented Feb 27, 2025

trim21 commented Feb 27, 2025 • edited Loading

baszalmstra commented Feb 27, 2025 • edited Loading

trim21 commented Feb 27, 2025

trim21 commented Feb 27, 2025

trim21 commented Feb 27, 2025 • edited Loading

rarkins commented Feb 27, 2025

trim21 commented Feb 27, 2025 • edited Loading

trim21 commented Feb 27, 2025

rarkins commented Feb 27, 2025

trim21 commented Feb 27, 2025

baszalmstra commented Feb 27, 2025

pavelzw commented Feb 27, 2025

trim21 commented Feb 27, 2025 • edited Loading

rarkins commented Feb 27, 2025

trim21 commented Feb 27, 2025 • edited Loading

trim21 commented Mar 5, 2025 • edited Loading

baszalmstra commented Mar 5, 2025

trim21 commented Mar 5, 2025 • edited Loading

morremeyer commented Feb 16, 2022 •

edited

Loading

morremeyer commented Feb 25, 2022 •

edited

Loading

trim21 commented Feb 27, 2025 •

edited

Loading

baszalmstra commented Feb 27, 2025 •

edited

Loading

trim21 commented Feb 27, 2025 •

edited

Loading

trim21 commented Feb 27, 2025 •

edited

Loading

trim21 commented Feb 27, 2025 •

edited

Loading

trim21 commented Feb 27, 2025 •

edited

Loading

trim21 commented Mar 5, 2025 •

edited

Loading

trim21 commented Mar 5, 2025 •

edited

Loading