Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

24.2.2 has either been retagged or the git_archival.txt makes it non-reproducible #4110

Closed
dvzrv opened this issue Apr 14, 2024 · 5 comments
Closed
Labels

Comments

@dvzrv
Copy link

dvzrv commented Apr 14, 2024

Summary

Hi! 👋

We are currently rebuilding all packages against Python 3.12 on Arch Linux.
On rebuild I noticed that the tag commit for 24.2.2 has changed.

We are locking the tag commit of the upstream repository using a checksum mechanism (see https://gitlab.archlinux.org/archlinux/packaging/packages/ansible-lint/-/blob/2ba0170c9cc645f939d1ba29f65d02d718cd132e/PKGBUILD)

This has changed since 2024-03-14, when we initially built the package:

git diff
diff --git i/PKGBUILD w/PKGBUILD
index 79770e2..f13a6d6 100644
--- i/PKGBUILD
+++ w/PKGBUILD
@@ -5,7 +5,7 @@

 pkgname=ansible-lint
 pkgver=24.2.2
-pkgrel=1
+pkgrel=2
 pkgdesc="Checks playbooks for practices and behaviour that could potentially be improved."
 arch=('any')
 url="https://github.com/ansible/ansible-lint"
@@ -17,7 +17,7 @@ checkdepends=(mypy python-jmespath python-pylint python-pytest python-pytest-moc
 optdepends=('ansible: check official ansible collections')
 source=(git+https://github.com/ansible/ansible-lint.git#tag=v$pkgver
         disable_version_check.patch)
-b2sums=('b2f6505626ae0c45d313062680e57bb8ae63874d95f052d2e97cb4ff388e1719a4c9bc9cb4319012e5d93957ba13a795150d7a1052eca2bbb898fb45c893e8f1'
+b2sums=('6a5cb672255b84269daf656a19e82c2a8de0d4340b59a4d67f908165f93c65c70183a0ab93cda1f6e9381e4866c61f05e8c6f8a922f27ae1532b787b633024e2'
         '98294f267ca693c0bc3921f8e076d674a219a891502cd31a0af789bc0b1447b53834b9c85853a134f6bc1ac384f31cb174cba2d55fbcc1636cae9bd3c0bd8f84')

 prepare() {

Has the 24.2.2 tag been redone after 2024-03-14?

Due to continuous issues with PyPI sdist tarballs we have put a distribution wide policy in place which encourages upstream provided, auto-generated source tarballs or VCS sources as package source: https://rfc.archlinux.page/0020-sources-for-python-packaging/

Due to reproducibility issues with the auto-generated source tarballs of projects using .git_archival.txt as suggested by setuptools-scm, we have changed many package scripts (such as that of ansible-lint) to use git sources instead. Unfortunately (unless 24.2.2 has been re-tagged), this is not reproducible either.

CC @jelly @Antiz96

Issue Type
  • Bug Report
OS / ENVIRONMENT

Arch Linux

  • ansible installation method: source (for building the OS package)
  • ansible-lint installation method: source (for building the OS package)
STEPS TO REPRODUCE
  • Lock the tag commit of this repository when the tag commit is the latest commit on the default branch.
  • Lock the tag commit of this repository when there is at least one additional commit on top of the latest tag.
Desired Behavior

The tag commit never changes due to arbitrary additional commits to the repository.
Additionally, the auto-generated tarball never changes due to arbitrary additional commits to the repository.

Actual Behavior

The use of setuptools-scm and the .git_archival.txt in particular alters auto-generated source tarballs (and seemingly also tags) after their initial creation.
This makes packaging efforts for this project non-reproducible and brittle and poses a problem for our supply chain security.

The problem with setuptools-scm has been reported and discussed in their issue tracker: pypa/setuptools-scm#806

Alternative solutions exist:

  • not using setuptools-scm/.git_archival.txt in the existing setup (switching to other VCS capable PEP517 build backends that do not introduce a setup in which tags or auto-generated tarballs are altered after their creation)
  • not using .git_archival.txt at all, since building from auto-generated source tarball with it is breaking reproducibility (building from git sources without it is possible, environment variables for version overrides exist for situations when building from auto-generated source tarball - also possible without that file)
@audgirka
Copy link
Contributor

@ssbarnea ^^

@ssbarnea ssbarnea removed the new Triage required label Apr 15, 2024
@ssbarnea
Copy link
Member

I am almost sure we did not do a retagging on this and I am afraid that is high-likely that we would not want to do anything about this, mainly because it will cause problems for others consuming the package.

The reality is that that issue is specific to archlinux and one could argue, that they need to deal with their own their past decisions.

Still, I am quite curious where does this kind of checksum originates from. I am aware that using checksums of source tarballs downloads from github is not reliable and it is already documented by github that they do not guarantee that the checksums will not change. Still, if I understood correctly, that is not what you are using.

@webknjaz Do you have some insight on this? Where it comes from, what we can do to keep everyone happy?

As a side note, .gitarchival was needed in order to allow ansible-lint installation as an action (shallow clone) to know about its own version, instead of reporting 0.0.0.

@ssbarnea ssbarnea closed this as not planned Won't fix, can't repro, duplicate, stale Apr 15, 2024
@dvzrv
Copy link
Author

dvzrv commented Apr 15, 2024

The reality is that that issue is specific to archlinux and one could argue, that they need to deal with their own their past decisions.

No, this breaks reproducibility for everyone relying on artifacts from this repository.
The checksum definition is based on the tarball at a specific commit: https://gitlab.archlinux.org/pacman/pacman/-/blob/4dc21b965b891042edc951d53f9ce93bf265cdfd/scripts/libmakepkg/source/git.sh.in#L153
This breakage is caused by the setup of .git_archival.txt in this repository and it is an acknowledged fact in the upstream ticket (pypa/setuptools-scm#806).

After looking at #2781 (comment) again, it seems to me as if you believe that an sdist tarball on PyPI is somehow "more secure". It is a (not well-defined) custom artifact, that is created from a source repository on some machine/ in CI. By definition it therefore cannot provide us with better guarantees for transparency or safety.

It is quite telling, that your above statement tries to frame Arch Linux as having to think real hard about "their past decisions", when it is quite clear that a configuration file in this repository leads to its auto-generated tarballs to become non-reproducible.

and it is already documented by github that they do not guarantee that the checksums will not change.

Sorry to point this out again, but that statement is FUD and github has revised it. You seemingly finding these issues funny (see emoji on #2781 (comment)) doesn't exactly make me feel like you are taking this problem seriously at all. Your reply in this ticket (again) comes across as rather condescending towards the free time people put in to improve the packaging of this project for a downstream distribution.

Why close this ticket a few hours after asking

what we can do to keep everyone happy?`

Frankly I did not expect much from this ticket (given past interactions), but my conclusion is that I don't feel comfortable contributing to any of the projects you are involved with.
I am therefore orphaning all official packages on Arch Linux for the Ansible ecosystem that I still maintain (I am the second person in this team of volunteers to do so for the same reason).
Maybe someone else is interested in dealing with this in their free time. I am certainly no longer 👋

@ssbarnea
Copy link
Member

@dvzrv Lets try to be pragmatic here: what can we do to help arch to the packaging? Removing git attributes will break versioning of github action, which relies on a specific that feature.

I closed the ticket after consulting with two others folks and I will not lock it. We do still want to help, but we need to find a solution that does not break something else.

The problem is not unique to ansible-lint package, lots of other python package maintainers do this.

@dvzrv
Copy link
Author

dvzrv commented Apr 16, 2024

The problem is not unique to ansible-lint package, lots of other python package maintainers do this.

That may be the case, but it doesn't mean that it is the right thing to do.
Many projects have been following the problematic upstream provided documentation by setuptools_scm, which has now (finally) been adjusted: pypa/setuptools-scm#1033

If you had looked at/ interacted with the relevant upstream ticket pypa/setuptools-scm#806 you would have seen that this has been reported against many projects already (I myself have reported this problem against e.g. molecule without fully understanding its origin and this was shot down as "your problem").

Feel free to look into the proposed upstream fix and excuse me if I don't spend more energy and time on this problem and do the work for you. I am luckily no longer responsible for any of these packages (and will mute this ticket now).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

No branches or pull requests

3 participants