Skip to content

Use protocols instead of importlib.abc.Loader/MetaPathFinder/PathEntryFinder #11890

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 12, 2024

Conversation

abravalheri
Copy link
Contributor

@abravalheri abravalheri commented May 10, 2024

As mentioned in #11882, #11541, #2468 most of the APIs related to importlib.abc.Loader/MetaPathFinder/PathEntryFinder were originally designed to handle what we call nowadays "Protocols".

Checking for actual subclasses lead to inconsistencies and false positives/negatives. The issues mentioned above bring evidence of these problems.

This PR proposes using Protocols and introduces _typeshed.importlib to house them.

The rationale behind adding _typeshed.importlib is that these protocols are implicitly defined in the standard library and documentation and there are projects out there that would benefit from sharing them so that they can also be used to when defining APIs1. I believe it is the same reasoning behind other members of _typeshed.

I have used the suffix Protocol just as a mean of distinguishing them better from the regular abcs. I usually don't like this (it is a kind of Hungarian notation variant...), but I think it is justifiable to avoid confusion.

Closes #11882
Closes #11541
Closes #2468, but not take into consideration Python 2 or deprecated methods.

Footnotes

  1. I can confirm that least setuptools/pkg_resources would benefit a lot from sharing the protocols. See discussion in https://github.com/pypa/setuptools/pull/4246/files#r1593000435.

This comment has been minimized.

def load_module(self, fullname: str, /) -> ModuleType: ...

class MetaPathFinderProtocol(Protocol):
def find_spec(self, fullname: str, path: Sequence[str] | None, target: ModuleType | None = ..., /) -> ModuleSpec | None: ...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about making it positional only arguments... but I copied the definitions from stdlib/sys/__init__.pyi, stdlib/importlib/abc.pyi, stdlib/types.pyi.

Maybe it is better just to remove the /?

Copy link
Member

@JelleZijlstra JelleZijlstra May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Protocols should usually have positional-only parameters, because otherwise implementations need to use the exact same parameter names. Use positional-or-keyword parameters only if consumers of the protocol pass keyword arguments.

Copy link
Contributor Author

@abravalheri abravalheri May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, that makes a lot of sense! Thank you very much for the clarification.

In accordance to that, I also introduced 6cddd52 to make the arguments in PathEntryFinderProtocol positional only.

This comment has been minimized.

This comment has been minimized.

@abravalheri abravalheri marked this pull request as ready for review May 10, 2024 13:28

__all__ = ["LoaderProtocol", "MetaPathFinderProtocol", "PathEntryFinderProtocol"]

class LoaderProtocol(Protocol):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to replace types._LoaderProtocol with this as well.

Copy link
Contributor Author

@abravalheri abravalheri May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing that in types.pyi would create an import loop between types.pyi and _typeshed/importlib.pyi. Is that OK to do for type stubs?

I can do the replacement in the pkg_resources stub without problems.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loops are not a problem in stubs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the clarification, I introduced this change in 7075e5e.

This comment has been minimized.

This comment has been minimized.

@layday
Copy link
Contributor

layday commented May 11, 2024

If these ABCs do not need to be implemented at runtime, can't we just pretend that they are protocols in the typeshed, similar to how it's done for collections.abc?

@AlexWaygood
Copy link
Member

If these ABCs do not need to be implemented at runtime, can't we just pretend that they are protocols in the typeshed, similar to how it's done for collections.abc?

We do this for multiple other simple ABCs in the stdlib that are "protocol-like" in typeshed as well, such as os.PathLike and contextlib.AbstractContextManager

@abravalheri
Copy link
Contributor Author

If these ABCs do not need to be implemented at runtime, can't we just pretend that they are protocols in the typeshed, similar to how it's done for collections.abc?

I don't think I understood this question...

The reason why I added in _typeshed is because I think they are needed by community packages to specify proper types signatures is various APIs. But I don't know if that was the question.

@layday
Copy link
Contributor

layday commented May 11, 2024

It looks like these parallel protocols are slimmed-down versions of the ABCs. IIUC, this PR is meant to address two separate issues: (1) importlib ABCs don't need to be subclassed, and (2) some of their methods are optional and the ABCs do in fact implement these.

@abravalheri
Copy link
Contributor Author

If these ABCs do not need to be implemented at runtime, can't we just pretend that they are protocols in the typeshed, similar to how it's done for collections.abc?

I don't think I understood this question...

The reason why I added in _typeshed is because I think they are needed by community packages to specify proper types signatures is various APIs. But I don't know if that was the question.


But on the other hand, if the question means something like: "why not make importlib.abc.Loader inherit from Protocol in the typeshed", I guess that is probably because protocols don't have support for optional methods?

@abravalheri
Copy link
Contributor Author

It looks like these parallel protocols are slimmed-down versions of the ABCs. IICU, this PR is meant addresses two separate issues: (1) importlib ABCs don't need to be subclassed, and (2) some of their methods are optional and the ABCs do in fact implement these.

I am not sure about this. I think that within the stdlib the optional methods are used. These ABCs are at the top of a hierarchy of many loaders/finders which use inheritance to share code.

@layday
Copy link
Contributor

layday commented May 11, 2024

The optional methods are no-ops on the ABCs in the stdlib. Either way, I don't have anything more to suggest here, but I thought clarifying what the intention is with these changes might help.

@abravalheri
Copy link
Contributor Author

abravalheri commented May 11, 2024

clarifying what the intention is with these changes might help.

The original intention of writing this PR is to solve #11882, #11541, #2468.

These issues are caused because when the stubs were written in typeshed there was a likely misinterpretation of the loader/finders PEP and documentations of importlib. The ABCs were never meant to be enforced, but rather informational.


Because the ABCs are at the core of some CPython inheritance chain, I think it is just easier to leave them alone, no?

Also, I think it is conceptually cleaner for third party packages to use _typeshed.importlib.*Protocol when they actually mean protocol.

@layday
Copy link
Contributor

layday commented May 11, 2024

These issues are caused because when the stubs were written in typeshed there was a likely misinterpretation of the loader/finders PEP and documentations of importlib. The ABCs were never meant to be enforced, but rather informational.

I'm not sure I understand what not "meant to be enforced" means - are you saying they don't need to be suclassed? Then they can be safely converted to protocols in the typeshed and you'd have the option to type check that any random object satisfies the protocol in full. We should of course keep the slimmed-down protocols since there's no way to mark a protocol member as being optional. However, they shouldn't be synonymous with their parent ABCs to avoid confusion.

Because the ABCs are at the core of some CPython inheritance chain, I think it is just easier to leave them alone, no?

I don't think it matters to typeshed users if they are subclassed inside of the stdlib, but I might be misunderstanding.

@abravalheri
Copy link
Contributor Author

abravalheri commented May 11, 2024

I'm not sure I understand what not "meant to be enforced" means - are you saying they don't need to be suclassed?

No, that is not what I was saying.

APIs that rely on finder/loaders don't need the finders and loaders to be subclasses of the ABCs, they just need them to comply with the protocol.

However, it is a fact that the ABCs are subclassed somewhere (we can verify that empirically, then it is only logical that there is a need for subclassing). Several classes in importlib.machinery, importlib.util subclass them - that is the evidence for the "need".


It is getting very hard to understand each other I am afraid 😅.

If you would like to open an alternative PR with the ideas that you are exposing, maybe that would be a good approach forward?

I am very happy to back off from this PR if anyone has a better implementation.

Copy link
Collaborator

@Avasam Avasam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am personally perfectly happy with these changes.

On the question of making the original ABCs into protocols in typeshed, I'll leave that up to other maintainers, as it may affect more than I realize. As @AlexWaygood mentioned, there's precedence to do so. But I'm not sure how the optional methods should be handled then, and I like the current separation between the original ABC and a type-only Protocol built after it.

@layday
Copy link
Contributor

layday commented May 11, 2024

APIs that rely on finder/loaders don't need the finders and loaders to be subclasses of the ABCs, they just need them to comply with the protocol.

We are in agreement.

However, it is a fact that the ABCs are subclassed somewhere (we can verify that empirically, then it is only logical that there is a need for subclassing). Several classes in importlib.machinery, importlib.util subclass them - that is the evidence for the "need".

At the risk of sounding presumptuous, I'll just note that protocols are themselves subclassable and doing so is often beneficial because it allows type checkers to verify their implementation. At runtime, Protocols are generalised ABCs. Why would the importlib ABCs being subclassed in the stdlib matter at all?

If you would like to open an alternative PR with the ideas that you are exposing, maybe that would be a good approach forward?

What I'm proposing is fairly straightforward: convert these 3 ABCs to protocols. If we can't agree that's a good thing, there's hardly any point in opening a PR for it.

I'm not trying to "challenge" your PR - I was confused initially, thinking the protocols were identical to the ABCs. It wasn't clear from the description that that wasn't the case; I tried to help by pointing that out and restating the problem the PR's intending to solve. I've also made two suggestions, that I don't mind if you chose to ignore.

@@ -359,7 +360,7 @@ def evaluate_marker(text: str, extra: Incomplete | None = None) -> bool: ...
class NullProvider:
egg_name: str | None
egg_info: str | None
loader: types._LoaderProtocol | None
loader: LoaderProtocol | None
Copy link
Collaborator

@Avasam Avasam May 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abravalheri Sorry I just noticed. But you won't be able to use new typeshed symbols in 3rd-party stubs in the same PR. Until the next mypy & pyright release, as they'll need to first include the new stdlib changes from typeshed.

Since this is the only 3rd party stub affected, you can simply duplicate the protocol here for now.

pyright is released quite often. So realistically this just means waiting for mypy for a follow-up PR. I can open a PR to update stubs/setuptools/pkg_resources/__init__.pyi as soon as this one is merged. And keep it on hold until the next version of mypy. (anyway I'm trying to keep both the stubs and setuptools updated as I'm adding first-party annotations and fixing typeshed stubs)

Copy link
Contributor Author

@abravalheri abravalheri May 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, wasn't this part the suggestion in #11890 (comment)? (or at least that is how I interpreted the comment and then I went ahead to implement the change as a way of addressing it)

If types._LoaderProtocol has to be maintained so that 3rd-party stubs can use them, then it makes almost no difference to replace types._LoaderProtocol because it is only used in 2 places: internally in types and in pkg_resources... We should probably just simply revert 7075e5e and 45b7a5c.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you won't be able to use new typeshed symbols in 3rd-party stubs in the same PR

Is this documented in the CONTRIBUTING guide? Sorry I might have missed that part.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can replace all usages of types._LoaderProtocol in stdlib in this PR. Just not for 3rd-party stubs. Otherwise uses of the most recent types-setuptools will have a reference to _typeshed.importlib.LoaderProtocol in their stub that doesn't exist yet.

types._LoaderProtocol should still exist until mypy updates their vendored typeshed stdlib stubs (which is done every version).

When I made that comment, I didn't think about the usage in setuptools stubs. Since it's only used there and in types, it might be cleaner to just revert as you're suggesting. And to remove types_LoaderProtocol along the follow-up PR that will use _typeshed.importlib.LoaderProtocol in setuptools stubs.

Sorry for the "flip-flopping" here ^^"

Is this documented in the CONTRIBUTING guide? Sorry I might have missed that part.

I don't think so, but it probably should, now that you mention it. It's rare, but we've been bitten by it twice (that I can remember) in the past.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like this comment from Avasam was addressed before the PR was merged — I think our setuptools stubs are now importing a protocol from _typeshed which, from mypy's perspective, doesn't exist yet. I think we need to change this so that the LoaderProtocol definition is temporarily duplicated in our setuptools stubs, rather than being imported from _typeshed, until a version of mypy is released that includes this protocol in its vendored stdlib stubs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexWaygood
Copy link
Member

I didn't realise the ABCs had optional methods that were omitted from the protocols being added here -- in that case, this approach makes sense. My bad for not looking more closely; apologies!

@abravalheri
Copy link
Contributor Author

abravalheri commented May 12, 2024

I'm not trying to "challenge" your PR - I was confused initially, thinking the protocols were identical to the ABCs. It wasn't clear from the description that that wasn't the case; I tried to help by pointing that out and restating the problem the PR's intending to solve. I've also made two suggestions, that I don't mind if you chose to ignore.
(in #11890 (comment))

That is OK, I never took it as a challenge. Thank you very much for taking the time to have a look on this BTW. My comment was sincerely in the spirit that if we are having problem to understand each other, maybe having another PR showcasing exactly what you mean in the code would be the best way for communicating (instead of keep discussing and asking back and forth). I just wanted to make clear that I was completely OK with that.


I tried to help by pointing that out and restating the problem the PR's intending to solve.

I am sorry, when I read #11890 (comment) that actually confused me more 😅. I don't think about the original intent of the PR as something that matches the description in the comment. My problem is that using these ABCs in type signatures of other functions doesn't seem like the right thing to do (the issues linked have evidence for that).

What I'm proposing is fairly straightforward: convert these 3 ABCs to protocols. If we can't agree that's a good thing, there's hardly any point in opening a PR for it.

Now I might have started to understand with more clarity what you are saying...

Personally, I think that it is better to keep these things separated (but I am also fine to do otherwise if the maintainers request). This opinion is biased by the fact that in the past I have been bitten by little innocent/convenient "lies" in type stubs and now I think that they all come back to bite you.

What we know so far:

  1. There are a couple of implicit protocols in importlib that have not been formally/publicly expressed in a way that allows sound type hints given the way the Python type system works.
  2. importlib.abc.* has a bunch of classes that are a mix of informational/examples/docs and code sharing by inheritance.

We probably don't need to mix them up (well, mixing them up is what got us here in the first place..).


I've also made two suggestions,

Could you please clarify what are the suggestions other than converting the ABCs into protocols? I might have lost them in the middle of the discussion...

Copy link
Contributor

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

@srittau srittau merged commit b42e3b2 into python:main May 12, 2024
57 checks passed
@abravalheri abravalheri deleted the importlib-abc branch May 12, 2024 13:13
Adamantios added a commit to valory-xyz/open-aea that referenced this pull request Jan 7, 2025
Resolves a bug which appeared after updating the macOS version (7e3df5fcf3213bb41f0fef68c04a6ea5e4bbc594):
```
_ TestCheckPackagesCommand.test_check_public_id_failure_wrong_public_id[test_param2] _

query = ['gym']

    def search_packages_info(query: List[str]) -> Generator[_PackageInfo, None, None]:
        """
        Gather details from installed distributions. Print distribution name,
        version, location, and installed files. Installed files requires a
        pip generated 'installed-files.txt' in the distributions '.egg-info'
        directory.
        """
>       env = get_default_environment()

/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/pip/_internal/commands/show.py:80:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def get_default_environment() -> BaseEnvironment:
        """Get the default representation for the current environment.

        This returns an Environment instance from the chosen backend. The default
        Environment instance should be built from ``sys.path`` and may use caching
        to share instance state accorss calls.
        """
>       return select_backend().Environment.default()

/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/pip/_internal/metadata/__init__.py:76:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    @functools.lru_cache(maxsize=None)
    def select_backend() -> Backend:
        if _should_use_importlib_metadata():
            from . import importlib

            return cast(Backend, importlib)
>       from . import pkg_resources

/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/pip/_internal/metadata/__init__.py:64:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    import email.message
    import email.parser
    import logging
    import os
    import zipfile
    from typing import (
        Collection,
        Iterable,
        Iterator,
        List,
        Mapping,
        NamedTuple,
        Optional,
    )

>   from pip._vendor import pkg_resources

/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/pip/_internal/metadata/pkg_resources.py:16:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    """
    Package resource API
    --------------------

    A resource is a logical file contained within a package, or a logical
    subdirectory thereof.  The package resource API expects resource names
    to have their path parts separated with ``/``, *not* whatever the local
    path separator is.  Do not use os.path operations to manipulate resource
    names being passed into the API.

    The package resource API is designed to work with normal filesystem packages,
    .egg files, and unpacked .egg files.  It can also work in a limited way with
    .zip files and with custom PEP 302 loaders that support the ``get_data()``
    method.

    This module is deprecated. Users are directed to :mod:`importlib.resources`,
    :mod:`importlib.metadata` and :pypi:`packaging` instead.
    """

    from __future__ import annotations

    import sys

    if sys.version_info < (3, 8):  # noqa: UP036 # Check for unsupported versions
        raise RuntimeError("Python 3.8 or later is required")

    import os
    import io
    import time
    import re
    import types
    from typing import (
        Any,
        Literal,
        Dict,
        Iterator,
        Mapping,
        MutableSequence,
        NamedTuple,
        NoReturn,
        Tuple,
        Union,
        TYPE_CHECKING,
        Protocol,
        Callable,
        Iterable,
        TypeVar,
        overload,
    )
    import zipfile
    import zipimport
    import warnings
    import stat
    import functools
    import pkgutil
    import operator
    import platform
    import collections
    import plistlib
    import email.parser
    import errno
    import tempfile
    import textwrap
    import inspect
    import ntpath
    import posixpath
    import importlib
    import importlib.abc
    import importlib.machinery
    from pkgutil import get_importer

    import _imp

    # capture these to bypass sandboxing
    from os import utime
    from os import open as os_open
    from os.path import isdir, split

    try:
        from os import mkdir, rename, unlink

        WRITE_SUPPORT = True
    except ImportError:
        # no write support, probably under GAE
        WRITE_SUPPORT = False

    from pip._internal.utils._jaraco_text import (
        yield_lines,
        drop_comment,
        join_continuation,
    )
    from pip._vendor.packaging import markers as _packaging_markers
    from pip._vendor.packaging import requirements as _packaging_requirements
    from pip._vendor.packaging import utils as _packaging_utils
    from pip._vendor.packaging import version as _packaging_version
    from pip._vendor.platformdirs import user_cache_dir as _user_cache_dir

    if TYPE_CHECKING:
        from _typeshed import BytesPath, StrPath, StrOrBytesPath
        from pip._vendor.typing_extensions import Self

    # Patch: Remove deprecation warning from vendored pkg_resources.
    # Setting PYTHONWARNINGS=error to verify builds produce no warnings
    # causes immediate exceptions.
    # See https://github.com/pypa/pip/issues/12243

    _T = TypeVar("_T")
    _DistributionT = TypeVar("_DistributionT", bound="Distribution")
    # Type aliases
    _NestedStr = Union[str, Iterable[Union[str, Iterable["_NestedStr"]]]]
    _InstallerTypeT = Callable[["Requirement"], "_DistributionT"]
    _InstallerType = Callable[["Requirement"], Union["Distribution", None]]
    _PkgReqType = Union[str, "Requirement"]
    _EPDistType = Union["Distribution", _PkgReqType]
    _MetadataType = Union["IResourceProvider", None]
    _ResolvedEntryPoint = Any  # Can be any attribute in the module
    _ResourceStream = Any  # TODO / Incomplete: A readable file-like object
    # Any object works, but let's indicate we expect something like a module (optionally has __loader__ or __file__)
    _ModuleLike = Union[object, types.ModuleType]
    # Any: Should be _ModuleLike but we end up with issues where _ModuleLike doesn't have _ZipLoaderModule's __loader__
    _ProviderFactoryType = Callable[[Any], "IResourceProvider"]
    _DistFinderType = Callable[[_T, str, bool], Iterable["Distribution"]]
    _NSHandlerType = Callable[[_T, str, str, types.ModuleType], Union[str, None]]
    _AdapterT = TypeVar(
        "_AdapterT", _DistFinderType[Any], _ProviderFactoryType, _NSHandlerType[Any]
    )

    # Use _typeshed.importlib.LoaderProtocol once available https://github.com/python/typeshed/pull/11890
    class _LoaderProtocol(Protocol):
        def load_module(self, fullname: str, /) -> types.ModuleType: ...

    class _ZipLoaderModule(Protocol):
        __loader__: zipimport.zipimporter

    _PEP440_FALLBACK = re.compile(r"^v?(?P<safe>(?:[0-9]+!)?[0-9]+(?:\.[0-9]+)*)", re.I)

    class PEP440Warning(RuntimeWarning):
        """
        Used when there is an issue with a version or specifier not complying with
        PEP 440.
        """

    parse_version = _packaging_version.Version

    _state_vars: dict[str, str] = {}

    def _declare_state(vartype: str, varname: str, initial_value: _T) -> _T:
        _state_vars[varname] = vartype
        return initial_value

    def __getstate__() -> dict[str, Any]:
        state = {}
        g = globals()
        for k, v in _state_vars.items():
            state[k] = g['_sget_' + v](g[k])
        return state

    def __setstate__(state: dict[str, Any]) -> dict[str, Any]:
        g = globals()
        for k, v in state.items():
            g['_sset_' + _state_vars[k]](k, g[k], v)
        return state

    def _sget_dict(val):
        return val.copy()

    def _sset_dict(key, ob, state):
        ob.clear()
        ob.update(state)

    def _sget_object(val):
        return val.__getstate__()

    def _sset_object(key, ob, state):
        ob.__setstate__(state)

    _sget_none = _sset_none = lambda *args: None

    def get_supported_platform():
        """Return this platform's maximum compatible version.

        distutils.util.get_platform() normally reports the minimum version
        of macOS that would be required to *use* extensions produced by
        distutils.  But what we want when checking compatibility is to know the
        version of macOS that we are *running*.  To allow usage of packages that
        explicitly require a newer version of macOS, we must also know the
        current version of the OS.

        If this condition occurs for any other platform with a version in its
        platform strings, this function should be extended accordingly.
        """
        plat = get_build_platform()
        m = macosVersionString.match(plat)
        if m is not None and sys.platform == "darwin":
            try:
                plat = 'macosx-%s-%s' % ('.'.join(_macos_vers()[:2]), m.group(3))
            except ValueError:
                # not macOS
                pass
        return plat

    __all__ = [
        # Basic resource access and distribution/entry point discovery
        'require',
        'run_script',
        'get_provider',
        'get_distribution',
        'load_entry_point',
        'get_entry_map',
        'get_entry_info',
        'iter_entry_points',
        'resource_string',
        'resource_stream',
        'resource_filename',
        'resource_listdir',
        'resource_exists',
        'resource_isdir',
        # Environmental control
        'declare_namespace',
        'working_set',
        'add_activation_listener',
        'find_distributions',
        'set_extraction_path',
        'cleanup_resources',
        'get_default_cache',
        # Primary implementation classes
        'Environment',
        'WorkingSet',
        'ResourceManager',
        'Distribution',
        'Requirement',
        'EntryPoint',
        # Exceptions
        'ResolutionError',
        'VersionConflict',
        'DistributionNotFound',
        'UnknownExtra',
        'ExtractionError',
        # Warnings
        'PEP440Warning',
        # Parsing functions and string utilities
        'parse_requirements',
        'parse_version',
        'safe_name',
        'safe_version',
        'get_platform',
        'compatible_platforms',
        'yield_lines',
        'split_sections',
        'safe_extra',
        'to_filename',
        'invalid_marker',
        'evaluate_marker',
        # filesystem utilities
        'ensure_directory',
        'normalize_path',
        # Distribution "precedence" constants
        'EGG_DIST',
        'BINARY_DIST',
        'SOURCE_DIST',
        'CHECKOUT_DIST',
        'DEVELOP_DIST',
        # "Provider" interfaces, implementations, and registration/lookup APIs
        'IMetadataProvider',
        'IResourceProvider',
        'FileMetadata',
        'PathMetadata',
        'EggMetadata',
        'EmptyProvider',
        'empty_provider',
        'NullProvider',
        'EggProvider',
        'DefaultProvider',
        'ZipProvider',
        'register_finder',
        'register_namespace_handler',
        'register_loader_type',
        'fixup_namespace_packages',
        'get_importer',
        # Warnings
        'PkgResourcesDeprecationWarning',
        # Deprecated/backward compatibility only
        'run_main',
        'AvailableDistributions',
    ]

    class ResolutionError(Exception):
        """Abstract base for dependency resolution errors"""

        def __repr__(self):
            return self.__class__.__name__ + repr(self.args)

    class VersionConflict(ResolutionError):
        """
        An already-installed version conflicts with the requested version.

        Should be initialized with the installed Distribution and the requested
        Requirement.
        """

        _template = "{self.dist} is installed but {self.req} is required"

        @property
        def dist(self) -> Distribution:
            return self.args[0]

        @property
        def req(self) -> Requirement:
            return self.args[1]

        def report(self):
            return self._template.format(**locals())

        def with_context(self, required_by: set[Distribution | str]):
            """
            If required_by is non-empty, return a version of self that is a
            ContextualVersionConflict.
            """
            if not required_by:
                return self
            args = self.args + (required_by,)
            return ContextualVersionConflict(*args)

    class ContextualVersionConflict(VersionConflict):
        """
        A VersionConflict that accepts a third parameter, the set of the
        requirements that required the installed Distribution.
        """

        _template = VersionConflict._template + ' by {self.required_by}'

        @property
        def required_by(self) -> set[str]:
            return self.args[2]

    class DistributionNotFound(ResolutionError):
        """A requested distribution was not found"""

        _template = (
            "The '{self.req}' distribution was not found "
            "and is required by {self.requirers_str}"
        )

        @property
        def req(self) -> Requirement:
            return self.args[0]

        @property
        def requirers(self) -> set[str] | None:
            return self.args[1]

        @property
        def requirers_str(self):
            if not self.requirers:
                return 'the application'
            return ', '.join(self.requirers)

        def report(self):
            return self._template.format(**locals())

        def __str__(self):
            return self.report()

    class UnknownExtra(ResolutionError):
        """Distribution doesn't have an "extra feature" of the given name"""

    _provider_factories: dict[type[_ModuleLike], _ProviderFactoryType] = {}

    PY_MAJOR = '{}.{}'.format(*sys.version_info)
    EGG_DIST = 3
    BINARY_DIST = 2
    SOURCE_DIST = 1
    CHECKOUT_DIST = 0
    DEVELOP_DIST = -1

    def register_loader_type(
        loader_type: type[_ModuleLike], provider_factory: _ProviderFactoryType
    ):
        """Register `provider_factory` to make providers for `loader_type`

        `loader_type` is the type or class of a PEP 302 ``module.__loader__``,
        and `provider_factory` is a function that, passed a *module* object,
        returns an ``IResourceProvider`` for that module.
        """
        _provider_factories[loader_type] = provider_factory

    @overload
    def get_provider(moduleOrReq: str) -> IResourceProvider: ...
    @overload
    def get_provider(moduleOrReq: Requirement) -> Distribution: ...
    def get_provider(moduleOrReq: str | Requirement) -> IResourceProvider | Distribution:
        """Return an IResourceProvider for the named module or requirement"""
        if isinstance(moduleOrReq, Requirement):
            return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
        try:
            module = sys.modules[moduleOrReq]
        except KeyError:
            __import__(moduleOrReq)
            module = sys.modules[moduleOrReq]
        loader = getattr(module, '__loader__', None)
        return _find_adapter(_provider_factories, loader)(module)

    @functools.lru_cache(maxsize=None)
    def _macos_vers():
        version = platform.mac_ver()[0]
        # fallback for MacPorts
        if version == '':
            plist = '/System/Library/CoreServices/SystemVersion.plist'
            if os.path.exists(plist):
                with open(plist, 'rb') as fh:
                    plist_content = plistlib.load(fh)
                if 'ProductVersion' in plist_content:
                    version = plist_content['ProductVersion']
        return version.split('.')

    def _macos_arch(machine):
        return {'PowerPC': 'ppc', 'Power_Macintosh': 'ppc'}.get(machine, machine)

    def get_build_platform():
        """Return this platform's string for platform-specific distributions

        XXX Currently this is the same as ``distutils.util.get_platform()``, but it
        needs some hacks for Linux and macOS.
        """
        from sysconfig import get_platform

        plat = get_platform()
        if sys.platform == "darwin" and not plat.startswith('macosx-'):
            try:
                version = _macos_vers()
                machine = os.uname()[4].replace(" ", "_")
                return "macosx-%d.%d-%s" % (
                    int(version[0]),
                    int(version[1]),
                    _macos_arch(machine),
                )
            except ValueError:
                # if someone is running a non-Mac darwin system, this will fall
                # through to the default implementation
                pass
        return plat

    macosVersionString = re.compile(r"macosx-(\d+)\.(\d+)-(.*)")
    darwinVersionString = re.compile(r"darwin-(\d+)\.(\d+)\.(\d+)-(.*)")
    # XXX backward compat
    get_platform = get_build_platform

    def compatible_platforms(provided: str | None, required: str | None):
        """Can code for the `provided` platform run on the `required` platform?

        Returns true if either platform is ``None``, or the platforms are equal.

        XXX Needs compatibility checks for Linux and other unixy OSes.
        """
        if provided is None or required is None or provided == required:
            # easy case
            return True

        # macOS special cases
        reqMac = macosVersionString.match(required)
        if reqMac:
            provMac = macosVersionString.match(provided)

            # is this a Mac package?
            if not provMac:
                # this is backwards compatibility for packages built before
                # setuptools 0.6. All packages built after this point will
                # use the new macOS designation.
                provDarwin = darwinVersionString.match(provided)
                if provDarwin:
                    dversion = int(provDarwin.group(1))
                    macosversion = "%s.%s" % (reqMac.group(1), reqMac.group(2))
                    if (
                        dversion == 7
                        and macosversion >= "10.3"
                        or dversion == 8
                        and macosversion >= "10.4"
                    ):
                        return True
                # egg isn't macOS or legacy darwin
                return False

            # are they the same major version and machine type?
            if provMac.group(1) != reqMac.group(1) or provMac.group(3) != reqMac.group(3):
                return False

            # is the required OS major update >= the provided one?
            if int(provMac.group(2)) > int(reqMac.group(2)):
                return False

            return True

        # XXX Linux and other platforms' special cases should go here
        return False

    @overload
    def get_distribution(dist: _DistributionT) -> _DistributionT: ...
    @overload
    def get_distribution(dist: _PkgReqType) -> Distribution: ...
    def get_distribution(dist: Distribution | _PkgReqType) -> Distribution:
        """Return a current distribution object for a Requirement or string"""
        if isinstance(dist, str):
            dist = Requirement.parse(dist)
        if isinstance(dist, Requirement):
            # Bad type narrowing, dist has to be a Requirement here, so get_provider has to return Distribution
            dist = get_provider(dist)  # type: ignore[assignment]
        if not isinstance(dist, Distribution):
            raise TypeError("Expected str, Requirement, or Distribution", dist)
        return dist

    def load_entry_point(dist: _EPDistType, group: str, name: str) -> _ResolvedEntryPoint:
        """Return `name` entry point of `group` for `dist` or raise ImportError"""
        return get_distribution(dist).load_entry_point(group, name)

    @overload
    def get_entry_map(
        dist: _EPDistType, group: None = None
    ) -> dict[str, dict[str, EntryPoint]]: ...
    @overload
    def get_entry_map(dist: _EPDistType, group: str) -> dict[str, EntryPoint]: ...
    def get_entry_map(dist: _EPDistType, group: str | None = None):
        """Return the entry point map for `group`, or the full entry map"""
        return get_distribution(dist).get_entry_map(group)

    def get_entry_info(dist: _EPDistType, group: str, name: str):
        """Return the EntryPoint object for `group`+`name`, or ``None``"""
        return get_distribution(dist).get_entry_info(group, name)

    class IMetadataProvider(Protocol):
        def has_metadata(self, name: str) -> bool:
            """Does the package's distribution contain the named metadata?"""

        def get_metadata(self, name: str) -> str:
            """The named metadata resource as a string"""

        def get_metadata_lines(self, name: str) -> Iterator[str]:
            """Yield named metadata resource as list of non-blank non-comment lines

            Leading and trailing whitespace is stripped from each line, and lines
            with ``#`` as the first non-blank character are omitted."""

        def metadata_isdir(self, name: str) -> bool:
            """Is the named metadata a directory?  (like ``os.path.isdir()``)"""

        def metadata_listdir(self, name: str) -> list[str]:
            """List of metadata names in the directory (like ``os.listdir()``)"""

        def run_script(self, script_name: str, namespace: dict[str, Any]) -> None:
            """Execute the named script in the supplied namespace dictionary"""

    class IResourceProvider(IMetadataProvider, Protocol):
        """An object that provides access to package resources"""

        def get_resource_filename(
            self, manager: ResourceManager, resource_name: str
        ) -> str:
            """Return a true filesystem path for `resource_name`

            `manager` must be a ``ResourceManager``"""

        def get_resource_stream(
            self, manager: ResourceManager, resource_name: str
        ) -> _ResourceStream:
            """Return a readable file-like object for `resource_name`

            `manager` must be a ``ResourceManager``"""

        def get_resource_string(
            self, manager: ResourceManager, resource_name: str
        ) -> bytes:
            """Return the contents of `resource_name` as :obj:`bytes`

            `manager` must be a ``ResourceManager``"""

        def has_resource(self, resource_name: str) -> bool:
            """Does the package contain the named resource?"""

        def resource_isdir(self, resource_name: str) -> bool:
            """Is the named resource a directory?  (like ``os.path.isdir()``)"""

        def resource_listdir(self, resource_name: str) -> list[str]:
            """List of resource names in the directory (like ``os.listdir()``)"""

    class WorkingSet:
        """A collection of active distributions on sys.path (or a similar list)"""

        def __init__(self, entries: Iterable[str] | None = None):
            """Create working set from list of path entries (default=sys.path)"""
            self.entries: list[str] = []
            self.entry_keys = {}
            self.by_key = {}
            self.normalized_to_canonical_keys = {}
            self.callbacks = []

            if entries is None:
                entries = sys.path

            for entry in entries:
                self.add_entry(entry)

        @classmethod
        def _build_master(cls):
            """
            Prepare the master working set.
            """
            ws = cls()
            try:
                from __main__ import __requires__
            except ImportError:
                # The main program does not list any requirements
                return ws

            # ensure the requirements are met
            try:
                ws.require(__requires__)
            except VersionConflict:
                return cls._build_from_requirements(__requires__)

            return ws

        @classmethod
        def _build_from_requirements(cls, req_spec):
            """
            Build a working set from a requirement spec. Rewrites sys.path.
            """
            # try it without defaults already on sys.path
            # by starting with an empty path
            ws = cls([])
            reqs = parse_requirements(req_spec)
            dists = ws.resolve(reqs, Environment())
            for dist in dists:
                ws.add(dist)

            # add any missing entries from sys.path
            for entry in sys.path:
                if entry not in ws.entries:
                    ws.add_entry(entry)

            # then copy back to sys.path
            sys.path[:] = ws.entries
            return ws

        def add_entry(self, entry: str):
            """Add a path item to ``.entries``, finding any distributions on it

            ``find_distributions(entry, True)`` is used to find distributions
            corresponding to the path entry, and they are added.  `entry` is
            always appended to ``.entries``, even if it is already present.
            (This is because ``sys.path`` can contain the same value more than
            once, and the ``.entries`` of the ``sys.path`` WorkingSet should always
            equal ``sys.path``.)
            """
            self.entry_keys.setdefault(entry, [])
            self.entries.append(entry)
            for dist in find_distributions(entry, True):
                self.add(dist, entry, False)

        def __contains__(self, dist: Distribution) -> bool:
            """True if `dist` is the active distribution for its project"""
            return self.by_key.get(dist.key) == dist

        def find(self, req: Requirement) -> Distribution | None:
            """Find a distribution matching requirement `req`

            If there is an active distribution for the requested project, this
            returns it as long as it meets the version requirement specified by
            `req`.  But, if there is an active distribution for the project and it
            does *not* meet the `req` requirement, ``VersionConflict`` is raised.
            If there is no active distribution for the requested project, ``None``
            is returned.
            """
            dist = self.by_key.get(req.key)

            if dist is None:
                canonical_key = self.normalized_to_canonical_keys.get(req.key)

                if canonical_key is not None:
                    req.key = canonical_key
                    dist = self.by_key.get(canonical_key)

            if dist is not None and dist not in req:
                # XXX add more info
                raise VersionConflict(dist, req)
            return dist

        def iter_entry_points(self, group: str, name: str | None = None):
            """Yield entry point objects from `group` matching `name`

            If `name` is None, yields all entry points in `group` from all
            distributions in the working set, otherwise only ones matching
            both `group` and `name` are yielded (in distribution order).
            """
            return (
                entry
                for dist in self
                for entry in dist.get_entry_map(group).values()
                if name is None or name == entry.name
            )

        def run_script(self, requires: str, script_name: str):
            """Locate distribution for `requires` and run `script_name` script"""
            ns = sys._getframe(1).f_globals
            name = ns['__name__']
            ns.clear()
            ns['__name__'] = name
            self.require(requires)[0].run_script(script_name, ns)

        def __iter__(self) -> Iterator[Distribution]:
            """Yield distributions for non-duplicate projects in the working set

            The yield order is the order in which the items' path entries were
            added to the working set.
            """
            seen = set()
            for item in self.entries:
                if item not in self.entry_keys:
                    # workaround a cache issue
                    continue

                for key in self.entry_keys[item]:
                    if key not in seen:
                        seen.add(key)
                        yield self.by_key[key]

        def add(
            self,
            dist: Distribution,
            entry: str | None = None,
            insert: bool = True,
            replace: bool = False,
        ):
            """Add `dist` to working set, associated with `entry`

            If `entry` is unspecified, it defaults to the ``.location`` of `dist`.
            On exit from this routine, `entry` is added to the end of the working
            set's ``.entries`` (if it wasn't already present).

            `dist` is only added to the working set if it's for a project that
            doesn't already have a distribution in the set, unless `replace=True`.
            If it's added, any callbacks registered with the ``subscribe()`` method
            will be called.
            """
            if insert:
                dist.insert_on(self.entries, entry, replace=replace)

            if entry is None:
                entry = dist.location
            keys = self.entry_keys.setdefault(entry, [])
            keys2 = self.entry_keys.setdefault(dist.location, [])
            if not replace and dist.key in self.by_key:
                # ignore hidden distros
                return

            self.by_key[dist.key] = dist
            normalized_name = _packaging_utils.canonicalize_name(dist.key)
            self.normalized_to_canonical_keys[normalized_name] = dist.key
            if dist.key not in keys:
                keys.append(dist.key)
            if dist.key not in keys2:
                keys2.append(dist.key)
            self._added_new(dist)

        @overload
        def resolve(
            self,
            requirements: Iterable[Requirement],
            env: Environment | None,
            installer: _InstallerTypeT[_DistributionT],
            replace_conflicting: bool = False,
            extras: tuple[str, ...] | None = None,
        ) -> list[_DistributionT]: ...
        @overload
        def resolve(
            self,
            requirements: Iterable[Requirement],
            env: Environment | None = None,
            *,
            installer: _InstallerTypeT[_DistributionT],
            replace_conflicting: bool = False,
            extras: tuple[str, ...] | None = None,
        ) -> list[_DistributionT]: ...
        @overload
        def resolve(
            self,
            requirements: Iterable[Requirement],
            env: Environment | None = None,
            installer: _InstallerType | None = None,
            replace_conflicting: bool = False,
            extras: tuple[str, ...] | None = None,
        ) -> list[Distribution]: ...
        def resolve(
            self,
            requirements: Iterable[Requirement],
            env: Environment | None = None,
            installer: _InstallerType | None | _InstallerTypeT[_DistributionT] = None,
            replace_conflicting: bool = False,
            extras: tuple[str, ...] | None = None,
        ) -> list[Distribution] | list[_DistributionT]:
            """List all distributions needed to (recursively) meet `requirements`

            `requirements` must be a sequence of ``Requirement`` objects.  `env`,
            if supplied, should be an ``Environment`` instance.  If
            not supplied, it defaults to all distributions available within any
            entry or distribution in the working set.  `installer`, if supplied,
            will be invoked with each requirement that cannot be met by an
            already-installed distribution; it should return a ``Distribution`` or
            ``None``.

            Unless `replace_conflicting=True`, raises a VersionConflict exception
            if
            any requirements are found on the path that have the correct name but
            the wrong version.  Otherwise, if an `installer` is supplied it will be
            invoked to obtain the correct version of the requirement and activate
            it.

            `extras` is a list of the extras to be used with these requirements.
            This is important because extra requirements may look like `my_req;
            extra = "my_extra"`, which would otherwise be interpreted as a purely
            optional requirement.  Instead, we want to be able to assert that these
            requirements are truly required.
            """

            # set up the stack
            requirements = list(requirements)[::-1]
            # set of processed requirements
            processed = set()
            # key -> dist
            best = {}
            to_activate = []

            req_extras = _ReqExtras()

            # Mapping of requirement to set of distributions that required it;
            # useful for reporting info about conflicts.
            required_by = collections.defaultdict(set)

            while requirements:
                # process dependencies breadth-first
                req = requirements.pop(0)
                if req in processed:
                    # Ignore cyclic or redundant dependencies
                    continue

                if not req_extras.markers_pass(req, extras):
                    continue

                dist = self._resolve_dist(
                    req, best, replace_conflicting, env, installer, required_by, to_activate
                )

                # push the new requirements onto the stack
                new_requirements = dist.requires(req.extras)[::-1]
                requirements.extend(new_requirements)

                # Register the new requirements needed by req
                for new_requirement in new_requirements:
                    required_by[new_requirement].add(req.project_name)
                    req_extras[new_requirement] = req.extras

                processed.add(req)

            # return list of distros to activate
            return to_activate

        def _resolve_dist(
            self, req, best, replace_conflicting, env, installer, required_by, to_activate
        ) -> Distribution:
            dist = best.get(req.key)
            if dist is None:
                # Find the best distribution and add it to the map
                dist = self.by_key.get(req.key)
                if dist is None or (dist not in req and replace_conflicting):
                    ws = self
                    if env is None:
                        if dist is None:
                            env = Environment(self.entries)
                        else:
                            # Use an empty environment and workingset to avoid
                            # any further conflicts with the conflicting
                            # distribution
                            env = Environment([])
                            ws = WorkingSet([])
                    dist = best[req.key] = env.best_match(
                        req, ws, installer, replace_conflicting=replace_conflicting
                    )
                    if dist is None:
                        requirers = required_by.get(req, None)
                        raise DistributionNotFound(req, requirers)
                to_activate.append(dist)
            if dist not in req:
                # Oops, the "best" so far conflicts with a dependency
                dependent_req = required_by[req]
                raise VersionConflict(dist, req).with_context(dependent_req)
            return dist

        @overload
        def find_plugins(
            self,
            plugin_env: Environment,
            full_env: Environment | None,
            installer: _InstallerTypeT[_DistributionT],
            fallback: bool = True,
        ) -> tuple[list[_DistributionT], dict[Distribution, Exception]]: ...
        @overload
        def find_plugins(
            self,
            plugin_env: Environment,
            full_env: Environment | None = None,
            *,
            installer: _InstallerTypeT[_DistributionT],
            fallback: bool = True,
        ) -> tuple[list[_DistributionT], dict[Distribution, Exception]]: ...
        @overload
        def find_plugins(
            self,
            plugin_env: Environment,
            full_env: Environment | None = None,
            installer: _InstallerType | None = None,
            fallback: bool = True,
        ) -> tuple[list[Distribution], dict[Distribution, Exception]]: ...
        def find_plugins(
            self,
            plugin_env: Environment,
            full_env: Environment | None = None,
            installer: _InstallerType | None | _InstallerTypeT[_DistributionT] = None,
            fallback: bool = True,
        ) -> tuple[
            list[Distribution] | list[_DistributionT],
            dict[Distribution, Exception],
        ]:
            """Find all activatable distributions in `plugin_env`

            Example usage::

                distributions, errors = working_set.find_plugins(
                    Environment(plugin_dirlist)
                )
                # add plugins+libs to sys.path
                map(working_set.add, distributions)
                # display errors
                print('Could not load', errors)

            The `plugin_env` should be an ``Environment`` instance that contains
            only distributions that are in the project's "plugin directory" or
            directories. The `full_env`, if supplied, should be an ``Environment``
            contains all currently-available distributions.  If `full_env` is not
            supplied, one is created automatically from the ``WorkingSet`` this
            method is called on, which will typically mean that every directory on
            ``sys.path`` will be scanned for distributions.

            `installer` is a standard installer callback as used by the
            ``resolve()`` method. The `fallback` flag indicates whether we should
            attempt to resolve older versions of a plugin if the newest version
            cannot be resolved.

            This method returns a 2-tuple: (`distributions`, `error_info`), where
            `distributions` is a list of the distributions found in `plugin_env`
            that were loadable, along with any other distributions that are needed
            to resolve their dependencies.  `error_info` is a dictionary mapping
            unloadable plugin distributions to an exception instance describing the
            error that occurred. Usually this will be a ``DistributionNotFound`` or
            ``VersionConflict`` instance.
            """

            plugin_projects = list(plugin_env)
            # scan project names in alphabetic order
            plugin_projects.sort()

            error_info: dict[Distribution, Exception] = {}
            distributions: dict[Distribution, Exception | None] = {}

            if full_env is None:
                env = Environment(self.entries)
                env += plugin_env
            else:
                env = full_env + plugin_env

            shadow_set = self.__class__([])
            # put all our entries in shadow_set
            list(map(shadow_set.add, self))

            for project_name in plugin_projects:
                for dist in plugin_env[project_name]:
                    req = [dist.as_requirement()]

                    try:
                        resolvees = shadow_set.resolve(req, env, installer)

                    except ResolutionError as v:
                        # save error info
                        error_info[dist] = v
                        if fallback:
                            # try the next older version of project
                            continue
                        else:
                            # give up on this project, keep going
                            break

                    else:
                        list(map(shadow_set.add, resolvees))
                        distributions.update(dict.fromkeys(resolvees))

                        # success, no need to try any more versions of this project
                        break

            sorted_distributions = list(distributions)
            sorted_distributions.sort()

            return sorted_distributions, error_info

        def require(self, *requirements: _NestedStr):
            """Ensure that distributions matching `requirements` are activated

            `requirements` must be a string or a (possibly-nested) sequence
            thereof, specifying the distributions and versions required.  The
            return value is a sequence of the distributions that needed to be
            activated to fulfill the requirements; all relevant distributions are
            included, even if they were already activated in this working set.
            """
            needed = self.resolve(parse_requirements(requirements))

            for dist in needed:
                self.add(dist)

            return needed

        def subscribe(
            self, callback: Callable[[Distribution], object], existing: bool = True
        ):
            """Invoke `callback` for all distributions

            If `existing=True` (default),
            call on all existing ones, as well.
            """
            if callback in self.callbacks:
                return
            self.callbacks.append(callback)
            if not existing:
                return
            for dist in self:
                callback(dist)

        def _added_new(self, dist):
            for callback in self.callbacks:
                callback(dist)

        def __getstate__(self):
            return (
                self.entries[:],
                self.entry_keys.copy(),
                self.by_key.copy(),
                self.normalized_to_canonical_keys.copy(),
                self.callbacks[:],
            )

        def __setstate__(self, e_k_b_n_c):
            entries, keys, by_key, normalized_to_canonical_keys, callbacks = e_k_b_n_c
            self.entries = entries[:]
            self.entry_keys = keys.copy()
            self.by_key = by_key.copy()
            self.normalized_to_canonical_keys = normalized_to_canonical_keys.copy()
            self.callbacks = callbacks[:]

    class _ReqExtras(Dict["Requirement", Tuple[str, ...]]):
        """
        Map each requirement to the extras that demanded it.
        """

        def markers_pass(self, req: Requirement, extras: tuple[str, ...] | None = None):
            """
            Evaluate markers for req against each extra that
            demanded it.

            Return False if the req has a marker and fails
            evaluation. Otherwise, return True.
            """
            extra_evals = (
                req.marker.evaluate({'extra': extra})
                for extra in self.get(req, ()) + (extras or (None,))
            )
            return not req.marker or any(extra_evals)

>   class Environment:

/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/pip/_vendor/pkg_resources/__init__.py:1126:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    class Environment:
        """Searchable snapshot of distributions on a search path"""

        def __init__(
            self,
            search_path: Iterable[str] | None = None,
>           platform: str | None = get_supported_platform(),
            python: str | None = PY_MAJOR,
        ):

/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/pip/_vendor/pkg_resources/__init__.py:1132:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def get_supported_platform():
        """Return this platform's maximum compatible version.

        distutils.util.get_platform() normally reports the minimum version
        of macOS that would be required to *use* extensions produced by
        distutils.  But what we want when checking compatibility is to know the
        version of macOS that we are *running*.  To allow usage of packages that
        explicitly require a newer version of macOS, we must also know the
        current version of the OS.

        If this condition occurs for any other platform with a version in its
        platform strings, this function should be extended accordingly.
        """
>       plat = get_build_platform()

/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/pip/_vendor/pkg_resources/__init__.py:212:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def get_build_platform():
        """Return this platform's string for platform-specific distributions

        XXX Currently this is the same as ``distutils.util.get_platform()``, but it
        needs some hacks for Linux and macOS.
        """
        from sysconfig import get_platform

>       plat = get_platform()

/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/pip/_vendor/pkg_resources/__init__.py:459:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def get_platform():
        """Return a string that identifies the current platform.

        This is used mainly to distinguish platform-specific build directories and
        platform-specific built distributions.  Typically includes the OS name and
        version and the architecture (as supplied by 'os.uname()'), although the
        exact information included depends on the OS; on Linux, the kernel version
        isn't particularly important.

        Examples of returned values:
           linux-i586
           linux-alpha (?)
           solaris-2.6-sun4u

        Windows will return one of:
           win-amd64 (64bit Windows on AMD64 (aka x86_64, Intel64, EM64T, etc)
           win32 (all others - specifically, sys.platform is returned)

        For other non-POSIX platforms, currently just returns 'sys.platform'.

        """
        if os.name == 'nt':
            if 'amd64' in sys.version.lower():
                return 'win-amd64'
            if '(arm)' in sys.version.lower():
                return 'win-arm32'
            if '(arm64)' in sys.version.lower():
                return 'win-arm64'
            return sys.platform

        if os.name != "posix" or not hasattr(os, 'uname'):
            # XXX what about the architecture? NT is Intel or Alpha
            return sys.platform

        # Set for cross builds explicitly
        if "_PYTHON_HOST_PLATFORM" in os.environ:
            return os.environ["_PYTHON_HOST_PLATFORM"]

        # Try to distinguish various flavours of Unix
        osname, host, release, version, machine = os.uname()

        # Convert the OS name to lowercase, remove '/' characters, and translate
        # spaces (for "Power Macintosh")
        osname = osname.lower().replace('/', '')
        machine = machine.replace(' ', '_')
        machine = machine.replace('/', '-')

        if osname[:5] == "linux":
            # At least on Linux/Intel, 'machine' is the processor --
            # i386, etc.
            # XXX what about Alpha, SPARC, etc?
            return  "%s-%s" % (osname, machine)
        elif osname[:5] == "sunos":
            if release[0] >= "5":           # SunOS 5 == Solaris 2
                osname = "solaris"
                release = "%d.%s" % (int(release[0]) - 3, release[2:])
                # We can't use "platform.architecture()[0]" because a
                # bootstrap problem. We use a dict to get an error
                # if some suspicious happens.
                bitness = {2147483647:"32bit", 9223372036854775807:"64bit"}
                machine += ".%s" % bitness[sys.maxsize]
            # fall through to standard osname-release-machine representation
        elif osname[:3] == "aix":
            return "%s-%s.%s" % (osname, version, release)
        elif osname[:6] == "cygwin":
            osname = "cygwin"
            import re
            rel_re = re.compile(r'[\d.]+')
            m = rel_re.match(release)
            if m:
                release = m.group()
        elif osname[:6] == "darwin":
            import _osx_support
>           osname, release, machine = _osx_support.get_platform_osx(
                                                get_config_vars(),
                                                osname, release, machine)

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/sysconfig.py:691:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

_config_vars = {'ABIFLAGS': '', 'AC_APPLE_UNIVERSAL_BUILD': 0, 'AIX_GENUINE_CPLUSPLUS': 0, 'ALT_SOABI': 0, ...}
osname = 'macosx', release = '11', machine = 'fat'

    def get_platform_osx(_config_vars, osname, release, machine):
        """Filter values for get_platform()"""
        # called from get_platform() in sysconfig and distutils.util
        #
        # For our purposes, we'll assume that the system version from
        # distutils' perspective is what MACOSX_DEPLOYMENT_TARGET is set
        # to. This makes the compatibility story a bit more sane because the
        # machine is going to compile and link as if it were
        # MACOSX_DEPLOYMENT_TARGET.

        macver = _config_vars.get('MACOSX_DEPLOYMENT_TARGET', '')
        macrelease = _get_system_version() or macver
        macver = macver or macrelease

        if macver:
            release = macver
            osname = "macosx"

            # Use the original CFLAGS value, if available, so that we
            # return the same machine type for the platform string.
            # Otherwise, distutils may consider this a cross-compiling
            # case and disallow installs.
            cflags = _config_vars.get(_INITPRE+'CFLAGS',
                                        _config_vars.get('CFLAGS', ''))
            if macrelease:
                try:
                    macrelease = tuple(int(i) for i in macrelease.split('.')[0:2])
                except ValueError:
                    macrelease = (10, 0)
            else:
                # assume no universal support
                macrelease = (10, 0)

            if (macrelease >= (10, 4)) and '-arch' in cflags.strip():
                # The universal build will build fat binaries, but not on
                # systems before 10.4

                machine = 'fat'

>               archs = re.findall(r'-arch\s+(\S+)', cflags)

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/_osx_support.py:539:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <MagicMock name='findall' id='4700814208'>
args = ('-arch\\s+(\\S+)', '-Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -arch arm64 -arch x86_64 -g')
kwargs = {}

    def __call__(self, /, *args, **kwargs):
        # can't use self in-case a function / method we are mocking uses self
        # in the signature
        self._mock_check_sig(*args, **kwargs)
        self._increment_mock_call(*args, **kwargs)
>       return self._mock_call(*args, **kwargs)

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/mock.py:1081:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <MagicMock name='findall' id='4700814208'>
args = ('-arch\\s+(\\S+)', '-Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -arch arm64 -arch x86_64 -g')
kwargs = {}

    def _mock_call(self, /, *args, **kwargs):
>       return self._execute_mock_call(*args, **kwargs)

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/mock.py:1085:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <MagicMock name='findall' id='4700814208'>
args = ('-arch\\s+(\\S+)', '-Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -arch arm64 -arch x86_64 -g')
kwargs = {}, effect = <list_iterator object at 0x11830c1f0>

    def _execute_mock_call(self, /, *args, **kwargs):
        # separate from _increment_mock_call so that awaited functions are
        # executed separately from their call, also AsyncMock overrides this method

        effect = self.side_effect
        if effect is not None:
            if _is_exception(effect):
                raise effect
            elif not _callable(effect):
>               result = next(effect)
E               StopIteration

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/mock.py:1142: StopIteration

The above exception was the direct cause of the following exception:

self = <tests.test_cli.test_check_packages.TestCheckPackagesCommand object at 0x11291da90>
test_param = _TestPublicIdParameters(side_effect=[[(None,)], [], [('fetchai', 'gym', '0.19.0')]], exit_code=0, message='OK!')

    @pytest.mark.parametrize(
        "test_param",
        [
            _TestPublicIdParameters(
                side_effect=[
                    [(None,)],
                    [(None, None, None)],
                ],
                exit_code=1,
                message="found 'None'",
            ),
            _TestPublicIdParameters(
                side_effect=[
                    [(None,)],
                    [],
                    [(None, None, None)],
                ],
                exit_code=1,
                message="found 'None/None:None'",
            ),
            _TestPublicIdParameters(
                side_effect=[
                    [(None,)],
                    [],
                    [("fetchai", "gym", "0.19.0")],
                ],
                exit_code=0,
                message="OK!",
            ),
            _TestPublicIdParameters(
                side_effect=[
                    [(None,)],
                    [],
                    ["", ()],
                ],
                exit_code=1,
                message="found ''",
            ),
        ],
    )
    def test_check_public_id_failure_wrong_public_id(
        self, test_param: _TestPublicIdParameters
    ) -> None:
        """Test `check_public_id` failure."""

        with mock.patch(
            "re.findall",
            side_effect=test_param.side_effect,
        ), _find_all_configuration_files_patch(
            [self.test_connection_config]
        ), check_dependencies_patch:
>           result = self.invoke(
                "--registry-path",
                str(self.packages_dir_path),
                "check-packages",
            )

/Users/runner/work/open-aea/open-aea/tests/test_cli/test_check_packages.py:237:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/Users/runner/work/open-aea/open-aea/aea/test_tools/test_cases.py:944: in invoke
    result = cls.runner.invoke(
/Users/runner/work/open-aea/open-aea/aea/test_tools/click_testing.py:103: in invoke
    cli.main(args=args or (), prog_name=prog_name, **extra)
/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/click/core.py:1078: in main
    rv = self.invoke(ctx)
/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/click/core.py:1688: in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/click/core.py:1434: in invoke
    return ctx.invoke(self.callback, **ctx.params)
/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/click/core.py:783: in invoke
    return __callback(*args, **kwargs)
/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/click/decorators.py:92: in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
/Users/runner/work/open-aea/open-aea/.tox/py3.8/lib/python3.8/site-packages/click/core.py:783: in invoke
    return __callback(*args, **kwargs)
/Users/runner/work/open-aea/open-aea/aea/cli/check_packages.py:621: in check_packages
    check_pypi_dependencies(file)
/Users/runner/work/open-aea/open-aea/aea/cli/check_packages.py:601: in check_pypi_dependencies
    PyPIDependenciesCheckTool(configuration_file).run()
/Users/runner/work/open-aea/open-aea/aea/cli/check_packages.py:549: in run
    package_dependencies = self.get_dependencies()
/Users/runner/work/open-aea/open-aea/aea/cli/check_packages.py:541: in get_dependencies
    result[dep] = DependenciesTool.get_package_files(dep)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

package_name = 'gym'

    @staticmethod
    def get_package_files(package_name: str) -> List[Path]:
        """Get package files list."""
>       packages_info = list(search_packages_info([package_name]))
E       RuntimeError: generator raised StopIteration
```

The issue occurs because of the following reasons:

- https://github.com/python/cpython/blob/v3.13.1/Lib/_osx_support.py#L537-L543
- https://github.com/valory-xyz/open-aea/blob/v1.60.0/tests/test_cli/test_check_packages.py#L231-L233

To put it short, now that we updated the macOS version (7e3df5fcf3213bb41f0fef68c04a6ea5e4bbc594), the condition in the first link is entered. In this condition, the `re.findall` is used, which is however mocked by the test as seen in the second link.

In general, mocking standard library functions is a terrible idea. This commit refactors the whole `check_public_id` function and the corresponding tests to simplify, separate concerns, and fix the issue by removing the mock of the `re.findall` standard library function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants