Skip to content

Issue#1682/repository simulator fetch tracker #1704

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

kairoaraujo
Copy link
Contributor

This commit implements a feature in Repository Simulator to
track the fetch calls to the metadata and targets. This feature was
mentioned in PR #1666 that generated issue #1682.
This commit adds RepositorySimulator.fetch_tracker. It also changes
the tests/test_updater_consistent_snapshot.py to use the
fetch_tracker instead of using mock.

It implements a dataclass that stores the calls to fetch metadata
(_fetch_metadata) in fetch_tracker.metadata and targets
(_fetch_targets) in fetch_tracker.targets.

The fetch calls for metadata, and targets are stored as lists.

Signed-off-by: Kairo de Araujo kdearaujo@vmware.com

Please fill in the fields below to submit a pull request. The more information
that is provided, the better.

Fixes #1682

Description of the changes being introduced by the pull request:

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

Kairo de Araujo added 2 commits December 2, 2021 10:57
This commit implements a feature in Repository Simulator to
track the fetch calls to the metadata and targets. This feature was
mentioned in PR theupdateframework#1666 that generated issue theupdateframework#1682.
This commit adds ``RepositorySimulator.fetch_tracker``. It also changes
the ``tests/test_updater_consistent_snapshot.py`` to use the
``fetch_tracker`` instead of using mock.

It implements a ``dataclass`` that stores the calls to fetch metadata
(``_fetch_metadata``) in ``fetch_tracker.metadata`` and targets
(``_fetch_targets``) in ``fetch_tracker.targets``.

The fetch calls for metadata, and targets are stored as lists.

Signed-off-by: Kairo de Araujo <kdearaujo@vmware.com>
Wrong type as str once it gets a list.

Signed-off-by: Kairo de Araujo <kdearaujo@vmware.com>
Copy link
Contributor Author

@kairoaraujo kairoaraujo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am storing it as a tuple('role name', 'version') for metadata and tuple('target path', 'target hash'). I am still in doubt about readability; maybe we should move it to a dictionary, then we don't need to rely on the code comments to understand it at the first code read.

@@ -229,6 +239,8 @@ def _fetch_target(

If hash is None, then consistent_snapshot is not used.
"""
self.fetch_tracker.targets.append((target_path, target_hash))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We append to the fetch_tracker.targets all fetch requests. Should we append-only successful ones?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to track the all requests coming from the client and I think this is the correct place.

@@ -248,6 +260,8 @@ def _fetch_metadata(

If version is None, non-versioned metadata is being requested.
"""
self.fetch_tracker.metadata.append((role, version))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We append to the fetch_tracker.metadata all fetch requests. Should we append-only successful ones?

@coveralls
Copy link

coveralls commented Dec 2, 2021

Pull Request Test Coverage Report for Build 1530070605

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 47 of 47 (100.0%) changed or added relevant lines in 3 files are covered.
  • 6 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+1.0%) to 98.579%

Files with Coverage Reduction New Missed Lines %
tuf/ngclient/updater.py 6 95.3%
Totals Coverage Status
Change from base Build 1530059117: 1.0%
Covered Lines: 3869
Relevant Lines: 3895

💛 - Coveralls

Copy link
Contributor

@sechkova sechkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the trackers look good, I think this is what we needed.
My comments are minor, related to types but also I think one check is missing.

I was wondering if defining a dataclass (or two dataclasses for metadata and targets) instead of using tuples may simplify the expected calls lists. We can set default values and "typing" will be simpler but on the other hand instead of (root, 1) we'll have to write SomeCall(root, 1) ... so I am not sure about this. This is a way to say that I am looking for a second opinion, not requesting changes 😁 We can keep it like this for now, maybe when more use cases come this choice we'll be more obvious.

Comment on lines 88 to 89
metadata: list = field(default_factory=list)
targets: list = field(default_factory=list)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
metadata: list = field(default_factory=list)
targets: list = field(default_factory=list)
metadata: List[Tuple[str, Optional[int]]] = field(default_factory=list)
targets: List[Tuple[str, Optional[str]]] = field(default_factory=list)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know my suggestion is ugly but if we want to be strict about types ... I guess it's fine until it appears only once.

info.path, expected_prefix
)
# target files are always persisted without hash prefix
self._assert_targets_files_exist([info.path])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to keep this check?

@@ -229,6 +239,8 @@ def _fetch_target(

If hash is None, then consistent_snapshot is not used.
"""
self.fetch_tracker.targets.append((target_path, target_hash))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to track the all requests coming from the client and I think this is the correct place.

@@ -116,6 +124,8 @@ def __init__(self) -> None:
self.dump_dir: Optional[str] = None
self.dump_version = 0

self.fetch_tracker: FetchCounter = FetchCounter()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.fetch_tracker: FetchCounter = FetchCounter()
self.fetch_tracker = FetchCounter()

I think mypy and the code editors are smart enough to infer the type in this case.

@@ -197,36 +199,29 @@ def test_download_targets(self, test_case_data: Dict[str, Any]):
consistent_snapshot: bool = test_case_data["consistent_snapshot"]
prefix_targets_with_hash: bool = test_case_data["prefix_targets"]
hash_algo: Optional[str] = test_case_data["hash_algo"]
targetpaths = ["file", "file.txt", "..file.ext", "f.le"]
targetpaths: list = test_case_data["targetpaths"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
targetpaths: list = test_case_data["targetpaths"]
targetpaths: List[str] = test_case_data["targetpaths"]

Kairo de Araujo added 2 commits December 2, 2021 17:38
…etch-tracker"

This reverts commit d3ba179, reversing
changes made to 47a18dd.
- Rename FetchCounter to FetchTracker as we are not counting
- Fix typing for the FetchCounter metadata
- Simplify the self.fetch_tracker initialization
- Revert the removed _assert_target_files_exist

Signed-off-by: Kairo de Araujo <kdearaujo@vmware.com>
@kairoaraujo
Copy link
Contributor Author

I closed this PR to avoid too many unrelated commits to fix my mess in merging the develop on top of kairoaraujo:issue#1682/repository-simulator-fetch-tracker branch.
The above-linked PR will be taking over with all relevant information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RepositorySimulator: add trackers/counters for downloaded files
3 participants