Skip to content
This repository was archived by the owner on Apr 17, 2023. It is now read-only.

Fix remaining issues on the synchronization of the registry #1599

Closed
mssola opened this issue Jan 18, 2018 · 9 comments
Closed

Fix remaining issues on the synchronization of the registry #1599

mssola opened this issue Jan 18, 2018 · 9 comments
Labels

Comments

@mssola
Copy link
Collaborator

mssola commented Jan 18, 2018

There are some pending issues from the synchronization of the registry. This issue tries to be an aggregate of old issues.

Crono deletes everything

Some issues pointed out that old crono might remove all the contents. See #663.

Crono is adding & removing images

See issues: #1184, #1261, #1160 and #1232.

@Vesli
Copy link

Vesli commented Jan 22, 2018

Hey mssola, how is this issue going on?

Here at my company I wanted to have a go with Portus and fortunately we are using it for our staging environment only at the moment.

I say fortunately because we are running it directly on an OpenSuse machine and over the weekend Portus deleted all the images, from its DB and from the registry!

I saw that for some people it pushes it again after some time, but on our case it did nothing of the sort.

I'm on Portus 2.2 and registry 2.6.
Should I move for the docker version instead or stop using Portus for now?

Thanks!

@mssola
Copy link
Collaborator Author

mssola commented Jan 22, 2018

I say fortunately because we are running it directly on an OpenSuse machine and over the weekend Portus deleted all the images, from its DB and from the registry!

That is ... bad 😞 Could you possibly collect the logs, and possibly get into more detail on your deployment strategy. Even if it's a bit old, it might help us to fix this situation.

I'm on Portus 2.2 and registry 2.6. Should I move for the docker version instead or stop using Portus for now?

Well, I'd advise in favor of this solution. Bear in mind that we are in the process of releasing the 2.3 version, so you might want to take a look at this (since moreover it has some deployment changes). Check our examples directory for some production-ready examples (just use opensuse/portus:2.3 instead of opensuse/portus:head, since there are no stability guarantees for 2.3 😉).

@marc0s
Copy link

marc0s commented Feb 1, 2018

I'm using portus:head and registry:2.6 and I'm hitting this issue. Any workaround in the meantime?
What I see in the logs is

background_1  | Net::OpenTimeout: connection timed out.
background_1  | [catalog] Could not fetch manifest for 'redacted/image2' with tag 'latest': Net::OpenTimeout: connection timed out.
background_1  | Net::OpenTimeout: connection timed out.
background_1  | [catalog] Could not fetch manifest for 'redacted/image2' with tag 'latest': Net::OpenTimeout: connection timed out.
background_1  | [catalog] Removed the tag 'latest'.
background_1  | [catalog] Removed the image 'image2'.
background_1  | [catalog] Created the tag 'latest'.

Each of the images is around 650MB and there are just 5 of them (is a recently created registry).

Thanks!

@mssola
Copy link
Collaborator Author

mssola commented Feb 9, 2018

@marc0s see this comment. This might even fix this issue (we'd need to identify some corner cases when enabled, but it should fix the situation for most people).

mssola added a commit to mssola/Portus that referenced this issue Feb 9, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon SUSE#1631 where background tasks can be disabled,
and it disables the `sync` task by default. Moreover, it also adds a
`sync-strategy` option which is available when users enable this
feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in SUSE#1631, but it
hopefully does the trick for most users.

Fixes SUSE#1650
Fixes SUSE#1664
See SUSE#1599
Depends on SUSE#1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
mssola added a commit to mssola/Portus that referenced this issue Feb 12, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon SUSE#1631 where background tasks can be disabled,
and it disables the `sync` task by default. Moreover, it also adds a
`sync-strategy` option which is available when users enable this
feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in SUSE#1631, but it
hopefully does the trick for most users.

Fixes SUSE#1650
Fixes SUSE#1664
See SUSE#1599
Depends on SUSE#1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
mssola added a commit to mssola/Portus that referenced this issue Feb 12, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon SUSE#1631 where background tasks can be disabled,
and it disables the `sync` task by default. Moreover, it also adds a
`sync-strategy` option which is available when users enable this
feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in SUSE#1631, but it
hopefully does the trick for most users.

Fixes SUSE#1650
Fixes SUSE#1664
See SUSE#1599
Depends on SUSE#1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
mssola added a commit to mssola/Portus that referenced this issue Feb 12, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon SUSE#1631 where background tasks can be disabled.
Moreover, it also adds a `sync-strategy` option which is available when
users enable this feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in SUSE#1631, but it
hopefully does the trick for most users.

Fixes SUSE#1650
Fixes SUSE#1664
See SUSE#1599
Depends on SUSE#1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
mssola added a commit to mssola/Portus that referenced this issue Feb 12, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon SUSE#1631 where background tasks can be disabled.
Moreover, it also adds a `sync-strategy` option which is available when
users enable this feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in SUSE#1631, but it
hopefully does the trick for most users.

Fixes SUSE#1650
Fixes SUSE#1664
See SUSE#1599
Depends on SUSE#1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
mssola added a commit to mssola/Portus that referenced this issue Feb 12, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon SUSE#1631 where background tasks can be disabled.
Moreover, it also adds a `sync-strategy` option which is available when
users enable this feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in SUSE#1631, but it
hopefully does the trick for most users.

Fixes SUSE#1650
Fixes SUSE#1664
See SUSE#1599
Depends on SUSE#1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
mssola added a commit to mssola/Portus that referenced this issue Feb 14, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon SUSE#1631 where background tasks can be disabled.
Moreover, it also adds a `sync-strategy` option which is available when
users enable this feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in SUSE#1599, but it
hopefully does the trick for most users.

Fixes SUSE#1650
Fixes SUSE#1664
See SUSE#1599
Depends on SUSE#1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
mssola added a commit that referenced this issue Feb 14, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon #1631 where background tasks can be disabled.
Moreover, it also adds a `sync-strategy` option which is available when
users enable this feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in #1599, but it
hopefully does the trick for most users.

Fixes #1650
Fixes #1664
See #1599
Depends on #1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
@mssola
Copy link
Collaborator Author

mssola commented Feb 15, 2018

We merged a PR yesterday (#1675) which fixes most issues (and works around existing ones). The image has not been built yet on the docker hub (it's in "Queue" for quite some time now ...), but once that is built you'll be able to pull that and have a better experience with this piece of Portus 😉

mssola added a commit to mssola/Portus that referenced this issue Apr 17, 2018
This commit is more explicit on what to do when a catalog error happens.
In this case, if a plain catalog error happens (fetching the list of
repositories and their manifests), then nothing will be done. If
fetching the catalog worked, and fetching the list of tags then suddenly
fails, before this commit that repo would've get nuked because it
would've get a repository with no tags.

This commit instructs to skip that repository when considering whether
to remove or not some repositories that are found to be dangling.

See SUSE#1293
See SUSE#1599

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
@mssola
Copy link
Collaborator Author

mssola commented Apr 17, 2018

See #1787 for some fixes on the update-delete policy. After some further tests, we will add these fixes into a new patch-level release (2.3.3) so everyone can pull the fixes right away.

mssola added a commit to mssola/Portus that referenced this issue Apr 18, 2018
This commit is more explicit on what to do when a catalog error happens.
In this case, if a plain catalog error happens (fetching the list of
repositories and their manifests), then nothing will be done. If
fetching the catalog worked, and fetching the list of tags then suddenly
fails, before this commit that repo would've get nuked because it
would've get a repository with no tags.

This commit instructs to skip that repository when considering whether
to remove or not some repositories that are found to be dangling.

See SUSE#1293
See SUSE#1599

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
mssola added a commit that referenced this issue Apr 18, 2018
This commit is more explicit on what to do when a catalog error happens.
In this case, if a plain catalog error happens (fetching the list of
repositories and their manifests), then nothing will be done. If
fetching the catalog worked, and fetching the list of tags then suddenly
fails, before this commit that repo would've get nuked because it
would've get a repository with no tags.

This commit instructs to skip that repository when considering whether
to remove or not some repositories that are found to be dangling.

See #1293
See #1599

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
@mssola
Copy link
Collaborator Author

mssola commented Apr 18, 2018

See #1787 for some fixes on the update-delete policy. After some further tests, we will add these fixes into a new patch-level release (2.3.3) so everyone can pull the fixes right away.

This has already been cherry picked into the v2.3 branch and now it's part of the 2.3 docker image.

@mssola
Copy link
Collaborator Author

mssola commented Apr 19, 2018

I've deployed a Portus instance for a day, and after messing with it a couple of times, everything seems reliable. I'll do a couple of more tests and I'll take another look at the code, but if everything is fine I'll simply close this issue. Feedback is welcome 😄

@mssola
Copy link
Collaborator Author

mssola commented Apr 25, 2018

I believe this can be closed. Let's create new issues for possible regressions or new bugs on this.

@mssola mssola closed this as completed Apr 25, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants