Skip to content
This repository was archived by the owner on Apr 17, 2023. It is now read-only.

Background job is pushing and deleting images #1650

Closed
Vad1mo opened this issue Feb 4, 2018 · 5 comments
Closed

Background job is pushing and deleting images #1650

Vad1mo opened this issue Feb 4, 2018 · 5 comments
Assignees

Comments

@Vad1mo
Copy link
Contributor

Vad1mo commented Feb 4, 2018

I tried out the new background process, at the beginning everything looked fine, however the job removes and adds images when the application is restarted.

I have not more information, but this is the situation I observed. Will investigate it a bit deeper in the next days.

I have only two images in the registry. It was a fresh setup with fresh DB and registry.

Here are some stacktraces

| [Initialization] Running: 'Registry events', 'Security scanning', 'Registry synchronization'
| [Initialization] Running: 'Registry events', 'Security scanning', 'Registry synchronization'
| /app/lib/portus/registry_client.rb:137:in `get_page': Something went wrong while fetching the catalog Response: [301] -  (Portus::RegistryClient::RegistryError)
| /app/lib/portus/registry_client.rb:137:in `get_page': Something went wrong while fetching the catalog Response: [500] - 500 Internal Server Error (Portus::RegistryClient::RegistryError)
| If you are the administrator of this website, then please read this web application's log file and/or the web server's log file to find out what went wrong.
| 	from /app/lib/portus/registry_client.rb:118:in `paged_response'
| 	from /app/lib/portus/registry_client.rb:83:in `catalog'
| 	from /app/lib/portus/registry_client.rb:118:in `paged_response'
| 	from /app/lib/portus/registry_client.rb:83:in `catalog'
| 	from /app/lib/portus/background/sync.rb:25:in `block in execute!'
| 	from /app/lib/portus/background/sync.rb:25:in `block in execute!'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:51:in `block (2 levels) in find_each'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:51:in `each'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:51:in `block (2 levels) in find_each'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:51:in `block in find_each'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:124:in `find_in_batches'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:51:in `each'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:51:in `block in find_each'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:50:in `find_each'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/querying.rb:9:in `find_each'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:124:in `find_in_batches'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:50:in `find_each'
| 	from /app/lib/portus/background/sync.rb:24:in `execute!'
| 	from /app/bin/background.rb:55:in `block (2 levels) in <top (required)>'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/querying.rb:9:in `find_each'
| 	from /app/lib/portus/background/sync.rb:24:in `execute!'
| 	from /app/bin/background.rb:53:in `each'
| 	from /app/bin/background.rb:53:in `block in <top (required)>'
| 	from /app/bin/background.rb:55:in `block (2 levels) in <top (required)>'
| 	from /app/bin/background.rb:52:in `loop'
| 	from /app/bin/background.rb:53:in `each'
| 	from /app/bin/background.rb:52:in `<top (required)>'
| 	from /app/bin/background.rb:53:in `block in <top (required)>'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/runner.rb:60:in `load'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/runner.rb:60:in `<top (required)>'
| 	from /app/bin/background.rb:52:in `loop'
| 	from /app/bin/background.rb:52:in `<top (required)>'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/commands_tasks.rb:123:in `require'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/commands_tasks.rb:123:in `require_command!'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/runner.rb:60:in `load'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/commands_tasks.rb:90:in `runner'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/commands_tasks.rb:39:in `run_command!'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/runner.rb:60:in `<top (required)>'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands.rb:17:in `<top (required)>'
| 	from bin/rails:12:in `require'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/commands_tasks.rb:123:in `require'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/commands_tasks.rb:123:in `require_command!'
| 	from bin/rails:12:in `<main>'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/commands_tasks.rb:90:in `runner'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/commands_tasks.rb:39:in `run_command!'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands.rb:17:in `<top (required)>'
| 	from bin/rails:12:in `require'
| 	from bin/rails:12:in `<main>'

| [Initialization] Running: 'Registry events', 'Security scanning', 'Registry synchronization'
| /app/lib/portus/registry_client.rb:137:in `get_page': Something went wrong while fetching the catalog Response: [500] - 500 Internal Server Error (Portus::RegistryClient::RegistryError)
| If you are the administrator of this website, then please read this web application's log file and/or the web server's log file to find out what went wrong.
| [catalog] Could not fetch manifest for 'vadim/helpy' with tag 'latest': Something went wrong while fetching manifest for vadim/helpy:latest:[500] - 500 Internal Server Error
| If you are the administrator of this website, then please read this web application's log file and/or the web server's log file to find out what went wrong.
| 	from /app/lib/portus/registry_client.rb:118:in `paged_response'
| 	from /app/lib/portus/registry_client.rb:83:in `catalog'
| [catalog] Removed the tag 'latest'.
| [catalog] Removed the image 'alpine'.
| 	from /app/lib/portus/background/sync.rb:25:in `block in execute!'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:51:in `block (2 levels) in find_each'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:51:in `each'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:51:in `block in find_each'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:124:in `find_in_batches'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:50:in `find_each'
| 	from /usr/local/bundle/gems/activerecord-4.2.10/lib/active_record/querying.rb:9:in `find_each'
| 	from /app/lib/portus/background/sync.rb:24:in `execute!'
| 	from /app/bin/background.rb:55:in `block (2 levels) in <top (required)>'
| 	from /app/bin/background.rb:53:in `each'
| 	from /app/bin/background.rb:53:in `block in <top (required)>'
| 	from /app/bin/background.rb:52:in `loop'
| 	from /app/bin/background.rb:52:in `<top (required)>'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/runner.rb:60:in `load'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/runner.rb:60:in `<top (required)>'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/commands_tasks.rb:123:in `require'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/commands_tasks.rb:123:in `require_command!'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/commands_tasks.rb:90:in `runner'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands/commands_tasks.rb:39:in `run_command!'
| 	from /usr/local/bundle/gems/railties-4.2.10/lib/rails/commands.rb:17:in `<top (required)>'
| 	from bin/rails:12:in `require'
| 	from bin/rails:12:in `<main>'
@mssola
Copy link
Collaborator

mssola commented Feb 5, 2018

I think this is similar to some of the remaining issues from the old crono process. See issue #1599 ("Crono is adding & removing images" section).

@Vad1mo
Copy link
Contributor Author

Vad1mo commented Feb 7, 2018

I did some more testing and it seems like background is behaving very odd.
The daemon not only pushes/pulls images but also does it twice from time to time, so there is also a concurrency issue.

We need to make the background process more robust, so that on errors it won't remove the whole DB and one minute later add it all again.
If this happens, all the user info and timestamps and actually the audit trail ist lost.

Example:

vadim pushed vadim / postgres:10-alpine
less than a minute

portus pushed vadim / postgres:10-alpine
less than a minute
[Mailer config] Host:     test.portus.local
[Mailer config] Protocol: https://
[Mailer config] Protocol: https://
[Initialization] Running: 'Registry events', 'Security scanning', 'Registry synchronization'
Net::OpenTimeout: connection timed out.
[catalog] Could not fetch manifest for 'vadim/registry' with tag 'latest': Net::OpenTimeout: connection timed out.
Handling 'push' event:
{
  "id": "192e7dbb-8f46-422c-b792-12fcef036c52",
  "timestamp": "2018-02-07T11:59:35.197867853+01:00",
  "action": "push",
  "target": {
    "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
    "size": 1364,
    "digest": "sha256:feb40d14cd33e646b9985e2d6754ed66616fedb840226c4d917ef53d616dcd6c",
    "length": 1364,
    "repository": "vadim/registry",
    "url": "https://test.portus.local/v2/vadim/registry/manifests/sha256:feb40d14cd33e646b9985e2d6754ed66616fedb840226c4d917ef53d616dcd6c",
    "tag": "latest"
  },
  "request": {
    "id": "345f055c-47aa-464d-88f5-f5704bcb70bf",
    "addr": "10.255.0.5",
    "host": "test.portus.local",
    "method": "PUT",
    "useragent": "docker/17.12.0-ce go/go1.9.2 git-commit/c97c6d6 kernel/4.9.60-linuxkit-aufs os/linux arch/amd64 UpstreamClient(Docker-Client/17.12.0-ce \\(darwin\\))"
  },
�
": {
    "name": "vadim"
  },
  "source": {
    "addr": "209e48968d6e:5000",
    "instanceID": "6942a7df-1238-4673-9a0c-cbef2aa581a8"
  }
}
[catalog] Created the tag '10-alpine'.
Handling 'push' event:
{
  "id": "91258c1a-cd8d-4b34-91ef-a3c05af02b58",
  "timestamp": "2018-02-07T12:01:23.943794054+01:00",
  "action": "push",
  "target": {
    "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
    "size": 2191,
    "digest": "sha256:71a5abf3062737bca8251514c853469f4456ec1a142723972f50ba3220b1b0b0",
    "length": 2191,
    "repository": "vadim/postgres",
    "url": "https://test.portus.local/v2/vadim/postgres/manifests/sha256:71a5abf3062737bca8251514c853469f4456ec1a142723972f50ba3220b1b0b0",
    "tag": "10-alpine"
  },
  "request": {
    "id": "8d99836d-0284-4419-a1b2-c16cd50353d0",
    "addr": "10.255.0.5",
    "host": "test.portus.local",
    "method": "PUT",
    "useragent": "docker/17.12.0-ce go/go1.9.2 git-commit/c97c6d6 kernel/4.9.60-linuxkit-aufs os/linux arch/amd64 UpstreamClient(Docker-Client/17.12.0-ce \\(darwin\\))"
  },
�
": {
    "name": "vadim"
  },
  "source": {
    "addr": "209e48968d6e:5000",
    "instanceID": "6942a7df-1238-4673-9a0c-cbef2aa581a8"
  }
}

@Vad1mo
Copy link
Contributor Author

Vad1mo commented Feb 7, 2018

I think I found the problem regarding the concurrency. When the image is pushed and the event is sent it happens that in the loop the sync process runs before the event processing task. In such cases the image gets imported by portus before the event is processes.

A solution would be that the Sync task should only run if the event queue is empty. If not is should skip to next loop.

@mssola
Copy link
Collaborator

mssola commented Feb 9, 2018

A solution would be that the Sync task should only run if the event queue is empty. If not is should skip to next loop.

I just fixed this locally, but I'm also working on a fix for #1664. Then I'll submit a PR which should hopefully close all issues regarding synchronization...

@mssola mssola self-assigned this Feb 9, 2018
mssola added a commit to mssola/Portus that referenced this issue Feb 9, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon SUSE#1631 where background tasks can be disabled,
and it disables the `sync` task by default. Moreover, it also adds a
`sync-strategy` option which is available when users enable this
feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in SUSE#1631, but it
hopefully does the trick for most users.

Fixes SUSE#1650
Fixes SUSE#1664
See SUSE#1599
Depends on SUSE#1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
@mssola
Copy link
Collaborator

mssola commented Feb 9, 2018

See #1675.

mssola added a commit to mssola/Portus that referenced this issue Feb 12, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon SUSE#1631 where background tasks can be disabled,
and it disables the `sync` task by default. Moreover, it also adds a
`sync-strategy` option which is available when users enable this
feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in SUSE#1631, but it
hopefully does the trick for most users.

Fixes SUSE#1650
Fixes SUSE#1664
See SUSE#1599
Depends on SUSE#1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
mssola added a commit to mssola/Portus that referenced this issue Feb 12, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon SUSE#1631 where background tasks can be disabled,
and it disables the `sync` task by default. Moreover, it also adds a
`sync-strategy` option which is available when users enable this
feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in SUSE#1631, but it
hopefully does the trick for most users.

Fixes SUSE#1650
Fixes SUSE#1664
See SUSE#1599
Depends on SUSE#1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
mssola added a commit to mssola/Portus that referenced this issue Feb 12, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon SUSE#1631 where background tasks can be disabled.
Moreover, it also adds a `sync-strategy` option which is available when
users enable this feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in SUSE#1631, but it
hopefully does the trick for most users.

Fixes SUSE#1650
Fixes SUSE#1664
See SUSE#1599
Depends on SUSE#1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
mssola added a commit to mssola/Portus that referenced this issue Feb 12, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon SUSE#1631 where background tasks can be disabled.
Moreover, it also adds a `sync-strategy` option which is available when
users enable this feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in SUSE#1631, but it
hopefully does the trick for most users.

Fixes SUSE#1650
Fixes SUSE#1664
See SUSE#1599
Depends on SUSE#1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
mssola added a commit to mssola/Portus that referenced this issue Feb 12, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon SUSE#1631 where background tasks can be disabled.
Moreover, it also adds a `sync-strategy` option which is available when
users enable this feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in SUSE#1631, but it
hopefully does the trick for most users.

Fixes SUSE#1650
Fixes SUSE#1664
See SUSE#1599
Depends on SUSE#1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
@mssola mssola closed this as completed in 7e60b71 Feb 14, 2018
mssola added a commit that referenced this issue Feb 14, 2018
We have historically struggled with the synchronization of the Registry.
This was developed at first because the following situation could
happen:

1. User pushes image:tag.
2. Portus authorizes the transaction.
3. Registry sends the event to Portus with all the data, but Portus is
   unavailable.
4. Portus is up again but it missed the registry event.

This is not the case anymore, because all the things that could make
Portus unavailable have been either fixed or moved into the background
process. Thus, this problem can be ignored in most cases. Moreover, this
feature is quite dangerous because a bug on this code can make Portus
wipe all the repositories, and historically "funny" issues have
happened.

This commit builds upon #1631 where background tasks can be disabled.
Moreover, it also adds a `sync-strategy` option which is available when
users enable this feature. This option has four possible values:

- `update-delete`: the same behavior we've had up until now.
- `update`: similar to `update-delete` but this one will not delete
  repositories which no longer exist on the registry. This is useful for
  users which don't trust the risky `update-delete` strategy.
- `on-start`: execute `update-delete` only once (on start).
- `initial`: execute `update-delete` only once on start and
  only if the registry is empty. This is the default strategy since it
  might be convenient for new users that have already a running
  registry.

This commit does not fix all the issues mentioned in #1599, but it
hopefully does the trick for most users.

Fixes #1650
Fixes #1664
See #1599
Depends on #1631

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants