Skip to content

test: flaky box/role.test.lua test #211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
avtikhon opened this issue May 18, 2020 · 2 comments
Open

test: flaky box/role.test.lua test #211

avtikhon opened this issue May 18, 2020 · 2 comments
Assignees

Comments

@avtikhon
Copy link
Contributor

Tarantool version:
Tarantool 2.5.0-32-g7f20272ea
Target: Darwin-x86_64-RelWithDebInfo
Build options: cmake . -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_BACKTRACE=ON
Compiler: /Library/Developer/CommandLineTools/usr/bin/cc /Library/Developer/CommandLineTools/usr/bin/c++
C_FLAGS: -Wno-unknown-pragmas -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -msse2 -std=c11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Werror
CXX_FLAGS: -Wno-unknown-pragmas -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -msse2 -std=c++11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Werror

OS version:
OSX

Bug description:
https://gitlab.com/tarantool/tarantool/-/jobs/556433107

 [019] --- box/role.result	Mon Mar 23 18:53:42 2020
 [019] +++ box/role.reject	Mon May 18 07:44:12 2020
 [019] @@ -197,9 +197,11 @@
 [019]  -- check a grant received via a role
 [019]  box.schema.user.create('test')
 [019]  ---
 [019] +- error: User '32' already has role 'public'
 [019]  ...
 [019]  box.schema.user.create('grantee')
 [019]  ---
 [019] +- error: User '33' already has role 'public'
 [019]  ...
 [019]  box.schema.role.create('liaison')
 [019]  ---

Steps to reproduce:

Optional (but very desirable):

  • coredump
  • backtrace
  • netstat
@avtikhon avtikhon self-assigned this May 18, 2020
avtikhon referenced this issue in tarantool/tarantool May 18, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  app/fiber.test.lua                        ; gh-4987
  app/fiber_channel.test.lua                ; gh-4961
  app/socket.test.lua                       ; gh-4978
  app-tap/popen.test.lua                    ; gh-4995
  box/alter_limits.test.lua                 ; gh-4926
  box/misc.test.lua                         ; gh-4982
  box/role.test.lua                         ; gh-4998
  box/rtree_rect.test.lua                   ; gh-4994
  box/sequence.test.lua                     ; gh-4996
  box/tuple.test.lua                        ; gh-4988
  engine/ddl.test.lua                       ; gh-4353
  replication/box_set_replication_stress    ; gh-4992
  replication/recover_missing_xlog.test.lua ; gh-4989
  replication/replica_rejoin.test.lua       ; gh-4985
  replication/wal_rw_stress.test.lua        ; gh-4977
  replication-py/conflict.test.py           ; gh-4980
  vinyl/errinj_ddl.test.lua                 ; gh-4993
  vinyl/misc.test.lua                       ; gh-4979
  vinyl/snapshot.test.lua                   ; gh-4984
  vinyl/write_iterator.test.lua             ; gh-4572
  xlog/panic_on_broken_lsn.test.lua         ; gh-4991

Part of #4953
avtikhon referenced this issue in tarantool/tarantool May 18, 2020
Added skip condition on OSX for test:
  replication/box_set_replication_stress.test.lua

Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  app/fiber.test.lua                        ; gh-4987
  app/fiber_channel.test.lua                ; gh-4961
  app/socket.test.lua                       ; gh-4978
  app-tap/popen.test.lua                    ; gh-4995
  box/alter_limits.test.lua                 ; gh-4926
  box/misc.test.lua                         ; gh-4982
  box/role.test.lua                         ; gh-4998
  box/rtree_rect.test.lua                   ; gh-4994
  box/sequence.test.lua                     ; gh-4996
  box/tuple.test.lua                        ; gh-4988
  engine/ddl.test.lua                       ; gh-4353
  replication/box_set_replication_stress    ; gh-4992
  replication/recover_missing_xlog.test.lua ; gh-4989
  replication/replica_rejoin.test.lua       ; gh-4985
  replication/wal_rw_stress.test.lua        ; gh-4977
  replication-py/conflict.test.py           ; gh-4980
  vinyl/errinj_ddl.test.lua                 ; gh-4993
  vinyl/misc.test.lua                       ; gh-4979
  vinyl/snapshot.test.lua                   ; gh-4984
  vinyl/write_iterator.test.lua             ; gh-4572
  xlog/panic_on_broken_lsn.test.lua         ; gh-4991

Part of #4953

(cherry picked from commit 72a2bae)
avtikhon referenced this issue in tarantool/tarantool May 19, 2020
Added skip condition on OSX for test:
  replication/box_set_replication_stress.test.lua

Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  app/fiber.test.lua                        ; gh-4987
  app/fiber_channel.test.lua                ; gh-4961
  app/socket.test.lua                       ; gh-4978
  box/alter_limits.test.lua                 ; gh-4926
  box/misc.test.lua                         ; gh-4982
  box/role.test.lua                         ; gh-4998
  box/rtree_rect.test.lua                   ; gh-4994
  box/sequence.test.lua                     ; gh-4996
  box/tuple.test.lua                        ; gh-4988
  engine/ddl.test.lua                       ; gh-4353
  replication/box_set_replication_stress    ; gh-4992
  replication/recover_missing_xlog.test.lua ; gh-4989
  replication/replica_rejoin.test.lua       ; gh-4985
  replication/wal_rw_stress.test.lua        ; gh-4977
  replication-py/conflict.test.py           ; gh-4980
  vinyl/errinj_ddl.test.lua                 ; gh-4993
  vinyl/misc.test.lua                       ; gh-4979
  vinyl/snapshot.test.lua                   ; gh-4984
  vinyl/write_iterator.test.lua             ; gh-4572
  xlog/panic_on_broken_lsn.test.lua         ; gh-4991

Part of #4953

(cherry picked from commit 72a2bae)
avtikhon referenced this issue in tarantool/tarantool May 19, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  app/fiber.test.lua                        ; gh-4987
  app/fiber_channel.test.lua                ; gh-4961
  app/socket.test.lua                       ; gh-4978
  box/alter_limits.test.lua                 ; gh-4926
  box/misc.test.lua                         ; gh-4982
  box/role.test.lua                         ; gh-4998
  box/rtree_rect.test.lua                   ; gh-4994
  box/sequence.test.lua                     ; gh-4996
  box/tuple.test.lua                        ; gh-4988
  engine/ddl.test.lua                       ; gh-4353
  replication/box_set_replication_stress    ; gh-4992
  replication/recover_missing_xlog.test.lua ; gh-4989
  replication/replica_rejoin.test.lua       ; gh-4985
  replication/wal_rw_stress.test.lua        ; gh-4977
  replication-py/conflict.test.py           ; gh-4980
  vinyl/errinj_ddl.test.lua                 ; gh-4993
  vinyl/misc.test.lua                       ; gh-4979
  vinyl/snapshot.test.lua                   ; gh-4984
  vinyl/write_iterator.test.lua             ; gh-4572
  xlog/panic_on_broken_lsn.test.lua         ; gh-4991

Part of #4953
@dmitryikh
Copy link

Hi, I've just found some dependencies in flaky tests in box/* suite.

All flaky tests were failed after box/net_msg_max.test.lua.

You can find in logs just before failed tests:

2020-05-19 23:08:25.778 [64213] main/119/console/unix/: I> set 'net_msg_max' configuration option to 768
2020-05-19 23:08:25.846 [64213] main/119/console/unix/: I> set 'net_msg_max' configuration option to 1536
2020-05-19 23:08:25.886 [64213] main/119/console/unix/: I> set 'net_msg_max' configuration option to 10
2020-05-19 23:08:25.899 [64213] main/119/console/unix/: I> set 'net_msg_max' configuration option to 2
2020-05-19 23:08:25.927 [64213] main/119/console/unix/: I> set 'readahead' configuration option to 16320
2020-05-19 23:08:25.927 [64213] main/119/console/unix/: I> set 'net_msg_max' configuration option to 768

I've managed to reproduce these cases in every run:

tarantool/tarantool-qa#211
./test-run.py -j -1 box/net_msg_max.test.lua box/role.test.lua box/net_msg_max.test.lua box/role.test.lua box/net_msg_max.test.lua box/role.test.lua box/net_msg_max.test.lua box/role.test.lua box/net_msg_max.test.lua box/role.test.lua box/net_msg_max.test.lua box/role.test.lua box/net_msg_max.test.lua box/role.test.lua box/net_msg_max.test.lua box/role.test.lua

tarantool/tarantool#4997 
./test-run.py -j -1 box/net_msg_max.test.lua box/on_replace.test.lua box/net_msg_max.test.lua box/on_replace.test.lua box/net_msg_max.test.lua box/on_replace.test.lua box/net_msg_max.test.lua box/on_replace.test.lua box/net_msg_max.test.lua box/on_replace.test.lua box/net_msg_max.test.lua box/on_replace.test.lua box/net_msg_max.test.lua box/on_replace.test.lua box/net_msg_max.test.lua box/on_replace.test.lua

The same approach works for #213 #214 #217 #219

Unfortunately I don't understand the root cause of this behaviour.. But I think that there is a bug in toggling net_msg_max option..

I Hope this helps..

kyukhin referenced this issue in tarantool/tarantool May 20, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  app/fiber.test.lua                        ; gh-4987
  app/fiber_channel.test.lua                ; gh-4961
  app/socket.test.lua                       ; gh-4978
  box/alter_limits.test.lua                 ; gh-4926
  box/misc.test.lua                         ; gh-4982
  box/role.test.lua                         ; gh-4998
  box/rtree_rect.test.lua                   ; gh-4994
  box/sequence.test.lua                     ; gh-4996
  box/tuple.test.lua                        ; gh-4988
  engine/ddl.test.lua                       ; gh-4353
  replication/box_set_replication_stress    ; gh-4992
  replication/recover_missing_xlog.test.lua ; gh-4989
  replication/replica_rejoin.test.lua       ; gh-4985
  replication/wal_rw_stress.test.lua        ; gh-4977
  replication-py/conflict.test.py           ; gh-4980
  vinyl/errinj_ddl.test.lua                 ; gh-4993
  vinyl/misc.test.lua                       ; gh-4979
  vinyl/snapshot.test.lua                   ; gh-4984
  vinyl/write_iterator.test.lua             ; gh-4572
  xlog/panic_on_broken_lsn.test.lua         ; gh-4991

Part of #4953

(cherry picked from commit 430c0e8)
kyukhin referenced this issue in tarantool/tarantool May 20, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  app/fiber.test.lua                        ; gh-4987
  app/fiber_channel.test.lua                ; gh-4961
  app/socket.test.lua                       ; gh-4978
  box/alter_limits.test.lua                 ; gh-4926
  box/misc.test.lua                         ; gh-4982
  box/role.test.lua                         ; gh-4998
  box/rtree_rect.test.lua                   ; gh-4994
  box/sequence.test.lua                     ; gh-4996
  box/tuple.test.lua                        ; gh-4988
  engine/ddl.test.lua                       ; gh-4353
  replication/box_set_replication_stress    ; gh-4992
  replication/recover_missing_xlog.test.lua ; gh-4989
  replication/replica_rejoin.test.lua       ; gh-4985
  replication/wal_rw_stress.test.lua        ; gh-4977
  replication-py/conflict.test.py           ; gh-4980
  vinyl/errinj_ddl.test.lua                 ; gh-4993
  vinyl/misc.test.lua                       ; gh-4979
  vinyl/snapshot.test.lua                   ; gh-4984
  vinyl/write_iterator.test.lua             ; gh-4572
  xlog/panic_on_broken_lsn.test.lua         ; gh-4991

Part of #4953

(cherry picked from commit 430c0e8)
kyukhin referenced this issue in tarantool/tarantool May 20, 2020
Added skip condition on OSX for test:
  replication/box_set_replication_stress.test.lua

Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  app/fiber.test.lua                        ; gh-4987
  app/fiber_channel.test.lua                ; gh-4961
  app/socket.test.lua                       ; gh-4978
  box/alter_limits.test.lua                 ; gh-4926
  box/misc.test.lua                         ; gh-4982
  box/role.test.lua                         ; gh-4998
  box/rtree_rect.test.lua                   ; gh-4994
  box/sequence.test.lua                     ; gh-4996
  box/tuple.test.lua                        ; gh-4988
  engine/ddl.test.lua                       ; gh-4353
  replication/box_set_replication_stress    ; gh-4992
  replication/recover_missing_xlog.test.lua ; gh-4989
  replication/replica_rejoin.test.lua       ; gh-4985
  replication/wal_rw_stress.test.lua        ; gh-4977
  replication-py/conflict.test.py           ; gh-4980
  vinyl/errinj_ddl.test.lua                 ; gh-4993
  vinyl/misc.test.lua                       ; gh-4979
  vinyl/snapshot.test.lua                   ; gh-4984
  vinyl/write_iterator.test.lua             ; gh-4572
  xlog/panic_on_broken_lsn.test.lua         ; gh-4991

Part of #4953

(cherry picked from commit 430c0e8)
kyukhin referenced this issue in tarantool/tarantool May 20, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  app/fiber.test.lua                        ; gh-4987
  app/fiber_channel.test.lua                ; gh-4961
  app/socket.test.lua                       ; gh-4978
  box/alter_limits.test.lua                 ; gh-4926
  box/misc.test.lua                         ; gh-4982
  box/role.test.lua                         ; gh-4998
  box/rtree_rect.test.lua                   ; gh-4994
  box/sequence.test.lua                     ; gh-4996
  box/tuple.test.lua                        ; gh-4988
  engine/ddl.test.lua                       ; gh-4353
  replication/box_set_replication_stress    ; gh-4992
  replication/recover_missing_xlog.test.lua ; gh-4989
  replication/replica_rejoin.test.lua       ; gh-4985
  replication/wal_rw_stress.test.lua        ; gh-4977
  replication-py/conflict.test.py           ; gh-4980
  vinyl/errinj_ddl.test.lua                 ; gh-4993
  vinyl/misc.test.lua                       ; gh-4979
  vinyl/snapshot.test.lua                   ; gh-4984
  vinyl/write_iterator.test.lua             ; gh-4572
  xlog/panic_on_broken_lsn.test.lua         ; gh-4991

Part of #4953
@Totktonada
Copy link
Member

NB: There is --reproduce option for test-run, which allows to run tests in a specified order. test-run writes reproduce files into test/var/reproduce.

It looks related to tarantool/test-run#156 . See also e33216af9889a2387f331fd30c157d60292bdc5b.

@ylobankov ylobankov transferred this issue from tarantool/tarantool Apr 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants