Skip to content

test: flaky often replication-py/init_storage.test.py #228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
avtikhon opened this issue Apr 29, 2020 · 2 comments
Open

test: flaky often replication-py/init_storage.test.py #228

avtikhon opened this issue Apr 29, 2020 · 2 comments
Assignees
Labels
bug Something isn't working flaky test teamS Scaling

Comments

@avtikhon
Copy link
Contributor

avtikhon commented Apr 29, 2020

Tarantool version:
All

OS version:
All

Bug description:
issue:

[003] replication-py/init_storage.test.py                             
[003] Worker "003_replication-py" received the following error; stopping...
[003] Traceback (most recent call last):
[003]   File "/Users/tntmac02.tarantool.i/tnt/test-run/lib/worker.py", line 305, in run_task
[003]     task, self.server, self.inspector)
[003]   File "/Users/tntmac02.tarantool.i/tnt/test-run/lib/test_suite.py", line 239, in run_test
[003]     short_status = test.run(server)
[003]   File "/Users/tntmac02.tarantool.i/tnt/test-run/lib/test.py", line 178, in run
[003]     self.execute(server)
[003]   File "/Users/tntmac02.tarantool.i/tnt/test-run/lib/tarantool_server.py", line 384, in execute
[003]     **server.__dict__))
[003]   File "replication-py/init_storage.test.py", line 67, in <module>
[003]     replica.wait_until_started()
[003]   File "/Users/tntmac02.tarantool.i/tnt/test-run/lib/tarantool_server.py", line 1023, in wait_until_started
[003]     self.logfile_pos.seek_wait(msg, p, self.name)
[003]   File "/Users/tntmac02.tarantool.i/tnt/test-run/lib/tarantool_server.py", line 439, in seek_wait
[003]     raise TarantoolStartError(name)
[003] TarantoolStartError
[003] 
[003] Exception: 
[003] 
[003] 
[003] [Instance "replica" returns with non-zero exit code: 1]
[003] 
[003] Last 15 lines of Tarantool Log file [Instance "replica"][/Users/tntmac02.tarantool.i/tnt/test/var/003_replication-py/replica.log]:
[003] 2020-04-29 08:14:54.021 [31661] main/103/replica C> Tarantool 2.5.0-2-g098324556
[003] 2020-04-29 08:14:54.022 [31661] main/103/replica C> log level 5
[003] 2020-04-29 08:14:54.023 [31661] main/103/replica I> mapping 117440512 bytes for memtx tuple arena...
[003] 2020-04-29 08:14:54.023 [31661] main/103/replica I> mapping 134217728 bytes for vinyl tuple arena...
[003] 2020-04-29 08:14:54.025 [31661] main/103/replica I> instance uuid 67dc35a8-a860-4f96-8043-212d23627e4b
[003] 2020-04-29 08:14:54.029 [31661] iproto/101/main I> binary: bound to [::1]:5304
[003] 2020-04-29 08:14:54.029 [31661] main/103/replica I> connecting to 1 replicas
[003] 2020-04-29 08:14:54.287 [31661] main/112/applier/localhost:5555 I> remote master d680f59e-22be-4c12-b9bb-4db6237d2c22 at [::1]:5555 running Tarantool 2.5.0
[003] 2020-04-29 08:14:54.302 [31661] main/103/replica I> connected to 1 replicas
[003] 2020-04-29 08:14:54.302 [31661] main/103/replica I> bootstrapping replica from d680f59e-22be-4c12-b9bb-4db6237d2c22 at [::1]:5555
[003] 2020-04-29 08:14:54.303 [31661] main/112/applier/localhost:5555 I> can't read row
[003] 2020-04-29 08:14:54.303 [31661] main/112/applier/localhost:5555 box.cc:147 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
[003] 2020-04-29 08:14:54.303 [31661] main/103/replica box.cc:147 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
[003] 2020-04-29 08:14:54.303 [31661] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
[003] 2020-04-29 08:14:54.303 [31661] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

Rare issue:

--- replication-py/init_storage.result  Sun Apr 26 20:45:32 2020
+++ var/007_replication-py/init_storage.result  Wed Apr 29 11:30:34 2020
@@ -131,4 +131,3 @@
 waiting reconnect on JOIN...
 ok
 waiting reconnect on SUBSCRIBE...
-ok

On reproduces found connection issues, like:

Starting instance replica...
started
Run console at unix/:/home/vagrant/tnt/test/var/001_replication-py/replica.control
tcp_server: remove dead UNIX socket: /home/vagrant/tnt/test/var/001_replication-py/replica.control
started
2020-11-08 08:31:19.926 [30534] main/103/replica C> Tarantool 2.7.0-20-g99d6c8a40
2020-11-08 08:31:19.926 [30534] main/103/replica C> log level 5
2020-11-08 08:31:19.929 [30534] main/103/replica I> mapping 117440512 bytes for memtx tuple arena...
2020-11-08 08:31:19.930 [30534] main/103/replica I> mapping 134217728 bytes for vinyl tuple arena...
2020-11-08 08:31:19.978 [30534] main/103/replica I> instance uuid c6dade09-219c-11eb-ac14-080027727614
2020-11-08 08:31:19.978 [30534] iproto/101/main I> binary: bound to 127.0.0.1:46448
2020-11-08 08:31:19.978 [30534] main/103/replica I> connecting to 1 replicas
2020-11-08 08:31:19.980 [30534] main/112/applier/localhost:49168 I> can't connect to master
2020-11-08 08:31:19.980 [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
2020-11-08 08:31:19.980 [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
2020-11-08 08:31:20.093 [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
2020-11-08 08:31:20.093 [30534] main/103/replica I> connected to 1 replicas
2020-11-08 08:31:20.093 [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
2020-11-08 08:31:20.093 [30534] main/112/applier/localhost:49168 I> can't read row
2020-11-08 08:31:20.093 [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
2020-11-08 08:31:20.093 [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
2020-11-08 08:31:20.093 [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
2020-11-08 08:31:20.093 [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

Steps to reproduce:

l=0 ; while ./test-run.py -j50 `for r in {1..100} ; do echo replication-py/init_storage.test.py ; done 2>/dev/null` ; do l=$(($l+1)) ; echo ======== $l ============= ; done

Optional (but very desirable):

  • coredump
  • backtrace
  • netstat
@avtikhon avtikhon self-assigned this Apr 29, 2020
avtikhon referenced this issue in tarantool/tarantool Apr 30, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  snapshot.test.py            ; gh-4514
  init_storage.test.py        ; gh-4949
  stat.test.lua               ; gh-4951
  xlog/checkpoint_daemon.test.lua ; gh-4952

Part of #4953
avtikhon referenced this issue in tarantool/tarantool Apr 30, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  snapshot.test.py            ; gh-4514
  init_storage.test.py        ; gh-4949
  stat.test.lua               ; gh-4951
  xlog/checkpoint_daemon.test.lua ; gh-4952

Part of #4953
avtikhon referenced this issue in tarantool/tarantool Apr 30, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  box-py/snapshot.test.py              ; gh-4514
  replication-py/init_storage.test.py  ; gh-4949
  vinyl/stat.test.lua                  ; gh-4951
  xlog/checkpoint_daemon.test.lua      ; gh-4952

Part of #4953
avtikhon referenced this issue in tarantool/tarantool Apr 30, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  box-py/snapshot.test.py              ; gh-4514
  replication/misc.test.lua            ; gh-4940
  replication-py/init_storage.test.py  ; gh-4949
  vinyl/stat.test.lua                  ; gh-4951
  xlog/checkpoint_daemon.test.lua      ; gh-4952

Part of #4953
avtikhon referenced this issue in tarantool/tarantool May 6, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  box-py/snapshot.test.py              ; gh-4514
  replication/misc.test.lua            ; gh-4940
  replication-py/init_storage.test.py  ; gh-4949
  vinyl/stat.test.lua                  ; gh-4951
  xlog/checkpoint_daemon.test.lua      ; gh-4952

Part of #4953
avtikhon referenced this issue in tarantool/tarantool May 6, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  box-py/snapshot.test.py                ; gh-4514
  replication/misc.test.lua              ; gh-4940
  replication/skip_conflict_row.test.lua ; gh-4958
  replication-py/init_storage.test.py    ; gh-4949
  vinyl/stat.test.lua                    ; gh-4951
  xlog/checkpoint_daemon.test.lua        ; gh-4952

Part of #4953
kyukhin referenced this issue in tarantool/tarantool May 8, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  box-py/snapshot.test.py                ; gh-4514
  replication/misc.test.lua              ; gh-4940
  replication/skip_conflict_row.test.lua ; gh-4958
  replication-py/init_storage.test.py    ; gh-4949
  vinyl/stat.test.lua                    ; gh-4951
  xlog/checkpoint_daemon.test.lua        ; gh-4952

Part of #4953
kyukhin referenced this issue in tarantool/tarantool May 8, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  box-py/snapshot.test.py                ; gh-4514
  replication/misc.test.lua              ; gh-4940
  replication/skip_conflict_row.test.lua ; gh-4958
  replication-py/init_storage.test.py    ; gh-4949
  vinyl/stat.test.lua                    ; gh-4951
  xlog/checkpoint_daemon.test.lua        ; gh-4952

Part of #4953

(cherry picked from commit faf7e48)
kyukhin referenced this issue in tarantool/tarantool May 8, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  box-py/snapshot.test.py                ; gh-4514
  replication/misc.test.lua              ; gh-4940
  replication/skip_conflict_row.test.lua ; gh-4958
  replication-py/init_storage.test.py    ; gh-4949
  vinyl/stat.test.lua                    ; gh-4951
  xlog/checkpoint_daemon.test.lua        ; gh-4952

Part of #4953

(cherry picked from commit faf7e48)
kyukhin referenced this issue in tarantool/tarantool May 8, 2020
Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  box-py/snapshot.test.py                ; gh-4514
  replication/misc.test.lua              ; gh-4940
  replication/skip_conflict_row.test.lua ; gh-4958
  replication-py/init_storage.test.py    ; gh-4949
  vinyl/stat.test.lua                    ; gh-4951
  xlog/checkpoint_daemon.test.lua        ; gh-4952

Part of #4953

(cherry picked from commit faf7e48)
avtikhon referenced this issue in tarantool/tarantool Nov 7, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' testing functions.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve external issues like "address allready in use" and
etc. on replica creation, check gh-4949. In this way the test was
changed to be able to catch exception 'TarantoolStartError' from
test-run. Also the test should have the ability to be restarted and
in this way 'crash_expected' flag was disabled to let the test fail
with checking exception.

Closes #4949
avtikhon referenced this issue in tarantool/tarantool Nov 7, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' testing functions.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve external issues like "address allready in use" and
etc. on replica creation, check gh-4949. In this way the test was
changed to be able to catch exception 'TarantoolStartError' from
test-run. Also the test should have the ability to be restarted and
in this way 'crash_expected' flag was disabled to let the test fail
with checking exception.

Closes #4949
avtikhon referenced this issue in tarantool/tarantool Nov 8, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

 [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with checking
exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 8, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

 [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with checking
exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 8, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 8, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 10, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 11, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 11, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 11, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 11, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 11, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 11, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 11, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 11, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 24, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Also found the new issues, like:

  Test hung! Result content mismatch:
  --- replication-py/init_storage.result  Wed Aug 26 06:06:15 2020
  +++ /tmp/tnt/101_replication-py/init_storage.result     Wed Nov 11 10:17:50 2020
  @@ -130,5 +130,3 @@
   -------------------------------------------------------------
   waiting reconnect on JOIN...
   ok
  -waiting reconnect on SUBSCRIBE...
  -ok

to fix it decided to use SIGKILL signal in restart/stop routines to
stop the instances.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 24, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Also found the new issues, like:

  Test hung! Result content mismatch:
  --- replication-py/init_storage.result  Wed Aug 26 06:06:15 2020
  +++ /tmp/tnt/101_replication-py/init_storage.result     Wed Nov 11 10:17:50 2020
  @@ -130,5 +130,3 @@
   -------------------------------------------------------------
   waiting reconnect on JOIN...
   ok
  -waiting reconnect on SUBSCRIBE...
  -ok

to fix it decided to use SIGKILL signal in restart/stop routines to
stop the instances.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 25, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Also found the new issues, like:

  Test hung! Result content mismatch:
  --- replication-py/init_storage.result  Wed Aug 26 06:06:15 2020
  +++ /tmp/tnt/101_replication-py/init_storage.result     Wed Nov 11 10:17:50 2020
  @@ -130,5 +130,3 @@
   -------------------------------------------------------------
   waiting reconnect on JOIN...
   ok
  -waiting reconnect on SUBSCRIBE...
  -ok

to fix it decided to use SIGKILL signal in restart/stop routines to
stop the instances.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 25, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Also found the new issues, like:

  Test hung! Result content mismatch:
  --- replication-py/init_storage.result  Wed Aug 26 06:06:15 2020
  +++ /tmp/tnt/101_replication-py/init_storage.result     Wed Nov 11 10:17:50 2020
  @@ -130,5 +130,3 @@
   -------------------------------------------------------------
   waiting reconnect on JOIN...
   ok
  -waiting reconnect on SUBSCRIBE...
  -ok

to fix it added wait loops from test-run repository.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 25, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Also found the new issues, like:

  Test hung! Result content mismatch:
  --- replication-py/init_storage.result  Wed Aug 26 06:06:15 2020
  +++ /tmp/tnt/101_replication-py/init_storage.result     Wed Nov 11 10:17:50 2020
  @@ -130,5 +130,3 @@
   -------------------------------------------------------------
   waiting reconnect on JOIN...
   ok
  -waiting reconnect on SUBSCRIBE...
  -ok

to fix it added wait loops from test-run repository.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Nov 27, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Also found the new issues, like:

  Test hung! Result content mismatch:
  --- replication-py/init_storage.result  Wed Aug 26 06:06:15 2020
  +++ /tmp/tnt/101_replication-py/init_storage.result     Wed Nov 11 10:17:50 2020
  @@ -130,5 +130,3 @@
   -------------------------------------------------------------
   waiting reconnect on JOIN...
   ok
  -waiting reconnect on SUBSCRIBE...
  -ok

to fix it added wait loops from test-run repository.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Dec 1, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Also found the new issues, like:

  Test hung! Result content mismatch:
  --- replication-py/init_storage.result  Wed Aug 26 06:06:15 2020
  +++ /tmp/tnt/101_replication-py/init_storage.result     Wed Nov 11 10:17:50 2020
  @@ -130,5 +130,3 @@
   -------------------------------------------------------------
   waiting reconnect on JOIN...
   ok
  -waiting reconnect on SUBSCRIBE...
  -ok

to fix it added wait loops from test-run repository.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Dec 1, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Also found the new issues, like:

  Test hung! Result content mismatch:
  --- replication-py/init_storage.result  Wed Aug 26 06:06:15 2020
  +++ /tmp/tnt/101_replication-py/init_storage.result     Wed Nov 11 10:17:50 2020
  @@ -130,5 +130,3 @@
   -------------------------------------------------------------
   waiting reconnect on JOIN...
   ok
  -waiting reconnect on SUBSCRIBE...
  -ok

to fix it added wait loops from test-run repository.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Dec 2, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was disabled to let the test fail with exception.

Also found the new issues, like:

  Test hung! Result content mismatch:
  --- replication-py/init_storage.result  Wed Aug 26 06:06:15 2020
  +++ /tmp/tnt/101_replication-py/init_storage.result     Wed Nov 11 10:17:50 2020
  @@ -130,5 +130,3 @@
   -------------------------------------------------------------
   waiting reconnect on JOIN...
   ok
  -waiting reconnect on SUBSCRIBE...
  -ok

to fix it added wait loops from test-run repository.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Dec 2, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was enabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Dec 2, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was enabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Dec 3, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was enabled to let the test fail with exception.

Needed by #4949
avtikhon referenced this issue in tarantool/tarantool Dec 3, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was enabled to let the test fail with exception.

Needed by #4949
kyukhin referenced this issue in tarantool/tarantool Dec 4, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was enabled to let the test fail with exception.

Needed by #4949

(cherry picked from commit 362c195)
kyukhin referenced this issue in tarantool/tarantool Dec 4, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was enabled to let the test fail with exception.

Needed by #4949

(cherry picked from commit 362c195)
kyukhin referenced this issue in tarantool/tarantool Dec 4, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was enabled to let the test fail with exception.

Needed by #4949

(cherry picked from commit 362c195)
kyukhin referenced this issue in tarantool/tarantool Dec 4, 2020
Found that test failed in 2 common places when it tried to start the
replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts.
It used to wait for replica start check the 'wait_until_started()'
function 'TarantoolServer' class from test-run repository. But it
didn't try resolve connection issues on replica creation, like:

  [30534] main/103/replica I> connecting to 1 replicas
  [30534] main/112/applier/localhost:49168 I> can't connect to master
  [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused
  [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second
  [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0
  [30534] main/103/replica I> connected to 1 replicas
  [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168
  [30534] main/112/applier/localhost:49168 I> can't read row
  [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode.

To resolve it the test was changed to be able to catch exception
'TarantoolStartError' from test-run. Also the test should have the
ability to be restarted by test-run using fragile list and in this way
'crash_expected' flag was enabled to let the test fail with exception.

Needed by #4949
@sergepetrenko sergepetrenko self-assigned this Dec 10, 2020
@kyukhin kyukhin added the teamS Scaling label Dec 16, 2020
@avtikhon avtikhon removed their assignment Dec 16, 2020
@kyukhin kyukhin removed the teamX label Dec 18, 2020
@sergos
Copy link

sergos commented Feb 11, 2022

The problem is two-facet. First, the test can't be written in such a way that leader is obtained its port after replica starts waiting for it. The simplest way to fix it is to

- server.stop()                                                                                       
+ os.kill(server.read_pidfile(), signal.SIGHUP)                                                        
replica = TarantoolServer(server.ini)                                                                
replica.script = "replication-py/replica.lua"                                                        
replica.vardir = server.vardir                                                                       
replica.rpl_master = master                                                                          
replica.deploy(wait=False)                                                                           
                                                                                                     
print("waiting reconnect on JOIN...")                                                                
- server.start()                                                                                      
+ os.kill(server.read_pidfile(), signal.SIGCONT)                                                       

in case you need see the replica behavior while leader is not available.

The second is way way worse.
Inside the tarantool_server.py a port is assigned to the server using find_port() during the phase install(). Taking into account we run a horde of workers, the port can be hijacked during the further activities before the start() is called and even if it is done immediately as in deploy() there's still a room inside the Tarantool instance execution before it tries to open the port.

@Totktonada
Copy link
Member

The latest paragraph looks quite relevant to tarantool/test-run#141.

@kyukhin kyukhin added bug Something isn't working and removed qa labels Mar 25, 2022
@ylobankov ylobankov transferred this issue from tarantool/tarantool Apr 15, 2022
@TarantoolBot TarantoolBot removed the 2sp label Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flaky test teamS Scaling
Projects
None yet
Development

No branches or pull requests

6 participants