Skip to content

Postpone fragile tests and run them one by one #187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Totktonada opened this issue Aug 26, 2019 · 0 comments · Fixed by #188
Closed

Postpone fragile tests and run them one by one #187

Totktonada opened this issue Aug 26, 2019 · 0 comments · Fixed by #188
Labels
feature A new functionality

Comments

@Totktonada
Copy link
Member

Just like we doing for is_parallel = False test suites. It hopefully will allow us to overcome flaky test fails without test cases disabling.

@Totktonada Totktonada added the feature A new functionality label Aug 26, 2019
Totktonada added a commit that referenced this issue Aug 26, 2019
Added "fragile" option to suite.ini file format. The option lists tests
that are not desined be run in parallel with others: say, those ones
that lean of timings that are meet only when enough system resources are
available.

The option set in the same way as "disabled" and other test lists:

 | fragile = foo.test.lua ; gh-1234
 |           bar.test.lua ; gh-5678

Fixes #187.
Totktonada added a commit to tarantool/tarantool that referenced this issue Sep 24, 2020
Retry a failed test when it is marked as fragile (and several other
conditions are met, see below).

The test-run already allows to set a list of fragile tests. They are run
one-by-one after all parallel ones in order to eliminate possible
resource starvation and fit timings to ones when the tests pass. See
[1].

In practice this approach does not help much against our problem with
flaky tests. We decided to retry failed tests, when they are known as
flagile. See [2].

The core idea is to split responsibility: known flaky fails will not
deflect attention of a developer, but each fragile test will be marked
explicitly, trackerized and will be analyzed by the quality assurance
team.

The default behaviour is not changed: each test from the fragile list
will be run once after all parallel ones. But now it is possible to set
retries amount.

Beware: the implementation does not allow to just set retries count, it
also requires to provide an md5sum of a failed test output (so called
reject file). The idea here is to ensure that we retry the test only in
case of a known fail: not some other fail within the test.

This approach has the limitation: in case of fail a test may output an
information that varies from run to run or depend of a base directory.
We should always verify the output before put its checksum into the
configuration file.

Despite doubts regarding this approach, it looks simple and we decided
to try and revisit it if there will be a need.

See configuration example in [3].

[1]: tarantool/test-run#187
[2]: tarantool/test-run#189
[3]: tarantool/test-run#217

Part of #5050
Totktonada added a commit to tarantool/tarantool that referenced this issue Sep 24, 2020
Retry a failed test when it is marked as fragile (and several other
conditions are met, see below).

The test-run already allows to set a list of fragile tests. They are run
one-by-one after all parallel ones in order to eliminate possible
resource starvation and fit timings to ones when the tests pass. See
[1].

In practice this approach does not help much against our problem with
flaky tests. We decided to retry failed tests, when they are known as
flagile. See [2].

The core idea is to split responsibility: known flaky fails will not
deflect attention of a developer, but each fragile test will be marked
explicitly, trackerized and will be analyzed by the quality assurance
team.

The default behaviour is not changed: each test from the fragile list
will be run once after all parallel ones. But now it is possible to set
retries amount.

Beware: the implementation does not allow to just set retries count, it
also requires to provide an md5sum of a failed test output (so called
reject file). The idea here is to ensure that we retry the test only in
case of a known fail: not some other fail within the test.

This approach has the limitation: in case of fail a test may output an
information that varies from run to run or depend of a base directory.
We should always verify the output before put its checksum into the
configuration file.

Despite doubts regarding this approach, it looks simple and we decided
to try and revisit it if there will be a need.

See configuration example in [3].

[1]: tarantool/test-run#187
[2]: tarantool/test-run#189
[3]: tarantool/test-run#217

Part of #5050

(cherry picked from commit 43482ee)
Totktonada added a commit to tarantool/tarantool that referenced this issue Sep 24, 2020
Retry a failed test when it is marked as fragile (and several other
conditions are met, see below).

The test-run already allows to set a list of fragile tests. They are run
one-by-one after all parallel ones in order to eliminate possible
resource starvation and fit timings to ones when the tests pass. See
[1].

In practice this approach does not help much against our problem with
flaky tests. We decided to retry failed tests, when they are known as
flagile. See [2].

The core idea is to split responsibility: known flaky fails will not
deflect attention of a developer, but each fragile test will be marked
explicitly, trackerized and will be analyzed by the quality assurance
team.

The default behaviour is not changed: each test from the fragile list
will be run once after all parallel ones. But now it is possible to set
retries amount.

Beware: the implementation does not allow to just set retries count, it
also requires to provide an md5sum of a failed test output (so called
reject file). The idea here is to ensure that we retry the test only in
case of a known fail: not some other fail within the test.

This approach has the limitation: in case of fail a test may output an
information that varies from run to run or depend of a base directory.
We should always verify the output before put its checksum into the
configuration file.

Despite doubts regarding this approach, it looks simple and we decided
to try and revisit it if there will be a need.

See configuration example in [3].

[1]: tarantool/test-run#187
[2]: tarantool/test-run#189
[3]: tarantool/test-run#217

Part of #5050

(cherry picked from commit 43482ee)
Totktonada added a commit to tarantool/tarantool that referenced this issue Sep 24, 2020
Retry a failed test when it is marked as fragile (and several other
conditions are met, see below).

The test-run already allows to set a list of fragile tests. They are run
one-by-one after all parallel ones in order to eliminate possible
resource starvation and fit timings to ones when the tests pass. See
[1].

In practice this approach does not help much against our problem with
flaky tests. We decided to retry failed tests, when they are known as
flagile. See [2].

The core idea is to split responsibility: known flaky fails will not
deflect attention of a developer, but each fragile test will be marked
explicitly, trackerized and will be analyzed by the quality assurance
team.

The default behaviour is not changed: each test from the fragile list
will be run once after all parallel ones. But now it is possible to set
retries amount.

Beware: the implementation does not allow to just set retries count, it
also requires to provide an md5sum of a failed test output (so called
reject file). The idea here is to ensure that we retry the test only in
case of a known fail: not some other fail within the test.

This approach has the limitation: in case of fail a test may output an
information that varies from run to run or depend of a base directory.
We should always verify the output before put its checksum into the
configuration file.

Despite doubts regarding this approach, it looks simple and we decided
to try and revisit it if there will be a need.

See configuration example in [3].

[1]: tarantool/test-run#187
[2]: tarantool/test-run#189
[3]: tarantool/test-run#217

Part of #5050

(cherry picked from commit 43482ee)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant