Skip to content

Random tests fail and crash on Mac OS #232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Gerold103 opened this issue Aug 22, 2019 · 9 comments
Closed

Random tests fail and crash on Mac OS #232

Gerold103 opened this issue Aug 22, 2019 · 9 comments
Assignees
Labels
bug Something isn't working flaky test luajit OSX

Comments

@Gerold103
Copy link

Tarantool version:
master
OS version:
Mac
Bug description:
Every time I run the full suite on the master branch, I get spontaneous crashes and hangs. Example of a crash:

[032] box/func_reload.test.lua                                        
[032] 
[032] [Instance "box" killed by signal: 6 (SIGABRT)]
[032] 
[032] Last 15 lines of Tarantool Log file [Instance "box"][/Users/gerold/Work/Repositories/tarantool/test/var/032_box/box.log]:
[032] 2019-08-22 22:14:42.026 [38110] snapshot/101/main I> done
[032] 2019-08-22 22:14:42.029 [38110] main I> removed /Users/gerold/Work/Repositories/tarantool/test/var/032_box/box/00000000000000000120.snap
[032] 2019-08-22 22:14:42.029 [38110] wal I> removed /Users/gerold/Work/Repositories/tarantool/test/var/032_box/box/00000000000000000120.xlog
[032] 2019-08-22 22:14:42.453 [38110] main/109/main I> joining replica eb8f7d4a-0e37-4a9f-940d-eb57756314c8 at fd 43, aka unix/:/Users/gerold/Work/Repositories/tarantool/test
[032] 2019-08-22 22:14:42.578 [38110] main/109/main I> initial data sent.
[032] 2019-08-22 22:14:42.578 [38110] main I> assigned id 2 to replica eb8f7d4a-0e37-4a9f-940d-eb57756314c8
[032] 2019-08-22 22:14:42.579 [38110] relay/unix/:(socket)/101/main I> recover from `/Users/gerold/Work/Repositories/tarantool/test/var/032_box/box/00000000000000002296.xlog'
[032] 2019-08-22 22:14:42.579 [38110] main/109/main I> final data sent.
[032] 2019-08-22 22:14:42.622 [38110] main/109/main I> subscribed replica eb8f7d4a-0e37-4a9f-940d-eb57756314c8 at fd 43, aka unix/:/Users/gerold/Work/Repositories/tarantool/test
[032] 2019-08-22 22:14:42.622 [38110] main/109/main I> remote vclock {1: 2298} local vclock {1: 2298}
[032] 2019-08-22 22:14:42.623 [38110] relay/unix/:(socket)/101/main I> recover from `/Users/gerold/Work/Repositories/tarantool/test/var/032_box/box/00000000000000002296.xlog'
[032] 2019-08-22 22:14:42.711 [38110] relay/unix/:(socket)/101/main coio.cc:379 !> SystemError unexpected EOF when reading from socket, called on fd 43, aka unix/:/Users/gerold/Work/Repositories/tarantool/test: Broken pipe
[032] 2019-08-22 22:14:42.712 [38110] relay/unix/:(socket)/101/main C> exiting the relay loop
[032] 2019-08-22 22:14:42.774 [38110] main I> removed replica eb8f7d4a-0e37-4a9f-940d-eb57756314c8
[032] Assertion failed: (region_used(region) >= used), function region_truncate, file /Users/gerold/Work/Repositories/tarantool/src/lib/small/small/region.c, line 75.
[032] [ fail ]

Failing tests: engine/ddl, box/func_reload, app/fiber_channel, and others.

@Totktonada
Copy link
Member

I guess a build with -DLUAJIT_ENABLE_GC64=ON on Linux will show the similar behaviour. See tarantool/tarantool#2643.

@Totktonada Totktonada changed the title Random tests fail and crash Random tests fail and crash on Mac OS Aug 23, 2019
@kyukhin kyukhin added the bug Something isn't working label Aug 23, 2019
@mraleph
Copy link

mraleph commented Aug 26, 2019

I would guess this is the same issue as tarantool/tarantool#4427

@Totktonada
Copy link
Member

@Gerold103 Is it reproduced after the tarantool/tarantool#4427 fix?

@Gerold103
Copy link
Author

This one I didn't see for a long time. But sometimes I see mp_tuple_assert() failing. I didn't find time to properly file an issue for that yet.

@Totktonada
Copy link
Member

mp_tuple_assert() looks similar to #234.

@igormunkin
Copy link

@Totktonada, @Gerold103, we have recenly applied the patch with the fix for #234 and #235. Could you please check, whether this issue is also gone if you consider it related to any of the mentioned above?

@Totktonada
Copy link
Member

I performed a couple of experiments on Linux with GC64.

Experiment

Build:

$ cmake .                        \
    -DCMAKE_BUILD_TYPE=Debug     \
    -DENABLE_BACKTRACE=ON        \
    -DENABLE_DIST=ON             \
    -DENABLE_FEEDBACK_DAEMON=OFF \
    -DENABLE_BUNDLED_LIBCURL=OFF \
    -DENABLE_FEEDBACK_DAEMON=OFF \
    -DLUAJIT_ENABLE_GC64=ON      \
    && make -j

Verify build:

# tarantool> require('ffi').abi('gc64')
---
- true
...

Run the test many times in parallel:

$ ./test/test-run.py $(yes box/func_reload.test.lua | head -n 1000)

Results

I checked current master (2.11.0-entrypoint-205-g1cd1a2df6) and the commit right before tarantool/tarantool#7328 (2.11.0-entrypoint-177-ga85629a69, the pull request lands as 2.11.0-entrypoint-178-g34330b159).

Got no fails.

Can't add more here.

@Gerold103
Copy link
Author

I didn't see any crashes for a long time.

@igormunkin
Copy link

Nice. Closing then. Feel free to reopen if this starts bothering anybody again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flaky test luajit OSX
Projects
None yet
Development

No branches or pull requests

7 participants