Skip to content
This repository was archived by the owner on Jun 23, 2022. It is now read-only.

feat: remove shared log #1019

Merged
merged 7 commits into from
Jan 21, 2022
Merged

feat: remove shared log #1019

merged 7 commits into from
Jan 21, 2022

Conversation

levy5307
Copy link
Contributor

@levy5307 levy5307 commented Jan 18, 2022

[replication]
+plog_force_flush = false

-log_private_batch_buffer_kb
-log_private_batch_buffer_count
-log_private_batch_buffer_flush_interval_ms

apache/incubator-pegasus#859

Performance Test

master branch

+--------------------------+----------+------------+--------+------------+--------------+--------------------------------------------------+------------------------------------------------+
|      operation_case      | run_time | throughput | length | read_write | thread_count |       read(qps|ave|min|max|95|99|999|9999)       |     write(qps|ave|min|max|95|99|999|9999)      |
+--------------------------+----------+------------+--------+------------+--------------+--------------------------------------------------+------------------------------------------------+
| write=single,read=single | 12982    | 23108      | 1000   | 0 : 1      | 15           | {0 0 0 0 0 0 0 0}                                | {23108 1944 412 515241 5265 10553 24484 38153} |
| write=single,read=single | 3695     | 244278     | 1000   | 1 : 0      | 50           | {244316 612 119 353535 969 2084 20393 35273}     | {0 0 0 0 0 0 0 0}                              |
| write=single,read=single | 5627     | 42651      | 1000   | 1 : 1      | 30           | {21328 1424 129 1111380 5469 20425 95060 155604} | {21326 2784 418 550228 7383 12399 37279 68009} |
| write=single,read=single | 5013     | 29915      | 1000   | 1 : 3      | 15           | {7479 1084 128 1347924 3065 15969 84169 141567}  | {22439 1639 407 255615 3979 7665 21596 37012}  |
| write=single,read=single | 3488     | 25793      | 1000   | 1 : 30     | 15           | {830 987 153 811689 2795 12929 75935 143828}     | {24966 1765 412 271700 4196 8500 19433 28137}  |
| write=single,read=single | 4093     | 73317      | 1000   | 3 : 1      | 30           | {54992 845 118 457727 2798 9303 36057 56009}     | {18328 2360 415 456191 5900 10385 30020 50996} |
| write=single,read=single | 4560     | 197394     | 1000   | 30 : 1     | 50           | {191035 732 119 172628 1461 4051 17439 27193}    | {6369 1470 440 130943 2480 5995 19623 30948}   |
+--------------------------+----------+------------+--------+------------+--------------+--------------------------------------------------+------------------------------------------------+

remove slog

+--------------------------+----------+------------+--------+------------+--------------+-----------------------------------------------+------------------------------------------------+
|      operation_case      | run_time | throughput | length | read_write | thread_count |     read(qps|ave|min|max|95|99|999|9999)      |     write(qps|ave|min|max|95|99|999|9999)      |
+--------------------------+----------+------------+--------+------------+--------------+-----------------------------------------------+------------------------------------------------+
| write=single,read=single | 10538    | 28464      | 1000   | 0 : 1      | 15           | {0 0 0 0 0 0 0 0}                             | {28466 1577 339 447572 4808 9529 21057 31903}  |
| write=single,read=single | 3420     | 265243     | 1000   | 1 : 0      | 50           | {265272 567 118 166548 884 1799 16427 23087}  | {0 0 0 0 0 0 0 0}                              |
| write=single,read=single | 4398     | 54556      | 1000   | 1 : 1      | 30           | {27282 724 123 179284 2433 7693 23727 33321}  | {27280 2567 351 474623 7309 12553 34591 60809} |
| write=single,read=single | 4071     | 36835      | 1000   | 1 : 3      | 15           | {9209 617 131 253225 1626 6852 21545 30025}   | {27629 1418 344 564393 4054 7904 19201 30495}  |
| write=single,read=single | 2831     | 31775      | 1000   | 1 : 30     | 15           | {1025 702 155 248788 1721 8023 29839 43103}   | {30757 1436 341 308393 4029 8521 19097 27551}  |
| write=single,read=single | 3410     | 87997      | 1000   | 3 : 1      | 30           | {66002 680 120 168404 1912 6281 21188 32169}  | {21999 2038 344 432895 5556 10164 30020 54239} |
| write=single,read=single | 4143     | 217312     | 1000   | 30 : 1     | 50           | {210345 668 116 179284 1374 3876 16865 23241} | {7010 1233 376 133353 2054 5217 18625 24799}   |
+--------------------------+----------+------------+--------+------------+--------------+-----------------------------------------------+------------------------------------------------+

comparison

The master branch is base 1.

+--------------+--------------+------------+------------------------+---------------------------+
|  read_write  | thread_count | throughput |  read(qps|ave|99|999)  |   write(qps|ave|99|999)   |
+--------------+--------------+------------+------------------------+---------------------------+
|    0 : 1     |     15       |    +23%    |           --           |   +23%|-19%|-10%|-14%     |
|    1 : 0     |     50       |     +9%    |   +9%| -7%|-14%|-19%   |         --                |
|    1 : 1     |     30       |    +28%    |  +28%|-49%|-62%|-75%   |   +28%| -8%| +1%| -7%     |
|    1 : 3     |     15       |    +23%    |  +23%|-43%|-57%|-74%   |   +23%|-13%| +3%|-11%     |
|    1 : 30    |     15       |    +23%    |  +23%|-29%|-38%|-61%   |   +23%|-19%| +0%| -2%     |
|    3 : 1     |     30       |    +20%    |  +20%|-21%|-32%|-41%   |   +20%|-14%| -2%| +0%     |
|    30 : 1    |     50       |    +10%    |  +10%| -9%| -4%| -3%   |   +10%|-16%|-13%| -5%     |
+--------------+--------------+------------+------------------------+---------------------------+

We can conclude that:

  • Throughout increase about 20%
  • Read latency decrease about 36%
  • Write latency decrease about 8%

Function Test

2pc

  1. start cluster with 4 replica servers
➜  pegasus git:(remove-slog) ✗ ./run.sh start_onebox -r 4 -m 3
  1. insert data into app temp
>>> set a b c
OK

app_id          : 2
partition_index : 4
decree          : 29
server          : x.x.x.x:x
  1. kill replica server that serves data inserted before
  2. get data successfully
>>> get a b 
"c"

app_id          : 2
partition_index : 4
server          : 

learn

  1. insert some data
[zhaoliwei@c3-hadoop-build03 pegasus]$ ./run.sh shell -n c4tst-performance1
INFO: parse meta_list from /home/zhaoliwei/deployment-config/xiaomi-config/conf/pegasus/pegasus-c4tst-performance1.cfg
W2021-12-14 16:32:44.955 (1639470764955404384 4986) : overwrite default thread pool for task RPC_CM_QUERY_PARTITION_CONFIG_BY_INDEX from THREAD_POOL_META_SERVER to THREAD_POOL_DEFAULT
W2021-12-14 16:32:44.955 (1639470764955471446 4986) : overwrite default thread pool for task RPC_CM_QUERY_PARTITION_CONFIG_BY_INDEX_ACK from THREAD_POOL_META_SERVER to THREAD_POOL_DEFAULT
Pegasus Shell 2.2.2-remove-slog-plog
Type "help" for more information.
Type "Ctrl-D" or "Ctrl-C" to exit the shell.

The config file is: /home/zhaoliwei/workspace/pegasus/config-shell.ini.4962
The cluster name is: c4tst-performance1
The cluster meta list is: 10.132.15.13:15601,10.132.16.54:15601
>>> ls
[general_info]
app_id  status  app_name  app_type  partition_count  replica_count  is_stateful  create_time  drop_time  drop_expire  envs_count  

[summary]
total_app_count  : 0

>>> create test
create app test succeed, waiting for app ready
test not ready yet, still waiting... (0/4)
test not ready yet, still waiting... (0/4)
test not ready yet, still waiting... (0/4)
test not ready yet, still waiting... (0/4)
test not ready yet, still waiting... (0/4)
test not ready yet, still waiting... (0/4)
test not ready yet, still waiting... (0/4)
test not ready yet, still waiting... (0/4)
test not ready yet, still waiting... (0/4)
test not ready yet, still waiting... (0/4)
test not ready yet, still waiting... (0/4)
test is ready now: (4/4)
test is ready now!
create app "test" succeed
>>> use test
OK
>>> ls
[general_info]
app_id  status     app_name  app_type  partition_count  replica_count  is_stateful  create_time          drop_time  drop_expire  envs_count  
590     AVAILABLE  test      pegasus   4                3              true         2021-12-14_16:32:47  -          -            0           

[summary]
total_app_count  : 1

>>> set a b c 
OK

app_id          : 590
partition_index : 0
decree          : 11
server          : 
>>> set a c b 
OK

app_id          : 590
partition_index : 0
decree          : 13
server          : 
>>> set c b a
OK

app_id          : 590
partition_index : 2
decree          : 13
server          : 
>>> set c a b 
OK

app_id          : 590
partition_index : 2
decree          : 14
server          : 
>>> set b a c 
OK

app_id          : 590
partition_index : 3
decree          : 14
server          : 
>>> set b c a
OK

app_id          : 590
partition_index : 3
decree          : 15
server          : 
>>> set aa bb cc
OK

app_id          : 590
partition_index : 1
decree          : 15
server          : 
>>> set 1 2 3
OK

app_id          : 590
partition_index : 2
decree          : 18
server          : 
>>> set 12 231
USAGE:         set                     <hash_key> <sort_key> <value> [ttl_in_seconds]
>>> set 1 3 2
OK

app_id          : 590
partition_index : 2
decree          : 21
server          : 
>>> set 2 1 3
OK

app_id          : 590
partition_index : 1
decree          : 20
server          : 
>>> set 2 3 1
OK

app_id          : 590
partition_index : 1
decree          : 22
server          : 
>>> set 3 1 2
OK

app_id          : 590
partition_index : 0
decree          : 23
server          :
>>> set 3 2 1
OK

app_id          : 590
partition_index : 0
decree          : 24
server          : 
>>> exit
dsn exit with code 0
  1. stop one replica server
[zhaoliwei@c3-hadoop-build03 pegasus]$ ./scripts/pegasus_offline_node.sh c4tst-performance1 
pegasus_offline_node_list.sh  pegasus_offline_node.sh       
[zhaoliwei@c3-hadoop-build03 pegasus]$ start cluster with 4 replica serversstart cluster with 4 replica servers./scripts/pegasus_offline_node.sh c4tst-performance1 
UID=10085
PID=17758
Start time: Tue Dec 14 16:34:49 CST 2021

Generating /tmp/10085.17758.pegasus.rolling_update.rs.list...
Generating /tmp/10085.17758.pegasus.offline_node.cluster_info...
Generating /tmp/10085.17758.pegasus.offline_node.nodes...
Set meta level to steady...
Set lb.assign_delay_ms to 10...

==================================================================
==================================================================
Offline replica server task 0 of ...

Getting serving replica count...
servicing_replica_count=3

Migrating primary replicas out of node...
Wait [10.132.15.13:15801] to migrate done...
Refer to /tmp/10085.17758.pegasus.offline_node.migrate_node for details
Still 1 primary replicas left on 
Still 1 primary replicas left on 
Still 1 primary replicas left on 
Still 1 primary replicas left on 
Still 1 primary replicas left on 
Still 1 primary replicas left on 
Still 1 primary replicas left on 
Still 1 primary replicas left on 
Still 1 primary replicas left on 
Still 1 primary replicas left on 
Migrate done.

Downgrading replicas on node...
Wait [] to downgrade done...
Refer to /tmp/10085.17758.pegasus.offline_node.downgrade_node for details
Still 3 replicas left on 
Still 3 replicas left on 
Still 3 replicas left on 
Still 3 replicas left on
Still 3 replicas left on 
Still 3 replicas left on 
Still 3 replicas left on 
Downgrade done.

Send kill_partition to node...
Sent kill_partition to 3 partitions

Stop node by minos...
./deploy stop pegasus c4tst-performance1 --job replica --task 0 --skip_confirm
2021-12-14 16:35:12 Stopping task 0 of replica on 
2021-12-14 16:35:12 Stop task 0 of replica on  success
Stop node by minos done.

Wait cluster to become healthy...
Cluster not healthy, unhealthy_partition_count = 3
Cluster not healthy, unhealthy_partition_count = 3
Cluster not healthy, unhealthy_partition_count = 3
Cluster not healthy, unhealthy_partition_count = 3
Cluster not healthy, unhealthy_partition_count = 3
Cluster not healthy, unhealthy_partition_count = 3
Cluster not healthy, unhealthy_partition_count = 3
Cluster not healthy, unhealthy_partition_count = 3
Cluster not healthy, unhealthy_partition_count = 3
Cluster not healthy, unhealthy_partition_count = 3
Cluster becomes healthy

Set lb.assign_delay_ms to DEFAULT...

Offline replica server task 0 done.
Elapsed time is 127 seconds.
  1. get data successfully
>>> get aa bb 
"cc"

app_id          : 590
partition_index : 1
server          : 
>>> get 2 1 
"3"

app_id          : 590
partition_index : 1
server          : 
>>> get 2 3 
"1"

app_id          : 590
partition_index : 1
server          : 
>>> 

@levy5307 levy5307 added the type/config-change PR that made modification on configs, which should be noted in release note. label Jan 18, 2022
@github-actions github-actions bot changed the title feat: remove shared log #859: feat: remove shared log Jan 18, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I noticed that your PR contained a reference to the ticket URL in the body but not in the title. I went ahead and updated that for you. Hope you don't mind! ☺️

@github-actions github-actions bot changed the title #859: feat: remove shared log #859: #859: feat: remove shared log Jan 19, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I noticed that your PR contained a reference to the ticket URL in the body but not in the title. I went ahead and updated that for you. Hope you don't mind! ☺️

@levy5307 levy5307 changed the title #859: #859: feat: remove shared log #859: feat: remove shared log Jan 19, 2022
@github-actions github-actions bot changed the title #859: feat: remove shared log #859: #859: feat: remove shared log Jan 19, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I noticed that your PR contained a reference to the ticket URL in the body but not in the title. I went ahead and updated that for you. Hope you don't mind! ☺️

@foreverneverer
Copy link
Contributor

https://github.com/neofinancial/ticket-check-action#inputs

to avoid auto-change title, you may need add title format line in https://github.com/XiaoMi/rdsn/blob/master/.github/workflows/issue_ref.yaml#L35 like this:

titleFormat: %title%

@levy5307
Copy link
Contributor Author

https://github.com/neofinancial/ticket-check-action#inputs

to avoid auto-change title, you may need add title format line in https://github.com/XiaoMi/rdsn/blob/master/.github/workflows/issue_ref.yaml#L35 like this:

titleFormat: %title%

I will raise a new pull request to change it : )

@levy5307 levy5307 changed the title #859: #859: feat: remove shared log feat: remove shared log Jan 19, 2022
@github-actions github-actions bot changed the title feat: remove shared log #859: feat: remove shared log Jan 19, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I noticed that your PR contained a reference to the ticket URL in the body but not in the title. I went ahead and updated that for you. Hope you don't mind! ☺️

@github-actions github-actions bot changed the title #859: feat: remove shared log #859: #859: feat: remove shared log Jan 19, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I noticed that your PR contained a reference to the ticket URL in the body but not in the title. I went ahead and updated that for you. Hope you don't mind! ☺️

@github-actions github-actions bot changed the title #859: #859: feat: remove shared log #859: #859: #859: feat: remove shared log Jan 19, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I noticed that your PR contained a reference to the ticket URL in the body but not in the title. I went ahead and updated that for you. Hope you don't mind! ☺️

@github-actions github-actions bot changed the title #859: #859: #859: feat: remove shared log #859: #859: #859: #859: feat: remove shared log Jan 19, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I noticed that your PR contained a reference to the ticket URL in the body but not in the title. I went ahead and updated that for you. Hope you don't mind! ☺️

@github-actions github-actions bot changed the title #859: #859: #859: #859: feat: remove shared log #859: #859: #859: #859: #859: feat: remove shared log Jan 19, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I noticed that your PR contained a reference to the ticket URL in the body but not in the title. I went ahead and updated that for you. Hope you don't mind! ☺️

@github-actions github-actions bot changed the title #859: #859: #859: #859: #859: feat: remove shared log #859: #859: #859: #859: #859: #859: feat: remove shared log Jan 19, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I noticed that your PR contained a reference to the ticket URL in the body but not in the title. I went ahead and updated that for you. Hope you don't mind! ☺️

@github-actions github-actions bot changed the title #859: #859: #859: #859: #859: #859: feat: remove shared log #859: #859: #859: #859: #859: #859: #859: feat: remove shared log Jan 19, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I noticed that your PR contained a reference to the ticket URL in the body but not in the title. I went ahead and updated that for you. Hope you don't mind! ☺️

@levy5307 levy5307 changed the title #859: #859: #859: #859: #859: #859: #859: feat: remove shared log #859: feat: remove shared log Jan 19, 2022
@github-actions github-actions bot changed the title #859: feat: remove shared log #859: #859: feat: remove shared log Jan 19, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I noticed that your PR contained a reference to the ticket URL in the body but not in the title. I went ahead and updated that for you. Hope you don't mind! ☺️

@levy5307 levy5307 changed the title #859: #859: feat: remove shared log feat: remove shared log Jan 19, 2022
@github-actions github-actions bot changed the title feat: remove shared log #859: feat: remove shared log Jan 19, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! I noticed that your PR contained a reference to the ticket URL in the body but not in the title. I went ahead and updated that for you. Hope you don't mind! ☺️

@foreverneverer foreverneverer changed the title #859: feat: remove shared log feat: remove shared log Jan 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type/config-change PR that made modification on configs, which should be noted in release note.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants