-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-16386 utils: Add DDB Feature and RM_POOL Support #15062
Conversation
Ticket title is 'Enhance DDB for CR purpose' |
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-15062/1/testReport/ |
e1b57a1
to
7011933
Compare
Test stage Functional Hardware Medium completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-15062/3/testReport/ |
This PR enhances the DDB functionality for CR purposes with the following updates: 1. Pool Behavior Control: Administrators can now control certain vos pool behaviors, such as skipping vos pool loading or setting a vos pool to immutable mode. 2. Manual Pool Shard Removal: A new command ddb rm_pool <vos_pool> has been introduced, allowing administrators to manually remove pool shards. 3. SPDK Environment Initialization Bug Fix: Fixed an issue where spdk_env_init() would fail during reinitialization. These updates aim to improve system flexibility and stability, providing administrators with more robust management capabilities. Required-githooks: true Signed-off-by: Wang Shilong <shilong.wang@intel.com> Required-githooks: true
cc4c66f
to
d845fda
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, is there a reason we couldn't use go-flags for the go cli front-end and instead had to pull in "grumble" package?
I tried to follow @ryon-jensen , maybe he could help answer this. |
…S-16386 Required-githooks: true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go changes look reasonable to me, but I'm not sure what everything in the ddb C code is doing. I think we need to get someone more experienced with ddb and/or vos in general to review the other portion of the changes.
out: | ||
return rc; | ||
} | ||
|
||
int | ||
ddb_main(struct ddb_io_ft *io_ft, int argc, char *argv[]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little confused because I can't figure out what ddb_main
is used for. It seems to think it is the main function of ddb, but the Go stuff is the entry point now. No references to ddb_main
outside of this directory. Should it (and the superfluous main
function in ddb_entry.c) be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, i am also a bit confused about not sure about this, i think maybe we could remove it, but i'd like to keep it is for now, and remove it in the separate discussion if you are fine with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I won't block either way. It would be great to prune unused code whenever we can.
src/include/daos_srv/vos.h
Outdated
@@ -20,6 +20,56 @@ | |||
#include <daos_srv/dtx_srv.h> | |||
#include <daos_srv/vos_types.h> | |||
|
|||
#define VOS_POOL_COMPAT_FLAG_IMMUTABLE (1ULL << 0) | |||
#define VOS_POOL_COMPAT_FLAG_SKIP_LOAD (1ULL << 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name "skip_load" seems some confused, skip load for what? According to the patch, seems the pool will not be started if the flag is set. So maybe called "skip_start" or just "skip_pool" or some more clear name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean skip load, let's change it to start start.
#define VOS_POOL_COMPAT_FLAG_IMMUTABLE (1ULL << 0) | ||
#define VOS_POOL_COMPAT_FLAG_SKIP_LOAD (1ULL << 1) | ||
#define VOS_POOL_COMPAT_FLAG_SKIP_REBUILD (1ULL << 2) | ||
#define VOS_POOL_COMPAT_FLAG_SKIP_DTX_RESYNC (1ULL << 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebuild depends on DTX resync, if DTX resync is skipped, then rebuild also needs to be skipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we only need FLAG to DTX resync then, it will disable rebuild.
ddb_run_rm_pool(struct ddb_ctx *ctx, struct rm_pool_options *opt) | ||
{ | ||
if (ddb_pool_is_open(ctx)) { | ||
ddb_error(ctx, "Must close pool before can open another\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What means? the pool to be destroyed is opened or some other pool is opened?
@@ -48,8 +48,18 @@ struct ddb_array { | |||
}; | |||
|
|||
/* Open and close a pool for a ddb_ctx */ | |||
int dv_pool_open(char *path, daos_handle_t *poh); | |||
int | |||
dv_pool_open(const char *path, daos_handle_t *poh, uint32_t flags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please adjust related format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit annoying, this is auto-format by clang format, let me fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git hooks required this..i leave it for now.
@@ -50,6 +49,35 @@ dv_pool_open(char *path, daos_handle_t *poh) | |||
return rc; | |||
} | |||
|
|||
int | |||
dv_pool_destroy(const char *path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you please to add some test cases for the pool destroy via ddb? On the other hand, since the DDB tool can be used for recovery from disaster, if SMD information does not match pool case or say SMD is corrupted, can we still operate on the pool by force?
src/utils/ddb/ddb_commands.c
Outdated
opt->clear_incompat_flags) | ||
ctx->dc_write_mode = true; | ||
else | ||
ctx->dc_write_mode = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the feature
option be used together with open
(with -w
) option?
…S-16386 Required-githooks: true
Required-githooks: true Signed-off-by: Wang Shilong <shilong.wang@intel.com>
Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15062/6/execution/node/360/log |
Test stage Build on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15062/6/execution/node/387/log |
Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15062/6/execution/node/273/log |
Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15062/6/execution/node/363/log |
Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15062/6/execution/node/326/log |
Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15062/6/execution/node/323/log |
Required-githooks: true Signed-off-by: Wang Shilong <shilong.wang@intel.com>
if (ddb_pool_is_open(ctx)) { | ||
if (!ctx->dc_write_mode && !opt->show_features) | ||
return -DER_INVAL; | ||
goto skip; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen if the pool is opened without -w
option, but some {set,clear}_{in}compat_flags
is valid? What is the expected behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will return -DER_INVAL? maybe should return NOPERM.
src/utils/ddb/ddb_commands.c
Outdated
{ | ||
if (ddb_pool_is_open(ctx)) { | ||
ddb_error(ctx, "Must close pool before can open another\n"); | ||
return -DER_EXIST; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want to express that "the pool cannot be destroyed because it is opened" ? if yes, then -DER_BUSY
may be better. Or I mis-understand anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-DER_BUSY is better.
if (str_has_value(pa->pa_r_cmd_run)) { | ||
rc = ddb_parse_cmd_str(ctx, pa->pa_r_cmd_run, open); | ||
if (rc) | ||
return rc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*open = true
for else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be assigned inside ddb_parse_cmd_str().
Required-githooks: true Signed-off-by: Wang Shilong <shilong.wang@intel.com>
Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-15062/8/testReport/ |
Test stage Unit Test with memcheck on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-15062/8/testReport/ |
|
||
func featureCompleter(prefix string, args []string) []string { | ||
suggestions := []string{"-h", "-e", "--enable", "-d", "--disable", "-s", "--show"} | ||
suggestions = append(suggestions, listDirVos(defMntPrefix)...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L38-L53 is identical to L57-L72 and should be extracted as a helper function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@@ -223,3 +223,36 @@ func ddbDtxActAbort(ctx *DdbContext, path string, dtx_id string) error { | |||
/* Run the c code command */ | |||
return daosError(C.ddb_run_dtx_act_abort(&ctx.ctx, &options)) | |||
} | |||
|
|||
func ddbFeature(ctx *DdbContext, path string, enable string, disable string, show bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: don't need to specify type for sequential parameters that share type, e.g. can do path, enable, disable string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Required-githooks: true Signed-off-by: Wang Shilong <shilong.wang@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go changes LGTM.
This PR enhances the DDB functionality for CR purposes with the following updates: 1. Pool Behavior Control: Administrators can now control certain vos pool behaviors, such as skipping vos pool loading or setting a vos pool to immutable mode. 2. Manual Pool Shard Removal: A new command ddb rm_pool <vos_pool> has been introduced, allowing administrators to manually remove pool shards. 3. SPDK Environment Initialization Bug Fix: Fixed an issue where spdk_env_init() would fail during reinitialization. These updates aim to improve system flexibility and stability, providing administrators with more robust management capabilities. Required-githooks: true Signed-off-by: Wang Shilong <shilong.wang@intel.com>
This PR enhances the DDB functionality for CR purposes with the following updates: 1. Pool Behavior Control: Administrators can now control certain vos pool behaviors, such as skipping vos pool loading or setting a vos pool to immutable mode. 2. Manual Pool Shard Removal: A new command ddb rm_pool <vos_pool> has been introduced, allowing administrators to manually remove pool shards. 3. SPDK Environment Initialization Bug Fix: Fixed an issue where spdk_env_init() would fail during reinitialization. These updates aim to improve system flexibility and stability, providing administrators with more robust management capabilities. Required-githooks: true Signed-off-by: Wang Shilong <shilong.wang@intel.com>
This PR enhances the DDB functionality for CR purposes with the following updates: 1. Pool Behavior Control: Administrators can now control certain vos pool behaviors, such as skipping vos pool loading or setting a vos pool to immutable mode. 2. Manual Pool Shard Removal: A new command ddb rm_pool <vos_pool> has been introduced, allowing administrators to manually remove pool shards. 3. SPDK Environment Initialization Bug Fix: Fixed an issue where spdk_env_init() would fail during reinitialization. These updates aim to improve system flexibility and stability, providing administrators with more robust management capabilities. Signed-off-by: Wang Shilong <shilong.wang@intel.com>
This PR enhances the DDB functionality for CR purposes with the following updates:
Administrators can now control certain vos pool behaviors, such as skipping vos pool loading or setting a vos pool to immutable mode.
A new command ddb rm_pool <vos_pool> has been introduced, allowing administrators to manually remove pool shards.
Fixed an issue where spdk_env_init() would fail during reinitialization.
These updates aim to improve system flexibility and stability, providing administrators with more robust management capabilities.
Required-githooks: true
Signed-off-by: Wang Shilong shilong.wang@intel.com
Required-githooks: true
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: