-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-16300 cart: Parse domain properly for multi-interface case #14864
Conversation
- parse out domain similarly to interface for multi-interface case Required-githooks: true Signed-off-by: Alexander A Oganezov <alexander.a.oganezov@intel.com>
Ticket title is 'cart: multi-interface case does not parse out D_DOMAIN correctly' |
Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14864/1/execution/node/304/log |
For cxi we use domain names and not interface names. Create context on interface was not accounting for this, causing wrong handling. Required-githooks: true Signed-off-by: Alexander A Oganezov <alexander.a.oganezov@intel.com>
e65cecb
to
ab31b18
Compare
Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14864/2/execution/node/357/log |
Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14864/2/execution/node/347/log |
Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14864/2/execution/node/356/log |
Required-githooks: true Signed-off-by: Alexander A Oganezov <alexander.a.oganezov@intel.com>
for purposes of get_info string generation Required-githooks: true Signed-off-by: Alexander A Oganezov <alexander.a.oganezov@intel.com>
|
||
if (interface) { | ||
if (provider == CRT_PROV_OFI_CXI) { | ||
D_INFO("Interface '%s' ignored for CXI. Using domain '%s' instead\n", | ||
interface, domain); | ||
|
||
/* Note: crt_provider_iface_str_get() returns interface name */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously with CXI (as it uses domain names not interface names) we would overwrite interface with domain values, but this causes issues with multi-interface feature that uses interface names as identifiers.
Instead crt_get_info_string() now special case handles CXI provider and only uses domain names there for it.
Required-githooks: true Signed-off-by: Alexander A Oganezov <alexander.a.oganezov@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ftest changes LGTM
I have tested those changes on aurora and with that PR, it works as expected. |
/* treat not set and set to empty as the same */ | ||
no_iface = (iface_str == NULL || *iface_str == '\0') ? true : false; | ||
no_domain = (domain_str == NULL || *domain_str == '\0') ? true : false; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll have to simplify that, I don't really like that we have all this if (no_iface) and if (no_domain) etc :) it's just hard to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure. some of a mess here is due to us having to form a string from 4 parts, 3 of which are optional. It would have been simpler if we could just fill in 4 fields (provider, domain, interface, port) in a structure as an alternate way of specifying info to mercury:) i feel most providers map well to this 4, and for outliers like 'sm' users can still use string version.
/* CXI provider uses domain names for info string */ | ||
if (provider == CRT_PROV_OFI_CXI) | ||
iface_str = NULL; | ||
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should improve that by separating init formats, cxi has its own format (cxi:) whereas other providers have IP format (<hostname/IP>:).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should improve that by separating init formats, cxi has its own format (cxi:) whereas other providers have IP format (<hostname/IP>:).
yeah there are few things we can improve here, lets discuss next cart wg meeting
@@ -34,10 +34,12 @@ struct crt_grp_gdata; | |||
struct crt_na_config { | |||
int32_t noc_port; | |||
int noc_iface_total; | |||
int noc_domain_total; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does not seem aligned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does not seem aligned
yeah, i think whole struct is badly aligned and clangformat is now forcing the right formatting on 1 field only:) will reformat next patch
- parse out domain similarly to interface for multi-interface case, as a coma-separated list of domains. - Bugfix: new interface-specific context APIs segfaulted with CXI provider specified, due to interface being previously discarded (cxi uses domain names instead). test scenario added to no_pmix_multi_ctx test case as well. - treat not set and 'set to empty' as the same for purposes of info_string generation. - add sanity check to make sure domain count matches interface count Signed-off-by: Alexander A Oganezov <alexander.a.oganezov@intel.com>
…) (#14880) - parse out domain similarly to interface for multi-interface case, as a coma-separated list of domains. - Bugfix: new interface-specific context APIs segfaulted with CXI provider specified, due to interface being previously discarded (cxi uses domain names instead). test scenario added to no_pmix_multi_ctx test case as well. - treat not set and 'set to empty' as the same for purposes of info_string generation. - add sanity check to make sure domain count matches interface count Signed-off-by: Alexander A Oganezov <alexander.a.oganezov@intel.com>
Required-githooks: true
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: