Skip to content

Add internal IPC-consumer-only mode to the devdax provider #831

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ldorau
Copy link
Contributor

@ldorau ldorau commented Oct 23, 2024

Description

Add IPC-consumer-only mode to devdax provider.

Requires:

Checklist

  • Code compiles without errors locally
  • All tests pass locally
  • CI workflows execute properly

@ldorau ldorau force-pushed the Fix_devdax_open_ipc_handle_and_devdax_put_ipc_handle branch from 82b8689 to d95cc97 Compare October 23, 2024 12:18
@lplewa
Copy link
Contributor

lplewa commented Oct 23, 2024

How this is supposed to work? Both devdax are a mirror?

@ldorau ldorau force-pushed the Fix_devdax_open_ipc_handle_and_devdax_put_ipc_handle branch from d95cc97 to f4c54a1 Compare October 23, 2024 14:00
@ldorau ldorau marked this pull request as ready for review October 23, 2024 14:00
@ldorau ldorau requested a review from a team as a code owner October 23, 2024 14:00
@ldorau
Copy link
Contributor Author

ldorau commented Oct 23, 2024

How this is supposed to work? Both devdax are a mirror?

It just works - out-of-the-box ;-) - check the CI please :-)

@lplewa
Copy link
Contributor

lplewa commented Oct 23, 2024

How this is supposed to work? Both devdax are a mirror?

It just works - out-of-the-box ;-) - check the CI please :-)

I asked how it works, not if CI pass ;)

If we have two processes which want to share allocation on devdax, do client process need extra devdax only to get allocation done on first one? If yes it is broken.

As far i know, IPC works as a pair {device, handle}, so both processes get handle to device the same device on their side, and then one process do allocation, create IPC handle to it, and sends it to second process. Second process uses pair {device1, handle} to receive pointer to this allocation.

But now the client process uses pair {device2, handle} to get allocation from device1. This have no sense.

But maybe i do not see something: @vinser52 - you are an IPC expert so maybe you can help.

BTW the same issue is related to #830 which is unfortunately has the same issue.

@vinser52
Copy link
Contributor

How this is supposed to work? Both devdax are a mirror?

It just works - out-of-the-box ;-) - check the CI please :-)

I asked how it works, not if CI pass ;)

If we have two processes which want to share allocation on devdax, do client process need extra devdax only to get allocation done on first one? If yes it is broken.

As far i know, IPC works as a pair {device, handle}, so both processes get handle to device the same device on their side, and then one process do allocation, create IPC handle to it, and sends it to second process. Second process uses pair {device1, handle} to receive pointer to this allocation.

But now the client process uses pair {device2, handle} to get allocation from device1. This have no sense.

But maybe i do not see something: @vinser52 - you are an IPC expert so maybe you can help.

BTW the same issue is related to #830 which is unfortunately has the same issue.

Yeah, I think @lplewa raised a very good question.

As far i know, IPC works as a pair {device, handle}, so both processes get handle to device the same device on their side, and then one process do allocation, create IPC handle to it, and sends it to second process. Second process uses pair {device1, handle} to receive pointer to this allocation.

Correct.

But now the client process uses pair {device2, handle} to get allocation from device1. This have no sense.

I agree with that concern. The general idea of IPC is (exactly like @lplewa described) to allow remote process to access the memory allocated by another process.

Now regarding how it is implemented in UMF. The memory pool is our main abstraction. To open IPC handle the same memory provider type should be used, e.g. if the IPC handle was created by the L0 provider, then the L0 provider should be used in another process to open this IPC handle. And this abstraction works fine with L0/CUDA/OS providers. For example to open L0 IPC handle the L0 context and L0 device are needed, so the user creates a L0 pool in a consumer process and specifies context/device as init parameters for the memory provider and it does not matter if a user is going to only open IPC handles in this pool or also going to allocate memory from this pool in a consumer process. The same init params (device/context) for the provider are needed in both scenarios: when you want only to open IPC handle and when you also want to allocate memory from that pool. The same is true for CUDA and OS providers.

But with FSDAX and File providers I think we have a gap that @lplewa highlighted. We need to think how to address it. Otherwise, it is inconvenient for the user: like @lplewa said, if a user wants only to open IPC handles from the remote process why user should care about choosing the right DAX device or file name? Maybe we should allow the creation of the instance of DAX provider without specifying the dax device if the user wants only to open IPC handles?

It just works - out-of-the-box ;-) - check the CI please :-)

I asked how it works, not if CI pass ;)

I think it is a good general statement (related not only to this PR). We should be use-case-driven, not CI-driven :) Our goal not to make CI-green but address customer needs. Whenever we are designing a new feature or fixing a bug we should think how it impacts the user, but CI is just a tool to validate the solution.
So let's not use "CI-green" as an argument when we discuss any code changes.

@lplewa
Copy link
Contributor

lplewa commented Oct 23, 2024

Maybe we should allow the creation of the instance of DAX provider without specifying the dax device if the user wants only to open IPC handles?

In UMF "device" (from {device, handle} pair) is provider. But in fact we do not need provider to open IPChandle, we need provider_params, so we know where memory is(FD/filepath/devicehandle), and how to "mmap it" (read/write permission etc...)

The issue what we are trying to solve here is that devdax/file provider is not multi-process safe(this is my guess, maybe there is another issue). So there should not be two processes activily using the same resource to allocate memory. To do so we have to store provider metadata on the device. So user should not create provider on the same file from two preceses. But we still need metadata from provider with protection/visibility flags.

I think that all ways requiring creating "real" provider to receive memory from another process is bad.
Instead of the provider handle, user should provide pair {ops, params} or we have allow user to create virtual/quasi/hollow provider(which from implementation point of view - only validates {ops, params} and stores it in the single structure)

User should be able to use either real provider of this virtual one (if user uses multi-process safe provider and want's to allocate and share memory from both processes, we should allow it without extra complication)

Also: We have to document which provider is multiprocess safe and which is not. Maybe we need API for it too, not sure.
Also: In case of file/dax provider we should Flock file, so user cannot create provider from two processes (on Windows we should use their voodoo magic to lock file, instead flock)

@ldorau
Copy link
Contributor Author

ldorau commented Oct 23, 2024

But with FSDAX and File providers I think we have a gap that @lplewa highlighted. We need to think how to address it. Otherwise, it is inconvenient for the user: like @lplewa said, if a user wants only to open IPC handles from the remote process why user should care about choosing the right DAX device or file name? Maybe we should allow the creation of the instance of DAX provider without specifying the dax device if the user wants only to open IPC handles?

Yes, that's right. I was already discussing that with @bratpiorka a bit, but we did not get to a solution yet.
We had only the same one idea ("Maybe we should allow the creation of the instance of DAX provider without specifying the dax device if the user wants only to open IPC handles?").

It just works - out-of-the-box ;-) - check the CI please :-)

I asked how it works, not if CI pass ;)

I think it is a good general statement (related not only to this PR). We should be use-case-driven, not CI-driven :) Our goal not to make CI-green but address customer needs. Whenever we are designing a new feature or fixing a bug we should think how it impacts the user, but CI is just a tool to validate the solution. So let's not use "CI-green" as an argument when we discuss any code changes.

Sure, that's obvious. It was just a joke. I was in hurry - I had to go home ASAP and I had no time to write more.

@vinser52
Copy link
Contributor

In UMF "device" (from {device, handle} pair) is provider. But in fact we do not need provider to open IPChandle, we need provider_params, so we know where memory is(FD/filepath/devicehandle), and how to "mmap it" (read/write permission etc...)

Using your analogy, the "device" is a memory provider instance. So another analogy I like is from the C++ world: memory provider/memory pool are classes (they implement some functionality) and provider/pool instances are objects. To open IPC handle we need not only params but also the implementation (which low-level API should be used, in case of OS provider it is mmap, in case of L0 it is L0-specific API, etc.). So memory provider instance incapsulates both: parameters and implementation.

The issue what we are trying to solve here is that devdax/file provider is not multi-process safe(this is my guess, maybe there is another issue). So there should not be two processes activily using the same resource to allocate memory. To do so we have to store provider metadata on the device. So user should not create provider on the same file from two preceses. But we still need metadata from provider with protection/visibility flags.

I do not think it is not multi-process issue. Using the same file or the same DAX device from multiple provider instances even within the same process is not supported. The File memory provider itself is multi-process safe if different files are used. Furthermore, if the user just wants to use memory provider instance to open IPC handles and never allocate from it than any filename/DAXdevice can be used to initialize the provider instance - it will not be used anyway if there are no allocations - but I agree that it is error-prone and we need somehow to address it.

Instead of the provider handle, user should provide pair {ops, params} or we have allow user to create virtual/quasi/hollow provider(which from implementation point of view - only validates {ops, params} and stores it in the single structure).

But what you described is exactly a memory provider instance. Memory provider instance is a combination of implementation (ops) and parameters.

I think that all ways requiring creating "real" provider to receive memory from another process is bad.
Instead of the provider handle, user should provide pair {ops, params} or we have allow user to create virtual/quasi/hollow provider(which from implementation point of view - only validates {ops, params} and stores it in the single structure)

User should be able to use either real provider of this virtual one (if user uses multi-process safe provider and want's to allocate and share memory from both processes, we should allow it without extra complication)

I think I suggested the similar idea, but instead of calling it "real" and "virtual" provider I just suggested in my previous comment the following:

if a user wants only to open IPC handles from the remote process why user should care about
choosing the right DAX device or file name? Maybe we should allow the creation of
the instance of DAX provider without specifying the dax device if the user wants only to open IPC handles?

Today, inside initialize callback of provider implementation we consider all params as required to be able to execute malloc and open_ipc_handle functions. But maybe we can do it "lazily", e.g. if malloc is called but the file name is not specified then it returns some error.

Also: We have to document which provider is multiprocess safe and which is not. Maybe we need API for it too, not sure.

I do not think we should use term "multi-process". As I described above, it is not a question about multiprocessing. Even inside a single process you can create two File providers with the same file name as an input parameter.

Also: In case of file/dax provider we should Flock file, so user cannot create provider from two processes (on Windows we should use their voodoo magic to lock file, instead flock)

For me it is overkill, describing such things in documentation should be enough. We cannot protect from everything - user should understand that using the same file/shared memory/devdax from multiple processes or even from multiple providers within the same process is not supported.

@ldorau ldorau marked this pull request as draft October 24, 2024 07:36
@ldorau ldorau changed the title Fix devdax_open_ipc_handle() [WIP] Fix some use cases of the devdax provider Oct 24, 2024
@lplewa
Copy link
Contributor

lplewa commented Oct 24, 2024

In UMF "device" (from {device, handle} pair) is provider. But in fact we do not need provider to open IPChandle, we need provider_params, so we know where memory is(FD/filepath/devicehandle), and how to "mmap it" (read/write permission etc...)

Using your analogy, the "device" is a memory provider instance. So another analogy I like is from the C++ world: memory provider/memory pool are classes (they implement some functionality) and provider/pool instances are objects. To open IPC handle we need not only params but also the implementation (which low-level API should be used, in case of OS provider it is mmap, in case of L0 it is L0-specific API, etc.). So memory provider instance incapsulates both: parameters and implementation.
My problem is that provider is quite heavy structure. There is a lot of overhead during initialization of the memory provider. Alocations, hwloc (in os provider), creation of trees. In case of dax provider we even mmap entire device. While when you read code IPC code of those providers, you can notice that IPC code is kind of separate, from provider structure, and we need it only "protection and visibility flags for mmap"

The issue what we are trying to solve here is that devdax/file provider is not multi-process safe(this is my guess, maybe there is another issue). So there should not be two processes activily using the same resource to allocate memory. To do so we have to store provider metadata on the device. So user should not create provider on the same file from two preceses. But we still need metadata from provider with protection/visibility flags.

I do not think it is not multi-process issue. Using the same file or the same DAX device from multiple provider instances even within the same process is not supported. The File memory provider itself is multi-process safe if different files are used. Furthermore, if the user just wants to use memory provider instance to open IPC handles and never allocate from it than any filename/DAXdevice can be used to initialize the provider instance - it will not be used anyway if there are no allocations - but I agree that it is error-prone and we need somehow to address it.

You are right - it's not only multi-process issue but resource sharing between multiple providers issue.

Instead of the provider handle, user should provide pair {ops, params} or we have allow user to create virtual/quasi/hollow provider(which from implementation point of view - only validates {ops, params} and stores it in the single structure).

But what you described is exactly a memory provider instance. Memory provider instance is a combination of implementation (ops) and parameters.

Difference is in calling initialize function of provider. And allowing to do alloc from provider, while on this virtual one it is forbiden (as we are not calling init)

I think that all ways requiring creating "real" provider to receive memory from another process is bad.
Instead of the provider handle, user should provide pair {ops, params} or we have allow user to create virtual/quasi/hollow provider(which from implementation point of view - only validates {ops, params} and stores it in the single structure)

User should be able to use either real provider of this virtual one (if user uses multi-process safe provider and want's to allocate and share memory from both processes, we should allow it without extra complication)

I think I suggested the similar idea, but instead of calling it "real" and "virtual" provider I just suggested in my previous comment the following:

if a user wants only to open IPC handles from the remote process why user should care about
choosing the right DAX device or file name? Maybe we should allow the creation of
the instance of DAX provider without specifying the dax device if the user wants only to open IPC handles?

Today, inside initialize callback of provider implementation we consider all params as required to be able to execute malloc and open_ipc_handle functions. But maybe we can do it "lazily", e.g. if malloc is called but the file name is not specified then it returns some error.

This is just ugly solution. Some providers you have to create exactly in the same way on client side like it was created on producer side(os/Cuda/L0), but for other provider there is need to create a special one. This require good documentation, examples per provider, and as users are not good in reading those, it will still create confusion. An idea of UMF is "unify", memory allocation. Ofcourse providers differs, with configuration parameters, but all providers should have similar "vibe", so if user knows how to use one provider, it should be able to use another one. So if user need special provider for IPC open, it should require special one for each provider. But with you solution, if user has single function in their code base, to create OSprovider, shared between producer and consumer code, and want's to swich to devdax provider, they need not only change this function, but add extra logic to it do distinguish between producer and consumer provider, and adjust its creation params.

Also: In case of file/dax provider we should Flock file, so user cannot create provider from two processes (on Windows we should use their voodoo magic to lock file, instead flock)

For me it is overkill, describing such things in documentation should be enough. We cannot protect from everything - user should understand that using the same file/shared memory/devdax from multiple processes or even from multiple providers within the same process is not supported.

This is quite simple feature. Just calling one extra function on linux (and adding extra parameter to open at windows). We should try to protect user from simple mistakes if it is easy on our side.

I'm not super attached to this virtual provider for IPC(it has issue with IPC cache in the tracking provider), but i really dislike idea of NULL path as a workaround.

@ldorau ldorau force-pushed the Fix_devdax_open_ipc_handle_and_devdax_put_ipc_handle branch from f4c54a1 to 2d51761 Compare October 24, 2024 13:26
@ldorau ldorau changed the title [WIP] Fix some use cases of the devdax provider Add IPC-consumer-only mode to the devdax provider Oct 24, 2024
@ldorau ldorau force-pushed the Fix_devdax_open_ipc_handle_and_devdax_put_ipc_handle branch from 2d51761 to 1e838cd Compare October 24, 2024 13:30
@ldorau
Copy link
Contributor Author

ldorau commented Oct 24, 2024

@vinser52 @lplewa please review the latest draft.

@ldorau ldorau changed the title Add IPC-consumer-only mode to the devdax provider Add internal IPC-consumer-only mode to the devdax provider Oct 24, 2024
@ldorau ldorau force-pushed the Fix_devdax_open_ipc_handle_and_devdax_put_ipc_handle branch 3 times, most recently from 8662870 to 8251f01 Compare October 24, 2024 20:57
@bratpiorka
Copy link
Contributor

@vinser52 There is one thing I need to admit that I didn't understand from the beginning of the IPC implementation. In function umfOpenIPCHandle() we have to pass a pool as a parameter "Pool handle where to open the the IPC handle" - but it doesn't mean that this allocation would be added to the pool right? I mean I can't free this ptr and then reuse this memory in the pool?
Anyway I thought - what if we could allow to pass NULL here + provide more data in the handle (e.g. provider parameters) - then using data from the handle we would know how to open given ptr and user wouldn't have to create a pool/provider with specific parameters at all.

@ldorau ldorau force-pushed the Fix_devdax_open_ipc_handle_and_devdax_put_ipc_handle branch 2 times, most recently from 7a94dbb to 2fb942a Compare October 25, 2024 10:10
Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Add utils_align_ptr_down_size_up() to align
a pointer down and a size up (for mmap()/munmap()).

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
@ldorau ldorau force-pushed the Fix_devdax_open_ipc_handle_and_devdax_put_ipc_handle branch from 2fb942a to c9fd02c Compare October 25, 2024 13:43
@ldorau
Copy link
Contributor Author

ldorau commented Oct 25, 2024

Rebased

devdax_open_ipc_handle() has to use the path of the remote
/dev/dax got from the IPC handle, not the local one.

devdax_open_ipc_handle() has to use also the memory protection
got from the IPC handle, so let's add the memory protection
to the IPC handle.

devdax_open_ipc_handle() should mmap only the required range of memory,
not the whole /dev/dax device, so let's add the length of the allocation
to the IPC handle.

devdax_close_ipc_handle() should unmap only the required range of memory,
not the whole /dev/dax device.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Change arguments of the ipc_devdax_prov test:
- the producer uses the devdax device (given by
  UMF_TESTS_DEVDAX_PATH and UMF_TESTS_DEVDAX_SIZE variables)
- the consumer is started in the IPC-consumer-only mode
  with no devdax device given (path==NULL && size==0).

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
@ldorau ldorau force-pushed the Fix_devdax_open_ipc_handle_and_devdax_put_ipc_handle branch from c9fd02c to 07fbc39 Compare October 25, 2024 15:09
@ldorau ldorau closed this Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants