-
Notifications
You must be signed in to change notification settings - Fork 35
Add internal IPC-consumer-only mode to the devdax provider #831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add internal IPC-consumer-only mode to the devdax provider #831
Conversation
82b8689
to
d95cc97
Compare
How this is supposed to work? Both devdax are a mirror? |
d95cc97
to
f4c54a1
Compare
It just works - out-of-the-box ;-) - check the CI please :-) |
I asked how it works, not if CI pass ;) If we have two processes which want to share allocation on devdax, do client process need extra devdax only to get allocation done on first one? If yes it is broken. As far i know, IPC works as a pair {device, handle}, so both processes get handle to device the same device on their side, and then one process do allocation, create IPC handle to it, and sends it to second process. Second process uses pair {device1, handle} to receive pointer to this allocation. But now the client process uses pair {device2, handle} to get allocation from device1. This have no sense. But maybe i do not see something: @vinser52 - you are an IPC expert so maybe you can help. BTW the same issue is related to #830 which is unfortunately has the same issue. |
Yeah, I think @lplewa raised a very good question.
Correct.
I agree with that concern. The general idea of IPC is (exactly like @lplewa described) to allow remote process to access the memory allocated by another process. Now regarding how it is implemented in UMF. The memory pool is our main abstraction. To open IPC handle the same memory provider type should be used, e.g. if the IPC handle was created by the L0 provider, then the L0 provider should be used in another process to open this IPC handle. And this abstraction works fine with L0/CUDA/OS providers. For example to open L0 IPC handle the L0 context and L0 device are needed, so the user creates a L0 pool in a consumer process and specifies context/device as init parameters for the memory provider and it does not matter if a user is going to only open IPC handles in this pool or also going to allocate memory from this pool in a consumer process. The same init params (device/context) for the provider are needed in both scenarios: when you want only to open IPC handle and when you also want to allocate memory from that pool. The same is true for CUDA and OS providers. But with FSDAX and File providers I think we have a gap that @lplewa highlighted. We need to think how to address it. Otherwise, it is inconvenient for the user: like @lplewa said, if a user wants only to open IPC handles from the remote process why user should care about choosing the right DAX device or file name? Maybe we should allow the creation of the instance of DAX provider without specifying the dax device if the user wants only to open IPC handles?
I think it is a good general statement (related not only to this PR). We should be use-case-driven, not CI-driven :) Our goal not to make CI-green but address customer needs. Whenever we are designing a new feature or fixing a bug we should think how it impacts the user, but CI is just a tool to validate the solution. |
In UMF "device" (from {device, handle} pair) is provider. But in fact we do not need provider to open IPChandle, we need provider_params, so we know where memory is(FD/filepath/devicehandle), and how to "mmap it" (read/write permission etc...) The issue what we are trying to solve here is that devdax/file provider is not multi-process safe(this is my guess, maybe there is another issue). So there should not be two processes activily using the same resource to allocate memory. To do so we have to store provider metadata on the device. So user should not create provider on the same file from two preceses. But we still need metadata from provider with protection/visibility flags. I think that all ways requiring creating "real" provider to receive memory from another process is bad. User should be able to use either real provider of this virtual one (if user uses multi-process safe provider and want's to allocate and share memory from both processes, we should allow it without extra complication) Also: We have to document which provider is multiprocess safe and which is not. Maybe we need API for it too, not sure. |
Yes, that's right. I was already discussing that with @bratpiorka a bit, but we did not get to a solution yet.
Sure, that's obvious. It was just a joke. I was in hurry - I had to go home ASAP and I had no time to write more. |
Using your analogy, the "device" is a memory provider instance. So another analogy I like is from the C++ world: memory provider/memory pool are classes (they implement some functionality) and provider/pool instances are objects. To open IPC handle we need not only params but also the implementation (which low-level API should be used, in case of OS provider it is mmap, in case of L0 it is L0-specific API, etc.). So memory provider instance incapsulates both: parameters and implementation.
I do not think it is not multi-process issue. Using the same file or the same DAX device from multiple provider instances even within the same process is not supported. The File memory provider itself is multi-process safe if different files are used. Furthermore, if the user just wants to use memory provider instance to open IPC handles and never allocate from it than any filename/DAXdevice can be used to initialize the provider instance - it will not be used anyway if there are no allocations - but I agree that it is error-prone and we need somehow to address it.
But what you described is exactly a memory provider instance. Memory provider instance is a combination of implementation (ops) and parameters.
I think I suggested the similar idea, but instead of calling it "real" and "virtual" provider I just suggested in my previous comment the following:
Today, inside initialize callback of provider implementation we consider all params as required to be able to execute
I do not think we should use term "multi-process". As I described above, it is not a question about multiprocessing. Even inside a single process you can create two File providers with the same file name as an input parameter.
For me it is overkill, describing such things in documentation should be enough. We cannot protect from everything - user should understand that using the same file/shared memory/devdax from multiple processes or even from multiple providers within the same process is not supported. |
You are right - it's not only multi-process issue but resource sharing between multiple providers issue.
Difference is in calling initialize function of provider. And allowing to do alloc from provider, while on this virtual one it is forbiden (as we are not calling init)
This is just ugly solution. Some providers you have to create exactly in the same way on client side like it was created on producer side(os/Cuda/L0), but for other provider there is need to create a special one. This require good documentation, examples per provider, and as users are not good in reading those, it will still create confusion. An idea of UMF is "unify", memory allocation. Ofcourse providers differs, with configuration parameters, but all providers should have similar "vibe", so if user knows how to use one provider, it should be able to use another one. So if user need special provider for IPC open, it should require special one for each provider. But with you solution, if user has single function in their code base, to create OSprovider, shared between producer and consumer code, and want's to swich to devdax provider, they need not only change this function, but add extra logic to it do distinguish between producer and consumer provider, and adjust its creation params.
This is quite simple feature. Just calling one extra function on linux (and adding extra parameter to open at windows). We should try to protect user from simple mistakes if it is easy on our side. I'm not super attached to this virtual provider for IPC(it has issue with IPC cache in the tracking provider), but i really dislike idea of NULL path as a workaround. |
f4c54a1
to
2d51761
Compare
2d51761
to
1e838cd
Compare
8662870
to
8251f01
Compare
@vinser52 There is one thing I need to admit that I didn't understand from the beginning of the IPC implementation. In function umfOpenIPCHandle() we have to pass a pool as a parameter "Pool handle where to open the the IPC handle" - but it doesn't mean that this allocation would be added to the pool right? I mean I can't free this ptr and then reuse this memory in the pool? |
7a94dbb
to
2fb942a
Compare
Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Add utils_align_ptr_down_size_up() to align a pointer down and a size up (for mmap()/munmap()). Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
2fb942a
to
c9fd02c
Compare
Rebased |
devdax_open_ipc_handle() has to use the path of the remote /dev/dax got from the IPC handle, not the local one. devdax_open_ipc_handle() has to use also the memory protection got from the IPC handle, so let's add the memory protection to the IPC handle. devdax_open_ipc_handle() should mmap only the required range of memory, not the whole /dev/dax device, so let's add the length of the allocation to the IPC handle. devdax_close_ipc_handle() should unmap only the required range of memory, not the whole /dev/dax device. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Change arguments of the ipc_devdax_prov test: - the producer uses the devdax device (given by UMF_TESTS_DEVDAX_PATH and UMF_TESTS_DEVDAX_SIZE variables) - the consumer is started in the IPC-consumer-only mode with no devdax device given (path==NULL && size==0). Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
c9fd02c
to
07fbc39
Compare
Description
Add IPC-consumer-only mode to devdax provider.
Requires:
Checklist