-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix inode eviction and sync writeback deadlock on Linux #17159
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1149,10 +1149,21 @@ zfs_get_data(void *arg, uint64_t gen, lr_write_t *lr, char *buf, | |
ASSERT3P(lwb, !=, NULL); | ||
ASSERT3U(size, !=, 0); | ||
|
||
error = zfs_zget_impl(zfsvfs, object, &zp, B_TRUE); | ||
#if defined(__linux__) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added ifdef because I'm not sure if it's possible to return EAGAIN on freebsd or not. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. perhaps it would be more clean to enhance |
||
/* | ||
* Under Linux, EAGAIN indicates the inode is being evicted and I_SYNC | ||
* is also set possibly blocking eviction, so we can't loop in | ||
* zfs_zget to avoid deadlock. Return EIO to force txg sync under such | ||
* scenario. | ||
*/ | ||
if (error == EAGAIN) | ||
return (SET_ERROR(EIO)); | ||
#endif | ||
/* | ||
* Nothing to do if the file has been removed | ||
*/ | ||
if (zfs_zget(zfsvfs, object, &zp) != 0) | ||
if (error) | ||
return (SET_ERROR(ENOENT)); | ||
if (zp->z_unlinked) { | ||
/* | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe you can move this before the
igrab
, take thei_lock
and check forI_FREEING
andI_SYNC
and if they are both set, bail out of the originalzfs_zget
without the need forzfs_zget_impl
- have you already considered it? am I overlooking something?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only place that needs bail out is if you are in writeback context, either you are the one setting I_SYNC or whoever set it is blocked by you.
If you do bailout for other caller then you will get errors randomly in all sorts of places for no good reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I_SYNC
is set in two kernel functions,writeback_sb_inodes
,writeback_single_inode
. So if you check forI_SYNC
andI_FREEING
and you usei_lock
correctly as you're supposed to, you get more clean and also reliable solution. How is what am saying wrong specifically and why? Which tons of errors?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@snajpa
I_FREEING and I_SYNC is set on the inode. Every one trying to do zfs_zget will see them. So if you don't have any restriction on caller then every one will bailout and got error as long as eviction and writeback is happening at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well.. I tried playing around with it and it seems like it would be an interesting project to get to the bottom of maybe few other lurking bugs, forcing one to go around and look at how errors from
zfs_zget
are handled all over the codebase - I think there's a thing in direct IO I'll try to tackle if I find time; but that's not relevant to resolving the deadlock, so, yeah, no easy way other than the_impl
split :)