-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hung pool after upgrade from 2.1.5 to 2.2.2 when performing a lot of writes #17091
Comments
``Could` https://www.reddit.com/r/zfs/comments/1i1g9bn/silent_data_loss_while_confirming_writes/ be relevant? |
I don't think so since I'm not using ZFS encryption nor have any sort of special SLOG setup. But the |
Just posted pretty much the same post into the ZFS mailing list here in case it gets more visibility there. |
It would be good to upgrade to 2.2.7 actually. I personally have no much interest to review again what we have fixed there during that year of development that could possibly help here. |
He's running ubuntu 24.04. zfs 2.2.2 is what comes with it. Last changed April 24 when they patched it to support kernel 6.8. Ubuntu doesn't believe in tracking ZFS development. The explanation is that not enough testing is done to meet their standards. They also prefer to take a few patches rather than update to the current version. As a result, their ZFS is always pretty questionable. To upgrade he'll have to build ZFS himself. https://openzfs.github.io/openzfs-docs/Developer%20Resources/Building%20ZFS.html What happened to the project to supply current packages for Ubuntu? |
z_upgrade is for updating filesystem accounting metadata after receiving datastreams, to have the correct figures for quotas and the like... see #9410 (comment) Seems like you're running into the
he mentioned in #9410 (comment) Do you by chance know what new feature flags have been activated by your Does the system receive new datasets/incrementals on a regular basis? Nevertheless: @behlendorf z_upgrade should not deadlock with writing to filesystems, should it? |
I'm also having this issue on Ubuntu 24.04.2, kernel 6.8.0-55-generic on amd64. I did try zfs version 2.2.7 and 2.3.0 but the issue persisted. I cannot get more than about 4 days of uptime before needing to reboot so the issue is quite reproducible. I did see someone mentioning SLOG which I am using with LVM on SSD. I have removed the SLOG and seeing now if that helps. |
System information
Describe the problem you're observing
I have a storage server with 9 vdev's each with 10 drives in a
raidz2
format. 6 of the vdev's have 12 TB drives, 3 have 22 TB ones.Last week I updated all packages on the Ubuntu 22.04 server, shut off
smbd
, upgraded from Ubuntu 22.04 -> Ubuntu 24.04, and rebooted.Everything looked normal, the pool was healthy (noted by
zpool status -v
) andsudo dmesg
was clean.I upgraded the pool with
zpool update
. Everything still looked good.Upon enabling
smbd
, within a few minutes a deadlock was be entered. Usingzpool iostat -v 1
I could see there was no I/O happening on any drive.ps aux
:I figured something random was happening, so I rebooted the server but the issue immediately came back. I also figured maybe
z_upgrade
needed to complete but have seen online that this is safe to run in the background.My running theory is that this is caused by writes rather than large amounts of reads. I tried making the
smb
share with the most I/Oread-only
to reduce I/O, and upon startingsmbd
the server was in a live state for almost an hour before a user tried to delete a large folder and it deadlocked back up again. Rebooting always fixes it.Describe how to reproduce the problem
Upgrade server from Ubuntu 22.04 -> Ubuntu 24.04.
Upgrade Pool
Have 1+ Samba shares, possibly performing a large amount of writes
Include any warning/errors/backtraces from the system logs
sudo dmesg
showing errors of the deadlocksudo systemctl status smbd
shows no errors onsmb
's side.What info can I provide to help debug this? I'm also curious if there's a way to get an ETA for
z_upgrade
. While I've read it's fine to let that continue in the background, maybe I need to wait for it to finish before any real writes/reads?The text was updated successfully, but these errors were encountered: