Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hung pool after upgrade from 2.1.5 to 2.2.2 when performing a lot of writes #17091

Open
phillip-stephens opened this issue Feb 25, 2025 · 7 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@phillip-stephens
Copy link

phillip-stephens commented Feb 25, 2025

System information

Type Version/Name
Distribution Name Ubuntu Server
Distribution Version 24.04
Kernel Version 6.8.0-53-generic
Architecture amd64
OpenZFS Version zfs-2.2.2-0ubuntu9.1

Describe the problem you're observing

I have a storage server with 9 vdev's each with 10 drives in a raidz2 format. 6 of the vdev's have 12 TB drives, 3 have 22 TB ones.
Last week I updated all packages on the Ubuntu 22.04 server, shut off smbd, upgraded from Ubuntu 22.04 -> Ubuntu 24.04, and rebooted.
Everything looked normal, the pool was healthy (noted by zpool status -v) and sudo dmesg was clean.
I upgraded the pool with zpool update. Everything still looked good.
Upon enabling smbd, within a few minutes a deadlock was be entered. Using zpool iostat -v 1 I could see there was no I/O happening on any drive.

ps aux:

pste@storage:~$ ps aux | grep "D"
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root        2901  4.7  0.0      0     0 ?        D    08:39  17:03 [z_upgrade]
root        4484  0.0  0.0      0     0 ?        D    08:39   0:00 [txg_quiesce]
root        6863  0.0  0.0  12196  8000 ?        Ss   08:48   0:00 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
kr      7436  0.0  0.0 132332 23756 ?        D    08:56   0:00 smbd: client [10.216.3.128]
as        7464  0.0  0.0 131708 28032 ?        D    08:56   0:00 smbd: client [10.216.0.37]
zd        7529  0.0  0.0 123512 17152 ?        D    08:56   0:00 smbd: client [10.216.3.160]
qa    7541  0.0  0.0 120468 39088 ?        D    08:56   0:06 smbd: client [10.216.3.49]
qu      7549  3.1  0.0 155548 48848 ?        D    08:56  10:48 smbd: client [10.216.0.20]
kr     12565  0.0  0.0 119112 21272 ?        D    11:18   0:00 smbd: client [10.216.2.157]
pste+   17343  0.0  0.0   9144  1920 pts/0    S+   14:37   0:00 grep --color=auto D

I figured something random was happening, so I rebooted the server but the issue immediately came back. I also figured maybe z_upgrade needed to complete but have seen online that this is safe to run in the background.
My running theory is that this is caused by writes rather than large amounts of reads. I tried making the smb share with the most I/O read-only to reduce I/O, and upon starting smbd the server was in a live state for almost an hour before a user tried to delete a large folder and it deadlocked back up again. Rebooting always fixes it.

Describe how to reproduce the problem

Upgrade server from Ubuntu 22.04 -> Ubuntu 24.04.
Upgrade Pool
Have 1+ Samba shares, possibly performing a large amount of writes

Include any warning/errors/backtraces from the system logs

sudo dmesg showing errors of the deadlock

Feb 24 08:40:00.406515 storage-01 kernel: RPC: Registered tcp-with-tls transport module.
Feb 24 08:40:00.406551 storage-01 kernel: RPC: Registered tcp NFSv4.1 backchannel transport module.
Feb 24 08:40:00.575078 storage-01 kernel: Process accounting resumed
Feb 24 08:40:00.928108 storage-01 systemd-journald[903]: /var/log/journal/4d9214db14634e1a802b475d9bfe0478/user-1038.journal: Journal file uses a different sequence number ID, rotating.
Feb 24 10:44:09.484041 storage-01 kernel: INFO: task txg_quiesce:4484 blocked for more than 122 seconds.
Feb 24 10:44:09.491210 storage-01 kernel:       Tainted: P           O       6.8.0-53-generic #55-Ubuntu
Feb 24 10:44:09.491334 storage-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 24 10:44:09.491417 storage-01 kernel: task:txg_quiesce     state:D stack:0     pid:4484  tgid:4484  ppid:2      flags:0x00004000
Feb 24 10:44:09.491517 storage-01 kernel: Call Trace:
Feb 24 10:44:09.491592 storage-01 kernel:  <TASK>
Feb 24 10:44:09.491655 storage-01 kernel:  __schedule+0x27c/0x6b0
Feb 24 10:44:09.491784 storage-01 kernel:  schedule+0x33/0x110
Feb 24 10:44:09.491896 storage-01 kernel:  cv_wait_common+0x102/0x140 [spl]
Feb 24 10:44:09.491973 storage-01 kernel:  ? __pfx_autoremove_wake_function+0x10/0x10
Feb 24 10:44:09.492069 storage-01 kernel:  __cv_wait+0x15/0x30 [spl]
Feb 24 10:44:09.492130 storage-01 kernel:  txg_quiesce+0x181/0x1f0 [zfs]
Feb 24 10:44:09.492205 storage-01 kernel:  txg_quiesce_thread+0xd2/0x120 [zfs]
Feb 24 10:44:09.492281 storage-01 kernel:  ? __pfx_txg_quiesce_thread+0x10/0x10 [zfs]
Feb 24 10:44:09.492371 storage-01 kernel:  ? __pfx_thread_generic_wrapper+0x10/0x10 [spl]
Feb 24 10:44:09.492438 storage-01 kernel:  thread_generic_wrapper+0x5c/0x70 [spl]
Feb 24 10:44:09.492505 storage-01 kernel:  kthread+0xef/0x120
Feb 24 10:44:09.492591 storage-01 kernel:  ? __pfx_kthread+0x10/0x10
Feb 24 10:44:09.492677 storage-01 kernel:  ret_from_fork+0x44/0x70
Feb 24 10:44:09.492771 storage-01 kernel:  ? __pfx_kthread+0x10/0x10
Feb 24 10:44:09.492839 storage-01 kernel:  ret_from_fork_asm+0x1b/0x30
Feb 24 10:44:09.492900 storage-01 kernel:  </TASK>
Feb 24 10:46:12.364009 storage-01 kernel: INFO: task txg_quiesce:4484 blocked for more than 245 seconds.
Feb 24 10:46:12.364322 storage-01 kernel:       Tainted: P           O       6.8.0-53-generic #55-Ubuntu
Feb 24 10:46:12.364404 storage-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 24 10:46:12.364475 storage-01 kernel: task:txg_quiesce     state:D stack:0     pid:4484  tgid:4484  ppid:2      flags:0x00004000
Feb 24 10:46:12.364539 storage-01 kernel: Call Trace:
Feb 24 10:46:12.364604 storage-01 kernel:  <TASK>
Feb 24 10:46:12.364705 storage-01 kernel:  __schedule+0x27c/0x6b0
Feb 24 10:46:12.364781 storage-01 kernel:  schedule+0x33/0x110
Feb 24 10:46:12.364834 storage-01 kernel:  cv_wait_common+0x102/0x140 [spl]
Feb 24 10:46:12.364901 storage-01 kernel:  ? __pfx_autoremove_wake_function+0x10/0x10
Feb 24 10:46:12.364953 storage-01 kernel:  __cv_wait+0x15/0x30 [spl]
Feb 24 10:46:12.365018 storage-01 kernel:  txg_quiesce+0x181/0x1f0 [zfs]
Feb 24 10:46:12.365085 storage-01 kernel:  txg_quiesce_thread+0xd2/0x120 [zfs]
Feb 24 10:46:12.365692 storage-01 kernel:  ? __pfx_txg_quiesce_thread+0x10/0x10 [zfs]
Feb 24 10:46:12.365756 storage-01 kernel:  ? __pfx_thread_generic_wrapper+0x10/0x10 [spl]
Feb 24 10:46:12.365824 storage-01 kernel:  thread_generic_wrapper+0x5c/0x70 [spl]
Feb 24 10:46:12.365890 storage-01 kernel:  kthread+0xef/0x120
Feb 24 10:46:12.365957 storage-01 kernel:  ? __pfx_kthread+0x10/0x10
Feb 24 10:46:12.366025 storage-01 kernel:  ret_from_fork+0x44/0x70
Feb 24 10:46:12.366112 storage-01 kernel:  ? __pfx_kthread+0x10/0x10
Feb 24 10:46:12.366163 storage-01 kernel:  ret_from_fork_asm+0x1b/0x30
Feb 24 10:46:12.366214 storage-01 kernel:  </TASK>
Feb 24 10:48:15.244023 storage-01 kernel: INFO: task txg_quiesce:4484 blocked for more than 368 seconds.
Feb 24 10:48:15.251117 storage-01 kernel:       Tainted: P           O       6.8.0-53-generic #55-Ubuntu
Feb 24 10:48:15.251202 storage-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 24 10:48:15.251293 storage-01 kernel: task:txg_quiesce     state:D stack:0     pid:4484  tgid:4484  ppid:2      flags:0x00004000
Feb 24 10:48:15.251358 storage-01 kernel: Call Trace:
Feb 24 10:48:15.251409 storage-01 kernel:  <TASK>
Feb 24 10:48:15.251460 storage-01 kernel:  __schedule+0x27c/0x6b0
Feb 24 10:48:15.251527 storage-01 kernel:  schedule+0x33/0x110
Feb 24 10:48:15.251594 storage-01 kernel:  cv_wait_common+0x102/0x140 [spl]
Feb 24 10:48:15.251702 storage-01 kernel:  ? __pfx_autoremove_wake_function+0x10/0x10
Feb 24 10:48:15.251766 storage-01 kernel:  __cv_wait+0x15/0x30 [spl]
Feb 24 10:48:15.251833 storage-01 kernel:  txg_quiesce+0x181/0x1f0 [zfs]
Feb 24 10:48:15.251885 storage-01 kernel:  txg_quiesce_thread+0xd2/0x120 [zfs]
Feb 24 10:48:15.251951 storage-01 kernel:  ? __pfx_txg_quiesce_thread+0x10/0x10 [zfs]
Feb 24 10:48:15.252025 storage-01 kernel:  ? __pfx_thread_generic_wrapper+0x10/0x10 [spl]
Feb 24 10:48:15.252091 storage-01 kernel:  thread_generic_wrapper+0x5c/0x70 [spl]
Feb 24 10:48:15.252144 storage-01 kernel:  kthread+0xef/0x120
Feb 24 10:48:15.252194 storage-01 kernel:  ? __pfx_kthread+0x10/0x10
Feb 24 10:48:15.252260 storage-01 kernel:  ret_from_fork+0x44/0x70
Feb 24 10:48:15.252325 storage-01 kernel:  ? __pfx_kthread+0x10/0x10
Feb 24 10:48:15.252393 storage-01 kernel:  ret_from_fork_asm+0x1b/0x30
Feb 24 10:48:15.252445 storage-01 kernel:  </TASK>
Feb 24 10:48:15.252496 storage-01 kernel: INFO: task smbd[10.216.0.2:7549 blocked for more than 122 seconds.
Feb 24 10:48:15.252572 storage-01 kernel:       Tainted: P           O       6.8.0-53-generic #55-Ubuntu
Feb 24 10:48:15.252624 storage-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

sudo systemctl status smbd shows no errors on smb's side.

What info can I provide to help debug this? I'm also curious if there's a way to get an ETA for z_upgrade. While I've read it's fine to let that continue in the background, maybe I need to wait for it to finish before any real writes/reads?

@phillip-stephens phillip-stephens added the Type: Defect Incorrect behavior (e.g. crash, hang) label Feb 25, 2025
@clhedrick
Copy link

@phillip-stephens
Copy link
Author

I don't think so since I'm not using ZFS encryption nor have any sort of special SLOG setup. But the txg_quiesce message in sudo dmesg is similar.

@phillip-stephens
Copy link
Author

Just posted pretty much the same post into the ZFS mailing list here in case it gets more visibility there.

@amotin
Copy link
Member

amotin commented Mar 6, 2025

It would be good to upgrade to 2.2.7 actually. I personally have no much interest to review again what we have fixed there during that year of development that could possibly help here.

@clhedrick
Copy link

He's running ubuntu 24.04. zfs 2.2.2 is what comes with it. Last changed April 24 when they patched it to support kernel 6.8. Ubuntu doesn't believe in tracking ZFS development. The explanation is that not enough testing is done to meet their standards. They also prefer to take a few patches rather than update to the current version. As a result, their ZFS is always pretty questionable.

To upgrade he'll have to build ZFS himself. https://openzfs.github.io/openzfs-docs/Developer%20Resources/Building%20ZFS.html

What happened to the project to supply current packages for Ubuntu?

@GregorKopka
Copy link
Contributor

GregorKopka commented Mar 17, 2025

z_upgrade is for updating filesystem accounting metadata after receiving datastreams, to have the correct figures for quotas and the like... see #9410 (comment)

Seems like you're running into the

If you've just enabled the new quota feature flags, or received a zfs dataset, then that missing quota accounting is detected and reconciled as a background process. This will result in a burst of IO, how much depends on the number of files in the filesystem.

he mentioned in #9410 (comment)

Do you by chance know what new feature flags have been activated by your zpool upgrade?
One of the newly activated ones might be responsible for triggering a rescan to update accounting on all filesystem, which might take a while for your amount of data, especially having to pull all filesystem metadata using demand reads from slow (IOPS wise) drives... so regarding ETA:
Your disks are slow, as of the layout the pool has the effective IOPS of between worst case 9 drives and absolute best case 27 drives (every vdev can theoretically read 3 single data-block stripes in parallel, in case the reads are distributed without overlap). As a result I would expect the time needed for z_update to be somewhere similar to the scanning phase of a scrub, as only the metadata has to be read.

Does the system receive new datasets/incrementals on a regular basis?
Then you could maybe stop that till the storm from mounting has died down...

Nevertheless: @behlendorf z_upgrade should not deadlock with writing to filesystems, should it?

@ccoager
Copy link

ccoager commented Mar 18, 2025

I'm also having this issue on Ubuntu 24.04.2, kernel 6.8.0-55-generic on amd64. I did try zfs version 2.2.7 and 2.3.0 but the issue persisted. I cannot get more than about 4 days of uptime before needing to reboot so the issue is quite reproducible. I did see someone mentioning SLOG which I am using with LVM on SSD. I have removed the SLOG and seeing now if that helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

5 participants