Skip to content

Commit 2def284

Browse files
Dave Chinnerdjwong
Dave Chinner
authored andcommittedMar 27, 2020
xfs: don't allow log IO to be throttled
Running metadata intensive workloads, I've been seeing the AIL pushing getting stuck on pinned buffers and triggering log forces. The log force is taking a long time to run because the log IO is getting throttled by wbt_wait() - the block layer writeback throttle. It's being throttled because there is a huge amount of metadata writeback going on which is filling the request queue. IOWs, we have a priority inversion problem here. Mark the log IO bios with REQ_IDLE so they don't get throttled by the block layer writeback throttle. When we are forcing the CIL, we are likely to need to to tens of log IOs, and they are issued as fast as they can be build and IO completed. Hence REQ_IDLE is appropriate - it's an indication that more IO will follow shortly. And because we also set REQ_SYNC, the writeback throttle will now treat log IO the same way it treats direct IO writes - it will not throttle them at all. Hence we solve the priority inversion problem caused by the writeback throttle being unable to distinguish between high priority log IO and background metadata writeback. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
1 parent 0e7ab7e commit 2def284

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed
 

‎fs/xfs/xfs_log.c

+9-1
Original file line numberDiff line numberDiff line change
@@ -1687,7 +1687,15 @@ xlog_write_iclog(
16871687
iclog->ic_bio.bi_iter.bi_sector = log->l_logBBstart + bno;
16881688
iclog->ic_bio.bi_end_io = xlog_bio_end_io;
16891689
iclog->ic_bio.bi_private = iclog;
1690-
iclog->ic_bio.bi_opf = REQ_OP_WRITE | REQ_META | REQ_SYNC | REQ_FUA;
1690+
1691+
/*
1692+
* We use REQ_SYNC | REQ_IDLE here to tell the block layer the are more
1693+
* IOs coming immediately after this one. This prevents the block layer
1694+
* writeback throttle from throttling log writes behind background
1695+
* metadata writeback and causing priority inversions.
1696+
*/
1697+
iclog->ic_bio.bi_opf = REQ_OP_WRITE | REQ_META | REQ_SYNC |
1698+
REQ_IDLE | REQ_FUA;
16911699
if (need_flush)
16921700
iclog->ic_bio.bi_opf |= REQ_PREFLUSH;
16931701

0 commit comments

Comments
 (0)
Please sign in to comment.