Skip to content

block: fix handling of dead zone write plugs#830

Open
blktests-ci[bot] wants to merge 1 commit into
linus-master_basefrom
series/1094144=>linus-master
Open

block: fix handling of dead zone write plugs#830
blktests-ci[bot] wants to merge 1 commit into
linus-master_basefrom
series/1094144=>linus-master

Conversation

@blktests-ci
Copy link
Copy Markdown

@blktests-ci blktests-ci Bot commented May 13, 2026

Pull request for series with
subject: block: fix handling of dead zone write plugs
version: 1
url: https://patchwork.kernel.org/project/linux-block/list/?series=1094144

@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented May 13, 2026

Upstream branch: aa54b1d
series: https://patchwork.kernel.org/project/linux-block/list/?series=1094144
version: 1

@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented May 13, 2026

Upstream branch: aa54b1d
series: https://patchwork.kernel.org/project/linux-block/list/?series=1094144
version: 1

Shin'ichiro reported hard to reproduce unaligned write errors with zoned
block devices. Under normal operation conditions (e.g. running XFS on an
SMR disk), these errors are nearly impossible to trigger. But using a
"slow" kernel with many debug options enables and some specific use cases
(e.g. fio zbd test case 46), the errors can be reproduced fairly easily.

The unaligned write errors come from mishandling a valid reference
counting pattern of zone write plugs. Such pattern triggers for instance
if a process A writes a zone (not necessarilly to the full state), another
process B immediately resets the zone and immediately following the
completion of the zone reset, starts issuing writes to the zone. With such
pattern, in some cases, the zone write plugs worker thread of the device
may still be holding a reference to the zone write plug of the zone taken
when process A was writing to the zone. The following zone reset from
process B marks the zone as dead but does not remove the zone write plug
from the device hash table as a reference to the plug still exist. Once
process B starts issuing new writes, the zone write plug is seen as dead
and the writes from process B are immediately failed, despite this write
pattern being perfectly legal.

Fix this by allowing restoring a dead zone write plug to a live state if a
write is issued to the zone when the zone is: marked as dead, empty and
the write sector corresponds to the first sector of the zone (that is, the
write is aligned to the zone write pointer). This is done with the new
helper function disk_check_zone_wplug_dead(), which restores a dead zone
write plug to a live state by clearing the BLK_ZONE_WPLUG_DEAD flag and
restoring the initial reference to the zone write plug taken when the plug
was added to the device hash table.

Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes: b7d4ffb ("block: fix zone write plug removal")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
@blktests-ci blktests-ci Bot force-pushed the series/1094144=>linus-master branch from 66bafd1 to 1bc11b0 Compare May 13, 2026 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant