blk: honor isolcpus configuration by blktests-ci[bot] · Pull Request #824 · linux-blktests/linux-block

blktests-ci · 2026-05-13T01:38:10Z

Pull request for series with
subject: blk: honor isolcpus configuration
version: 13
url: https://patchwork.kernel.org/project/linux-block/list/?series=1093842

blktests-ci · 2026-05-13T01:38:11Z

Upstream branch: aa54b1d
series: https://patchwork.kernel.org/project/linux-block/list/?series=1093842
version: 13

The calculation of the upper limit for queues does not depend solely on the number of online CPUs; for example, the isolcpus kernel command-line option must also be considered. To account for this, the block layer provides a helper function to retrieve the maximum number of queues. Use it to set an appropriate upper queue number limit. This patch brings aacraid in line with the API migration initiated for other SCSI drivers in commit 94970cf ("scsi: use block layer helpers to calculate num of queues"). Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> [atomlin: Drop "Fixes:" tag; indicate alignment with other SCSI drivers] Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>

The core scheduler recently transitioned to compiling SMP data structures unconditionally to reduce code complexity - see commit cac5cef ("sched/smp: Make SMP unconditional"). In alignment with this philosophy of reducing dual-path maintenance, this patch removes the #ifdef CONFIG_SMP guards and the dedicated !SMP fallback logic here. While the !SMP path provided a slightly simpler execution flow for uniprocessor kernels (avoiding SMP-specific overhead), maintaining these separate code paths adds unnecessary complexity and testing burden. Removing these guards simplifies the codebase by standardizing entirely on the SMP logic, which safely resolves to single-CPU operations on UP configurations. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> [atomlin: Updated commit message to clarify !SMP removal context] Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>

This commit introduces group_mask_cpus_evenly(), which allows callers to distribute a specific CPU mask evenly across groups. It serves as a bounded version of group_cpus_evenly(). While group_cpus_evenly() operates on the global cpu_possible_mask, group_mask_cpus_evenly() confines the distribution strictly within the boundaries of the caller-provided mask. It preserves the kernel's native two-stage spreading logic-first prioritising CPUs that are physically present (cpu_present_mask) to prevent I/O starvation, and then distributing any remaining vectors to non-present CPUs to maintain hotplug safety. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> [atomlin: - Added check for numgrps == 0 - Updated commit message to resolve typo - Removed unused <linux/sched/isolation.h> - Fix TOCTOU race by caching the provided mask - Implemented two-stage grouping logic to prioritise physically present CPUs, mirroring group_cpus_evenly()] Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>

Multiqueue drivers spread I/O queues across all CPUs for optimal performance. However, these drivers are not aware of CPU isolation requirements and will distribute queues without considering the isolcpus configuration. Introduce a new isolcpus mask that allows users to define which CPUs should have I/O queues assigned. This is similar to managed_irq, but intended for drivers that do not use the managed IRQ infrastructure Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>

Extend the capabilities of the generic CPU to hardware queue (hctx) mapping code, so it maps houskeeping CPUs and isolated CPUs to the hardware queues evenly. A hctx is only operational when there is at least one online housekeeping CPU assigned (aka active_hctx). Thus, check the final mapping that there is no hctx which has only offline housekeeing CPU and online isolated CPUs. Example mapping result: 16 online CPUs isolcpus=io_queue,2-3,6-7,12-13 Queue mapping: hctx0: default 0 2 hctx1: default 1 3 hctx2: default 4 6 hctx3: default 5 7 hctx4: default 8 12 hctx5: default 9 13 hctx6: default 10 hctx7: default 11 hctx8: default 14 hctx9: default 15 IRQ mapping: irq 42 affinity 0 effective 0 nvme0q0 irq 43 affinity 0 effective 0 nvme0q1 irq 44 affinity 1 effective 1 nvme0q2 irq 45 affinity 4 effective 4 nvme0q3 irq 46 affinity 5 effective 5 nvme0q4 irq 47 affinity 8 effective 8 nvme0q5 irq 48 affinity 9 effective 9 nvme0q6 irq 49 affinity 10 effective 10 nvme0q7 irq 50 affinity 11 effective 11 nvme0q8 irq 51 affinity 14 effective 14 nvme0q9 irq 52 affinity 15 effective 15 nvme0q10 A corner case is when the number of online CPUs and present CPUs differ and the driver asks for less queues than online CPUs, e.g. 8 online CPUs, 16 possible CPUs isolcpus=io_queue,2-3,6-7,12-13 virtio_blk.num_request_queues=2 Queue mapping: hctx0: default 0 1 2 3 4 5 6 7 8 12 13 hctx1: default 9 10 11 14 15 IRQ mapping irq 27 affinity 0 effective 0 virtio0-config irq 28 affinity 0-1,4-5,8 effective 5 virtio0-req.0 irq 29 affinity 9-11,14-15 effective 0 virtio0-req.1 Noteworthy is that for the normal/default configuration (!isoclpus) the mapping will change for systems which have non hyperthreading CPUs. The main assignment loop will completely rely that group_mask_cpus_evenly to do the right thing. The old code would distribute the CPUs linearly over the hardware context: queue mapping for /dev/nvme0n1 hctx0: default 0 8 hctx1: default 1 9 hctx2: default 2 10 hctx3: default 3 11 hctx4: default 4 12 hctx5: default 5 13 hctx6: default 6 14 hctx7: default 7 15 The assign each hardware context the map generated by the group_mask_cpus_evenly function: queue mapping for /dev/nvme0n1 hctx0: default 0 1 hctx1: default 2 3 hctx2: default 4 5 hctx3: default 6 7 hctx4: default 8 9 hctx5: default 10 11 hctx6: default 12 13 hctx7: default 14 15 In case of hyperthreading CPUs, the resulting map stays the same. Signed-off-by: Daniel Wagner <wagi@kernel.org> [atomlin: - Updated blk_mq_validate() to use test_bit() for the new bitmap - Replaced __free cleanups with traditional goto unwinding to align with subsystem styling - Updated blk_mq_map_fallback() to use qmap->queue_offset ensuring secondary maps do not incorrectly route to the primary default map - Added a bitmap_empty() check to prevent out-of-bounds CPU routing when all mapped CPUs are offline - Migrated active_hctx to a dynamically sized bitmap to fix an out-of-bounds write when hardware queues exceed the system CPU count - Fixed absolute vs. relative hardware queue index mix-up in blk_mq_map_queues() and validation checks - Fixed typographical errors - Reduced stack frame size of blk_mq_num_queues() - Resolved a TOCTOU race against CPU hotplug events by snapshotting cpu_online_mask to ensure mapping and validation phases agree - Corrected a loop overwrite bug in blk_mq_map_queues() by iterating directly over masks to prevent orphaned queues from being activated - Restored topology-aware multi-queue fallback in blk_mq_map_hw_queues() for devices lacking IRQ affinity hints - Hardened isolation logic in blk_mq_map_hw_queues() to require online housekeeping CPUs before marking a hardware queue as active - Optimised active queue evaluations by short-circuiting redundant checks once a valid CPU is found] Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>

When isolcpus=io_queue is enabled and the last housekeeping CPU for a given hctx goes offline, no CPU would be left to handle I/O. To prevent I/O stalls, disallow offlining housekeeping CPUs that are still serving isolated CPUs. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> [atomlin: - Removed duplicate paragraph from commit message - Allow offlining of non-housekeeping CPUs - Fix logic flaw that prematurely rejected valid offline requests - Iterated over cpu_online_mask and manually reverse-mapped CPUs to correctly detect isolated CPUs, as blk_mq_map_swqueue() intentionally prunes them from hctx->cpumask - Prevented a TOCTOU NULL pointer dereference race against concurrent device teardown by using READ_ONCE() to fetch the disk pointer] Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>

At present, the managed interrupt spreading algorithm distributes vectors across all available CPUs within a given node or system. On systems employing CPU isolation (e.g., "isolcpus=io_queue"), this behaviour defeats the primary purpose of isolation by routing hardware interrupts (such as NVMe completion queues) directly to isolated cores. Update irq_create_affinity_masks() to respect the housekeeping CPU mask. By passing the HK_TYPE_IO_QUEUE mask directly to the topological distribution function (group_mask_cpus_evenly()), we ensure that managed interrupts are kept strictly off isolated CPUs. This patch additionally addresses the architectural constraints of restricted vector distribution: 1. Vector Limits: Updated irq_calc_affinity_vectors() to bound the maximum number of allocated vectors to the weight of the housekeeping mask. This prevents drivers from wasting memory on dead hardware queues that cannot be routed to isolated CPUs. 2. Multi-set Alignment: When isolation constraints result in fewer available masks than requested vectors for a given set, the remaining vector slots are padded with irq_default_affinity. The loop correctly advances by the requested vector count (this_vecs) to prevent shifting and corrupting the 1:1 hardware queue mapping for subsequent sets. 3. Zero Overhead: The housekeeping mask is conditionally assigned via a direct pointer, completely avoiding temporary mask allocations (e.g., alloc_cpumask_var) and bitwise operations when CPU isolation is disabled. This guarantees zero performance or memory overhead for standard configurations. Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>

blktests-ci · 2026-05-15T08:14:04Z

Upstream branch: 70eda68
series: https://patchwork.kernel.org/project/linux-block/list/?series=1093842
version: 13

The io_queue flag informs multiqueue device drivers where to place hardware queues. Document this new flag in the isolcpus command-line argument description. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> [atomlin: Refined io_queue kernel parameter documentation] Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>

blktests-ci Bot added new linus-master V13 labels May 13, 2026

blktests-ci Bot force-pushed the linus-master_base branch from b1870f6 to ca57796 Compare May 15, 2026 07:55

igaw and others added 7 commits May 15, 2026 08:14

blktests-ci Bot force-pushed the series/1082186=>linus-master branch from 66a19be to e39f76b Compare May 15, 2026 08:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

blk: honor isolcpus configuration#824

blk: honor isolcpus configuration#824
blktests-ci[bot] wants to merge 8 commits into
linus-master_basefrom
series/1082186=>linus-master

blktests-ci Bot commented May 13, 2026

Uh oh!

blktests-ci Bot commented May 13, 2026

Uh oh!

blktests-ci Bot commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

blktests-ci Bot commented May 13, 2026

Uh oh!

blktests-ci Bot commented May 13, 2026

Uh oh!

blktests-ci Bot commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants