TEST: update readme#21
Closed
quic-viskuma wants to merge 3 commits into
Closed
Conversation
Pre merge workflow - sync, build & test Reusing - https://github.com/qualcomm-linux/kernel-config Signed-off-by: Vishal Kumar <viskuma@qti.qualcomm.com>
eb72779 to
c9e154a
Compare
89725f2 to
1bb673e
Compare
c9e154a to
65fd028
Compare
Don't merge Signed-off-by: Vishal Kumar <viskuma@qti.qualcomm.com>
0e4daa3 to
d1a1172
Compare
Signed-off-by: VISHAL KUMAR <viskuma@qti.qualcomm.com>
sgaud-quic
pushed a commit
to sgaud-quic/kernel-topics
that referenced
this pull request
May 30, 2025
[Why & How] Fix a false positive warning which occurs due to lack of correct checks when querying plane_id in DML21. This fixes the warning when performing a mode1 reset (cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover): [ 35.751250] WARNING: CPU: 11 PID: 326 at /tmp/amd.PHpyAl7v/amd/amdgpu/../display/dc/dml2/dml2_dc_resource_mgmt.c:91 dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu] [ 35.751434] Modules linked in: amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) amddrm_buddy(OE) amdxcp(OE) amddrm_exec(OE) amd_sched(OE) amdkcl(OE) drm_suballoc_helper drm_ttm_helper ttm drm_display_helper cec rc_core i2c_algo_bit rfcomm qrtr cmac algif_hash algif_skcipher af_alg bnep amd_atl intel_rapl_msr intel_rapl_common snd_hda_codec_hdmi snd_hda_intel edac_mce_amd snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_hda_core snd_hwdep snd_pcm kvm snd_seq_midi snd_seq_midi_event snd_rawmidi crct10dif_pclmul polyval_clmulni polyval_generic btusb ghash_clmulni_intel sha256_ssse3 btrtl sha1_ssse3 snd_seq btintel aesni_intel btbcm btmtk snd_seq_device crypto_simd sunrpc cryptd bluetooth snd_timer ccp binfmt_misc rapl snd i2c_piix4 wmi_bmof gigabyte_wmi k10temp i2c_smbus soundcore gpio_amdpt mac_hid sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 hid_generic usbhid hid crc32_pclmul igc ahci xhci_pci libahci xhci_pci_renesas video wmi [ 35.751501] CPU: 11 UID: 0 PID: 326 Comm: kworker/u64:9 Tainted: G OE 6.11.0-21-generic qualcomm-linux#21~24.04.1-Ubuntu [ 35.751504] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 35.751505] Hardware name: Gigabyte Technology Co., Ltd. X670E AORUS PRO X/X670E AORUS PRO X, BIOS F30 05/22/2024 [ 35.751506] Workqueue: amdgpu-reset-dev amdgpu_debugfs_reset_work [amdgpu] [ 35.751638] RIP: 0010:dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu] [ 35.751794] Code: 6d 0c 00 00 8b 84 24 88 00 00 00 41 3b 44 9c 20 0f 84 fc 07 00 00 48 83 c3 01 48 83 fb 06 75 b3 4c 8b 64 24 68 4c 8b 6c 24 40 <0f> 0b b8 06 00 00 00 49 8b 94 24 a0 49 00 00 89 c3 83 f8 07 0f 87 [ 35.751796] RSP: 0018:ffffbfa3805d7680 EFLAGS: 00010246 [ 35.751798] RAX: 0000000000010000 RBX: 0000000000000006 RCX: 0000000000000000 [ 35.751799] RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000 [ 35.751800] RBP: ffffbfa3805d78f0 R08: 0000000000000000 R09: 0000000000000000 [ 35.751801] R10: 0000000000000000 R11: 0000000000000000 R12: ffffbfa383249000 [ 35.751802] R13: ffffa0e68f280000 R14: ffffbfa383249658 R15: 0000000000000000 [ 35.751803] FS: 0000000000000000(0000) GS:ffffa0edbe580000(0000) knlGS:0000000000000000 [ 35.751804] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 35.751805] CR2: 00005d847ef96c58 CR3: 000000041de3e000 CR4: 0000000000f50ef0 [ 35.751806] PKRU: 55555554 [ 35.751807] Call Trace: [ 35.751810] <TASK> [ 35.751816] ? show_regs+0x6c/0x80 [ 35.751820] ? __warn+0x88/0x140 [ 35.751822] ? dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu] [ 35.751964] ? report_bug+0x182/0x1b0 [ 35.751969] ? handle_bug+0x6e/0xb0 [ 35.751972] ? exc_invalid_op+0x18/0x80 [ 35.751974] ? asm_exc_invalid_op+0x1b/0x20 [ 35.751978] ? dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu] [ 35.752117] ? math_pow+0x48/0xa0 [amdgpu] [ 35.752256] ? srso_alias_return_thunk+0x5/0xfbef5 [ 35.752260] ? math_pow+0x48/0xa0 [amdgpu] [ 35.752400] ? srso_alias_return_thunk+0x5/0xfbef5 [ 35.752403] ? math_pow+0x11/0xa0 [amdgpu] [ 35.752524] ? srso_alias_return_thunk+0x5/0xfbef5 [ 35.752526] ? core_dcn4_mode_programming+0xe4d/0x20d0 [amdgpu] [ 35.752663] ? srso_alias_return_thunk+0x5/0xfbef5 [ 35.752669] dml21_validate+0x3d4/0x980 [amdgpu] Reviewed-by: Austin Zheng <austin.zheng@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f8ad62c)
sgaud-quic
pushed a commit
that referenced
this pull request
Feb 3, 2026
Fix a use-after-free which happens due to enslave failure after the new slave has been added to the array. Since the new slave can be used for Tx immediately, we can use it after it has been freed by the enslave error cleanup path which frees the allocated slave memory. Slave update array is supposed to be called last when further enslave failures are not expected. Move it after xdp setup to avoid any problems. It is very easy to reproduce the problem with a simple xdp_pass prog: ip l add bond1 type bond mode balance-xor ip l set bond1 up ip l set dev bond1 xdp object xdp_pass.o sec xdp_pass ip l add dumdum type dummy Then run in parallel: while :; do ip l set dumdum master bond1 1>/dev/null 2>&1; done; mausezahn bond1 -a own -b rand -A rand -B 1.1.1.1 -c 0 -t tcp "dp=1-1023, flags=syn" The crash happens almost immediately: [ 605.602850] Oops: general protection fault, probably for non-canonical address 0xe0e6fc2460000137: 0000 [#1] SMP KASAN NOPTI [ 605.602916] KASAN: maybe wild-memory-access in range [0x07380123000009b8-0x07380123000009bf] [ 605.602946] CPU: 0 UID: 0 PID: 2445 Comm: mausezahn Kdump: loaded Tainted: G B 6.19.0-rc6+ #21 PREEMPT(voluntary) [ 605.602979] Tainted: [B]=BAD_PAGE [ 605.602998] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 605.603032] RIP: 0010:netdev_core_pick_tx+0xcd/0x210 [ 605.603063] Code: 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 3e 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 6b 08 49 8d 7d 30 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 25 01 00 00 49 8b 45 30 4c 89 e2 48 89 ee 48 89 [ 605.603111] RSP: 0018:ffff88817b9af348 EFLAGS: 00010213 [ 605.603145] RAX: dffffc0000000000 RBX: ffff88817d28b420 RCX: 0000000000000000 [ 605.603172] RDX: 00e7002460000137 RSI: 0000000000000008 RDI: 07380123000009be [ 605.603199] RBP: ffff88817b541a00 R08: 0000000000000001 R09: fffffbfff3ed8c0c [ 605.603226] R10: ffffffff9f6c6067 R11: 0000000000000001 R12: 0000000000000000 [ 605.603253] R13: 073801230000098e R14: ffff88817d28b448 R15: ffff88817b541a84 [ 605.603286] FS: 00007f6570ef67c0(0000) GS:ffff888221dfa000(0000) knlGS:0000000000000000 [ 605.603319] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 605.603343] CR2: 00007f65712fae40 CR3: 000000011371b000 CR4: 0000000000350ef0 [ 605.603373] Call Trace: [ 605.603392] <TASK> [ 605.603410] __dev_queue_xmit+0x448/0x32a0 [ 605.603434] ? __pfx_vprintk_emit+0x10/0x10 [ 605.603461] ? __pfx_vprintk_emit+0x10/0x10 [ 605.603484] ? __pfx___dev_queue_xmit+0x10/0x10 [ 605.603507] ? bond_start_xmit+0xbfb/0xc20 [bonding] [ 605.603546] ? _printk+0xcb/0x100 [ 605.603566] ? __pfx__printk+0x10/0x10 [ 605.603589] ? bond_start_xmit+0xbfb/0xc20 [bonding] [ 605.603627] ? add_taint+0x5e/0x70 [ 605.603648] ? add_taint+0x2a/0x70 [ 605.603670] ? end_report.cold+0x51/0x75 [ 605.603693] ? bond_start_xmit+0xbfb/0xc20 [bonding] [ 605.603731] bond_start_xmit+0x623/0xc20 [bonding] Fixes: 9e2ee5c ("net, bonding: Add XDP support to the bonding driver") Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Reported-by: Chen Zhen <chenzhen126@huawei.com> Closes: https://lore.kernel.org/netdev/fae17c21-4940-5605-85b2-1d5e17342358@huawei.com/ CC: Jussi Maki <joamaki@gmail.com> CC: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/20260123120659.571187-1-razor@blackwall.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
sgaud-quic
pushed a commit
that referenced
this pull request
Mar 10, 2026
When allocating a lot of buffers and putting the TTM under memory pressure, during swapout, it might crash the system with the stack trace below. It turns out that ttm_bo_swapout_cb might replace bo->resource when it moves it to system cached. When commit c06da4b ("drm/ttm: Tidy usage of local variables a little bit") used a local variable for bo->resource, it used the freed resource later in the function, leading to a UAF. Move back to using bo->resource in all cases in that function instead of a local variable. [ 604.814275] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 604.814284] #PF: supervisor read access in kernel mode [ 604.814288] #PF: error_code(0x0000) - not-present page [ 604.814291] PGD 0 P4D 0 [ 604.814296] Oops: Oops: 0000 [#1] SMP NOPTI [ 604.814303] CPU: 2 UID: 0 PID: 4408 Comm: vulkan Tainted: G W 7.0.0-rc2-00001-gc50a051e6aca #21 PREEMPT(full) aef6eb0c02036a7c8a5e62e0c84a30c2be90688d [ 604.814309] Tainted: [W]=WARN [ 604.814311] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0133 08/05/2024 [ 604.814314] RIP: 0010:ttm_resource_move_to_lru_tail+0x100/0x160 [ttm] [ 604.814329] Code: 5b 5d e9 83 b4 1b cb 48 63 d2 48 c1 e0 04 48 8b 4e 40 48 8d 7e 40 48 8b ac d3 d8 00 00 00 48 89 c3 48 8d 54 05 68 48 8b 46 48 <48> 3b 38 0f 85 b3 3b 00 00 48 3b 79 08 0f 85 a9 3b 00 00 48 89 41 [ 604.814332] RSP: 0018:ffffcfe54e3d7578 EFLAGS: 00010256 [ 604.814336] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8cf09eced300 [ 604.814339] RDX: 0000000000000068 RSI: ffff8cf1d4c1fc00 RDI: ffff8cf1d4c1fc40 [ 604.814341] RBP: 0000000000000000 R08: ffff8cf09eced300 R09: 0000000000000000 [ 604.814344] R10: 0000000000000000 R11: 0000000000000016 R12: ffff8cf1d4c1fc00 [ 604.814346] R13: 0000000000000400 R14: ffff8cf096289c00 R15: ffff8cf084c8f688 [ 604.814349] FS: 00007f00531b7780(0000) GS:ffff8cf4217a0000(0000) knlGS:0000000000000000 [ 604.814352] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 604.814355] CR2: 0000000000000000 CR3: 000000018e3df000 CR4: 0000000000350ef0 [ 604.814358] Call Trace: [ 604.814362] <TASK> [ 604.814368] ttm_bo_swapout_cb+0x24c/0x280 [ttm a469cf7fcb6737fdcf3fb5cdbcc8b1ca41f3e302] [ 604.814380] ttm_lru_walk_for_evict+0xac/0x1d0 [ttm a469cf7fcb6737fdcf3fb5cdbcc8b1ca41f3e302] [ 604.814394] ttm_bo_swapout+0x5b/0x80 [ttm a469cf7fcb6737fdcf3fb5cdbcc8b1ca41f3e302] [ 604.814405] ttm_global_swapout+0x63/0x100 [ttm a469cf7fcb6737fdcf3fb5cdbcc8b1ca41f3e302] [ 604.814415] ttm_tt_populate+0x82/0x130 [ttm a469cf7fcb6737fdcf3fb5cdbcc8b1ca41f3e302] [ 604.814424] ttm_bo_populate+0x37/0xa0 [ttm a469cf7fcb6737fdcf3fb5cdbcc8b1ca41f3e302] [ 604.814433] ttm_bo_handle_move_mem+0x157/0x170 [ttm a469cf7fcb6737fdcf3fb5cdbcc8b1ca41f3e302] [ 604.814443] ttm_bo_validate+0xd9/0x180 [ttm a469cf7fcb6737fdcf3fb5cdbcc8b1ca41f3e302] [ 604.814453] ttm_bo_init_reserved+0xa0/0x1b0 [ttm a469cf7fcb6737fdcf3fb5cdbcc8b1ca41f3e302] [ 604.814461] ? srso_return_thunk+0x5/0x5f [ 604.814469] amdgpu_bo_create+0x1f5/0x500 [amdgpu 361516226706227f4403914dbfdd3f90996136ca] [ 604.814855] ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu 361516226706227f4403914dbfdd3f90996136ca] [ 604.815182] amdgpu_bo_create_user+0x3d/0x70 [amdgpu 361516226706227f4403914dbfdd3f90996136ca] [ 604.815504] amdgpu_gem_create_ioctl+0x16c/0x3b0 [amdgpu 361516226706227f4403914dbfdd3f90996136ca] [ 604.815830] ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu 361516226706227f4403914dbfdd3f90996136ca] [ 604.816155] ? __pfx_amdgpu_gem_create_ioctl+0x10/0x10 [amdgpu 361516226706227f4403914dbfdd3f90996136ca] [ 604.816478] drm_ioctl_kernel+0xae/0x100 [ 604.816486] drm_ioctl+0x283/0x510 [ 604.816491] ? __pfx_amdgpu_gem_create_ioctl+0x10/0x10 [amdgpu 361516226706227f4403914dbfdd3f90996136ca] [ 604.816819] amdgpu_drm_ioctl+0x4a/0x80 [amdgpu 361516226706227f4403914dbfdd3f90996136ca] [ 604.817135] __x64_sys_ioctl+0x96/0xe0 [ 604.817142] do_syscall_64+0x11b/0x7e0 [ 604.817148] ? srso_return_thunk+0x5/0x5f [ 604.817152] ? srso_return_thunk+0x5/0x5f [ 604.817156] ? walk_system_ram_range+0xb0/0x110 [ 604.817161] ? srso_return_thunk+0x5/0x5f [ 604.817165] ? __pte_offset_map+0x1b/0xb0 [ 604.817170] ? srso_return_thunk+0x5/0x5f [ 604.817174] ? pte_offset_map_lock+0x87/0xf0 [ 604.817179] ? srso_return_thunk+0x5/0x5f [ 604.817183] ? insert_pfn+0x9f/0x1f0 [ 604.817188] ? srso_return_thunk+0x5/0x5f [ 604.817192] ? vmf_insert_pfn_prot+0x97/0x190 [ 604.817197] ? srso_return_thunk+0x5/0x5f [ 604.817201] ? ttm_bo_vm_fault_reserved+0x1a6/0x3f0 [ttm a469cf7fcb6737fdcf3fb5cdbcc8b1ca41f3e302] [ 604.817213] ? srso_return_thunk+0x5/0x5f [ 604.817217] ? amdgpu_gem_fault+0xe2/0x100 [amdgpu 361516226706227f4403914dbfdd3f90996136ca] [ 604.817542] ? srso_return_thunk+0x5/0x5f [ 604.817546] ? __do_fault+0x33/0x180 [ 604.817550] ? srso_return_thunk+0x5/0x5f [ 604.817554] ? do_fault+0x178/0x610 [ 604.817559] ? srso_return_thunk+0x5/0x5f [ 604.817562] ? __handle_mm_fault+0x9be/0x1120 [ 604.817567] ? srso_return_thunk+0x5/0x5f [ 604.817574] ? srso_return_thunk+0x5/0x5f [ 604.817578] ? count_memcg_events+0xc4/0x160 [ 604.817583] ? srso_return_thunk+0x5/0x5f [ 604.817587] ? handle_mm_fault+0x1d7/0x2e0 [ 604.817593] ? srso_return_thunk+0x5/0x5f [ 604.817596] ? do_user_addr_fault+0x173/0x660 [ 604.817602] ? srso_return_thunk+0x5/0x5f [ 604.817607] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 604.817612] RIP: 0033:0x7f00532cef4d [ 604.817617] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 [ 604.817620] RSP: 002b:00007ffd69ab0650 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 604.817624] RAX: ffffffffffffffda RBX: 00007ffd69ab07d0 RCX: 00007f00532cef4d [ 604.817627] RDX: 00007ffd69ab0700 RSI: 00000000c0206440 RDI: 0000000000000005 [ 604.817629] RBP: 00007ffd69ab06a0 R08: 00007f00533a0ac0 R09: 0000000000000000 [ 604.817632] R10: 00007ffd69ab07c0 R11: 0000000000000246 R12: 00007ffd69ab0700 [ 604.817634] R13: 00000000c0206440 R14: 0000000000000005 R15: 0000000000000243 [ 604.817642] </TASK> Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Christian König <christian.koenig@amd.com> Fixes: c06da4b ("drm/ttm: Tidy usage of local variables a little bit") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://lore.kernel.org/r/20260304-ttm_bo_res_uaf-v1-1-43f20125b67f@igalia.com
sgaud-quic
pushed a commit
that referenced
this pull request
Mar 10, 2026
Many ethernet drivers report xdp Rx queue frag size as being the same as DMA write size. However, the only user of this field, namely bpf_xdp_frags_increase_tail(), clearly expects a truesize. Such difference leads to unspecific memory corruption issues under certain circumstances, e.g. in ixgbevf maximum DMA write size is 3 KB, so when running xskxceiver's XDP_ADJUST_TAIL_GROW_MULTI_BUFF, 6K packet fully uses all DMA-writable space in 2 buffers. This would be fine, if only rxq->frag_size was properly set to 4K, but value of 3K results in a negative tailroom, because there is a non-zero page offset. We are supposed to return -EINVAL and be done with it in such case, but due to tailroom being stored as an unsigned int, it is reported to be somewhere near UINT_MAX, resulting in a tail being grown, even if the requested offset is too much (it is around 2K in the abovementioned test). This later leads to all kinds of unspecific calltraces. [ 7340.337579] xskxceiver[1440]: segfault at 1da718 ip 00007f4161aeac9d sp 00007f41615a6a00 error 6 [ 7340.338040] xskxceiver[1441]: segfault at 7f410000000b ip 00000000004042b5 sp 00007f415bffecf0 error 4 [ 7340.338179] in libc.so.6[61c9d,7f4161aaf000+160000] [ 7340.339230] in xskxceiver[42b5,400000+69000] [ 7340.340300] likely on CPU 6 (core 0, socket 6) [ 7340.340302] Code: ff ff 01 e9 f4 fe ff ff 0f 1f 44 00 00 4c 39 f0 74 73 31 c0 ba 01 00 00 00 f0 0f b1 17 0f 85 ba 00 00 00 49 8b 87 88 00 00 00 <4c> 89 70 08 eb cc 0f 1f 44 00 00 48 8d bd f0 fe ff ff 89 85 ec fe [ 7340.340888] likely on CPU 3 (core 0, socket 3) [ 7340.345088] Code: 00 00 00 ba 00 00 00 00 be 00 00 00 00 89 c7 e8 31 ca ff ff 89 45 ec 8b 45 ec 85 c0 78 07 b8 00 00 00 00 eb 46 e8 0b c8 ff ff <8b> 00 83 f8 69 74 24 e8 ff c7 ff ff 8b 00 83 f8 0b 74 18 e8 f3 c7 [ 7340.404334] Oops: general protection fault, probably for non-canonical address 0x6d255010bdffc: 0000 [#1] SMP NOPTI [ 7340.405972] CPU: 7 UID: 0 PID: 1439 Comm: xskxceiver Not tainted 6.19.0-rc1+ #21 PREEMPT(lazy) [ 7340.408006] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-5.fc42 04/01/2014 [ 7340.409716] RIP: 0010:lookup_swap_cgroup_id+0x44/0x80 [ 7340.410455] Code: 83 f8 1c 73 39 48 ba ff ff ff ff ff ff ff 03 48 8b 04 c5 20 55 fa bd 48 21 d1 48 89 ca 83 e1 01 48 d1 ea c1 e1 04 48 8d 04 90 <8b> 00 48 83 c4 10 d3 e8 c3 cc cc cc cc 31 c0 e9 98 b7 dd 00 48 89 [ 7340.412787] RSP: 0018:ffffcc5c04f7f6d0 EFLAGS: 00010202 [ 7340.413494] RAX: 0006d255010bdffc RBX: ffff891f477895a8 RCX: 0000000000000010 [ 7340.414431] RDX: 0001c17e3fffffff RSI: 00fa070000000000 RDI: 000382fc7fffffff [ 7340.415354] RBP: 00fa070000000000 R08: ffffcc5c04f7f8f8 R09: ffffcc5c04f7f7d0 [ 7340.416283] R10: ffff891f4c1a7000 R11: ffffcc5c04f7f9c8 R12: ffffcc5c04f7f7d0 [ 7340.417218] R13: 03ffffffffffffff R14: 00fa06fffffffe00 R15: ffff891f47789500 [ 7340.418229] FS: 0000000000000000(0000) GS:ffff891ffdfaa000(0000) knlGS:0000000000000000 [ 7340.419489] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7340.420286] CR2: 00007f415bfffd58 CR3: 0000000103f03002 CR4: 0000000000772ef0 [ 7340.421237] PKRU: 55555554 [ 7340.421623] Call Trace: [ 7340.421987] <TASK> [ 7340.422309] ? softleaf_from_pte+0x77/0xa0 [ 7340.422855] swap_pte_batch+0xa7/0x290 [ 7340.423363] zap_nonpresent_ptes.constprop.0.isra.0+0xd1/0x270 [ 7340.424102] zap_pte_range+0x281/0x580 [ 7340.424607] zap_pmd_range.isra.0+0xc9/0x240 [ 7340.425177] unmap_page_range+0x24d/0x420 [ 7340.425714] unmap_vmas+0xa1/0x180 [ 7340.426185] exit_mmap+0xe1/0x3b0 [ 7340.426644] __mmput+0x41/0x150 [ 7340.427098] exit_mm+0xb1/0x110 [ 7340.427539] do_exit+0x1b2/0x460 [ 7340.427992] do_group_exit+0x2d/0xc0 [ 7340.428477] get_signal+0x79d/0x7e0 [ 7340.428957] arch_do_signal_or_restart+0x34/0x100 [ 7340.429571] exit_to_user_mode_loop+0x8e/0x4c0 [ 7340.430159] do_syscall_64+0x188/0x6b0 [ 7340.430672] ? __do_sys_clone3+0xd9/0x120 [ 7340.431212] ? switch_fpu_return+0x4e/0xd0 [ 7340.431761] ? arch_exit_to_user_mode_prepare.isra.0+0xa1/0xc0 [ 7340.432498] ? do_syscall_64+0xbb/0x6b0 [ 7340.433015] ? __handle_mm_fault+0x445/0x690 [ 7340.433582] ? count_memcg_events+0xd6/0x210 [ 7340.434151] ? handle_mm_fault+0x212/0x340 [ 7340.434697] ? do_user_addr_fault+0x2b4/0x7b0 [ 7340.435271] ? clear_bhb_loop+0x30/0x80 [ 7340.435788] ? clear_bhb_loop+0x30/0x80 [ 7340.436299] ? clear_bhb_loop+0x30/0x80 [ 7340.436812] ? clear_bhb_loop+0x30/0x80 [ 7340.437323] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 7340.437973] RIP: 0033:0x7f4161b14169 [ 7340.438468] Code: Unable to access opcode bytes at 0x7f4161b1413f. [ 7340.439242] RSP: 002b:00007ffc6ebfa770 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca [ 7340.440173] RAX: fffffffffffffe00 RBX: 00000000000005a1 RCX: 00007f4161b14169 [ 7340.441061] RDX: 00000000000005a1 RSI: 0000000000000109 RDI: 00007f415bfff990 [ 7340.441943] RBP: 00007ffc6ebfa7a0 R08: 0000000000000000 R09: 00000000ffffffff [ 7340.442824] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 7340.443707] R13: 0000000000000000 R14: 00007f415bfff990 R15: 00007f415bfff6c0 [ 7340.444586] </TASK> [ 7340.444922] Modules linked in: rfkill intel_rapl_msr intel_rapl_common intel_uncore_frequency_common skx_edac_common nfit libnvdimm kvm_intel vfat fat kvm snd_pcm irqbypass rapl iTCO_wdt snd_timer intel_pmc_bxt iTCO_vendor_support snd ixgbevf virtio_net soundcore i2c_i801 pcspkr libeth_xdp net_failover i2c_smbus lpc_ich failover libeth virtio_balloon joydev 9p fuse loop zram lz4hc_compress lz4_compress 9pnet_virtio 9pnet netfs ghash_clmulni_intel serio_raw qemu_fw_cfg [ 7340.449650] ---[ end trace 0000000000000000 ]--- The issue can be fixed in all in-tree drivers, but we cannot just trust OOT drivers to not do this. Therefore, make tailroom a signed int and produce a warning when it is negative to prevent such mistakes in the future. Fixes: bf25146 ("bpf: add frags support to the bpf_xdp_adjust_tail() API") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20260305111253.2317394-10-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
sgaud-quic
pushed a commit
that referenced
this pull request
Mar 10, 2026
Larysa Zaremba says: ==================== Address XDP frags having negative tailroom Aside from the issue described below, tailroom calculation does not account for pages being split between frags, e.g. in i40e, enetc and AF_XDP ZC with smaller chunks. These series address the problem by calculating modulo (skb_frag_off() % rxq->frag_size) in order to get data offset within a smaller block of memory. Please note, xskxceiver tail grow test passes without modulo e.g. in xdpdrv mode on i40e, because there is not enough descriptors to get to flipped buffers. Many ethernet drivers report xdp Rx queue frag size as being the same as DMA write size. However, the only user of this field, namely bpf_xdp_frags_increase_tail(), clearly expects a truesize. Such difference leads to unspecific memory corruption issues under certain circumstances, e.g. in ixgbevf maximum DMA write size is 3 KB, so when running xskxceiver's XDP_ADJUST_TAIL_GROW_MULTI_BUFF, 6K packet fully uses all DMA-writable space in 2 buffers. This would be fine, if only rxq->frag_size was properly set to 4K, but value of 3K results in a negative tailroom, because there is a non-zero page offset. We are supposed to return -EINVAL and be done with it in such case, but due to tailroom being stored as an unsigned int, it is reported to be somewhere near UINT_MAX, resulting in a tail being grown, even if the requested offset is too much(it is around 2K in the abovementioned test). This later leads to all kinds of unspecific calltraces. [ 7340.337579] xskxceiver[1440]: segfault at 1da718 ip 00007f4161aeac9d sp 00007f41615a6a00 error 6 [ 7340.338040] xskxceiver[1441]: segfault at 7f410000000b ip 00000000004042b5 sp 00007f415bffecf0 error 4 [ 7340.338179] in libc.so.6[61c9d,7f4161aaf000+160000] [ 7340.339230] in xskxceiver[42b5,400000+69000] [ 7340.340300] likely on CPU 6 (core 0, socket 6) [ 7340.340302] Code: ff ff 01 e9 f4 fe ff ff 0f 1f 44 00 00 4c 39 f0 74 73 31 c0 ba 01 00 00 00 f0 0f b1 17 0f 85 ba 00 00 00 49 8b 87 88 00 00 00 <4c> 89 70 08 eb cc 0f 1f 44 00 00 48 8d bd f0 fe ff ff 89 85 ec fe [ 7340.340888] likely on CPU 3 (core 0, socket 3) [ 7340.345088] Code: 00 00 00 ba 00 00 00 00 be 00 00 00 00 89 c7 e8 31 ca ff ff 89 45 ec 8b 45 ec 85 c0 78 07 b8 00 00 00 00 eb 46 e8 0b c8 ff ff <8b> 00 83 f8 69 74 24 e8 ff c7 ff ff 8b 00 83 f8 0b 74 18 e8 f3 c7 [ 7340.404334] Oops: general protection fault, probably for non-canonical address 0x6d255010bdffc: 0000 [#1] SMP NOPTI [ 7340.405972] CPU: 7 UID: 0 PID: 1439 Comm: xskxceiver Not tainted 6.19.0-rc1+ #21 PREEMPT(lazy) [ 7340.408006] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-5.fc42 04/01/2014 [ 7340.409716] RIP: 0010:lookup_swap_cgroup_id+0x44/0x80 [ 7340.410455] Code: 83 f8 1c 73 39 48 ba ff ff ff ff ff ff ff 03 48 8b 04 c5 20 55 fa bd 48 21 d1 48 89 ca 83 e1 01 48 d1 ea c1 e1 04 48 8d 04 90 <8b> 00 48 83 c4 10 d3 e8 c3 cc cc cc cc 31 c0 e9 98 b7 dd 00 48 89 [ 7340.412787] RSP: 0018:ffffcc5c04f7f6d0 EFLAGS: 00010202 [ 7340.413494] RAX: 0006d255010bdffc RBX: ffff891f477895a8 RCX: 0000000000000010 [ 7340.414431] RDX: 0001c17e3fffffff RSI: 00fa070000000000 RDI: 000382fc7fffffff [ 7340.415354] RBP: 00fa070000000000 R08: ffffcc5c04f7f8f8 R09: ffffcc5c04f7f7d0 [ 7340.416283] R10: ffff891f4c1a7000 R11: ffffcc5c04f7f9c8 R12: ffffcc5c04f7f7d0 [ 7340.417218] R13: 03ffffffffffffff R14: 00fa06fffffffe00 R15: ffff891f47789500 [ 7340.418229] FS: 0000000000000000(0000) GS:ffff891ffdfaa000(0000) knlGS:0000000000000000 [ 7340.419489] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7340.420286] CR2: 00007f415bfffd58 CR3: 0000000103f03002 CR4: 0000000000772ef0 [ 7340.421237] PKRU: 55555554 [ 7340.421623] Call Trace: [ 7340.421987] <TASK> [ 7340.422309] ? softleaf_from_pte+0x77/0xa0 [ 7340.422855] swap_pte_batch+0xa7/0x290 [ 7340.423363] zap_nonpresent_ptes.constprop.0.isra.0+0xd1/0x270 [ 7340.424102] zap_pte_range+0x281/0x580 [ 7340.424607] zap_pmd_range.isra.0+0xc9/0x240 [ 7340.425177] unmap_page_range+0x24d/0x420 [ 7340.425714] unmap_vmas+0xa1/0x180 [ 7340.426185] exit_mmap+0xe1/0x3b0 [ 7340.426644] __mmput+0x41/0x150 [ 7340.427098] exit_mm+0xb1/0x110 [ 7340.427539] do_exit+0x1b2/0x460 [ 7340.427992] do_group_exit+0x2d/0xc0 [ 7340.428477] get_signal+0x79d/0x7e0 [ 7340.428957] arch_do_signal_or_restart+0x34/0x100 [ 7340.429571] exit_to_user_mode_loop+0x8e/0x4c0 [ 7340.430159] do_syscall_64+0x188/0x6b0 [ 7340.430672] ? __do_sys_clone3+0xd9/0x120 [ 7340.431212] ? switch_fpu_return+0x4e/0xd0 [ 7340.431761] ? arch_exit_to_user_mode_prepare.isra.0+0xa1/0xc0 [ 7340.432498] ? do_syscall_64+0xbb/0x6b0 [ 7340.433015] ? __handle_mm_fault+0x445/0x690 [ 7340.433582] ? count_memcg_events+0xd6/0x210 [ 7340.434151] ? handle_mm_fault+0x212/0x340 [ 7340.434697] ? do_user_addr_fault+0x2b4/0x7b0 [ 7340.435271] ? clear_bhb_loop+0x30/0x80 [ 7340.435788] ? clear_bhb_loop+0x30/0x80 [ 7340.436299] ? clear_bhb_loop+0x30/0x80 [ 7340.436812] ? clear_bhb_loop+0x30/0x80 [ 7340.437323] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 7340.437973] RIP: 0033:0x7f4161b14169 [ 7340.438468] Code: Unable to access opcode bytes at 0x7f4161b1413f. [ 7340.439242] RSP: 002b:00007ffc6ebfa770 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca [ 7340.440173] RAX: fffffffffffffe00 RBX: 00000000000005a1 RCX: 00007f4161b14169 [ 7340.441061] RDX: 00000000000005a1 RSI: 0000000000000109 RDI: 00007f415bfff990 [ 7340.441943] RBP: 00007ffc6ebfa7a0 R08: 0000000000000000 R09: 00000000ffffffff [ 7340.442824] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 7340.443707] R13: 0000000000000000 R14: 00007f415bfff990 R15: 00007f415bfff6c0 [ 7340.444586] </TASK> [ 7340.444922] Modules linked in: rfkill intel_rapl_msr intel_rapl_common intel_uncore_frequency_common skx_edac_common nfit libnvdimm kvm_intel vfat fat kvm snd_pcm irqbypass rapl iTCO_wdt snd_timer intel_pmc_bxt iTCO_vendor_support snd ixgbevf virtio_net soundcore i2c_i801 pcspkr libeth_xdp net_failover i2c_smbus lpc_ich failover libeth virtio_balloon joydev 9p fuse loop zram lz4hc_compress lz4_compress 9pnet_virtio 9pnet netfs ghash_clmulni_intel serio_raw qemu_fw_cfg [ 7340.449650] ---[ end trace 0000000000000000 ]--- The issue can be fixed in all in-tree drivers, but we cannot just trust OOT drivers to not do this. Therefore, make tailroom a signed int and produce a warning when it is negative to prevent such mistakes in the future. The issue can also be easily reproduced with ice driver, by applying the following diff to xskxceiver and enjoying a kernel panic in xdpdrv mode: diff --git a/tools/testing/selftests/bpf/prog_tests/test_xsk.c b/tools/testing/selftests/bpf/prog_tests/test_xsk.c index 5af28f3..042d587fa7ef 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_xsk.c +++ b/tools/testing/selftests/bpf/prog_tests/test_xsk.c @@ -2541,8 +2541,8 @@ int testapp_adjust_tail_grow_mb(struct test_spec *test) { test->mtu = MAX_ETH_JUMBO_SIZE; /* Grow by (frag_size - last_frag_Size) - 1 to stay inside the last fragment */ - return testapp_adjust_tail(test, (XSK_UMEM__MAX_FRAME_SIZE / 2) - 1, - XSK_UMEM__LARGE_FRAME_SIZE * 2); + return testapp_adjust_tail(test, XSK_UMEM__MAX_FRAME_SIZE * 100, + 6912); } int testapp_tx_queue_consumer(struct test_spec *test) If we print out the values involved in the tailroom calculation: tailroom = rxq->frag_size - skb_frag_size(frag) - skb_frag_off(frag); 4294967040 = 3456 - 3456 - 256 I personally reproduced and verified the issue in ice and i40e, aside from WiP ixgbevf implementation. ==================== Link: https://patch.msgid.link/20260305111253.2317394-1-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
shuaz-shuai
pushed a commit
to shuaz-shuai/kernel-topics
that referenced
this pull request
May 12, 2026
The following race is possible with migration swap entries or
device-private THP entries. e.g. when move_pages is called on a PMD THP
page, then there maybe an intermediate state, where PMD entry acts as
a migration swap entry (pmd_present() is true). Then if an munmap
happens at the same time, then this VM_BUG_ON() can happen in
pmdp_huge_get_and_clear_full().
This patch fixes that.
Thread A: move_pages() syscall
add_folio_for_migration()
mmap_read_lock(mm)
folio_isolate_lru(folio)
mmap_read_unlock(mm)
do_move_pages_to_node()
migrate_pages()
try_to_migrate_one()
spin_lock(ptl)
set_pmd_migration_entry()
pmdp_invalidate() # PMD: _PAGE_INVALID | _PAGE_PTE | pfn
set_pmd_at() # PMD: migration swap entry (pmd_present=0)
spin_unlock(ptl)
[page copy phase] # <--- RACE WINDOW -->
Thread B: munmap()
mmap_write_downgrade(mm)
unmap_vmas() -> zap_pmd_range()
zap_huge_pmd()
__pmd_trans_huge_lock()
pmd_is_huge(): # !pmd_present && !pmd_none -> TRUE (swap entry)
pmd_lock() -> # spin_lock(ptl), waits for Thread A to release ptl
pmdp_huge_get_and_clear_full()
VM_BUG_ON(!pmd_present(*pmdp)) # HITS!
[ 287.738700][ T1867] ------------[ cut here ]------------
[ 287.743843][ T1867] kernel BUG at arch/powerpc/mm/book3s64/pgtable.c:187!
cpu 0x0: Vector: 700 (Program Check) at [c00000044037f4f0]
pc: c000000000094ca4: pmdp_huge_get_and_clear_full+0x6c/0x23c
lr: c000000000645dec: zap_huge_pmd+0xb0/0x868
sp: c00000044037f790
msr: 800000000282b033
current = 0xc0000004032c1a00
paca = 0xc000000004fe0000 irqmask: 0x03 irq_happened: 0x09
pid = 1867, comm = a.out
kernel BUG at :187!
Linux version 6.19.0-12136-g14360d4f917c-dirty (powerpc64le-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) qualcomm-linux#27 SMP PREEMPT Sun Feb 22 10:38:56 IST 2026
enter ? for help
[link register ] c000000000645dec zap_huge_pmd+0xb0/0x868
[c00000044037f790] c00000044037f7d0 (unreliable)
[c00000044037f7d0] c000000000645dcc zap_huge_pmd+0x90/0x868
[c00000044037f840] c0000000005724cc unmap_page_range+0x176c/0x1f40
[c00000044037fa00] c000000000572ea0 unmap_vmas+0xb0/0x1d8
[c00000044037fa90] c0000000005af254 unmap_region+0xb4/0x128
[c00000044037fb50] c0000000005af400 vms_complete_munmap_vmas+0x138/0x310
[c00000044037fbe0] c0000000005b0f1c do_vmi_align_munmap+0x1ec/0x238
[c00000044037fd30] c0000000005b3688 __vm_munmap+0x170/0x1f8
[c00000044037fdf0] c000000000587f74 sys_munmap+0x2c/0x40
[c00000044037fe10] c000000000032668 system_call_exception+0x128/0x350
[c00000044037fe50] c00000000000d05c system_call_vectored_common+0x15c/0x2ec
---- Exception: 3000 (System Call Vectored) at 0000000010064a2c
SP (7fff9b1ee9c0) is in userspace
0:mon> zh
commit a30b48b ("mm/migrate_device: implement THP migration of zone device pages"),
enabled migration for device-private PMD entries. Hence this is one
other path where this warning could get trigger from.
------------[ cut here ]------------
WARNING: arch/powerpc/mm/book3s64/hash_pgtable.c:199 at hash__pmd_hugepage_update+0x48/0x284, CPU#3: hmm-tests/1905
Modules linked in: test_hmm
CPU: 3 UID: 0 PID: 1905 Comm: hmm-tests Tainted: G B W L N 7.0.0-rc1-01438-g7e2f0ee7581c qualcomm-linux#21 PREEMPT
Tainted: [B]=BAD_PAGE, [W]=WARN, [L]=SOFTLOCKUP, [N]=TEST
Hardware name: IBM pSeries (emulated by qemu) POWER10 (architected) 0x801200 0xf000006 of:SLOF,git-ee03ae pSeries
NIP [c000000000096b70] hash__pmd_hugepage_update+0x48/0x284
LR [c000000000096e7c] hash__pmdp_huge_get_and_clear+0xd0/0xd4
Call Trace:
[c000000604707670] [c000000004e102b8] 0xc000000004e102b8 (unreliable)
[c000000604707700] [c00000000064ec3c] set_pmd_migration_entry+0x414/0x498
[c000000604707760] [c00000000063e5a4] migrate_vma_collect_pmd+0x12e8/0x16c4
[c000000604707890] [c00000000059282c] walk_pgd_range+0x7fc/0xd2c
[c000000604707990] [c000000000592e40] __walk_page_range+0xe4/0x2ac
[c000000604707a10] [c000000000593534] walk_page_range_mm_unsafe+0x204/0x2a4
[c000000604707ab0] [c00000000063af10] migrate_vma_setup+0x1dc/0x2e8
[c000000604707b10] [c008000006a21838] dmirror_migrate_to_system.constprop.0+0x210/0x4b0 [test_hmm]
[c000000604707c30] [c008000006a245b0] dmirror_fops_unlocked_ioctl+0x454/0xa5c [test_hmm]
[c000000604707d20] [c0000000006aab84] sys_ioctl+0x4ec/0x1178
[c000000604707e10] [c0000000000326a8] system_call_exception+0x128/0x350
[c000000604707e50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
---- interrupt: 3000 at 0x7fffbe44f50c
Fixes: 75358ea ("powerpc/mm/book3s64: Fix MADV_DONTNEED and parallel page fault race")
Fixes: a30b48b ("mm/migrate_device: implement THP migration of zone device pages")
Reported-by: Pavithra Prakash <pavrampu@linux.vnet.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/9437e5ef28d1e2f5cbdd7f8286350ce93c1d43c5.1773078178.git.ritesh.list@gmail.com
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Don't merge