syzbot has found a reproducer for the following issue on:
HEAD commit: 9bacdd8996c7 Merge tag 'for-6.7-rc1-tag' of git://git.kern.. git tree: upstream console+strace: https://syzkaller.appspot.com/x/log.txt?x=13e932ff680000 kernel config: https://syzkaller.appspot.com/x/.config?x=d05dd66e2eb2c872 dashboard link: https://syzkaller.appspot.com/bug?extid=225bfad78b079744fd5e compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1041f91f680000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10cc7b98e80000
Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/8e9d5e2b6665/disk-9bacdd89.raw.... vmlinux: https://storage.googleapis.com/syzbot-assets/b8ee67db540d/vmlinux-9bacdd89.x... kernel image: https://storage.googleapis.com/syzbot-assets/3477230ef7a9/bzImage-9bacdd89.x...
The issue was bisected to:
commit c2368b19807affd7621f7c4638cd2e17fec13021 Author: Jiri Pirko jiri@nvidia.com Date: Fri Jul 29 07:10:35 2022 +0000
net: devlink: introduce "unregistering" mark and use it during devlinks iteration
bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1758e1e3680000 final oops: https://syzkaller.appspot.com/x/report.txt?x=14d8e1e3680000 console output: https://syzkaller.appspot.com/x/log.txt?x=10d8e1e3680000
IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+225bfad78b079744fd5e@syzkaller.appspotmail.com Fixes: c2368b19807a ("net: devlink: introduce "unregistering" mark and use it during devlinks iteration")
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: rcu: 0-...!: (1 ticks this GP) idle=3b94/1/0x4000000000000000 softirq=6057/6057 fqs=9 rcu: (detected by 1, t=10502 jiffies, g=6949, q=188 ncpus=2) Sending NMI from CPU 1 to CPUs 0: NMI backtrace for cpu 0 CPU: 0 PID: 8 Comm: kworker/0:0 Not tainted 6.7.0-rc1-syzkaller-00012-g9bacdd8996c7 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023 Workqueue: events_power_efficient gc_worker RIP: 0010:pv_queued_spin_unlock arch/x86/include/asm/paravirt.h:591 [inline] RIP: 0010:queued_spin_unlock arch/x86/include/asm/qspinlock.h:57 [inline] RIP: 0010:do_raw_spin_unlock+0x117/0x8b0 kernel/locking/spinlock_debug.c:141 Code: 49 c7 45 00 ff ff ff ff 0f b6 04 2b 84 c0 0f 85 c9 03 00 00 41 c7 06 ff ff ff ff 48 c7 c0 60 b8 79 8d 48 c1 e8 03 80 3c 28 00 <74> 0c 48 c7 c7 60 b8 79 8d e8 9b d3 7b 00 48 83 3d 73 30 0b 0c 00 RSP: 0018:ffffc90000007c20 EFLAGS: 00000046 RAX: 1ffffffff1af370c RBX: 1ffff110042eac5e RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8880217562e8 RBP: dffffc0000000000 R08: ffff8880217562eb R09: 1ffff110042eac5d R10: dffffc0000000000 R11: ffffed10042eac5e R12: 1ffff110042eac5f R13: ffff8880217562f8 R14: ffff8880217562f0 R15: ffff8880217562e8 FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000600 CR3: 000000000d730000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <NMI> </NMI> <IRQ> __raw_spin_unlock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_unlock+0x1e/0x40 kernel/locking/spinlock.c:186 spin_unlock include/linux/spinlock.h:391 [inline] advance_sched+0x9bd/0xcb0 net/sched/sch_taprio.c:992 __run_hrtimer kernel/time/hrtimer.c:1688 [inline] __hrtimer_run_queues+0x59f/0xd20 kernel/time/hrtimer.c:1752 hrtimer_interrupt+0x396/0x980 kernel/time/hrtimer.c:1814 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1065 [inline] __sysvec_apic_timer_interrupt+0x104/0x3a0 arch/x86/kernel/apic/apic.c:1082 sysvec_apic_timer_interrupt+0x92/0xb0 arch/x86/kernel/apic/apic.c:1076 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645 RIP: 0010:lock_acquire+0x25a/0x530 kernel/locking/lockdep.c:5757 Code: 2b 00 74 08 4c 89 f7 e8 04 33 7d 00 f6 44 24 61 02 0f 85 8a 01 00 00 41 f7 c7 00 02 00 00 74 01 fb 48 c7 44 24 40 0e 36 e0 45 <4b> c7 44 25 00 00 00 00 00 43 c7 44 25 09 00 00 00 00 43 c7 44 25 RSP: 0018:ffffc900000d7940 EFLAGS: 00000206 RAX: 0000000000000001 RBX: 1ffff9200001af34 RCX: 0000000000000001 RDX: dffffc0000000000 RSI: ffffffff8b6ac0c0 RDI: ffffffff8bbdf300 RBP: ffffc900000d7a88 R08: ffffffff90dd4367 R09: 1ffffffff21ba86c R10: dffffc0000000000 R11: fffffbfff21ba86d R12: 1ffff9200001af30 R13: dffffc0000000000 R14: ffffc900000d79a0 R15: 0000000000000246 rcu_lock_acquire include/linux/rcupdate.h:301 [inline] rcu_read_lock include/linux/rcupdate.h:747 [inline] gc_worker+0x28c/0x15a0 net/netfilter/nf_conntrack_core.c:1488 process_one_work kernel/workqueue.c:2630 [inline] process_scheduled_works+0x90f/0x1420 kernel/workqueue.c:2703 worker_thread+0xa5f/0x1000 kernel/workqueue.c:2784 kthread+0x2d3/0x370 kernel/kthread.c:388 ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242 </TASK> INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 1.422 msecs rcu: rcu_preempt kthread starved for 9734 jiffies! g6949 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. rcu: RCU grace-period kthread stack dump: task:rcu_preempt state:R running task stack:26576 pid:17 tgid:17 ppid:2 flags:0x00004000 Call Trace: <TASK> context_switch kernel/sched/core.c:5376 [inline] __schedule+0x1961/0x4ab0 kernel/sched/core.c:6688 __schedule_loop kernel/sched/core.c:6763 [inline] schedule+0x149/0x260 kernel/sched/core.c:6778 schedule_timeout+0x1bd/0x300 kernel/time/timer.c:2167 rcu_gp_fqs_loop+0x30a/0x1500 kernel/rcu/tree.c:1631 rcu_gp_kthread+0xa7/0x3b0 kernel/rcu/tree.c:1830 kthread+0x2d3/0x370 kernel/kthread.c:388 ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242 </TASK> rcu: Stack dump where RCU GP kthread last ran: CPU: 1 PID: 1272 Comm: kworker/u4:6 Not tainted 6.7.0-rc1-syzkaller-00012-g9bacdd8996c7 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023 Workqueue: events_unbound toggle_allocation_gate RIP: 0010:csd_lock_wait kernel/smp.c:311 [inline] RIP: 0010:smp_call_function_many_cond+0x1832/0x2940 kernel/smp.c:855 Code: 45 8b 65 00 44 89 e6 83 e6 01 31 ff e8 97 88 0b 00 41 83 e4 01 49 bc 00 00 00 00 00 fc ff df 75 07 e8 d2 84 0b 00 eb 38 f3 90 <42> 0f b6 04 23 84 c0 75 11 41 f7 45 00 01 00 00 00 74 1e e8 b6 84 RSP: 0018:ffffc9000562f720 EFLAGS: 00000293 RAX: ffffffff8182f9fa RBX: 1ffff110173087c5 RCX: ffff8880201a0000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000 RBP: ffffc9000562f920 R08: ffffffff8182f9c9 R09: 1ffffffff21ba86c R10: dffffc0000000000 R11: fffffbfff21ba86d R12: dffffc0000000000 R13: ffff8880b9843e28 R14: ffff8880b993d480 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe63960000 CR3: 000000000d730000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> </IRQ> <TASK> on_each_cpu_cond_mask+0x3f/0x80 kernel/smp.c:1023 on_each_cpu include/linux/smp.h:71 [inline] text_poke_sync arch/x86/kernel/alternative.c:2006 [inline] text_poke_bp_batch+0x352/0xb30 arch/x86/kernel/alternative.c:2216 text_poke_flush arch/x86/kernel/alternative.c:2407 [inline] text_poke_finish+0x30/0x50 arch/x86/kernel/alternative.c:2414 arch_jump_label_transform_apply+0x1c/0x30 arch/x86/kernel/jump_label.c:146 static_key_enable_cpuslocked+0x132/0x260 kernel/jump_label.c:205 static_key_enable+0x1a/0x20 kernel/jump_label.c:218 toggle_allocation_gate+0xb5/0x250 mm/kfence/core.c:830 process_one_work kernel/workqueue.c:2630 [inline] process_scheduled_works+0x90f/0x1420 kernel/workqueue.c:2703 worker_thread+0xa5f/0x1000 kernel/workqueue.c:2784 kthread+0x2d3/0x370 kernel/kthread.c:388 ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242 </TASK>
--- If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing.
b.a.t.m.a.n@lists.open-mesh.org