Hi,
We found a bug in Linux 6.10 using syzkaller. It is probably a null pointer dereference bug. In line 307 of net/batman-adv/bridge_loop_avoidance, when executing "hash = backbone_gw->bat_priv->bla.claim_hash;", it does not check if "backbone_gw->bat_priv==NULL".
The bug report and syzkaller reproducer are as follows:
bug report:
Oops: general protection fault, probably for non-canonical address 0xdffffc000000004a: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000250-0x0000000000000257] CPU: 0 PID: 45 Comm: kworker/u4:3 Not tainted 6.10.0 #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: bat_events batadv_bla_periodic_work RIP: 0010:batadv_bla_del_backbone_claims+0x4e/0x360 net/batman-adv/bridge_loop_avoidance.c:307 Code: 18 48 83 c3 18 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 01 72 33 f7 bd 50 02 00 00 48 03 2b 48 89 e8 48 c1 e8 03 <42> 80 3c 20 00 74 08 48 89 ef e8 e3 71 33 f7 48 8b 6d 00 48 85 ed RSP: 0018:ffffc9000090f9b0 EFLAGS: 00010202 RAX: 000000000000004a RBX: ffff88802cd7c018 RCX: ffff888015370000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88802cd7c000 RBP: 0000000000000250 R08: ffffffff8ac0433d R09: 1ffff110059af805 R10: dffffc0000000000 R11: ffffed10059af806 R12: dffffc0000000000 R13: ffff88802cd7c008 R14: 00000000ffffcf80 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff888063a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000556956047f2c CR3: 000000000d932000 CR4: 0000000000350ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> batadv_bla_purge_backbone_gw+0x285/0x4c0 net/batman-adv/bridge_loop_avoidance.c:1254 batadv_bla_periodic_work+0xc3/0xa80 net/batman-adv/bridge_loop_avoidance.c:1445 process_one_work kernel/workqueue.c:3248 [inline] process_scheduled_works+0x977/0x1410 kernel/workqueue.c:3329 worker_thread+0xaa0/0x1020 kernel/workqueue.c:3409 kthread+0x2eb/0x380 kernel/kthread.c:389 ret_from_fork+0x49/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:batadv_bla_del_backbone_claims+0x4e/0x360 net/batman-adv/bridge_loop_avoidance.c:307 Code: 18 48 83 c3 18 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 01 72 33 f7 bd 50 02 00 00 48 03 2b 48 89 e8 48 c1 e8 03 <42> 80 3c 20 00 74 08 48 89 ef e8 e3 71 33 f7 48 8b 6d 00 48 85 ed RSP: 0018:ffffc9000090f9b0 EFLAGS: 00010202 RAX: 000000000000004a RBX: ffff88802cd7c018 RCX: ffff888015370000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88802cd7c000 RBP: 0000000000000250 R08: ffffffff8ac0433d R09: 1ffff110059af805 R10: dffffc0000000000 R11: ffffed10059af806 R12: dffffc0000000000 R13: ffff88802cd7c008 R14: 00000000ffffcf80 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff888063a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000556956047f2c CR3: 000000000d932000 CR4: 0000000000350ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 ---------------- Code disassembly (best guess): 0: 18 48 83 sbb %cl,-0x7d(%rax) 3: c3 ret 4: 18 48 89 sbb %cl,-0x77(%rax) 7: d8 48 c1 fmuls -0x3f(%rax) a: e8 03 42 80 3c call 0x3c804212 f: 20 00 and %al,(%rax) 11: 74 08 je 0x1b 13: 48 89 df mov %rbx,%rdi 16: e8 01 72 33 f7 call 0xf733721c 1b: bd 50 02 00 00 mov $0x250,%ebp 20: 48 03 2b add (%rbx),%rbp 23: 48 89 e8 mov %rbp,%rax 26: 48 c1 e8 03 shr $0x3,%rax * 2a: 42 80 3c 20 00 cmpb $0x0,(%rax,%r12,1) <-- trapping instruction 2f: 74 08 je 0x39 31: 48 89 ef mov %rbp,%rdi 34: e8 e3 71 33 f7 call 0xf733721c 39: 48 8b 6d 00 mov 0x0(%rbp),%rbp 3d: 48 85 ed test %rbp,%rbp
Syzkaller reproducer: # {Threaded:false Repeat:true RepeatTimes:0 Procs:1 Slowdown:1 Sandbox:none SandboxArg:0 Leak:false NetInjection:false NetDevices:true NetReset:false Cgroups:false BinfmtMisc:true CloseFDs:true KCSAN:false DevlinkPCI:false NicVF:false USB:true VhciInjection:false Wifi:true IEEE802154:false Sysctl:false Swap:true UseTmpDir:true HandleSegv:true Trace:false LegacyOptions:{Collide:false Fault:false FaultCall:0 FaultNth:0}} write$syz_spec_1342568572_346(0xffffffffffffffff, &(0x7f0000000080)={{0x0, 0x4, 0x6}, {0x5, 0x0, 0x111, 0xe, "c2beae5c4e"}}, 0x20) write$syz_spec_18446744072532934322_80(0xffffffffffffffff, &(0x7f0000000000)="2b952480c7ca55097d1707935ba64b20f3026c03d658026b81bf264340512b3cb4e01afda2de754299ea7a113343ab7b9bda2fc0a2e2cdbfecbca0233a0772b12ebde5d98a1203cb871672dff7e4c86ec1dccef0a76312fbe8d45dc2bd0f8fc2ebeb2a6be6a300916c5281da2c1ef64d66267091b82429976c019da3645557ed1d439c5a637f6bf58c53bc414539dd87c69098d671402586b631f9ac5c2fe9cedc281a6f005b5c4d1dd5ed9be400", 0xb4) r0 = syz_open_dev$sg(&(0x7f0000000180), 0x0, 0x109400) ioctl$syz_spec_1724254976_2866(r0, 0x1, &(0x7f0000000080)={0x0, 0x2, [0x85, 0x8, 0x15, 0xd]}) ioctl$TIOCSTI(0xffffffffffffffff, 0x5412, 0x0) openat$ttynull(0xffffffffffffff9c, &(0x7f00000000c0), 0x109841, 0x0) r1 = openat$ttynull(0xffffffffffffff9c, 0x0, 0x109841, 0x0) ioctl$TIOCSTI(r1, 0x5412, 0x0) syz_open_dev$tty20(0xc, 0x4, 0x1) write$syz_spec_1342568572_233(0xffffffffffffffff, 0x0, 0x0) ioctl$syz_spec_1101043199_396(0xffffffffffffffff, 0x80104d12, 0x0) ioctl$syz_spec_1342803520_149(0xffffffffffffffff, 0x5501, 0xf9d) write$syz_spec_18446744073706268967_8(0xffffffffffffffff, &(0x7f00000002c0)=0xfd80, 0xfffffc34) ioctl$syz_spec_18446744073707301390_3197(0xffffffffffffffff, 0xc0a85320, 0x0) ioctl$syz_spec_18446744073707301390_3092(0xffffffffffffffff, 0x40a85321, 0x0) openat$ppp(0xffffffffffffff9c, &(0x7f0000000100), 0x200, 0x0) mmap$IORING_OFF_SQ_RING(&(0x7f00003ff000/0xc00000)=nil, 0xc00000, 0xe, 0x9a172, 0xffffffffffffffff, 0x0) mmap$IORING_OFF_SQES(&(0x7f0000000000/0xc00000)=nil, 0xc00000, 0x1000019, 0x42832, 0xffffffffffffffff, 0x10000000)
On Sunday, 25 August 2024 06:14:48 CEST Xingyu Li wrote:
In line 307 of net/batman-adv/bridge_loop_avoidance, when executing "hash = backbone_gw->bat_priv->bla.claim_hash;", it does not check if "backbone_gw->bat_priv==NULL".
Because it cannot be NULL unless something really, really, really bad happened. bat_priv will only be set when the gateway gets created using batadv_bla_get_backbone_gw(). It never gets unset during the lifetime on the backbone gateway.
Maybe Simon has more to say about that.
On Sunday, 25 August 2024 06:14:48 CEST Xingyu Li wrote:
RIP: 0010:batadv_bla_del_backbone_claims+0x4e/0x360
Which line would that be in your build?
On Sunday, 25 August 2024 06:14:48 CEST Xingyu Li wrote:
Syzkaller reproducer:
At the moment, I am unable to reproduce this crash with the provided reproducer.
Can you reproduce it with it? If you can, did you try to perform a bisect using the reproducer?
Kind regards, Sven
Which line would that be in your build?
Somehow, the bug report does not include the line number in my end.
At the moment, I am unable to reproduce this crash with the provided reproducer.
Can you reproduce it with it?
Sorry. The above syzkaller reproducer needs the additional support to run it. But here is a C reproducer: https://gist.github.com/freexxxyyy/0be5002c45d7f060cb599dd7595cab78
On Sun, Aug 25, 2024 at 9:24 AM Sven Eckelmann sven@narfation.org wrote:
On Sunday, 25 August 2024 06:14:48 CEST Xingyu Li wrote:
In line 307 of net/batman-adv/bridge_loop_avoidance, when executing "hash = backbone_gw->bat_priv->bla.claim_hash;", it does not check if "backbone_gw->bat_priv==NULL".
Because it cannot be NULL unless something really, really, really bad happened. bat_priv will only be set when the gateway gets created using batadv_bla_get_backbone_gw(). It never gets unset during the lifetime on the backbone gateway.
Maybe Simon has more to say about that.
On Sunday, 25 August 2024 06:14:48 CEST Xingyu Li wrote:
RIP: 0010:batadv_bla_del_backbone_claims+0x4e/0x360
Which line would that be in your build?
On Sunday, 25 August 2024 06:14:48 CEST Xingyu Li wrote:
Syzkaller reproducer:
At the moment, I am unable to reproduce this crash with the provided reproducer.
Can you reproduce it with it? If you can, did you try to perform a bisect using the reproducer?
Kind regards, Sven
On Thursday, 29 August 2024 06:30:23 CEST Xingyu Li wrote:
Which line would that be in your build?
Somehow, the bug report does not include the line number in my end.
You can try to use gdb or similar tools to figure out more about it [1]. Maybe even adjust your kernel build to create better debuggable crashes
At the moment, I am unable to reproduce this crash with the provided reproducer.
Since I am missing information and you don't have a working reproducer - how should I then fix anything? Your comment from the first doesn't seem to apply and it is unclear how you came to the conclusion in the first place.
Can you reproduce it with it?
Sorry. The above syzkaller reproducer needs the additional support to run it. But here is a C reproducer: https://gist.github.com/freexxxyyy/0be5002c45d7f060cb599dd7595cab78
I've tried to run it with the normal syz-execprog - but you seem to say now that this reproducer is not working the upstream one? In this case, please try to get it working with upstream. See also the mail from Kees Cook [2].
Kind regards, Sven
[1] https://www.open-mesh.org/projects/devtools/wiki/Crashlog_with_pstore#Decodi... [2] https://lore.kernel.org/r/202408281812.3F765DF@keescook
b.a.t.m.a.n@lists.open-mesh.org