Hello,
syzbot found the following issue on:
HEAD commit: 2f111a6fd5b5 Merge tag 'ceph-for-5.15-rc7' of git://github.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=115750acb00000 kernel config: https://syzkaller.appspot.com/x/.config?x=d95853dad8472c91 dashboard link: https://syzkaller.appspot.com/bug?extid=28b0702ada0bf7381f58 compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1026ef2cb00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15c9c162b00000
IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+28b0702ada0bf7381f58@syzkaller.appspotmail.com
RBP: 00007ffef262e230 R08: 0000000000000002 R09: 00007fddc8003531 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ------------[ cut here ]------------ ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0 WARNING: CPU: 0 PID: 6517 at lib/debugobjects.c:508 debug_print_object lib/debugobjects.c:505 [inline] WARNING: CPU: 0 PID: 6517 at lib/debugobjects.c:508 debug_object_assert_init+0x1fa/0x250 lib/debugobjects.c:895 Modules linked in: CPU: 0 PID: 6517 Comm: syz-executor011 Not tainted 5.15.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:debug_print_object lib/debugobjects.c:505 [inline] RIP: 0010:debug_object_assert_init+0x1fa/0x250 lib/debugobjects.c:895 Code: e8 4b 15 b8 fd 4c 8b 45 00 48 c7 c7 a0 31 b4 8a 48 c7 c6 00 2e b4 8a 48 c7 c2 e0 33 b4 8a 31 c9 49 89 d9 31 c0 e8 b6 c6 36 fd <0f> 0b ff 05 3a 5c c5 09 48 83 c5 38 48 89 e8 48 c1 e8 03 42 80 3c RSP: 0018:ffffc90002c7e698 EFLAGS: 00010046 RAX: cffa606352c78700 RBX: 0000000000000000 RCX: ffff888076ce9c80 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffffffff8a512d00 R08: ffffffff81693402 R09: ffffed1017383f2c R10: ffffed1017383f2c R11: 0000000000000000 R12: dffffc0000000000 R13: ffff88801bcd1720 R14: 0000000000000002 R15: ffffffff90ba5a20 FS: 0000555557087300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f5473f3c000 CR3: 0000000070ca6000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: debug_timer_assert_init kernel/time/timer.c:739 [inline] debug_assert_init kernel/time/timer.c:784 [inline] del_timer+0xa5/0x3d0 kernel/time/timer.c:1204 try_to_grab_pending+0x151/0xbb0 kernel/workqueue.c:1270 __cancel_work_timer+0x14c/0x710 kernel/workqueue.c:3129 batadv_nc_mesh_free+0x4a/0xf0 net/batman-adv/network-coding.c:1869 batadv_mesh_free+0x6f/0x140 net/batman-adv/main.c:245 batadv_mesh_init+0x4e5/0x550 net/batman-adv/main.c:226 batadv_softif_init_late+0x8fe/0xd70 net/batman-adv/soft-interface.c:804 register_netdevice+0x826/0x1c30 net/core/dev.c:10229 __rtnl_newlink net/core/rtnetlink.c:3458 [inline] rtnl_newlink+0x14b3/0x1d10 net/core/rtnetlink.c:3506 rtnetlink_rcv_msg+0x934/0xe60 net/core/rtnetlink.c:5572 netlink_rcv_skb+0x200/0x470 net/netlink/af_netlink.c:2510 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x814/0x9f0 net/netlink/af_netlink.c:1345 netlink_sendmsg+0xa29/0xe50 net/netlink/af_netlink.c:1935 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg net/socket.c:724 [inline] ____sys_sendmsg+0x5b9/0x910 net/socket.c:2409 ___sys_sendmsg net/socket.c:2463 [inline] __sys_sendmsg+0x36f/0x450 net/socket.c:2492 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fddc82bc7e9 Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffef262e228 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fddc82bc7e9 RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003 RBP: 00007ffef262e230 R08: 0000000000000002 R09: 00007fddc8003531 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
--- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. syzbot can test patches for this issue, for details see: https://goo.gl/tpsmEJ#testing-patches
On 10/22/21 02:19, syzbot wrote:
Hello,
syzbot found the following issue on:
HEAD commit: 2f111a6fd5b5 Merge tag 'ceph-for-5.15-rc7' of git://github.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=115750acb00000 kernel config: https://syzkaller.appspot.com/x/.config?x=d95853dad8472c91 dashboard link: https://syzkaller.appspot.com/bug?extid=28b0702ada0bf7381f58 compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1026ef2cb00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15c9c162b00000
IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+28b0702ada0bf7381f58@syzkaller.appspotmail.com
RBP: 00007ffef262e230 R08: 0000000000000002 R09: 00007fddc8003531 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ------------[ cut here ]------------ ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0 WARNING: CPU: 0 PID: 6517 at lib/debugobjects.c:508 debug_print_object lib/debugobjects.c:505 [inline] WARNING: CPU: 0 PID: 6517 at lib/debugobjects.c:508 debug_object_assert_init+0x1fa/0x250 lib/debugobjects.c:895 Modules linked in: CPU: 0 PID: 6517 Comm: syz-executor011 Not tainted 5.15.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:debug_print_object lib/debugobjects.c:505 [inline] RIP: 0010:debug_object_assert_init+0x1fa/0x250 lib/debugobjects.c:895 Code: e8 4b 15 b8 fd 4c 8b 45 00 48 c7 c7 a0 31 b4 8a 48 c7 c6 00 2e b4 8a 48 c7 c2 e0 33 b4 8a 31 c9 49 89 d9 31 c0 e8 b6 c6 36 fd <0f> 0b ff 05 3a 5c c5 09 48 83 c5 38 48 89 e8 48 c1 e8 03 42 80 3c RSP: 0018:ffffc90002c7e698 EFLAGS: 00010046 RAX: cffa606352c78700 RBX: 0000000000000000 RCX: ffff888076ce9c80 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffffffff8a512d00 R08: ffffffff81693402 R09: ffffed1017383f2c R10: ffffed1017383f2c R11: 0000000000000000 R12: dffffc0000000000 R13: ffff88801bcd1720 R14: 0000000000000002 R15: ffffffff90ba5a20 FS: 0000555557087300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f5473f3c000 CR3: 0000000070ca6000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: debug_timer_assert_init kernel/time/timer.c:739 [inline] debug_assert_init kernel/time/timer.c:784 [inline] del_timer+0xa5/0x3d0 kernel/time/timer.c:1204 try_to_grab_pending+0x151/0xbb0 kernel/workqueue.c:1270 __cancel_work_timer+0x14c/0x710 kernel/workqueue.c:3129 batadv_nc_mesh_free+0x4a/0xf0 net/batman-adv/network-coding.c:1869 batadv_mesh_free+0x6f/0x140 net/batman-adv/main.c:245 batadv_mesh_init+0x4e5/0x550 net/batman-adv/main.c:226
Looks like cancel_delayed_work_sync() is called before INIT_DELAYED_WORK(), so calltrace looks like
batadv_mesh_init() batadv_originator_init() <- injected allocation failure batadv_mesh_free() batadv_nc_mesh_free() cancel_delayed_work_sync()
Quick fix can be moving INIT_DELAYED_WORK() from batadv_nc_init() to batadv_mesh_init(), since there is complex dependencies between each mech part, if I understood comments correctly
Just for thoughts and syzbot testing
#syz test git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
With regards, Pavel Skripkin
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue: general protection fault in batadv_nc_purge_paths
RBP: 00007fe7b40631d0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002 R13: 00007ffe7ffd3def R14: 00007fe7b4063300 R15: 0000000000022000 general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 1 PID: 9061 Comm: syz-executor.0 Not tainted 5.15.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:batadv_nc_purge_paths+0x38/0x3f0 net/batman-adv/network-coding.c:437 Code: 48 89 d3 49 89 f6 48 89 7c 24 58 49 bd 00 00 00 00 00 fc ff df e8 38 48 ab f7 4d 8d 7e 10 4c 89 f8 48 c1 e8 03 48 89 44 24 48 <42> 8a 04 28 84 c0 0f 85 88 03 00 00 41 8b 2f 31 ff 89 ee e8 20 4c RSP: 0018:ffffc9000d04eac0 EFLAGS: 00010202 RAX: 0000000000000002 RBX: 0000000000000000 RCX: ffff888078270000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88807ec2cc80 RBP: 00000000fffffff4 R08: ffffffff8154e5b4 R09: ffffed100fd85adc R10: ffffed100fd85adc R11: 0000000000000000 R12: ffff88807ec2cc80 R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000010 FS: 00007fe7b4063700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f359172e000 CR3: 000000005e749000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: batadv_nc_mesh_free+0x7a/0xf0 net/batman-adv/network-coding.c:1869 batadv_mesh_free+0x6f/0x140 net/batman-adv/main.c:249 batadv_mesh_init+0x5b1/0x620 net/batman-adv/main.c:230 batadv_softif_init_late+0x8fe/0xd70 net/batman-adv/soft-interface.c:804 register_netdevice+0x826/0x1c30 net/core/dev.c:10229 __rtnl_newlink net/core/rtnetlink.c:3458 [inline] rtnl_newlink+0x14b3/0x1d10 net/core/rtnetlink.c:3506 rtnetlink_rcv_msg+0x934/0xe60 net/core/rtnetlink.c:5572 netlink_rcv_skb+0x200/0x470 net/netlink/af_netlink.c:2510 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x814/0x9f0 net/netlink/af_netlink.c:1345 netlink_sendmsg+0xa29/0xe50 net/netlink/af_netlink.c:1935 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg net/socket.c:724 [inline] ____sys_sendmsg+0x5b9/0x910 net/socket.c:2409 ___sys_sendmsg net/socket.c:2463 [inline] __sys_sendmsg+0x36f/0x450 net/socket.c:2492 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fe7b48eda39 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fe7b4063188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fe7b49f0f60 RCX: 00007fe7b48eda39 RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003 RBP: 00007fe7b40631d0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002 R13: 00007ffe7ffd3def R14: 00007fe7b4063300 R15: 0000000000022000 Modules linked in: ---[ end trace 67ff054734964acf ]--- RIP: 0010:batadv_nc_purge_paths+0x38/0x3f0 net/batman-adv/network-coding.c:437 Code: 48 89 d3 49 89 f6 48 89 7c 24 58 49 bd 00 00 00 00 00 fc ff df e8 38 48 ab f7 4d 8d 7e 10 4c 89 f8 48 c1 e8 03 48 89 44 24 48 <42> 8a 04 28 84 c0 0f 85 88 03 00 00 41 8b 2f 31 ff 89 ee e8 20 4c RSP: 0018:ffffc9000d04eac0 EFLAGS: 00010202 RAX: 0000000000000002 RBX: 0000000000000000 RCX: ffff888078270000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88807ec2cc80 RBP: 00000000fffffff4 R08: ffffffff8154e5b4 R09: ffffed100fd85adc R10: ffffed100fd85adc R11: 0000000000000000 R12: ffff88807ec2cc80 R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000010 FS: 00007fe7b4063700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc230f87020 CR3: 000000005e749000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 ---------------- Code disassembly (best guess): 0: 48 89 d3 mov %rdx,%rbx 3: 49 89 f6 mov %rsi,%r14 6: 48 89 7c 24 58 mov %rdi,0x58(%rsp) b: 49 bd 00 00 00 00 00 movabs $0xdffffc0000000000,%r13 12: fc ff df 15: e8 38 48 ab f7 callq 0xf7ab4852 1a: 4d 8d 7e 10 lea 0x10(%r14),%r15 1e: 4c 89 f8 mov %r15,%rax 21: 48 c1 e8 03 shr $0x3,%rax 25: 48 89 44 24 48 mov %rax,0x48(%rsp) * 2a: 42 8a 04 28 mov (%rax,%r13,1),%al <-- trapping instruction 2e: 84 c0 test %al,%al 30: 0f 85 88 03 00 00 jne 0x3be 36: 41 8b 2f mov (%r15),%ebp 39: 31 ff xor %edi,%edi 3b: 89 ee mov %ebp,%esi 3d: e8 .byte 0xe8 3e: 20 .byte 0x20 3f: 4c rex.WR
Tested on:
commit: 1d4590f5 Merge tag 'acpi-5.15-rc7' of git://git.kernel.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=156e86c4b00000 kernel config: https://syzkaller.appspot.com/x/.config?x=d95853dad8472c91 dashboard link: https://syzkaller.appspot.com/bug?extid=28b0702ada0bf7381f58 compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2 patch: https://syzkaller.appspot.com/x/patch.diff?x=16cb3daf300000
On 10/22/21 23:20, syzbot wrote:
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue: general protection fault in batadv_nc_purge_paths
Oh, ok. Next clean up call in batadv_nc_mesh_free() caused GPF, since fields are not initialized. Let's try to clean up one by one and do not break dependencies.
Quite ugly one, but idea is correct, I guess
Also, make each *_init() call clean up all allocated stuff to not call corresponding *_free() on error handling path, since it introduces problems, as syzbot reported
With regards, Pavel Skripkin
On 10/22/21 23:57, Pavel Skripkin wrote:
On 10/22/21 23:20, syzbot wrote:
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue: general protection fault in batadv_nc_purge_paths
Oh, ok. Next clean up call in batadv_nc_mesh_free() caused GPF, since fields are not initialized. Let's try to clean up one by one and do not break dependencies.
Quite ugly one, but idea is correct, I guess
Also, make each *_init() call clean up all allocated stuff to not call corresponding *_free() on error handling path, since it introduces problems, as syzbot reported
Whooops.... Forgot to ask syzbot to test the patch
#syz test git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
With regards, Pavel Skripkin
Hello,
syzbot has tested the proposed patch and the reproducer did not trigger any issue:
Reported-and-tested-by: syzbot+28b0702ada0bf7381f58@syzkaller.appspotmail.com
Tested on:
commit: 9c0c4d24 Merge tag 'block-5.15-2021-10-22' of git://gi.. git tree: upstream kernel config: https://syzkaller.appspot.com/x/.config?x=d95853dad8472c91 dashboard link: https://syzkaller.appspot.com/bug?extid=28b0702ada0bf7381f58 compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2 patch: https://syzkaller.appspot.com/x/patch.diff?x=1553d4c4b00000
Note: testing is done by a robot and is best-effort only.
On Friday, 22 October 2021 22:58:15 CEST Pavel Skripkin wrote: [...]
Oh, ok. Next clean up call in batadv_nc_mesh_free() caused GPF, since fields are not initialized. Let's try to clean up one by one and do not break dependencies.
Quite ugly one, but idea is correct, I guess
Also, make each *_init() call clean up all allocated stuff to not call corresponding *_free() on error handling path, since it introduces problems, as syzbot reported
Thanks for the patch + syzbot interactions. I just wanted to implement a change - which would most likely have ended up the same way. Can you please send it to netdev and Cc b.a.t.m.a.n@lists.open-mesh.org? We don't have anything else to submit at the moment for netdev and this patch can be applied by netdev directly. I will add my Acked-by in this process.
Not sure about the Fixes. It is definitely wrong in the initial commit.... but it got only really problematic when other features got introduced. I would still say that the initial one should be mentioned.
Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
@Linus, @Marek, @Antonio: Please check whether it is ok to move the batadv_v_mesh_init after batadv_tt_init + batadv_originator_init. batadv_v_mesh_init is basically there to initialize:
* bat_priv->bat_v.ogm_buff(|_len|_mutex) * bat_priv->bat_v.ogm_seqno * bat_priv->bat_v.ogm_wq
batadv_originator_init is there to initialize the
* bat_priv->orig_hash * bat_priv->orig_work (batadv_purge_orig) + queue it up
batadv_tt_init is a lot more complex but should in theory not interact with ogm specific algo ops.
I wouldn't know why there could be a problem but I would leave it to the experts.
Kind regards, Sven
Syzbot reported ODEBUG warning in batadv_nc_mesh_free(). The problem was in wrong error handling in batadv_mesh_init().
Before this patch batadv_mesh_init() was calling batadv_mesh_free() in case of any batadv_*_init() calls failure. This approach may work well, when there is some kind of indicator, which can tell which parts of batadv are initialized; but there isn't any.
All written above lead to cleaning up uninitialized fields. Even if we hide ODEBUG warning by initializing bat_priv->nc.work, syzbot was able to hit GPF in batadv_nc_purge_paths(), because hash pointer in still NULL. [1]
To fix these bugs we can unwind batadv_*_init() calls one by one. It is good approach for 2 reasons: 1) It fixes bugs on error handling path 2) It improves the performance, since we won't call unneeded batadv_*_free() functions.
So, this patch makes all batadv_*_init() clean up all allocated memory before returning with an error to no call correspoing batadv_*_free() and open-codes batadv_mesh_free() with proper order to avoid touching uninitialized fields.
Link: https://lore.kernel.org/netdev/000000000000c87fbd05cef6bcb0@google.com/ [1] Reported-and-tested-by: syzbot+28b0702ada0bf7381f58@syzkaller.appspotmail.com Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Signed-off-by: Pavel Skripkin paskripkin@gmail.com --- net/batman-adv/bridge_loop_avoidance.c | 8 +++- net/batman-adv/main.c | 56 ++++++++++++++++++-------- net/batman-adv/network-coding.c | 4 +- net/batman-adv/translation-table.c | 4 +- 4 files changed, 52 insertions(+), 20 deletions(-)
diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c index 1669744304c5..17687848daec 100644 --- a/net/batman-adv/bridge_loop_avoidance.c +++ b/net/batman-adv/bridge_loop_avoidance.c @@ -1560,10 +1560,14 @@ int batadv_bla_init(struct batadv_priv *bat_priv) return 0;
bat_priv->bla.claim_hash = batadv_hash_new(128); - bat_priv->bla.backbone_hash = batadv_hash_new(32); + if (!bat_priv->bla.claim_hash) + return -ENOMEM;
- if (!bat_priv->bla.claim_hash || !bat_priv->bla.backbone_hash) + bat_priv->bla.backbone_hash = batadv_hash_new(32); + if (!bat_priv->bla.backbone_hash) { + batadv_hash_destroy(bat_priv->bla.claim_hash); return -ENOMEM; + }
batadv_hash_set_lock_class(bat_priv->bla.claim_hash, &batadv_claim_hash_lock_class_key); diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c index 3ddd66e4c29e..5207cd8d6ad8 100644 --- a/net/batman-adv/main.c +++ b/net/batman-adv/main.c @@ -190,29 +190,41 @@ int batadv_mesh_init(struct net_device *soft_iface)
bat_priv->gw.generation = 0;
- ret = batadv_v_mesh_init(bat_priv); - if (ret < 0) - goto err; - ret = batadv_originator_init(bat_priv); - if (ret < 0) - goto err; + if (ret < 0) { + atomic_set(&bat_priv->mesh_state, BATADV_MESH_DEACTIVATING); + goto err_orig; + }
ret = batadv_tt_init(bat_priv); - if (ret < 0) - goto err; + if (ret < 0) { + atomic_set(&bat_priv->mesh_state, BATADV_MESH_DEACTIVATING); + goto err_tt; + } + + ret = batadv_v_mesh_init(bat_priv); + if (ret < 0) { + atomic_set(&bat_priv->mesh_state, BATADV_MESH_DEACTIVATING); + goto err_v; + }
ret = batadv_bla_init(bat_priv); - if (ret < 0) - goto err; + if (ret < 0) { + atomic_set(&bat_priv->mesh_state, BATADV_MESH_DEACTIVATING); + goto err_bla; + }
ret = batadv_dat_init(bat_priv); - if (ret < 0) - goto err; + if (ret < 0) { + atomic_set(&bat_priv->mesh_state, BATADV_MESH_DEACTIVATING); + goto err_dat; + }
ret = batadv_nc_mesh_init(bat_priv); - if (ret < 0) - goto err; + if (ret < 0) { + atomic_set(&bat_priv->mesh_state, BATADV_MESH_DEACTIVATING); + goto err_nc; + }
batadv_gw_init(bat_priv); batadv_mcast_init(bat_priv); @@ -222,8 +234,20 @@ int batadv_mesh_init(struct net_device *soft_iface)
return 0;
-err: - batadv_mesh_free(soft_iface); +err_nc: + batadv_dat_free(bat_priv); +err_dat: + batadv_bla_free(bat_priv); +err_bla: + batadv_v_mesh_free(bat_priv); +err_v: + batadv_tt_free(bat_priv); +err_tt: + batadv_originator_free(bat_priv); +err_orig: + batadv_purge_outstanding_packets(bat_priv, NULL); + atomic_set(&bat_priv->mesh_state, BATADV_MESH_INACTIVE); + return ret; }
diff --git a/net/batman-adv/network-coding.c b/net/batman-adv/network-coding.c index 9f06132e007d..0a7f1d36a6a8 100644 --- a/net/batman-adv/network-coding.c +++ b/net/batman-adv/network-coding.c @@ -152,8 +152,10 @@ int batadv_nc_mesh_init(struct batadv_priv *bat_priv) &batadv_nc_coding_hash_lock_class_key);
bat_priv->nc.decoding_hash = batadv_hash_new(128); - if (!bat_priv->nc.decoding_hash) + if (!bat_priv->nc.decoding_hash) { + batadv_hash_destroy(bat_priv->nc.coding_hash); goto err; + }
batadv_hash_set_lock_class(bat_priv->nc.decoding_hash, &batadv_nc_decoding_hash_lock_class_key); diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c index e0b3dace2020..4b7ad6684bc4 100644 --- a/net/batman-adv/translation-table.c +++ b/net/batman-adv/translation-table.c @@ -4162,8 +4162,10 @@ int batadv_tt_init(struct batadv_priv *bat_priv) return ret;
ret = batadv_tt_global_init(bat_priv); - if (ret < 0) + if (ret < 0) { + batadv_tt_local_table_free(bat_priv); return ret; + }
batadv_tvlv_handler_register(bat_priv, batadv_tt_tvlv_ogm_handler_v1, batadv_tt_tvlv_unicast_handler_v1,
On Sunday, 24 October 2021 15:13:56 CEST Pavel Skripkin wrote:
Syzbot reported ODEBUG warning in batadv_nc_mesh_free(). The problem was in wrong error handling in batadv_mesh_init().
Before this patch batadv_mesh_init() was calling batadv_mesh_free() in case of any batadv_*_init() calls failure. This approach may work well, when there is some kind of indicator, which can tell which parts of batadv are initialized; but there isn't any.
All written above lead to cleaning up uninitialized fields. Even if we hide ODEBUG warning by initializing bat_priv->nc.work, syzbot was able to hit GPF in batadv_nc_purge_paths(), because hash pointer in still NULL. [1]
To fix these bugs we can unwind batadv_*_init() calls one by one. It is good approach for 2 reasons: 1) It fixes bugs on error handling path 2) It improves the performance, since we won't call unneeded batadv_*_free() functions.
So, this patch makes all batadv_*_init() clean up all allocated memory before returning with an error to no call correspoing batadv_*_free() and open-codes batadv_mesh_free() with proper order to avoid touching uninitialized fields.
Link: https://lore.kernel.org/netdev/000000000000c87fbd05cef6bcb0@google.com/ [1] Reported-and-tested-by: syzbot+28b0702ada0bf7381f58@syzkaller.appspotmail.com Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Signed-off-by: Pavel Skripkin paskripkin@gmail.com
Acked-by: Sven Eckelmann sven@narfation.org
Kind regards, Sven
On Sun, 24 Oct 2021 16:58:30 +0200 Sven Eckelmann wrote:
On Sunday, 24 October 2021 15:13:56 CEST Pavel Skripkin wrote:
Syzbot reported ODEBUG warning in batadv_nc_mesh_free(). The problem was in wrong error handling in batadv_mesh_init().
Before this patch batadv_mesh_init() was calling batadv_mesh_free() in case of any batadv_*_init() calls failure. This approach may work well, when there is some kind of indicator, which can tell which parts of batadv are initialized; but there isn't any.
All written above lead to cleaning up uninitialized fields. Even if we hide ODEBUG warning by initializing bat_priv->nc.work, syzbot was able to hit GPF in batadv_nc_purge_paths(), because hash pointer in still NULL. [1]
To fix these bugs we can unwind batadv_*_init() calls one by one. It is good approach for 2 reasons: 1) It fixes bugs on error handling path 2) It improves the performance, since we won't call unneeded batadv_*_free() functions.
So, this patch makes all batadv_*_init() clean up all allocated memory before returning with an error to no call correspoing batadv_*_free() and open-codes batadv_mesh_free() with proper order to avoid touching uninitialized fields.
Link: https://lore.kernel.org/netdev/000000000000c87fbd05cef6bcb0@google.com/ [1] Reported-and-tested-by: syzbot+28b0702ada0bf7381f58@syzkaller.appspotmail.com Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Signed-off-by: Pavel Skripkin paskripkin@gmail.com
Acked-by: Sven Eckelmann sven@narfation.org
FWIW I'm marking this as "Awaiting upstream" in netdev patchwork, please LMK if you prefer for it to be applied directly.
On Tuesday, 26 October 2021 02:49:50 CEST Jakub Kicinski wrote: [...]
Acked-by: Sven Eckelmann sven@narfation.org
FWIW I'm marking this as "Awaiting upstream" in netdev patchwork, please LMK if you prefer for it to be applied directly.
Please apply this directly. Thanks
Kind regards, Sven
Hello:
This patch was applied to netdev/net.git (master) by David S. Miller davem@davemloft.net:
On Sun, 24 Oct 2021 16:13:56 +0300 you wrote:
Syzbot reported ODEBUG warning in batadv_nc_mesh_free(). The problem was in wrong error handling in batadv_mesh_init().
Before this patch batadv_mesh_init() was calling batadv_mesh_free() in case of any batadv_*_init() calls failure. This approach may work well, when there is some kind of indicator, which can tell which parts of batadv are initialized; but there isn't any.
[...]
Here is the summary with links: - net: batman-adv: fix error handling https://git.kernel.org/netdev/net/c/6f68cd634856
You are awesome, thank you!
b.a.t.m.a.n@lists.open-mesh.org