Hello,
syzbot found the following crash on:
HEAD commit: 7cc2a8ea Merge tag 'block-5.8-2020-07-01' of git://git.ker.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=130b828f100000 kernel config: https://syzkaller.appspot.com/x/.config?x=7be693511b29b338 dashboard link: https://syzkaller.appspot.com/bug?extid=2eeeb5ad0766b57394d8 compiler: gcc (GCC) 10.1.0-syz 20200507
Unfortunately, I don't have any reproducer for this crash yet.
IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+2eeeb5ad0766b57394d8@syzkaller.appspotmail.com
general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077] CPU: 1 PID: 9126 Comm: kworker/u4:9 Not tainted 5.8.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet RIP: 0010:batadv_iv_ogm_schedule_buff+0xd1e/0x1410 net/batman-adv/bat_iv_ogm.c:843 Code: 80 3c 28 00 0f 85 ee 05 00 00 4d 8b 3f 49 81 ff e0 e9 4e 8d 0f 84 dd 02 00 00 e8 bd 80 ae f9 49 8d 7f 70 48 89 f8 48 c1 e8 03 <42> 80 3c 28 00 0f 85 af 06 00 00 48 8b 44 24 08 49 8b 6f 70 80 38 RSP: 0018:ffffc90004e97b98 EFLAGS: 00010202 RAX: 000000000000000e RBX: ffff8880a7471800 RCX: ffffffff87c5394d RDX: ffff88804cf02380 RSI: ffffffff87c536a3 RDI: 0000000000000070 RBP: 0000000000077000 R08: 0000000000000001 R09: ffff8880a875a02b R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000007 R13: dffffc0000000000 R14: ffff888051ad4c40 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000400200 CR3: 0000000061cac000 CR4: 00000000001426e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: batadv_iv_ogm_schedule net/batman-adv/bat_iv_ogm.c:869 [inline] batadv_iv_ogm_schedule net/batman-adv/bat_iv_ogm.c:862 [inline] batadv_iv_send_outstanding_bat_ogm_packet+0x5c8/0x800 net/batman-adv/bat_iv_ogm.c:1722 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415 kthread+0x3b5/0x4a0 kernel/kthread.c:291 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293 Modules linked in: ---[ end trace f5c5eda032070cd1 ]--- RIP: 0010:batadv_iv_ogm_schedule_buff+0xd1e/0x1410 net/batman-adv/bat_iv_ogm.c:843 Code: 80 3c 28 00 0f 85 ee 05 00 00 4d 8b 3f 49 81 ff e0 e9 4e 8d 0f 84 dd 02 00 00 e8 bd 80 ae f9 49 8d 7f 70 48 89 f8 48 c1 e8 03 <42> 80 3c 28 00 0f 85 af 06 00 00 48 8b 44 24 08 49 8b 6f 70 80 38 RSP: 0018:ffffc90004e97b98 EFLAGS: 00010202 RAX: 000000000000000e RBX: ffff8880a7471800 RCX: ffffffff87c5394d RDX: ffff88804cf02380 RSI: ffffffff87c536a3 RDI: 0000000000000070 RBP: 0000000000077000 R08: 0000000000000001 R09: ffff8880a875a02b R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000007 R13: dffffc0000000000 R14: ffff888051ad4c40 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000400200 CR3: 000000009480d000 CR4: 00000000001426e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
--- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
On Tuesday, 7 July 2020 17:30:14 CEST syzbot wrote:
general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077] CPU: 1 PID: 9126 Comm: kworker/u4:9 Not tainted 5.8.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet RIP: 0010:batadv_iv_ogm_schedule_buff+0xd1e/0x1410 net/batman-adv/bat_iv_ogm.c:843
Seems to be following lines:
838 /* OGMs from primary interfaces are scheduled on all 839 * interfaces. 840 */ 841 rcu_read_lock(); 842 list_for_each_entry_rcu(tmp_hard_iface, &batadv_hardif_list, list) { 843 if (tmp_hard_iface->soft_iface != hard_iface->soft_iface) 844 continue;
If I understand it correctly, the tmp_hard_iface is NULL and then accessing soft_iface (offset 0x70 on amd64) causes this problem. But neither the batadv_hardif_list should ever point to NULL nor any entry inside the list.
I've just went through all code which accesses the list:
* bat_iv_ogm.c 839,7 @@ static void batadv_iv_ogm_schedule_buff * bat_iv_ogm.c 1606,7 @@ static void batadv_iv_ogm_process * bat_iv_ogm.c 1671,7 @@ static void batadv_iv_ogm_process * bat_iv_ogm.c 2144,7 @@ static void batadv_iv_neigh_print * bat_iv_ogm.c 2313,8 @@ batadv_iv_ogm_neigh_dump * bat_v.c 188,7 @@ static void batadv_v_neigh_print * bat_v.c 315,7 @@ batadv_v_neigh_dump * bat_v_elp.c 425,7 @@ void batadv_v_elp_primary_iface_set * bat_v_ogm.c 298,7 @@ static void batadv_v_ogm_send_softif * bat_v_ogm.c 923,7 @@ static void batadv_v_ogm_process * hard-interface.c 68,7 @@ batadv_hardif_get_by_netdev * hard-interface.c 431,7 @@ batadv_hardif_get_active * hard-interface.c 501,7 @@ static void batadv_check_known_mac_addr * hard-interface.c 533,7 @@ static void batadv_hardif_recalc_extra_skbroom * hard-interface.c 572,7 @@ int batadv_hardif_min_mtu * hard-interface.c 829,7 @@ static size_t batadv_hardif_cnt * main.c 290,7 @@ bool batadv_is_my_mac * netlink.c 991,7 @@ batadv_netlink_dump_hardif * originator.c 1301,7 @@ static bool batadv_purge_orig_node * send.c 882,7 @@ static void batadv_send_outstanding_bcast_packet * soft-interface.c 1141,7 @@ static void batadv_softif_destroy_netlink
and all the code which adds to the list or initializes parts of the list:
* hard-interface.c 927,7 @@ batadv_hardif_add_interface
- should be under rtnl_lock
* hard-interface.c 945,7 @@ batadv_hardif_add_interface
- should be under rtnl_lock
* hard-interface.c 99,7 @@ static int __init batadv_init
- this is done to initialized the list head before the rest of the code is initialized
and all the code which removes entries from the list:
* hard-interface.c 985,8 @@ void batadv_hardif_remove_interfaces
- is under rtnl_lock - there should also be nothing in this list because unregister_netdevice_notifier will trigger a NETDEV_UNREGISTER of these devices
* hard-interface.c 1048,7 @@ static int batadv_hard_if_event
- should be under rtnl_lock
The batadv_hard_iface is only kfree_rcu'ed by batadv_hardif_release when the reference counter is zero. The reference counter is increased in:
* bat_iv_ogm.c 843,20 @@ static void batadv_iv_ogm_schedule_buff * bat_iv_ogm.c 1678,13 @@ static void batadv_iv_ogm_process * bat_v_ogm.c 302,7 @@ static void batadv_v_ogm_send_softif * bat_v_ogm.c 930,7 @@ static void batadv_v_ogm_process * hard-interface.c 70,7 @@ batadv_hardif_get_by_netdev * hard-interface.c 436,7 @@ batadv_hardif_get_active * hard-interface.c 471,7 @@ static void batadv_primary_if_select * hard-interface.c 720,7 @@ int batadv_hardif_enable_interface * hard-interface.c 765,7 @@ int batadv_hardif_enable_interface * hard-interface.c 932,7 @@ batadv_hardif_add_interface * hard-interface.c 944,7 @@ batadv_hardif_add_interface * hard-interface.h 133,7 @@ batadv_primary_if_get_selected * main.c 460,7 @@ int batadv_batman_skb_recv * originator.c 413,7 @@ batadv_orig_ifinfo_new * originator.c 491,7 @@ batadv_neigh_ifinfo_new * originator.c 570,7 @@ batadv_hardif_neigh_create * originator.c 682,7 @@ batadv_neigh_node_create * originator.c 1308,7 @@ static bool batadv_purge_orig_node * send.c 527,10 @@ batadv_forw_packet_alloc * send.c 932,7 @@ static void batadv_send_outstanding_bcast_packet
and decreased:
* bat_iv_ogm.c 518,7 @@ batadv_iv_ogm_can_aggregate * bat_iv_ogm.c 843,20 @@ static void batadv_iv_ogm_schedule_buff * bat_iv_ogm.c 1678,13 @@ static void batadv_iv_ogm_process * bat_v.c 51,7 @@ static void batadv_v_iface_activate * bat_v.c 108,7 @@ static void batadv_v_iface_update_mac * bat_v_elp.c 540,7 @@ int batadv_v_elp_packet_recv * bat_v_ogm.c 326,7 @@ static void batadv_v_ogm_send_softif * bat_v_ogm.c 340,12 @@ static void batadv_v_ogm_send_softif * bat_v_ogm.c 958,7 @@ static void batadv_v_ogm_process * bat_v_ogm.c 966,7 @@ static void batadv_v_ogm_process * bridge_loop_avoidance.c 440,7 @@ static void batadv_bla_send_claim * bridge_loop_avoidance.c 1405,7 @@ void batadv_bla_status_update * bridge_loop_avoidance.c 1499,7 @@ static void batadv_bla_periodic_work * bridge_loop_avoidance.c 1538,7 @@ int batadv_bla_init * bridge_loop_avoidance.c 1746,7 @@ void batadv_bla_free * bridge_loop_avoidance.c 1910,7 @@ bool batadv_bla_rx * bridge_loop_avoidance.c 2017,7 @@ bool batadv_bla_tx * bridge_loop_avoidance.c 2081,7 @@ int batadv_bla_claim_table_seq_print_text * bridge_loop_avoidance.c 2248,7 @@ int batadv_bla_claim_dump * bridge_loop_avoidance.c 2317,7 @@ int batadv_bla_backbone_table_seq_print_text * bridge_loop_avoidance.c 2486,7 @@ int batadv_bla_backbone_dump * bridge_loop_avoidance.c 2538,7 @@ bool batadv_bla_check_claim * distributed-arp-table.c 891,7 @@ int batadv_dat_cache_seq_print_text * distributed-arp-table.c 1037,7 @@ int batadv_dat_cache_dump * fragmentation.c 540,7 @@ int batadv_frag_send_packet * gateway_client.c 535,7 @@ int batadv_gw_client_seq_print_text * gateway_client.c 595,7 @@ int batadv_gw_dump * hard-interface.c 239,7 @@ static struct net_device *batadv_get_real_netdevice * hard-interface.c 460,7 @@ static void batadv_primary_if_update_addr * hard-interface.c 484,7 @@ static void batadv_primary_if_select * hard-interface.c 657,7 @@ batadv_hardif_activate_interface * hard-interface.c 809,7 @@ int batadv_hardif_enable_interface * hard-interface.c 860,7 @@ void batadv_hardif_disable_interface * hard-interface.c 870,7 @@ void batadv_hardif_disable_interface * hard-interface.c 893,11 @@ void batadv_hardif_disable_interface * hard-interface.c 973,7 @@ static void batadv_hardif_remove_interface * hard-interface.c 1086,10 @@ static int batadv_hard_if_event * icmp_socket.c 278,7 @@ static ssize_t batadv_socket_write * main.c 336,7 @@ batadv_seq_print_text_primary_if_get * main.c 504,7 @@ int batadv_batman_skb_recv * main.c 515,7 @@ int batadv_batman_skb_recv * multicast.c 2152,7 @@ int batadv_mcast_flags_seq_print_text * multicast.c 2361,7 @@ batadv_mcast_netlink_get_primary * multicast.c 2389,7 @@ int batadv_mcast_flags_dump * netlink.c 359,14 @@ static int batadv_netlink_mesh_fill * netlink.c 1217,7 @@ batadv_get_hardif_from_info * netlink.c 1336,7 @@ static void batadv_post_doit * network-coding.c 1937,7 @@ int batadv_nc_nodes_seq_print_text * originator.c 239,7 @@ static void batadv_neigh_ifinfo_release * originator.c 270,7 @@ static void batadv_hardif_neigh_release * originator.c 304,7 @@ static void batadv_neigh_node_release * originator.c 756,7 @@ int batadv_hardif_neigh_seq_print_text * originator.c 835,11 @@ int batadv_hardif_neigh_dump * originator.c 859,7 @@ static void batadv_orig_ifinfo_release * originator.c 1319,7 @@ static bool batadv_purge_orig_node * originator.c 1406,7 @@ int batadv_orig_seq_print_text * originator.c 1461,7 @@ int batadv_orig_hardif_seq_print_text * originator.c 1532,11 @@ int batadv_orig_dump * routing.c 280,7 @@ static int batadv_recv_my_icmp_packet * routing.c 335,7 @@ static int batadv_recv_icmp_ttl_exceeded * routing.c 796,7 @@ batadv_reroute_unicast_packet * routing.c 907,7 @@ static bool batadv_check_unicast_ttvn * send.c 310,7 @@ bool batadv_send_skb_prepare_unicast_4addr * send.c 475,9 @@ void batadv_forw_packet_free * send.c 767,14 @@ int batadv_add_bcast_packet_to_list * send.c 932,7 @@ static void batadv_send_outstanding_bcast_packet * soft-interface.c 395,7 @@ static netdev_tx_t batadv_interface_tx * soft-interface.c 893,7 @@ static int batadv_softif_slave_add * soft-interface.c 920,7 @@ static int batadv_softif_slave_del * sysfs.c 282,7 @@ ssize_t batadv_store_##_name * sysfs.c 301,7 @@ ssize_t batadv_show_##_name * sysfs.c 959,7 @@ static ssize_t batadv_show_mesh_iface * sysfs.c 1013,7 @@ static int batadv_store_mesh_iface_finish * sysfs.c 1110,7 @@ static ssize_t batadv_show_iface_status * sysfs.c 1170,7 @@ static ssize_t batadv_store_throughput_override * sysfs.c 1190,7 @@ static ssize_t batadv_show_throughput_override * tp_meter.c 748,7 @@ static void batadv_tp_recv_ack * tp_meter.c 882,7 @@ static int batadv_tp_send * tp_meter.c 1207,7 @@ static int batadv_tp_send_ack * translation-table.c 820,7 @@ bool batadv_tt_local_add * translation-table.c 1135,7 @@ int batadv_tt_local_seq_print_text * translation-table.c 1293,7 @@ int batadv_tt_local_dump * translation-table.c 2007,7 @@ int batadv_tt_global_seq_print_text * translation-table.c 2214,7 @@ int batadv_tt_global_dump * translation-table.c 3198,7 @@ static bool batadv_send_tt_request * translation-table.c 3461,7 @@ static bool batadv_send_my_tt_response * translation-table.c 3785,7 @@ static void batadv_send_roam_adv
Btw. we can most likely ignore everything related to bat_v* because it crashed in bat_iv. So if anybody else spots something which I've missed....
Kind regards, Sven
b.a.t.m.a.n@lists.open-mesh.org