INFO: rcu detected stall in netlink_sendmsg (4)
by syzbot
Hello,
syzbot found the following crash on:
HEAD commit: ae661dec Merge branch 'ifla_xdp_expected_fd'
git tree: bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=12245647e00000
kernel config: https://syzkaller.appspot.com/x/.config?x=b5acf5ac38a50651
dashboard link: https://syzkaller.appspot.com/bug?extid=0fb70e87d8e0ac278fe9
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
Unfortunately, I don't have any reproducer for this crash yet.
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+0fb70e87d8e0ac278fe9(a)syzkaller.appspotmail.com
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 0-....: (1 GPs behind) idle=5c2/1/0x4000000000000002 softirq=376075/376076 fqs=5176
(t=10500 jiffies g=506061 q=176208)
NMI backtrace for cpu 0
CPU: 0 PID: 17281 Comm: syz-executor.5 Not tainted 5.6.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x188/0x20d lib/dump_stack.c:118
nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62
trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
rcu_dump_cpu_stacks+0x169/0x1b3 kernel/rcu/tree_stall.h:254
print_cpu_stall kernel/rcu/tree_stall.h:475 [inline]
check_cpu_stall kernel/rcu/tree_stall.h:549 [inline]
rcu_pending kernel/rcu/tree.c:3030 [inline]
rcu_sched_clock_irq.cold+0x518/0xc55 kernel/rcu/tree.c:2276
update_process_times+0x25/0x60 kernel/time/timer.c:1726
tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:171
tick_sched_timer+0x4e/0x140 kernel/time/tick-sched.c:1314
__run_hrtimer kernel/time/hrtimer.c:1517 [inline]
__hrtimer_run_queues+0x32c/0xdd0 kernel/time/hrtimer.c:1579
hrtimer_interrupt+0x312/0x770 kernel/time/hrtimer.c:1641
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1119 [inline]
smp_apic_timer_interrupt+0x15b/0x600 arch/x86/kernel/apic/apic.c:1144
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
</IRQ>
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:759 [inline]
RIP: 0010:lock_release+0x45f/0x7c0 kernel/locking/lockdep.c:4505
Code: 94 08 00 00 00 00 00 00 48 c1 e8 03 80 3c 10 00 0f 85 d0 02 00 00 48 83 3d 6d 1d 1b 08 00 0f 84 71 01 00 00 48 8b 3c 24 57 9d <0f> 1f 44 00 00 48 b8 00 00 00 00 00 fc ff df 48 01 c3 48 c7 03 00
RSP: 0018:ffffc90003d9ec30 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
RAX: 1ffffffff12e7698 RBX: 1ffff920007b3d89 RCX: 1ffff110098769b9
RDX: dffffc0000000000 RSI: 1ffff110098769c5 RDI: 0000000000000282
RBP: ffff88804c3b4540 R08: 0000000000000004 R09: fffffbfff14cc269
R10: fffffbfff14cc268 R11: ffffffff8a661347 R12: bc95c6993a9665e0
R13: ffffffff87a36fb1 R14: ffff88804c3b4dd0 R15: 0000000000000003
__raw_spin_unlock_bh include/linux/spinlock_api_smp.h:174 [inline]
_raw_spin_unlock_bh+0x12/0x30 kernel/locking/spinlock.c:207
spin_unlock_bh include/linux/spinlock.h:383 [inline]
batadv_tt_local_purge_pending_clients+0x2a1/0x3b0 net/batman-adv/translation-table.c:3914
batadv_tt_local_resize_to_mtu+0x96/0x130 net/batman-adv/translation-table.c:4198
batadv_update_min_mtu net/batman-adv/hard-interface.c:626 [inline]
batadv_hardif_activate_interface.part.0.cold+0xc6/0x294 net/batman-adv/hard-interface.c:653
batadv_hardif_activate_interface net/batman-adv/hard-interface.c:800 [inline]
batadv_hardif_enable_interface+0x9f2/0xaa0 net/batman-adv/hard-interface.c:792
batadv_softif_slave_add+0x92/0x150 net/batman-adv/soft-interface.c:859
do_set_master net/core/rtnetlink.c:2470 [inline]
do_set_master+0x1d7/0x230 net/core/rtnetlink.c:2443
do_setlink+0xaa2/0x3680 net/core/rtnetlink.c:2605
__rtnl_newlink+0xad5/0x1590 net/core/rtnetlink.c:3266
rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3391
rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5454
netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2478
netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329
netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:672
____sys_sendmsg+0x6b9/0x7d0 net/socket.c:2343
___sys_sendmsg+0x100/0x170 net/socket.c:2397
__sys_sendmsg+0xec/0x1b0 net/socket.c:2430
do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x45c849
Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f043b72fc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f043b7306d4 RCX: 000000000045c849
RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003
RBP: 000000000076bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000000009f5 R14: 00000000004ccac9 R15: 000000000076bf0c
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller(a)googlegroups.com.
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
2 years, 6 months
[PATCH] batctl: fix endianness when reading radiotap header
by Marek Lindner
All radiotap header fields are specified in little endian byte-order.
Header length conversion is necessary on some platforms.
Signed-off-by: Marek Lindner <mareklindner(a)neomailbox.ch>
---
tcpdump.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tcpdump.c b/tcpdump.c
index 4b9e4f6..1beace1 100644
--- a/tcpdump.c
+++ b/tcpdump.c
@@ -1144,10 +1144,10 @@ static int monitor_header_length(unsigned char *packet_buff, ssize_t buff_len, i
return -1;
radiotap_hdr = (struct radiotap_header*)packet_buff;
- if (buff_len <= radiotap_hdr->it_len)
+ if (buff_len <= le16toh(radiotap_hdr->it_len))
return -1;
else
- return radiotap_hdr->it_len;
+ return le16toh(radiotap_hdr->it_len);
}
return -1;
--
2.25.1
2 years, 9 months
b.a.t.m.a.n. specification
by Fernando Gont
Hello, folks,
While looking at batman, I came across your IETF Internet-Draft
https://tools.ietf.org/id/draft-openmesh-b-a-t-m-a-n-00.txt.
Questions:
1) Is this the closest there is to an specification of batman?
2) Does it described the current protocol, or have there been changes
since then that have not been reflected into the internet-draft?
3) Any reason why the draft wasn't published as an IETF RFC?
Thanks!
Cheers,
--
Fernando Gont
SI6 Networks
e-mail: fgont(a)si6networks.com
PGP Fingerprint: 6666 31C6 D484 63B2 8FB1 E3C4 AE25 0D55 1D4E 7492
2 years, 9 months
configuration issue
by Oguzhan Kayhan
Hello
I am running batman over a openwrt 18.06
default batman config over this firmware is batman-adv 2017.4
I have two systems present..
One board is with 802.11n radio. other is 802.11ac radio
This is the only difference..same image compiled running on both devices..
On 802.11n radio set..
config is as:
batman-adv:
config 'mesh' 'bat0'
option 'aggregated_ogms'
option 'ap_isolation'
option 'bonding'
option 'fragmentation'
option 'gw_bandwidth'
option 'gw_mode'
option 'gw_sel_class'
option 'log_level'
option 'orig_interval'
option 'vis_mode'
option 'bridge_loop_avoidance'
option 'distributed_arp_table'
option 'multicast_mode'
option 'network_coding'
option 'hop_penalty'
option 'isolation_mark'
wireless :
config wifi-iface
option device 'radio0'
option ifname 'mesh0'
option network 'mesh'
option mode 'adhoc'
option ssid 'mesh'
option bssid '02:CA:FE:CA:CA:40'
option mcast_rate '18000'
network:
config interface 'lan'
option type 'bridge'
option ifname 'eth0 bat0'
option proto 'static'
....
config interface 'mesh'
option mtu '1532'
option proto 'batadv'
option mesh 'bat0'
config interface 'bat'
option ifname 'bat0'
option proto 'static'
option mtu '1500'
option ipaddr '172.0.0.10'
option netmask '255.255.255.0'
This configuration works fine..
But if I run same config on the second node with 802.11ac radio.. it fails..
So I dig around and changed the config as follows.. And it started to work..
batman-adv is same..no difference
wireless:
config wifi-iface 'wmesh'
option device 'radio0'
option ifname 'adhoc0'
option network 'bat0_hardif_wlan'
option mode 'adhoc'
option ssid 'mesh'
option mcast_rate '18000'
option bssid '02:CA:FE:CA:CA:40'
network:
config interface 'bat0_hardif_wlan'
option mtu '1532'
option proto 'batadv'
option mesh 'bat0'
config interface 'bat0_hardif_eth0'
option mtu '1532'
option proto 'batadv'
option mesh 'bat0'
option ifname 'eth0'
I have two questions so far..
Why the first config does not work over second system..(Wifi works
fine but. for mesh. i needed to change the config like this)
Second question.. I can live with different config.. Ok..
But I want to use eth0 and eth2 and mesh network as a bridge...
But whenever i add any of this interfaces on bridge.. batman-adv fails..
What am I missing???
2 years, 9 months
[PATCH 0/5] pull request for net-next: batman-adv 2020-04-27
by Simon Wunderlich
Hi David,
here is a small cleanup pull request of batman-adv to go into net-next.
Please pull or let me know of any problem!
Thank you,
Simon
The following changes since commit 8f3d9f354286745c751374f5f1fcafee6b3f3136:
Linux 5.7-rc1 (2020-04-12 12:35:55 -0700)
are available in the Git repository at:
git://git.open-mesh.org/linux-merge.git tags/batadv-next-for-davem-20200427
for you to fetch changes up to e73f94d1b6f05f6f22434c63de255a9dec6fd23d:
batman-adv: remove unused inline function batadv_arp_change_timeout (2020-04-24 15:22:41 +0200)
----------------------------------------------------------------
This cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- fix spelling error, by Sven Eckelmann
- drop unneeded types.h include, by Sven Eckelmann
- change random number generation to prandom_u32_max(),
by Sven Eckelmann
- remove unused function batadv_arp_change_timeout(), by Yue Haibing
----------------------------------------------------------------
Simon Wunderlich (1):
batman-adv: Start new development cycle
Sven Eckelmann (3):
batman-adv: Fix spelling error in term buffer
batman-adv: trace: Drop unneeded types.h include
batman-adv: Utilize prandom_u32_max for random [0, max) values
YueHaibing (1):
batman-adv: remove unused inline function batadv_arp_change_timeout
net/batman-adv/bat_iv_ogm.c | 4 ++--
net/batman-adv/bat_v_elp.c | 2 +-
net/batman-adv/bat_v_ogm.c | 4 ++--
net/batman-adv/distributed-arp-table.h | 5 -----
net/batman-adv/main.h | 2 +-
net/batman-adv/trace.h | 1 -
net/batman-adv/types.h | 2 +-
7 files changed, 7 insertions(+), 13 deletions(-)
2 years, 9 months
[PATCH 0/4] pull request for net: batman-adv 2020-04-27
by Simon Wunderlich
Hi David,
here are some bugfixes which we would like to have integrated into net.
Please pull or let me know of any problem!
Thank you,
Simon
The following changes since commit 8f3d9f354286745c751374f5f1fcafee6b3f3136:
Linux 5.7-rc1 (2020-04-12 12:35:55 -0700)
are available in the Git repository at:
git://git.open-mesh.org/linux-merge.git tags/batadv-net-for-davem-20200427
for you to fetch changes up to 6f91a3f7af4186099dd10fa530dd7e0d9c29747d:
batman-adv: Fix refcnt leak in batadv_v_ogm_process (2020-04-21 10:08:05 +0200)
----------------------------------------------------------------
Here are some batman-adv bugfixes:
- fix random number generation in network coding, by George Spelvin
- fix reference counter leaks, by Xiyu Yang (3 patches)
----------------------------------------------------------------
George Spelvin (1):
batman-adv: fix batadv_nc_random_weight_tq
Xiyu Yang (3):
batman-adv: Fix refcnt leak in batadv_show_throughput_override
batman-adv: Fix refcnt leak in batadv_store_throughput_override
batman-adv: Fix refcnt leak in batadv_v_ogm_process
net/batman-adv/bat_v_ogm.c | 2 +-
net/batman-adv/network-coding.c | 9 +--------
net/batman-adv/sysfs.c | 3 ++-
3 files changed, 4 insertions(+), 10 deletions(-)
2 years, 9 months
[PATCH net-next] batman-adv: remove unsued inline function batadv_arp_change_timeout
by YueHaibing
There's no callers in-tree.
Signed-off-by: YueHaibing <yuehaibing(a)huawei.com>
---
net/batman-adv/distributed-arp-table.h | 5 -----
1 file changed, 5 deletions(-)
diff --git a/net/batman-adv/distributed-arp-table.h b/net/batman-adv/distributed-arp-table.h
index 2bff2f4a325c..4e031661682a 100644
--- a/net/batman-adv/distributed-arp-table.h
+++ b/net/batman-adv/distributed-arp-table.h
@@ -163,11 +163,6 @@ static inline void batadv_dat_init_own_addr(struct batadv_priv *bat_priv,
{
}
-static inline void batadv_arp_change_timeout(struct net_device *soft_iface,
- const char *name)
-{
-}
-
static inline int batadv_dat_init(struct batadv_priv *bat_priv)
{
return 0;
--
2.17.1
2 years, 9 months
[PATCH] batman-adv: Fix refcnt leak in batadv_v_ogm_process
by Xiyu Yang
batadv_v_ogm_process() invokes batadv_hardif_neigh_get(), which returns
a reference of the neighbor object to "hardif_neigh" with increased
refcount.
When batadv_v_ogm_process() returns, "hardif_neigh" becomes invalid, so
the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in one exception handling paths of
batadv_v_ogm_process(). When batadv_v_ogm_orig_get() fails to get the
orig node and returns NULL, the refcnt increased by
batadv_hardif_neigh_get() is not decreased, causing a refcnt leak.
Fix this issue by jumping to "out" label when batadv_v_ogm_orig_get()
fails to get the orig node.
Fixes: 9323158ef9f4 ("batman-adv: OGMv2 - implement originators logic")
Signed-off-by: Xiyu Yang <xiyuyang19(a)fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf(a)gmail.com>
---
net/batman-adv/bat_v_ogm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/batman-adv/bat_v_ogm.c b/net/batman-adv/bat_v_ogm.c
index 969466218999..80b87b1f4e3a 100644
--- a/net/batman-adv/bat_v_ogm.c
+++ b/net/batman-adv/bat_v_ogm.c
@@ -893,7 +893,7 @@ static void batadv_v_ogm_process(const struct sk_buff *skb, int ogm_offset,
orig_node = batadv_v_ogm_orig_get(bat_priv, ogm_packet->orig);
if (!orig_node)
- return;
+ goto out;
neigh_node = batadv_neigh_node_get_or_create(orig_node, if_incoming,
ethhdr->h_source);
--
2.7.4
2 years, 9 months
batman cpu usage of multiple instances
by Moritz Warning
Hi,
I have many batman-adv instances (~1000) running on one computer in a Linux network namespace each.
Top shows me that a single kernel worker is handling batman-adv:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29251 root 20 0 0 0 0 R 99.3 0.0 18:03.27 kworker/u32:3+bat_events
Is there a way to let those multiple batman-adv instances make use of the other cores?
thanks,
Moritz
2 years, 9 months
maximum hop count
by Moritz Warning
Hi,
I run a simulation of 50 batman-adv instances connected on a chain topology:
[node-0] <-> [node1] <-> ... <-> [node49]
Despite there being no packet loss, the nodes at both ends (nodes 0 and 49) only see 32 other nodes.
The second outermost nodes see 33 other nodes and so on until the nodes that are at least 18 hops from both ends (nodes 17 and 32), which see all other 49 nodes.
The OGM TTL is set to 50 [1], but from this experiment, the TTL seems to be 32.
Can someone shed light on this observation?
The batman-adv version used is 2019.4.
Thanks,
Moritz
[0] https://github.com/mwarning/meshnet-lab
[1] https://git.open-mesh.org/batman-adv.git/blob/refs/heads/master:/net/batm...
2 years, 9 months