Hey guys,
I've been installing a 9 node setup here in our cellar. They are all running B.A.T.M.A.N. adv 0.2.1-beta r1545 (so the current batman-adv maintance version in OpenWRT).
The result over night: - 1x: bat_events: page allocation failure - 3x: WARNING: at net/sched/sch_generic.c:226 0x801c3554() (those 3 had TQ values of about 10-15 in batctl)
For the last error, I'm not sure if this is related to batman-adv or to the madwifi driver. I was using OpenWRT trunk 18405 on all routers.
I'm attaching some backtraces. Sorry again I can't get more out of them than that with ksymoops. The first two ones are the bat_events problem, the others the scheduling thingy.
Cheers, Linus
On Sat, Jan 23, 2010 at 06:46:16PM +0100, Linus L??ssing wrote:
Hey guys,
I've been installing a 9 node setup here in our cellar. They are all running B.A.T.M.A.N. adv 0.2.1-beta r1545 (so the current batman-adv maintance version in OpenWRT).
The result over night:
- 1x: bat_events: page allocation failure
This looks like a memory leak somewhere. It could be anywhere, batman or some other part of the kernel.
Can you build your kernel with the kernel memory leak detector enabled? It is under the Kernel Hacking options. You probably also need to build the kernel with symbols, which will help with ksymopps anyway. Documentation for kmemleak is in Documentation/kmemleak.txt. I've never used it myself, so i've no idea how good it actually is...
Interestingly, did you notice the warnings:
batman-adv:The newly added mac address (00:24:01:b7:6a:d2) already exists on: eth0.2 batman-adv:It is strongly recommended to keep mac addresses unique to avoid +problems!
I guess this is because you are using VLANs. Have you seen these problems without using VLANs?
Andrew
Ok, this is with symbols enabled and I changed the mac addresses, so they are defintely different now. Here is the new backtrace for the second issue. It occured after about 1-2 hours.
Cheers, Linus
On Sat, Jan 23, 2010 at 07:10:48PM +0100, Andrew Lunn wrote:
On Sat, Jan 23, 2010 at 06:46:16PM +0100, Linus L??ssing wrote:
Hey guys,
I've been installing a 9 node setup here in our cellar. They are all running B.A.T.M.A.N. adv 0.2.1-beta r1545 (so the current batman-adv maintance version in OpenWRT).
The result over night:
- 1x: bat_events: page allocation failure
This looks like a memory leak somewhere. It could be anywhere, batman or some other part of the kernel.
Can you build your kernel with the kernel memory leak detector enabled? It is under the Kernel Hacking options. You probably also need to build the kernel with symbols, which will help with ksymopps anyway. Documentation for kmemleak is in Documentation/kmemleak.txt. I've never used it myself, so i've no idea how good it actually is...
Interestingly, did you notice the warnings:
batman-adv:The newly added mac address (00:24:01:b7:6a:d2) already exists on: eth0.2 batman-adv:It is strongly recommended to keep mac addresses unique to avoid +problems!
I guess this is because you are using VLANs. Have you seen these problems without using VLANs?
Andrew
Hi Andrew,
Here is the output with the symbols inserted. I can reproduce this memory leak very reliable now, by just activating the vis server (didn't do that with the logs I've posted before) in a couple of minutes. free is showing a nice countdown then :).
For kmemleak... I'm using 2.6.30.10 here and I think Torvalds merged it into 2.6.33 a month ago. Anyone having a 2.6.33 kernel running at the moment, patches for kmemleak in 2.6.30 or another idea of how to debug this in 2.6.30?
"Debug slab memory allocations" (CONFIG_DEBUG_SLAB) and "Memory leak debugging" (DEBUG_SLAB_LEAK) are activated on the kernel I'm using here, but they are not of any use, are they?
Cheers, Linus
On Sat, Jan 23, 2010 at 07:10:48PM +0100, Andrew Lunn wrote:
On Sat, Jan 23, 2010 at 06:46:16PM +0100, Linus L??ssing wrote:
Hey guys,
I've been installing a 9 node setup here in our cellar. They are all running B.A.T.M.A.N. adv 0.2.1-beta r1545 (so the current batman-adv maintance version in OpenWRT).
The result over night:
- 1x: bat_events: page allocation failure
This looks like a memory leak somewhere. It could be anywhere, batman or some other part of the kernel.
Can you build your kernel with the kernel memory leak detector enabled? It is under the Kernel Hacking options. You probably also need to build the kernel with symbols, which will help with ksymopps anyway. Documentation for kmemleak is in Documentation/kmemleak.txt. I've never used it myself, so i've no idea how good it actually is...
Interestingly, did you notice the warnings:
batman-adv:The newly added mac address (00:24:01:b7:6a:d2) already exists on: eth0.2 batman-adv:It is strongly recommended to keep mac addresses unique to avoid +problems!
I guess this is because you are using VLANs. Have you seen these problems without using VLANs?
Andrew
On Sun, Jan 24, 2010 at 09:42:12PM +0100, Linus L??ssing wrote:
Hi Andrew,
Here is the output with the symbols inserted. I can reproduce this memory leak very reliable now, by just activating the vis server (didn't do that with the logs I've posted before) in a couple of minutes. free is showing a nice countdown then :).
So it is vis server functionality which is leaking? If vis server is not enabled, there is no leak?
If so, we don't need kmemcheck. I expect just looking at the vis code is probably enough to find the missing free.
Andrew
On Sun, Jan 24, 2010 at 10:00:10PM +0100, Andrew Lunn wrote:
On Sun, Jan 24, 2010 at 09:42:12PM +0100, Linus L??ssing wrote:
Hi Andrew,
Here is the output with the symbols inserted. I can reproduce this memory leak very reliable now, by just activating the vis server (didn't do that with the logs I've posted before) in a couple of minutes. free is showing a nice countdown then :).
So it is vis server functionality which is leaking? If vis server is not enabled, there is no leak?
If so, we don't need kmemcheck. I expect just looking at the vis code is probably enough to find the missing free.
It looks pretty obvious:
Working our way via the call stack:
In batman_skb_recv() we have:
case BAT_VIS: ret = recv_vis_packet(skb); break; default: ret = NET_RX_DROP; } if (ret == NET_RX_DROP) kfree_skb(skb);
i.e. the packet is only freed if recv_vis_packet() returns NET_RX_DROP
In recv_vis_packet() we have: switch (vis_packet->vis_type) { case VIS_TYPE_SERVER_SYNC: /* TODO: handle fragmented skbs properly */ receive_server_sync_packet(vis_packet, skb_headlen(skb)); ret = NET_RX_SUCCESS; break;
case VIS_TYPE_CLIENT_UPDATE: /* TODO: handle fragmented skbs properly */ receive_client_update_packet(vis_packet, skb_headlen(skb)); ret = NET_RX_SUCCESS; break;
default: /* ignore unknown packet */ ret = NET_RX_DROP; break; } return ret;
i.e. receive_client_update_packet() and receive_server_sync_packet() need to sometime result in the packet being freed. batman_skb_recv() will not free the packet.
void receive_server_sync_packet(struct vis_packet *vis_packet, int vis_info_len) { struct vis_info *info; int is_new; unsigned long flags; int vis_server = atomic_read(&vis_mode);
spin_lock_irqsave(&vis_hash_lock, flags); info = add_packet(vis_packet, vis_info_len, &is_new); if (info == NULL) goto end;
/* only if we are server ourselves and packet is newer than the one in * hash.*/ if (vis_server == VIS_TYPE_SERVER_SYNC && is_new) { memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN); if (list_empty(&info->send_list)) list_add_tail(&info->send_list, &send_list); } end: spin_unlock_irqrestore(&vis_hash_lock, flags); }
No free here. vis_packet is a pointer into the skbuf, but we never add vis_packet to any queue here, just info is queued.
add_packet():
info = kmalloc(sizeof(struct vis_info) + vis_info_len, GFP_ATOMIC); if (info == NULL) return NULL;
INIT_LIST_HEAD(&info->send_list); INIT_LIST_HEAD(&info->recv_list); info->first_seen = jiffies; memcpy(&info->packet, vis_packet, sizeof(struct vis_packet) + vis_info_len);
We allocate memory can copy the contents of the packet into the allocated memory. So again, the skbuf is not queued.
As far as i can see, the skbuf is never put on a queue. We just copy the needed data out of it. This means the skbuf does need freeing.
I had a very quick look at receive_client_update_packet(). It seems to have the same structure.
I will submit a fix for this soon. Is there any other similar code which might have the same bug?
Andrew
Staging: batman-adv: Fix skbuff leak in VIS code.
The vis code takes a copy of the data inside the skbuf if it is interesting for us, so we always need to release the skbuf.
Reported-by: Linus L�ssing linus.luessing@web.de Signed-off-by: Andrew Lunn andrew@lunn.ch
Index: routing.c =================================================================== --- routing.c (revision 1563) +++ routing.c (working copy) @@ -1135,21 +1135,22 @@ if (is_my_mac(vis_packet->sender_orig)) return NET_RX_DROP;
+ /* We take a copy of the data in the packet, so we should + always free the skbuf. */ + ret = NET_RX_DROP; + switch (vis_packet->vis_type) { case VIS_TYPE_SERVER_SYNC: /* TODO: handle fragmented skbs properly */ receive_server_sync_packet(vis_packet, skb_headlen(skb)); - ret = NET_RX_SUCCESS; break;
case VIS_TYPE_CLIENT_UPDATE: /* TODO: handle fragmented skbs properly */ receive_client_update_packet(vis_packet, skb_headlen(skb)); - ret = NET_RX_SUCCESS; break;
default: /* ignore unknown packet */ - ret = NET_RX_DROP; break; } return ret;
Just tried the patch on the routers and it works perfectly, no more memory leak caused by vis! Thanks for the quick fix.
Cheers, Linus
And another (different) one. Which occured right after flashing and booting causing a kernel panic. Sorry couldn't find a method for reproducing them yet.
Cheers, Linus
I'm now able to reproduce the second problem as well: Activating the vis-server on one node causes the vis-clients to throw this call trace with its slow path warning imediately. Nevertheless, vis output works anyway, clients are sending vis-packets and the server is drawing nice graphs. See the attachment for an example call trace.
Cheers, Linus
On Tuesday 26 January 2010 14:13:11 Linus Lüssing wrote:
I'm now able to reproduce the second problem as well: Activating the vis-server on one node causes the vis-clients to throw this call trace with its slow path warning imediately. Nevertheless, vis output works anyway, clients are sending vis-packets and the server is drawing nice graphs. See the attachment for an example call trace.
Please try the attached patch and see if it helps.
Regards, Marek
Hi Marek,
nope, does not seem to work, still the same issue I also tried it on my laptop here with the patch installed and it is a very similar call trace, although here it is send_vis_packets instead of vis_quit in the call trace.
Cheers, Linus
On Tue, Jan 26, 2010 at 03:16:38PM +0800, Marek Lindner wrote:
On Tuesday 26 January 2010 14:13:11 Linus Lüssing wrote:
I'm now able to reproduce the second problem as well: Activating the vis-server on one node causes the vis-clients to throw this call trace with its slow path warning imediately. Nevertheless, vis output works anyway, clients are sending vis-packets and the server is drawing nice graphs. See the attachment for an example call trace.
Please try the attached patch and see if it helps.
Regards, Marek
diff --git a/batman-adv-kernelland/vis.c b/batman-adv-kernelland/vis.c index b118d1e..a6c235f 100644 --- a/batman-adv-kernelland/vis.c +++ b/batman-adv-kernelland/vis.c @@ -405,6 +405,9 @@ static void purge_vis_packets(void) { HASHIT(hashit); struct vis_info *info;
unsigned long flags;
spin_lock_irqsave(&vis_hash_lock, flags);
while (hash_iterate(vis_hash, &hashit)) { info = hashit.bucket->data;
@@ -416,13 +419,17 @@ static void purge_vis_packets(void) free_info(info); } }
- spin_unlock_irqrestore(&vis_hash_lock, flags);
}
static void broadcast_vis_packet(struct vis_info *info, int packet_length) { HASHIT(hashit); struct orig_node *orig_node;
struct batman_if *batman_if; unsigned long flags;
uint8_t dstaddr[ETH_ALEN];
spin_lock_irqsave(&orig_hash_lock, flags);
@@ -431,45 +438,60 @@ static void broadcast_vis_packet(struct vis_info *info, int packet_length) orig_node = hashit.bucket->data;
/* if it's a vis server and reachable, send it. */
if (orig_node &&
(orig_node->flags & VIS_SERVER) &&
orig_node->batman_if &&
orig_node->router) {
if ((!orig_node) || (!orig_node->batman_if) ||
(!orig_node->router))
continue;
/* don't send it if we already received the packet from
* this node. */
if (recv_list_is_in(&info->recv_list, orig_node->orig))
continue;
if (!(orig_node->flags & VIS_SERVER))
continue;
memcpy(info->packet.target_orig,
orig_node->orig, ETH_ALEN);
/* don't send it if we already received the packet from
* this node. */
if (recv_list_is_in(&info->recv_list, orig_node->orig))
continue;
send_raw_packet((unsigned char *) &info->packet,
packet_length,
orig_node->batman_if,
orig_node->router->addr);
}
spin_unlock_irqrestore(&orig_hash_lock, flags);
memcpy(info->packet.target_orig, orig_node->orig, ETH_ALEN);
batman_if = orig_node->batman_if;
memcpy(dstaddr, orig_node->router->addr, ETH_ALEN);
send_raw_packet((unsigned char *)&info->packet,
packet_length, batman_if, dstaddr);
}spin_lock_irqsave(&orig_hash_lock, flags);
- memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN);
- spin_unlock_irqrestore(&orig_hash_lock, flags);
- memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN);
}
static void unicast_vis_packet(struct vis_info *info, int packet_length) { struct orig_node *orig_node;
struct batman_if *batman_if; unsigned long flags;
uint8_t dstaddr[ETH_ALEN];
spin_lock_irqsave(&orig_hash_lock, flags); orig_node = ((struct orig_node *) hash_find(orig_hash, info->packet.target_orig));
- if ((orig_node != NULL) &&
(orig_node->batman_if != NULL) &&
(orig_node->router != NULL)) {
send_raw_packet((unsigned char *) &info->packet, packet_length,
orig_node->batman_if,
orig_node->router->addr);
- }
- if ((!orig_node) || (!orig_node->batman_if) || (!orig_node->router))
goto out;
- /* don't lock while sending the packets ... we therefore
* copy the required data before sending */
- batman_if = orig_node->batman_if;
- memcpy(dstaddr, orig_node->router->addr, ETH_ALEN);
- spin_unlock_irqrestore(&orig_hash_lock, flags);
- send_raw_packet((unsigned char *)&info->packet,
packet_length, batman_if, dstaddr);
- return;
+out: spin_unlock_irqrestore(&orig_hash_lock, flags); }
@@ -502,17 +524,18 @@ static void send_vis_packets(struct work_struct *work) struct vis_info *info, *temp; unsigned long flags;
spin_lock_irqsave(&vis_hash_lock, flags); purge_vis_packets();
if (generate_vis_packet() == 0) /* schedule if generation was successful */ list_add_tail(&my_vis_info->send_list, &send_list);
- spin_lock_irqsave(&vis_hash_lock, flags); list_for_each_entry_safe(info, temp, &send_list, send_list) { list_del_init(&info->send_list); send_vis_packet(info); }
- spin_unlock_irqrestore(&vis_hash_lock, flags); start_vis_timer();
}
Hi,
nope, does not seem to work, still the same issue I also tried it on my laptop here with the patch installed and it is a very similar call trace, although here it is send_vis_packets instead of vis_quit in the call trace.
you should take the stack trace with a grain of salt. Stack tracing such an issue is a quite tricky thing, hence can not be fully trusted. Since the skb changes the code runs in interrupt context which probably introduced this bug. All previous warnings of this kind were related to holding a lock while sending packets. My patch unlocked the problematic orig_hash - maybe that was not enough ? By adding retrun statements at the beginning of the vis send function you might be able to get down to the problem.
Regards, Marek
On Thu, Jan 28, 2010 at 08:09:57AM +0800, Marek Lindner wrote:
Hi,
nope, does not seem to work, still the same issue I also tried it on my laptop here with the patch installed and it is a very similar call trace, although here it is send_vis_packets instead of vis_quit in the call trace.
you should take the stack trace with a grain of salt. Stack tracing such an issue is a quite tricky thing, hence can not be fully trusted. Since the skb changes the code runs in interrupt context which probably introduced this bug. All previous warnings of this kind were related to holding a lock while sending packets. My patch unlocked the problematic orig_hash - maybe that was not enough ? By adding retrun statements at the beginning of the vis send function you might be able to get down to the problem.
It might also be worth running lockdep on the code. Normally you don^t need the actually lockup, you just need to execute the code path that would lockup under whatever conditions are required for it to lockup.
Andrew
WARNING: at kernel/softirq.c:141 local_bh_enable+0x48/0xa0() Modules linked in: ath_ahb ath_hal(P) batman_adv ip6t_REJECT ip6t_LOG ip6t_rt ip6t_hbh ip6t_mh ip6t_ipv6header ip6t_frag ip6t_eui64 ip6t_ah ip6table_raw ip6_queue ip6table_mangle ip6table_filter ip6_tables ebt_redirect ebt_mark ebt_vlan ebt_stp ebt_pkttype ebt_mark_m ebt_limit ebt_among ebt_802_3 ebtable_nat ebtable_filter ebtable_broute ebtables xt_quota xt_pkttype xt_physdev ipt_REJECT xt_TCPMSS ipt_LOG xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables tun ipv6 Call Trace: [<800493a4>] dump_stack+0x8/0x34 [<80060cd8>] warn_slowpath_common+0x70/0xb0 [<800674ac>] local_bh_enable+0x48/0xa0 [<801b3ef8>] dev_queue_xmit+0x388/0x3dc [<8062c454>] ieee80211_parent_queue_xmit+0x8c/0xb4 [ath_ahb] [<8062c778>] ieee80211_hardstart+0x2fc/0x34c [ath_ahb] [<801b3e5c>] dev_queue_xmit+0x2ec/0x3dc [<80bb68d8>] vis_quit+0x7c4/0x950 [batman_adv]
---[ end trace e96b16c908cf7c40 ]--- br-wan_vpn: received tcn bpdu on port 1(eth0.4) br-wan_vpn: topology change detected, propagating root@OpenWrt:~#
I dug a little deeper into what his means. It means we are in interrupt context and interrupts are disabled. This seems to be not allowed when calling local_bh_enable(), which probably means it is not allowed when calling dev_queue_xmit.
What i don't understand yet is the call path. The Call Trace above does not make much sense. vis_quit() does not send any packets.
Looking at Marek's patch it is going in the right direction. send_skb_packet() cannot be called with interrupts disabled. which means send_raw_packet() cannot be called with interrupts disabled. which means broadcast_vis_packet() needs to spin_unlock_irqrestore() before sending. which means unicast_vis_packet() needs to spin_unlock_irqrestore() before sending.
This is what Mareks patch does. However, this part of Marek's patch looks wrong:
struct vis_info *info, *temp; unsigned long flags;
- spin_lock_irqsave(&vis_hash_lock, flags); purge_vis_packets();
if (generate_vis_packet() == 0) /* schedule if generation was successful */ list_add_tail(&my_vis_info->send_list, &send_list);
+ spin_lock_irqsave(&vis_hash_lock, flags); list_for_each_entry_safe(info, temp, &send_list, send_list) { list_del_init(&info->send_list); send_vis_packet(info); } + spin_unlock_irqrestore(&vis_hash_lock, flags); start_vis_timer(); }
You are disable interrupts and then call send_vis_packet(). This will not work since all calls under send_vis_packet will have interrupts disabled.
Andrew
On Friday 29 January 2010 16:25:45 Andrew Lunn wrote:
This is what Mareks patch does. However, this part of Marek's patch looks wrong:
I'd like to point out that this lock wasn't introduced by my previously posted patch. I managed to reduce the effect of this lock but you can't remove it entirely without changing the whole sending mechanism.
You are disable interrupts and then call send_vis_packet(). This will not work since all calls under send_vis_packet will have interrupts disabled.
Yes, that is correct. The reason for this is simple: The interrupt flags are stored in send_vis_packets() and without passing them on we can't re-enable the interrupts. That did not appear to be a good solution. ;) As far as I understand the vis_lock's only purpose is to avoid that the "info" pointer which is passed to send_vis_packet() can't be deleted while we use it. That might be better solved using reference counting.
Cheers, Marek
Hi Linus
Please could you try this patch and see if it fixes the vis problem. It compiles cleanly with 2.6.32. It is checkpatch clean, it is sparse clean. However, since i've not actually tried running the code i would not be too surprised if it deadlocked, leaked memory, oopsed, ...
Andrew
Staging: batman-adv: Don't have interrupts disabled while sending.
send_vis_packets() would disable interrupts before calling dev_queue_xmit() which resulting in a backtrace in local_bh_enable(). Fix this by using kref on the vis_info object so that we can call send_vis_packets() without holding vis_hash_lock. vis_hash_lock also used to protect recv_list, so we now need a new lock to protect that instead of vis_hash_lock.
Also a few checkpatch cleanups.
Reported-by: Linus L�ssing linus.luessing@web.de Signed-off-by: Andrew Lunn andrew@lunn.ch
Index: vis.c =================================================================== --- vis.c (revision 1568) +++ vis.c (working copy) @@ -30,22 +30,26 @@
struct hashtable_t *vis_hash; DEFINE_SPINLOCK(vis_hash_lock); +static DEFINE_SPINLOCK(recv_list_lock); static struct vis_info *my_vis_info; static struct list_head send_list; /* always locked with vis_hash_lock */
static void start_vis_timer(void);
/* free the info */ -static void free_info(void *data) +static void free_info(struct kref *ref) { - struct vis_info *info = data; + struct vis_info *info = container_of(ref, struct vis_info, refcount); struct recvlist_node *entry, *tmp; + unsigned long flags;
list_del_init(&info->send_list); + spin_lock_irqsave(&recv_list_lock, flags); list_for_each_entry_safe(entry, tmp, &info->recv_list, list) { list_del(&entry->list); kfree(entry); } + spin_unlock_irqrestore(&recv_list_lock, flags); kfree(info); }
@@ -147,32 +151,41 @@ static void recv_list_add(struct list_head *recv_list, char *mac) { struct recvlist_node *entry; + unsigned long flags; + entry = kmalloc(sizeof(struct recvlist_node), GFP_ATOMIC); if (!entry) return;
memcpy(entry->mac, mac, ETH_ALEN); + spin_lock_irqsave(&recv_list_lock, flags); list_add_tail(&entry->list, recv_list); + spin_unlock_irqrestore(&recv_list_lock, flags); }
/* returns 1 if this mac is in the recv_list */ static int recv_list_is_in(struct list_head *recv_list, char *mac) { struct recvlist_node *entry; + unsigned long flags;
+ spin_lock_irqsave(&recv_list_lock, flags); list_for_each_entry(entry, recv_list, list) { - if (memcmp(entry->mac, mac, ETH_ALEN) == 0) + if (memcmp(entry->mac, mac, ETH_ALEN) == 0) { + spin_unlock_irqrestore(&recv_list_lock, flags); return 1; + } } - + spin_unlock_irqrestore(&recv_list_lock, flags); return 0; }
/* try to add the packet to the vis_hash. return NULL if invalid (e.g. too old, - * broken.. ). vis hash must be locked outside. is_new is set when the packet + * broken.. ). vis hash must be locked outside. is_new is set when the packet * is newer than old entries in the hash. */ static struct vis_info *add_packet(struct vis_packet *vis_packet, - int vis_info_len, int *is_new) + int vis_info_len, int *is_new, + int make_broadcast) { struct vis_info *info, *old_info; struct vis_info search_elem; @@ -199,7 +212,7 @@ } /* remove old entry */ hash_remove(vis_hash, old_info); - free_info(old_info); + kref_put(&old_info->refcount, free_info); }
info = kmalloc(sizeof(struct vis_info) + vis_info_len, GFP_ATOMIC); @@ -208,6 +221,7 @@
INIT_LIST_HEAD(&info->send_list); INIT_LIST_HEAD(&info->recv_list); + kref_init(&info->refcount); info->first_seen = jiffies; memcpy(&info->packet, vis_packet, sizeof(struct vis_packet) + vis_info_len); @@ -215,16 +229,21 @@ /* initialize and add new packet. */ *is_new = 1;
+ /* Make it a broadcast packet, if required */ + if (make_broadcast) + memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN); + /* repair if entries is longer than packet. */ if (info->packet.entries * sizeof(struct vis_info_entry) > vis_info_len) - info->packet.entries = vis_info_len / sizeof(struct vis_info_entry); + info->packet.entries = vis_info_len / + sizeof(struct vis_info_entry);
recv_list_add(&info->recv_list, info->packet.sender_orig);
/* try to add it */ if (hash_add(vis_hash, info) < 0) { /* did not work (for some reason) */ - free_info(info); + kref_put(&old_info->refcount, free_info); info = NULL; }
@@ -235,19 +254,20 @@ void receive_server_sync_packet(struct vis_packet *vis_packet, int vis_info_len) { struct vis_info *info; - int is_new; + int is_new, make_broadcast; unsigned long flags; int vis_server = atomic_read(&vis_mode);
+ make_broadcast = (vis_server == VIS_TYPE_SERVER_SYNC); + spin_lock_irqsave(&vis_hash_lock, flags); - info = add_packet(vis_packet, vis_info_len, &is_new); + info = add_packet(vis_packet, vis_info_len, &is_new, make_broadcast); if (info == NULL) goto end;
/* only if we are server ourselves and packet is newer than the one in * hash.*/ if (vis_server == VIS_TYPE_SERVER_SYNC && is_new) { - memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN); if (list_empty(&info->send_list)) list_add_tail(&info->send_list, &send_list); } @@ -263,24 +283,27 @@ int is_new; unsigned long flags; int vis_server = atomic_read(&vis_mode); + int are_target = 0;
/* clients shall not broadcast. */ if (is_bcast(vis_packet->target_orig)) return;
+ /* Are we the target for this VIS packet? */ + if (vis_server == VIS_TYPE_SERVER_SYNC && + is_my_mac(info->packet.target_orig)) + are_target = 1; + spin_lock_irqsave(&vis_hash_lock, flags); - info = add_packet(vis_packet, vis_info_len, &is_new); + info = add_packet(vis_packet, vis_info_len, &is_new, are_target); if (info == NULL) goto end; /* note that outdated packets will be dropped at this point. */
/* send only if we're the target server or ... */ - if (vis_server == VIS_TYPE_SERVER_SYNC && - is_my_mac(info->packet.target_orig) && - is_new) { + if (are_target && is_new) { info->packet.vis_type = VIS_TYPE_SERVER_SYNC; /* upgrade! */ - memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN); if (list_empty(&info->send_list)) list_add_tail(&info->send_list, &send_list);
@@ -362,14 +385,17 @@ while (hash_iterate(orig_hash, &hashit_global)) { orig_node = hashit_global.bucket->data; if (orig_node->router != NULL - && compare_orig(orig_node->router->addr, orig_node->orig) + && compare_orig(orig_node->router->addr, + orig_node->orig) && orig_node->batman_if && (orig_node->batman_if->if_active == IF_ACTIVE) && orig_node->router->tq_avg > 0) {
/* fill one entry into buffer. */ entry = &entry_array[info->packet.entries]; - memcpy(entry->src, orig_node->batman_if->net_dev->dev_addr, ETH_ALEN); + memcpy(entry->src, + orig_node->batman_if->net_dev->dev_addr, + ETH_ALEN); memcpy(entry->dest, orig_node->orig, ETH_ALEN); entry->quality = orig_node->router->tq_avg; info->packet.entries++; @@ -401,6 +427,8 @@ return 0; }
+/* free old vis packets. Must be called with this vis_hash_lock + * held */ static void purge_vis_packets(void) { HASHIT(hashit); @@ -413,7 +441,7 @@ if (time_after(jiffies, info->first_seen + (VIS_TIMEOUT*HZ)/1000)) { hash_remove_bucket(vis_hash, &hashit); - free_info(info); + kref_put(&info->refcount, free_info); } } } @@ -423,6 +451,8 @@ HASHIT(hashit); struct orig_node *orig_node; unsigned long flags; + struct batman_if *batman_if; + uint8_t dstaddr[ETH_ALEN];
spin_lock_irqsave(&orig_hash_lock, flags);
@@ -431,46 +461,57 @@ orig_node = hashit.bucket->data;
/* if it's a vis server and reachable, send it. */ - if (orig_node && - (orig_node->flags & VIS_SERVER) && - orig_node->batman_if && - orig_node->router) { + if ((!orig_node) || (!orig_node->batman_if) || + (!orig_node->router)) + continue; + if (!(orig_node->flags & VIS_SERVER)) + continue; + /* don't send it if we already received the packet from + * this node. */ + if (recv_list_is_in(&info->recv_list, orig_node->orig)) + continue;
- /* don't send it if we already received the packet from - * this node. */ - if (recv_list_is_in(&info->recv_list, orig_node->orig)) - continue; + memcpy(info->packet.target_orig, orig_node->orig, ETH_ALEN); + batman_if = orig_node->batman_if; + memcpy(dstaddr, orig_node->router->addr, ETH_ALEN); + spin_unlock_irqrestore(&orig_hash_lock, flags);
- memcpy(info->packet.target_orig, - orig_node->orig, ETH_ALEN); + send_raw_packet((unsigned char *)&info->packet, + packet_length, batman_if, dstaddr);
- send_raw_packet((unsigned char *) &info->packet, - packet_length, - orig_node->batman_if, - orig_node->router->addr); - } + spin_lock_irqsave(&orig_hash_lock, flags); + } + spin_unlock_irqrestore(&orig_hash_lock, flags); memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN); - spin_unlock_irqrestore(&orig_hash_lock, flags); }
static void unicast_vis_packet(struct vis_info *info, int packet_length) { struct orig_node *orig_node; unsigned long flags; + struct batman_if *batman_if; + uint8_t dstaddr[ETH_ALEN];
spin_lock_irqsave(&orig_hash_lock, flags); orig_node = ((struct orig_node *) hash_find(orig_hash, info->packet.target_orig));
- if ((orig_node != NULL) && - (orig_node->batman_if != NULL) && - (orig_node->router != NULL)) { - send_raw_packet((unsigned char *) &info->packet, packet_length, - orig_node->batman_if, - orig_node->router->addr); - } + if ((!orig_node) || (!orig_node->batman_if) || (!orig_node->router)) + goto out; + + /* don't lock while sending the packets ... we therefore + * copy the required data before sending */ + batman_if = orig_node->batman_if; + memcpy(dstaddr, orig_node->router->addr, ETH_ALEN); spin_unlock_irqrestore(&orig_hash_lock, flags); + + send_raw_packet((unsigned char *)&info->packet, + packet_length, batman_if, dstaddr); + return; + +out: + spin_unlock_irqrestore(&orig_hash_lock, flags); }
/* only send one vis packet. called from send_vis_packets() */ @@ -503,6 +544,7 @@ unsigned long flags;
spin_lock_irqsave(&vis_hash_lock, flags); + purge_vis_packets();
if (generate_vis_packet() == 0) @@ -511,7 +553,11 @@
list_for_each_entry_safe(info, temp, &send_list, send_list) { list_del_init(&info->send_list); + kref_get(&info->refcount); + spin_unlock_irqrestore(&vis_hash_lock, flags); send_vis_packet(info); + spin_lock_irqsave(&vis_hash_lock, flags); + kref_put(&info->refcount, free_info); } spin_unlock_irqrestore(&vis_hash_lock, flags); start_vis_timer(); @@ -544,6 +590,7 @@ my_vis_info->first_seen = jiffies - atomic_read(&vis_interval); INIT_LIST_HEAD(&my_vis_info->recv_list); INIT_LIST_HEAD(&my_vis_info->send_list); + kref_init(&my_vis_info->refcount); my_vis_info->packet.version = COMPAT_VERSION; my_vis_info->packet.packet_type = BAT_VIS; my_vis_info->packet.ttl = TTL; @@ -557,9 +604,9 @@
if (hash_add(vis_hash, my_vis_info) < 0) { printk(KERN_ERR - "batman-adv:Can't add own vis packet into hash\n"); - free_info(my_vis_info); /* not in hash, need to remove it - * manually. */ + "batman-adv:Can't add own vis packet into hash\n"); + /* not in hash, need to remove it manually. */ + kref_put(&my_vis_info->refcount, free_info); goto err; }
@@ -573,6 +620,13 @@ return 0; }
+/* Decrease the reference count on a hash item info */ +static void free_info_ref(void *data) +{ + struct vis_info *info = data; + kref_put(&info->refcount, free_info); +} + /* shutdown vis-server */ void vis_quit(void) { @@ -584,7 +638,7 @@
spin_lock_irqsave(&vis_hash_lock, flags); /* properly remove, kill timers ... */ - hash_delete(vis_hash, free_info); + hash_delete(vis_hash, free_info_ref); vis_hash = NULL; my_vis_info = NULL; spin_unlock_irqrestore(&vis_hash_lock, flags); @@ -594,5 +648,5 @@ static void start_vis_timer(void) { queue_delayed_work(bat_event_workqueue, &vis_timer_wq, - (atomic_read(&vis_interval) * HZ ) / 1000); + (atomic_read(&vis_interval) * HZ) / 1000); } Index: vis.h =================================================================== --- vis.h (revision 1568) +++ vis.h (working copy) @@ -29,6 +29,7 @@ /* list of server-neighbors we received a vis-packet * from. we should not reply to them. */ struct list_head send_list; + struct kref refcount; /* this packet might be part of the vis send queue. */ struct vis_packet packet; /* vis_info may follow here*/
Hi Andrew,
Yes, ehm it does indeed cause a kernel panic on the server side immediately :). See the attachment for the full call trace.
Cheers, Linus
On Sat, Jan 30, 2010 at 05:50:59PM +0100, Andrew Lunn wrote:
Hi Linus
Please could you try this patch and see if it fixes the vis problem. It compiles cleanly with 2.6.32. It is checkpatch clean, it is sparse clean. However, since i've not actually tried running the code i would not be too surprised if it deadlocked, leaked memory, oopsed, ...
Andrew
Staging: batman-adv: Don't have interrupts disabled while sending.
send_vis_packets() would disable interrupts before calling dev_queue_xmit() which resulting in a backtrace in local_bh_enable(). Fix this by using kref on the vis_info object so that we can call send_vis_packets() without holding vis_hash_lock. vis_hash_lock also used to protect recv_list, so we now need a new lock to protect that instead of vis_hash_lock.
Also a few checkpatch cleanups.
Reported-by: Linus L�ssing linus.luessing@web.de Signed-off-by: Andrew Lunn andrew@lunn.ch
Index: vis.c
--- vis.c (revision 1568) +++ vis.c (working copy) @@ -30,22 +30,26 @@
struct hashtable_t *vis_hash; DEFINE_SPINLOCK(vis_hash_lock); +static DEFINE_SPINLOCK(recv_list_lock); static struct vis_info *my_vis_info; static struct list_head send_list; /* always locked with vis_hash_lock */
static void start_vis_timer(void);
/* free the info */ -static void free_info(void *data) +static void free_info(struct kref *ref) {
- struct vis_info *info = data;
struct vis_info *info = container_of(ref, struct vis_info, refcount); struct recvlist_node *entry, *tmp;
unsigned long flags;
list_del_init(&info->send_list);
spin_lock_irqsave(&recv_list_lock, flags); list_for_each_entry_safe(entry, tmp, &info->recv_list, list) { list_del(&entry->list); kfree(entry); }
spin_unlock_irqrestore(&recv_list_lock, flags); kfree(info);
}
@@ -147,32 +151,41 @@ static void recv_list_add(struct list_head *recv_list, char *mac) { struct recvlist_node *entry;
unsigned long flags;
entry = kmalloc(sizeof(struct recvlist_node), GFP_ATOMIC); if (!entry) return;
memcpy(entry->mac, mac, ETH_ALEN);
spin_lock_irqsave(&recv_list_lock, flags); list_add_tail(&entry->list, recv_list);
spin_unlock_irqrestore(&recv_list_lock, flags);
}
/* returns 1 if this mac is in the recv_list */ static int recv_list_is_in(struct list_head *recv_list, char *mac) { struct recvlist_node *entry;
unsigned long flags;
spin_lock_irqsave(&recv_list_lock, flags); list_for_each_entry(entry, recv_list, list) {
if (memcmp(entry->mac, mac, ETH_ALEN) == 0)
if (memcmp(entry->mac, mac, ETH_ALEN) == 0) {
spin_unlock_irqrestore(&recv_list_lock, flags); return 1;
}}
- spin_unlock_irqrestore(&recv_list_lock, flags); return 0;
}
/* try to add the packet to the vis_hash. return NULL if invalid (e.g. too old,
- broken.. ). vis hash must be locked outside. is_new is set when the packet
- broken.. ). vis hash must be locked outside. is_new is set when the packet
- is newer than old entries in the hash. */
static struct vis_info *add_packet(struct vis_packet *vis_packet,
int vis_info_len, int *is_new)
int vis_info_len, int *is_new,
int make_broadcast)
{ struct vis_info *info, *old_info; struct vis_info search_elem; @@ -199,7 +212,7 @@ } /* remove old entry */ hash_remove(vis_hash, old_info);
free_info(old_info);
kref_put(&old_info->refcount, free_info);
}
info = kmalloc(sizeof(struct vis_info) + vis_info_len, GFP_ATOMIC);
@@ -208,6 +221,7 @@
INIT_LIST_HEAD(&info->send_list); INIT_LIST_HEAD(&info->recv_list);
- kref_init(&info->refcount); info->first_seen = jiffies; memcpy(&info->packet, vis_packet, sizeof(struct vis_packet) + vis_info_len);
@@ -215,16 +229,21 @@ /* initialize and add new packet. */ *is_new = 1;
- /* Make it a broadcast packet, if required */
- if (make_broadcast)
memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN);
- /* repair if entries is longer than packet. */ if (info->packet.entries * sizeof(struct vis_info_entry) > vis_info_len)
info->packet.entries = vis_info_len / sizeof(struct vis_info_entry);
info->packet.entries = vis_info_len /
sizeof(struct vis_info_entry);
recv_list_add(&info->recv_list, info->packet.sender_orig);
/* try to add it */ if (hash_add(vis_hash, info) < 0) { /* did not work (for some reason) */
free_info(info);
info = NULL; }kref_put(&old_info->refcount, free_info);
@@ -235,19 +254,20 @@ void receive_server_sync_packet(struct vis_packet *vis_packet, int vis_info_len) { struct vis_info *info;
- int is_new;
int is_new, make_broadcast; unsigned long flags; int vis_server = atomic_read(&vis_mode);
make_broadcast = (vis_server == VIS_TYPE_SERVER_SYNC);
spin_lock_irqsave(&vis_hash_lock, flags);
- info = add_packet(vis_packet, vis_info_len, &is_new);
info = add_packet(vis_packet, vis_info_len, &is_new, make_broadcast); if (info == NULL) goto end;
/* only if we are server ourselves and packet is newer than the one in
- hash.*/
if (vis_server == VIS_TYPE_SERVER_SYNC && is_new) {
if (list_empty(&info->send_list)) list_add_tail(&info->send_list, &send_list); }memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN);
@@ -263,24 +283,27 @@ int is_new; unsigned long flags; int vis_server = atomic_read(&vis_mode);
int are_target = 0;
/* clients shall not broadcast. */ if (is_bcast(vis_packet->target_orig)) return;
/* Are we the target for this VIS packet? */
if (vis_server == VIS_TYPE_SERVER_SYNC &&
is_my_mac(info->packet.target_orig))
are_target = 1;
spin_lock_irqsave(&vis_hash_lock, flags);
- info = add_packet(vis_packet, vis_info_len, &is_new);
info = add_packet(vis_packet, vis_info_len, &is_new, are_target); if (info == NULL) goto end; /* note that outdated packets will be dropped at this point. */
/* send only if we're the target server or ... */
- if (vis_server == VIS_TYPE_SERVER_SYNC &&
is_my_mac(info->packet.target_orig) &&
is_new) {
- if (are_target && is_new) { info->packet.vis_type = VIS_TYPE_SERVER_SYNC; /* upgrade! */
if (list_empty(&info->send_list)) list_add_tail(&info->send_list, &send_list);memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN);
@@ -362,14 +385,17 @@ while (hash_iterate(orig_hash, &hashit_global)) { orig_node = hashit_global.bucket->data; if (orig_node->router != NULL
&& compare_orig(orig_node->router->addr, orig_node->orig)
&& compare_orig(orig_node->router->addr,
orig_node->orig) && orig_node->batman_if && (orig_node->batman_if->if_active == IF_ACTIVE)
&& orig_node->router->tq_avg > 0) {
/* fill one entry into buffer. */ entry = &entry_array[info->packet.entries];
memcpy(entry->src, orig_node->batman_if->net_dev->dev_addr, ETH_ALEN);
memcpy(entry->src,
orig_node->batman_if->net_dev->dev_addr,
ETH_ALEN); memcpy(entry->dest, orig_node->orig, ETH_ALEN); entry->quality = orig_node->router->tq_avg; info->packet.entries++;
@@ -401,6 +427,8 @@ return 0; }
+/* free old vis packets. Must be called with this vis_hash_lock
- held */
static void purge_vis_packets(void) { HASHIT(hashit); @@ -413,7 +441,7 @@ if (time_after(jiffies, info->first_seen + (VIS_TIMEOUT*HZ)/1000)) { hash_remove_bucket(vis_hash, &hashit);
free_info(info);
} }kref_put(&info->refcount, free_info);
} @@ -423,6 +451,8 @@ HASHIT(hashit); struct orig_node *orig_node; unsigned long flags;
struct batman_if *batman_if;
uint8_t dstaddr[ETH_ALEN];
spin_lock_irqsave(&orig_hash_lock, flags);
@@ -431,46 +461,57 @@ orig_node = hashit.bucket->data;
/* if it's a vis server and reachable, send it. */
if (orig_node &&
(orig_node->flags & VIS_SERVER) &&
orig_node->batman_if &&
orig_node->router) {
if ((!orig_node) || (!orig_node->batman_if) ||
(!orig_node->router))
continue;
if (!(orig_node->flags & VIS_SERVER))
continue;
/* don't send it if we already received the packet from
* this node. */
if (recv_list_is_in(&info->recv_list, orig_node->orig))
continue;
/* don't send it if we already received the packet from
* this node. */
if (recv_list_is_in(&info->recv_list, orig_node->orig))
continue;
memcpy(info->packet.target_orig, orig_node->orig, ETH_ALEN);
batman_if = orig_node->batman_if;
memcpy(dstaddr, orig_node->router->addr, ETH_ALEN);
spin_unlock_irqrestore(&orig_hash_lock, flags);
memcpy(info->packet.target_orig,
orig_node->orig, ETH_ALEN);
send_raw_packet((unsigned char *)&info->packet,
packet_length, batman_if, dstaddr);
send_raw_packet((unsigned char *) &info->packet,
packet_length,
orig_node->batman_if,
orig_node->router->addr);
}
spin_lock_irqsave(&orig_hash_lock, flags);
- }
- spin_unlock_irqrestore(&orig_hash_lock, flags); memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN);
- spin_unlock_irqrestore(&orig_hash_lock, flags);
}
static void unicast_vis_packet(struct vis_info *info, int packet_length) { struct orig_node *orig_node; unsigned long flags;
struct batman_if *batman_if;
uint8_t dstaddr[ETH_ALEN];
spin_lock_irqsave(&orig_hash_lock, flags); orig_node = ((struct orig_node *) hash_find(orig_hash, info->packet.target_orig));
- if ((orig_node != NULL) &&
(orig_node->batman_if != NULL) &&
(orig_node->router != NULL)) {
send_raw_packet((unsigned char *) &info->packet, packet_length,
orig_node->batman_if,
orig_node->router->addr);
- }
- if ((!orig_node) || (!orig_node->batman_if) || (!orig_node->router))
goto out;
- /* don't lock while sending the packets ... we therefore
* copy the required data before sending */
- batman_if = orig_node->batman_if;
- memcpy(dstaddr, orig_node->router->addr, ETH_ALEN); spin_unlock_irqrestore(&orig_hash_lock, flags);
- send_raw_packet((unsigned char *)&info->packet,
packet_length, batman_if, dstaddr);
- return;
+out:
- spin_unlock_irqrestore(&orig_hash_lock, flags);
}
/* only send one vis packet. called from send_vis_packets() */ @@ -503,6 +544,7 @@ unsigned long flags;
spin_lock_irqsave(&vis_hash_lock, flags);
purge_vis_packets();
if (generate_vis_packet() == 0)
@@ -511,7 +553,11 @@
list_for_each_entry_safe(info, temp, &send_list, send_list) { list_del_init(&info->send_list);
kref_get(&info->refcount);
send_vis_packet(info);spin_unlock_irqrestore(&vis_hash_lock, flags);
spin_lock_irqsave(&vis_hash_lock, flags);
} spin_unlock_irqrestore(&vis_hash_lock, flags); start_vis_timer();kref_put(&info->refcount, free_info);
@@ -544,6 +590,7 @@ my_vis_info->first_seen = jiffies - atomic_read(&vis_interval); INIT_LIST_HEAD(&my_vis_info->recv_list); INIT_LIST_HEAD(&my_vis_info->send_list);
- kref_init(&my_vis_info->refcount); my_vis_info->packet.version = COMPAT_VERSION; my_vis_info->packet.packet_type = BAT_VIS; my_vis_info->packet.ttl = TTL;
@@ -557,9 +604,9 @@
if (hash_add(vis_hash, my_vis_info) < 0) { printk(KERN_ERR
"batman-adv:Can't add own vis packet into hash\n");
free_info(my_vis_info); /* not in hash, need to remove it
* manually. */
"batman-adv:Can't add own vis packet into hash\n");
/* not in hash, need to remove it manually. */
goto err; }kref_put(&my_vis_info->refcount, free_info);
@@ -573,6 +620,13 @@ return 0; }
+/* Decrease the reference count on a hash item info */ +static void free_info_ref(void *data) +{
- struct vis_info *info = data;
- kref_put(&info->refcount, free_info);
+}
/* shutdown vis-server */ void vis_quit(void) { @@ -584,7 +638,7 @@
spin_lock_irqsave(&vis_hash_lock, flags); /* properly remove, kill timers ... */
- hash_delete(vis_hash, free_info);
- hash_delete(vis_hash, free_info_ref); vis_hash = NULL; my_vis_info = NULL; spin_unlock_irqrestore(&vis_hash_lock, flags);
@@ -594,5 +648,5 @@ static void start_vis_timer(void) { queue_delayed_work(bat_event_workqueue, &vis_timer_wq,
(atomic_read(&vis_interval) * HZ ) / 1000);
(atomic_read(&vis_interval) * HZ) / 1000);
} Index: vis.h =================================================================== --- vis.h (revision 1568) +++ vis.h (working copy) @@ -29,6 +29,7 @@ /* list of server-neighbors we received a vis-packet * from. we should not reply to them. */ struct list_head send_list;
- struct kref refcount; /* this packet might be part of the vis send queue. */ struct vis_packet packet; /* vis_info may follow here*/
On Sun, Jan 31, 2010 at 08:37:29PM +0100, Linus L??ssing wrote:
Hi Andrew,
Yes, ehm it does indeed cause a kernel panic on the server side immediately :). See the attachment for the full call trace.
Oh yes, dumb error. I will give you a new patch tomorrow. I might even try it first this time!
Thanks for testing,
Andrew
Hi Linus
Here is a new version of the patch. I've tested it this time using five UML machines. It should not immediately opps now.
Please could you test it with your test setup.
Thanks Andrew
Staging: batman-adv: Don't have interrupts disabled while sending.
send_vis_packets() would disable interrupts before calling dev_queue_xmit() which resulting in a backtrace in local_bh_enable(). Fix this by using kref on the vis_info object so that we can call send_vis_packets() without holding vis_hash_lock. vis_hash_lock also used to protect recv_list, so we now need a new lock to protect that instead of vis_hash_lock.
Also a few checkpatch cleanups.
Reported-by: Linus L\374ssing linus.luessing@web.de Signed-off-by: Andrew Lunn andrew@lunn.ch
Index: vis.c
Index: vis.c =================================================================== --- vis.c (revision 1568) +++ vis.c (working copy) @@ -30,22 +30,26 @@
struct hashtable_t *vis_hash; DEFINE_SPINLOCK(vis_hash_lock); +static DEFINE_SPINLOCK(recv_list_lock); static struct vis_info *my_vis_info; static struct list_head send_list; /* always locked with vis_hash_lock */
static void start_vis_timer(void);
/* free the info */ -static void free_info(void *data) +static void free_info(struct kref *ref) { - struct vis_info *info = data; + struct vis_info *info = container_of(ref, struct vis_info, refcount); struct recvlist_node *entry, *tmp; + unsigned long flags;
list_del_init(&info->send_list); + spin_lock_irqsave(&recv_list_lock, flags); list_for_each_entry_safe(entry, tmp, &info->recv_list, list) { list_del(&entry->list); kfree(entry); } + spin_unlock_irqrestore(&recv_list_lock, flags); kfree(info); }
@@ -147,32 +151,41 @@ static void recv_list_add(struct list_head *recv_list, char *mac) { struct recvlist_node *entry; + unsigned long flags; + entry = kmalloc(sizeof(struct recvlist_node), GFP_ATOMIC); if (!entry) return;
memcpy(entry->mac, mac, ETH_ALEN); + spin_lock_irqsave(&recv_list_lock, flags); list_add_tail(&entry->list, recv_list); + spin_unlock_irqrestore(&recv_list_lock, flags); }
/* returns 1 if this mac is in the recv_list */ static int recv_list_is_in(struct list_head *recv_list, char *mac) { struct recvlist_node *entry; + unsigned long flags;
+ spin_lock_irqsave(&recv_list_lock, flags); list_for_each_entry(entry, recv_list, list) { - if (memcmp(entry->mac, mac, ETH_ALEN) == 0) + if (memcmp(entry->mac, mac, ETH_ALEN) == 0) { + spin_unlock_irqrestore(&recv_list_lock, flags); return 1; + } } - + spin_unlock_irqrestore(&recv_list_lock, flags); return 0; }
/* try to add the packet to the vis_hash. return NULL if invalid (e.g. too old, - * broken.. ). vis hash must be locked outside. is_new is set when the packet + * broken.. ). vis hash must be locked outside. is_new is set when the packet * is newer than old entries in the hash. */ static struct vis_info *add_packet(struct vis_packet *vis_packet, - int vis_info_len, int *is_new) + int vis_info_len, int *is_new, + int make_broadcast) { struct vis_info *info, *old_info; struct vis_info search_elem; @@ -199,7 +212,7 @@ } /* remove old entry */ hash_remove(vis_hash, old_info); - free_info(old_info); + kref_put(&old_info->refcount, free_info); }
info = kmalloc(sizeof(struct vis_info) + vis_info_len, GFP_ATOMIC); @@ -208,6 +221,7 @@
INIT_LIST_HEAD(&info->send_list); INIT_LIST_HEAD(&info->recv_list); + kref_init(&info->refcount); info->first_seen = jiffies; memcpy(&info->packet, vis_packet, sizeof(struct vis_packet) + vis_info_len); @@ -215,16 +229,21 @@ /* initialize and add new packet. */ *is_new = 1;
+ /* Make it a broadcast packet, if required */ + if (make_broadcast) + memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN); + /* repair if entries is longer than packet. */ if (info->packet.entries * sizeof(struct vis_info_entry) > vis_info_len) - info->packet.entries = vis_info_len / sizeof(struct vis_info_entry); + info->packet.entries = vis_info_len / + sizeof(struct vis_info_entry);
recv_list_add(&info->recv_list, info->packet.sender_orig);
/* try to add it */ if (hash_add(vis_hash, info) < 0) { /* did not work (for some reason) */ - free_info(info); + kref_put(&old_info->refcount, free_info); info = NULL; }
@@ -235,19 +254,20 @@ void receive_server_sync_packet(struct vis_packet *vis_packet, int vis_info_len) { struct vis_info *info; - int is_new; + int is_new, make_broadcast; unsigned long flags; int vis_server = atomic_read(&vis_mode);
+ make_broadcast = (vis_server == VIS_TYPE_SERVER_SYNC); + spin_lock_irqsave(&vis_hash_lock, flags); - info = add_packet(vis_packet, vis_info_len, &is_new); + info = add_packet(vis_packet, vis_info_len, &is_new, make_broadcast); if (info == NULL) goto end;
/* only if we are server ourselves and packet is newer than the one in * hash.*/ if (vis_server == VIS_TYPE_SERVER_SYNC && is_new) { - memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN); if (list_empty(&info->send_list)) list_add_tail(&info->send_list, &send_list); } @@ -263,24 +283,27 @@ int is_new; unsigned long flags; int vis_server = atomic_read(&vis_mode); + int are_target = 0;
/* clients shall not broadcast. */ if (is_bcast(vis_packet->target_orig)) return;
+ /* Are we the target for this VIS packet? */ + if (vis_server == VIS_TYPE_SERVER_SYNC && + is_my_mac(vis_packet->target_orig)) + are_target = 1; + spin_lock_irqsave(&vis_hash_lock, flags); - info = add_packet(vis_packet, vis_info_len, &is_new); + info = add_packet(vis_packet, vis_info_len, &is_new, are_target); if (info == NULL) goto end; /* note that outdated packets will be dropped at this point. */
/* send only if we're the target server or ... */ - if (vis_server == VIS_TYPE_SERVER_SYNC && - is_my_mac(info->packet.target_orig) && - is_new) { + if (are_target && is_new) { info->packet.vis_type = VIS_TYPE_SERVER_SYNC; /* upgrade! */ - memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN); if (list_empty(&info->send_list)) list_add_tail(&info->send_list, &send_list);
@@ -362,14 +385,17 @@ while (hash_iterate(orig_hash, &hashit_global)) { orig_node = hashit_global.bucket->data; if (orig_node->router != NULL - && compare_orig(orig_node->router->addr, orig_node->orig) + && compare_orig(orig_node->router->addr, + orig_node->orig) && orig_node->batman_if && (orig_node->batman_if->if_active == IF_ACTIVE) && orig_node->router->tq_avg > 0) {
/* fill one entry into buffer. */ entry = &entry_array[info->packet.entries]; - memcpy(entry->src, orig_node->batman_if->net_dev->dev_addr, ETH_ALEN); + memcpy(entry->src, + orig_node->batman_if->net_dev->dev_addr, + ETH_ALEN); memcpy(entry->dest, orig_node->orig, ETH_ALEN); entry->quality = orig_node->router->tq_avg; info->packet.entries++; @@ -401,6 +427,8 @@ return 0; }
+/* free old vis packets. Must be called with this vis_hash_lock + * held */ static void purge_vis_packets(void) { HASHIT(hashit); @@ -413,7 +441,7 @@ if (time_after(jiffies, info->first_seen + (VIS_TIMEOUT*HZ)/1000)) { hash_remove_bucket(vis_hash, &hashit); - free_info(info); + kref_put(&info->refcount, free_info); } } } @@ -423,6 +451,8 @@ HASHIT(hashit); struct orig_node *orig_node; unsigned long flags; + struct batman_if *batman_if; + uint8_t dstaddr[ETH_ALEN];
spin_lock_irqsave(&orig_hash_lock, flags);
@@ -431,46 +461,57 @@ orig_node = hashit.bucket->data;
/* if it's a vis server and reachable, send it. */ - if (orig_node && - (orig_node->flags & VIS_SERVER) && - orig_node->batman_if && - orig_node->router) { + if ((!orig_node) || (!orig_node->batman_if) || + (!orig_node->router)) + continue; + if (!(orig_node->flags & VIS_SERVER)) + continue; + /* don't send it if we already received the packet from + * this node. */ + if (recv_list_is_in(&info->recv_list, orig_node->orig)) + continue;
- /* don't send it if we already received the packet from - * this node. */ - if (recv_list_is_in(&info->recv_list, orig_node->orig)) - continue; + memcpy(info->packet.target_orig, orig_node->orig, ETH_ALEN); + batman_if = orig_node->batman_if; + memcpy(dstaddr, orig_node->router->addr, ETH_ALEN); + spin_unlock_irqrestore(&orig_hash_lock, flags);
- memcpy(info->packet.target_orig, - orig_node->orig, ETH_ALEN); + send_raw_packet((unsigned char *)&info->packet, + packet_length, batman_if, dstaddr);
- send_raw_packet((unsigned char *) &info->packet, - packet_length, - orig_node->batman_if, - orig_node->router->addr); - } + spin_lock_irqsave(&orig_hash_lock, flags); + } + spin_unlock_irqrestore(&orig_hash_lock, flags); memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN); - spin_unlock_irqrestore(&orig_hash_lock, flags); }
static void unicast_vis_packet(struct vis_info *info, int packet_length) { struct orig_node *orig_node; unsigned long flags; + struct batman_if *batman_if; + uint8_t dstaddr[ETH_ALEN];
spin_lock_irqsave(&orig_hash_lock, flags); orig_node = ((struct orig_node *) hash_find(orig_hash, info->packet.target_orig));
- if ((orig_node != NULL) && - (orig_node->batman_if != NULL) && - (orig_node->router != NULL)) { - send_raw_packet((unsigned char *) &info->packet, packet_length, - orig_node->batman_if, - orig_node->router->addr); - } + if ((!orig_node) || (!orig_node->batman_if) || (!orig_node->router)) + goto out; + + /* don't lock while sending the packets ... we therefore + * copy the required data before sending */ + batman_if = orig_node->batman_if; + memcpy(dstaddr, orig_node->router->addr, ETH_ALEN); spin_unlock_irqrestore(&orig_hash_lock, flags); + + send_raw_packet((unsigned char *)&info->packet, + packet_length, batman_if, dstaddr); + return; + +out: + spin_unlock_irqrestore(&orig_hash_lock, flags); }
/* only send one vis packet. called from send_vis_packets() */ @@ -503,6 +544,7 @@ unsigned long flags;
spin_lock_irqsave(&vis_hash_lock, flags); + purge_vis_packets();
if (generate_vis_packet() == 0) @@ -511,7 +553,11 @@
list_for_each_entry_safe(info, temp, &send_list, send_list) { list_del_init(&info->send_list); + kref_get(&info->refcount); + spin_unlock_irqrestore(&vis_hash_lock, flags); send_vis_packet(info); + spin_lock_irqsave(&vis_hash_lock, flags); + kref_put(&info->refcount, free_info); } spin_unlock_irqrestore(&vis_hash_lock, flags); start_vis_timer(); @@ -544,6 +590,7 @@ my_vis_info->first_seen = jiffies - atomic_read(&vis_interval); INIT_LIST_HEAD(&my_vis_info->recv_list); INIT_LIST_HEAD(&my_vis_info->send_list); + kref_init(&my_vis_info->refcount); my_vis_info->packet.version = COMPAT_VERSION; my_vis_info->packet.packet_type = BAT_VIS; my_vis_info->packet.ttl = TTL; @@ -557,9 +604,9 @@
if (hash_add(vis_hash, my_vis_info) < 0) { printk(KERN_ERR - "batman-adv:Can't add own vis packet into hash\n"); - free_info(my_vis_info); /* not in hash, need to remove it - * manually. */ + "batman-adv:Can't add own vis packet into hash\n"); + /* not in hash, need to remove it manually. */ + kref_put(&my_vis_info->refcount, free_info); goto err; }
@@ -573,6 +620,13 @@ return 0; }
+/* Decrease the reference count on a hash item info */ +static void free_info_ref(void *data) +{ + struct vis_info *info = data; + kref_put(&info->refcount, free_info); +} + /* shutdown vis-server */ void vis_quit(void) { @@ -584,7 +638,7 @@
spin_lock_irqsave(&vis_hash_lock, flags); /* properly remove, kill timers ... */ - hash_delete(vis_hash, free_info); + hash_delete(vis_hash, free_info_ref); vis_hash = NULL; my_vis_info = NULL; spin_unlock_irqrestore(&vis_hash_lock, flags); @@ -594,5 +648,5 @@ static void start_vis_timer(void) { queue_delayed_work(bat_event_workqueue, &vis_timer_wq, - (atomic_read(&vis_interval) * HZ ) / 1000); + (atomic_read(&vis_interval) * HZ) / 1000); } Index: vis.h =================================================================== --- vis.h (revision 1568) +++ vis.h (working copy) @@ -29,6 +29,7 @@ /* list of server-neighbors we received a vis-packet * from. we should not reply to them. */ struct list_head send_list; + struct kref refcount; /* this packet might be part of the vis send queue. */ struct vis_packet packet; /* vis_info may follow here*/
On Thu, Feb 11, 2010 at 10:46:59AM +0100, Andrew Lunn wrote:
Hi Linus
Here is a new version of the patch. I've tested it this time using five UML machines. It should not immediately opps now.
Instead is will leak memory and crash after a while...
I will try to find the memory leak.
Andrew
Hi Andrew,
Sorry, didn't have the time to try your patch any earlier, I'm right in the middle of my exams :). Your patch already looks quite good, I couldn't reproduce any memory leaks or crashes here (tried that with three routers and 1 or 2 vis servers activated, also activating/deactivating them a lot, no problems with that). And yes, the slow-path warning has gone with your patch. However, I'm having some weird output when connecting two routers over wifi _and_ over ethernet cable. The setup:
Before plugging in the cable: r1-ath1 <-- wifi --> r2-ath1 ------------ root@OpenWrt:~# batctl vd dot digraph { "r1-ath1" -> "r2-ath1" [label="1.32"] "r1-ath1" -> "r1-hna" [label="HNA"] "r1-ath1" -> "5a:2e:1e:1f:64:6b" [label="HNA"] subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] } "r2-ath1" -> "r1-ath1" [label="1.11"] "r2-ath1" -> "r2-hna" [label="HNA"] "r2-ath1" -> "82:31:95:f9:14:6f" [label="HNA"] subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] } } ------------ After plugging in the cable: r1-ath1 <-- wifi --> r2-ath1 + r1-eth0.3 <-- cable --> r2-eth0.3 ------------ root@OpenWrt:~# batctl vd dot digraph { "r1-ath1" -> "r2-ath1" [label="1.0"] "r1-ath1" -> "r2-eth0.3" [label="1.66"] "r1-ath1" -> "r1-hna" [label="HNA"] "r1-ath1" -> "5a:2e:1e:1f:64:6b" [label="HNA"] subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] "r1-eth0.3" } subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] } "r2-ath1" -> "r1-ath1" [label="1.0"] "r2-ath1" -> "r1-eth0.3" [label="1.15"] "r2-ath1" -> "r2-hna" [label="HNA"] "r2-ath1" -> "82:31:95:f9:14:6f" [label="HNA"] subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] "r2-eth0.3" } subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] } } root@OpenWrt:~# cat /proc/net/batman-adv/vis_data 06:22:b0:98:87:dd,TQ 04:22:b0:98:87:fa 251, HNA 00:22:b0:98:87:dd, HNA 5a:2e:1e:1f:64:6b, PRIMARY, SEC 04:22:b0:98:87:de, 06:22:b0:98:87:f9,TQ 06:22:b0:98:87:dd 255, TQ 04:22:b0:98:87:de 251, HNA 00:22:b0:98:87:f9, HNA 82:31:95:f9:14:6f, SEC 04:22:b0:98:87:fa, PRIMARY, ---------- So the second 'subgraph "cluster_r1-ath1"' is obviously unnecessary. Also "r1-ath1" -> "r2-eth0.3" looks wrong, should be "r1-eth0.3" -> "r2-eth0.3" instead (and the same with r2 a few lines later).
Cheers, Linus
On Thu, Feb 11, 2010 at 11:01:56AM +0100, Andrew Lunn wrote:
On Thu, Feb 11, 2010 at 10:46:59AM +0100, Andrew Lunn wrote:
Hi Linus
Here is a new version of the patch. I've tested it this time using five UML machines. It should not immediately opps now.
Instead is will leak memory and crash after a while...
I will try to find the memory leak.
Andrew
On Fri, Feb 19, 2010 at 06:19:05PM +0100, Linus L??ssing wrote:
Hi Andrew,
Sorry, didn't have the time to try your patch any earlier, I'm right in the middle of my exams :).
Hi Linus
Marek told me. No problems. I remember what its like studying for exams. However, it is nice to sometimes take a break and do something totally different.
Your patch already looks quite good, I couldn't reproduce any memory leaks or crashes here (tried that with three routers and 1 or 2 vis servers activated, also activating/deactivating them a lot, no problems with that). And yes, the slow-path warning has gone with your patch.
Great. So we are on the right tracks.
However, I'm having some weird output when connecting two routers over wifi _and_ over ethernet cable. The setup:
Before plugging in the cable: r1-ath1 <-- wifi --> r2-ath1
root@OpenWrt:~# batctl vd dot digraph { "r1-ath1" -> "r2-ath1" [label="1.32"] "r1-ath1" -> "r1-hna" [label="HNA"] "r1-ath1" -> "5a:2e:1e:1f:64:6b" [label="HNA"] subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] } "r2-ath1" -> "r1-ath1" [label="1.11"] "r2-ath1" -> "r2-hna" [label="HNA"] "r2-ath1" -> "82:31:95:f9:14:6f" [label="HNA"] subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] } }
After plugging in the cable: r1-ath1 <-- wifi --> r2-ath1 + r1-eth0.3 <-- cable --> r2-eth0.3
root@OpenWrt:~# batctl vd dot digraph { "r1-ath1" -> "r2-ath1" [label="1.0"] "r1-ath1" -> "r2-eth0.3" [label="1.66"] "r1-ath1" -> "r1-hna" [label="HNA"] "r1-ath1" -> "5a:2e:1e:1f:64:6b" [label="HNA"] subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] "r1-eth0.3" } subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] } "r2-ath1" -> "r1-ath1" [label="1.0"] "r2-ath1" -> "r1-eth0.3" [label="1.15"] "r2-ath1" -> "r2-hna" [label="HNA"] "r2-ath1" -> "82:31:95:f9:14:6f" [label="HNA"] subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] "r2-eth0.3" } subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] } } root@OpenWrt:~# cat /proc/net/batman-adv/vis_data 06:22:b0:98:87:dd,TQ 04:22:b0:98:87:fa 251, HNA 00:22:b0:98:87:dd, HNA 5a:2e:1e:1f:64:6b, PRIMARY, SEC 04:22:b0:98:87:de, 06:22:b0:98:87:f9,TQ 06:22:b0:98:87:dd 255, TQ 04:22:b0:98:87:de 251, HNA 00:22:b0:98:87:f9, HNA 82:31:95:f9:14:6f, SEC 04:22:b0:98:87:fa, PRIMARY,
Actually, this vis_data to does not map to the dot above! There are the wrong number of HNA, wrong order etc.
Here is what i think your bat-host file contains: 06:22:b0:98:87:dd r1-ath1 06:22:b0:98:87:f9 r2-ath1 00:22:b0:98:87:dd r1-hna 04:22:b0:98:87:de r1-eth0.3 00:22:b0:98:87:f9 r2-hna 04:22:b0:98:87:fa r2-eth0.3
and this is what i get, assuming i got the MAC->name mapping correct:
digraph { "r1-ath1" -> "r2-eth0.3" [label="1.15"] "r1-ath1" -> "r1-hna" [label="HNA"] "r1-ath1" -> "5a:2e:1e:1f:64:6b" [label="HNA"] subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] } subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] "r1-eth0.3" } "r2-ath1" -> "r1-ath1" [label="1.0"] "r2-ath1" -> "r1-eth0.3" [label="1.15"] "r2-ath1" -> "r2-hna" [label="HNA"] "r2-ath1" -> "82:31:95:f9:14:6f" [label="HNA"] subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] "r2-eth0.3" } subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] } }
batctl parses top-to-bottom, left-to-right. It does not consolidate the PRIMARY and the SECONDARY into one cluster. It leaves DOT to do that. Hence there are two cluster statements for each cluster actually drawn.
So the second 'subgraph "cluster_r1-ath1"' is obviously unnecessary.
Yes, unnecessary, but makes the batctl code easier.
Also "r1-ath1" -> "r2-eth0.3" looks wrong, should be
"r1-eth0.3" -> "r2-eth0.3" instead (and the same with r2 a few lines later).
These comments i agree with. A wireless and a wired device should not be neighbours. We don't have any records which originate from the secondary MAC address. That is guess is the major problem here.
So, did my/Mareks patch break it, or was it broken before?
First i suggest you go back to just before Simon's patch which introduced receiving using skbufs:
http://open-mesh.org/changeset/1517
That will tell us if we need to go back further, or our patch broke it.
If you need to go back further, i would suggest just before:
http://open-mesh.org/changeset/1510
However, if it is our patch then we can chop the patch into two:
Use Mareks patch:
https://lists.open-mesh.org/pipermail/b.a.t.m.a.n/2010-January/002261.html
and
Index: vis.c =================================================================== --- vis.c (revision 1575) +++ vis.c (working copy) @@ -444,10 +444,15 @@ memcpy(info->packet.target_orig, orig_node->orig, ETH_ALEN);
+spin_unlock_irqrestore(&orig_hash_lock, flags); + send_raw_packet((unsigned char *) &info->packet, packet_length, orig_node->batman_if, orig_node->router->addr); + +spin_lock_irqsave(&orig_hash_lock, flags); + } } memcpy(info->packet.target_orig, broadcastAddr, ETH_ALEN);
This adds a race condition, which i hope if O.K. for debugging purposes, but i hope allows the send to happen without the slowpath errors. If so, we can test Marek's part of the patch.
I'm on vacation for a week now. I will have Internet access some time, but not much.
Have fun debugging.
Andrew
On Sat, Feb 20, 2010 at 07:04:11PM +0100, Andrew Lunn wrote:
On Fri, Feb 19, 2010 at 06:19:05PM +0100, Linus L??ssing wrote:
Hi Andrew,
Sorry, didn't have the time to try your patch any earlier, I'm right in the middle of my exams :).
Hi Linus
Marek told me. No problems. I remember what its like studying for exams. However, it is nice to sometimes take a break and do something totally different.
Your patch already looks quite good, I couldn't reproduce any memory leaks or crashes here (tried that with three routers and 1 or 2 vis servers activated, also activating/deactivating them a lot, no problems with that). And yes, the slow-path warning has gone with your patch.
Great. So we are on the right tracks.
However, I'm having some weird output when connecting two routers over wifi _and_ over ethernet cable. The setup:
Before plugging in the cable: r1-ath1 <-- wifi --> r2-ath1
root@OpenWrt:~# batctl vd dot digraph { "r1-ath1" -> "r2-ath1" [label="1.32"] "r1-ath1" -> "r1-hna" [label="HNA"] "r1-ath1" -> "5a:2e:1e:1f:64:6b" [label="HNA"] subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] } "r2-ath1" -> "r1-ath1" [label="1.11"] "r2-ath1" -> "r2-hna" [label="HNA"] "r2-ath1" -> "82:31:95:f9:14:6f" [label="HNA"] subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] } }
After plugging in the cable: r1-ath1 <-- wifi --> r2-ath1 + r1-eth0.3 <-- cable --> r2-eth0.3
root@OpenWrt:~# batctl vd dot digraph { "r1-ath1" -> "r2-ath1" [label="1.0"] "r1-ath1" -> "r2-eth0.3" [label="1.66"] "r1-ath1" -> "r1-hna" [label="HNA"] "r1-ath1" -> "5a:2e:1e:1f:64:6b" [label="HNA"] subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] "r1-eth0.3" } subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] } "r2-ath1" -> "r1-ath1" [label="1.0"] "r2-ath1" -> "r1-eth0.3" [label="1.15"] "r2-ath1" -> "r2-hna" [label="HNA"] "r2-ath1" -> "82:31:95:f9:14:6f" [label="HNA"] subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] "r2-eth0.3" } subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] } } root@OpenWrt:~# cat /proc/net/batman-adv/vis_data 06:22:b0:98:87:dd,TQ 04:22:b0:98:87:fa 251, HNA 00:22:b0:98:87:dd, HNA 5a:2e:1e:1f:64:6b, PRIMARY, SEC 04:22:b0:98:87:de, 06:22:b0:98:87:f9,TQ 06:22:b0:98:87:dd 255, TQ 04:22:b0:98:87:de 251, HNA 00:22:b0:98:87:f9, HNA 82:31:95:f9:14:6f, SEC 04:22:b0:98:87:fa, PRIMARY,
Actually, this vis_data to does not map to the dot above! There are the wrong number of HNA, wrong order etc.
Hmm, just noticed, the output also seems to be flapping between those two from time to time: ------------------ root@OpenWrt:~# cat /proc/net/batman-adv/vis 06:22:b0:98:87:dd,TQ 04:22:b0:98:87:fa 251, HNA 00:22:b0:98:87:dd, HNA f6:ae:97:b3:9a:5c, PRIMARY, SEC 04:22:b0:98:87:de, 06:22:b0:98:87:f9,TQ 04:22:b0:98:87:de 251, HNA da:3e:79:2c:d3:3e, HNA 00:22:b0:98:87:f9, PRIMARY, SEC 04:22:b0:98:87:fa, root@OpenWrt:~# cat /proc/net/batman-adv/vis 06:22:b0:98:87:dd,TQ 04:22:b0:98:87:fa 251, HNA 00:22:b0:98:87:dd, HNA f6:ae:97:b3:9a:5c, PRIMARY, SEC 04:22:b0:98:87:de, 06:22:b0:98:87:f9,TQ 06:22:b0:98:87:dd 255, TQ 04:22:b0:98:87:de 251, HNA da:3e:79:2c:d3:3e, HNA 00:22:b0:98:87:f9, SEC 04:22:b0:98:87:fa, PRIMARY, ------------------
Here is what i think your bat-host file contains: 06:22:b0:98:87:dd r1-ath1 06:22:b0:98:87:f9 r2-ath1 00:22:b0:98:87:dd r1-hna 04:22:b0:98:87:de r1-eth0.3 00:22:b0:98:87:f9 r2-hna 04:22:b0:98:87:fa r2-eth0.3
and this is what i get, assuming i got the MAC->name mapping correct:
Yes, correct mapping :).
digraph { "r1-ath1" -> "r2-eth0.3" [label="1.15"] "r1-ath1" -> "r1-hna" [label="HNA"] "r1-ath1" -> "5a:2e:1e:1f:64:6b" [label="HNA"] subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] } subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] "r1-eth0.3" } "r2-ath1" -> "r1-ath1" [label="1.0"] "r2-ath1" -> "r1-eth0.3" [label="1.15"] "r2-ath1" -> "r2-hna" [label="HNA"] "r2-ath1" -> "82:31:95:f9:14:6f" [label="HNA"] subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] "r2-eth0.3" } subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] } }
batctl parses top-to-bottom, left-to-right. It does not consolidate the PRIMARY and the SECONDARY into one cluster. It leaves DOT to do that. Hence there are two cluster statements for each cluster actually drawn.
So the second 'subgraph "cluster_r1-ath1"' is obviously unnecessary.
Yes, unnecessary, but makes the batctl code easier.
Also "r1-ath1" -> "r2-eth0.3" looks wrong, should be
"r1-eth0.3" -> "r2-eth0.3" instead (and the same with r2 a few lines later).
These comments i agree with. A wireless and a wired device should not be neighbours. We don't have any records which originate from the secondary MAC address. That is guess is the major problem here.
So, did my/Mareks patch break it, or was it broken before?
First i suggest you go back to just before Simon's patch which introduced receiving using skbufs:
http://open-mesh.org/changeset/1517
That will tell us if we need to go back further, or our patch broke it.
If you need to go back further, i would suggest just before:
Okay, just checked, this got introduced with 1510 already, yes. I might have a closer look at this next week.
Cheers, Linus
Hello,
i can reproduce the slowpath warning in the trunk within my qemu setup. When i start a server, some of the clients show this slow path warning.
I've tried Andrews patch, and i can not see any memory leaks - it varies in a 100k window. However some of the servers crash after some time (30 minutes) without any special trigger i am aware of. Stack trace is this:
root@OpenWrt:/# ------------[ cut here ]------------ WARNING: at lib/kref.c:43 kref_get+0x18/0x30() Hardware name: Modules linked in: via_velocity via_rhine tg3 sis900 r8169 pcnet32 ne2k_pci 8390 e1000 e100 batman_adv 8139too 3c59x nf_nat_tftp nf_conntrack_tftp nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack pppoe pppox libphy ipt_REJECT xt_TCPMSS ipt_LOG xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables ppp_async ppp_generic slhc natsemi crc_ccitt ipv6 Pid: 570, comm: bat_events Not tainted 2.6.31.1 #32 Call Trace: [<c10dd608>] ? kref_get+0x18/0x30 [<c101f55f>] ? warn_slowpath_common+0x7f/0xb0 [<c10dd608>] ? kref_get+0x18/0x30 [<c101f5a3>] ? warn_slowpath_null+0x13/0x20 [<c10dd608>] ? kref_get+0x18/0x30 [<c2ac51a2>] ? proc_vis_read_prim_sec+0x352/0x5a0 [batman_adv] [<c2ac4ee0>] ? proc_vis_read_prim_sec+0x90/0x5a0 [batman_adv] [<c102d23a>] ? worker_thread+0xca/0x150 [<c102fd80>] ? autoremove_wake_function+0x0/0x50 [<c102d170>] ? worker_thread+0x0/0x150 [<c102fbc3>] ? kthread+0x73/0x90 [<c102fb50>] ? kthread+0x0/0x90 [<c1003813>] ? kernel_thread_helper+0x7/0x14 ---[ end trace 4f770856ce7e0712 ]--- ------------[ cut here ]------------ kernel BUG at mm/slub.c:2929! invalid opcode: 0000 [#1] last sysfs file: /sys/kernel/uevent_seqnum Modules linked in: via_velocity via_rhine tg3 sis900 r8169 pcnet32 ne2k_pci 8390 e1000 e100 batman_adv 8139too 3c59x nf_nat_tftp nf_conntrack_tftp nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack pppoe pppox libphy ipt_REJECT xt_TCPMSS ipt_LOG xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables ppp_async ppp_generic slhc natsemi crc_ccitt ipv6
Pid: 570, comm: bat_events Tainted: G W (2.6.31.1 #32) EIP: 0060:[<c1064669>] EFLAGS: 00010046 CPU: 0 EIP is at kfree+0x49/0xc0 EAX: 00000000 EBX: c1f5dd94 ECX: 40080000 EDX: c133cba0 ESI: c2ac5540 EDI: c1f5dd80 EBP: 00000216 ESP: c0d06eec DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Process bat_events (pid: 570, ti=c0d06000 task=c18d0000 task.ti=c0d06000) Stack: 00000002 c1f5dd94 c1f5dd94 c2ac5540 c1f5dda4 c10dd5cf c1f5dda4 00000000 <0> c2ac538b c0d06f68 c0e0504a c1f5dd84 c1f5dd94 c1f5dd80 c1f5dd80 000022a0 <0> 00000046 c0e05024 c0e05030 00000286 00000086 c1f5dda4 0000003f 00000080 Call Trace: [<c2ac5540>] ? vis_init+0x150/0x1b0 [batman_adv] [<c10dd5cf>] ? kref_put+0x4f/0x70 [<c2ac538b>] ? proc_vis_read_prim_sec+0x53b/0x5a0 [batman_adv] [<c2ac4ee0>] ? proc_vis_read_prim_sec+0x90/0x5a0 [batman_adv] [<c102d23a>] ? worker_thread+0xca/0x150 [<c102fd80>] ? autoremove_wake_function+0x0/0x50 [<c102d170>] ? worker_thread+0x0/0x150 [<c102fbc3>] ? kthread+0x73/0x90 [<c102fb50>] ? kthread+0x0/0x90 [<c1003813>] ? kernel_thread_helper+0x7/0x14 Code: 00 40 a1 80 cb 2e c1 c1 ea 0c c1 e2 05 01 c2 8b 0a 89 c8 25 00 80 00 00 66 85 c0 74 05 8b 52 0c 8b 0a 84 c9 78 24 f6 c5 c0 75 09 <0f> 0b 90 8d 74 26 00 eb fe 8b 5c 24 08 89 d0 8b 74 24 0c 8b 7c EIP: [<c1064669>] kfree+0x49/0xc0 SS:ESP 0068:c0d06eec ---[ end trace 4f770856ce7e0713 ]---
Please note that in this moment, i have not queried the vis_data file, so the proc_vis_read_prim_sec call is probably not correctly reported. Furthermore there are some more stack traces in vfs/mount and then the boxes reboot - we probably write at some bad memory position.
I've changed Andrews patch a little bit, and could not crash my 9 kvm instances when running it some hours last night. I've added kref_get() calls when the info packets get referenced from the send_list, this reference is removed with kref_put() in the send_vis_packets() after sending the packet. The disadvantage of this solution is that in a race condition, we might (unnecessarily) send out an old packet. No memory leaks as far as i could see after running some hours.
Please give it a try (patch is attached to this e-mail)!
best regards, Simon
On Thu, Feb 11, 2010 at 11:01:56AM +0100, Andrew Lunn wrote:
On Thu, Feb 11, 2010 at 10:46:59AM +0100, Andrew Lunn wrote:
Hi Linus
Here is a new version of the patch. I've tested it this time using five UML machines. It should not immediately opps now.
Instead is will leak memory and crash after a while...
I will try to find the memory leak.
Andrew
Hi Simon
@@ -503,15 +550,23 @@ unsigned long flags;
spin_lock_irqsave(&vis_hash_lock, flags);
- purge_vis_packets();
- if (generate_vis_packet() == 0)
if (generate_vis_packet() == 0) { /* schedule if generation was successful */
kref_get(&my_vis_info->refcount);
list_add_tail(&my_vis_info->send_list, &send_list);
}
list_for_each_entry_safe(info, temp, &send_list, send_list) {
spin_unlock_irqrestore(&vis_hash_lock, flags);
send_vis_packet(info);
spin_lock_irqsave(&vis_hash_lock, flags);
list_del_init(&info->send_list);
send_vis_packet(info);
} spin_unlock_irqrestore(&vis_hash_lock, flags); start_vis_timer();kref_put(&info->refcount, free_info);
Although you say this works, i just looks wrong. It looks unbalanced. Where is the kref_put for my_vis_info->refcount? There is a kref_put inside what looks like a loop, but no kref_get inside the loop.
Can you explain this is more detail? Can we make it look correctly balanced?
Thanks Andrew
Hey Andrew,
all list-adds and list-removes do a kref_get or kref_put respectively, but that probably was not very clear. As soon as a refence is created (by linking into the hash or a list), we should also kref_get() it.
Please find attached a new version of this patch with separate send_list_add/remove functions which include the put/get functions. I've also added the kref_put/get call to the send_list loop again. It should look more balanced now. Tested in my 9 qemu setup again, no memory leaks or crashes but i only did a very quick test (15 minutes)
best regards, Simon
On Mon, Mar 01, 2010 at 06:59:55AM +0100, Andrew Lunn wrote:
Hi Simon
@@ -503,15 +550,23 @@ unsigned long flags;
spin_lock_irqsave(&vis_hash_lock, flags);
- purge_vis_packets();
- if (generate_vis_packet() == 0)
if (generate_vis_packet() == 0) { /* schedule if generation was successful */
kref_get(&my_vis_info->refcount);
list_add_tail(&my_vis_info->send_list, &send_list);
}
list_for_each_entry_safe(info, temp, &send_list, send_list) {
spin_unlock_irqrestore(&vis_hash_lock, flags);
send_vis_packet(info);
spin_lock_irqsave(&vis_hash_lock, flags);
list_del_init(&info->send_list);
send_vis_packet(info);
} spin_unlock_irqrestore(&vis_hash_lock, flags); start_vis_timer();kref_put(&info->refcount, free_info);
Although you say this works, i just looks wrong. It looks unbalanced. Where is the kref_put for my_vis_info->refcount? There is a kref_put inside what looks like a loop, but no kref_get inside the loop.
Can you explain this is more detail? Can we make it look correctly balanced?
Thanks Andrew
On Mon, Mar 01, 2010 at 05:57:06PM +0100, Simon Wunderlich wrote:
Hey Andrew,
all list-adds and list-removes do a kref_get or kref_put respectively, but that probably was not very clear. As soon as a refence is created (by linking into the hash or a list), we should also kref_get() it.
Please find attached a new version of this patch with separate send_list_add/remove functions which include the put/get functions. I've also added the kref_put/get call to the send_list loop again. It should look more balanced now. Tested in my 9 qemu setup again, no memory leaks or crashes but i only did a very quick test (15 minutes)
Hi Simon
This looks better. Did it survive longer testing?
Andrew
Hello Andrew,
no, sorry, and i won't be able to do it this week as i am at the cebit this whole week (if anyone else is there, please tell me. ;).
I would therefore like to ask you and Linus to test it as well, i will perform more tests on sunday.
Thanks, Simon
On Tue, Mar 02, 2010 at 07:43:08AM +0100, Andrew Lunn wrote:
On Mon, Mar 01, 2010 at 05:57:06PM +0100, Simon Wunderlich wrote:
Hey Andrew,
all list-adds and list-removes do a kref_get or kref_put respectively, but that probably was not very clear. As soon as a refence is created (by linking into the hash or a list), we should also kref_get() it.
Please find attached a new version of this patch with separate send_list_add/remove functions which include the put/get functions. I've also added the kref_put/get call to the send_list loop again. It should look more balanced now. Tested in my 9 qemu setup again, no memory leaks or crashes but i only did a very quick test (15 minutes)
Hi Simon
This looks better. Did it survive longer testing?
Andrew
Hello Simon -
I'll be in hall 13 at the booth of Atcom E42 to present the Mesh-Potato router. (With B.A.T.M.A.N. inside, of course) Will arrive tomorrow afternoon and be there until Saturday 18:00.
Cheers, Elektra
no, sorry, and i won't be able to do it this week as i am at the cebit this whole week (if anyone else is there, please tell me. ;).
Hi Simon and Andrew,
I already tried both Andrew's intial and Simon's latest patch yesterday and had both of them running for about 1h+, also with alternate vis-intervals (1s, 50ms, 30s) but so far I wasn't able to reproduce any crashes or leaks on two Debian stable VMs connected with each other. I'll run Andrew's patch version again this night for some more hours to make sure that I'm able to reproduce the crash at all. Any other hints for reproducing it more quickly welcome :).
Cheers, Linus
PS: With how many nodes and what kind of systems could you reproduce the crash? Were all those nodes connected directly with each other? How many vis-servers were running?
On Tue, Mar 02, 2010 at 10:13:24PM +0100, Simon Wunderlich wrote:
Hello Andrew,
no, sorry, and i won't be able to do it this week as i am at the cebit this whole week (if anyone else is there, please tell me. ;).
I would therefore like to ask you and Linus to test it as well, i will perform more tests on sunday.
Thanks, Simon
On Tue, Mar 02, 2010 at 07:43:08AM +0100, Andrew Lunn wrote:
On Mon, Mar 01, 2010 at 05:57:06PM +0100, Simon Wunderlich wrote:
Hey Andrew,
all list-adds and list-removes do a kref_get or kref_put respectively, but that probably was not very clear. As soon as a refence is created (by linking into the hash or a list), we should also kref_get() it.
Please find attached a new version of this patch with separate send_list_add/remove functions which include the put/get functions. I've also added the kref_put/get call to the send_list loop again. It should look more balanced now. Tested in my 9 qemu setup again, no memory leaks or crashes but i only did a very quick test (15 minutes)
Hi Simon
This looks better. Did it survive longer testing?
Andrew
I've been running 3 qemu x86 instances with Debian stable here (2.6.26 kernel) as well as 3 Atheros ap61 / mips based routers with OpenWRT (rev. 19468, 2.6.30.10 kernel) for 22 hours. None of them crashed with Andrew's patch... so, sorry, still can't reproduce the bug you're talking about.
Cheers, Linus
On Thu, Mar 04, 2010 at 01:26:20AM +0100, Linus L??ssing wrote:
I've been running 3 qemu x86 instances with Debian stable here (2.6.26 kernel) as well as 3 Atheros ap61 / mips based routers with OpenWRT (rev. 19468, 2.6.30.10 kernel) for 22 hours. None of them crashed with Andrew's patch... so, sorry, still can't reproduce the bug you're talking about.
If you cannot reproduce it, its probably fixed :-)
Marek, could you commit the patch.
Thanks Andrew
On Thursday 04 March 2010 16:57:24 Andrew Lunn wrote:
On Thu, Mar 04, 2010 at 01:26:20AM +0100, Linus L??ssing wrote:
I've been running 3 qemu x86 instances with Debian stable here (2.6.26 kernel) as well as 3 Atheros ap61 / mips based routers with OpenWRT (rev. 19468, 2.6.30.10 kernel) for 22 hours. None of them crashed with Andrew's patch... so, sorry, still can't reproduce the bug you're talking about.
If you cannot reproduce it, its probably fixed :-)
Marek, could you commit the patch.
Sure but which one ? I have to admit I lost the overview (there were too many patches flying around) .. :-)
Cheers, Marek
Sure but which one ? I have to admit I lost the overview (there were too many patches flying around) .. :-)
https://lists.open-mesh.org/pipermail/b.a.t.m.a.n/2010-March/002354.html
Andrew
On Thursday 04 March 2010 17:49:41 Andrew Lunn wrote:
Sure but which one ? I have to admit I lost the overview (there were too many patches flying around) .. :-)
https://lists.open-mesh.org/pipermail/b.a.t.m.a.n/2010-March/002354.html
Applied in rev 1579,
Regards, Marek
b.a.t.m.a.n@lists.open-mesh.org