-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
El 12/11/13 18:54, Antonio Quartulli escribió:
Hi Nico,
I have no real clue, but is it possible that there is a loop somewhere? I imagine you have checked already..but I can't come with something more useful at the moment..
Ok, I've spent the afternoon turning off semi-inaccessible nodes one by one until I found the one causing the problem.
It's installed on a public lighting post, so it may take a while to take it down for inspection.
I don't know if you guys remember I had brought to the battlemesh a crazy node (nicknamed Jocker), that started misbehaving after a lightning bolt hit nearby. The symptom was the same I observed now: every node in the net would start repeatedly showing the message: "received packet on bat0 with own address as source address".
I was in Europe during the time this second node started behaving like this so I still don't know much about the moment it started.
Do you think this matter could be addressed at batman level somehow? In a 50 node network this is already quite difficult to diagnose. I can't imagine how a bigger network where no single person has remote access to every node would coordinate to isolate the problematic router...
If you are interested in looking at this first hand we can try to set up an isolated test-bed with IPv6 connectivity for you to log in and play around.
Am I the only one who has bumped into this (twice)?
cheers.
On Tue, Nov 12, 2013 at 06:45:40PM -0300, Nicolás Echániz wrote:
back in Quintana... this problem is still showing in every node. The network is unstable and so it's difficult to debug. If anyone has a clue as to where to look for the origin I'll be glad to read your thoughts.
cheers, Nico
El 13/10/13 18:34, Nicolás Echániz escribió:
While I'm still in Europe I've observed that the network in Quintana has started performing very poorly today. It was working perfectly fine until yesterday.
The logs on every router have started showing entries like these:
Oct 13 18:09:43 frigorifico kern.warn kernel: [12018.150000] br-lan: received packet on bat0 with own address as source address Oct 13 18:09:45 frigorifico kern.warn kernel: [12020.040000] br-lan: received packet on bat0 with own address as source address Oct 13 18:09:45 frigorifico kern.warn kernel: [12020.040000] br-lan: received packet on bat0 with own address as source address Oct 13 18:09:45 frigorifico kern.warn kernel: [12020.550000] br-lan: received packet on bat0 with own address as source address Oct 13 18:09:45 frigorifico kern.warn kernel: [12020.550000] br-lan: received packet on bat0 with own address as source address Oct 13 18:09:45 frigorifico kern.warn kernel: [12020.570000] br-lan: received packet on bat0 with own address as source address Oct 13 18:09:45 frigorifico kern.warn kernel: [12020.580000] br-lan: received packet on bat0 with own address as source address Oct 13 18:09:46 frigorifico kern.warn kernel: [12021.040000] br-lan: received packet on bat0 with own address as source address
As you can see there are many per second.
I've pasted a bit of batctl ll batman; batctl log here:
...it's only showing the "originator packet from myself" lines and one line before. (the sample is less than 5 secs of logs)
Every node I checked is showing the same.
Last time this happened it was due to a router that had been affected by a nearby lightning bolt. The switch went crazy. It took a while to detect it and the network was 15 nodes big. Now it's 40 and we are quite far away :)
If anyone has an idea of how to better test where the problem is originated, I'll be glad to hear it. Also if any batman devel wishes to log in to the net to check first hand, just let me know.
Cheers! Nico
PS: batman version is 2012.4