Hi all,
I'm running batman-adv-kernelland r1102 on linux-2.6.21.5. I've found I can fairly reliably deadlock the kernel by disabling and re-enabling batman a few times, e.g.
$ echo ath0 > /proc/net/batman-adv/interfaces wait a while... $ echo > /proc/net/batman-adv/interfaces wait a while and repeat.
It appears that this is deadlocking the kernel somehow, as syscalls never return, e.g. 'ps' prints the header and then freezes.
Our interface configuration system brings interfaces up and down on re- configure, so we'd like the batman interface to be able to withstand this use case. I will try to start looking through the code, but I thought I'd throw the problem out there to see what others more experienced with it think.
Thanks all,
-- Scott Raynel WAND Network Research Group Department of Computer Science University of Waikato New Zealand
Hello Scott,
thanks for the report. I could reproduce the Bug. (i've tried something similar myself, but did not wait between the commands before...). When i reproduce, not the syscall but the module itself locks up. :(
I'll look into this, please keep me informed, i'll do the same. :)
regards, Simon
On Fri, Jul 25, 2008 at 03:14:44PM +1200, Scott Raynel wrote:
Hi all,
I'm running batman-adv-kernelland r1102 on linux-2.6.21.5. I've found I can fairly reliably deadlock the kernel by disabling and re-enabling batman a few times, e.g.
$ echo ath0 > /proc/net/batman-adv/interfaces wait a while... $ echo > /proc/net/batman-adv/interfaces wait a while and repeat.
It appears that this is deadlocking the kernel somehow, as syscalls never return, e.g. 'ps' prints the header and then freezes.
Our interface configuration system brings interfaces up and down on re- configure, so we'd like the batman interface to be able to withstand this use case. I will try to start looking through the code, but I thought I'd throw the problem out there to see what others more experienced with it think.
Thanks all,
-- Scott Raynel WAND Network Research Group Department of Computer Science University of Waikato New Zealand
B.A.T.M.A.N mailing list B.A.T.M.A.N@open-mesh.net https://list.open-mesh.net/mm/listinfo/b.a.t.m.a.n
Hi,
I'm studying how BATMAN Advanced-Kernel work. I'm interesting in knowing what is the behavior of BATMAN and the reaction time of the protocol if a node exit from a range node and entered in another range node. I have discovered that (in a simulation with 3 hosts with a mobile node and 2 static node) BATMAN find a new route losing more or less 8 seconds, passing from one to the other node. But the OGMs time is 1 s and the sliding window is 64, so why it can lost 64 seconds to find another route (or 32)?
So there is a equation to define a the time lost to find a new route?
I have an originator interval of 1 seconds (set by default).
Thank you
Paolo
Hi Paolo,
BATMAN IV in batman-advanced uses TQ_GLOBAL_WINDOW_SIZE of 10. When a packet is received, the TQ value transmitted in the packet will be weighted and stored. The sliding window of neighbours who did not forward an OGM with this sequence number will have this entry set to zero. The old neighbours' sliding window will be filled with zeros, while the new neighbours' sliding window will gain more and more (weighted) TQ values. Once the Average of the sliding window is higher than the old one, the route will switch.
So it should change after maximum of 10 seconds. The new neighbour will start off with bad tq values for the TQ: Neighbours have not answered a lot of OGMs, so the asymetric penalty reduces the stored TQ value a lot. This is probably the reason why it does not switch in 5 seconds, but takes a bit longer.
If you want to read in the source code, look into isBidirectionalNeigh() and update_orig(). This is the theory, i don't guarantee for bugs. ;)
regards, Simon
On Wed, Jul 30, 2008 at 05:09:56PM +0200, cipollone wrote:
Hi,
I'm studying how BATMAN Advanced-Kernel work. I'm interesting in knowing what is the behavior of BATMAN and the reaction time of the protocol if a node exit from a range node and entered in another range node. I have discovered that (in a simulation with 3 hosts with a mobile node and 2 static node) BATMAN find a new route losing more or less 8 seconds, passing from one to the other node. But the OGMs time is 1 s and the sliding window is 64, so why it can lost 64 seconds to find another route (or 32)?
So there is a equation to define a the time lost to find a new route?
I have an originator interval of 1 seconds (set by default).
Thank you
Paolo
B.A.T.M.A.N mailing list B.A.T.M.A.N@open-mesh.net https://list.open-mesh.net/mm/listinfo/b.a.t.m.a.n
Hello Scott,
please have a look into rev 1104, this should fix some synchronisation problems. At least the problem i could reproduce when unloading is gone now. Please tell me if this solves your problem, too. :)
regards, Simon
On Fri, Jul 25, 2008 at 03:14:44PM +1200, Scott Raynel wrote:
Hi all,
I'm running batman-adv-kernelland r1102 on linux-2.6.21.5. I've found I can fairly reliably deadlock the kernel by disabling and re-enabling batman a few times, e.g.
$ echo ath0 > /proc/net/batman-adv/interfaces wait a while... $ echo > /proc/net/batman-adv/interfaces wait a while and repeat.
It appears that this is deadlocking the kernel somehow, as syscalls never return, e.g. 'ps' prints the header and then freezes.
Our interface configuration system brings interfaces up and down on re- configure, so we'd like the batman interface to be able to withstand this use case. I will try to start looking through the code, but I thought I'd throw the problem out there to see what others more experienced with it think.
Thanks all,
-- Scott Raynel WAND Network Research Group Department of Computer Science University of Waikato New Zealand
B.A.T.M.A.N mailing list B.A.T.M.A.N@open-mesh.net https://list.open-mesh.net/mm/listinfo/b.a.t.m.a.n
b.a.t.m.a.n@lists.open-mesh.org