Re: [B.A.T.M.A.N.] Memory Leak-batmand-exp rev1003

18 Mar 2008


      Hi,
On Sonntag 16 März 2008, Freifunk Dresden wrote:
...
Hi,
I have found that the batmand-experimental (rev1003) still consumes more
and more memory if you have specified the -r1 or -r2 options. I haven't
tried -r3.
If I turn off gateway with option -r0 the memory consumtion is constant.
It seems that the memory increase is in steps of 16kbyte but the time
when this
happens is different.
the syslog does not show any hint despite of several tries to name the bat0
interface (about ever two minutes):
Mar 16 14:50:49 (none) kern.err bmxd[2133]: Startup parameters:
/sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 63
--no-unreachable-rule --no-throw-rules --no-prio-rules
--one-way-tunnel 1 --two-way-tunnel 0 eth1 tbb /t 1 /i /A
Mar 16 14:52:27 (none) kern.err bmxd[2398]: Trying to name tunnel to bat0
... Mar 16 14:52:27 (none) kern.err bmxd[2398]: success!
Mar 16 14:54:04 (none) kern.err bmxd[2601]: Trying to name tunnel to bat0
... Mar 16 14:54:04 (none) kern.err bmxd[2601]: success!
Can you give more information about the scenario that might cause the problem 
and send a debug-level 3 log of the scenario. Maybe I got an idea then.
...
Both interfaces eth1 and tbb are active and conntected.
The WRT54 that has the option -r0 set, has a constant memory consumtion.
During this test I got a very strange message on the wrt that only has
the eth1
interface. There is no other router with the same ip address.
Mar 16 14:52:21 (none) kern.err bmxd[21119]: Drop packet: DAD alert!
OGM from 10.12.10.17 via NB 10.12.10.17 with out of range seqno! rcvd
sqno 28347, last valid seqno: 23290 at 6096067!               Maybe
two nodes are using this IP!? Waiting 0 more seconds before r
This DAD (duplicate address detection) message indicates a _potential_ 
duplicate address usage. It is triggered by receiving an invalid (or 
out-of-range) sequence number from the same IP address. During the start of 
each daemon an initial sequencenumber is randomized and then incremented by 
one with each new emitted OGM (by default with an originator interval of one 
second). A receiving node can now detect abnormal sequences. E.g. after 
receiving the sequence number n from node A, the next expected sequence 
numbers should be something around n+1. Ten seconds after receiving sequence 
number n+1 a sequence number around n+11 would be expected. 
This works even if some OGMs (and corresponding sequence 
numbers got lost). If instead of n+11 a sequence number of n+5000 is received 
then something is strange. Either two nodes are using the same IP and started 
with two different initial sequence numbers or only one node is using this IP 
address but the daemon on that node has been restarted. Then a new initial 
sequence number is randomized and other nodes will be temporary confused. 
The latter is probably the scenario indicated by your syslog. 
It logs: 
 - an "out-of-range sequence number has been detected
 - tells you that related OGMs will be temporary ignored 
 - and for how long it will continue to ignore these strange sequence numbers. 
If no more in-range sequence numbers arrive for a certain time the data set 
of node A is reinitialized and subsequent OGMs from node A will be accepted.
If you look at the debug-output of 10.12.10.17 it should indicate that this 
node has been restarted and thereby changed its sequence number 
counter from 23290 to 28347.
Hope the explanation helped bit.
regards,
axel
...
Setup:
laptop-tbb-----------tbb[10.12.0.17]eth1--------eth1[10.12.10.1]
     +------------tbb[10.12.10.17]eth1---------+


/stephan

B.A.T.M.A.N mailing list
B.A.T.M.A.N@open-mesh.net
https://list.open-mesh.net/mm/listinfo/b.a.t.m.a.n

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: [B.A.T.M.A.N.] Memory Leak-batmand-exp rev1003