Re: [B.A.T.M.A.N.] batman goes looping...

6 Aug 2009

      On Thursday 06 August 2009 16:44:28 Yang Su wrote:
...

You are right. Under certain circumstance, this approach may lead to less

optimal route selection. Just to understand batman algorithm better, I try
to make an example (please correct me if something's wrong here): There are
two direct neighbors via which a sender communicates with a receiver. Via
the good neighbor, their is a single good route with long delay. Between
the bad neighbor and the receiver there are many parallel bad routes. Each
bad route has high packet loss rate but very short delay. The receiver's
OGMs arrive at the sender first via the bad neighbor. Although there is
substantial OGM loss on each individual bad routes, by aggregation over
those parallel bad routes, the bad neighbor is still able to relay most of
sender's OGM to the receiver. As a result, if the sender already chooses
the bad neighbor as the next hop, it will stick to it for quite long time
and does not switch to the good neighbor.
It is possble to simplify your example more to clearly illustrate the problem. 
Scenario: Host A wants to send data to host B. Consider the beautiful graphic 
I attached. ;-)
The link between A and B is quite asymetric. No packet loss from B to A but 
heavy packet loss the other way round. Now, B periodically sends its OGMs and 
all OGMs will arrive at A directly and via the neighbor whereas the OGMs via 
the neighbor always be slower than the directly received ones. A should choose 
the longer path via the neighbor because it has no packet loss but the 
"fastest sequence number wins" algorithm won't allow that.
...

This approach achieves the similar goal as the seqno-based one: switch to

a neighbor only when this neighbor is better and has more up-to-date
information than the current best route. But this approach is less
constrained and seems not to suffer from the above problem for the
seqno-based approach. I have tested the patch on our test environment. It
works pretty well on the test cases.
Great! I comitted the patch immediately. Thanks again for your thorough 
analysis and ideas. :-)
...

The problem might happen when the neighbor thinks that I am his best

neighbor (his good number is actually from me). In this case, my estimation
of that neighbor's TQ is actually decided by my own TQ (my TQ minus one hop
penalty). When my TQ drops down, the correct behavior should be that my
estimation of that neighbor's TQ also drops down accordingly. However, when
echo cancellation steps in, my estimation of that neighbor's TQ will not be
updated and remains to be the stale value. This might cause problems (e.g.
looping) in corner cases even with the new patch.  Does this analysis make
sense or I miss some details of the echo cancellation?
Unless you found another bug the echo cancellation should not hinder the 
propagation of our own TQ value. Once we (uml2) received a packet we will 
rebroadcast it. Our neighbor (uml1) will receive it (unless the packet got 
lost) and update its routing tables accordingly. Now, he will rebroadcast the 
packet as well which will be killed by the echo cancellation on uml1 as this 
packet does not contain new information. 
But even if the TQ received via uml2 is worse uml1 might still have a good TQ 
from uml3 in its routing table which might get rebroadcasted.
Regards,
Marek

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: [B.A.T.M.A.N.] batman goes looping...