Hi Marek
I finally had time to dig into our problems with loops in our chain.
Some background for the list. Yang and I have been using User Mode
Linux (UML) to build a test network for batman advanced. We connect a
number of uml machines together using a modified version of
uml_switch. The modifications allow us to change the packet drop
probability between any two nodes. We have been testing using simple
chains as shown in the attached gif. The black lines show the
currently used links. The red lines are other links which are
currently not used by batman. The black links have a packet drop
probablilty of 0% and the red of 20%.
Our test was to remove uml5 from the network and see how long
batman-adv took to re-route around it. We ping from uml4 to uml6 and
from uml1 to uml9.
We found that uml4->uml6 would recover in around 14 seconds. However
uml1->uml9 took much longer, 65 seconds.
Looking at the routing, we found it went into loops. When sending from
uml1 to uml9, uml1 routes to uml2, uml2 routes back to uml1.
Here are the logs from uml2. I've cut out most of the packets, just
showing OGMs from uml9. There is a simple relationship between the MAC
address and the uml number:
fe:fe:00:00:01:01 - uml1
fe:fe:00:00:02:01 - uml2
fe:fe:00:00:03:01 - uml3
etc...
[ 42949558] Received BATMAN packet via NB: fe:fe:00:00:03:01, IF: eth1 [fe:fe:00:00:02:01] (from OG: fe:fe:00:00:09:01, via old OG: fe:fe:00:00:04:01, seqno 146, tq 218, TTL 44, V 7, IDF 0)
[ 42949558] bidirectional: orig = fe:fe:00:00:09:01 neigh = fe:fe:00:00:03:01 => own_bcast = 64, real recv = 64, local tq: 255, asym_penalty: 255, total tq: 218
[ 42949558] update_originator(): Searching and updating originator entry of received packet
[ 42949558] Updating existing last-hop neighbour of originator
[ 42949558] Drop packet: duplicate packet received
This has been received from uml3 origionally from uml4. The TQ is 218
to uml9 via uml3.
[ 42949559] Received BATMAN packet via NB: fe:fe:00:00:01:01, IF: eth1 [fe:fe:00:00:02:01] (from OG: fe:fe:00:00:09:01, via old OG: fe:fe:00:00:03:01, seqno 146, tq 209, TTL 42, V 7, IDF 0)
[ 42949559] bidirectional: orig = fe:fe:00:00:09:01 neigh = fe:fe:00:00:01:01 => own_bcast = 64, real recv = 64, local tq: 255, asym_penalty: 255, total tq: 209
[ 42949559] update_originator(): Searching and updating originator entry of received packet
[ 42949559] Updating existing last-hop neighbour of originator
[ 42949559] Drop packet: duplicate packet received
This is where is starts to get interesting. This is from uml1,
origionally from uml3. So it has jumped uml2, it used the 20% packet
drop link which exists between uml1 and uml3. Because this is not an
echo, uml2 processes it, and now knows that with a TQ of 209 it can
get to uml9 via uml1.
[ 42949559] Sending own packet (originator fe:fe:00:00:02:01, seqno 155, TQ 255, TTL 50, IDF off) on interface eth1 [fe:fe:00:00:02:01]
[ 42949559] Forwarding aggregated packet (originator fe:fe:00:00:06:01, seqno 152, TQ 232, TTL 46, IDF off) on interface eth1 [fe:fe:00:00:02:01]
[ 42949559] Forwarding aggregated packet (originator fe:fe:00:00:09:01, seqno 146, TQ 215, TTL 43, IDF off) on interface eth1 [fe:fe:00:00:02:01]
[ 42949559] Forwarding packet (originator fe:fe:00:00:01:01, seqno 156, TQ 250, TTL 49, IDF on) on interface eth1 [fe:fe:00:00:02:01]
[ 42949560] Received BATMAN packet via NB: fe:fe:00:00:03:01, IF: eth1 [fe:fe:00:00:02:01] (from OG: fe:fe:00:00:09:01, via old OG: fe:fe:00:00:04:01, seqno 148, tq 150, TTL 45, V 7, IDF 0)
[ 42949560] updating last_seqno: old 146, new 148
[ 42949560] bidirectional: orig = fe:fe:00:00:09:01 neigh = fe:fe:00:00:03:01 => own_bcast = 64, real recv = 64, local tq: 255, asym_penalty: 255, total tq: 150
[ 42949560] update_originator(): Searching and updating originator entry of received packet
[ 42949560] Updating existing last-hop neighbour of originator
[ 42949560] Changing route towards: fe:fe:00:00:09:01 (now via fe:fe:00:00:01:01 - was via fe:fe:00:00:03:01)
[ 42949560] Forwarding packet: rebroadcast originator packet
[ 42949560] Forwarding packet: tq_orig: 150, tq_avg: 209, tq_forw: 204, ttl_orig: 44, ttl_forw: 255
Now things go none optimal :-(
This is from uml3, origionally from uml4. The TQ value has dropped to
150. This will be when we have removed uml5, so the TQ naturally does
drop.
The TQ value via uml3 is now less than the TQ value via uml1. So it
changes its route to go via uml1.
Looking at the logs of uml1, uml1 is always routing to uml9 via uml2.
The problem here i think is to do with the asymetric links algorithms.
When sending out an OGM, the node uses the TQ for its best link to the
originator, not the link the OGM came in on. If the OGM from uml1
origionally from UML3 reported the TQ via that route, the TQ would
very likely be lower. uml2 would then not of choosen to swap to
uml1. However, uml1 reports its best route, which is via uml2. uml2
does not know this, decides to use uml1, and we have a loop.
Does this all hang together correctly? I'm i interpreting this all
right...
How would you suggest fix this?
Thanks
Andrew
Hi
I've read the readme on how to setup batman advanced and also the openwrt
howto.
It looks great, and it looks like the way to go as roaming seem to be
handled correctly.
I have some questions before I try to setup it.
Could it work on "standard" broadcom hardware like WRT54gl?
Do I need multi ssid hardware? only atheros?
What about the ahdemo mode? is it like adhoc?
If I want standard AP mode, must I create several ssid (1 ahdemo + 1 AP)
then bridge bat0 to the AP ssid?
If I want to setup coova-chilli to drive the mesh (hotspot mesh), how could
I ssh to the repeaters? Is some sort of VLAN supported? there would be 2
separate networks in the mesh, one for gateway to repeater communication,
the other for the hotspot mesh.
Or maybe would it be possible that repeaters use a specific dhcp server (not
chilli) to get their ip addresses?
It looks that we can get the mac addresses of all repeaters. Can I use arp
to get their IP?
You would say "why ssh to the repeaters?", I don't know... Maybe for
configuration/ monitoring...
Thanks
Hi,
I have installed kamikaze, xwrt, webif, batman webif - all ok.
I'm having difficulty in understanding how I 'link' batman to the wireless interface. I have removed all the virtual lan stuff as I think this is the first step.
If I ssh in and ifconfig I can see eth0.0 (lan ports) eth0.1 (seems to be the internet wan port) and also wl0 with the config I set for it via the webif.
If I go to the webif/mesh tab all I can select as interfaces are the listed are the eth0.0 and eth0.1 no mention of the wireless interface ?
Second point the Mesh tab indicates that no batmand is running, presumably it will auto start (from this page) when I have the interface set properly.
Sorry for the basic questions, I have read the basic howto's but am not really sure how the command line relates to the batman webif.
Thanks very much
nick
_________________________________________________________________
Windows Live Messenger: Celebrate 10 amazing years with free winks and emoticons.
http://clk.atdmt.com/UKM/go/157562755/direct/01/
Hi
I would like to know where I can find information about the data that
the VIS Server gives. More precisely the label on each link and its
relation with the link quality that B.A.T.M.A.N. gives.
Also, which is the general criteria to know from which number, that
the output shows, is considered a good, regular, bad or "imposible to
use" link?
An example:
digraph topology
{
"5.174.37.225" -> "5.224.160.202"[label="2.13"]
"5.174.117.226" -> "5.174.37.225"[label="5.00"]
"5.174.117.226" -> "0.0.0.0/0.0.0.0"[label="HNA"]
"5.224.160.202" -> "5.174.37.225"[label="1.28"]
"5.224.160.202" -> "0.0.0.0/0.0.0.0"[label="HNA"]
}
Thanks in advance.