Hi,
Mi name is NicoEchániz, this is my first post to the list.
********** INTRO, please skip it if you find it long to read **********
I've been doing free-network stuff for a while (around 10 years). I started in Buenos Aires, Argentina, where I built one of the first nodes for BuenosAiresLibre.org
I was the coordinator of the first JRRL (Free Networks Regional Meeting), where Ramón Roca (from guifi.net) was our guest keynote and members from Latin American networks intentionally gathered for the first time.
I've moved last year to a small town in another province, where I'm experimenting whith a completely different network. BuenosAiresLibre (a.k.a. BAL) run in infrastructure mode and used OLSR for routing.
In this town (Quintana), we are now building an ad-hoc mesh and as you must have guessed we're using batman-adv for routing.
Some friends from BAL came along and we worked for a couple of weeks on this experiment.
You may find information here (spanish but g.translator might help): http://wiki.arraigodigital.org.ar/RedLibre/QuintanaCamp/Documentaci%C3%B3n
And some interesting pictures here: http://www.lavecindaria.org.ar/category/quintanacamp/
QuintanaLibre.org is the first experimental implementation of a cheap design of ad-hoc mesh network that we intend to re-use in hundreds of small towns around the country, in colaboration with the national Ministry of Education, through the Arraigo Digital plan. Some info on that can be found here: http://www.arraigodigital.org.ar
***********************************************************************
Well... after this not-so-brief introduction, here's what I'd like to ask about.
We have been experimenting with different setups for our nodes. Some of them are regular TP-Link MR-3220 (thanks Elektra for the recommendation) with a PoE modification; others have a USB/Wi-fi adapter added (TP-Link WN722N) and there's a third kind where we connected an Ubiquiti BulletM2 to an MR-3220.
This third kind of setup is the one that has been giving us trouble.
There's a total 5 nodes running in the test network; the longest inter-node distance is 1.5 Km.
In the example below, the MR is called "marisa-mr" and the bulletM2 is called "marisa-blt" (this is the only node with this setup so far). They are connected by ethernet and ad-hoc and share the same tower space.
The network seemed to work quite well, but from time to time, pings would sort of fall into a hole for a while... so we started looking at traceroutes and this is what we found.
Traceroute is done from a notebook connected (with batman active on eth0) to the node called "nogal" and trying to trace the route to czuk_wlan1 (the other end of the net).
running the command several times, from time to time we get this sort of output:
$ sudo batctl tr czuk_wlan1 traceroute to czuk_wlan1 (f8:d1:11:0b:76:4b), 50 hops max, ... 1: nogal_wlan0 (00:15:6d:d6:24:7a) 0.395 ms 0.214 ms 0.416 ms 2: marisa-mr_wlan0 (54:e6:fc:b9:cb:38) 1.702 ms 3.291 ms 1.694 ms 3: cisterna_wlan0 (54:e6:fc:b9:be:e8) 25.282 ms * * 4: marisa-mr_wlan0 (54:e6:fc:b9:cb:38) 2.694 ms 1.722 ms 5.462 ms 5: marisa-blt_eth0 (00:15:6d:3f:2c:4f) 1.758 ms 1.584 ms 5.409 ms 6: marisa-mr_wlan0 (54:e6:fc:b9:cb:38) 1.335 ms 4.584 ms 3.111 ms ... 47: marisa-blt_eth0 (00:15:6d:3f:2c:4f) 2.090 ms 2.180 ms 2.138 ms 48: marisa-mr_wlan0 (54:e6:fc:b9:cb:38) 4.738 ms 2.898 ms 2.735 ms 49: marisa-blt_eth0 (00:15:6d:3f:2c:4f) 2.325 ms 2.269 ms 2.249 ms
$ sudo batctl tr czuk_wlan1 traceroute to czuk_wlan1 (f8:d1:11:0b:76:4b), 50 hops max, ... 1: nogal_wlan0 (00:15:6d:d6:24:7a) 0.541 ms 0.193 ms 0.188 ms 2: marisa-mr_wlan0 (54:e6:fc:b9:cb:38) 1.347 ms 1.221 ms 1.215 ms 3: marisa-blt_eth0 (00:15:6d:3f:2c:4f) 1.281 ms 5.103 ms 1.255 ms 4: marisa-mr_wlan0 (54:e6:fc:b9:cb:38) 5.944 ms 1.705 ms 2.238 ms ... 33: marisa-blt_eth0 (00:15:6d:3f:2c:4f) 1.920 ms 1.890 ms 1.815 ms 34: marisa-mr_wlan0 (54:e6:fc:b9:cb:38) 2.859 ms 1.778 ms 4.499 ms 35: * * * * 36: czuk_wlan1 (f8:d1:11:0b:76:4b) 299.941 ms 37.911 ms *
and this is what a correct traceroute looks like:
$ sudo batctl tr czuk_wlan1 traceroute to czuk_wlan1 (f8:d1:11:0b:76:4b), 50 hops max, 20 byte packets 1: nogal_wlan0 (00:15:6d:d6:24:7a) 0.292 ms 0.211 ms 0.209 ms 2: marisa-mr_wlan0 (54:e6:fc:b9:cb:38) 1.541 ms 1.407 ms 2.508 ms 3: marisa-blt_eth0 (00:15:6d:3f:2c:4f) 1.464 ms 1.593 ms 1.466 ms 4: cisterna_wlan0 (54:e6:fc:b9:be:e8) 5.275 ms 18.669 ms 4.106 ms 5: czuk_wlan1 (f8:d1:11:0b:76:4b) 3.505 ms 5.681 ms 5.198 ms
or this one, when the chosen route skips the cisterna node: $ sudo batctl tr czuk_wlan1 traceroute to czuk_wlan1 (f8:d1:11:0b:76:4b), 50 hops max, ... 1: nogal_wlan0 (00:15:6d:d6:24:7a) 0.293 ms 0.184 ms 1.990 ms 2: marisa-mr_wlan0 (54:e6:fc:b9:cb:38) 1.638 ms 1.687 ms 1.442 ms 3: marisa-blt_eth0 (00:15:6d:3f:2c:4f) 1.635 ms 1.334 ms 1.463 ms 4: czuk_wlan1 (f8:d1:11:0b:76:4b) 8.783 ms 3.844 ms 3.704 ms
this happens with the nodes configured according to: http://www.open-mesh.org/wiki/batman-adv/Bridge-loop-avoidance
or so we understand!
here are the relevant portions of the nodes config. we have a central repo for configurations, thus the "strange" syntax.
MARISA-BLT
$ uci show wireless.@wifi-iface[0] -c marisa-blt/etc/config/ wireless.cfg033579=wifi-iface wireless.cfg033579.device=radio0 wireless.cfg033579.encryption=none wireless.cfg033579.mode=adhoc wireless.cfg033579.ssid=mesh.quintanalibre.org.ar wireless.cfg033579.bssid=02:12:34:56:78:9A
# wlan0 here is a master mode interface $ uci show network.lan.ifname -c marisa-blt/etc/config/ network.lan.ifname=bat0 wlan0 eth0
# wlan0-1 here is the mesh interface $ uci show batman-adv.bat0.interfaces -c marisa-blt/etc/config/ batman-adv.bat0.interfaces=wlan0-1 br-lan
MARISA-MR
$ uci show wireless.@wifi-iface[0] -c marisa-mr/etc/config/ wireless.cfg033579=wifi-iface wireless.cfg033579.encryption=none wireless.cfg033579.device=radio0 wireless.cfg033579.bssid=02:12:34:56:78:9A wireless.cfg033579.mode=adhoc wireless.cfg033579.ssid=mesh.quintanalibre.org.ar (this is wlan0 in this router)
#eth1 is connected to a router inside the owner's house; no batman $ uci show network.lan.ifname -c marisa-mr/etc/config/ network.lan.ifname=bat0 eth1 eth0
#wlan0 is the only wireless interface here $ uci show batman-adv.bat0.interfaces -c marisa-mr/etc/config/ batman-adv.bat0.interfaces=wlan0 br-lan
We have also tried adding eth0 to bat0 and take it out of br-lan, which also "works" but gives routing loops from time to time.
OpenWRT is trunk from a couple of weeks ago, where batman-adv version was 2011.4
We hoped that adding a second router through ethernet would give the mesh an alternate path and more redundancy, but we get much lower throughputs because of these loops.
We would very much appreciate any insight on this matter.
Hoping to read from you bat-friends soon :)
Cheers, NicoEchániz