Dear all, I'm an author of "Routing Protocols for Mesh Networks with Mobility Support" and "Assessing Mobility Support in Mesh Networks". I'm very happy to hear that someone is reading these papers and I'm glad to discuss with you about them. By the way, I'm also the implementer of BATMAN module for ns-2 simulator.
In the first paper, we compare BATMAN with OLSR, GPSR and AODV routing protocols through ns-2 simulation. We found the routing loop problem that, in our opinion, strongly affects mobility scenarios and we designed sw-BATMAN to limit this problem. In the second paper, we tried to apply our findings in a real implementation. Thus, we deployed a vehicular testbed and we modified the batman-adv according to what we found in simulation. From this process, we obtained a routing protocol able to achieve good performance in our vehicular scenario (also thanks to the second interface enabling seamless handover).
Concerning the ns-2 module, we implemented it using the description of BATMAN protocol we found in the BATMAN-DRAFT [date 7 Apr 2008]. Actually we would like to point out from the start that we approached the BATMAN draft from the beginning with our mobile scenario in mind. This could have biased our interpretation and subsequent choices. Indeed, we found out from the beginning, after implementing batman in ns-2, that its response to "heavy mobility" was a bit slacking: we used a window of size 128, with 1-second OGM period. As a consequence, when suddenly moving out of radio range of a neighbor that had filled up the window with OGMs, and into the radio range of a new neighbor, it would take roughly 65 seconds before the new route is preferable to the old one (i.e., after the window collecting OGMs from the old neighbor is half empty, and the window collecting OGMs from the new neighbor is half full). Indeed, in our comparison with the "Draft" BATMAN, we explicitly introduced a timer to force the old route to be dropped out of the window. Please note that this timer is just two times the originator time interval (i.e., 2 seconds using the value in the draft) and it does not depend on the PURGE_TIMEOUT. We could call it NEIGHBOR_TIMEOUT.
We acknowledge that the Draft makes no mention of such timer, however the performance was so poor (in this highly-mobile scenario), that we, so to speak, found no better way handle this situation. Additionally, following the experience with other routing protocols for ad hoc layers (i.e., OLSR) we have introduced a MAC-layer timed feedback that triggers the deletion of a route if the MAC layer fails in delivering data to the next-hop of such route (thus making BATMAN more responsive). Reasonably, the MAC-layer feedback is just triggered by data packets that cannot be sent to the next hop.
We would also like to add that our smart-window modification addresses potential reactivity problems rather than connectivity loss. This is exemplified in figure in attachment to this mail, where the following occurs:
- At first node 4 is within radio range of node 2 (and out of range of node 3); thus, node 3 receives OGMs from 4 through 2, and populates a sliding window with them (let's call this window 4-2-W) [notation: (destination_node) - (nexthop_node) - W] - When node 4 moves out of node 2 range and within node 3 range, the MAC-layer timed feedback forces node 2 to remove node 4 as next-hop choice to node 4 itself (and similarly, node 4 removes node 2 as next hop). At this point, node 3 starts populating a new window with freshly-received, direct OGMs from node 4 (let's call this window 4-4-W), while at the same time the old "4-2-W" starts depleting. - The purpose of the exponentially-weighted smart-window is to speed-up the identification of node 4 as next-hop for node 3 (instead of keeping the old stale route through node 2). Indeed, the sum of the (weighted) content of 4-4-W quickly overcomes the sum of the content of 4-2-W.
What do you think now about our implementation? Is it more clear? Please, feel free to ask for more details: I will be glad to explain our reasons and to learn something useful from your experiences.
Best regards, Massimo
2010/12/23 Marek Lindner lindner_marek@yahoo.de
Hi,
"Note that, within a timeout, set by default to twice the OGM interval, nodes purge a neighbor from which OGMs are no longer received"
In fact the PURGE_TIMEOUT is set to a value of 200s so the problem described in the paper, in my opinion, comes from a wrong implementation of B.A.T.M.A.N. in the ns2 simulator used in the paper.
you are absolutely right. The authors assume the batman protocol is based on timeouts which leads to wrong results. Anyone interested to understand the PURGE_TIMEOUT can consult our FAQ.
During the WBMv3 we studied a paper called "Routing protocols for mesh networks with mobility support" which seems to have a similar content / same authors (?) and contacted the authors to let them know about their wrong assumption. Not sure in which manner these papers are related ...
Furthermore in my opinion, is better that mobile nodes does not use B.A.T.M.A.N. but should act as mesh unaware nodes exploiting HNA. Mesh network are not MANET...
Agreed.
Cheers, Marek