Hi there,
I found an interesting paper today that covers mobility of some nodes in a batman-adv network:
Stefano Annese, Claudio Casetti, Carla F. Chiasserini, Paolo Cipollone, Andrea Ghittino, and Massimo Reineri. 2009. Assessing mobility support in mesh networks. In Proceedings of the 4th ACM international workshop on Experimental evaluation and characterization (WINTECH '09).
http://portal.acm.org/citation.cfm?id=1614297
Unfortunately, the paper is not freely available on the web. If you're blessed with access to the ACM digital library by your school or employer, you can download it free of charge; in other cases you'll have to buy it for USD 15.
In the following, I will summarize relevant parts:
I) Weighting the local packet count
The authors found that when each entry in the sliding window is weighed with the same weight as it is done in the current version of batman-adv, routing loops can occur when mobility is present. They greatly improve their results by weighting with the following function (i = 0..S-1 where S is the window size; 0 means freshest packet):
weight(i) := max(1, floor(i * S / e^i))
This function seems to be suboptimal as it weights entry 0 with 1 while the following few entries are boosted with a weight of up to 20 (see attachment). Adding 1 to i should deal with this flaw.
However, this weighting function lead to big improvements in the authors' simulations, so I think you are on the right track when thinking about introducing some kind of weighting or moving average to boost the weight of current packets.
Here's the Octave/Matlab code I used to create the plot (if you want to do plots for other weighting functions that are discussed, just replace the definition of w and don't forget to use .* and ./ instead of * and / to enforce elementwise operations):
x = 0:63 w = max(1, x.*64./exp(x)) stem(x,w) xlabel('age') ylabel('weight')
II) Optimizations for multi-interface nodes
The authors used nodes with 2 radios. They did roughly the following on the mobile nodes (not on the fixed ones):
1) In regular intervals, scan the forwarding tables for each interface to check if any neighbors are known. If an interface has no contact to any neighbor, go to 2)
2) Use the interface without known neighbors to scan all channels except the channel that the other interface is listening to. Don't send OGMs but simply listen for OGMs from other stations. If a channel is found that has a neighbor sending, stay on this channel and start to behave like a normal batman-adv node (send OGMs etc.).
I'm not sure if this will be benefitial in other scenarios than in the vehicular network scenario of the paper, but I wanted to share my findings with you.
- Daniel
Hi, very interesting document. However the problem of routing loops described in the paper is based on this erroneous information about B.A.T.M.A.N.:
"Note that, within a timeout, set by default to twice the OGM interval, nodes purge a neighbor from which OGMs are no longer received"
In fact the PURGE_TIMEOUT is set to a value of 200s so the problem described in the paper, in my opinion, comes from a wrong implementation of B.A.T.M.A.N. in the ns2 simulator used in the paper.
Anyhow the weighting function of the window is an interesting part but unfortunately the authors does not describe how this is the best function. Being a good weighting function in a specific network scenario does not mean being the best function at all! :)
Furthermore in my opinion, is better that mobile nodes does not use B.A.T.M.A.N. but should act as mesh unaware nodes exploiting HNA. Mesh network are not MANET...
Regards!
2010/12/22 Daniel Seither post@tiwoc.de:
Hi there,
I found an interesting paper today that covers mobility of some nodes in a batman-adv network:
Stefano Annese, Claudio Casetti, Carla F. Chiasserini, Paolo Cipollone, Andrea Ghittino, and Massimo Reineri. 2009. Assessing mobility support in mesh networks. In Proceedings of the 4th ACM international workshop on Experimental evaluation and characterization (WINTECH '09).
http://portal.acm.org/citation.cfm?id=1614297
Unfortunately, the paper is not freely available on the web. If you're blessed with access to the ACM digital library by your school or employer, you can download it free of charge; in other cases you'll have to buy it for USD 15.
In the following, I will summarize relevant parts:
I) Weighting the local packet count
The authors found that when each entry in the sliding window is weighed with the same weight as it is done in the current version of batman-adv, routing loops can occur when mobility is present. They greatly improve their results by weighting with the following function (i = 0..S-1 where S is the window size; 0 means freshest packet):
weight(i) := max(1, floor(i * S / e^i))
This function seems to be suboptimal as it weights entry 0 with 1 while the following few entries are boosted with a weight of up to 20 (see attachment). Adding 1 to i should deal with this flaw.
However, this weighting function lead to big improvements in the authors' simulations, so I think you are on the right track when thinking about introducing some kind of weighting or moving average to boost the weight of current packets.
Here's the Octave/Matlab code I used to create the plot (if you want to do plots for other weighting functions that are discussed, just replace the definition of w and don't forget to use .* and ./ instead of * and / to enforce elementwise operations):
x = 0:63 w = max(1, x.*64./exp(x)) stem(x,w) xlabel('age') ylabel('weight')
II) Optimizations for multi-interface nodes
The authors used nodes with 2 radios. They did roughly the following on the mobile nodes (not on the fixed ones):
- In regular intervals, scan the forwarding tables for each interface
to check if any neighbors are known. If an interface has no contact to any neighbor, go to 2)
- Use the interface without known neighbors to scan all channels except
the channel that the other interface is listening to. Don't send OGMs but simply listen for OGMs from other stations. If a channel is found that has a neighbor sending, stay on this channel and start to behave like a normal batman-adv node (send OGMs etc.).
I'm not sure if this will be benefitial in other scenarios than in the vehicular network scenario of the paper, but I wanted to share my findings with you.
- Daniel
Hi,
"Note that, within a timeout, set by default to twice the OGM interval, nodes purge a neighbor from which OGMs are no longer received"
In fact the PURGE_TIMEOUT is set to a value of 200s so the problem described in the paper, in my opinion, comes from a wrong implementation of B.A.T.M.A.N. in the ns2 simulator used in the paper.
you are absolutely right. The authors assume the batman protocol is based on timeouts which leads to wrong results. Anyone interested to understand the PURGE_TIMEOUT can consult our FAQ.
During the WBMv3 we studied a paper called "Routing protocols for mesh networks with mobility support" which seems to have a similar content / same authors (?) and contacted the authors to let them know about their wrong assumption. Not sure in which manner these papers are related ...
Furthermore in my opinion, is better that mobile nodes does not use B.A.T.M.A.N. but should act as mesh unaware nodes exploiting HNA. Mesh network are not MANET...
Agreed.
Cheers, Marek
Dear all, I'm an author of "Routing Protocols for Mesh Networks with Mobility Support" and "Assessing Mobility Support in Mesh Networks". I'm very happy to hear that someone is reading these papers and I'm glad to discuss with you about them. By the way, I'm also the implementer of BATMAN module for ns-2 simulator.
In the first paper, we compare BATMAN with OLSR, GPSR and AODV routing protocols through ns-2 simulation. We found the routing loop problem that, in our opinion, strongly affects mobility scenarios and we designed sw-BATMAN to limit this problem. In the second paper, we tried to apply our findings in a real implementation. Thus, we deployed a vehicular testbed and we modified the batman-adv according to what we found in simulation. From this process, we obtained a routing protocol able to achieve good performance in our vehicular scenario (also thanks to the second interface enabling seamless handover).
Concerning the ns-2 module, we implemented it using the description of BATMAN protocol we found in the BATMAN-DRAFT [date 7 Apr 2008]. Actually we would like to point out from the start that we approached the BATMAN draft from the beginning with our mobile scenario in mind. This could have biased our interpretation and subsequent choices. Indeed, we found out from the beginning, after implementing batman in ns-2, that its response to "heavy mobility" was a bit slacking: we used a window of size 128, with 1-second OGM period. As a consequence, when suddenly moving out of radio range of a neighbor that had filled up the window with OGMs, and into the radio range of a new neighbor, it would take roughly 65 seconds before the new route is preferable to the old one (i.e., after the window collecting OGMs from the old neighbor is half empty, and the window collecting OGMs from the new neighbor is half full). Indeed, in our comparison with the "Draft" BATMAN, we explicitly introduced a timer to force the old route to be dropped out of the window. Please note that this timer is just two times the originator time interval (i.e., 2 seconds using the value in the draft) and it does not depend on the PURGE_TIMEOUT. We could call it NEIGHBOR_TIMEOUT.
We acknowledge that the Draft makes no mention of such timer, however the performance was so poor (in this highly-mobile scenario), that we, so to speak, found no better way handle this situation. Additionally, following the experience with other routing protocols for ad hoc layers (i.e., OLSR) we have introduced a MAC-layer timed feedback that triggers the deletion of a route if the MAC layer fails in delivering data to the next-hop of such route (thus making BATMAN more responsive). Reasonably, the MAC-layer feedback is just triggered by data packets that cannot be sent to the next hop.
We would also like to add that our smart-window modification addresses potential reactivity problems rather than connectivity loss. This is exemplified in figure in attachment to this mail, where the following occurs:
- At first node 4 is within radio range of node 2 (and out of range of node 3); thus, node 3 receives OGMs from 4 through 2, and populates a sliding window with them (let's call this window 4-2-W) [notation: (destination_node) - (nexthop_node) - W] - When node 4 moves out of node 2 range and within node 3 range, the MAC-layer timed feedback forces node 2 to remove node 4 as next-hop choice to node 4 itself (and similarly, node 4 removes node 2 as next hop). At this point, node 3 starts populating a new window with freshly-received, direct OGMs from node 4 (let's call this window 4-4-W), while at the same time the old "4-2-W" starts depleting. - The purpose of the exponentially-weighted smart-window is to speed-up the identification of node 4 as next-hop for node 3 (instead of keeping the old stale route through node 2). Indeed, the sum of the (weighted) content of 4-4-W quickly overcomes the sum of the content of 4-2-W.
What do you think now about our implementation? Is it more clear? Please, feel free to ask for more details: I will be glad to explain our reasons and to learn something useful from your experiences.
Best regards, Massimo
2010/12/23 Marek Lindner lindner_marek@yahoo.de
Hi,
"Note that, within a timeout, set by default to twice the OGM interval, nodes purge a neighbor from which OGMs are no longer received"
In fact the PURGE_TIMEOUT is set to a value of 200s so the problem described in the paper, in my opinion, comes from a wrong implementation of B.A.T.M.A.N. in the ns2 simulator used in the paper.
you are absolutely right. The authors assume the batman protocol is based on timeouts which leads to wrong results. Anyone interested to understand the PURGE_TIMEOUT can consult our FAQ.
During the WBMv3 we studied a paper called "Routing protocols for mesh networks with mobility support" which seems to have a similar content / same authors (?) and contacted the authors to let them know about their wrong assumption. Not sure in which manner these papers are related ...
Furthermore in my opinion, is better that mobile nodes does not use B.A.T.M.A.N. but should act as mesh unaware nodes exploiting HNA. Mesh network are not MANET...
Agreed.
Cheers, Marek
Hi,
I'm an author of "Routing Protocols for Mesh Networks
with Mobility Support" and "Assessing Mobility Support in Mesh Networks". I'm very happy to hear that someone is reading these papers and I'm glad to discuss with you about them.
welcome on our list! :-)
Indeed, in our comparison with the "Draft" BATMAN, we explicitly introduced a timer to force the old route to be dropped out of the window. Please note that this timer is just two times the originator time interval (i.e., 2 seconds using the value in the draft) and it does not depend on the PURGE_TIMEOUT. We could call it NEIGHBOR_TIMEOUT.
Thanks for the clarifications. Please note that it does not matter how long a certain timeout is - every timeout which influences routing decisions is a potential risk for routing loops. We, the people behind batman, are very well aware that a timeout free routing algorithm does not converge as fast as one with timeouts but also is less susceptible to routing loops. Your tests / paper demonstrated this fact once more. Of course, we are interested in improving the algorithm as long as it does not happen on the expense of stability. We are going to study your paper at the coming WirelessBattleMesh in Barcelona to give you detailed feedback regarding your ideas.
Regards, Marek
b.a.t.m.a.n@lists.open-mesh.org