Signed-off-by: Marek Lindner lindner_marek@yahoo.de --- soft-interface.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/soft-interface.c b/soft-interface.c index 2ffdc74..7548762 100644 --- a/soft-interface.c +++ b/soft-interface.c @@ -836,7 +836,7 @@ struct net_device *softif_create(const char *name) atomic_set(&bat_priv->gw_sel_class, 20); atomic_set(&bat_priv->gw_bandwidth, 41); atomic_set(&bat_priv->orig_interval, 1000); - atomic_set(&bat_priv->hop_penalty, 10); + atomic_set(&bat_priv->hop_penalty, 30); atomic_set(&bat_priv->log_level, 0); atomic_set(&bat_priv->fragmentation, 1); atomic_set(&bat_priv->bcast_queue_left, BCAST_QUEUE_LEN);
On Fri, Jan 27, 2012 at 11:11:55PM +0800, Marek Lindner wrote:
Signed-off-by: Marek Lindner lindner_marek@yahoo.de
soft-interface.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/soft-interface.c b/soft-interface.c index 2ffdc74..7548762 100644 --- a/soft-interface.c +++ b/soft-interface.c @@ -836,7 +836,7 @@ struct net_device *softif_create(const char *name) atomic_set(&bat_priv->gw_sel_class, 20); atomic_set(&bat_priv->gw_bandwidth, 41); atomic_set(&bat_priv->orig_interval, 1000);
- atomic_set(&bat_priv->hop_penalty, 10);
- atomic_set(&bat_priv->hop_penalty, 30);
Hi Marek
Do you have any performance analysis to show this is really helpful and not harmful?
I've seen indoor results where i had to reduce the hop penalty, otherwise BATMAN was taking a short path which worked badly. By reducing the hop penalty, so encouraging it to take more hops, i got usable routes.
I see the danger here this could break working networks, so maybe it needs justification?
Thanks Andrew
Hi Andrew,
Do you have any performance analysis to show this is really helpful and not harmful?
I've seen indoor results where i had to reduce the hop penalty, otherwise BATMAN was taking a short path which worked badly. By reducing the hop penalty, so encouraging it to take more hops, i got usable routes.
I see the danger here this could break working networks, so maybe it needs justification?
as a matter of fact I do believe it is helpful. In various networks (more than a dozen) I have seen that batman would largely favor multi-hop routes, thus reducing the overall throughput. By setting it to a higher value I regained some of its performance. The networks are still up & running - I can show them to you if you are interested.
So, you had to reduce the default value of 10 to something even smaller ? A hop penalty of 10 results in a penatly of 4% per hop. A rough equivalent of 2 lost packets (62/64). Does not sound very much to me. Can you explain your test setup a little more ?
Nevertheless, this patch was intended to get a discussion going. The main problem I have been seeing in the last weeks is that OGM broadcasts have a hard time estimating the link quality / throughput on 11n devices. I'll also try to hack a proof of concept for an rssi influence on the routing and see if that has a better effect.
Regards, Marek
Hi all,
2012/1/27 Marek Lindner lindner_marek@yahoo.de:
Hi Andrew,
Do you have any performance analysis to show this is really helpful and not harmful?
I've seen indoor results where i had to reduce the hop penalty, otherwise BATMAN was taking a short path which worked badly. By reducing the hop penalty, so encouraging it to take more hops, i got usable routes.
I see the danger here this could break working networks, so maybe it needs justification?
I have experencied the same situation in some tests, and I agree with Andrew when he says that some form of justification is necessary.
as a matter of fact I do believe it is helpful. In various networks (more than a dozen) I have seen that batman would largely favor multi-hop routes, thus reducing the overall throughput. By setting it to a higher value I regained some of its performance. The networks are still up & running - I can show them to you if you are interested.
So, you had to reduce the default value of 10 to something even smaller ? A hop penalty of 10 results in a penatly of 4% per hop. A rough equivalent of 2 lost packets (62/64). Does not sound very much to me. Can you explain your test setup a little more ?
Nevertheless, this patch was intended to get a discussion going. The main problem I have been seeing in the last weeks is that OGM broadcasts have a hard time estimating the link quality / throughput on 11n devices. I'll also try to hack a proof of concept for an rssi influence on the routing and see if that has a better effect.
The problems of TQ emerges when the rate of devices increase, because especially in mixed b,g,n networks TQ does not distinguish between fast and slow link. We all know that brodcast losses does not say almost nothing about link speed or load.
The only way to improve the TQ metric is a cross-layer implementation as already experienced (considering only bandwidth) in my tests. Obviously this means breaking the "universal" compatibility with network interfaces, the use of mac80211 and cfg80211 in any case can limit this problem in my opinion.
Regards, Marek
Regards, Daniele
Hi,
I have experencied the same situation in some tests, and I agree with Andrew when he says that some form of justification is necessary.
you also have seen that a hop penalty of 10 is too high ? Can you explain your setup a bit more ?
The problems of TQ emerges when the rate of devices increase, because especially in mixed b,g,n networks TQ does not distinguish between fast and slow link. We all know that brodcast losses does not say almost nothing about link speed or load.
The only way to improve the TQ metric is a cross-layer implementation as already experienced (considering only bandwidth) in my tests. Obviously this means breaking the "universal" compatibility with network interfaces, the use of mac80211 and cfg80211 in any case can limit this problem in my opinion.
I am certain that you great ideas and that you spend a lot of time on working with batman / meshing. However, it is somewhat difficult to review / discuss / adapt your work since we have a hard time understanding your concepts without proper explications / documentation. Would it possible for you to talk/write a bit more about your stuff ?
The WBMv5 is a good opportunity to chat because you get all of us in one place. ;-)
Cheers, Marek
So, you had to reduce the default value of 10 to something even smaller ? A hop penalty of 10 results in a penatly of 4% per hop. A rough equivalent of 2 lost packets (62/64). Does not sound very much to me. Can you explain your test setup a little more ?
These observations come from a research project made together with Hochschule Luzern. There is some flyer like documentation in:
www.hslu.ch/t-spawn-project-description_en.pdf
It is a deployable indoor network. The tests i made were with a mesh of 6 nodes, deployed in a chain. The deployment is intelligent, made independently of BATMAN. It uses packet probing at the lowest coding rate to ensure there is always a link to two nodes upstream in the chain. So you walk along with 5 nodes in your hand. When the algorithm determines the link upstream to two nodes has reach a threshold, it tells you to deploy the next mesh node. We kept doing this, along the corridor, down the steps, along another corridor, through a fire door, etc, until we were out of nodes.
When iperf was used to measure the traffic from one end of the chain to the other. With the default hop penalty we got poor performance. With the traceroute facility of batctl, we could see it was route flipping between 3 hops and 4 hops. When it used 3 hops, the packet loss was too high and we got poor bandwidth. Then it went up to 4 hops, the packet loss was lower, so we got more bandwidth.
This was repeatable, with each deploy we made.
Then we tried with a lower hop penalty. I think it was 5, but i don't remember. BATMAN then used 5 hops and there was no route flipping. We also got the best iperf bandwidth for end to end of the chain.
The fact BATMAN was route flipping with a hop penalty of 10 suggests to me the links had similar TQ. So OGMs are getting through at the lowest coding rate. But data packets are having trouble, maybe because they are full MTU, or because the wifi driver is using the wrong coding rate.
I suspect the TQ measurements as determined by OGMs are more optimistic than actual data packets. Linus's played with different NDP packet sizes, and i think he ended up with big packets in order to give more realistic TQ measurements.
Unfortunately, this project is now finished. I do have access to the hardware, but no time allocated to play with it :-(
Nevertheless, this patch was intended to get a discussion going.
Well, i'm happy to take part in the discussion. I've no idea if our use case is typical, or an edge case. So comments, and results from other peoples networks would be useful.
If this change it to help 11n, maybe some more intelligence would be better, to ask the wireless stack is the interface abg or n, and from that determine what hop penalty should be used?
Andrew
Hi all,
Very nice setup Andrew :)
On Fri, Jan 27, 2012 at 08:13:34 +0100, Andrew Lunn wrote:
So, you had to reduce the default value of 10 to something even smaller ? A hop penalty of 10 results in a penatly of 4% per hop. A rough equivalent of 2 lost packets (62/64). Does not sound very much to me. Can you explain your test setup a little more ?
I suspect the TQ measurements as determined by OGMs are more optimistic than actual data packets. Linus's played with different NDP packet sizes, and i think he ended up with big packets in order to give more realistic TQ measurements.
Nevertheless, this patch was intended to get a discussion going.
Well, i'm happy to take part in the discussion. I've no idea if our use case is typical, or an edge case. So comments, and results from other peoples networks would be useful.
If this change it to help 11n, maybe some more intelligence would be better, to ask the wireless stack is the interface abg or n, and from that determine what hop penalty should be used?
In my honest opinion we are mixing two different issues: 1) current hop penalty value not really significant 2) OGM link quality measurements do not reflect the metric we'd like it to be
problem 2 is not going to be solved by hacking the hop penalty. It needs further investigation/research and NDP is probably a good starting point towards a possible solution (I think we all agree on this).
For what concern the hop penalty, as far as I understood, it is in charge of making batman prefer a shorter route in case of equal TQs over the traversed links. Instead of hacking the value...what about redesigning the way the hop penalty affects the TQ value of forwarded OGMs? Maybe using a different function (poly of deg>1 or exp) instead of a simple linear decreasing? May this help all the scenarios we mentioned?
Cheers,
Hi,
In my honest opinion we are mixing two different issues:
- current hop penalty value not really significant
- OGM link quality measurements do not reflect the metric we'd like it to
be
problem 2 is not going to be solved by hacking the hop penalty. It needs further investigation/research and NDP is probably a good starting point towards a possible solution (I think we all agree on this).
you are right - these are 2 different issues.
For what concern the hop penalty, as far as I understood, it is in charge of making batman prefer a shorter route in case of equal TQs over the traversed links. Instead of hacking the value...what about redesigning the way the hop penalty affects the TQ value of forwarded OGMs? Maybe using a different function (poly of deg>1 or exp) instead of a simple linear decreasing? May this help all the scenarios we mentioned?
The hop penalty is not as linear as you think. The formula is: tq * (TQ_MAX_VALUE - hop_penalty)) / (TQ_MAX_VALUE
With a hop penalty of 10 you get the following results: tq = 255, penalty = 10, resulting tq = 245 tq = 200, penalty = 8, resulting tq = 192 tq = 150, penalty = 6, resulting tq = 144 tq = 100, penalty = 4, resulting tq = 96 tq = 50, penalty = 2, resulting tq = 48
As you can see the more the tq goes down the less influence the hop penalty has.
Regards, Marek
On Sat, Jan 28, 2012 at 03:12:57PM +0100, Antonio Quartulli wrote:
Hi all,
Very nice setup Andrew :)
I cannot take much credit for it. I helped write the project proposal, but then was not allowed to take part in the project because of other higher priority projects. The credit goes to Hochschule Luzern, Linus and others.
In my honest opinion we are mixing two different issues:
- current hop penalty value not really significant
- OGM link quality measurements do not reflect the metric we'd like it to be
Yes, i agree. However, in the scenarios we have seen in this project, they are related. When OGM based TQ giving us too optimistic values, a higher hop penalty makes this even worse.
However, comments so far suggest i'm in a corner case, and that for others, a higher hop penalty does help. So for the moment, maybe increasing the hop penalty is the right things to do, but remember that once we have a better TQ measurement, that the hop penalty should be examined again.
Andrew
On Saturday, January 28, 2012 03:13:34 Andrew Lunn wrote:
When iperf was used to measure the traffic from one end of the chain to the other. With the default hop penalty we got poor performance. With the traceroute facility of batctl, we could see it was route flipping between 3 hops and 4 hops. When it used 3 hops, the packet loss was too high and we got poor bandwidth. Then it went up to 4 hops, the packet loss was lower, so we got more bandwidth.
This was repeatable, with each deploy we made.
Then we tried with a lower hop penalty. I think it was 5, but i don't remember. BATMAN then used 5 hops and there was no route flipping. We also got the best iperf bandwidth for end to end of the chain.
I have a hard time understanding this because the hop penalty has less influence on bad links. As you can see in my previous mail below a TQ of 100 the default penalty of 10 make less then a 4 TQ points difference.
Did you try setting a higher multicast rate ? This tends to also eliminate flaky direct connections.
The fact BATMAN was route flipping with a hop penalty of 10 suggests to me the links had similar TQ. So OGMs are getting through at the lowest coding rate. But data packets are having trouble, maybe because they are full MTU, or because the wifi driver is using the wrong coding rate.
Actually, in most of my setups the connection all neighboring nodes is perfect. Maybe that is another corner case ? :-)
I suspect the TQ measurements as determined by OGMs are more optimistic than actual data packets. Linus's played with different NDP packet sizes, and i think he ended up with big packets in order to give more realistic TQ measurements.
Unfortunately, this project is now finished. I do have access to the hardware, but no time allocated to play with it :-(
It was a good idea and is not forgotten. Hopefully I have the code ready by the time of the WBMv5. Then we can play a bit with that.
Nevertheless, this patch was intended to get a discussion going.
Well, i'm happy to take part in the discussion. I've no idea if our use case is typical, or an edge case. So comments, and results from other peoples networks would be useful.
If this change it to help 11n, maybe some more intelligence would be better, to ask the wireless stack is the interface abg or n, and from that determine what hop penalty should be used?
It is not directly related with 11n. The pain level grows with 11n as the gap between packet loss and throughput grows. This setting is more intended for setups in which all nodes have rather good connections to all other nodes. Then the direct TQs and the "hop" TQs are too similar and batman starts using multi-hop connections.
Regards, Marek
On Fri, Jan 27, 2012 at 11:54:25PM +0800, Marek Lindner wrote:
Hi Andrew,
Do you have any performance analysis to show this is really helpful and not harmful?
I've seen indoor results where i had to reduce the hop penalty, otherwise BATMAN was taking a short path which worked badly. By reducing the hop penalty, so encouraging it to take more hops, i got usable routes.
I see the danger here this could break working networks, so maybe it needs justification?
as a matter of fact I do believe it is helpful. In various networks (more than a dozen) I have seen that batman would largely favor multi-hop routes, thus reducing the overall throughput. By setting it to a higher value I regained some of its performance. The networks are still up & running - I can show them to you if you are interested.
I have seen similar results in my test setups. One simple scenario where I have seen route flapping with hop penalty 10 in multiple setups is: If some nodes are at the same place (e.g. a few netbooks on the same table), they often don't use the direct route but change to a two-hop route to reach their destination - even if the direct link is nearly perfect. There don't even has to be payload traffic involved, the routes just flap because of the little tq oscillations from some packets lost.
In these tests, I have also changed the hop penalty to 30 (or even 50, sometimes) and these problems are gone.
The TQ metric has limited informative value in terms of available bandwidth/chosen rate. The default wifi broadcast/multicast rate of 1 Mbit/s may lead to prefering low-rate 1 hop links over high-rate 2 hop links. However, this can be often fixed by increasing the mcast rate (mac80211 or madwifi support this). We should consider including rate information in future metrics.
Anyway, for now and our current TQ metric I strongly agree in increasing the hop penalty too.
Cheers, Simon
So, you had to reduce the default value of 10 to something even smaller ? A hop penalty of 10 results in a penatly of 4% per hop. A rough equivalent of 2 lost packets (62/64). Does not sound very much to me. Can you explain your test setup a little more ?
Nevertheless, this patch was intended to get a discussion going. The main problem I have been seeing in the last weeks is that OGM broadcasts have a hard time estimating the link quality / throughput on 11n devices. I'll also try to hack a proof of concept for an rssi influence on the routing and see if that has a better effect.
On Friday, January 27, 2012 23:11:55 Marek Lindner wrote:
Signed-off-by: Marek Lindner lindner_marek@yahoo.de
soft-interface.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)
Thanks for all the feedback and comments. I applied the patch in revision 6a12de1. Let's see how it goes.
Regards, Marek
b.a.t.m.a.n@lists.open-mesh.org