Currently BATMAN_V path throughput computation algorithm takes into account store & forward WiFi characteristics. When an originator forwards an OGM on the same interface it received it, the path throughput is divided by two.
Let's consider the topology below
+-------+ +-------+ +-------+ | Orig0 | <------ | Orig1 | <------ | Orig2 | +-------+ T01 +-------+ T12 +-------+
Where Orig0's OGM is received on same WiFi (non full duplex) interface as the one used to forward it to Orig2. And where T01 is the estimated throughput for link between Orig0 and Orig1 and T12 is the one between Orig1 and Orig2. Let's note PT02 the B.A.T.M.A.N-Adv estimated throughput for the end-to-end path between Orig2 and Orig0.
In this case Orig0 broadcasts its own OGM initialized with BATADV_THROUGHPUT_MAX_VALUE. Orig1 receives it and compares it with the estimated link throughput T01. Thus Orig1 considers the path to reach Orig0 has an end-to-end throughput of T01, so far so good.
Then Orig1 first adapts the Orig0 OGM throughput to T01/2 then forwards it on same interface it received it. Orig2 receives it and first thing Orig2 does is checking if T12 is lower than the received OGM throughput (i.e. T01/2), and if that is the case T12 is considered to be the new end-to-end path throughput.
The first issue I see here is that Orig2 does not know the path to reach Orig0 has to get half duplex penalty because it is forwarded on same WiFi interface on Orig1, only Orig1 knows that. Thus if T12 is lower that T01/2, T12 will be chosen as the Orig2 to Orig0 path throughput (i.e PT02) and the half duplex penalty is lost.
The first patch of this series aims to fix that by adding a flag in OGM packets to inform Orig2 the path to reach Orig0 shares the same half duplex interface and that it has to apply the dividing by two penalty on its link throughput.
The other thing I think can be improved, is this dividing by 2 penalty. This penalty seems a bit off the expected estimation most of the time. The way I approach this half duplex penalty is by trying to compute the maximum number of bytes that can go from Orig0 to Orig2 passing through Orig1 in one second.
And because of half duplex characteristic of WiFi you can't transfer bytes from Orig0 to Orig1 and Orig1 to Orig2 simultaneously. So at the end it comes down to finding the maximum number of bytes (x) that can go from Orig0 to Orig1 and then from Orig1 to reach Orig2 within one second as below:
x / T01 + x / T12 = TotalTripTime
With x/T01 and x/T12 being the time x bytes takes to go from Orig0 to Orig1 and Orig1 to Orig2 respectfully.
So by solving the above for x with TotalTripTime being 1second: x = T01 * T12 / (T01 + T12)
Thus if T01 == T12 Orig1 takes the same time to receive bytes from Orig2 than to forward them to Orig1 then dividing by two makes sense.
But if let says Orig1 forwards data to Orig0 twice as fast as it receives it from Orig2 (e.g. T12 = 3MB/s and T01 = 6MB/s), throughput can reach up to two third of T12 throughput (e.g. Orig2 sends 2 MB to Orig1 taking 2/3 of a second which is then forward to Orig0 taking the remaining 1/3 of a second reaching an overall throughput of 2MB/s).
Reasoning by recurrence the following formula can be applied to find estimated path throughput for any half duplex chain between OrigX to OrigY through OrigZ:
PTzx = PTyx * Tzy / (PTyx + Tzy)
Where PTzx and PTyx are estimated throughput for end-to-end path between OrigZ and OrigX, and OrigY and OrigX respectively. And where Tzy is the estimated throughput for link between OrigZ and OrigY.
The second patch from this series moves from the divided by two forward penalty to the one above.
Remi Pommarel (2): batman-adv: Keep half duplex penalty on OGM receiving side also batman-adv: Better half duplex penalty estimation
include/uapi/linux/batadv_packet.h | 8 ++++++ net/batman-adv/bat_v_ogm.c | 44 ++++++++++++++++++++++++++---- net/batman-adv/types.h | 3 ++ 3 files changed, 49 insertions(+), 6 deletions(-)
Considering the following topology:
+-------+ +-------+ +-------+ | Orig0 | <------ | Orig1 | <------ | Orig2 | +-------+ T01 +-------+ T12 +-------+
Where T01 and T12 are throughput estimations for link between Orig0 and Orig1 and the one between Orig1 and Orig2 respectively. And where Orig1 is using the same WiFi interface to reach Orig0 and Orig2.
In this case Orig2 will receive an OGM for Orig0 from Orig1 with a throughput of T01/2 to take into account store & forward charactersitic of WiFi interface. But if T12 is lower that T01/2 this penalty is dropped and end-to-end throughput for Orig2 to Orig0 path will be estimated to be T12.
This patch adds a flag in OGM packet that indicates the OGM needs half duplex penalty. Thus the node receiving it (i.e. Orig2 in the situation above) can correctly apply the penalty on its throughput.
In the case above Orig2 will received from Orig1 a OGM for Orig0 with BATADV_V_HALF_DUPLEX flag set, so it could use T12/2 as its end-to-end path throughput instead of T12.
Signed-off-by: Remi Pommarel repk@triplefau.lt --- include/uapi/linux/batadv_packet.h | 8 ++++++ net/batman-adv/bat_v_ogm.c | 42 +++++++++++++++++++++++++----- net/batman-adv/types.h | 3 +++ 3 files changed, 47 insertions(+), 6 deletions(-)
diff --git a/include/uapi/linux/batadv_packet.h b/include/uapi/linux/batadv_packet.h index ea4692c339ce..9c711d149a45 100644 --- a/include/uapi/linux/batadv_packet.h +++ b/include/uapi/linux/batadv_packet.h @@ -84,6 +84,14 @@ enum batadv_iv_flags { BATADV_DIRECTLINK = 1UL << 2, };
+/** + * enum batadv_v_flags - flags used in B.A.T.M.A.N. V OGM2 packets + * @BATADV_V_HALF_DUPLEX: Half Duplex penalty should be applied to throughput + */ +enum batadv_v_flags { + BATADV_V_HALF_DUPLEX = 1UL << 0, +}; + /** * enum batadv_icmp_packettype - ICMP message types * @BATADV_ECHO_REPLY: success reply to BATADV_ECHO_REQUEST diff --git a/net/batman-adv/bat_v_ogm.c b/net/batman-adv/bat_v_ogm.c index 1d750f3cb2e4..27597f4cdf3e 100644 --- a/net/batman-adv/bat_v_ogm.c +++ b/net/batman-adv/bat_v_ogm.c @@ -474,12 +474,14 @@ void batadv_v_ogm_primary_iface_set(struct batadv_hard_iface *primary_iface) static u32 batadv_v_forward_penalty(struct batadv_priv *bat_priv, struct batadv_hard_iface *if_incoming, struct batadv_hard_iface *if_outgoing, - u32 throughput) + u32 throughput, bool *half_duplex) { int if_hop_penalty = atomic_read(&if_incoming->hop_penalty); int hop_penalty = atomic_read(&bat_priv->hop_penalty); int hop_penalty_max = BATADV_TQ_MAX_VALUE;
+ *half_duplex = false; + /* Apply per hardif hop penalty */ throughput = throughput * (hop_penalty_max - if_hop_penalty) / hop_penalty_max; @@ -494,8 +496,10 @@ static u32 batadv_v_forward_penalty(struct batadv_priv *bat_priv, */ if (throughput > 10 && if_incoming == if_outgoing && - !(if_incoming->bat_v.flags & BATADV_FULL_DUPLEX)) + !(if_incoming->bat_v.flags & BATADV_FULL_DUPLEX)) { + *half_duplex = true; return throughput / 2; + }
/* hop penalty of 255 equals 100% */ return throughput * (hop_penalty_max - hop_penalty) / hop_penalty_max; @@ -573,6 +577,9 @@ static void batadv_v_ogm_forward(struct batadv_priv *bat_priv,
/* apply forward penalty */ ogm_forward = (struct batadv_ogm2_packet *)skb_buff; + ogm_forward->flags &= ~BATADV_V_HALF_DUPLEX; + if (neigh_ifinfo->bat_v.half_duplex) + ogm_forward->flags |= BATADV_V_HALF_DUPLEX; ogm_forward->throughput = htonl(neigh_ifinfo->bat_v.throughput); ogm_forward->ttl--;
@@ -615,6 +622,7 @@ static int batadv_v_ogm_metric_update(struct batadv_priv *bat_priv, bool protection_started = false; int ret = -EINVAL; u32 path_throughput; + bool half_duplex; s32 seq_diff;
orig_ifinfo = batadv_orig_ifinfo_new(orig_node, if_outgoing); @@ -656,10 +664,12 @@ static int batadv_v_ogm_metric_update(struct batadv_priv *bat_priv,
path_throughput = batadv_v_forward_penalty(bat_priv, if_incoming, if_outgoing, - ntohl(ogm2->throughput)); + ntohl(ogm2->throughput), + &half_duplex); neigh_ifinfo->bat_v.throughput = path_throughput; neigh_ifinfo->bat_v.last_seqno = ntohl(ogm2->seqno); neigh_ifinfo->last_ttl = ogm2->ttl; + neigh_ifinfo->bat_v.half_duplex = half_duplex;
if (seq_diff > 0 || protection_started) ret = 1; @@ -842,6 +852,26 @@ batadv_v_ogm_aggr_packet(int buff_pos, int packet_len, (next_buff_pos <= BATADV_MAX_AGGREGATION_BYTES); }
+/** + * batadv_v_get_throughput() - Compute path throughput from received OGM + * @ogm: OGM2 packet received + * @neigh: Neighbour OGM packet has been received from + * @return: Estimated path throughput + */ +static u32 batadv_v_get_throughput(struct batadv_ogm2_packet *ogm, + struct batadv_hardif_neigh_node *neigh) +{ + u32 oth, lth; + + oth = ntohl(ogm->throughput); + lth = ewma_throughput_read(&neigh->bat_v.throughput); + + if ((ogm->flags & BATADV_V_HALF_DUPLEX) && lth > 10) + lth /= 2; + + return min_t(u32, lth, oth); +} + /** * batadv_v_ogm_process() - process an incoming batman v OGM * @skb: the skb containing the OGM @@ -858,7 +888,7 @@ static void batadv_v_ogm_process(const struct sk_buff *skb, int ogm_offset, struct batadv_neigh_node *neigh_node = NULL; struct batadv_hard_iface *hard_iface; struct batadv_ogm2_packet *ogm_packet; - u32 ogm_throughput, link_throughput, path_throughput; + u32 ogm_throughput, path_throughput; int ret;
ethhdr = eth_hdr(skb); @@ -911,9 +941,9 @@ static void batadv_v_ogm_process(const struct sk_buff *skb, int ogm_offset, * neighbor) the path throughput metric equals the link throughput. * - For OGMs traversing more than hop the path throughput metric is * the smaller of the path throughput and the link throughput. + * - Also apply Half Duplex interfaces penalty */ - link_throughput = ewma_throughput_read(&hardif_neigh->bat_v.throughput); - path_throughput = min_t(u32, link_throughput, ogm_throughput); + path_throughput = batadv_v_get_throughput(ogm_packet, hardif_neigh); ogm_packet->throughput = htonl(path_throughput);
batadv_v_ogm_process_per_outif(bat_priv, ethhdr, ogm_packet, orig_node, diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h index 2be5d4a712c5..147b1595d32a 100644 --- a/net/batman-adv/types.h +++ b/net/batman-adv/types.h @@ -708,6 +708,9 @@ struct batadv_neigh_ifinfo_bat_v {
/** @last_seqno: last sequence number known for this neighbor */ u32 last_seqno; + + /** @half_duplex: throughput should suffer half duplex penalty */ + bool half_duplex; };
/**
Let's consider the below topology
+-------+ +-------+ +-------+ | OrigA | <--- ... ---- | OrigB | <------- | OrigC | +-------+ PT_ab +-------+ LT_bc +-------+
Where OrigA's OGM is received on same WiFi (non full duplex) interface as the one used to forward it to OrigC. And where LT_bc is the estimated throughput for the direct link between OrigB and OrigC. And where PT_ab is the end-to-end B.A.T.M.A.N-Adv path throughput estimation of OrigB to reach OrigA.
Let's note PT_ac the B.A.T.M.A.N-Adv path throughput estimation of OrigC to reach OrigA in this topology.
PT_ac was estimated by dividing by two the minimal value between PT_ab and LT_bc because of store & forward characteristic of OrigB wifi interface.
However the following formula seems to be a more realistic approximation of PT_ac:
PT_ac = PT_ab * LT_bc / (PT_ab * LT_bc)
This patch change the half duplex penalty to match the formula above.
NB: OrigB still sets PT_ab/2 throughput in OrigA's OGM before forwarding it to OrigC for retrocompatibility sake, and is discarded when OrigC computes the new estimated end-to-end path throughput.
Signed-off-by: Remi Pommarel repk@triplefau.lt --- net/batman-adv/bat_v_ogm.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/batman-adv/bat_v_ogm.c b/net/batman-adv/bat_v_ogm.c index 27597f4cdf3e..9b7d4de182d0 100644 --- a/net/batman-adv/bat_v_ogm.c +++ b/net/batman-adv/bat_v_ogm.c @@ -866,10 +866,12 @@ static u32 batadv_v_get_throughput(struct batadv_ogm2_packet *ogm, oth = ntohl(ogm->throughput); lth = ewma_throughput_read(&neigh->bat_v.throughput);
- if ((ogm->flags & BATADV_V_HALF_DUPLEX) && lth > 10) - lth /= 2; + if (!(ogm->flags & BATADV_V_HALF_DUPLEX)) + return min_t(u32, lth, oth);
- return min_t(u32, lth, oth); + /* OGM throughput was divided by two for retrocompatibility sake */ + oth *= 2; + return oth * lth / (oth + lth); }
/**
Hi,
Thanks for taking your time to look into this and the detailed explanations!
Generally, the issues both patches try to address make sense to me.
On Thu, Sep 28, 2023 at 02:39:36PM +0200, Remi Pommarel wrote:
Let's consider the below topology
[...]
However the following formula seems to be a more realistic approximation of PT_ac:
PT_ac = PT_ab * LT_bc / (PT_ab * LT_bc)
Typo, I guess, as this would always be 1? What is actually implemented makes more sense to me.
[...]
- return min_t(u32, lth, oth);
- /* OGM throughput was divided by two for retrocompatibility sake */
- oth *= 2;
- return oth * lth / (oth + lth);
Could we end up here with a (forged?) OGM that has both the new half duplex flag set and a throughput value of 0? While also having an lth of 0, therefore dividing by zero here?
In the following scenario:
+-------+ ch.1 +-------+ ch.2 +-------+ ch.2 +-------+ | Orig0 | <----- | Orig1 | <------ | Orig2 | <------ | Orig3 | +-------+ 300 +-------+ 30000 +-------+ 110 +-------+ ^ | | ch.3 | +-----------------------------------+ 100
Would the results on Orig2 to Orig1 be these? - via Orig2: 300*110 / (300+110) = 80.5 - via Orig1: 100 <- selected
While it should have been this? - via Orig2: 30000*110 / (30000+110) = 109.6 <- selected - via Orig1: 100
But we can't calculate the latter on Orig3, because we don't know the two hop neighbor link throughput? Or am I missing something?
Also, this seems to assume that time slices are divided equally. That's probably only be true for WiFi drivers that have airtime fairness changes integrated? So only recent versions of mt76, ath9k and ath10k? Has anyone verified that this works fine not only in AP but also in 11s mode?
And a third concern, but we'd probably have this issue with both our current and your suggestion: Would we be off again 802.11be and its "Multi-Link Operation" in the future?
Regards, Linus
On Sat, Oct 14, 2023 at 07:10:28AM +0200, Linus Lüssing wrote:
In the following scenario:
+-------+ ch.1 +-------+ ch.2 +-------+ ch.2 +-------+ | Orig0 | <----- | Orig1 | <------ | Orig2 | <------ | Orig3 | +-------+ 300 +-------+ 30000 +-------+ 110 +-------+ ^ | | ch.3 | +-----------------------------------+ 100
Would the results on Orig2 to Orig1 be these?
Sorry, I ment "on Orig3 to Orig0" here.
- via Orig2: 300*110 / (300+110) = 80.5
- via Orig1: 100 <- selected
While it should have been this?
- via Orig2: 30000*110 / (30000+110) = 109.6 <- selected
- via Orig1: 100
On Sat, Oct 14, 2023 at 07:10:28AM +0200, Linus Lüssing wrote:
Hi,
Thanks for taking your time to look into this and the detailed explanations!
Generally, the issues both patches try to address make sense to me.
On Thu, Sep 28, 2023 at 02:39:36PM +0200, Remi Pommarel wrote:
Let's consider the below topology
[...]
However the following formula seems to be a more realistic approximation of PT_ac:
PT_ac = PT_ab * LT_bc / (PT_ab * LT_bc)
Typo, I guess, as this would always be 1? What is actually implemented makes more sense to me.
Correct ought to be PT_ab * LT_bc / (PT_ab + LT_bc)
[...]
- return min_t(u32, lth, oth);
- /* OGM throughput was divided by two for retrocompatibility sake */
- oth *= 2;
- return oth * lth / (oth + lth);
Could we end up here with a (forged?) OGM that has both the new half duplex flag set and a throughput value of 0? While also having an lth of 0, therefore dividing by zero here?
Yes good point will add appropriate checks for that and the other possible integer overflow if this RFC goes further.
In the following scenario:
+-------+ ch.1 +-------+ ch.2 +-------+ ch.2 +-------+ | Orig0 | <----- | Orig1 | <------ | Orig2 | <------ | Orig3 | +-------+ 300 +-------+ 30000 +-------+ 110 +-------+ ^ | | ch.3 | +-----------------------------------+ 100
Would the results on Orig3 to Orig0 be these?
- via Orig2: 300*110 / (300+110) = 80.5
- via Orig1: 100 <- selected
While it should have been this?
- via Orig2: 30000*110 / (30000+110) = 109.6 <- selected
- via Orig1: 100
But we can't calculate the latter on Orig3, because we don't know the two hop neighbor link throughput? Or am I missing something?
No good catch thanks. I can think of a way to fix that but it would need additionnal info in the OGM to store current half duplex link speed (maybe to add a TVLV for that). So let's first see if the idea seems sound enough to go further.
On a side note, the current implementation also has its own flaws for this scenario. Let's say you consider Orig0 to Orig3 instead and packets will also go from Orig1 to Orig3 directly instead of bouncing on Orig2.
Also, this seems to assume that time slices are divided equally. That's probably only be true for WiFi drivers that have airtime fairness changes integrated? So only recent versions of mt76, ath9k and ath10k? Has anyone verified that this works fine not only in AP but also in 11s mode?
I don't know how that would behave on setup that does not have airtime fairness changes integrated, if you think the current dividing by two approach is better maybe this can be made a configurable option but that could be tricky ?
For 11s, I have also run tests using mesh points instead of AP/STA and I have measured similar results.
And a third concern, but we'd probably have this issue with both our current and your suggestion: Would we be off again 802.11be and its "Multi-Link Operation" in the future?
This, I have hard time figuring out how MLO would play along with B.A.T.M.A.N-Adv integration. Unfortunately right now I have no way to experiment that yet. IIUC the link would be a mix between half and full duplex, and this would probably complicate things a bit.
Thanks a lot for your review.
On Wed Oct 18, 2023 at 9:58 PM CEST, Remi Pommarel wrote:
[...]
Also, this seems to assume that time slices are divided equally. That's probably only be true for WiFi drivers that have airtime fairness changes integrated? So only recent versions of mt76, ath9k and ath10k? Has anyone verified that this works fine not only in AP but also in 11s mode?
I don't know how that would behave on setup that does not have airtime fairness changes integrated, if you think the current dividing by two approach is better maybe this can be made a configurable option but that could be tricky ?
It seems to me that airtime fairness is something that most current drivers aim at doing. Even the mac80211 scheduler is going this route with the itxq work. So I feel like we should assume that with time, most drivers will be. And devices that do not respect airtime fairness will probably not match the current TP/2 rule either.
[...]
And a third concern, but we'd probably have this issue with both our current and your suggestion: Would we be off again 802.11be and its "Multi-Link Operation" in the future?
This, I have hard time figuring out how MLO would play along with B.A.T.M.A.N-Adv integration. Unfortunately right now I have no way to experiment that yet. IIUC the link would be a mix between half and full duplex, and this would probably complicate things a bit.
Thanks a lot for your review.
For me MLO is hard to take into account. Depending on the drivers (and probably on the firmwares mostly) we do not know if it is/will be used as a real aggregation mechanism or as a way to have 'free' roaming between multiple bands.
Moreover, currently all the path throughput estimation is based on the expected throuput that the 80211 stack gives us for individual sta. I beleive that very few drivers actually provide a value for it.
So IMHO we should do our best to have a good path estimation based on the sta estimated throughput, and it should be the mac80211 drivers job to provide us with an accurate estimated throughput for each sta on a link. And yes in the MLO case it will be a hard job indeed...
On Thu, Sep 28, 2023 at 02:39:36PM +0200, Remi Pommarel wrote:
diff --git a/net/batman-adv/bat_v_ogm.c b/net/batman-adv/bat_v_ogm.c index 27597f4cdf3e..9b7d4de182d0 100644 --- a/net/batman-adv/bat_v_ogm.c +++ b/net/batman-adv/bat_v_ogm.c @@ -866,10 +866,12 @@ static u32 batadv_v_get_throughput(struct batadv_ogm2_packet *ogm,
[...]
- return min_t(u32, lth, oth);
- /* OGM throughput was divided by two for retrocompatibility sake */
- oth *= 2;
- return oth * lth / (oth + lth);
Also looks like we'd have potential integer overflow issues here as oth, lth and the return value are all u32.
In the worst case (oth + lth) could wrap around to 0 and we'd divide by zero?
On Thursday, 28 September 2023 14:39:34 CEST Remi Pommarel wrote:
Then Orig1 first adapts the Orig0 OGM throughput to T01/2 then forwards it on same interface it received it. Orig2 receives it and first thing Orig2 does is checking if T12 is lower than the received OGM throughput (i.e. T01/2), and if that is the case T12 is considered to be the new end-to-end path throughput.
The first issue I see here is that Orig2 does not know the path to reach Orig0 has to get half duplex penalty because it is forwarded on same WiFi interface on Orig1, only Orig1 knows that. Thus if T12 is lower that T01/2, T12 will be chosen as the Orig2 to Orig0 path throughput (i.e PT02) and the half duplex penalty is lost.
I am not quite following where you see the problem.
The half duplex / store & forward penalty is for situations in which batman- adv has to forward packets from an interface to another. In your scenario that only is Orig1.
Why should Orig2 need to care whether Orig1 does store & forward or not?
If the direct path from Orig0 to Orig2 is better than the path over Orig1 the metric should reflect that.
Maybe you can add throughput metric values to your example and then expand on what you find problematic?
Cheers, Marek
On Thu, Sep 28, 2023 at 05:33:46PM +0200, Marek Lindner wrote:
On Thursday, 28 September 2023 14:39:34 CEST Remi Pommarel wrote:
Then Orig1 first adapts the Orig0 OGM throughput to T01/2 then forwards it on same interface it received it. Orig2 receives it and first thing Orig2 does is checking if T12 is lower than the received OGM throughput (i.e. T01/2), and if that is the case T12 is considered to be the new end-to-end path throughput.
The first issue I see here is that Orig2 does not know the path to reach Orig0 has to get half duplex penalty because it is forwarded on same WiFi interface on Orig1, only Orig1 knows that. Thus if T12 is lower that T01/2, T12 will be chosen as the Orig2 to Orig0 path throughput (i.e PT02) and the half duplex penalty is lost.
I am not quite following where you see the problem.
The half duplex / store & forward penalty is for situations in which batman- adv has to forward packets from an interface to another. In your scenario that only is Orig1.
Why should Orig2 need to care whether Orig1 does store & forward or not?
Because if Orig2 wanted to reach Orig0 through Orig1 the overall throughput would be impacted but it is not if the expected throughput of its link to Orig1 is lower than the expected throughput of the received OGM.
If the direct path from Orig0 to Orig2 is better than the path over Orig1 the metric should reflect that.
In the example there is no direct path from Orig0 to Orig2, the only way for Orig2 to reach Orig0 is by going through Orig1.
Maybe you can add throughput metric values to your example and then expand on what you find problematic?
Ok here is an example:
+-------+ +-------+ +-------+ | Orig0 | <------ | Orig1 | <------ | Orig2 | +-------+ 300 +-------+ 110 +-------+ ^ | | | +-----------------------------------+ 100
Let's say that :
- Orig0 and Orig1 are connected via a 200Mbps WiFi mesh link (mesh0) - Orig1 and Orig2 are connected via a 110Mbps WiFi mesh link (mesh0) - Orig0 and Orig2 are connected via a 100Mbps WiFi mesh link (mesh0)
With the current implementation the originator table of Orig2 will show something like the following:
$ batctl o Originator last-seen ( throughput) Nexthop [outgoingIF] * Orig0-Main-Mac 0.220s ( 110) Orig1-mesh0-Mac [ mesh0 ] Orig0-Main-Mac 0.220s ( 100) Orig1-mesh0-Mac [ mesh0 ]
So best path for Orig2 to Orig0 would go through Orig1 with an expected throughput of 110Mbps. But such a throughput cannot be reached because Orig1 has to forward packet from and to the same WiFi interface.
If the throughput between Orig1 and Orig2 were to be 160Mbps instead of previous 110Mbps then the originator table on Orig2 will look like that:
$ batctl o Originator last-seen ( throughput) Nexthop [outgoingIF] Orig0-Main-Mac 0.220s ( 80) Orig1-mesh0-Mac [ mesh0 ] * Orig0-Main-Mac 0.220s ( 100) Orig1-mesh0-Mac [ mesh0 ]
Best path being the direct one as it should be.
Thanks
And of course I messed up both batctl o outputs.
On Thu, Sep 28, 2023 at 06:48:21PM +0200, Remi Pommarel wrote:
On Thu, Sep 28, 2023 at 05:33:46PM +0200, Marek Lindner wrote:
Maybe you can add throughput metric values to your example and then expand on what you find problematic?
[ ... ]
$ batctl o Originator last-seen ( throughput) Nexthop [outgoingIF]
- Orig0-Main-Mac 0.220s ( 110) Orig1-mesh0-Mac [ mesh0 ] Orig0-Main-Mac 0.220s ( 100) Orig1-mesh0-Mac [ mesh0 ]
Is in fact
$ batctl o Originator last-seen ( throughput) Nexthop [outgoingIF] * Orig0-Main-Mac 0.220s ( 110) Orig1-mesh0-Mac [ mesh0 ] Orig0-Main-Mac 0.220s ( 100) Orig0-mesh0-Mac [ mesh0 ]
(The last line nexthop was wrong)
and
So best path for Orig2 to Orig0 would go through Orig1 with an expected throughput of 110Mbps. But such a throughput cannot be reached because Orig1 has to forward packet from and to the same WiFi interface.
If the throughput between Orig1 and Orig2 were to be 160Mbps instead of previous 110Mbps then the originator table on Orig2 will look like that:
$ batctl o Originator last-seen ( throughput) Nexthop [outgoingIF] Orig0-Main-Mac 0.220s ( 80) Orig1-mesh0-Mac [ mesh0 ]
- Orig0-Main-Mac 0.220s ( 100) Orig1-mesh0-Mac [ mesh0 ]
Is in fact
$ batctl o Originator last-seen ( throughput) Nexthop [outgoingIF] * Orig0-Main-Mac 0.220s ( 80) Orig1-mesh0-Mac [ mesh0 ] Orig0-Main-Mac 0.220s ( 100) Orig0-mesh0-Mac [ mesh0 ]
(Same error here)
Sorry about that,
On Thursday, 28 September 2023 18:48:20 CEST Remi Pommarel wrote:
If the direct path from Orig0 to Orig2 is better than the path over Orig1 the metric should reflect that.
In the example there is no direct path from Orig0 to Orig2, the only way for Orig2 to reach Orig0 is by going through Orig1.
If there is only one path, the computed metric does not matter at all.
If there are alternative paths (as you saying below "Orig0 and Orig2 are connected via a 100Mbps"), batman-adv has to find the best of the existing paths.
Let's say that :
- Orig0 and Orig1 are connected via a 200Mbps WiFi mesh link (mesh0)
- Orig1 and Orig2 are connected via a 110Mbps WiFi mesh link (mesh0)
- Orig0 and Orig2 are connected via a 100Mbps WiFi mesh link (mesh0)
With the current implementation the originator table of Orig2 will show something like the following:
$ batctl o Originator last-seen ( throughput) Nexthop [outgoingIF]
- Orig0-Main-Mac 0.220s ( 110) Orig1-mesh0-Mac [ mesh0 ] Orig0-Main-Mac 0.220s ( 100) Orig1-mesh0-Mac [ mesh0 ]
So best path for Orig2 to Orig0 would go through Orig1 with an expected throughput of 110Mbps. But such a throughput cannot be reached because Orig1 has to forward packet from and to the same WiFi interface.
Correct. Looking at your example where is the problem with the store & forward penalty?
Or in other words: What scenario are your patches aiming to improve?
Cheers, Marek
On Thu, Sep 28, 2023 at 08:10:48PM +0200, Marek Lindner wrote:
On Thursday, 28 September 2023 18:48:20 CEST Remi Pommarel wrote:
If the direct path from Orig0 to Orig2 is better than the path over Orig1 the metric should reflect that.
In the example there is no direct path from Orig0 to Orig2, the only way for Orig2 to reach Orig0 is by going through Orig1.
If there is only one path, the computed metric does not matter at all.
If there are alternative paths (as you saying below "Orig0 and Orig2 are connected via a 100Mbps"), batman-adv has to find the best of the existing paths.
Yes and it currently fails to do that as explained below.
Let's say that :
- Orig0 and Orig1 are connected via a 200Mbps WiFi mesh link (mesh0)
- Orig1 and Orig2 are connected via a 110Mbps WiFi mesh link (mesh0)
- Orig0 and Orig2 are connected via a 100Mbps WiFi mesh link (mesh0)
With the current implementation the originator table of Orig2 will show something like the following:
$ batctl o Originator last-seen ( throughput) Nexthop [outgoingIF]
- Orig0-Main-Mac 0.220s ( 110) Orig1-mesh0-Mac [ mesh0 ] Orig0-Main-Mac 0.220s ( 100) Orig1-mesh0-Mac [ mesh0 ]
So best path for Orig2 to Orig0 would go through Orig1 with an expected throughput of 110Mbps. But such a throughput cannot be reached because Orig1 has to forward packet from and to the same WiFi interface.
Correct. Looking at your example where is the problem with the store & forward penalty?
The problem is that the wrong path is selected.
The best one should be the direct one. Because going through Orig1, 110Mbps would never be bereached due to the store & forward penalty on Orig1 and the real throughput will be below the direct path (around 80Mbps).
Or in other words: What scenario are your patches aiming to improve?
With both patches this
* Orig0-Main-Mac 0.220s ( 110) Orig1-mesh0-Mac [ mesh0 ] Orig0-Main-Mac 0.220s ( 100) Orig0-mesh0-Mac [ mesh0 ]
will instead be
Orig0-Main-Mac 0.220s ( 80) Orig1-mesh0-Mac [ mesh0 ] * Orig0-Main-Mac 0.220s ( 100) Orig0-mesh0-Mac [ mesh0 ]
Fixing the best path selection.
Thanks
On Thursday, 28 September 2023 21:16:36 CEST Remi Pommarel wrote:
$ batctl o Originator last-seen ( throughput) Nexthop [outgoingIF]
- Orig0-Main-Mac 0.220s ( 110) Orig1-mesh0-Mac [ mesh0 ]
Orig0-Main-Mac 0.220s ( 100) Orig1-mesh0-Mac [ mesh0 ]
So best path for Orig2 to Orig0 would go through Orig1 with an expected throughput of 110Mbps. But such a throughput cannot be reached because Orig1 has to forward packet from and to the same WiFi interface.
Correct. Looking at your example where is the problem with the store & forward penalty?
The problem is that the wrong path is selected.
The best one should be the direct one. Because going through Orig1, 110Mbps would never be bereached due to the store & forward penalty on Orig1 and the real throughput will be below the direct path (around 80Mbps).
To summarize the problem you see: A path traversing a half duplex node might not be penalized enough when the weaker throughput link lies before a stronger throughput link because the half duplex penalty is not be applied before the packet is forwarded.
The underlying assumption is that this indeed is an issue in terms of (measurable) throughput. Are there any numbers / papers / experiments you are basing this on? Is the store & forward throughput limit determined by the throughput of the weakest link?
Cheers, Marek
On Tue, Oct 03, 2023 at 11:06:45PM +0200, Marek Lindner wrote:
On Thursday, 28 September 2023 21:16:36 CEST Remi Pommarel wrote:
$ batctl o Originator last-seen ( throughput) Nexthop [outgoingIF]
- Orig0-Main-Mac 0.220s ( 110) Orig1-mesh0-Mac [ mesh0 ]
Orig0-Main-Mac 0.220s ( 100) Orig1-mesh0-Mac [ mesh0 ]
So best path for Orig2 to Orig0 would go through Orig1 with an expected throughput of 110Mbps. But such a throughput cannot be reached because Orig1 has to forward packet from and to the same WiFi interface.
Correct. Looking at your example where is the problem with the store & forward penalty?
The problem is that the wrong path is selected.
The best one should be the direct one. Because going through Orig1, 110Mbps would never be bereached due to the store & forward penalty on Orig1 and the real throughput will be below the direct path (around 80Mbps).
To summarize the problem you see: A path traversing a half duplex node might not be penalized enough when the weaker throughput link lies before a stronger throughput link because the half duplex penalty is not be applied before the packet is forwarded.
Yes, in fact currently it is even not penalized at all. This is what the first patch proposes to fix.
This issue could also be looked at from a different angle, which is maybe more convincing.
Let's say there is the following setup:
sta1 <-------> AP <---------> sta2 275Mbps 720Mbps
Then the BATMAN_V current routing algorithm is going to compute the following:
- a 275Mbps path towards sta2 on sta1 - a 137.5Mbps path towards sta1 on sta2
IMO, there is no real reason to have such an asymetry.
While the first patch fixes this asymetry by estimating both paths to be 137.5Mbps, the second patch is a proposition for a better throughput estimation.
The underlying assumption is that this indeed is an issue in terms of (measurable) throughput. Are there any numbers / papers / experiments you are basing this on? Is the store & forward throughput limit determined by the throughput of the weakest link?
I haven't found any paper on that matter, if you have one that shows that dividing by two is a sound estimation I would be genuinely interessted though.
However to support the theory of the second patch I did run some iperf3 tests on the setup above.
Results from iperf3 measurements:
- sta1 --> AP : 275Mbps - AP --> sta1 : 221Mbps
- AP --> sta2 : 720Mbps - sta2 --> AP : 704Mbps
- sta1 --> sta2 : 193Mbps - sta2 --> sta1 : 152Mbps
The sta* --> AP and AP --> sta* asymetry comes from the different WiFi hardwares characteristics (i.e. AP WiFi card is better at TX than RX).
Now let say that B.A.T.M.A.N-Adv has perfect throughput estimation for direct neighbour links (e.g. sta1 <--> AP and sta2 <--> AP).
Here are the path throughput estimations with different methods for sta1 <--> sta2.
Estimation from current B.A.T.M.A.N-adv BATMAN_V: - sta1 --> sta2 : 137.5Mbps - sta2 --> sta1 : 221Mbps
Estimation with Patch 1: - sta1 --> sta2 : 137.5 Mbps - sta2 --> sta1 : 110.5 Mbps
Estimation with both patches: - sta1 --> sta2 : 199Mbps - sta2 --> sta1 : 168Mbps
I have created a NS3 simulation test [0] that also seems to show the proposed throughput estimation is a closer estimation most of the time.
Here is an example output of this simulation:
$ ns3-dev-wifi-duplex-penalty-default --pos=10 NS3 simulated throughput sta2 ---> AP: 156.321 Mbit/s NS3 simulated throughput AP ---> sta1: 323.139 Mbit/s NS3 simulated throughput sta2 --> sta1: 102.888 Mbit/s Current BATMAN_V estimated throughput sta2 --> sta1: 156.321 Mbit/s Patch 1 estimated throughput sta2 --> sta1: 78.1603 Mbit/s Both patches estimated throughput sta2 --> sta1: 105.355 Mbit/s
[0]: http://ix.io/4IG4
Anyway thanks a lot for your time.
b.a.t.m.a.n@lists.open-mesh.org