On Thu, Dec 12, 2024 at 5:27 PM Paolo Abeni pabeni@redhat.com wrote:
On 12/9/24 15:01, Xiao Liang wrote:
There are 4 net namespaces involved when creating links:
- source netns - where the netlink socket resides,
- target netns - where to put the device being created,
- link netns - netns associated with the device (backend),
- peer netns - netns of peer device.
Currently, two nets are passed to newlink() callback - "src_net" parameter and "dev_net" (implicitly in net_device). They are set as follows, depending on netlink attributes.
+------------+-------------------+---------+---------+ | peer netns | IFLA_LINK_NETNSID | src_net | dev_net | +------------+-------------------+---------+---------+ | | absent | source | target | | absent +-------------------+---------+---------+ | | present | link | link | +------------+-------------------+---------+---------+ | | absent | peer | target | | present +-------------------+---------+---------+ | | present | peer | link | +------------+-------------------+---------+---------+
When IFLA_LINK_NETNSID is present, the device is created in link netns first. This has some side effects, including extra ifindex allocation, ifname validation and link notifications. There's also an extra step to move the device to target netns. These could be avoided if we create it in target netns at the beginning.
On the other hand, the meaning of src_net is ambiguous. It varies depending on how parameters are passed. It is the effective link or peer netns by design, but some drivers ignore it and use dev_net instead.
This patch refactors netns handling by packing newlink() parameters into a struct, and passing source, link and peer netns as is through this struct. Fallback logic is implemented in helper functions - rtnl_newlink_link_net() and rtnl_newlink_peer_net(). If is not set, peer netns falls back to link netns, and link netns falls back to source netns. rtnl_newlink_create() now creates devices in target netns directly, so dev_net is always target netns.
For drivers that use dev_net as fallback of link_netns, current behavior is kept for compatibility.
Signed-off-by: Xiao Liang shaw.leon@gmail.com
I must admit this patch is way too huge for me to allow any reasonable review except that this has the potential of breaking a lot of things.
I think you should be splitted to make it more palatable; i.e.
- a patch just add the params struct with no semantic changes.
- a patch making the dev_change_net_namespace() conditional on net !=
tge_net[1]
- many per-device patches creating directly the device in the target
namespace.
- a patch reverting [1]
Other may have different opinions, I'd love to hear them.
Thanks. I understand your concern. Since the device is created in common code, how about splitting the patch this way:
1) make the params struct contain both current src_net and other netns: struct rtnl_newlink_params { struct net *net; // renamed from current src_net struct net *src_net; // real src_net struct net *link_net; ... }; 2) convert each driver to use the accurate netns, 3) remove "net", which is not used now, from params struct, 4) change rtnl_newlink_create() to create device in target netns directly.
So 1) will be a big one but has no semantic changes. And I will send Patch 1 in this series to the net tree instead.
diff --git a/drivers/net/amt.c b/drivers/net/amt.c index 98c6205ed19f..2f7bf50e05d2 100644 --- a/drivers/net/amt.c +++ b/drivers/net/amt.c @@ -3161,14 +3161,17 @@ static int amt_validate(struct nlattr *tb[], struct nlattr *data[], return 0; }
-static int amt_newlink(struct net *net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[],
struct netlink_ext_ack *extack)
+static int amt_newlink(struct rtnl_newlink_params *params) {
struct net_device *dev = params->dev;
struct nlattr **tb = params->tb;
struct nlattr **data = params->data;
struct netlink_ext_ack *extack = params->extack;
struct net *link_net = rtnl_newlink_link_net(params); struct amt_dev *amt = netdev_priv(dev); int err = -EINVAL;
Minor nit: here and and many other places, please respect the reverse xmas tree order.
Will fix this.