This patch series includes some netns-related improvements and fixes for rtnetlink, to make link creation more intuitive:
1) Creating link in another net namespace doesn't conflict with link names in current one. 2) Refector rtnetlink link creation. Create link in target namespace directly.
So that
# ip link add netns ns1 link-netns ns2 tun0 type gre ...
will create tun0 in ns1, rather than create it in ns2 and move to ns1. And don't conflict with another interface named "tun0" in current netns.
Patch 01 serves for 1) to avoids link name conflict in different netns.
To achieve 2), there're mainly 3 steps:
- Patch 02 packs newlink() parameters into a struct, including the original "src_net" along with more netns context. No semantic changes are introduced. - Patch 03 ~ 07 converts device drivers to use the explicit netns extracted from params. - Patch 08 ~ 09 removes the old netns parameter, and converts rtnetlink to create device in target netns directly.
Patch 10 ~ 11 adds some tests for link name and link netns.
BTW please note there're some issues found in current code:
- In amt_newlink() drivers/net/amt.c:
amt->net = net; ... amt->stream_dev = dev_get_by_index(net, ...
Uses net, but amt_lookup_upper_dev() only searches in dev_net. So the AMT device may not be properly deleted if it's in a different netns from lower dev.
- In gtp_newlink() in drivers/net/gtp.c:
gtp->net = src_net; ... gn = net_generic(dev_net(dev), gtp_net_id); list_add_rcu(>p->list, &gn->gtp_dev_list);
Uses src_net, but priv is linked to list in dev_net. So it may not be properly deleted on removal of link netns.
- In pfcp_newlink() in drivers/net/pfcp.c:
pfcp->net = net; ... pn = net_generic(dev_net(dev), pfcp_net_id); list_add_rcu(&pfcp->list, &pn->pfcp_dev_list);
Same as above.
- In lowpan_newlink() in net/ieee802154/6lowpan/core.c:
wdev = dev_get_by_index(dev_net(ldev), nla_get_u32(tb[IFLA_LINK]));
Looks for IFLA_LINK in dev_net, but in theory the ifindex is defined in link netns.
---
v7: - Add selftest kconfig. - Remove a duplicated test of ip6gre.
v6: link: https://lore.kernel.org/all/20241218130909.2173-1-shaw.leon@gmail.com/ - Split prototype, driver and rtnetlink changes. - Add more tests for link netns. - Fix IPv6 tunnel net overwriten in ndo_init(). - Reorder variable declarations. - Exclude a ip_tunnel-specific patch.
v5: link: https://lore.kernel.org/all/20241209140151.231257-1-shaw.leon@gmail.com/ - Fix function doc in batman-adv. - Include peer_net in rtnl newlink parameters.
v4: link: https://lore.kernel.org/all/20241118143244.1773-1-shaw.leon@gmail.com/ - Pack newlink() parameters to a single struct. - Use ynl async_msg_queue.empty() in selftest.
v3: link: https://lore.kernel.org/all/20241113125715.150201-1-shaw.leon@gmail.com/ - Drop "netns_atomic" flag and module parameter. Add netns parameter to newlink() instead, and convert drivers accordingly. - Move python NetNSEnter helper to net selftest lib.
v2: link: https://lore.kernel.org/all/20241107133004.7469-1-shaw.leon@gmail.com/ - Check NLM_F_EXCL to ensure only link creation is affected. - Add self tests for link name/ifindex conflict and notifications in different netns. - Changes in dummy driver and ynl in order to add the test case.
v1: link: https://lore.kernel.org/all/20241023023146.372653-1-shaw.leon@gmail.com/
Xiao Liang (11): rtnetlink: Lookup device in target netns when creating link rtnetlink: Pack newlink() params into struct net: Use link netns in newlink() of rtnl_link_ops ieee802154: 6lowpan: Use link netns in newlink() of rtnl_link_ops net: ip_tunnel: Use link netns in newlink() of rtnl_link_ops net: ipv6: Use link netns in newlink() of rtnl_link_ops net: xfrm: Use link netns in newlink() of rtnl_link_ops rtnetlink: Remove "net" from newlink params rtnetlink: Create link directly in target net namespace selftests: net: Add python context manager for netns entering selftests: net: Add test cases for link and peer netns
drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 11 +- drivers/net/amt.c | 16 +- drivers/net/bareudp.c | 11 +- drivers/net/bonding/bond_netlink.c | 8 +- drivers/net/can/dev/netlink.c | 4 +- drivers/net/can/vxcan.c | 9 +- .../ethernet/qualcomm/rmnet/rmnet_config.c | 11 +- drivers/net/geneve.c | 11 +- drivers/net/gtp.c | 9 +- drivers/net/ipvlan/ipvlan.h | 4 +- drivers/net/ipvlan/ipvlan_main.c | 15 +- drivers/net/ipvlan/ipvtap.c | 10 +- drivers/net/macsec.c | 15 +- drivers/net/macvlan.c | 8 +- drivers/net/macvtap.c | 11 +- drivers/net/netkit.c | 9 +- drivers/net/pfcp.c | 11 +- drivers/net/ppp/ppp_generic.c | 10 +- drivers/net/team/team_core.c | 7 +- drivers/net/veth.c | 9 +- drivers/net/vrf.c | 11 +- drivers/net/vxlan/vxlan_core.c | 11 +- drivers/net/wireguard/device.c | 11 +- drivers/net/wireless/virtual/virt_wifi.c | 14 +- drivers/net/wwan/wwan_core.c | 25 +++- include/net/ip_tunnels.h | 5 +- include/net/rtnetlink.h | 44 +++++- net/8021q/vlan_netlink.c | 15 +- net/batman-adv/soft-interface.c | 16 +- net/bridge/br_netlink.c | 12 +- net/caif/chnl_net.c | 6 +- net/core/rtnetlink.c | 35 +++-- net/hsr/hsr_netlink.c | 14 +- net/ieee802154/6lowpan/core.c | 9 +- net/ipv4/ip_gre.c | 27 ++-- net/ipv4/ip_tunnel.c | 10 +- net/ipv4/ip_vti.c | 10 +- net/ipv4/ipip.c | 14 +- net/ipv6/ip6_gre.c | 42 ++++-- net/ipv6/ip6_tunnel.c | 20 ++- net/ipv6/ip6_vti.c | 16 +- net/ipv6/sit.c | 18 ++- net/xfrm/xfrm_interface_core.c | 15 +- tools/testing/selftests/net/Makefile | 1 + tools/testing/selftests/net/config | 5 + .../testing/selftests/net/lib/py/__init__.py | 2 +- tools/testing/selftests/net/lib/py/netns.py | 18 +++ tools/testing/selftests/net/link_netns.py | 141 ++++++++++++++++++ tools/testing/selftests/net/netns-name.sh | 10 ++ 49 files changed, 550 insertions(+), 226 deletions(-) create mode 100755 tools/testing/selftests/net/link_netns.py
When creating link, lookup for existing device in target net namespace instead of current one. For example, two links created by:
# ip link add dummy1 type dummy # ip link add netns ns1 dummy1 type dummy
should have no conflict since they are in different namespaces.
Signed-off-by: Xiao Liang shaw.leon@gmail.com --- net/core/rtnetlink.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 6b745096809d..f65bd49da541 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3852,20 +3852,26 @@ static int __rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, { struct nlattr ** const tb = tbs->tb; struct net *net = sock_net(skb->sk); + struct net *device_net; struct net_device *dev; struct ifinfomsg *ifm; bool link_specified;
+ /* When creating, lookup for existing device in target net namespace */ + device_net = (nlh->nlmsg_flags & NLM_F_CREATE) && + (nlh->nlmsg_flags & NLM_F_EXCL) ? + tgt_net : net; + ifm = nlmsg_data(nlh); if (ifm->ifi_index > 0) { link_specified = true; - dev = __dev_get_by_index(net, ifm->ifi_index); + dev = __dev_get_by_index(device_net, ifm->ifi_index); } else if (ifm->ifi_index < 0) { NL_SET_ERR_MSG(extack, "ifindex can't be negative"); return -EINVAL; } else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME]) { link_specified = true; - dev = rtnl_dev_get(net, tb); + dev = rtnl_dev_get(device_net, tb); } else { link_specified = false; dev = NULL;
There are 4 net namespaces involved when creating links:
- source netns - where the netlink socket resides, - target netns - where to put the device being created, - link netns - netns associated with the device (backend), - peer netns - netns of peer device.
Currently, two nets are passed to newlink() callback - "src_net" parameter and "dev_net" (implicitly in net_device). They are set as follows, depending on netlink attributes in the request.
+------------+-------------------+---------+---------+ | peer netns | IFLA_LINK_NETNSID | src_net | dev_net | +------------+-------------------+---------+---------+ | | absent | source | target | | absent +-------------------+---------+---------+ | | present | link | link | +------------+-------------------+---------+---------+ | | absent | peer | target | | present +-------------------+---------+---------+ | | present | peer | link | +------------+-------------------+---------+---------+
When IFLA_LINK_NETNSID is present, the device is created in link netns first and then moved to target netns. This has some side effects, including extra ifindex allocation, ifname validation and link events. These could be avoided if we create it in target netns from the beginning.
On the other hand, the meaning of src_net parameter is ambiguous. It varies depending on how parameters are passed. It is the effective link (or peer netns) by design, but some drivers ignore it and use dev_net instead.
This patch packs existing newlink() parameters, along with the source netns, link netns and peer netns, into a struct. The old "src_net" is renamed to "net" to avoid confusion with real source netns, and will be deprecated later. The use of src_net are converted to params->net trivially.
To make the semantics more clear, two helper functions - rtnl_newlink_link_net() and rtnl_newlink_peer_net() - are provided for netns fallback logic. Peer netns falls back to link netns, and link netns falls back to source netns.
In following patches, to prepare for creating link in target netns directly:
- For device drivers that are aware of the old "src_net", the use of it are replace with one of the two helper functions. - And for those that takes dev_net() as link netns, we try params->link_net and then dev_net(), in order to maintain compatibility with the old behavior.
Signed-off-by: Xiao Liang shaw.leon@gmail.com --- drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 9 ++-- drivers/net/amt.c | 12 +++-- drivers/net/bareudp.c | 9 ++-- drivers/net/bonding/bond_netlink.c | 8 ++-- drivers/net/can/dev/netlink.c | 4 +- drivers/net/can/vxcan.c | 9 ++-- .../ethernet/qualcomm/rmnet/rmnet_config.c | 9 ++-- drivers/net/geneve.c | 9 ++-- drivers/net/gtp.c | 7 +-- drivers/net/ipvlan/ipvlan.h | 4 +- drivers/net/ipvlan/ipvlan_main.c | 13 ++++-- drivers/net/ipvlan/ipvtap.c | 10 ++-- drivers/net/macsec.c | 13 ++++-- drivers/net/macvlan.c | 7 ++- drivers/net/macvtap.c | 11 +++-- drivers/net/netkit.c | 9 ++-- drivers/net/pfcp.c | 9 ++-- drivers/net/ppp/ppp_generic.c | 8 ++-- drivers/net/team/team_core.c | 7 +-- drivers/net/veth.c | 9 ++-- drivers/net/vrf.c | 11 +++-- drivers/net/vxlan/vxlan_core.c | 9 ++-- drivers/net/wireguard/device.c | 9 ++-- drivers/net/wireless/virtual/virt_wifi.c | 12 +++-- drivers/net/wwan/wwan_core.c | 25 +++++++--- include/net/rtnetlink.h | 46 +++++++++++++++++-- net/8021q/vlan_netlink.c | 13 ++++-- net/batman-adv/soft-interface.c | 16 +++---- net/bridge/br_netlink.c | 12 +++-- net/caif/chnl_net.c | 6 +-- net/core/rtnetlink.c | 16 +++++-- net/hsr/hsr_netlink.c | 8 ++-- net/ieee802154/6lowpan/core.c | 6 +-- net/ipv4/ip_gre.c | 21 ++++++--- net/ipv4/ip_vti.c | 7 +-- net/ipv4/ipip.c | 11 +++-- net/ipv6/ip6_gre.c | 24 ++++++---- net/ipv6/ip6_tunnel.c | 7 +-- net/ipv6/ip6_vti.c | 6 +-- net/ipv6/sit.c | 7 +-- net/xfrm/xfrm_interface_core.c | 7 +-- 41 files changed, 296 insertions(+), 159 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c index 9ad8d9856275..61f2457aab77 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c @@ -97,10 +97,13 @@ static int ipoib_changelink(struct net_device *dev, struct nlattr *tb[], return ret; }
-static int ipoib_new_child_link(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipoib_new_child_link(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct net_device *pdev; struct ipoib_dev_priv *ppriv; u16 child_pkey; diff --git a/drivers/net/amt.c b/drivers/net/amt.c index 98c6205ed19f..85878abb51d2 100644 --- a/drivers/net/amt.c +++ b/drivers/net/amt.c @@ -3161,13 +3161,17 @@ static int amt_validate(struct nlattr *tb[], struct nlattr *data[], return 0; }
-static int amt_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int amt_newlink(struct rtnl_newlink_params *params) { - struct amt_dev *amt = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; + struct net *net = params->net; + struct amt_dev *amt; int err = -EINVAL;
+ amt = netdev_priv(dev); amt->net = net; amt->mode = nla_get_u32(data[IFLA_AMT_MODE]);
diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c index 70814303aab8..4c2a50bbf7c0 100644 --- a/drivers/net/bareudp.c +++ b/drivers/net/bareudp.c @@ -698,10 +698,13 @@ static void bareudp_dellink(struct net_device *dev, struct list_head *head) unregister_netdevice_queue(dev, head); }
-static int bareudp_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int bareudp_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; + struct net *net = params->net; struct bareudp_conf conf; int err;
diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c index 2a6a424806aa..39708a778285 100644 --- a/drivers/net/bonding/bond_netlink.c +++ b/drivers/net/bonding/bond_netlink.c @@ -564,10 +564,12 @@ static int bond_changelink(struct net_device *bond_dev, struct nlattr *tb[], return 0; }
-static int bond_newlink(struct net *src_net, struct net_device *bond_dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int bond_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *bond_dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; int err;
err = bond_changelink(bond_dev, tb, data, extack); diff --git a/drivers/net/can/dev/netlink.c b/drivers/net/can/dev/netlink.c index 01aacdcda260..52dae0e94858 100644 --- a/drivers/net/can/dev/netlink.c +++ b/drivers/net/can/dev/netlink.c @@ -624,9 +624,7 @@ static int can_fill_xstats(struct sk_buff *skb, const struct net_device *dev) return -EMSGSIZE; }
-static int can_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int can_newlink(struct rtnl_newlink_params *params) { return -EOPNOTSUPP; } diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c index ca8811941085..5d7717c22fab 100644 --- a/drivers/net/can/vxcan.c +++ b/drivers/net/can/vxcan.c @@ -172,10 +172,13 @@ static void vxcan_setup(struct net_device *dev) /* forward declaration for rtnl_create_link() */ static struct rtnl_link_ops vxcan_link_ops;
-static int vxcan_newlink(struct net *peer_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vxcan_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *peer_net = params->net; + struct nlattr **tb = params->tb; struct vxcan_priv *priv; struct net_device *peer;
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c index f3bea196a8f9..b4834651c693 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c @@ -117,11 +117,14 @@ static void rmnet_unregister_bridge(struct rmnet_port *port) rmnet_unregister_real_device(bridge_dev); }
-static int rmnet_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int rmnet_newlink(struct rtnl_newlink_params *params) { u32 data_format = RMNET_FLAGS_INGRESS_DEAGGREGATION; + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct net_device *real_dev; int mode = RMNET_EPMODE_VND; struct rmnet_endpoint *ep; diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index 642155cb8315..ea0a98a513ed 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -1614,10 +1614,13 @@ static void geneve_link_config(struct net_device *dev, geneve_change_mtu(dev, ldev_mtu - info->options_len); }
-static int geneve_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int geneve_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; + struct net *net = params->net; struct geneve_config cfg = { .df = GENEVE_DF_UNSET, .use_udp6_rx_checksums = false, diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 89a996ad8cd0..46d5734da7f3 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -1460,10 +1460,11 @@ static int gtp_create_sockets(struct gtp_dev *gtp, const struct nlattr *nla, #define GTP_TH_MAXLEN (sizeof(struct udphdr) + sizeof(struct gtp0_header)) #define GTP_IPV6_MAXLEN (sizeof(struct ipv6hdr) + GTP_TH_MAXLEN)
-static int gtp_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int gtp_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; unsigned int role = GTP_ROLE_GGSN; struct gtp_dev *gtp; struct gtp_net *gn; diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h index 025e0c19ec25..beff25a1d6f0 100644 --- a/drivers/net/ipvlan/ipvlan.h +++ b/drivers/net/ipvlan/ipvlan.h @@ -166,9 +166,7 @@ struct ipvl_addr *ipvlan_addr_lookup(struct ipvl_port *port, void *lyr3h, void *ipvlan_get_L3_hdr(struct ipvl_port *port, struct sk_buff *skb, int *type); void ipvlan_count_rx(const struct ipvl_dev *ipvlan, unsigned int len, bool success, bool mcast); -int ipvlan_link_new(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack); +int ipvlan_link_new(struct rtnl_newlink_params *params); void ipvlan_link_delete(struct net_device *dev, struct list_head *head); void ipvlan_link_setup(struct net_device *dev); int ipvlan_link_register(struct rtnl_link_ops *ops); diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c index ee2c3cf4df36..a994fd54ada4 100644 --- a/drivers/net/ipvlan/ipvlan_main.c +++ b/drivers/net/ipvlan/ipvlan_main.c @@ -532,16 +532,21 @@ static int ipvlan_nl_fillinfo(struct sk_buff *skb, return ret; }
-int ipvlan_link_new(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +int ipvlan_link_new(struct rtnl_newlink_params *params) { - struct ipvl_dev *ipvlan = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; + struct ipvl_dev *ipvlan; struct ipvl_port *port; struct net_device *phy_dev; int err; u16 mode = IPVLAN_MODE_L3;
+ ipvlan = netdev_priv(dev); + if (!tb[IFLA_LINK]) return -EINVAL;
diff --git a/drivers/net/ipvlan/ipvtap.c b/drivers/net/ipvlan/ipvtap.c index 1afc4c47be73..0b0c65390066 100644 --- a/drivers/net/ipvlan/ipvtap.c +++ b/drivers/net/ipvlan/ipvtap.c @@ -73,13 +73,13 @@ static void ipvtap_update_features(struct tap_dev *tap, netdev_update_features(vlan->dev); }
-static int ipvtap_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipvtap_newlink(struct rtnl_newlink_params *params) { - struct ipvtap_dev *vlantap = netdev_priv(dev); + struct net_device *dev = params->dev; + struct ipvtap_dev *vlantap; int err;
+ vlantap = netdev_priv(dev); INIT_LIST_HEAD(&vlantap->tap.queue_list);
/* Since macvlan supports all offloads by default, make @@ -97,7 +97,7 @@ static int ipvtap_newlink(struct net *src_net, struct net_device *dev, /* Don't put anything that may fail after macvlan_common_newlink * because we can't undo what it does. */ - err = ipvlan_link_new(src_net, dev, tb, data, extack); + err = ipvlan_link_new(params); if (err) { netdev_rx_handler_unregister(dev); return err; diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 1bc1e5993f56..9da111a6629c 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -4141,17 +4141,22 @@ static int macsec_add_dev(struct net_device *dev, sci_t sci, u8 icv_len)
static struct lock_class_key macsec_netdev_addr_lock_key;
-static int macsec_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int macsec_newlink(struct rtnl_newlink_params *params) { - struct macsec_dev *macsec = macsec_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; + struct net *net = params->net; rx_handler_func_t *rx_handler; u8 icv_len = MACSEC_DEFAULT_ICV_LEN; struct net_device *real_dev; + struct macsec_dev *macsec; int err, mtu; sci_t sci;
+ macsec = macsec_priv(dev); + if (!tb[IFLA_LINK]) return -EINVAL; real_dev = __dev_get_by_index(net, nla_get_u32(tb[IFLA_LINK])); diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index fed4fe2a4748..1915f54bd35a 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -1565,11 +1565,10 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev, } EXPORT_SYMBOL_GPL(macvlan_common_newlink);
-static int macvlan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int macvlan_newlink(struct rtnl_newlink_params *params) { - return macvlan_common_newlink(src_net, dev, tb, data, extack); + return macvlan_common_newlink(params->net, params->dev, params->tb, + params->data, params->extack); }
void macvlan_dellink(struct net_device *dev, struct list_head *head) diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c index 29a5929d48e5..e5fd8a147310 100644 --- a/drivers/net/macvtap.c +++ b/drivers/net/macvtap.c @@ -77,13 +77,13 @@ static void macvtap_update_features(struct tap_dev *tap, netdev_update_features(vlan->dev); }
-static int macvtap_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int macvtap_newlink(struct rtnl_newlink_params *params) { - struct macvtap_dev *vlantap = netdev_priv(dev); + struct net_device *dev = params->dev; + struct macvtap_dev *vlantap; int err;
+ vlantap = netdev_priv(dev); INIT_LIST_HEAD(&vlantap->tap.queue_list);
/* Since macvlan supports all offloads by default, make @@ -105,7 +105,8 @@ static int macvtap_newlink(struct net *src_net, struct net_device *dev, /* Don't put anything that may fail after macvlan_common_newlink * because we can't undo what it does. */ - err = macvlan_common_newlink(src_net, dev, tb, data, extack); + err = macvlan_common_newlink(params->net, dev, params->tb, params->data, + params->extack); if (err) { netdev_rx_handler_unregister(dev); return err; diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c index c1d881dc6409..f5527bb533ab 100644 --- a/drivers/net/netkit.c +++ b/drivers/net/netkit.c @@ -327,10 +327,13 @@ static int netkit_validate(struct nlattr *tb[], struct nlattr *data[],
static struct rtnl_link_ops netkit_link_ops;
-static int netkit_new_link(struct net *peer_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int netkit_new_link(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *peer_net = params->net; + struct nlattr **tb = params->tb; struct nlattr *peer_tb[IFLA_MAX + 1], **tbp = tb, *attr; enum netkit_action policy_prim = NETKIT_PASS; enum netkit_action policy_peer = NETKIT_PASS; diff --git a/drivers/net/pfcp.c b/drivers/net/pfcp.c index 69434fd13f96..cb936da99674 100644 --- a/drivers/net/pfcp.c +++ b/drivers/net/pfcp.c @@ -184,14 +184,15 @@ static int pfcp_add_sock(struct pfcp_dev *pfcp) return PTR_ERR_OR_ZERO(pfcp->sock); }
-static int pfcp_newlink(struct net *net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int pfcp_newlink(struct rtnl_newlink_params *params) { - struct pfcp_dev *pfcp = netdev_priv(dev); + struct net_device *dev = params->dev; + struct net *net = params->net; + struct pfcp_dev *pfcp; struct pfcp_net *pn; int err;
+ pfcp = netdev_priv(dev); pfcp->net = net;
err = pfcp_add_sock(pfcp); diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 4583e15ad03a..5b58e7bb4e7b 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -1303,10 +1303,12 @@ static int ppp_nl_validate(struct nlattr *tb[], struct nlattr *data[], return 0; }
-static int ppp_nl_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ppp_nl_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct ppp_config conf = { .unit = -1, .ifname_is_set = true, diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c index c7690adec8db..1e2a40196377 100644 --- a/drivers/net/team/team_core.c +++ b/drivers/net/team/team_core.c @@ -2211,10 +2211,11 @@ static void team_setup(struct net_device *dev) dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX; }
-static int team_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int team_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **tb = params->tb; + if (tb[IFLA_ADDRESS] == NULL) eth_hw_addr_random(dev);
diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 01251868a9c2..04229c07023d 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -1765,10 +1765,13 @@ static int veth_init_queues(struct net_device *dev, struct nlattr *tb[]) return 0; }
-static int veth_newlink(struct net *peer_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int veth_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *peer_net = params->net; + struct nlattr **tb = params->tb; int err; struct net_device *peer; struct veth_priv *priv; diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c index ca81b212a246..9a21bfc5bcc7 100644 --- a/drivers/net/vrf.c +++ b/drivers/net/vrf.c @@ -1677,16 +1677,19 @@ static void vrf_dellink(struct net_device *dev, struct list_head *head) unregister_netdevice_queue(dev, head); }
-static int vrf_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vrf_newlink(struct rtnl_newlink_params *params) { - struct net_vrf *vrf = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; struct netns_vrf *nn_vrf; + struct net_vrf *vrf; bool *add_fib_rules; struct net *net; int err;
+ vrf = netdev_priv(dev); + if (!data || !data[IFLA_VRF_TABLE]) { NL_SET_ERR_MSG(extack, "VRF table id is missing"); return -EINVAL; diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index 05c10acb2a57..3d1088bf9acd 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -4393,10 +4393,13 @@ static int vxlan_nl2conf(struct nlattr *tb[], struct nlattr *data[], return 0; }
-static int vxlan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vxlan_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct vxlan_config conf; int err;
diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c index 6cf173a008e7..92aac080d2b5 100644 --- a/drivers/net/wireguard/device.c +++ b/drivers/net/wireguard/device.c @@ -307,13 +307,14 @@ static void wg_setup(struct net_device *dev) wg->dev = dev; }
-static int wg_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int wg_newlink(struct rtnl_newlink_params *params) { - struct wg_device *wg = netdev_priv(dev); + struct net_device *dev = params->dev; + struct net *src_net = params->net; + struct wg_device *wg; int ret = -ENOMEM;
+ wg = netdev_priv(dev); rcu_assign_pointer(wg->creating_net, src_net); init_rwsem(&wg->static_identity.lock); mutex_init(&wg->socket_update_lock); diff --git a/drivers/net/wireless/virtual/virt_wifi.c b/drivers/net/wireless/virtual/virt_wifi.c index 4ee374080466..d64eb03e0ac8 100644 --- a/drivers/net/wireless/virtual/virt_wifi.c +++ b/drivers/net/wireless/virtual/virt_wifi.c @@ -519,13 +519,17 @@ static rx_handler_result_t virt_wifi_rx_handler(struct sk_buff **pskb) }
/* Called with rtnl lock held. */ -static int virt_wifi_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int virt_wifi_newlink(struct rtnl_newlink_params *params) { - struct virt_wifi_netdev_priv *priv = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct virt_wifi_netdev_priv *priv; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; int err;
+ priv = netdev_priv(dev); + if (!tb[IFLA_LINK]) return -EINVAL;
diff --git a/drivers/net/wwan/wwan_core.c b/drivers/net/wwan/wwan_core.c index a51e2755991a..908a3db61477 100644 --- a/drivers/net/wwan/wwan_core.c +++ b/drivers/net/wwan/wwan_core.c @@ -967,15 +967,20 @@ static struct net_device *wwan_rtnl_alloc(struct nlattr *tb[], return dev; }
-static int wwan_rtnl_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int wwan_rtnl_newlink(struct rtnl_newlink_params *params) { - struct wwan_device *wwandev = wwan_dev_get_by_parent(dev->dev.parent); - u32 link_id = nla_get_u32(data[IFLA_WWAN_LINK_ID]); - struct wwan_netdev_priv *priv = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct wwan_netdev_priv *priv; + struct wwan_device *wwandev; + u32 link_id; int ret;
+ wwandev = wwan_dev_get_by_parent(dev->dev.parent); + link_id = nla_get_u32(data[IFLA_WWAN_LINK_ID]); + priv = netdev_priv(dev); + if (IS_ERR(wwandev)) return PTR_ERR(wwandev);
@@ -1064,6 +1069,11 @@ static void wwan_create_default_link(struct wwan_device *wwandev, struct net_device *dev; struct nlmsghdr *nlh; struct sk_buff *msg; + struct rtnl_newlink_params params = { + .net = &init_net, + .tb = tb, + .data = data, + };
/* Forge attributes required to create a WWAN netdev. We first * build a netlink message and then parse it. This looks @@ -1105,7 +1115,8 @@ static void wwan_create_default_link(struct wwan_device *wwandev, if (WARN_ON(IS_ERR(dev))) goto unlock;
- if (WARN_ON(wwan_rtnl_newlink(&init_net, dev, tb, data, NULL))) { + params.dev = dev; + if (WARN_ON(wwan_rtnl_newlink(¶ms))) { free_netdev(dev); goto unlock; } diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h index bc0069a8b6ea..ed970b4568d1 100644 --- a/include/net/rtnetlink.h +++ b/include/net/rtnetlink.h @@ -69,6 +69,46 @@ static inline int rtnl_msg_family(const struct nlmsghdr *nlh) return AF_UNSPEC; }
+/** + * struct rtnl_newlink_params - parameters of rtnl_link_ops::newlink() + * + * @net: Netns of interest + * @src_net: Source netns of rtnetlink socket + * @link_net: Link netns by IFLA_LINK_NETNSID, NULL if not specified + * @peer_net: Peer netns + * @dev: The net_device being created + * @tb: IFLA_* attributes + * @data: IFLA_INFO_DATA attributes + * @extack: Netlink extended ACK + */ +struct rtnl_newlink_params { + struct net *net; + struct net *src_net; + struct net *link_net; + struct net *peer_net; + struct net_device *dev; + struct nlattr **tb; + struct nlattr **data; + struct netlink_ext_ack *extack; +}; + +/* Get effective link netns from newlink params. Generally, this is link_net + * and falls back to src_net. But for compatibility, a driver may * choose to + * use dev_net(dev) instead. + */ +static inline struct net *rtnl_newlink_link_net(struct rtnl_newlink_params *p) +{ + return p->link_net ? : p->src_net; +} + +/* Get peer netns from newlink params. Fallback to link netns if peer netns is + * not specified explicitly. + */ +static inline struct net *rtnl_newlink_peer_net(struct rtnl_newlink_params *p) +{ + return p->peer_net ? : rtnl_newlink_link_net(p); +} + /** * struct rtnl_link_ops - rtnetlink link operations * @@ -125,11 +165,7 @@ struct rtnl_link_ops { struct nlattr *data[], struct netlink_ext_ack *extack);
- int (*newlink)(struct net *src_net, - struct net_device *dev, - struct nlattr *tb[], - struct nlattr *data[], - struct netlink_ext_ack *extack); + int (*newlink)(struct rtnl_newlink_params *params); int (*changelink)(struct net_device *dev, struct nlattr *tb[], struct nlattr *data[], diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c index 134419667d59..26a0f0a2ce27 100644 --- a/net/8021q/vlan_netlink.c +++ b/net/8021q/vlan_netlink.c @@ -135,16 +135,21 @@ static int vlan_changelink(struct net_device *dev, struct nlattr *tb[], return 0; }
-static int vlan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vlan_newlink(struct rtnl_newlink_params *params) { - struct vlan_dev_priv *vlan = vlan_dev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct net_device *real_dev; + struct vlan_dev_priv *vlan; unsigned int max_mtu; __be16 proto; int err;
+ vlan = vlan_dev_priv(dev); + if (!data[IFLA_VLAN_ID]) { NL_SET_ERR_MSG_MOD(extack, "VLAN id not specified"); return -EINVAL; diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c index 2758aba47a2f..5f92a25d6b26 100644 --- a/net/batman-adv/soft-interface.c +++ b/net/batman-adv/soft-interface.c @@ -1063,22 +1063,20 @@ static int batadv_softif_validate(struct nlattr *tb[], struct nlattr *data[],
/** * batadv_softif_newlink() - pre-initialize and register new batadv link - * @src_net: the applicable net namespace - * @dev: network device to register - * @tb: IFLA_INFO_DATA netlink attributes - * @data: enum batadv_ifla_attrs attributes - * @extack: extended ACK report struct + * @params: rtnl newlink parameters * * Return: 0 if successful or error otherwise. */ -static int batadv_softif_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int batadv_softif_newlink(struct rtnl_newlink_params *params) { - struct batadv_priv *bat_priv = netdev_priv(dev); + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct batadv_priv *bat_priv; const char *algo_name; int err;
+ bat_priv = netdev_priv(dev); + if (data && data[IFLA_BATADV_ALGO_NAME]) { algo_name = nla_data(data[IFLA_BATADV_ALGO_NAME]); err = batadv_algo_select(bat_priv, algo_name); diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 3e0f47203f2a..362ca10607ba 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -1553,13 +1553,17 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[], return 0; }
-static int br_dev_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int br_dev_newlink(struct rtnl_newlink_params *params) { - struct net_bridge *br = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; + struct net_bridge *br; int err;
+ br = netdev_priv(dev); + err = register_netdevice(dev); if (err) return err; diff --git a/net/caif/chnl_net.c b/net/caif/chnl_net.c index 94ad09e36df2..748e38908709 100644 --- a/net/caif/chnl_net.c +++ b/net/caif/chnl_net.c @@ -438,10 +438,10 @@ static void caif_netlink_parms(struct nlattr *data[], } }
-static int ipcaif_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipcaif_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; int ret; struct chnl_net *caifdev; ASSERT_RTNL(); diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index f65bd49da541..f902b8a5189f 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3757,6 +3757,15 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, struct net_device *dev; char ifname[IFNAMSIZ]; int err; + struct rtnl_newlink_params params = { + .net = net, + .src_net = net, + .link_net = link_net, + .peer_net = peer_net, + .tb = tb, + .data = data, + .extack = extack, + };
if (!ops->alloc && !ops->setup) return -EOPNOTSUPP; @@ -3776,14 +3785,15 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, }
dev->ifindex = ifm->ifi_index; + params.dev = dev;
if (link_net) - net = link_net; + params.net = link_net; if (peer_net) - net = peer_net; + params.net = peer_net;
if (ops->newlink) - err = ops->newlink(net, dev, tb, data, extack); + err = ops->newlink(¶ms); else err = register_netdevice(dev); if (err < 0) { diff --git a/net/hsr/hsr_netlink.c b/net/hsr/hsr_netlink.c index b68f2f71d0e1..08d38e2e2962 100644 --- a/net/hsr/hsr_netlink.c +++ b/net/hsr/hsr_netlink.c @@ -29,10 +29,12 @@ static const struct nla_policy hsr_policy[IFLA_HSR_MAX + 1] = { /* Here, it seems a netdevice has already been allocated for us, and the * hsr_dev_setup routine has been executed. Nice! */ -static int hsr_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int hsr_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; enum hsr_version proto_version; unsigned char multicast_spec; u8 proto = HSR_PROTOCOL_HSR; diff --git a/net/ieee802154/6lowpan/core.c b/net/ieee802154/6lowpan/core.c index 175efd860f7b..c16c14807d87 100644 --- a/net/ieee802154/6lowpan/core.c +++ b/net/ieee802154/6lowpan/core.c @@ -129,10 +129,10 @@ static int lowpan_validate(struct nlattr *tb[], struct nlattr *data[], return 0; }
-static int lowpan_newlink(struct net *src_net, struct net_device *ldev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int lowpan_newlink(struct rtnl_newlink_params *params) { + struct net_device *ldev = params->dev; + struct nlattr **tb = params->tb; struct net_device *wdev; int ret;
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index a020342f618d..71eb651e2b44 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -1392,10 +1392,11 @@ ipgre_newlink_encap_setup(struct net_device *dev, struct nlattr *data[]) return 0; }
-static int ipgre_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipgre_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; struct ip_tunnel_parm_kern p; __u32 fwmark = 0; int err; @@ -1410,10 +1411,11 @@ static int ipgre_newlink(struct net *src_net, struct net_device *dev, return ip_tunnel_newlink(dev, tb, &p, fwmark); }
-static int erspan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int erspan_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; struct ip_tunnel_parm_kern p; __u32 fwmark = 0; int err; @@ -1698,6 +1700,10 @@ struct net_device *gretap_fb_dev_create(struct net *net, const char *name, LIST_HEAD(list_kill); struct ip_tunnel *t; int err; + struct rtnl_newlink_params params = { + .net = net, + .tb = tb, + };
memset(&tb, 0, sizeof(tb));
@@ -1710,7 +1716,8 @@ struct net_device *gretap_fb_dev_create(struct net *net, const char *name, t = netdev_priv(dev); t->collect_md = true;
- err = ipgre_newlink(net, dev, tb, NULL, NULL); + params.dev = dev; + err = ipgre_newlink(¶ms); if (err < 0) { free_netdev(dev); return ERR_PTR(err); diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index f0b4419cef34..12ccbf34fb6c 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -575,11 +575,12 @@ static void vti_netlink_parms(struct nlattr *data[], *fwmark = nla_get_u32(data[IFLA_VTI_FWMARK]); }
-static int vti_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vti_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; struct ip_tunnel_parm_kern parms; + struct nlattr **tb = params->tb; __u32 fwmark = 0;
vti_netlink_parms(data, &parms, &fwmark); diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c index dc0db5895e0e..3a737ea3c2e5 100644 --- a/net/ipv4/ipip.c +++ b/net/ipv4/ipip.c @@ -436,15 +436,18 @@ static void ipip_netlink_parms(struct nlattr *data[], *fwmark = nla_get_u32(data[IFLA_IPTUN_FWMARK]); }
-static int ipip_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipip_newlink(struct rtnl_newlink_params *params) { - struct ip_tunnel *t = netdev_priv(dev); + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; struct ip_tunnel_encap ipencap; struct ip_tunnel_parm_kern p; + struct ip_tunnel *t; __u32 fwmark = 0;
+ t = netdev_priv(dev); + if (ip_tunnel_netlink_encap_parms(data, &ipencap)) { int err = ip_tunnel_encap_setup(t, &ipencap);
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index 235808cfec70..3efd51f0d7d2 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -2005,15 +2005,19 @@ static int ip6gre_newlink_common(struct net *src_net, struct net_device *dev, return err; }
-static int ip6gre_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ip6gre_newlink(struct rtnl_newlink_params *params) { - struct ip6_tnl *nt = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct net *net = dev_net(dev); struct ip6gre_net *ign; + struct ip6_tnl *nt; int err;
+ nt = netdev_priv(dev); ip6gre_netlink_parms(data, &nt->parms); ign = net_generic(net, ip6gre_net_id);
@@ -2241,15 +2245,19 @@ static void ip6erspan_tap_setup(struct net_device *dev) netif_keep_dst(dev); }
-static int ip6erspan_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ip6erspan_newlink(struct rtnl_newlink_params *params) { - struct ip6_tnl *nt = netdev_priv(dev); + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct net *src_net = params->net; + struct nlattr **tb = params->tb; struct net *net = dev_net(dev); struct ip6gre_net *ign; + struct ip6_tnl *nt; int err;
+ nt = netdev_priv(dev); ip6gre_netlink_parms(data, &nt->parms); ip6erspan_set_version(data, &nt->parms); ign = net_generic(net, ip6gre_net_id); diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 48fd53b98972..f4bdbabc3246 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -2002,10 +2002,11 @@ static void ip6_tnl_netlink_parms(struct nlattr *data[], parms->fwmark = nla_get_u32(data[IFLA_IPTUN_FWMARK]); }
-static int ip6_tnl_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ip6_tnl_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; struct net *net = dev_net(dev); struct ip6_tnl_net *ip6n = net_generic(net, ip6_tnl_net_id); struct ip_tunnel_encap ipencap; diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index 590737c27537..79e601e629d2 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -997,10 +997,10 @@ static void vti6_netlink_parms(struct nlattr *data[], parms->fwmark = nla_get_u32(data[IFLA_VTI_FWMARK]); }
-static int vti6_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int vti6_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; struct net *net = dev_net(dev); struct ip6_tnl *nt;
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index 39bd8951bfca..4dd1309d1eb3 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -1550,10 +1550,11 @@ static bool ipip6_netlink_6rd_parms(struct nlattr *data[], } #endif
-static int ipip6_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int ipip6_newlink(struct rtnl_newlink_params *params) { + struct net_device *dev = params->dev; + struct nlattr **data = params->data; + struct nlattr **tb = params->tb; struct net *net = dev_net(dev); struct ip_tunnel *nt; struct ip_tunnel_encap ipencap; diff --git a/net/xfrm/xfrm_interface_core.c b/net/xfrm/xfrm_interface_core.c index 98f1e2b67c76..77d50d4af4a1 100644 --- a/net/xfrm/xfrm_interface_core.c +++ b/net/xfrm/xfrm_interface_core.c @@ -814,10 +814,11 @@ static void xfrmi_netlink_parms(struct nlattr *data[], parms->collect_md = true; }
-static int xfrmi_newlink(struct net *src_net, struct net_device *dev, - struct nlattr *tb[], struct nlattr *data[], - struct netlink_ext_ack *extack) +static int xfrmi_newlink(struct rtnl_newlink_params *params) { + struct netlink_ext_ack *extack = params->extack; + struct net_device *dev = params->dev; + struct nlattr **data = params->data; struct net *net = dev_net(dev); struct xfrm_if_parms p = {}; struct xfrm_if *xi;
On Sat, 4 Jan 2025 20:57:23 +0800 Xiao Liang wrote:
-static int amt_newlink(struct net *net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[],
struct netlink_ext_ack *extack)
+static int amt_newlink(struct rtnl_newlink_params *params) {
- struct amt_dev *amt = netdev_priv(dev);
- struct netlink_ext_ack *extack = params->extack;
- struct net_device *dev = params->dev;
- struct nlattr **data = params->data;
- struct nlattr **tb = params->tb;
- struct net *net = params->net;
- struct amt_dev *amt;
IMHO you packed a little too much into the struct. Could you take the dev and the extack back out?
On Wed, Jan 8, 2025 at 4:38 AM Jakub Kicinski kuba@kernel.org wrote:
On Sat, 4 Jan 2025 20:57:23 +0800 Xiao Liang wrote:
-static int amt_newlink(struct net *net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[],
struct netlink_ext_ack *extack)
+static int amt_newlink(struct rtnl_newlink_params *params) {
struct amt_dev *amt = netdev_priv(dev);
struct netlink_ext_ack *extack = params->extack;
struct net_device *dev = params->dev;
struct nlattr **data = params->data;
struct nlattr **tb = params->tb;
struct net *net = params->net;
struct amt_dev *amt;
IMHO you packed a little too much into the struct. Could you take the dev and the extack back out?
Sure. I thought you were suggesting packing them all in review of v3...
On Wed, 8 Jan 2025 16:36:26 +0800 Xiao Liang wrote:
On Wed, Jan 8, 2025 at 4:38 AM Jakub Kicinski kuba@kernel.org wrote:
On Sat, 4 Jan 2025 20:57:23 +0800 Xiao Liang wrote:
-static int amt_newlink(struct net *net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[],
struct netlink_ext_ack *extack)
+static int amt_newlink(struct rtnl_newlink_params *params) {
struct amt_dev *amt = netdev_priv(dev);
struct netlink_ext_ack *extack = params->extack;
struct net_device *dev = params->dev;
struct nlattr **data = params->data;
struct nlattr **tb = params->tb;
struct net *net = params->net;
struct amt_dev *amt;
IMHO you packed a little too much into the struct. Could you take the dev and the extack back out?
Sure. I thought you were suggesting packing them all in review of v3...
Sorry about that, I wasn't very clear :(
What I had in mind was similar to how we define ethtool ops, (especially the more recent ones which have extack) for example:
int (*set_mm)(struct net_device *dev, struct ethtool_mm_cfg *cfg, struct netlink_ext_ack *extack);
These netdevice drivers already uses netns parameter in newlink() callback. Convert them to use rtnl_newlink_link_net() or rtnl_newlink_peer_net() for clarity and deprecate params->net.
Signed-off-by: Xiao Liang shaw.leon@gmail.com --- drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 4 ++-- drivers/net/amt.c | 6 +++--- drivers/net/bareudp.c | 4 ++-- drivers/net/can/vxcan.c | 2 +- drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 4 ++-- drivers/net/geneve.c | 4 ++-- drivers/net/gtp.c | 4 ++-- drivers/net/ipvlan/ipvlan_main.c | 4 ++-- drivers/net/macsec.c | 4 ++-- drivers/net/macvlan.c | 5 +++-- drivers/net/macvtap.c | 4 ++-- drivers/net/netkit.c | 2 +- drivers/net/pfcp.c | 4 ++-- drivers/net/ppp/ppp_generic.c | 4 ++-- drivers/net/veth.c | 2 +- drivers/net/vxlan/vxlan_core.c | 4 ++-- drivers/net/wireguard/device.c | 4 ++-- drivers/net/wireless/virtual/virt_wifi.c | 4 ++-- drivers/net/wwan/wwan_core.c | 2 +- net/8021q/vlan_netlink.c | 4 ++-- net/hsr/hsr_netlink.c | 8 ++++---- 21 files changed, 42 insertions(+), 41 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c index 61f2457aab77..ac01650b0ac2 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c @@ -99,10 +99,10 @@ static int ipoib_changelink(struct net_device *dev, struct nlattr *tb[],
static int ipoib_new_child_link(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; struct net_device *pdev; struct ipoib_dev_priv *ppriv; @@ -112,7 +112,7 @@ static int ipoib_new_child_link(struct rtnl_newlink_params *params) if (!tb[IFLA_LINK]) return -EINVAL;
- pdev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); + pdev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!pdev || pdev->type != ARPHRD_INFINIBAND) return -ENODEV;
diff --git a/drivers/net/amt.c b/drivers/net/amt.c index 85878abb51d2..de4ea1a3f3d3 100644 --- a/drivers/net/amt.c +++ b/drivers/net/amt.c @@ -3163,16 +3163,16 @@ static int amt_validate(struct nlattr *tb[], struct nlattr *data[],
static int amt_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; struct nlattr **tb = params->tb; - struct net *net = params->net; struct amt_dev *amt; int err = -EINVAL;
amt = netdev_priv(dev); - amt->net = net; + amt->net = link_net; amt->mode = nla_get_u32(data[IFLA_AMT_MODE]);
if (data[IFLA_AMT_MAX_TUNNELS] && @@ -3187,7 +3187,7 @@ static int amt_newlink(struct rtnl_newlink_params *params) amt->hash_buckets = AMT_HSIZE; amt->nr_tunnels = 0; get_random_bytes(&amt->hash_seed, sizeof(amt->hash_seed)); - amt->stream_dev = dev_get_by_index(net, + amt->stream_dev = dev_get_by_index(link_net, nla_get_u32(data[IFLA_AMT_LINK])); if (!amt->stream_dev) { NL_SET_ERR_MSG_ATTR(extack, tb[IFLA_AMT_LINK], diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c index 4c2a50bbf7c0..1fe5dcae38f5 100644 --- a/drivers/net/bareudp.c +++ b/drivers/net/bareudp.c @@ -700,11 +700,11 @@ static void bareudp_dellink(struct net_device *dev, struct list_head *head)
static int bareudp_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; struct nlattr **tb = params->tb; - struct net *net = params->net; struct bareudp_conf conf; int err;
@@ -712,7 +712,7 @@ static int bareudp_newlink(struct rtnl_newlink_params *params) if (err) return err;
- err = bareudp_configure(net, dev, &conf, extack); + err = bareudp_configure(link_net, dev, &conf, extack); if (err) return err;
diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c index 5d7717c22fab..e3c52c191086 100644 --- a/drivers/net/can/vxcan.c +++ b/drivers/net/can/vxcan.c @@ -174,10 +174,10 @@ static struct rtnl_link_ops vxcan_link_ops;
static int vxcan_newlink(struct rtnl_newlink_params *params) { + struct net *peer_net = rtnl_newlink_peer_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *peer_net = params->net; struct nlattr **tb = params->tb; struct vxcan_priv *priv; struct net_device *peer; diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c index b4834651c693..7a6b746a3b15 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c @@ -120,10 +120,10 @@ static void rmnet_unregister_bridge(struct rmnet_port *port) static int rmnet_newlink(struct rtnl_newlink_params *params) { u32 data_format = RMNET_FLAGS_INGRESS_DEAGGREGATION; + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; struct net_device *real_dev; int mode = RMNET_EPMODE_VND; @@ -137,7 +137,7 @@ static int rmnet_newlink(struct rtnl_newlink_params *params) return -EINVAL; }
- real_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); + real_dev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) { NL_SET_ERR_MSG_MOD(extack, "link does not exist"); return -ENODEV; diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index ea0a98a513ed..3dec3e5aae79 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -1616,11 +1616,11 @@ static void geneve_link_config(struct net_device *dev,
static int geneve_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; struct nlattr **tb = params->tb; - struct net *net = params->net; struct geneve_config cfg = { .df = GENEVE_DF_UNSET, .use_udp6_rx_checksums = false, @@ -1634,7 +1634,7 @@ static int geneve_newlink(struct rtnl_newlink_params *params) if (err) return err;
- err = geneve_configure(net, dev, extack, &cfg); + err = geneve_configure(link_net, dev, extack, &cfg); if (err) return err;
diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 46d5734da7f3..50f8a0cd1d4b 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -1462,9 +1462,9 @@ static int gtp_create_sockets(struct gtp_dev *gtp, const struct nlattr *nla,
static int gtp_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; unsigned int role = GTP_ROLE_GGSN; struct gtp_dev *gtp; struct gtp_net *gn; @@ -1495,7 +1495,7 @@ static int gtp_newlink(struct rtnl_newlink_params *params) gtp->restart_count = nla_get_u8_default(data[IFLA_GTP_RESTART_COUNT], 0);
- gtp->net = src_net; + gtp->net = link_net;
err = gtp_hashtable_new(gtp, hashsize); if (err < 0) diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c index a994fd54ada4..7d19771383c7 100644 --- a/drivers/net/ipvlan/ipvlan_main.c +++ b/drivers/net/ipvlan/ipvlan_main.c @@ -534,10 +534,10 @@ static int ipvlan_nl_fillinfo(struct sk_buff *skb,
int ipvlan_link_new(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; struct ipvl_dev *ipvlan; struct ipvl_port *port; @@ -550,7 +550,7 @@ int ipvlan_link_new(struct rtnl_newlink_params *params) if (!tb[IFLA_LINK]) return -EINVAL;
- phy_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); + phy_dev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!phy_dev) return -ENODEV;
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 9da111a6629c..ad53a67410dc 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -4143,11 +4143,11 @@ static struct lock_class_key macsec_netdev_addr_lock_key;
static int macsec_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; struct nlattr **tb = params->tb; - struct net *net = params->net; rx_handler_func_t *rx_handler; u8 icv_len = MACSEC_DEFAULT_ICV_LEN; struct net_device *real_dev; @@ -4159,7 +4159,7 @@ static int macsec_newlink(struct rtnl_newlink_params *params)
if (!tb[IFLA_LINK]) return -EINVAL; - real_dev = __dev_get_by_index(net, nla_get_u32(tb[IFLA_LINK])); + real_dev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) return -ENODEV; if (real_dev->type != ARPHRD_ETHER) diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index 1915f54bd35a..7050a061b2b9 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -1567,8 +1567,9 @@ EXPORT_SYMBOL_GPL(macvlan_common_newlink);
static int macvlan_newlink(struct rtnl_newlink_params *params) { - return macvlan_common_newlink(params->net, params->dev, params->tb, - params->data, params->extack); + return macvlan_common_newlink(rtnl_newlink_link_net(params), + params->dev, params->tb, params->data, + params->extack); }
void macvlan_dellink(struct net_device *dev, struct list_head *head) diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c index e5fd8a147310..01cf1efbe4c5 100644 --- a/drivers/net/macvtap.c +++ b/drivers/net/macvtap.c @@ -105,8 +105,8 @@ static int macvtap_newlink(struct rtnl_newlink_params *params) /* Don't put anything that may fail after macvlan_common_newlink * because we can't undo what it does. */ - err = macvlan_common_newlink(params->net, dev, params->tb, params->data, - params->extack); + err = macvlan_common_newlink(rtnl_newlink_link_net(params), dev, + params->tb, params->data, params->extack); if (err) { netdev_rx_handler_unregister(dev); return err; diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c index f5527bb533ab..79a2c37990fd 100644 --- a/drivers/net/netkit.c +++ b/drivers/net/netkit.c @@ -329,10 +329,10 @@ static struct rtnl_link_ops netkit_link_ops;
static int netkit_new_link(struct rtnl_newlink_params *params) { + struct net *peer_net = rtnl_newlink_peer_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *peer_net = params->net; struct nlattr **tb = params->tb; struct nlattr *peer_tb[IFLA_MAX + 1], **tbp = tb, *attr; enum netkit_action policy_prim = NETKIT_PASS; diff --git a/drivers/net/pfcp.c b/drivers/net/pfcp.c index cb936da99674..e98724a71c22 100644 --- a/drivers/net/pfcp.c +++ b/drivers/net/pfcp.c @@ -186,14 +186,14 @@ static int pfcp_add_sock(struct pfcp_dev *pfcp)
static int pfcp_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct net_device *dev = params->dev; - struct net *net = params->net; struct pfcp_dev *pfcp; struct pfcp_net *pn; int err;
pfcp = netdev_priv(dev); - pfcp->net = net; + pfcp->net = link_net;
err = pfcp_add_sock(pfcp); if (err) { diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 5b58e7bb4e7b..316b6d01436b 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -1305,9 +1305,9 @@ static int ppp_nl_validate(struct nlattr *tb[], struct nlattr *data[],
static int ppp_nl_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; struct ppp_config conf = { .unit = -1, @@ -1345,7 +1345,7 @@ static int ppp_nl_newlink(struct rtnl_newlink_params *params) if (!tb[IFLA_IFNAME] || !nla_len(tb[IFLA_IFNAME]) || !*(char *)nla_data(tb[IFLA_IFNAME])) conf.ifname_is_set = false;
- err = ppp_dev_configure(src_net, dev, &conf); + err = ppp_dev_configure(link_net, dev, &conf);
out_unlock: mutex_unlock(&ppp_mutex); diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 04229c07023d..11ee821edcd6 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -1767,10 +1767,10 @@ static int veth_init_queues(struct net_device *dev, struct nlattr *tb[])
static int veth_newlink(struct rtnl_newlink_params *params) { + struct net *peer_net = rtnl_newlink_peer_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *peer_net = params->net; struct nlattr **tb = params->tb; int err; struct net_device *peer; diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index 3d1088bf9acd..db173a1d948d 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -4395,10 +4395,10 @@ static int vxlan_nl2conf(struct nlattr *tb[], struct nlattr *data[],
static int vxlan_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; struct vxlan_config conf; int err; @@ -4407,7 +4407,7 @@ static int vxlan_newlink(struct rtnl_newlink_params *params) if (err) return err;
- return __vxlan_dev_create(src_net, dev, &conf, extack); + return __vxlan_dev_create(link_net, dev, &conf, extack); }
static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[], diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c index 92aac080d2b5..b2ba9d9c6ad3 100644 --- a/drivers/net/wireguard/device.c +++ b/drivers/net/wireguard/device.c @@ -309,13 +309,13 @@ static void wg_setup(struct net_device *dev)
static int wg_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct net_device *dev = params->dev; - struct net *src_net = params->net; struct wg_device *wg; int ret = -ENOMEM;
wg = netdev_priv(dev); - rcu_assign_pointer(wg->creating_net, src_net); + rcu_assign_pointer(wg->creating_net, link_net); init_rwsem(&wg->static_identity.lock); mutex_init(&wg->socket_update_lock); mutex_init(&wg->device_update_lock); diff --git a/drivers/net/wireless/virtual/virt_wifi.c b/drivers/net/wireless/virtual/virt_wifi.c index d64eb03e0ac8..5e7c7a1d7d5f 100644 --- a/drivers/net/wireless/virtual/virt_wifi.c +++ b/drivers/net/wireless/virtual/virt_wifi.c @@ -521,10 +521,10 @@ static rx_handler_result_t virt_wifi_rx_handler(struct sk_buff **pskb) /* Called with rtnl lock held. */ static int virt_wifi_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct virt_wifi_netdev_priv *priv; - struct net *src_net = params->net; struct nlattr **tb = params->tb; int err;
@@ -536,7 +536,7 @@ static int virt_wifi_newlink(struct rtnl_newlink_params *params) netif_carrier_off(dev);
priv->upperdev = dev; - priv->lowerdev = __dev_get_by_index(src_net, + priv->lowerdev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK]));
if (!priv->lowerdev) diff --git a/drivers/net/wwan/wwan_core.c b/drivers/net/wwan/wwan_core.c index 908a3db61477..06a2172d1856 100644 --- a/drivers/net/wwan/wwan_core.c +++ b/drivers/net/wwan/wwan_core.c @@ -1070,7 +1070,7 @@ static void wwan_create_default_link(struct wwan_device *wwandev, struct nlmsghdr *nlh; struct sk_buff *msg; struct rtnl_newlink_params params = { - .net = &init_net, + .src_net = &init_net, .tb = tb, .data = data, }; diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c index 26a0f0a2ce27..0a9930017bba 100644 --- a/net/8021q/vlan_netlink.c +++ b/net/8021q/vlan_netlink.c @@ -137,10 +137,10 @@ static int vlan_changelink(struct net_device *dev, struct nlattr *tb[],
static int vlan_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; struct net_device *real_dev; struct vlan_dev_priv *vlan; @@ -160,7 +160,7 @@ static int vlan_newlink(struct rtnl_newlink_params *params) return -EINVAL; }
- real_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); + real_dev = __dev_get_by_index(link_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) { NL_SET_ERR_MSG_MOD(extack, "link does not exist"); return -ENODEV; diff --git a/net/hsr/hsr_netlink.c b/net/hsr/hsr_netlink.c index 08d38e2e2962..9bc564e81827 100644 --- a/net/hsr/hsr_netlink.c +++ b/net/hsr/hsr_netlink.c @@ -31,10 +31,10 @@ static const struct nla_policy hsr_policy[IFLA_HSR_MAX + 1] = { */ static int hsr_newlink(struct rtnl_newlink_params *params) { + struct net *link_net = rtnl_newlink_link_net(params); struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; enum hsr_version proto_version; unsigned char multicast_spec; u8 proto = HSR_PROTOCOL_HSR; @@ -48,7 +48,7 @@ static int hsr_newlink(struct rtnl_newlink_params *params) NL_SET_ERR_MSG_MOD(extack, "Slave1 device not specified"); return -EINVAL; } - link[0] = __dev_get_by_index(src_net, + link[0] = __dev_get_by_index(link_net, nla_get_u32(data[IFLA_HSR_SLAVE1])); if (!link[0]) { NL_SET_ERR_MSG_MOD(extack, "Slave1 does not exist"); @@ -58,7 +58,7 @@ static int hsr_newlink(struct rtnl_newlink_params *params) NL_SET_ERR_MSG_MOD(extack, "Slave2 device not specified"); return -EINVAL; } - link[1] = __dev_get_by_index(src_net, + link[1] = __dev_get_by_index(link_net, nla_get_u32(data[IFLA_HSR_SLAVE2])); if (!link[1]) { NL_SET_ERR_MSG_MOD(extack, "Slave2 does not exist"); @@ -71,7 +71,7 @@ static int hsr_newlink(struct rtnl_newlink_params *params) }
if (data[IFLA_HSR_INTERLINK]) - interlink = __dev_get_by_index(src_net, + interlink = __dev_get_by_index(link_net, nla_get_u32(data[IFLA_HSR_INTERLINK]));
if (interlink && interlink == link[0]) {
When link_net is set, use it as link netns instead of dev_net(). This prepares for rtnetlink core to create device in target netns directly, in which case the two namespaces may be different.
Signed-off-by: Xiao Liang shaw.leon@gmail.com --- net/ieee802154/6lowpan/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ieee802154/6lowpan/core.c b/net/ieee802154/6lowpan/core.c index c16c14807d87..65a5c61cf38c 100644 --- a/net/ieee802154/6lowpan/core.c +++ b/net/ieee802154/6lowpan/core.c @@ -143,7 +143,8 @@ static int lowpan_newlink(struct rtnl_newlink_params *params) if (!tb[IFLA_LINK]) return -EINVAL; /* find and hold wpan device */ - wdev = dev_get_by_index(dev_net(ldev), nla_get_u32(tb[IFLA_LINK])); + wdev = dev_get_by_index(params->link_net ? : dev_net(ldev), + nla_get_u32(tb[IFLA_LINK])); if (!wdev) return -ENODEV; if (wdev->type != ARPHRD_IEEE802154) {
When link_net is set, use it as link netns instead of dev_net(). This prepares for rtnetlink core to create device in target netns directly, in which case the two namespaces may be different.
Convert common ip_tunnel_newlink() to accept an extra link netns argument. Don't overwrite ip_tunnel.net in ip_tunnel_init().
Signed-off-by: Xiao Liang shaw.leon@gmail.com --- include/net/ip_tunnels.h | 5 +++-- net/ipv4/ip_gre.c | 8 +++++--- net/ipv4/ip_tunnel.c | 10 ++++++---- net/ipv4/ip_vti.c | 3 ++- net/ipv4/ipip.c | 3 ++- 5 files changed, 18 insertions(+), 11 deletions(-)
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index 1aa31bdb2b31..ae1f2dda4533 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -406,8 +406,9 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, bool log_ecn_error); int ip_tunnel_changelink(struct net_device *dev, struct nlattr *tb[], struct ip_tunnel_parm_kern *p, __u32 fwmark); -int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[], - struct ip_tunnel_parm_kern *p, __u32 fwmark); +int ip_tunnel_newlink(struct net *net, struct net_device *dev, + struct nlattr *tb[], struct ip_tunnel_parm_kern *p, + __u32 fwmark); void ip_tunnel_setup(struct net_device *dev, unsigned int net_id);
bool ip_tunnel_netlink_encap_parms(struct nlattr *data[], diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index 71eb651e2b44..d1b712b775b6 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -1408,7 +1408,8 @@ static int ipgre_newlink(struct rtnl_newlink_params *params) err = ipgre_netlink_parms(dev, data, tb, &p, &fwmark); if (err < 0) return err; - return ip_tunnel_newlink(dev, tb, &p, fwmark); + return ip_tunnel_newlink(params->link_net ? : dev_net(dev), dev, tb, &p, + fwmark); }
static int erspan_newlink(struct rtnl_newlink_params *params) @@ -1427,7 +1428,8 @@ static int erspan_newlink(struct rtnl_newlink_params *params) err = erspan_netlink_parms(dev, data, tb, &p, &fwmark); if (err) return err; - return ip_tunnel_newlink(dev, tb, &p, fwmark); + return ip_tunnel_newlink(params->link_net ? : dev_net(dev), dev, tb, &p, + fwmark); }
static int ipgre_changelink(struct net_device *dev, struct nlattr *tb[], @@ -1701,7 +1703,7 @@ struct net_device *gretap_fb_dev_create(struct net *net, const char *name, struct ip_tunnel *t; int err; struct rtnl_newlink_params params = { - .net = net, + .src_net = net, .tb = tb, };
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 25505f9b724c..952d2241c9b1 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -1213,11 +1213,11 @@ void ip_tunnel_delete_nets(struct list_head *net_list, unsigned int id, } EXPORT_SYMBOL_GPL(ip_tunnel_delete_nets);
-int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[], - struct ip_tunnel_parm_kern *p, __u32 fwmark) +int ip_tunnel_newlink(struct net *net, struct net_device *dev, + struct nlattr *tb[], struct ip_tunnel_parm_kern *p, + __u32 fwmark) { struct ip_tunnel *nt; - struct net *net = dev_net(dev); struct ip_tunnel_net *itn; int mtu; int err; @@ -1326,7 +1326,9 @@ int ip_tunnel_init(struct net_device *dev) }
tunnel->dev = dev; - tunnel->net = dev_net(dev); + if (!tunnel->net) + tunnel->net = dev_net(dev); + strscpy(tunnel->parms.name, dev->name); iph->version = 4; iph->ihl = 5; diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index 12ccbf34fb6c..98752b4d28ad 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -584,7 +584,8 @@ static int vti_newlink(struct rtnl_newlink_params *params) __u32 fwmark = 0;
vti_netlink_parms(data, &parms, &fwmark); - return ip_tunnel_newlink(dev, tb, &parms, fwmark); + return ip_tunnel_newlink(params->link_net ? : dev_net(dev), dev, tb, + &parms, fwmark); }
static int vti_changelink(struct net_device *dev, struct nlattr *tb[], diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c index 3a737ea3c2e5..c65c8b0e838f 100644 --- a/net/ipv4/ipip.c +++ b/net/ipv4/ipip.c @@ -456,7 +456,8 @@ static int ipip_newlink(struct rtnl_newlink_params *params) }
ipip_netlink_parms(data, &p, &t->collect_md, &fwmark); - return ip_tunnel_newlink(dev, tb, &p, fwmark); + return ip_tunnel_newlink(params->link_net ? : dev_net(dev), dev, tb, &p, + fwmark); }
static int ipip_changelink(struct net_device *dev, struct nlattr *tb[],
When link_net is set, use it as link netns instead of dev_net(). This prepares for rtnetlink core to create device in target netns directly, in which case the two namespaces may be different.
Set correct netns in priv before registering device, and avoid overwriting it in ndo_init() path.
Signed-off-by: Xiao Liang shaw.leon@gmail.com --- net/ipv6/ip6_gre.c | 22 ++++++++++++---------- net/ipv6/ip6_tunnel.c | 13 ++++++++----- net/ipv6/ip6_vti.c | 10 ++++++---- net/ipv6/sit.c | 11 +++++++---- 4 files changed, 33 insertions(+), 23 deletions(-)
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index 3efd51f0d7d2..1d47c229068d 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -1498,7 +1498,8 @@ static int ip6gre_tunnel_init_common(struct net_device *dev) tunnel = netdev_priv(dev);
tunnel->dev = dev; - tunnel->net = dev_net(dev); + if (!tunnel->net) + tunnel->net = dev_net(dev); strcpy(tunnel->parms.name, dev->name);
ret = dst_cache_init(&tunnel->dst_cache, GFP_KERNEL); @@ -1882,7 +1883,8 @@ static int ip6erspan_tap_init(struct net_device *dev) tunnel = netdev_priv(dev);
tunnel->dev = dev; - tunnel->net = dev_net(dev); + if (!tunnel->net) + tunnel->net = dev_net(dev); strcpy(tunnel->parms.name, dev->name);
ret = dst_cache_init(&tunnel->dst_cache, GFP_KERNEL); @@ -1971,7 +1973,7 @@ static bool ip6gre_netlink_encap_parms(struct nlattr *data[], return ret; }
-static int ip6gre_newlink_common(struct net *src_net, struct net_device *dev, +static int ip6gre_newlink_common(struct net *link_net, struct net_device *dev, struct nlattr *tb[], struct nlattr *data[], struct netlink_ext_ack *extack) { @@ -1992,7 +1994,7 @@ static int ip6gre_newlink_common(struct net *src_net, struct net_device *dev, eth_hw_addr_random(dev);
nt->dev = dev; - nt->net = dev_net(dev); + nt->net = link_net;
err = register_netdevice(dev); if (err) @@ -2010,13 +2012,13 @@ static int ip6gre_newlink(struct rtnl_newlink_params *params) struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; - struct net *net = dev_net(dev); struct ip6gre_net *ign; struct ip6_tnl *nt; + struct net *net; int err;
+ net = params->link_net ? : dev_net(dev); nt = netdev_priv(dev); ip6gre_netlink_parms(data, &nt->parms); ign = net_generic(net, ip6gre_net_id); @@ -2029,7 +2031,7 @@ static int ip6gre_newlink(struct rtnl_newlink_params *params) return -EEXIST; }
- err = ip6gre_newlink_common(src_net, dev, tb, data, extack); + err = ip6gre_newlink_common(net, dev, tb, data, extack); if (!err) { ip6gre_tnl_link_config(nt, !tb[IFLA_MTU]); ip6gre_tunnel_link_md(ign, nt); @@ -2250,13 +2252,13 @@ static int ip6erspan_newlink(struct rtnl_newlink_params *params) struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *src_net = params->net; struct nlattr **tb = params->tb; - struct net *net = dev_net(dev); struct ip6gre_net *ign; struct ip6_tnl *nt; + struct net *net; int err;
+ net = params->link_net ? : dev_net(dev); nt = netdev_priv(dev); ip6gre_netlink_parms(data, &nt->parms); ip6erspan_set_version(data, &nt->parms); @@ -2270,7 +2272,7 @@ static int ip6erspan_newlink(struct rtnl_newlink_params *params) return -EEXIST; }
- err = ip6gre_newlink_common(src_net, dev, tb, data, extack); + err = ip6gre_newlink_common(net, dev, tb, data, extack); if (!err) { ip6erspan_tnl_link_config(nt, !tb[IFLA_MTU]); ip6erspan_tunnel_link_md(ign, nt); diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index f4bdbabc3246..cb09cc878dee 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -253,8 +253,7 @@ static void ip6_dev_free(struct net_device *dev) static int ip6_tnl_create2(struct net_device *dev) { struct ip6_tnl *t = netdev_priv(dev); - struct net *net = dev_net(dev); - struct ip6_tnl_net *ip6n = net_generic(net, ip6_tnl_net_id); + struct ip6_tnl_net *ip6n = net_generic(t->net, ip6_tnl_net_id); int err;
dev->rtnl_link_ops = &ip6_link_ops; @@ -1878,7 +1877,8 @@ ip6_tnl_dev_init_gen(struct net_device *dev) int t_hlen;
t->dev = dev; - t->net = dev_net(dev); + if (!t->net) + t->net = dev_net(dev);
ret = dst_cache_init(&t->dst_cache, GFP_KERNEL); if (ret) @@ -2007,13 +2007,16 @@ static int ip6_tnl_newlink(struct rtnl_newlink_params *params) struct net_device *dev = params->dev; struct nlattr **data = params->data; struct nlattr **tb = params->tb; - struct net *net = dev_net(dev); - struct ip6_tnl_net *ip6n = net_generic(net, ip6_tnl_net_id); struct ip_tunnel_encap ipencap; + struct ip6_tnl_net *ip6n; struct ip6_tnl *nt, *t; + struct net *net; int err;
+ net = params->link_net ? : dev_net(dev); + ip6n = net_generic(net, ip6_tnl_net_id); nt = netdev_priv(dev); + nt->net = net;
if (ip_tunnel_netlink_encap_parms(data, &ipencap)) { err = ip6_tnl_encap_setup(nt, &ipencap); diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index 79e601e629d2..a3108a7464c7 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -177,8 +177,7 @@ vti6_tnl_unlink(struct vti6_net *ip6n, struct ip6_tnl *t) static int vti6_tnl_create2(struct net_device *dev) { struct ip6_tnl *t = netdev_priv(dev); - struct net *net = dev_net(dev); - struct vti6_net *ip6n = net_generic(net, vti6_net_id); + struct vti6_net *ip6n = net_generic(t->net, vti6_net_id); int err;
dev->rtnl_link_ops = &vti6_link_ops; @@ -925,7 +924,8 @@ static inline int vti6_dev_init_gen(struct net_device *dev) struct ip6_tnl *t = netdev_priv(dev);
t->dev = dev; - t->net = dev_net(dev); + if (!t->net) + t->net = dev_net(dev); netdev_hold(dev, &t->dev_tracker, GFP_KERNEL); netdev_lockdep_set_classes(dev); return 0; @@ -1001,13 +1001,15 @@ static int vti6_newlink(struct rtnl_newlink_params *params) { struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *net = dev_net(dev); struct ip6_tnl *nt; + struct net *net;
+ net = params->link_net ? : dev_net(dev); nt = netdev_priv(dev); vti6_netlink_parms(data, &nt->parms);
nt->parms.proto = IPPROTO_IPV6; + nt->net = net;
if (vti6_locate(net, &nt->parms, 0)) return -EEXIST; diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index 4dd1309d1eb3..8888fc51fa0b 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -201,8 +201,7 @@ static void ipip6_tunnel_clone_6rd(struct net_device *dev, struct sit_net *sitn) static int ipip6_tunnel_create(struct net_device *dev) { struct ip_tunnel *t = netdev_priv(dev); - struct net *net = dev_net(dev); - struct sit_net *sitn = net_generic(net, sit_net_id); + struct sit_net *sitn = net_generic(t->net, sit_net_id); int err;
__dev_addr_set(dev, &t->parms.iph.saddr, 4); @@ -270,6 +269,7 @@ static struct ip_tunnel *ipip6_tunnel_locate(struct net *net, nt = netdev_priv(dev);
nt->parms = *parms; + nt->net = net; if (ipip6_tunnel_create(dev) < 0) goto failed_free;
@@ -1449,7 +1449,8 @@ static int ipip6_tunnel_init(struct net_device *dev) int err;
tunnel->dev = dev; - tunnel->net = dev_net(dev); + if (!tunnel->net) + tunnel->net = dev_net(dev); strcpy(tunnel->parms.name, dev->name);
ipip6_tunnel_bind_dev(dev); @@ -1555,15 +1556,17 @@ static int ipip6_newlink(struct rtnl_newlink_params *params) struct net_device *dev = params->dev; struct nlattr **data = params->data; struct nlattr **tb = params->tb; - struct net *net = dev_net(dev); struct ip_tunnel *nt; struct ip_tunnel_encap ipencap; #ifdef CONFIG_IPV6_SIT_6RD struct ip_tunnel_6rd ip6rd; #endif + struct net *net; int err;
+ net = params->link_net ? : dev_net(dev); nt = netdev_priv(dev); + nt->net = net;
if (ip_tunnel_netlink_encap_parms(data, &ipencap)) { err = ip_tunnel_encap_setup(nt, &ipencap);
When link_net is set, use it as link netns instead of dev_net(). This prepares for rtnetlink core to create device in target netns directly, in which case the two namespaces may be different.
Signed-off-by: Xiao Liang shaw.leon@gmail.com --- net/xfrm/xfrm_interface_core.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/xfrm/xfrm_interface_core.c b/net/xfrm/xfrm_interface_core.c index 77d50d4af4a1..d1198c63dd23 100644 --- a/net/xfrm/xfrm_interface_core.c +++ b/net/xfrm/xfrm_interface_core.c @@ -242,10 +242,9 @@ static void xfrmi_dev_free(struct net_device *dev) gro_cells_destroy(&xi->gro_cells); }
-static int xfrmi_create(struct net_device *dev) +static int xfrmi_create(struct net *net, struct net_device *dev) { struct xfrm_if *xi = netdev_priv(dev); - struct net *net = dev_net(dev); struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id); int err;
@@ -819,11 +818,12 @@ static int xfrmi_newlink(struct rtnl_newlink_params *params) struct netlink_ext_ack *extack = params->extack; struct net_device *dev = params->dev; struct nlattr **data = params->data; - struct net *net = dev_net(dev); struct xfrm_if_parms p = {}; struct xfrm_if *xi; + struct net *net; int err;
+ net = params->link_net ? : dev_net(dev); xfrmi_netlink_parms(data, &p); if (p.collect_md) { struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id); @@ -852,7 +852,7 @@ static int xfrmi_newlink(struct rtnl_newlink_params *params) xi->net = net; xi->dev = dev;
- err = xfrmi_create(dev); + err = xfrmi_create(net, dev); return err; }
Now that devices have been converted to use the specific netns instead of ambiguous "net", let's remove it from newlink parameters.
Signed-off-by: Xiao Liang shaw.leon@gmail.com --- include/net/rtnetlink.h | 2 -- net/core/rtnetlink.c | 6 ------ 2 files changed, 8 deletions(-)
diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h index ed970b4568d1..04fc0e91af42 100644 --- a/include/net/rtnetlink.h +++ b/include/net/rtnetlink.h @@ -72,7 +72,6 @@ static inline int rtnl_msg_family(const struct nlmsghdr *nlh) /** * struct rtnl_newlink_params - parameters of rtnl_link_ops::newlink() * - * @net: Netns of interest * @src_net: Source netns of rtnetlink socket * @link_net: Link netns by IFLA_LINK_NETNSID, NULL if not specified * @peer_net: Peer netns @@ -82,7 +81,6 @@ static inline int rtnl_msg_family(const struct nlmsghdr *nlh) * @extack: Netlink extended ACK */ struct rtnl_newlink_params { - struct net *net; struct net *src_net; struct net *link_net; struct net *peer_net; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index f902b8a5189f..a2246bbaf2bc 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3758,7 +3758,6 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, char ifname[IFNAMSIZ]; int err; struct rtnl_newlink_params params = { - .net = net, .src_net = net, .link_net = link_net, .peer_net = peer_net, @@ -3787,11 +3786,6 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, dev->ifindex = ifm->ifi_index; params.dev = dev;
- if (link_net) - params.net = link_net; - if (peer_net) - params.net = peer_net; - if (ops->newlink) err = ops->newlink(¶ms); else
Make rtnl_newlink_create() create device in target namespace directly. Avoid extra netns change when link netns is provided.
Device drivers has been converted to be aware of link netns, that is not assuming device netns is and link netns is the same when ops->newlink() is called.
Signed-off-by: Xiao Liang shaw.leon@gmail.com --- net/core/rtnetlink.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index a2246bbaf2bc..e8126007eb00 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3776,8 +3776,8 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, name_assign_type = NET_NAME_ENUM; }
- dev = rtnl_create_link(link_net ? : tgt_net, ifname, - name_assign_type, ops, tb, extack); + dev = rtnl_create_link(tgt_net, ifname, name_assign_type, ops, tb, + extack); if (IS_ERR(dev)) { err = PTR_ERR(dev); goto out; @@ -3798,11 +3798,6 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, err = rtnl_configure_link(dev, ifm, portid, nlh); if (err < 0) goto out_unregister; - if (link_net) { - err = dev_change_net_namespace(dev, tgt_net, ifname); - if (err < 0) - goto out_unregister; - } if (tb[IFLA_MASTER]) { err = do_set_master(dev, nla_get_u32(tb[IFLA_MASTER]), extack); if (err)
Change netns of current thread and switch back on context exit. For example:
with NetNSEnter("ns1"): ip("link add dummy0 type dummy")
The command be executed in netns "ns1".
Signed-off-by: Xiao Liang shaw.leon@gmail.com --- tools/testing/selftests/net/lib/py/__init__.py | 2 +- tools/testing/selftests/net/lib/py/netns.py | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/lib/py/__init__.py b/tools/testing/selftests/net/lib/py/__init__.py index 54d8f5eba810..e2d6c7b63019 100644 --- a/tools/testing/selftests/net/lib/py/__init__.py +++ b/tools/testing/selftests/net/lib/py/__init__.py @@ -2,7 +2,7 @@
from .consts import KSRC from .ksft import * -from .netns import NetNS +from .netns import NetNS, NetNSEnter from .nsim import * from .utils import * from .ynl import NlError, YnlFamily, EthtoolFamily, NetdevFamily, RtnlFamily diff --git a/tools/testing/selftests/net/lib/py/netns.py b/tools/testing/selftests/net/lib/py/netns.py index ecff85f9074f..8e9317044eef 100644 --- a/tools/testing/selftests/net/lib/py/netns.py +++ b/tools/testing/selftests/net/lib/py/netns.py @@ -1,9 +1,12 @@ # SPDX-License-Identifier: GPL-2.0
from .utils import ip +import ctypes import random import string
+libc = ctypes.cdll.LoadLibrary('libc.so.6') +
class NetNS: def __init__(self, name=None): @@ -29,3 +32,18 @@ class NetNS:
def __repr__(self): return f"NetNS({self.name})" + + +class NetNSEnter: + def __init__(self, ns_name): + self.ns_path = f"/run/netns/{ns_name}" + + def __enter__(self): + self.saved = open("/proc/thread-self/ns/net") + with open(self.ns_path) as ns_file: + libc.setns(ns_file.fileno(), 0) + return self + + def __exit__(self, exc_type, exc_value, traceback): + libc.setns(self.saved.fileno(), 0) + self.saved.close()
- Add test for creating link in another netns when a link of the same name and ifindex exists in current netns. - Add test to verify that link is created in target netns directly - no link new/del events should be generated in link netns or current netns. - Add test cases to verify that link-netns is set as expected for various drivers and combination of namespace-related parameters.
Signed-off-by: Xiao Liang shaw.leon@gmail.com --- tools/testing/selftests/net/Makefile | 1 + tools/testing/selftests/net/config | 5 + tools/testing/selftests/net/link_netns.py | 141 ++++++++++++++++++++++ tools/testing/selftests/net/netns-name.sh | 10 ++ 4 files changed, 157 insertions(+) create mode 100755 tools/testing/selftests/net/link_netns.py
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index 73ee88d6b043..df07a38f884f 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -35,6 +35,7 @@ TEST_PROGS += cmsg_so_mark.sh TEST_PROGS += cmsg_so_priority.sh TEST_PROGS += cmsg_time.sh cmsg_ipv6.sh TEST_PROGS += netns-name.sh +TEST_PROGS += link_netns.py TEST_PROGS += nl_netdev.py TEST_PROGS += srv6_end_dt46_l3vpn_test.sh TEST_PROGS += srv6_end_dt4_l3vpn_test.sh diff --git a/tools/testing/selftests/net/config b/tools/testing/selftests/net/config index 5b9baf708950..ab55270669ec 100644 --- a/tools/testing/selftests/net/config +++ b/tools/testing/selftests/net/config @@ -107,3 +107,8 @@ CONFIG_XFRM_INTERFACE=m CONFIG_XFRM_USER=m CONFIG_IP_NF_MATCH_RPFILTER=m CONFIG_IP6_NF_MATCH_RPFILTER=m +CONFIG_IPVLAN=m +CONFIG_CAN=m +CONFIG_CAN_DEV=m +CONFIG_CAN_VXCAN=m +CONFIG_NETKIT=y diff --git a/tools/testing/selftests/net/link_netns.py b/tools/testing/selftests/net/link_netns.py new file mode 100755 index 000000000000..aab043c59d69 --- /dev/null +++ b/tools/testing/selftests/net/link_netns.py @@ -0,0 +1,141 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +import time + +from lib.py import ksft_run, ksft_exit, ksft_true +from lib.py import ip +from lib.py import NetNS, NetNSEnter +from lib.py import RtnlFamily + + +LINK_NETNSID = 100 + + +def test_event() -> None: + with NetNS() as ns1, NetNS() as ns2: + with NetNSEnter(str(ns2)): + rtnl = RtnlFamily() + + rtnl.ntf_subscribe("rtnlgrp-link") + + ip(f"netns set {ns2} {LINK_NETNSID}", ns=str(ns1)) + ip(f"link add netns {ns1} link-netnsid {LINK_NETNSID} dummy1 type dummy") + ip(f"link add netns {ns1} dummy2 type dummy", ns=str(ns2)) + + ip("link del dummy1", ns=str(ns1)) + ip("link del dummy2", ns=str(ns1)) + + time.sleep(1) + rtnl.check_ntf() + ksft_true(rtnl.async_msg_queue.empty(), + "Received unexpected link notification") + + +def validate_link_netns(netns, ifname, link_netnsid) -> bool: + link_info = ip(f"-d link show dev {ifname}", ns=netns, json=True) + if not link_info: + return False + return link_info[0].get("link_netnsid") == link_netnsid + + +def test_link_net() -> None: + configs = [ + # type, common args, type args, fallback to dev_net + ("ipvlan", "link dummy1", "", False), + ("macsec", "link dummy1", "", False), + ("macvlan", "link dummy1", "", False), + ("macvtap", "link dummy1", "", False), + ("vlan", "link dummy1", "id 100", False), + ("gre", "", "local 192.0.2.1", True), + ("vti", "", "local 192.0.2.1", True), + ("ipip", "", "local 192.0.2.1", True), + ("ip6gre", "", "local 2001:db8::1", True), + ("ip6tnl", "", "local 2001:db8::1", True), + ("vti6", "", "local 2001:db8::1", True), + ("sit", "", "local 192.0.2.1", True), + ("xfrm", "", "if_id 1", True), + ] + + with NetNS() as ns1, NetNS() as ns2, NetNS() as ns3: + net1, net2, net3 = str(ns1), str(ns2), str(ns3) + + # prepare link netnsid and a dummy link needed by certain drivers + ip(f"netns set {net3} {LINK_NETNSID}", ns=str(net2)) + ip("link add dummy1 type dummy", ns=net3) + + cases = [ + # source, "netns", "link-netns", expected link-netns + (net3, None, None, None, None), + (net3, net2, None, None, LINK_NETNSID), + (net2, None, net3, LINK_NETNSID, LINK_NETNSID), + (net1, net2, net3, LINK_NETNSID, LINK_NETNSID), + ] + + for src_net, netns, link_netns, exp1, exp2 in cases: + tgt_net = netns or src_net + for typ, cargs, targs, fb_dev_net in configs: + cmd = "link add" + if netns: + cmd += f" netns {netns}" + if link_netns: + cmd += f" link-netns {link_netns}" + cmd += f" {cargs} foo type {typ} {targs}" + ip(cmd, ns=src_net) + if fb_dev_net: + ksft_true(validate_link_netns(tgt_net, "foo", exp1), + f"{typ} link_netns validation failed") + else: + ksft_true(validate_link_netns(tgt_net, "foo", exp2), + f"{typ} link_netns validation failed") + ip(f"link del foo", ns=tgt_net) + + +def test_peer_net() -> None: + types = [ + "vxcan", + "netkit", + "veth", + ] + + with NetNS() as ns1, NetNS() as ns2, NetNS() as ns3, NetNS() as ns4: + net1, net2, net3, net4 = str(ns1), str(ns2), str(ns3), str(ns4) + + ip(f"netns set {net3} {LINK_NETNSID}", ns=str(net2)) + + cases = [ + # source, "netns", "link-netns", "peer netns", expected + (net1, None, None, None, None), + (net1, net2, None, None, None), + (net2, None, net3, None, LINK_NETNSID), + (net1, net2, net3, None, None), + (net2, None, None, net3, LINK_NETNSID), + (net1, net2, None, net3, LINK_NETNSID), + (net2, None, net2, net3, LINK_NETNSID), + (net1, net2, net4, net3, LINK_NETNSID), + ] + + for src_net, netns, link_netns, peer_netns, exp in cases: + tgt_net = netns or src_net + for typ in types: + cmd = "link add" + if netns: + cmd += f" netns {netns}" + if link_netns: + cmd += f" link-netns {link_netns}" + cmd += f" foo type {typ}" + if peer_netns: + cmd += f" peer netns {peer_netns}" + ip(cmd, ns=src_net) + ksft_true(validate_link_netns(tgt_net, "foo", exp), + f"{typ} peer_netns validation failed") + ip(f"link del foo", ns=tgt_net) + + +def main() -> None: + ksft_run([test_event, test_link_net, test_peer_net]) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/net/netns-name.sh b/tools/testing/selftests/net/netns-name.sh index 6974474c26f3..0be1905d1f2f 100755 --- a/tools/testing/selftests/net/netns-name.sh +++ b/tools/testing/selftests/net/netns-name.sh @@ -78,6 +78,16 @@ ip -netns $NS link show dev $ALT_NAME 2> /dev/null && fail "Can still find alt-name after move" ip -netns $test_ns link del $DEV || fail
+# +# Test no conflict of the same name/ifindex in different netns +# +ip -netns $NS link add name $DEV index 100 type dummy || fail +ip -netns $NS link add netns $test_ns name $DEV index 100 type dummy || + fail "Can create in netns without moving" +ip -netns $test_ns link show dev $DEV >> /dev/null || fail "Device not found" +ip -netns $NS link del $DEV || fail +ip -netns $test_ns link del $DEV || fail + echo -ne "$(basename $0) \t\t\t\t" if [ $RET_CODE -eq 0 ]; then echo "[ OK ]"
From: Xiao Liang shaw.leon@gmail.com Date: Sat, 4 Jan 2025 20:57:21 +0800
This patch series includes some netns-related improvements and fixes for rtnetlink, to make link creation more intuitive:
- Creating link in another net namespace doesn't conflict with link names in current one.
- Refector rtnetlink link creation. Create link in target namespace directly.
So that
# ip link add netns ns1 link-netns ns2 tun0 type gre ...
will create tun0 in ns1, rather than create it in ns2 and move to ns1. And don't conflict with another interface named "tun0" in current netns.
Patch 01 serves for 1) to avoids link name conflict in different netns.
To achieve 2), there're mainly 3 steps:
- Patch 02 packs newlink() parameters into a struct, including the original "src_net" along with more netns context. No semantic changes are introduced.
- Patch 03 ~ 07 converts device drivers to use the explicit netns extracted from params.
- Patch 08 ~ 09 removes the old netns parameter, and converts rtnetlink to create device in target netns directly.
Patch 10 ~ 11 adds some tests for link name and link netns.
BTW please note there're some issues found in current code:
In amt_newlink() drivers/net/amt.c:
amt->net = net; ... amt->stream_dev = dev_get_by_index(net, ...
Uses net, but amt_lookup_upper_dev() only searches in dev_net. So the AMT device may not be properly deleted if it's in a different netns from lower dev.
I think you are right, and the upper device will be leaked and UAF will happen.
amt must manage a list linked to a lower dev.
Given no one has reported the issue, another option would be drop cross netns support in a short period.
---8<--- diff --git a/drivers/net/amt.c b/drivers/net/amt.c index 98c6205ed19f..d39a5fe17a6f 100644 --- a/drivers/net/amt.c +++ b/drivers/net/amt.c @@ -3168,6 +3168,12 @@ static int amt_newlink(struct net *net, struct net_device *dev, struct amt_dev *amt = netdev_priv(dev); int err = -EINVAL;
+ if (!net_eq(net, dev_net(dev))) { + NL_SET_ERR_MSG_ATTR(extack, tb[IFLA_TARGET_NETNSID], + "Can't find stream device in a different netns"); + return err; + } + amt->net = net; amt->mode = nla_get_u32(data[IFLA_AMT_MODE]);
---8<---
In gtp_newlink() in drivers/net/gtp.c:
gtp->net = src_net; ... gn = net_generic(dev_net(dev), gtp_net_id); list_add_rcu(>p->list, &gn->gtp_dev_list);
Uses src_net, but priv is linked to list in dev_net. So it may not be properly deleted on removal of link netns.
The device is linked to a list in the same netns, so the device will not be leaked. See gtp_net_exit_batch_rtnl().
Rather, the problem is the udp tunnel socket netns could be freed earlier than the dev netns.
---8<--- # ip netns add test # ip netns attach root 1 # ip -n test link add netns root name gtp0 type gtp role sgsn # ip netns del test [ 125.828205] ref_tracker: net notrefcnt@0000000061c9afc0 has 1/2 users at [ 125.828205] sk_alloc+0x7c8/0x8c0 [ 125.828205] inet_create+0x284/0xd70 [ 125.828205] __sock_create+0x23b/0x6a0 [ 125.828205] udp_sock_create4+0x94/0x3f0 [ 125.828205] gtp_create_sock+0x286/0x340 [ 125.828205] gtp_create_sockets+0x43/0x110 [ 125.828205] gtp_newlink+0x775/0x1070 [ 125.828205] rtnl_newlink+0xa7f/0x19e0 [ 125.828205] rtnetlink_rcv_msg+0x71b/0xc10 [ 125.828205] netlink_rcv_skb+0x12b/0x360 [ 125.828205] netlink_unicast+0x446/0x710 [ 125.828205] netlink_sendmsg+0x73a/0xbf0 [ 125.828205] ____sys_sendmsg+0x89d/0xb00 [ 125.828205] ___sys_sendmsg+0xe9/0x170 [ 125.828205] __sys_sendmsg+0x104/0x190 [ 125.828205] do_syscall_64+0xc1/0x1d0 [ 125.828205] [ 125.833135] ref_tracker: net notrefcnt@0000000061c9afc0 has 1/2 users at [ 125.833135] sk_alloc+0x7c8/0x8c0 [ 125.833135] inet_create+0x284/0xd70 [ 125.833135] __sock_create+0x23b/0x6a0 [ 125.833135] udp_sock_create4+0x94/0x3f0 [ 125.833135] gtp_create_sock+0x286/0x340 [ 125.833135] gtp_create_sockets+0x21/0x110 [ 125.833135] gtp_newlink+0x775/0x1070 [ 125.833135] rtnl_newlink+0xa7f/0x19e0 [ 125.833135] rtnetlink_rcv_msg+0x71b/0xc10 [ 125.833135] netlink_rcv_skb+0x12b/0x360 [ 125.833135] netlink_unicast+0x446/0x710 [ 125.833135] netlink_sendmsg+0x73a/0xbf0 [ 125.833135] ____sys_sendmsg+0x89d/0xb00 [ 125.833135] ___sys_sendmsg+0xe9/0x170 [ 125.833135] __sys_sendmsg+0x104/0x190 [ 125.833135] do_syscall_64+0xc1/0x1d0 [ 125.833135] [ 125.837998] ------------[ cut here ]------------ [ 125.838345] WARNING: CPU: 0 PID: 11 at lib/ref_tracker.c:179 ref_tracker_dir_exit+0x26c/0x520 [ 125.838906] Modules linked in: [ 125.839130] CPU: 0 UID: 0 PID: 11 Comm: kworker/u16:0 Not tainted 6.13.0-rc5-00150-gc707e6e25dde #188 [ 125.839734] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 125.840497] Workqueue: netns cleanup_net [ 125.840773] RIP: 0010:ref_tracker_dir_exit+0x26c/0x520 [ 125.841128] Code: 00 00 00 fc ff df 4d 8b 26 49 bd 00 01 00 00 00 00 ad de 4c 39 f5 0f 85 df 00 00 00 48 8b 74 24 08 48 89 df e8 a5 cc 12 02 90 <0f> 0b 90 48 8d 6b 44 be 04 00 00 00 48 89 ef e8 80 de 67 ff 48 89 [ 125.842364] RSP: 0018:ff11000007f3fb60 EFLAGS: 00010286 [ 125.842714] RAX: 0000000000004337 RBX: ff1100000d231aa0 RCX: 1ffffffff0e40d5c [ 125.843195] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8423ee3c [ 125.843664] RBP: ff1100000d231af0 R08: 0000000000000001 R09: fffffbfff0e397ae [ 125.844142] R10: 0000000000000001 R11: 0000000000036001 R12: ff1100000d231af0 [ 125.844606] R13: dead000000000100 R14: ff1100000d231af0 R15: dffffc0000000000 [ 125.845067] FS: 0000000000000000(0000) GS:ff1100006ce00000(0000) knlGS:0000000000000000 [ 125.845596] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 125.845984] CR2: 0000564cbf104000 CR3: 000000000ef44001 CR4: 0000000000771ef0 [ 125.846480] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 125.846958] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 125.847450] PKRU: 55555554 [ 125.847634] Call Trace: [ 125.847800] <TASK> [ 125.847946] ? __warn+0xcc/0x2d0 [ 125.848177] ? ref_tracker_dir_exit+0x26c/0x520 [ 125.848485] ? report_bug+0x28c/0x2d0 [ 125.848742] ? handle_bug+0x54/0xa0 [ 125.848982] ? exc_invalid_op+0x18/0x50 [ 125.849252] ? asm_exc_invalid_op+0x1a/0x20 [ 125.849537] ? _raw_spin_unlock_irqrestore+0x2c/0x50 [ 125.849865] ? ref_tracker_dir_exit+0x26c/0x520 [ 125.850174] ? __pfx_ref_tracker_dir_exit+0x10/0x10 [ 125.850510] ? kfree+0x1cf/0x3e0 [ 125.850740] net_free+0x5d/0x90 [ 125.850962] cleanup_net+0x685/0x8e0 [ 125.851226] ? __pfx_cleanup_net+0x10/0x10 [ 125.851514] process_one_work+0x7d4/0x16f0 [ 125.851795] ? __pfx_lock_acquire+0x10/0x10 [ 125.852072] ? __pfx_process_one_work+0x10/0x10 [ 125.852396] ? assign_work+0x167/0x240 [ 125.852653] ? lock_is_held_type+0x9e/0x120 [ 125.852931] worker_thread+0x54c/0xca0 [ 125.853193] ? __pfx_worker_thread+0x10/0x10 [ 125.853485] kthread+0x249/0x300 [ 125.853709] ? __pfx_kthread+0x10/0x10 [ 125.853966] ret_from_fork+0x2c/0x70 [ 125.854229] ? __pfx_kthread+0x10/0x10 [ 125.854480] ret_from_fork_asm+0x1a/0x30 [ 125.854746] </TASK> [ 125.854897] irq event stamp: 17849 [ 125.855138] hardirqs last enabled at (17883): [<ffffffff812dc6ad>] __up_console_sem+0x4d/0x60 [ 125.855714] hardirqs last disabled at (17892): [<ffffffff812dc692>] __up_console_sem+0x32/0x60 [ 125.856315] softirqs last enabled at (17878): [<ffffffff8117d603>] handle_softirqs+0x4f3/0x750 [ 125.856908] softirqs last disabled at (17857): [<ffffffff8117d9e4>] __irq_exit_rcu+0xc4/0x100 [ 125.857492] ---[ end trace 0000000000000000 ]--- ---8<---
We can fix this by linking the dev to the socket's netns and clean them up in __net_exit hook as done in bareudp and geneve.
---8<--- diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 89a996ad8cd0..77638a815873 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -70,6 +70,7 @@ struct pdp_ctx { /* One instance of the GTP device. */ struct gtp_dev { struct list_head list; + struct list_head sock_list;
struct sock *sk0; struct sock *sk1u; @@ -102,6 +103,7 @@ static unsigned int gtp_net_id __read_mostly;
struct gtp_net { struct list_head gtp_dev_list; + struct list_head gtp_sock_list; };
static u32 gtp_h_initval; @@ -1526,6 +1528,10 @@ static int gtp_newlink(struct net *src_net, struct net_device *dev,
gn = net_generic(dev_net(dev), gtp_net_id); list_add_rcu(>p->list, &gn->gtp_dev_list); + + gn = net_generic(src_net, gtp_net_id); + list_add(>p->sock_list, &gn->gtp_sock_list); + dev->priv_destructor = gtp_destructor;
netdev_dbg(dev, "registered new GTP interface\n"); @@ -1552,6 +1558,7 @@ static void gtp_dellink(struct net_device *dev, struct list_head *head) pdp_context_delete(pctx);
list_del_rcu(>p->list); + list_del(>p->sock_list); unregister_netdevice_queue(dev, head); }
@@ -2465,6 +2472,8 @@ static int __net_init gtp_net_init(struct net *net) struct gtp_net *gn = net_generic(net, gtp_net_id);
INIT_LIST_HEAD(&gn->gtp_dev_list); + INIT_LIST_HEAD(&gn->gtp_sock_list); + return 0; }
@@ -2475,9 +2484,12 @@ static void __net_exit gtp_net_exit_batch_rtnl(struct list_head *net_list,
list_for_each_entry(net, net_list, exit_list) { struct gtp_net *gn = net_generic(net, gtp_net_id); - struct gtp_dev *gtp; + struct gtp_dev *gtp, *next; + + list_for_each_entry_safe(gtp, next, &gn->gtp_dev_list, list) + gtp_dellink(gtp->dev, dev_to_kill);
- list_for_each_entry(gtp, &gn->gtp_dev_list, list) + list_for_each_entry_safe(gtp, next, &gn->gtp_sock_list, sock_list) gtp_dellink(gtp->dev, dev_to_kill); } } ---8<---
In pfcp_newlink() in drivers/net/pfcp.c:
pfcp->net = net; ... pn = net_generic(dev_net(dev), pfcp_net_id); list_add_rcu(&pfcp->list, &pn->pfcp_dev_list);
Same as above.
I haven't tested pfcp but it seems to have the same problem.
I'll post patches for gtp and pfcp.
In lowpan_newlink() in net/ieee802154/6lowpan/core.c:
wdev = dev_get_by_index(dev_net(ldev), nla_get_u32(tb[IFLA_LINK]));
Looks for IFLA_LINK in dev_net, but in theory the ifindex is defined in link netns.
I guess you mean the ifindex is defined in src_net instead. Not sure if it's too late to change the behaviour.
On Tue, Jan 7, 2025 at 4:57 PM Kuniyuki Iwashima kuniyu@amazon.com wrote:
From: Xiao Liang shaw.leon@gmail.com Date: Sat, 4 Jan 2025 20:57:21 +0800
[...]
In amt_newlink() drivers/net/amt.c:
amt->net = net; ... amt->stream_dev = dev_get_by_index(net, ...
Uses net, but amt_lookup_upper_dev() only searches in dev_net. So the AMT device may not be properly deleted if it's in a different netns from lower dev.
I think you are right, and the upper device will be leaked and UAF will happen.
amt must manage a list linked to a lower dev.
Given no one has reported the issue, another option would be drop cross netns support in a short period.
Yes. I also noticed AMT sets dev->netns_local to prevent netns change. Probably it also assumes the same netns during creation.
[...]
In gtp_newlink() in drivers/net/gtp.c:
gtp->net = src_net; ... gn = net_generic(dev_net(dev), gtp_net_id); list_add_rcu(>p->list, &gn->gtp_dev_list);
Uses src_net, but priv is linked to list in dev_net. So it may not be properly deleted on removal of link netns.
The device is linked to a list in the same netns, so the device will not be leaked. See gtp_net_exit_batch_rtnl().
Rather, the problem is the udp tunnel socket netns could be freed earlier than the dev netns.
Yes, you're right. Actually I mean the netns of the socket by "link netns" (there's some clarification about this in patch 02).
[...]
In pfcp_newlink() in drivers/net/pfcp.c:
pfcp->net = net; ... pn = net_generic(dev_net(dev), pfcp_net_id); list_add_rcu(&pfcp->list, &pn->pfcp_dev_list);
Same as above.
I haven't tested pfcp but it seems to have the same problem.
I'll post patches for gtp and pfcp.
It would be nice.
In lowpan_newlink() in net/ieee802154/6lowpan/core.c:
wdev = dev_get_by_index(dev_net(ldev), nla_get_u32(tb[IFLA_LINK]));
Looks for IFLA_LINK in dev_net, but in theory the ifindex is defined in link netns.
I guess you mean the ifindex is defined in src_net instead. Not sure if it's too late to change the behaviour.
Yes, it's source net for lowpan. I think it depends on whether the interpretation of IFLA_LINK should be considered as part API provided by rtnetlink core, or something customizable by driver. In the former case, this can be considered as a bug.
Thanks.
On Tue, Jan 7, 2025 at 4:57 PM Kuniyuki Iwashima kuniyu@amazon.com wrote: [...]
We can fix this by linking the dev to the socket's netns and clean them up in __net_exit hook as done in bareudp and geneve.
---8<--- diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 89a996ad8cd0..77638a815873 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -70,6 +70,7 @@ struct pdp_ctx { /* One instance of the GTP device. */ struct gtp_dev { struct list_head list;
struct list_head sock_list; struct sock *sk0; struct sock *sk1u;
@@ -102,6 +103,7 @@ static unsigned int gtp_net_id __read_mostly;
struct gtp_net { struct list_head gtp_dev_list;
struct list_head gtp_sock_list;
After a closer look at the GTP driver, I'm confused about the gtp_dev_list here. GTP device is linked to this list at creation time, but netns can be changed afterwards. The list is used in gtp_net_exit_batch_rtnl(), but to my understanding net devices can already be deleted in default_device_exit_batch() by default. And I wonder if the use in gtp_genl_dump_pdp() can be replaced by something like for_each_netdev_rcu().
};
static u32 gtp_h_initval; @@ -1526,6 +1528,10 @@ static int gtp_newlink(struct net *src_net, struct net_device *dev,
gn = net_generic(dev_net(dev), gtp_net_id); list_add_rcu(>p->list, &gn->gtp_dev_list);
gn = net_generic(src_net, gtp_net_id);
list_add(>p->sock_list, &gn->gtp_sock_list);
dev->priv_destructor = gtp_destructor; netdev_dbg(dev, "registered new GTP interface\n");
@@ -1552,6 +1558,7 @@ static void gtp_dellink(struct net_device *dev, struct list_head *head) pdp_context_delete(pctx);
list_del_rcu(>p->list);
list_del(>p->sock_list); unregister_netdevice_queue(dev, head);
}
@@ -2465,6 +2472,8 @@ static int __net_init gtp_net_init(struct net *net) struct gtp_net *gn = net_generic(net, gtp_net_id);
INIT_LIST_HEAD(&gn->gtp_dev_list);
INIT_LIST_HEAD(&gn->gtp_sock_list);
return 0;
}
@@ -2475,9 +2484,12 @@ static void __net_exit gtp_net_exit_batch_rtnl(struct list_head *net_list,
list_for_each_entry(net, net_list, exit_list) { struct gtp_net *gn = net_generic(net, gtp_net_id);
struct gtp_dev *gtp;
struct gtp_dev *gtp, *next;
list_for_each_entry_safe(gtp, next, &gn->gtp_dev_list, list)
gtp_dellink(gtp->dev, dev_to_kill);
list_for_each_entry(gtp, &gn->gtp_dev_list, list)
list_for_each_entry_safe(gtp, next, &gn->gtp_sock_list, sock_list) gtp_dellink(gtp->dev, dev_to_kill); }
} ---8<---
From: Xiao Liang shaw.leon@gmail.com Date: Tue, 7 Jan 2025 20:53:19 +0800
On Tue, Jan 7, 2025 at 4:57 PM Kuniyuki Iwashima kuniyu@amazon.com wrote: [...]
We can fix this by linking the dev to the socket's netns and clean them up in __net_exit hook as done in bareudp and geneve.
---8<--- diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 89a996ad8cd0..77638a815873 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -70,6 +70,7 @@ struct pdp_ctx { /* One instance of the GTP device. */ struct gtp_dev { struct list_head list;
struct list_head sock_list; struct sock *sk0; struct sock *sk1u;
@@ -102,6 +103,7 @@ static unsigned int gtp_net_id __read_mostly;
struct gtp_net { struct list_head gtp_dev_list;
struct list_head gtp_sock_list;
After a closer look at the GTP driver, I'm confused about the gtp_dev_list here. GTP device is linked to this list at creation time, but netns can be changed afterwards. The list is used in gtp_net_exit_batch_rtnl(), but to my understanding net devices can already be deleted in default_device_exit_batch() by default. And I wonder if the use in gtp_genl_dump_pdp() can be replaced by something like for_each_netdev_rcu().
Right, it should be, or we need to set netns_local. Will include this diff in the fix series.
---8<--- diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 2460a2c13c32..f9186eda36f0 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -2278,6 +2278,7 @@ static int gtp_genl_dump_pdp(struct sk_buff *skb, struct gtp_dev *last_gtp = (struct gtp_dev *)cb->args[2], *gtp; int i, j, bucket = cb->args[0], skip = cb->args[1]; struct net *net = sock_net(skb->sk); + struct net_device *dev; struct pdp_ctx *pctx; struct gtp_net *gn;
@@ -2287,7 +2288,10 @@ static int gtp_genl_dump_pdp(struct sk_buff *skb, return 0;
rcu_read_lock(); - list_for_each_entry_rcu(gtp, &gn->gtp_dev_list, list) { + for_each_netdev_rcu(net, dev) { + if (dev->rtnl_link_ops != >p_link_ops) + continue; + if (last_gtp && last_gtp != gtp) continue; else ---8<---
Otherwise, we need to move it manually like this, which is apparently overkill and unnecessary :p
---8<--- diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 2460a2c13c32..90b410b73c89 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -2501,6 +2501,46 @@ static struct pernet_operations gtp_net_ops = { .size = sizeof(struct gtp_net), };
+static int gtp_device_event(struct notifier_block *nb, + unsigned long event, void *ptr) +{ + struct net_device *dev = netdev_notifier_info_to_dev(ptr); + struct gtp_dev *gtp; + struct gtp_net *gn; + + if (dev->rtnl_link_ops != >p_link_ops) + goto out; + + gtp = netdev_priv(dev); + + switch (event) { + case NETDEV_UNREGISTER: + if (dev->reg_state != NETREG_REGISTERED) + goto out; + + /* dev_net(dev) is changed, see __dev_change_net_namespace(). + * rcu_barrier() after NETDEV_UNREGISTER guarantees that no + * one traversing a list in the old netns jumps to another + * list in the new netns. + */ + list_del_rcu(>p->list); + break; + case NETDEV_REGISTER: + if (gtp->list.prev != LIST_POISON2) + goto out; + + /* complete netns change. */ + gn = net_generic(dev_net(dev), gtp_net_id); + list_add_rcu(>p->list, &gn->gtp_dev_list); + } +out: + return NOTIFY_DONE; +} + +static struct notifier_block gtp_notifier_block = { + .notifier_call = gtp_device_event, +}; + static int __init gtp_init(void) { int err; @@ -2511,10 +2551,14 @@ static int __init gtp_init(void) if (err < 0) goto error_out;
- err = rtnl_link_register(>p_link_ops); + err = register_netdevice_notifier(>p_notifier_block); if (err < 0) goto unreg_pernet_subsys;
+ err = rtnl_link_register(>p_link_ops); + if (err < 0) + goto unreg_netdev_notifier; + err = genl_register_family(>p_genl_family); if (err < 0) goto unreg_rtnl_link; @@ -2525,6 +2569,8 @@ static int __init gtp_init(void)
unreg_rtnl_link: rtnl_link_unregister(>p_link_ops); +unreg_netdev_notifier: + register_netdevice_notifier(>p_notifier_block); unreg_pernet_subsys: unregister_pernet_subsys(>p_net_ops); error_out: @@ -2537,6 +2583,7 @@ static void __exit gtp_fini(void) { genl_unregister_family(>p_genl_family); rtnl_link_unregister(>p_link_ops); + register_netdevice_notifier(>p_notifier_block); unregister_pernet_subsys(>p_net_ops);
pr_info("GTP module unloaded\n"); ---8<---
b.a.t.m.a.n@lists.open-mesh.org