The following commit has been merged in the master branch: commit fe994ed588393f97dd9c03a79dc0dfa1fc7cd58a Merge: 7070d180bf6b54b7c2fb79ebd15746004add0b34 92ec804f3dbf0d986f8e10850bfff14f316d7aaf Author: Stephen Rothwell sfr@canb.auug.org.au Date: Tue Sep 22 12:32:25 2020 +1000
Merge remote-tracking branch 'net-next/master' into master
# Conflicts: # drivers/net/dsa/microchip/ksz9477.c # net/ipv4/route.c
diff --combined Documentation/admin-guide/kernel-parameters.txt index 945dbbf0184d,8af893ef0d46..366e25f6c20f --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@@ -599,17 -599,6 +599,17 @@@ altogether. For more information, see include/linux/dma-contiguous.h
+ cma_pernuma=nn[MG] + [ARM64,KNL] + Sets the size of kernel per-numa memory area for + contiguous memory allocations. A value of 0 disables + per-numa CMA altogether. And If this option is not + specificed, the default value is 0. + With per-numa CMA enabled, DMA users on node nid will + first try to allocate buffer from the pernuma area + which is located in node nid, if the allocation fails, + they will fallback to the global default memory area. + cmo_free_hint= [PPC] Format: { yes | no } Specify whether pages are marked as being inactive when they are freed. This is used in CMO environments @@@ -1349,6 -1338,11 +1349,11 @@@ Format: <interval>,<probability>,<space>,<times> See also Documentation/fault-injection/.
+ fb_tunnels= [NET] + Format: { initns | none } + See Documentation/admin-guide/sysctl/net.rst for + fb_tunnels_only_for_init_ns + floppy= [HW] See Documentation/admin-guide/blockdev/floppy.rst.
@@@ -3196,7 -3190,7 +3201,7 @@@ register save and restore. The kernel will only save legacy floating-point registers on task switch.
- nohugeiomap [KNL,X86,PPC] Disable kernel huge I/O mappings. + nohugeiomap [KNL,X86,PPC,ARM64] Disable kernel huge I/O mappings.
nosmt [KNL,S390] Disable symmetric multithreading (SMT). Equivalent to smt=1. diff --combined Documentation/networking/ethtool-netlink.rst index b5a79881551f,2c8e0ddf548e..30b98245979f --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@@ -68,6 -68,7 +68,7 @@@ the flags may not apply to requests. Re ================================= =================================== ``ETHTOOL_FLAG_COMPACT_BITSETS`` use compact format bitsets in reply ``ETHTOOL_FLAG_OMIT_REPLY`` omit optional reply (_SET and _ACT) + ``ETHTOOL_FLAG_STATS`` include optional device statistics ================================= ===================================
New request flags should follow the general idea that if the flag is not set, @@@ -206,7 -207,6 +207,7 @@@ Userspace to kernel ``ETHTOOL_MSG_TSINFO_GET`` get timestamping info ``ETHTOOL_MSG_CABLE_TEST_ACT`` action start cable test ``ETHTOOL_MSG_CABLE_TEST_TDR_ACT`` action start raw TDR cable test + ``ETHTOOL_MSG_TUNNEL_INFO_GET`` get tunnel offload info ===================================== ================================
Kernel to userspace: @@@ -240,7 -240,6 +241,7 @@@ ``ETHTOOL_MSG_TSINFO_GET_REPLY`` timestamping info ``ETHTOOL_MSG_CABLE_TEST_NTF`` Cable test results ``ETHTOOL_MSG_CABLE_TEST_TDR_NTF`` Cable test TDR results + ``ETHTOOL_MSG_TUNNEL_INFO_GET_REPLY`` tunnel offload info ===================================== =================================
``GET`` requests are sent by userspace applications to retrieve device @@@ -991,8 -990,18 +992,18 @@@ Kernel response contents ``ETHTOOL_A_PAUSE_AUTONEG`` bool pause autonegotiation ``ETHTOOL_A_PAUSE_RX`` bool receive pause frames ``ETHTOOL_A_PAUSE_TX`` bool transmit pause frames + ``ETHTOOL_A_PAUSE_STATS`` nested pause statistics ===================================== ====== ==========================
+ ``ETHTOOL_A_PAUSE_STATS`` are reported if ``ETHTOOL_FLAG_STATS`` was set + in ``ETHTOOL_A_HEADER_FLAGS``. + It will be empty if driver did not report any statistics. Drivers fill in + the statistics in the following structure: + + .. kernel-doc:: include/linux/ethtool.h + :identifiers: ethtool_pause_stats + + Each member has a corresponding attribute defined.
PAUSE_SET ============ @@@ -1365,5 -1374,4 +1376,5 @@@ are netlink only ``ETHTOOL_SFECPARAM`` n/a n/a ''ETHTOOL_MSG_CABLE_TEST_ACT'' n/a ''ETHTOOL_MSG_CABLE_TEST_TDR_ACT'' + n/a ``ETHTOOL_MSG_TUNNEL_INFO_GET`` =================================== ===================================== diff --combined MAINTAINERS index 0702a6a3ee85,e3c1c70057e4..4b91cbcbb0b4 --- a/MAINTAINERS +++ b/MAINTAINERS @@@ -1286,7 -1286,7 +1286,7 @@@ S: Supporte F: Documentation/devicetree/bindings/net/apm-xgene-enet.txt F: Documentation/devicetree/bindings/net/apm-xgene-mdio.txt F: drivers/net/ethernet/apm/xgene/ - F: drivers/net/phy/mdio-xgene.c + F: drivers/net/mdio/mdio-xgene.c
APPLIED MICRO (APM) X-GENE SOC PMU M: Khuong Dinh khuong@os.amperecomputing.com @@@ -1694,6 -1694,7 +1694,6 @@@ F: arch/arm/mach-cns3xxx
ARM/CAVIUM THUNDER NETWORK DRIVER M: Sunil Goutham sgoutham@marvell.com -M: Robert Richter rrichter@marvell.com L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) S: Supported F: drivers/net/ethernet/cavium/thunder/ @@@ -2219,8 -2220,8 +2219,8 @@@ ARM/OPENMOKO NEO FREERUNNER (GTA02) MAC L: openmoko-kernel@lists.openmoko.org (subscribers-only) S: Orphan W: http://wiki.openmoko.org/wiki/Neo_FreeRunner -F: arch/arm/mach-s3c24xx/gta02.h -F: arch/arm/mach-s3c24xx/mach-gta02.c +F: arch/arm/mach-s3c/gta02.h +F: arch/arm/mach-s3c/mach-gta02.c
ARM/Orion SoC/Technologic Systems TS-78xx platform support M: Alexander Clouter alex@digriz.org.uk @@@ -2399,7 -2400,7 +2399,7 @@@ ARM/SAMSUNG EXYNOS ARM ARCHITECTURE M: Kukjin Kim kgene@kernel.org M: Krzysztof Kozlowski krzk@kernel.org L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) -L: linux-samsung-soc@vger.kernel.org (moderated for non-subscribers) +L: linux-samsung-soc@vger.kernel.org S: Maintained Q: https://patchwork.kernel.org/project/linux-samsung-soc/list/ F: Documentation/arm/samsung/ @@@ -2409,8 -2410,10 +2409,8 @@@ F: arch/arm/boot/dts/exynos F: arch/arm/boot/dts/s3c* F: arch/arm/boot/dts/s5p* F: arch/arm/mach-exynos*/ -F: arch/arm/mach-s3c24*/ -F: arch/arm/mach-s3c64xx/ +F: arch/arm/mach-s3c/ F: arch/arm/mach-s5p*/ -F: arch/arm/plat-samsung/ F: arch/arm64/boot/dts/exynos/ F: drivers/*/*/*s3c24* F: drivers/*/*s3c24* @@@ -2421,9 -2424,6 +2421,9 @@@ F: drivers/soc/samsung F: drivers/tty/serial/samsung* F: include/linux/soc/samsung/ N: exynos +N: s3c2410 +N: s3c64xx +N: s5pv210
ARM/SAMSUNG MOBILE MACHINE SUPPORT M: Kyungmin Park kyungmin.park@samsung.com @@@ -2442,7 -2442,7 +2442,7 @@@ F: drivers/media/platform/s5p-g2d
ARM/SAMSUNG S5P SERIES HDMI CEC SUBSYSTEM SUPPORT M: Marek Szyprowski m.szyprowski@samsung.com -L: linux-samsung-soc@vger.kernel.org (moderated for non-subscribers) +L: linux-samsung-soc@vger.kernel.org L: linux-media@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/media/s5p-cec.txt @@@ -3435,7 -3435,7 +3435,7 @@@ M: bcm-kernel-feedback-list@broadcom.co L: linux-arm-kernel@lists.infradead.org S: Maintained F: arch/arm/boot/dts/bcm470* -F: arch/arm/boot/dts/bcm5301x*.dtsi +F: arch/arm/boot/dts/bcm5301* F: arch/arm/boot/dts/bcm953012* F: arch/arm/mach-bcm/bcm_5301x.c
@@@ -3493,7 -3493,6 +3493,7 @@@ F: arch/mips/bmips/ F: arch/mips/boot/dts/brcm/bcm*.dts* F: arch/mips/include/asm/mach-bmips/* F: arch/mips/kernel/*bmips* +F: drivers/soc/bcm/bcm63xx F: drivers/irqchip/irq-bcm63* F: drivers/irqchip/irq-bcm7* F: drivers/irqchip/irq-brcmstb* @@@ -3949,8 -3948,8 +3949,8 @@@ W: https://wireless.wiki.kernel.org/en/ F: drivers/net/wireless/ath/carl9170/
CAVIUM I2C DRIVER -M: Robert Richter rrichter@marvell.com -S: Supported +M: Robert Richter rric@kernel.org +S: Odd Fixes W: http://www.marvell.com F: drivers/i2c/busses/i2c-octeon* F: drivers/i2c/busses/i2c-thunderx* @@@ -3965,8 -3964,8 +3965,8 @@@ W: http://www.marvell.co F: drivers/net/ethernet/cavium/liquidio/
CAVIUM MMC DRIVER -M: Robert Richter rrichter@marvell.com -S: Supported +M: Robert Richter rric@kernel.org +S: Odd Fixes W: http://www.marvell.com F: drivers/mmc/host/cavium*
@@@ -3978,9 -3977,9 +3978,9 @@@ W: http://www.marvell.co F: drivers/crypto/cavium/cpt/
CAVIUM THUNDERX2 ARM64 SOC -M: Robert Richter rrichter@marvell.com +M: Robert Richter rric@kernel.org L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) -S: Maintained +S: Odd Fixes F: Documentation/devicetree/bindings/arm/cavium-thunder2.txt F: arch/arm64/boot/dts/cavium/thunder2-99xx*
@@@ -4259,15 -4258,12 +4259,15 @@@ S: Maintaine F: .clang-format
CLANG/LLVM BUILD SUPPORT +M: Nathan Chancellor natechancellor@gmail.com +M: Nick Desaulniers ndesaulniers@google.com L: clang-built-linux@googlegroups.com S: Supported W: https://clangbuiltlinux.github.io/ B: https://github.com/ClangBuiltLinux/linux/issues C: irc://chat.freenode.net/clangbuiltlinux F: Documentation/kbuild/llvm.rst +F: scripts/clang-tools/ K: \b(?i:clang|llvm)\b
CLEANCACHE API @@@ -4411,6 -4407,12 +4411,6 @@@ T: git git://git.infradead.org/users/hc F: fs/configfs/ F: include/linux/configfs.h
-CONNECTOR -M: Evgeniy Polyakov zbr@ioremap.net -L: netdev@vger.kernel.org -S: Maintained -F: drivers/connector/ - CONSOLE SUBSYSTEM M: Greg Kroah-Hartman gregkh@linuxfoundation.org S: Supported @@@ -4707,6 -4709,15 +4707,15 @@@ S: Supporte W: http://www.chelsio.com F: drivers/crypto/chelsio
+ CXGB4 INLINE CRYPTO DRIVER + M: Ayush Sawal ayush.sawal@chelsio.com + M: Vinay Kumar Yadav vinay.yadav@chelsio.com + M: Rohit Maheshwari rohitm@chelsio.com + L: netdev@vger.kernel.org + S: Supported + W: http://www.chelsio.com + F: drivers/net/ethernet/chelsio/inline_crypto/ + CXGB4 ETHERNET DRIVER (CXGB4) M: Vishal Kulkarni vishal@chelsio.com L: netdev@vger.kernel.org @@@ -6177,7 -6188,7 +6186,7 @@@ F: Documentation/devicetree/bindings/ed F: drivers/edac/aspeed_edac.c
EDAC-BLUEFIELD -M: Shravan Kumar Ramani sramani@nvidia.com +M: Shravan Kumar Ramani shravankr@nvidia.com S: Supported F: drivers/edac/bluefield_edac.c
@@@ -6189,15 -6200,16 +6198,15 @@@ F: drivers/edac/highbank
EDAC-CAVIUM OCTEON M: Ralf Baechle ralf@linux-mips.org -M: Robert Richter rrichter@marvell.com L: linux-edac@vger.kernel.org L: linux-mips@vger.kernel.org S: Supported F: drivers/edac/octeon_edac*
EDAC-CAVIUM THUNDERX -M: Robert Richter rrichter@marvell.com +M: Robert Richter rric@kernel.org L: linux-edac@vger.kernel.org -S: Supported +S: Odd Fixes F: drivers/edac/thunderx_edac*
EDAC-CORE @@@ -6205,7 -6217,7 +6214,7 @@@ M: Borislav Petkov <bp@alien8.de M: Mauro Carvalho Chehab mchehab@kernel.org M: Tony Luck tony.luck@intel.com R: James Morse james.morse@arm.com -R: Robert Richter rrichter@marvell.com +R: Robert Richter rric@kernel.org L: linux-edac@vger.kernel.org S: Supported T: git git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git edac-for-next @@@ -6518,11 -6530,14 +6527,14 @@@ F: Documentation/devicetree/bindings/ne F: Documentation/devicetree/bindings/net/mdio* F: Documentation/devicetree/bindings/net/qca,ar803x.yaml F: Documentation/networking/phy.rst + F: drivers/net/mdio/ + F: drivers/net/pcs/ F: drivers/net/phy/ F: drivers/of/of_mdio.c F: drivers/of/of_net.c F: include/dt-bindings/net/qca-ar803x.h F: include/linux/*mdio*.h + F: include/linux/mdio/*.h F: include/linux/of_net.h F: include/linux/phy.h F: include/linux/phy_fixed.h @@@ -6898,14 -6913,6 +6910,14 @@@ L: linuxppc-dev@lists.ozlabs.or S: Maintained F: drivers/dma/fsldma.*
+FREESCALE DSPI DRIVER +M: Vladimir Oltean olteanv@gmail.com +L: linux-spi@vger.kernel.org +S: Maintained +F: Documentation/devicetree/bindings/spi/spi-fsl-dspi.txt +F: drivers/spi/spi-fsl-dspi.c +F: include/linux/spi/spi-fsl-dspi.h + FREESCALE ENETC ETHERNET DRIVERS M: Claudiu Manoil claudiu.manoil@nxp.com L: netdev@vger.kernel.org @@@ -7194,7 -7201,7 +7206,7 @@@ FUSE: FILESYSTEM IN USERSPAC M: Miklos Szeredi miklos@szeredi.hu L: linux-fsdevel@vger.kernel.org S: Maintained -W: http://fuse.sourceforge.net/ +W: https://github.com/libfuse/ T: git git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git F: Documentation/filesystems/fuse.rst F: fs/fuse/ @@@ -8277,7 -8284,7 +8289,7 @@@ IA64 (Itanium) PLATFOR M: Tony Luck tony.luck@intel.com M: Fenghua Yu fenghua.yu@intel.com L: linux-ia64@vger.kernel.org -S: Maintained +S: Odd Fixes T: git git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git F: Documentation/ia64/ F: arch/ia64/ @@@ -8326,9 -8333,8 +8338,9 @@@ S: Supporte F: drivers/pci/hotplug/rpaphp*
IBM Power SRIOV Virtual NIC Device Driver -M: Thomas Falcon tlfalcon@linux.ibm.com -M: John Allen jallen@linux.ibm.com +M: Dany Madden drt@linux.ibm.com +M: Lijun Pan ljp@linux.ibm.com +M: Sukadev Bhattiprolu sukadev@linux.ibm.com L: netdev@vger.kernel.org S: Supported F: drivers/net/ethernet/ibm/ibmvnic.* @@@ -8342,7 -8348,7 +8354,7 @@@ F: arch/powerpc/platforms/powernv/copy- F: arch/powerpc/platforms/powernv/vas*
IBM Power Virtual Ethernet Device Driver -M: Thomas Falcon tlfalcon@linux.ibm.com +M: Cristobal Forno cforno12@linux.ibm.com L: netdev@vger.kernel.org S: Supported F: drivers/net/ethernet/ibm/ibmveth.* @@@ -8618,9 -8624,8 +8630,9 @@@ INGENIC JZ47xx SoC M: Paul Cercueil paul@crapouillou.net S: Maintained F: arch/mips/boot/dts/ingenic/ -F: arch/mips/include/asm/mach-jz4740/ -F: arch/mips/jz4740/ +F: arch/mips/generic/board-ingenic.c +F: arch/mips/include/asm/mach-ingenic/ +F: arch/mips/ingenic/Kconfig F: drivers/clk/ingenic/ F: drivers/dma/dma-jz4780.c F: drivers/gpu/drm/ingenic/ @@@ -8855,7 -8860,7 +8867,7 @@@ INTEL IPU3 CSI-2 CIO2 DRIVE M: Yong Zhi yong.zhi@intel.com M: Sakari Ailus sakari.ailus@linux.intel.com M: Bingbu Cao bingbu.cao@intel.com -R: Tian Shu Qiu tian.shu.qiu@intel.com +R: Tianshu Qiu tian.shu.qiu@intel.com L: linux-media@vger.kernel.org S: Maintained F: Documentation/userspace-api/media/v4l/pixfmt-srggb10-ipu3.rst @@@ -8864,7 -8869,7 +8876,7 @@@ F: drivers/media/pci/intel/ipu3 INTEL IPU3 CSI-2 IMGU DRIVER M: Sakari Ailus sakari.ailus@linux.intel.com R: Bingbu Cao bingbu.cao@intel.com -R: Tian Shu Qiu tian.shu.qiu@intel.com +R: Tianshu Qiu tian.shu.qiu@intel.com L: linux-media@vger.kernel.org S: Maintained F: Documentation/admin-guide/media/ipu3.rst @@@ -9250,7 -9255,7 +9262,7 @@@ F: drivers/firmware/iscsi_ibft
ISCSI EXTENSIONS FOR RDMA (ISER) INITIATOR M: Sagi Grimberg sagi@grimberg.me -M: Max Gurtovoy maxg@nvidia.com +M: Max Gurtovoy mgurtovoy@nvidia.com L: linux-rdma@vger.kernel.org S: Supported W: http://www.openfabrics.org @@@ -9799,7 -9804,7 +9811,7 @@@ F: drivers/scsi/53c700
LEAKING_ADDRESSES M: Tobin C. Harding me@tobin.cc -M: Tycho Andersen tycho@tycho.ws +M: Tycho Andersen tycho@tycho.pizza L: kernel-hardening@lists.openwall.com S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/tobin/leaks.git @@@ -10305,6 -10310,13 +10317,13 @@@ S: Maintaine W: http://linux-test-project.github.io/ T: git git://github.com/linux-test-project/ltp.git
+ LYNX PCS MODULE + M: Ioana Ciornei ioana.ciornei@nxp.com + L: netdev@vger.kernel.org + S: Supported + F: drivers/net/pcs/pcs-lynx.c + F: include/linux/pcs-lynx.h + M68K ARCHITECTURE M: Geert Uytterhoeven geert@linux-m68k.org L: linux-m68k@lists.linux-m68k.org @@@ -10512,7 -10524,7 +10531,7 @@@ M: Tobias Waldekranz <tobias@waldekranz L: netdev@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/net/marvell,mvusb.yaml - F: drivers/net/phy/mdio-mvusb.c + F: drivers/net/mdio/mdio-mvusb.c
MARVELL XENON MMC/SD/SDIO HOST CONTROLLER DRIVER M: Hu Ziji huziji@marvell.com @@@ -10659,6 -10671,15 +10678,15 @@@ L: linux-input@vger.kernel.or S: Maintained F: drivers/hid/hid-mcp2221.c
+ MCP25XXFD SPI-CAN NETWORK DRIVER + M: Marc Kleine-Budde mkl@pengutronix.de + M: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org + R: Thomas Kopp thomas.kopp@microchip.com + L: linux-can@vger.kernel.org + S: Maintained + F: Documentation/devicetree/bindings/net/can/microchip,mcp25xxfd.yaml + F: drivers/net/can/spi/mcp25xxfd/ + MCP4018 AND MCP4531 MICROCHIP DIGITAL POTENTIOMETER DRIVERS M: Peter Rosin peda@axentia.se L: linux-iio@vger.kernel.org @@@ -12054,7 -12075,6 +12082,7 @@@ Q: http://patchwork.ozlabs.org/project/ T: git git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git T: git git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git F: Documentation/devicetree/bindings/net/ +F: drivers/connector/ F: drivers/net/ F: include/linux/etherdevice.h F: include/linux/fcdevice.h @@@ -12766,7 -12786,7 +12794,7 @@@ T: git git://linuxtv.org/media_tree.gi F: drivers/media/i2c/ov2685.c
OMNIVISION OV2740 SENSOR DRIVER -M: Tianshu Qiu tian.shu.qiua@intel.com +M: Tianshu Qiu tian.shu.qiu@intel.com R: Shawn Tu shawnx.tu@intel.com R: Bingbu Cao bingbu.cao@intel.com L: linux-media@vger.kernel.org @@@ -12782,12 -12802,10 +12810,12 @@@ T: git git://linuxtv.org/media_tree.gi F: drivers/media/i2c/ov5640.c
OMNIVISION OV5647 SENSOR DRIVER -M: Luis Oliveira lolivei@synopsys.com +M: Dave Stevenson dave.stevenson@raspberrypi.com +M: Jacopo Mondi jacopo@jmondi.org L: linux-media@vger.kernel.org S: Maintained T: git git://linuxtv.org/media_tree.git +F: Documentation/devicetree/bindings/media/i2c/ov5647.yaml F: drivers/media/i2c/ov5647.c
OMNIVISION OV5670 SENSOR DRIVER @@@ -13323,7 -13341,7 +13351,7 @@@ PCI DRIVER FOR SAMSUNG EXYNO M: Jingoo Han jingoohan1@gmail.com L: linux-pci@vger.kernel.org L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) -L: linux-samsung-soc@vger.kernel.org (moderated for non-subscribers) +L: linux-samsung-soc@vger.kernel.org S: Maintained F: drivers/pci/controller/dwc/pci-exynos.c
@@@ -13456,10 -13474,10 +13484,10 @@@ F: Documentation/devicetree/bindings/pc F: drivers/pci/controller/dwc/*artpec*
PCIE DRIVER FOR CAVIUM THUNDERX -M: Robert Richter rrichter@marvell.com +M: Robert Richter rric@kernel.org L: linux-pci@vger.kernel.org L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) -S: Supported +S: Odd Fixes F: drivers/pci/controller/pci-thunder-*
PCIE DRIVER FOR HISILICON @@@ -13731,7 -13749,7 +13759,7 @@@ M: Tomasz Figa <tomasz.figa@gmail.com M: Krzysztof Kozlowski krzk@kernel.org M: Sylwester Nawrocki s.nawrocki@samsung.com L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) -L: linux-samsung-soc@vger.kernel.org (moderated for non-subscribers) +L: linux-samsung-soc@vger.kernel.org S: Maintained Q: https://patchwork.kernel.org/project/linux-samsung-soc/list/ T: git git://git.kernel.org/pub/scm/linux/kernel/git/pinctrl/samsung.git @@@ -13957,7 -13975,6 +13985,7 @@@ PRINT M: Petr Mladek pmladek@suse.com M: Sergey Senozhatsky sergey.senozhatsky@gmail.com R: Steven Rostedt rostedt@goodmis.org +R: John Ogness john.ogness@linutronix.de S: Maintained F: include/linux/printk.h F: kernel/printk/ @@@ -14234,18 -14251,21 +14262,18 @@@ M: Nilesh Javali <njavali@marvell.com M: GR-QLogic-Storage-Upstream@marvell.com L: linux-scsi@vger.kernel.org S: Supported -F: Documentation/scsi/LICENSE.qla2xxx F: drivers/scsi/qla2xxx/
QLOGIC QLA3XXX NETWORK DRIVER M: GR-Linux-NIC-Dev@marvell.com L: netdev@vger.kernel.org S: Supported -F: Documentation/networking/device_drivers/ethernet/qlogic/LICENSE.qla3xxx F: drivers/net/ethernet/qlogic/qla3xxx.*
QLOGIC QLA4XXX iSCSI DRIVER M: QLogic-Storage-Upstream@qlogic.com L: linux-scsi@vger.kernel.org S: Supported -F: Documentation/scsi/LICENSE.qla4xxx F: drivers/scsi/qla4xxx/
QLOGIC QLCNIC (1/10)Gb ETHERNET DRIVER @@@ -14396,7 -14416,7 +14424,7 @@@ M: Rob Clark <robdclark@gmail.com L: iommu@lists.linux-foundation.org L: linux-arm-msm@vger.kernel.org S: Maintained -F: drivers/iommu/qcom_iommu.c +F: drivers/iommu/arm/arm-smmu/qcom_iommu.c
QUALCOMM IPCC MAILBOX DRIVER M: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org @@@ -14593,9 -14613,9 +14621,9 @@@ M: Niklas S��derlund <niklas.soderlund+ L: linux-media@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/media/i2c/imi,rdacm2x-gmsl.yaml -F: drivers/media/i2c/rdacm20.c F: drivers/media/i2c/max9271.c F: drivers/media/i2c/max9271.h +F: drivers/media/i2c/rdacm20.c
RDC R-321X SoC M: Florian Fainelli florian@openwrt.org @@@ -14889,7 -14909,6 +14917,7 @@@ F: include/linux/hid-roccat
ROCKCHIP ISP V1 DRIVER M: Helen Koike helen.koike@collabora.com +M: Dafna Hirschfeld dafna.hirschfeld@collabora.com L: linux-media@vger.kernel.org S: Maintained F: drivers/staging/media/rkisp1/ @@@ -15276,16 -15295,17 +15304,17 @@@ F: include/linux/mfd/samsung SAMSUNG S3C24XX/S3C64XX SOC SERIES CAMIF DRIVER M: Sylwester Nawrocki sylvester.nawrocki@gmail.com L: linux-media@vger.kernel.org -L: linux-samsung-soc@vger.kernel.org (moderated for non-subscribers) +L: linux-samsung-soc@vger.kernel.org S: Maintained F: drivers/media/platform/s3c-camif/ F: include/media/drv-intf/s3c_camif.h
SAMSUNG S3FWRN5 NFC DRIVER - M: Robert Baldyga r.baldyga@samsung.com + M: Krzysztof Kozlowski krzk@kernel.org M: Krzysztof Opasiak k.opasiak@samsung.com L: linux-nfc@lists.01.org (moderated for non-subscribers) - S: Supported + S: Maintained + F: Documentation/devicetree/bindings/net/nfc/samsung,s3fwrn5.yaml F: drivers/nfc/s3fwrn5
SAMSUNG S5C73M3 CAMERA DRIVER @@@ -15325,7 -15345,7 +15354,7 @@@ SAMSUNG SOC CLOCK DRIVER M: Sylwester Nawrocki s.nawrocki@samsung.com M: Tomasz Figa tomasz.figa@gmail.com M: Chanwoo Choi cw00.choi@samsung.com -L: linux-samsung-soc@vger.kernel.org (moderated for non-subscribers) +L: linux-samsung-soc@vger.kernel.org S: Supported T: git git://git.kernel.org/pub/scm/linux/kernel/git/snawrocki/clk.git F: Documentation/devicetree/bindings/clock/exynos*.txt @@@ -15333,20 -15353,17 +15362,20 @@@ F: Documentation/devicetree/bindings/cl F: Documentation/devicetree/bindings/clock/samsung,s5p* F: drivers/clk/samsung/ F: include/dt-bindings/clock/exynos*.h +F: include/linux/clk/samsung.h +F: include/linux/platform_data/clk-s3c2410.h
SAMSUNG SPI DRIVERS M: Kukjin Kim kgene@kernel.org M: Krzysztof Kozlowski krzk@kernel.org M: Andi Shyti andi@etezian.org L: linux-spi@vger.kernel.org -L: linux-samsung-soc@vger.kernel.org (moderated for non-subscribers) +L: linux-samsung-soc@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/spi/spi-samsung.txt F: drivers/spi/spi-s3c* F: include/linux/platform_data/spi-s3c64xx.h +F: include/linux/spi/s3c24xx-fiq.h
SAMSUNG SXGBE DRIVERS M: Byungho An bh74.an@samsung.com @@@ -15581,7 -15598,6 +15610,7 @@@ F: include/uapi/linux/sed SECURITY CONTACT M: Security Officers security@kernel.org S: Supported +F: Documentation/admin-guide/security-bugs.rst
SECURITY SUBSYSTEM M: James Morris jmorris@namei.org @@@ -15673,6 -15689,7 +15702,7 @@@ L: netdev@vger.kernel.or S: Maintained F: drivers/net/phy/phylink.c F: drivers/net/phy/sfp* + F: include/linux/mdio/mdio-i2c.h F: include/linux/phylink.h F: include/linux/sfp.h K: phylink.h|struct\s+phylink|.phylink|>phylink_|phylink_(autoneg|clear|connect|create|destroy|disconnect|ethtool|helper|mac|mii|of|set|start|stop|test|validate) @@@ -15861,17 -15878,19 +15891,17 @@@ F: drivers/video/fbdev/simplefb. F: include/linux/platform_data/simplefb.h
SIMTEC EB110ATX (Chalice CATS) -M: Vincent Sanders vince@simtec.co.uk M: Simtec Linux Team linux@simtec.co.uk S: Supported W: http://www.simtec.co.uk/products/EB110ATX/
SIMTEC EB2410ITX (BAST) -M: Vincent Sanders vince@simtec.co.uk M: Simtec Linux Team linux@simtec.co.uk S: Supported W: http://www.simtec.co.uk/products/EB2410ITX/ -F: arch/arm/mach-s3c24xx/bast-ide.c -F: arch/arm/mach-s3c24xx/bast-irq.c -F: arch/arm/mach-s3c24xx/mach-bast.c +F: arch/arm/mach-s3c/bast-ide.c +F: arch/arm/mach-s3c/bast-irq.c +F: arch/arm/mach-s3c/mach-bast.c
SIOX M: Thorsten Scherer t.scherer@eckelmann.de @@@ -16077,6 -16096,7 +16107,6 @@@ F: include/uapi/rdma/rdma_user_rxe. SOFTLOGIC 6x10 MPEG CODEC M: Bluecherry Maintainers maintainers@bluecherrydvr.com M: Anton Sviridenko anton@corp.bluecherry.net -M: Andrey Utkin andrey.utkin@corp.bluecherry.net M: Andrey Utkin andrey_utkin@fastmail.com M: Ismael Luceno ismael@iodev.co.uk L: linux-media@vger.kernel.org @@@ -16754,8 -16774,8 +16784,8 @@@ SYNOPSYS DESIGNWARE ETHERNET XPCS DRIVE M: Jose Abreu Jose.Abreu@synopsys.com L: netdev@vger.kernel.org S: Supported - F: drivers/net/phy/mdio-xpcs.c - F: include/linux/mdio-xpcs.h + F: drivers/net/pcs/pcs-xpcs.c + F: include/linux/pcs/pcs-xpcs.h
SYNOPSYS DESIGNWARE I2C DRIVER M: Jarkko Nikula jarkko.nikula@linux.intel.com @@@ -17247,8 -17267,8 +17277,8 @@@ S: Maintaine F: drivers/net/thunderbolt.c
THUNDERX GPIO DRIVER -M: Robert Richter rrichter@marvell.com -S: Maintained +M: Robert Richter rric@kernel.org +S: Odd Fixes F: drivers/gpio/gpio-thunderx.c
TI AM437X VPFE DRIVER @@@ -17731,7 -17751,6 +17761,7 @@@ S: Supporte W: http://www.linux-mtd.infradead.org/doc/ubifs.html T: git git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs.git next T: git git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs.git fixes +F: Documentation/filesystems/ubifs-authentication.rst F: Documentation/filesystems/ubifs.rst F: fs/ubifs/
@@@ -18125,6 -18144,14 +18155,6 @@@ T: git git://linuxtv.org/media_tree.gi F: drivers/media/usb/uvc/ F: include/uapi/linux/uvcvideo.h
-USB VISION DRIVER -M: Hans Verkuil hverkuil@xs4all.nl -L: linux-media@vger.kernel.org -S: Odd Fixes -W: https://linuxtv.org -T: git git://linuxtv.org/media_tree.git -F: drivers/staging/media/usbvision/ - USB WEBCAM GADGET M: Laurent Pinchart laurent.pinchart@ideasonboard.com L: linux-usb@vger.kernel.org @@@ -18323,8 -18350,10 +18353,8 @@@ S: Maintaine F: drivers/media/platform/video-mux.c
VIDEOBUF2 FRAMEWORK -M: Pawel Osciak pawel@osciak.com +M: Tomasz Figa tfiga@chromium.org M: Marek Szyprowski m.szyprowski@samsung.com -M: Kyungmin Park kyungmin.park@samsung.com -R: Tomasz Figa tfiga@chromium.org L: linux-media@vger.kernel.org S: Maintained F: drivers/media/common/videobuf2/* @@@ -18514,14 -18543,6 +18544,14 @@@ W: https://linuxtv.or T: git git://linuxtv.org/media_tree.git F: drivers/media/test-drivers/vivid/*
+VIDTV VIRTUAL DIGITAL TV DRIVER +M: Daniel W. S. Almeida dwlsalmeida@gmail.com +L: linux-media@vger.kernel.org +S: Maintained +W: https://linuxtv.org +T: git git://linuxtv.org/media_tree.git +F: drivers/media/test-drivers/vidtv/* + VLYNQ BUS M: Florian Fainelli f.fainelli@gmail.com L: openwrt-devel@lists.openwrt.org (subscribers-only) @@@ -18788,7 -18809,7 +18818,7 @@@ F: Documentation/devicetree/bindings/mf F: Documentation/devicetree/bindings/regulator/wlf,arizona.yaml F: Documentation/devicetree/bindings/sound/wlf,arizona.yaml F: Documentation/hwmon/wm83??.rst -F: arch/arm/mach-s3c64xx/mach-crag6410* +F: arch/arm/mach-s3c/mach-crag6410* F: drivers/clk/clk-wm83*.c F: drivers/extcon/extcon-arizona.c F: drivers/gpio/gpio-*wm*.c diff --combined arch/arm64/boot/dts/exynos/exynos5433-tm2-common.dtsi index 6246cce2a15e,24aab3ea3f52..829fea23d4ab --- a/arch/arm64/boot/dts/exynos/exynos5433-tm2-common.dtsi +++ b/arch/arm64/boot/dts/exynos/exynos5433-tm2-common.dtsi @@@ -87,8 -87,8 +87,8 @@@
i2c_max98504: i2c-gpio-0 { compatible = "i2c-gpio"; - gpios = <&gpd0 1 GPIO_ACTIVE_HIGH /* SPK_AMP_SDA */ - &gpd0 0 GPIO_ACTIVE_HIGH /* SPK_AMP_SCL */ >; + sda-gpios = <&gpd0 1 GPIO_ACTIVE_HIGH>; + scl-gpios = <&gpd0 0 GPIO_ACTIVE_HIGH>; i2c-gpio,delay-us = <2>; #address-cells = <1>; #size-cells = <0>; @@@ -795,8 -795,8 +795,8 @@@ reg = <0x27>; interrupt-parent = <&gpa1>; interrupts = <3 IRQ_TYPE_LEVEL_HIGH>; - s3fwrn5,en-gpios = <&gpf1 4 GPIO_ACTIVE_HIGH>; - s3fwrn5,fw-gpios = <&gpj0 2 GPIO_ACTIVE_HIGH>; + en-gpios = <&gpf1 4 GPIO_ACTIVE_HIGH>; + wake-gpios = <&gpj0 2 GPIO_ACTIVE_HIGH>; }; };
diff --combined drivers/net/dsa/microchip/ksz9477.c index 2f5506ac7d19,b62dd64470a8..153664bf0e20 --- a/drivers/net/dsa/microchip/ksz9477.c +++ b/drivers/net/dsa/microchip/ksz9477.c @@@ -1208,7 -1208,7 +1208,7 @@@ static void ksz9477_port_setup(struct k
/* configure MAC to 1G & RGMII mode */ ksz_pread8(dev, port, REG_PORT_XMII_CTRL_1, &data8); - switch (dev->interface) { + switch (p->interface) { case PHY_INTERFACE_MODE_MII: ksz9477_set_xmii(dev, 0, &data8); ksz9477_set_gbit(dev, false, &data8); @@@ -1229,12 -1229,15 +1229,15 @@@ ksz9477_set_gbit(dev, true, &data8); data8 &= ~PORT_RGMII_ID_IG_ENABLE; data8 &= ~PORT_RGMII_ID_EG_ENABLE; - if (dev->interface == PHY_INTERFACE_MODE_RGMII_ID || - dev->interface == PHY_INTERFACE_MODE_RGMII_RXID) + if (p->interface == PHY_INTERFACE_MODE_RGMII_ID || + p->interface == PHY_INTERFACE_MODE_RGMII_RXID) data8 |= PORT_RGMII_ID_IG_ENABLE; - if (dev->interface == PHY_INTERFACE_MODE_RGMII_ID || - dev->interface == PHY_INTERFACE_MODE_RGMII_TXID) + if (p->interface == PHY_INTERFACE_MODE_RGMII_ID || + p->interface == PHY_INTERFACE_MODE_RGMII_TXID) data8 |= PORT_RGMII_ID_EG_ENABLE; + /* On KSZ9893, disable RGMII in-band status support */ + if (dev->features & IS_9893) + data8 &= ~PORT_MII_MAC_MODE; p->phydev.speed = SPEED_1000; break; } @@@ -1265,36 -1268,37 +1268,46 @@@ static void ksz9477_config_cpu_port(str for (i = 0; i < dev->port_cnt; i++) { if (dsa_is_cpu_port(ds, i) && (dev->cpu_ports & (1 << i))) { phy_interface_t interface; + const char *prev_msg; + const char *prev_mode;
dev->cpu_port = i; dev->host_mask = (1 << dev->cpu_port); dev->port_mask |= dev->host_mask; + p = &dev->ports[i];
/* Read from XMII register to determine host port * interface. If set specifically in device tree * note the difference to help debugging. */ interface = ksz9477_get_interface(dev, i); - if (!dev->interface) - dev->interface = interface; - if (interface && interface != dev->interface) { + if (!p->interface) { + if (dev->compat_interface) { + dev_warn(dev->dev, + "Using legacy switch "phy-mode" property, because it is missing on port %d node. " + "Please update your device tree.\n", + i); + p->interface = dev->compat_interface; + } else { + p->interface = interface; + } + } - if (interface && interface != p->interface) - dev_info(dev->dev, - "use %s instead of %s\n", - phy_modes(p->interface), - phy_modes(interface)); ++ if (interface && interface != p->interface) { + prev_msg = " instead of "; + prev_mode = phy_modes(interface); + } else { + prev_msg = ""; + prev_mode = ""; + } + dev_info(dev->dev, + "Port%d: using phy mode %s%s%s\n", + i, - phy_modes(dev->interface), ++ phy_modes(p->interface), + prev_msg, + prev_mode);
/* enable cpu port */ ksz9477_port_setup(dev, i, true); - p = &dev->ports[dev->cpu_port]; p->vid_member = dev->port_mask; p->on = 1; } @@@ -1435,10 -1439,12 +1448,12 @@@ static int ksz9477_switch_detect(struc /* Default capability is gigabit capable. */ dev->features = GBIT_SUPPORT;
+ dev_dbg(dev->dev, "Switch detect: ID=%08x%02x\n", id32, data8); id_hi = (u8)(id32 >> 16); id_lo = (u8)(id32 >> 8); if ((id_lo & 0xf) == 3) { /* Chip is from KSZ9893 design. */ + dev_info(dev->dev, "Found KSZ9893\n"); dev->features |= IS_9893;
/* Chip does not support gigabit. */ @@@ -1447,6 -1453,7 +1462,7 @@@ dev->mib_port_cnt = 3; dev->phy_port_cnt = 2; } else { + dev_info(dev->dev, "Found KSZ9477 or compatible\n"); /* Chip uses new XMII register definitions. */ dev->features |= NEW_XMII;
diff --combined drivers/net/dsa/microchip/ksz_common.c index 8e755b50c9c1,a31738662d95..cb534547c715 --- a/drivers/net/dsa/microchip/ksz_common.c +++ b/drivers/net/dsa/microchip/ksz_common.c @@@ -388,8 -388,6 +388,8 @@@ int ksz_switch_register(struct ksz_devi const struct ksz_dev_ops *ops) { phy_interface_t interface; + struct device_node *port; + unsigned int port_num; int ret;
if (dev->pdata) @@@ -402,8 -400,9 +402,9 @@@
if (dev->reset_gpio) { gpiod_set_value_cansleep(dev->reset_gpio, 1); - mdelay(10); + usleep_range(10000, 12000); gpiod_set_value_cansleep(dev->reset_gpio, 0); + usleep_range(100, 1000); }
mutex_init(&dev->dev_mutex); @@@ -423,19 -422,10 +424,19 @@@ /* Host port interface will be self detected, or specifically set in * device tree. */ + for (port_num = 0; port_num < dev->port_cnt; ++port_num) + dev->ports[port_num].interface = PHY_INTERFACE_MODE_NA; if (dev->dev->of_node) { ret = of_get_phy_mode(dev->dev->of_node, &interface); if (ret == 0) - dev->interface = interface; + dev->compat_interface = interface; + for_each_available_child_of_node(dev->dev->of_node, port) { + if (of_property_read_u32(port, "reg", &port_num)) + continue; + if (port_num >= dev->port_cnt) + return -EINVAL; + of_get_phy_mode(port, &dev->ports[port_num].interface); + } dev->synclko_125 = of_property_read_bool(dev->dev->of_node, "microchip,synclko-125"); } diff --combined drivers/net/dsa/ocelot/felix.c index 01427cd08448,f9a7034be0c7..5f395d4119ac --- a/drivers/net/dsa/ocelot/felix.c +++ b/drivers/net/dsa/ocelot/felix.c @@@ -19,6 -19,7 +19,7 @@@ #include <linux/of_net.h> #include <linux/pci.h> #include <linux/of.h> + #include <linux/pcs-lynx.h> #include <net/pkt_sched.h> #include <net/dsa.h> #include "felix.h" @@@ -196,27 -197,16 +197,16 @@@ static void felix_phylink_validate(stru felix->info->phylink_validate(ocelot, port, supported, state); }
- static int felix_phylink_mac_pcs_get_state(struct dsa_switch *ds, int port, - struct phylink_link_state *state) - { - struct ocelot *ocelot = ds->priv; - struct felix *felix = ocelot_to_felix(ocelot); - - if (felix->info->pcs_link_state) - felix->info->pcs_link_state(ocelot, port, state); - - return 0; - } - static void felix_phylink_mac_config(struct dsa_switch *ds, int port, unsigned int link_an_mode, const struct phylink_link_state *state) { struct ocelot *ocelot = ds->priv; struct felix *felix = ocelot_to_felix(ocelot); + struct dsa_port *dp = dsa_to_port(ds, port);
- if (felix->info->pcs_config) - felix->info->pcs_config(ocelot, port, link_an_mode, state); + if (felix->pcs[port]) + phylink_set_pcs(dp->pl, &felix->pcs[port]->pcs); }
static void felix_phylink_mac_link_down(struct dsa_switch *ds, int port, @@@ -306,10 -296,6 +296,6 @@@ static void felix_phylink_mac_link_up(s ocelot_fields_write(ocelot, port, QSYS_SWITCH_PORT_MODE_PORT_ENA, 1);
- if (felix->info->pcs_link_up) - felix->info->pcs_link_up(ocelot, port, link_an_mode, interface, - speed, duplex); - if (felix->info->port_sched_speed_set) felix->info->port_sched_speed_set(ocelot, port, speed); } @@@ -552,23 -538,6 +538,6 @@@ static int felix_init_structs(struct fe return 0; }
- static struct ptp_clock_info ocelot_ptp_clock_info = { - .owner = THIS_MODULE, - .name = "felix ptp", - .max_adj = 0x7fffffff, - .n_alarm = 0, - .n_ext_ts = 0, - .n_per_out = OCELOT_PTP_PINS_NUM, - .n_pins = OCELOT_PTP_PINS_NUM, - .pps = 0, - .gettime64 = ocelot_ptp_gettime64, - .settime64 = ocelot_ptp_settime64, - .adjtime = ocelot_ptp_adjtime, - .adjfine = ocelot_ptp_adjfine, - .verify = ocelot_ptp_verify, - .enable = ocelot_ptp_enable, - }; - /* Hardware initialization done here so that we can allocate structures with * devm without fear of dsa_register_switch returning -EPROBE_DEFER and causing * us to allocate structures twice (leak memory) and map PCI memory twice @@@ -585,12 -554,9 +554,12 @@@ static int felix_setup(struct dsa_switc if (err) return err;
- ocelot_init(ocelot); + err = ocelot_init(ocelot); + if (err) + return err; + if (ocelot->ptp) { - err = ocelot_init_timestamp(ocelot, &ocelot_ptp_clock_info); + err = ocelot_init_timestamp(ocelot, felix->info->ptp_caps); if (err) { dev_err(ocelot->dev, "Timestamp initialization failed\n"); @@@ -630,11 -596,6 +599,6 @@@
ds->mtu_enforcement_ingress = true; ds->configure_vlan_while_not_filtering = true; - /* It looks like the MAC/PCS interrupt register - PM0_IEVENT (0x8040) - * isn't instantiated for the Felix PF. - * In-band AN may take a few ms to complete, so we need to poll. - */ - ds->pcs_poll = true;
return 0; } @@@ -643,13 -604,10 +607,13 @@@ static void felix_teardown(struct dsa_s { struct ocelot *ocelot = ds->priv; struct felix *felix = ocelot_to_felix(ocelot); + int port;
if (felix->info->mdio_bus_free) felix->info->mdio_bus_free(ocelot);
+ for (port = 0; port < ocelot->num_phys_ports; port++) + ocelot_deinit_port(ocelot, port); ocelot_deinit_timestamp(ocelot); /* stop workqueue thread */ ocelot_deinit(ocelot); @@@ -793,7 -751,6 +757,6 @@@ const struct dsa_switch_ops felix_switc .get_sset_count = felix_get_sset_count, .get_ts_info = felix_get_ts_info, .phylink_validate = felix_phylink_validate, - .phylink_mac_link_state = felix_phylink_mac_pcs_get_state, .phylink_mac_config = felix_phylink_mac_config, .phylink_mac_link_down = felix_phylink_mac_link_down, .phylink_mac_link_up = felix_phylink_mac_link_up, @@@ -823,31 -780,5 +786,5 @@@ .cls_flower_add = felix_cls_flower_add, .cls_flower_del = felix_cls_flower_del, .cls_flower_stats = felix_cls_flower_stats, - .port_setup_tc = felix_port_setup_tc, + .port_setup_tc = felix_port_setup_tc, }; - - static int __init felix_init(void) - { - int err; - - err = pci_register_driver(&felix_vsc9959_pci_driver); - if (err) - return err; - - err = platform_driver_register(&seville_vsc9953_driver); - if (err) - return err; - - return 0; - } - module_init(felix_init); - - static void __exit felix_exit(void) - { - pci_unregister_driver(&felix_vsc9959_pci_driver); - platform_driver_unregister(&seville_vsc9953_driver); - } - module_exit(felix_exit); - - MODULE_DESCRIPTION("Felix Switch driver"); - MODULE_LICENSE("GPL v2"); diff --combined drivers/net/dsa/ocelot/seville_vsc9953.c index 0fdeff22a76c,12c7fb9c2c3f..650f7c0e6e6a --- a/drivers/net/dsa/ocelot/seville_vsc9953.c +++ b/drivers/net/dsa/ocelot/seville_vsc9953.c @@@ -7,6 -7,7 +7,7 @@@ #include <soc/mscc/ocelot_sys.h> #include <soc/mscc/ocelot.h> #include <linux/of_platform.h> + #include <linux/pcs-lynx.h> #include <linux/packing.h> #include <linux/iopoll.h> #include "felix.h" @@@ -15,23 -16,12 +16,12 @@@ #define VSC9953_VCAP_IS2_ENTRY_WIDTH 376 #define VSC9953_VCAP_PORT_CNT 10
- #define MSCC_MIIM_REG_STATUS 0x0 - #define MSCC_MIIM_STATUS_STAT_BUSY BIT(3) - #define MSCC_MIIM_REG_CMD 0x8 - #define MSCC_MIIM_CMD_OPR_WRITE BIT(1) - #define MSCC_MIIM_CMD_OPR_READ BIT(2) - #define MSCC_MIIM_CMD_WRDATA_SHIFT 4 - #define MSCC_MIIM_CMD_REGAD_SHIFT 20 - #define MSCC_MIIM_CMD_PHYAD_SHIFT 25 - #define MSCC_MIIM_CMD_VLD BIT(31) - #define MSCC_MIIM_REG_DATA 0xC - #define MSCC_MIIM_DATA_ERROR (BIT(16) | BIT(17)) - - #define MSCC_PHY_REG_PHY_CFG 0x0 - #define PHY_CFG_PHY_ENA (BIT(0) | BIT(1) | BIT(2) | BIT(3)) - #define PHY_CFG_PHY_COMMON_RESET BIT(4) - #define PHY_CFG_PHY_RESET (BIT(5) | BIT(6) | BIT(7) | BIT(8)) - #define MSCC_PHY_REG_PHY_STATUS 0x4 + #define MSCC_MIIM_CMD_OPR_WRITE BIT(1) + #define MSCC_MIIM_CMD_OPR_READ BIT(2) + #define MSCC_MIIM_CMD_WRDATA_SHIFT 4 + #define MSCC_MIIM_CMD_REGAD_SHIFT 20 + #define MSCC_MIIM_CMD_PHYAD_SHIFT 25 + #define MSCC_MIIM_CMD_VLD BIT(31)
static const u32 vsc9953_ana_regmap[] = { REG(ANA_ADVLEARN, 0x00b500), @@@ -819,6 -809,10 +809,10 @@@ out return err; }
+ /* CORE_ENA is in SYS:SYSTEM:RESET_CFG + * MEM_INIT is in SYS:SYSTEM:RESET_CFG + * MEM_ENA is in SYS:SYSTEM:RESET_CFG + */ static int vsc9953_reset(struct ocelot *ocelot) { int val, err; @@@ -834,8 -828,8 +828,8 @@@ }
/* initialize switch mem ~40us */ - ocelot_field_write(ocelot, SYS_RESET_CFG_MEM_INIT, 1); ocelot_field_write(ocelot, SYS_RESET_CFG_MEM_ENA, 1); + ocelot_field_write(ocelot, SYS_RESET_CFG_MEM_INIT, 1);
err = readx_poll_timeout(vsc9953_sys_ram_init_status, ocelot, val, !val, VSC9953_SYS_RAMINIT_SLEEP, @@@ -846,7 -840,6 +840,6 @@@ }
/* enable switch core */ - ocelot_field_write(ocelot, SYS_RESET_CFG_MEM_ENA, 1); ocelot_field_write(ocelot, SYS_RESET_CFG_CORE_ENA, 1);
return 0; @@@ -960,18 -953,27 +953,27 @@@ static int vsc9953_mdio_bus_alloc(struc
for (port = 0; port < felix->info->num_ports; port++) { struct ocelot_port *ocelot_port = ocelot->ports[port]; - struct phy_device *pcs; int addr = port + 4; + struct mdio_device *pcs; + struct lynx_pcs *lynx; + + if (dsa_is_unused_port(felix->ds, port)) + continue;
if (ocelot_port->phy_mode == PHY_INTERFACE_MODE_INTERNAL) continue;
- pcs = get_phy_device(felix->imdio, addr, false); + pcs = mdio_device_create(felix->imdio, addr); if (IS_ERR(pcs)) continue;
- pcs->interface = ocelot_port->phy_mode; - felix->pcs[port] = pcs; + lynx = lynx_pcs_create(pcs); + if (!lynx) { + mdio_device_free(pcs); + continue; + } + + felix->pcs[port] = lynx;
dev_info(dev, "Found PCS at internal MDIO address %d\n", addr); } @@@ -979,6 -981,23 +981,23 @@@ return 0; }
+ static void vsc9953_mdio_bus_free(struct ocelot *ocelot) + { + struct felix *felix = ocelot_to_felix(ocelot); + int port; + + for (port = 0; port < ocelot->num_phys_ports; port++) { + struct lynx_pcs *pcs = felix->pcs[port]; + + if (!pcs) + continue; + + mdio_device_free(pcs->mdio); + lynx_pcs_destroy(pcs); + } + mdiobus_unregister(felix->imdio); + } + static void vsc9953_xmit_template_populate(struct ocelot *ocelot, int port) { struct ocelot_port *ocelot_port = ocelot->ports[port]; @@@ -1008,14 -1027,11 +1027,11 @@@ static const struct felix_info seville_ .vcap_is2_keys = vsc9953_vcap_is2_keys, .vcap_is2_actions = vsc9953_vcap_is2_actions, .vcap = vsc9953_vcap_props, - .shared_queue_sz = 128 * 1024, + .shared_queue_sz = 2048 * 1024, .num_mact_rows = 2048, .num_ports = 10, .mdio_bus_alloc = vsc9953_mdio_bus_alloc, - .mdio_bus_free = vsc9959_mdio_bus_free, - .pcs_config = vsc9959_pcs_config, - .pcs_link_up = vsc9959_pcs_link_up, - .pcs_link_state = vsc9959_pcs_link_state, + .mdio_bus_free = vsc9953_mdio_bus_free, .phylink_validate = vsc9953_phylink_validate, .prevalidate_phy_mode = vsc9953_prevalidate_phy_mode, .xmit_template_populate = vsc9953_xmit_template_populate, @@@ -1094,7 -1110,7 +1110,7 @@@ static const struct of_device_id sevill }; MODULE_DEVICE_TABLE(of, seville_of_match);
- struct platform_driver seville_vsc9953_driver = { + static struct platform_driver seville_vsc9953_driver = { .probe = seville_probe, .remove = seville_remove, .driver = { @@@ -1102,3 -1118,7 +1118,7 @@@ .of_match_table = of_match_ptr(seville_of_match), }, }; + module_platform_driver(seville_vsc9953_driver); + + MODULE_DESCRIPTION("Seville Switch driver"); + MODULE_LICENSE("GPL v2"); diff --combined drivers/net/dsa/rtl8366.c index a8c5a934c3d3,7c09ed747bc0..c58ca324a4b2 --- a/drivers/net/dsa/rtl8366.c +++ b/drivers/net/dsa/rtl8366.c @@@ -36,12 -36,113 +36,113 @@@ int rtl8366_mc_is_used(struct realtek_s } EXPORT_SYMBOL_GPL(rtl8366_mc_is_used);
+ /** + * rtl8366_obtain_mc() - retrieve or allocate a VLAN member configuration + * @smi: the Realtek SMI device instance + * @vid: the VLAN ID to look up or allocate + * @vlanmc: the pointer will be assigned to a pointer to a valid member config + * if successful + * @return: index of a new member config or negative error number + */ + static int rtl8366_obtain_mc(struct realtek_smi *smi, int vid, + struct rtl8366_vlan_mc *vlanmc) + { + struct rtl8366_vlan_4k vlan4k; + int ret; + int i; + + /* Try to find an existing member config entry for this VID */ + for (i = 0; i < smi->num_vlan_mc; i++) { + ret = smi->ops->get_vlan_mc(smi, i, vlanmc); + if (ret) { + dev_err(smi->dev, "error searching for VLAN MC %d for VID %d\n", + i, vid); + return ret; + } + + if (vid == vlanmc->vid) + return i; + } + + /* We have no MC entry for this VID, try to find an empty one */ + for (i = 0; i < smi->num_vlan_mc; i++) { + ret = smi->ops->get_vlan_mc(smi, i, vlanmc); + if (ret) { + dev_err(smi->dev, "error searching for VLAN MC %d for VID %d\n", + i, vid); + return ret; + } + + if (vlanmc->vid == 0 && vlanmc->member == 0) { + /* Update the entry from the 4K table */ + ret = smi->ops->get_vlan_4k(smi, vid, &vlan4k); + if (ret) { + dev_err(smi->dev, "error looking for 4K VLAN MC %d for VID %d\n", + i, vid); + return ret; + } + + vlanmc->vid = vid; + vlanmc->member = vlan4k.member; + vlanmc->untag = vlan4k.untag; + vlanmc->fid = vlan4k.fid; + ret = smi->ops->set_vlan_mc(smi, i, vlanmc); + if (ret) { + dev_err(smi->dev, "unable to set/update VLAN MC %d for VID %d\n", + i, vid); + return ret; + } + + dev_dbg(smi->dev, "created new MC at index %d for VID %d\n", + i, vid); + return i; + } + } + + /* MC table is full, try to find an unused entry and replace it */ + for (i = 0; i < smi->num_vlan_mc; i++) { + int used; + + ret = rtl8366_mc_is_used(smi, i, &used); + if (ret) + return ret; + + if (!used) { + /* Update the entry from the 4K table */ + ret = smi->ops->get_vlan_4k(smi, vid, &vlan4k); + if (ret) + return ret; + + vlanmc->vid = vid; + vlanmc->member = vlan4k.member; + vlanmc->untag = vlan4k.untag; + vlanmc->fid = vlan4k.fid; + ret = smi->ops->set_vlan_mc(smi, i, vlanmc); + if (ret) { + dev_err(smi->dev, "unable to set/update VLAN MC %d for VID %d\n", + i, vid); + return ret; + } + dev_dbg(smi->dev, "recycled MC at index %i for VID %d\n", + i, vid); + return i; + } + } + + dev_err(smi->dev, "all VLAN member configurations are in use\n"); + return -ENOSPC; + } + int rtl8366_set_vlan(struct realtek_smi *smi, int vid, u32 member, u32 untag, u32 fid) { + struct rtl8366_vlan_mc vlanmc; struct rtl8366_vlan_4k vlan4k; + int mc; int ret; - int i; + + if (!smi->ops->is_vlan_valid(smi, vid)) + return -EINVAL;
dev_dbg(smi->dev, "setting VLAN%d 4k members: 0x%02x, untagged: 0x%02x\n", @@@ -63,133 -164,58 +164,58 @@@ "resulting VLAN%d 4k members: 0x%02x, untagged: 0x%02x\n", vid, vlan4k.member, vlan4k.untag);
- /* Try to find an existing MC entry for this VID */ - for (i = 0; i < smi->num_vlan_mc; i++) { - struct rtl8366_vlan_mc vlanmc; - - ret = smi->ops->get_vlan_mc(smi, i, &vlanmc); - if (ret) - return ret; - - if (vid == vlanmc.vid) { - /* update the MC entry */ - vlanmc.member |= member; - vlanmc.untag |= untag; - vlanmc.fid = fid; - - ret = smi->ops->set_vlan_mc(smi, i, &vlanmc); + /* Find or allocate a member config for this VID */ + ret = rtl8366_obtain_mc(smi, vid, &vlanmc); + if (ret < 0) + return ret; + mc = ret;
- dev_dbg(smi->dev, - "resulting VLAN%d MC members: 0x%02x, untagged: 0x%02x\n", - vid, vlanmc.member, vlanmc.untag); + /* Update the MC entry */ + vlanmc.member |= member; + vlanmc.untag |= untag; + vlanmc.fid = fid;
- break; - } - } + /* Commit updates to the MC entry */ + ret = smi->ops->set_vlan_mc(smi, mc, &vlanmc); + if (ret) + dev_err(smi->dev, "failed to commit changes to VLAN MC index %d for VID %d\n", + mc, vid); + else + dev_dbg(smi->dev, + "resulting VLAN%d MC members: 0x%02x, untagged: 0x%02x\n", + vid, vlanmc.member, vlanmc.untag);
return ret; } EXPORT_SYMBOL_GPL(rtl8366_set_vlan);
- int rtl8366_get_pvid(struct realtek_smi *smi, int port, int *val) - { - struct rtl8366_vlan_mc vlanmc; - int ret; - int index; - - ret = smi->ops->get_mc_index(smi, port, &index); - if (ret) - return ret; - - ret = smi->ops->get_vlan_mc(smi, index, &vlanmc); - if (ret) - return ret; - - *val = vlanmc.vid; - return 0; - } - EXPORT_SYMBOL_GPL(rtl8366_get_pvid); - int rtl8366_set_pvid(struct realtek_smi *smi, unsigned int port, unsigned int vid) { struct rtl8366_vlan_mc vlanmc; - struct rtl8366_vlan_4k vlan4k; + int mc; int ret; - int i; - - /* Try to find an existing MC entry for this VID */ - for (i = 0; i < smi->num_vlan_mc; i++) { - ret = smi->ops->get_vlan_mc(smi, i, &vlanmc); - if (ret) - return ret; - - if (vid == vlanmc.vid) { - ret = smi->ops->set_vlan_mc(smi, i, &vlanmc); - if (ret) - return ret; - - ret = smi->ops->set_mc_index(smi, port, i); - return ret; - } - } - - /* We have no MC entry for this VID, try to find an empty one */ - for (i = 0; i < smi->num_vlan_mc; i++) { - ret = smi->ops->get_vlan_mc(smi, i, &vlanmc); - if (ret) - return ret; - - if (vlanmc.vid == 0 && vlanmc.member == 0) { - /* Update the entry from the 4K table */ - ret = smi->ops->get_vlan_4k(smi, vid, &vlan4k); - if (ret) - return ret; - - vlanmc.vid = vid; - vlanmc.member = vlan4k.member; - vlanmc.untag = vlan4k.untag; - vlanmc.fid = vlan4k.fid; - ret = smi->ops->set_vlan_mc(smi, i, &vlanmc); - if (ret) - return ret; - - ret = smi->ops->set_mc_index(smi, port, i); - return ret; - } - } - - /* MC table is full, try to find an unused entry and replace it */ - for (i = 0; i < smi->num_vlan_mc; i++) { - int used; - - ret = rtl8366_mc_is_used(smi, i, &used); - if (ret) - return ret;
- if (!used) { - /* Update the entry from the 4K table */ - ret = smi->ops->get_vlan_4k(smi, vid, &vlan4k); - if (ret) - return ret; + if (!smi->ops->is_vlan_valid(smi, vid)) + return -EINVAL;
- vlanmc.vid = vid; - vlanmc.member = vlan4k.member; - vlanmc.untag = vlan4k.untag; - vlanmc.fid = vlan4k.fid; - ret = smi->ops->set_vlan_mc(smi, i, &vlanmc); - if (ret) - return ret; + /* Find or allocate a member config for this VID */ + ret = rtl8366_obtain_mc(smi, vid, &vlanmc); + if (ret < 0) + return ret; + mc = ret;
- ret = smi->ops->set_mc_index(smi, port, i); - return ret; - } + ret = smi->ops->set_mc_index(smi, port, mc); + if (ret) { + dev_err(smi->dev, "set PVID: failed to set MC index %d for port %d\n", + mc, port); + return ret; }
- dev_err(smi->dev, - "all VLAN member configurations are in use\n"); + dev_dbg(smi->dev, "set PVID: the PVID for port %d set to %d using existing MC index %d\n", + port, vid, mc);
- return -ENOSPC; + return 0; } EXPORT_SYMBOL_GPL(rtl8366_set_pvid);
@@@ -389,7 -415,8 +415,8 @@@ void rtl8366_vlan_add(struct dsa_switc if (!smi->ops->is_vlan_valid(smi, vid)) return;
- dev_info(smi->dev, "add VLAN on port %d, %s, %s\n", + dev_info(smi->dev, "add VLAN %d on port %d, %s, %s\n", + vlan->vid_begin, port, untagged ? "untagged" : "tagged", pvid ? " PVID" : "no PVID"); @@@ -398,34 -425,29 +425,29 @@@ dev_err(smi->dev, "port is DSA or CPU port\n");
for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) { - int pvid_val = 0; - - dev_info(smi->dev, "add VLAN %04x\n", vid); member |= BIT(port);
if (untagged) untag |= BIT(port);
- /* To ensure that we have a valid MC entry for this VLAN, - * initialize the port VLAN ID here. - */ - ret = rtl8366_get_pvid(smi, port, &pvid_val); - if (ret < 0) { - dev_err(smi->dev, "could not lookup PVID for port %d\n", - port); - return; - } - if (pvid_val == 0) { - ret = rtl8366_set_pvid(smi, port, vid); - if (ret < 0) - return; - } - ret = rtl8366_set_vlan(smi, vid, member, untag, 0); if (ret) dev_err(smi->dev, "failed to set up VLAN %04x", vid); + + if (!pvid) + continue; + + ret = rtl8366_set_pvid(smi, port, vid); + if (ret) + dev_err(smi->dev, + "failed to set PVID on port %d to VLAN %04x", + port, vid); + + if (!ret) + dev_dbg(smi->dev, "VLAN add: added VLAN %d with PVID on port %d\n", + vid, port); } } EXPORT_SYMBOL_GPL(rtl8366_vlan_add); @@@ -452,19 -474,13 +474,19 @@@ int rtl8366_vlan_del(struct dsa_switch return ret;
if (vid == vlanmc.vid) { - /* clear VLAN member configurations */ - vlanmc.vid = 0; - vlanmc.priority = 0; - vlanmc.member = 0; - vlanmc.untag = 0; - vlanmc.fid = 0; - + /* Remove this port from the VLAN */ + vlanmc.member &= ~BIT(port); + vlanmc.untag &= ~BIT(port); + /* + * If no ports are members of this VLAN + * anymore then clear the whole member + * config so it can be reused. + */ + if (!vlanmc.member && vlanmc.untag) { + vlanmc.vid = 0; + vlanmc.priority = 0; + vlanmc.fid = 0; + } ret = smi->ops->set_vlan_mc(smi, i, &vlanmc); if (ret) { dev_err(smi->dev, diff --combined drivers/net/ethernet/3com/typhoon.c index 049cc0158a64,f11474cac59f..05e15b6e5e2c --- a/drivers/net/ethernet/3com/typhoon.c +++ b/drivers/net/ethernet/3com/typhoon.c @@@ -789,8 -789,8 +789,8 @@@ typhoon_start_tx(struct sk_buff *skb, s * it with zeros to ETH_ZLEN for us. */ if (skb_shinfo(skb)->nr_frags == 0) { - skb_dma = pci_map_single(tp->tx_pdev, skb->data, skb->len, - PCI_DMA_TODEVICE); + skb_dma = dma_map_single(&tp->tx_pdev->dev, skb->data, + skb->len, DMA_TO_DEVICE); txd->flags = TYPHOON_FRAG_DESC | TYPHOON_DESC_VALID; txd->len = cpu_to_le16(skb->len); txd->frag.addr = cpu_to_le32(skb_dma); @@@ -800,8 -800,8 +800,8 @@@ int i, len;
len = skb_headlen(skb); - skb_dma = pci_map_single(tp->tx_pdev, skb->data, len, - PCI_DMA_TODEVICE); + skb_dma = dma_map_single(&tp->tx_pdev->dev, skb->data, len, + DMA_TO_DEVICE); txd->flags = TYPHOON_FRAG_DESC | TYPHOON_DESC_VALID; txd->len = cpu_to_le16(len); txd->frag.addr = cpu_to_le32(skb_dma); @@@ -818,8 -818,8 +818,8 @@@
len = skb_frag_size(frag); frag_addr = skb_frag_address(frag); - skb_dma = pci_map_single(tp->tx_pdev, frag_addr, len, - PCI_DMA_TODEVICE); + skb_dma = dma_map_single(&tp->tx_pdev->dev, frag_addr, + len, DMA_TO_DEVICE); txd->flags = TYPHOON_FRAG_DESC | TYPHOON_DESC_VALID; txd->len = cpu_to_le16(len); txd->frag.addr = cpu_to_le32(skb_dma); @@@ -1349,12 -1349,12 +1349,12 @@@ typhoon_download_firmware(struct typhoo image_data = typhoon_fw->data; fHdr = (struct typhoon_file_header *) image_data;
- /* Cannot just map the firmware image using pci_map_single() as + /* Cannot just map the firmware image using dma_map_single() as * the firmware is vmalloc()'d and may not be physically contiguous, - * so we allocate some consistent memory to copy the sections into. + * so we allocate some coherent memory to copy the sections into. */ err = -ENOMEM; - dpage = pci_alloc_consistent(pdev, PAGE_SIZE, &dpage_dma); + dpage = dma_alloc_coherent(&pdev->dev, PAGE_SIZE, &dpage_dma, GFP_ATOMIC); if (!dpage) { netdev_err(tp->dev, "no DMA mem for firmware\n"); goto err_out; @@@ -1419,7 -1419,8 +1419,7 @@@ * the checksum, we can do this once, at the end. */ csum = csum_fold(csum_partial_copy_nocheck(image_data, - dpage, len, - 0)); + dpage, len));
iowrite32(len, ioaddr + TYPHOON_REG_BOOT_LENGTH); iowrite32(le16_to_cpu((__force __le16)csum), @@@ -1459,7 -1460,7 +1459,7 @@@ err_out_irq iowrite32(irqMasked, ioaddr + TYPHOON_REG_INTR_MASK); iowrite32(irqEnabled, ioaddr + TYPHOON_REG_INTR_ENABLE);
- pci_free_consistent(pdev, PAGE_SIZE, dpage, dpage_dma); + dma_free_coherent(&pdev->dev, PAGE_SIZE, dpage, dpage_dma);
err_out: return err; @@@ -1526,8 -1527,8 +1526,8 @@@ typhoon_clean_tx(struct typhoon *tp, st */ skb_dma = (dma_addr_t) le32_to_cpu(tx->frag.addr); dma_len = le16_to_cpu(tx->len); - pci_unmap_single(tp->pdev, skb_dma, dma_len, - PCI_DMA_TODEVICE); + dma_unmap_single(&tp->pdev->dev, skb_dma, dma_len, + DMA_TO_DEVICE); }
tx->flags = 0; @@@ -1608,8 -1609,8 +1608,8 @@@ typhoon_alloc_rx_skb(struct typhoon *tp skb_reserve(skb, 2); #endif
- dma_addr = pci_map_single(tp->pdev, skb->data, - PKT_BUF_SZ, PCI_DMA_FROMDEVICE); + dma_addr = dma_map_single(&tp->pdev->dev, skb->data, PKT_BUF_SZ, + DMA_FROM_DEVICE);
/* Since no card does 64 bit DAC, the high bits will never * change from zero. @@@ -1664,20 -1665,19 +1664,19 @@@ typhoon_rx(struct typhoon *tp, struct b if (pkt_len < rx_copybreak && (new_skb = netdev_alloc_skb(tp->dev, pkt_len + 2)) != NULL) { skb_reserve(new_skb, 2); - pci_dma_sync_single_for_cpu(tp->pdev, dma_addr, - PKT_BUF_SZ, - PCI_DMA_FROMDEVICE); + dma_sync_single_for_cpu(&tp->pdev->dev, dma_addr, + PKT_BUF_SZ, DMA_FROM_DEVICE); skb_copy_to_linear_data(new_skb, skb->data, pkt_len); - pci_dma_sync_single_for_device(tp->pdev, dma_addr, - PKT_BUF_SZ, - PCI_DMA_FROMDEVICE); + dma_sync_single_for_device(&tp->pdev->dev, dma_addr, + PKT_BUF_SZ, + DMA_FROM_DEVICE); skb_put(new_skb, pkt_len); typhoon_recycle_rx_skb(tp, idx); } else { new_skb = skb; skb_put(new_skb, pkt_len); - pci_unmap_single(tp->pdev, dma_addr, PKT_BUF_SZ, - PCI_DMA_FROMDEVICE); + dma_unmap_single(&tp->pdev->dev, dma_addr, PKT_BUF_SZ, + DMA_FROM_DEVICE); typhoon_alloc_rx_skb(tp, idx); } new_skb->protocol = eth_type_trans(new_skb, tp->dev); @@@ -1791,8 -1791,8 +1790,8 @@@ typhoon_free_rx_rings(struct typhoon *t for (i = 0; i < RXENT_ENTRIES; i++) { struct rxbuff_ent *rxb = &tp->rxbuffers[i]; if (rxb->skb) { - pci_unmap_single(tp->pdev, rxb->dma_addr, PKT_BUF_SZ, - PCI_DMA_FROMDEVICE); + dma_unmap_single(&tp->pdev->dev, rxb->dma_addr, + PKT_BUF_SZ, DMA_FROM_DEVICE); dev_kfree_skb(rxb->skb); rxb->skb = NULL; } @@@ -2305,7 -2305,7 +2304,7 @@@ typhoon_init_one(struct pci_dev *pdev, goto error_out_disable; }
- err = pci_set_dma_mask(pdev, DMA_BIT_MASK(32)); + err = dma_set_mask(&pdev->dev, DMA_BIT_MASK(32)); if (err < 0) { err_msg = "No usable DMA configuration"; goto error_out_mwi; @@@ -2354,8 -2354,8 +2353,8 @@@
/* allocate pci dma space for rx and tx descriptor rings */ - shared = pci_alloc_consistent(pdev, sizeof(struct typhoon_shared), - &shared_dma); + shared = dma_alloc_coherent(&pdev->dev, sizeof(struct typhoon_shared), + &shared_dma, GFP_KERNEL); if (!shared) { err_msg = "could not allocate DMA memory"; err = -ENOMEM; @@@ -2508,8 -2508,8 +2507,8 @@@ error_out_reset typhoon_reset(ioaddr, NoWait);
error_out_dma: - pci_free_consistent(pdev, sizeof(struct typhoon_shared), - shared, shared_dma); + dma_free_coherent(&pdev->dev, sizeof(struct typhoon_shared), shared, + shared_dma); error_out_remap: pci_iounmap(pdev, ioaddr); error_out_regions: @@@ -2536,8 -2536,8 +2535,8 @@@ typhoon_remove_one(struct pci_dev *pdev pci_restore_state(pdev); typhoon_reset(tp->ioaddr, NoWait); pci_iounmap(pdev, tp->ioaddr); - pci_free_consistent(pdev, sizeof(struct typhoon_shared), - tp->shared, tp->shared_dma); + dma_free_coherent(&pdev->dev, sizeof(struct typhoon_shared), + tp->shared, tp->shared_dma); pci_release_regions(pdev); pci_clear_mwi(pdev); pci_disable_device(pdev); diff --combined drivers/net/ethernet/broadcom/bnxt/bnxt.c index 7b7e8b7883c8,53f64ca673c3..65c298f1f333 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@@ -3782,7 -3782,6 +3782,7 @@@ static int bnxt_hwrm_func_qstat_ext(str return -EOPNOTSUPP;
bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_FUNC_QSTATS_EXT, -1, -1); + req.fid = cpu_to_le16(0xffff); req.flags = FUNC_QSTATS_EXT_REQ_FLAGS_COUNTER_MASK; mutex_lock(&bp->hwrm_cmd_lock); rc = _hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT); @@@ -3853,7 -3852,7 +3853,7 @@@ static void bnxt_init_stats(struct bnx tx_masks = stats->hw_masks; tx_count = sizeof(struct tx_port_stats_ext) / 8;
- flags = FUNC_QSTATS_EXT_REQ_FLAGS_COUNTER_MASK; + flags = PORT_QSTATS_EXT_REQ_FLAGS_COUNTER_MASK; rc = bnxt_hwrm_port_qstats_ext(bp, flags); if (rc) { mask = (1ULL << 40) - 1; @@@ -4306,7 -4305,7 +4306,7 @@@ static int bnxt_hwrm_do_send_msg(struc u32 bar_offset = BNXT_GRCPF_REG_CHIMP_COMM; u16 dst = BNXT_HWRM_CHNL_CHIMP;
- if (test_bit(BNXT_STATE_FW_FATAL_COND, &bp->state)) + if (BNXT_NO_FW_ACCESS(bp)) return -EBUSY;
if (msg_len > BNXT_HWRM_MAX_REQ_LEN) { @@@ -5724,7 -5723,7 +5724,7 @@@ static int hwrm_ring_free_send_msg(stru struct hwrm_ring_free_output *resp = bp->hwrm_cmd_resp_addr; u16 error_code;
- if (test_bit(BNXT_STATE_FW_FATAL_COND, &bp->state)) + if (BNXT_NO_FW_ACCESS(bp)) return 0;
bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_RING_FREE, cmpl_ring_id, -1); @@@ -7818,7 -7817,7 +7818,7 @@@ static int bnxt_set_tpa(struct bnxt *bp
if (set_tpa) tpa_flags = bp->flags & BNXT_FLAG_TPA; - else if (test_bit(BNXT_STATE_FW_FATAL_COND, &bp->state)) + else if (BNXT_NO_FW_ACCESS(bp)) return 0; for (i = 0; i < bp->nr_vnics; i++) { rc = bnxt_hwrm_vnic_set_tpa(bp, i, tpa_flags); @@@ -8635,10 -8634,9 +8635,9 @@@ static void bnxt_del_napi(struct bnxt * for (i = 0; i < bp->cp_nr_rings; i++) { struct bnxt_napi *bnapi = bp->bnapi[i];
- napi_hash_del(&bnapi->napi); - netif_napi_del(&bnapi->napi); + __netif_napi_del(&bnapi->napi); } - /* We called napi_hash_del() before netif_napi_del(), we need + /* We called __netif_napi_del(), we need * to respect an RCU grace period before freeing napi structures. */ synchronize_net(); @@@ -9312,16 -9310,18 +9311,16 @@@ static ssize_t bnxt_show_temp(struct de struct hwrm_temp_monitor_query_output *resp; struct bnxt *bp = dev_get_drvdata(dev); u32 len = 0; + int rc;
resp = bp->hwrm_cmd_resp_addr; bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_TEMP_MONITOR_QUERY, -1, -1); mutex_lock(&bp->hwrm_cmd_lock); - if (!_hwrm_send_message_silent(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT)) + rc = _hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT); + if (!rc) len = sprintf(buf, "%u\n", resp->temp * 1000); /* display millidegree */ mutex_unlock(&bp->hwrm_cmd_lock); - - if (len) - return len; - - return sprintf(buf, "unknown\n"); + return rc ?: len; } static SENSOR_DEVICE_ATTR(temp1_input, 0444, bnxt_show_temp, NULL, 0);
@@@ -9341,16 -9341,7 +9340,16 @@@ static void bnxt_hwmon_close(struct bnx
static void bnxt_hwmon_open(struct bnxt *bp) { + struct hwrm_temp_monitor_query_input req = {0}; struct pci_dev *pdev = bp->pdev; + int rc; + + bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_TEMP_MONITOR_QUERY, -1, -1); + rc = hwrm_send_message_silent(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT); + if (rc == -EACCES || rc == -EOPNOTSUPP) { + bnxt_hwmon_close(bp); + return; + }
if (bp->hwmon_dev) return; @@@ -11787,10 -11778,6 +11786,10 @@@ static void bnxt_remove_one(struct pci_ if (BNXT_PF(bp)) bnxt_sriov_disable(bp);
+ clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state); + bnxt_cancel_sp_work(bp); + bp->sp_event = 0; + bnxt_dl_fw_reporters_destroy(bp, true); if (BNXT_PF(bp)) devlink_port_type_clear(&bp->dl_port); @@@ -11798,6 -11785,9 +11797,6 @@@ unregister_netdev(dev); bnxt_dl_unregister(bp); bnxt_shutdown_tc(bp); - clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state); - bnxt_cancel_sp_work(bp); - bp->sp_event = 0;
bnxt_clear_int_mode(bp); bnxt_hwrm_func_drv_unrgtr(bp); @@@ -12098,7 -12088,7 +12097,7 @@@ static int bnxt_init_mac_addr(struct bn static void bnxt_vpd_read_info(struct bnxt *bp) { struct pci_dev *pdev = bp->pdev; - int i, len, pos, ro_size; + int i, len, pos, ro_size, size; ssize_t vpd_size; u8 *vpd_data;
@@@ -12133,8 -12123,7 +12132,8 @@@ if (len + pos > vpd_size) goto read_sn;
- strlcpy(bp->board_partno, &vpd_data[pos], min(len, BNXT_VPD_FLD_LEN)); + size = min(len, BNXT_VPD_FLD_LEN - 1); + memcpy(bp->board_partno, &vpd_data[pos], size);
read_sn: pos = pci_vpd_find_info_keyword(vpd_data, i, ro_size, @@@ -12147,8 -12136,7 +12146,8 @@@ if (len + pos > vpd_size) goto exit;
- strlcpy(bp->board_serialno, &vpd_data[pos], min(len, BNXT_VPD_FLD_LEN)); + size = min(len, BNXT_VPD_FLD_LEN - 1); + memcpy(bp->board_serialno, &vpd_data[pos], size); exit: kfree(vpd_data); } diff --combined drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index fecdfd875af1,5a65f28ef771..6a3453f46d9a --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@@ -1322,9 -1322,6 +1322,9 @@@ static int bnxt_get_regs_len(struct net struct bnxt *bp = netdev_priv(dev); int reg_len;
+ if (!BNXT_PF(bp)) + return -EOPNOTSUPP; + reg_len = BNXT_PXP_REG_LEN;
if (bp->fw_cap & BNXT_FW_CAP_PCIE_STATS_SUPPORTED) @@@ -1781,6 -1778,22 +1781,22 @@@ static void bnxt_get_pauseparam(struct epause->tx_pause = !!(link_info->req_flow_ctrl & BNXT_LINK_PAUSE_TX); }
+ static void bnxt_get_pause_stats(struct net_device *dev, + struct ethtool_pause_stats *epstat) + { + struct bnxt *bp = netdev_priv(dev); + u64 *rx, *tx; + + if (BNXT_VF(bp) || !(bp->flags & BNXT_FLAG_PORT_STATS)) + return; + + rx = bp->port_stats.sw_stats; + tx = bp->port_stats.sw_stats + BNXT_TX_PORT_STATS_BYTE_OFFSET / 8; + + epstat->rx_pause_frames = BNXT_GET_RX_PORT_STATS64(rx, rx_pause_frames); + epstat->tx_pause_frames = BNXT_GET_TX_PORT_STATS64(tx, tx_pause_frames); + } + static int bnxt_set_pauseparam(struct net_device *dev, struct ethtool_pauseparam *epause) { @@@ -1791,12 -1804,9 +1807,12 @@@ if (!BNXT_PHY_CFG_ABLE(bp)) return -EOPNOTSUPP;
+ mutex_lock(&bp->link_lock); if (epause->autoneg) { - if (!(link_info->autoneg & BNXT_AUTONEG_SPEED)) - return -EINVAL; + if (!(link_info->autoneg & BNXT_AUTONEG_SPEED)) { + rc = -EINVAL; + goto pause_exit; + }
link_info->autoneg |= BNXT_AUTONEG_FLOW_CTRL; if (bp->hwrm_spec_code >= 0x10201) @@@ -1817,11 -1827,11 +1833,11 @@@ if (epause->tx_pause) link_info->req_flow_ctrl |= BNXT_LINK_PAUSE_TX;
- if (netif_running(dev)) { - mutex_lock(&bp->link_lock); + if (netif_running(dev)) rc = bnxt_hwrm_set_pause(bp); - mutex_unlock(&bp->link_lock); - } + +pause_exit: + mutex_unlock(&bp->link_lock); return rc; }
@@@ -2558,7 -2568,8 +2574,7 @@@ static int bnxt_set_eee(struct net_devi struct bnxt *bp = netdev_priv(dev); struct ethtool_eee *eee = &bp->eee; struct bnxt_link_info *link_info = &bp->link_info; - u32 advertising = - _bnxt_fw_to_ethtool_adv_spds(link_info->advertising, 0); + u32 advertising; int rc = 0;
if (!BNXT_PHY_CFG_ABLE(bp)) @@@ -2567,23 -2578,19 +2583,23 @@@ if (!(bp->flags & BNXT_FLAG_EEE_CAP)) return -EOPNOTSUPP;
+ mutex_lock(&bp->link_lock); + advertising = _bnxt_fw_to_ethtool_adv_spds(link_info->advertising, 0); if (!edata->eee_enabled) goto eee_ok;
if (!(link_info->autoneg & BNXT_AUTONEG_SPEED)) { netdev_warn(dev, "EEE requires autoneg\n"); - return -EINVAL; + rc = -EINVAL; + goto eee_exit; } if (edata->tx_lpi_enabled) { if (bp->lpi_tmr_hi && (edata->tx_lpi_timer > bp->lpi_tmr_hi || edata->tx_lpi_timer < bp->lpi_tmr_lo)) { netdev_warn(dev, "Valid LPI timer range is %d and %d microsecs\n", bp->lpi_tmr_lo, bp->lpi_tmr_hi); - return -EINVAL; + rc = -EINVAL; + goto eee_exit; } else if (!bp->lpi_tmr_hi) { edata->tx_lpi_timer = eee->tx_lpi_timer; } @@@ -2593,8 -2600,7 +2609,8 @@@ } else if (edata->advertised & ~advertising) { netdev_warn(dev, "EEE advertised %x must be a subset of autoneg advertised speeds %x\n", edata->advertised, advertising); - return -EINVAL; + rc = -EINVAL; + goto eee_exit; }
eee->advertised = edata->advertised; @@@ -2606,8 -2612,6 +2622,8 @@@ eee_ok if (netif_running(dev)) rc = bnxt_hwrm_set_link_setting(bp, false, true);
+eee_exit: + mutex_unlock(&bp->link_lock); return rc; }
@@@ -3657,6 -3661,7 +3673,7 @@@ const struct ethtool_ops bnxt_ethtool_o ETHTOOL_COALESCE_USE_ADAPTIVE_RX, .get_link_ksettings = bnxt_get_link_ksettings, .set_link_ksettings = bnxt_set_link_ksettings, + .get_pause_stats = bnxt_get_pause_stats, .get_pauseparam = bnxt_get_pauseparam, .set_pauseparam = bnxt_set_pauseparam, .get_drvinfo = bnxt_get_drvinfo, diff --combined drivers/net/ethernet/cadence/macb_main.c index 9179f7b0b900,830c537bc08c..9c8f40e8a721 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@@ -647,7 -647,8 +647,7 @@@ static void macb_mac_link_up(struct phy ctrl |= GEM_BIT(GBE); }
- /* We do not support MLO_PAUSE_RX yet */ - if (tx_pause) + if (rx_pause) ctrl |= MACB_BIT(PAE);
macb_set_tx_clk(bp->tx_clk, speed, ndev); @@@ -1465,9 -1466,9 +1465,9 @@@ static int macb_poll(struct napi_struc return work_done; }
- static void macb_hresp_error_task(unsigned long data) + static void macb_hresp_error_task(struct tasklet_struct *t) { - struct macb *bp = (struct macb *)data; + struct macb *bp = from_tasklet(bp, t, hresp_err_tasklet); struct net_device *dev = bp->dev; struct macb_queue *queue; unsigned int q; @@@ -4559,8 -4560,7 +4559,7 @@@ static int macb_probe(struct platform_d goto err_out_unregister_mdio; }
- tasklet_init(&bp->hresp_err_tasklet, macb_hresp_error_task, - (unsigned long)bp); + tasklet_setup(&bp->hresp_err_tasklet, macb_hresp_error_task);
netdev_info(dev, "Cadence %s rev 0x%08x at 0x%08lx irq %d (%pM)\n", macb_is_gem(bp) ? "GEM" : "MACB", macb_readl(bp, MID), diff --combined drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c index 481498585ead,f6c1ec140e09..6ec5f2f26f05 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c @@@ -604,17 -604,14 +604,14 @@@ int cxgb4_get_free_ftid(struct net_devi /* If the new rule wants to get inserted into * HPFILTER region, but its prio is greater * than the rule with the highest prio in HASH - * region, then reject the rule. - */ - if (t->tc_hash_tids_max_prio && - tc_prio > t->tc_hash_tids_max_prio) - break; - - /* If there's not enough slots available - * in HPFILTER region, then move on to - * normal FILTER region immediately. + * region, or if there's not enough slots + * available in HPFILTER region, then skip + * trying to insert this rule into HPFILTER + * region and directly go to the next region. */ - if (ftid + n > t->nhpftids) { + if ((t->tc_hash_tids_max_prio && + tc_prio > t->tc_hash_tids_max_prio) || + (ftid + n) > t->nhpftids) { ftid = t->nhpftids; continue; } @@@ -1911,16 -1908,13 +1908,16 @@@ out static int configure_filter_tcb(struct adapter *adap, unsigned int tid, struct filter_entry *f) { - if (f->fs.hitcnts) + if (f->fs.hitcnts) { set_tcb_field(adap, f, tid, TCB_TIMESTAMP_W, - TCB_TIMESTAMP_V(TCB_TIMESTAMP_M) | + TCB_TIMESTAMP_V(TCB_TIMESTAMP_M), + TCB_TIMESTAMP_V(0ULL), + 1); + set_tcb_field(adap, f, tid, TCB_RTT_TS_RECENT_AGE_W, TCB_RTT_TS_RECENT_AGE_V(TCB_RTT_TS_RECENT_AGE_M), - TCB_TIMESTAMP_V(0ULL) | TCB_RTT_TS_RECENT_AGE_V(0ULL), 1); + }
if (f->fs.newdmac) set_tcb_tflag(adap, f, tid, TF_CCTRL_ECE_S, 1, diff --combined drivers/net/ethernet/dec/tulip/de2104x.c index 2610efe4f873,e648724c2c36..d9f6c19940ef --- a/drivers/net/ethernet/dec/tulip/de2104x.c +++ b/drivers/net/ethernet/dec/tulip/de2104x.c @@@ -85,7 -85,7 +85,7 @@@ MODULE_PARM_DESC (rx_copybreak, "de2104 #define DSL CONFIG_DE2104X_DSL #endif
-#define DE_RX_RING_SIZE 64 +#define DE_RX_RING_SIZE 128 #define DE_TX_RING_SIZE 64 #define DE_RING_BYTES \ ((sizeof(struct de_desc) * DE_RX_RING_SIZE) + \ @@@ -443,21 -443,23 +443,23 @@@ static void de_rx (struct de_private *d }
if (!copying_skb) { - pci_unmap_single(de->pdev, mapping, - buflen, PCI_DMA_FROMDEVICE); + dma_unmap_single(&de->pdev->dev, mapping, buflen, + DMA_FROM_DEVICE); skb_put(skb, len);
mapping = de->rx_skb[rx_tail].mapping = - pci_map_single(de->pdev, copy_skb->data, - buflen, PCI_DMA_FROMDEVICE); + dma_map_single(&de->pdev->dev, copy_skb->data, + buflen, DMA_FROM_DEVICE); de->rx_skb[rx_tail].skb = copy_skb; } else { - pci_dma_sync_single_for_cpu(de->pdev, mapping, len, PCI_DMA_FROMDEVICE); + dma_sync_single_for_cpu(&de->pdev->dev, mapping, len, + DMA_FROM_DEVICE); skb_reserve(copy_skb, RX_OFFSET); skb_copy_from_linear_data(skb, skb_put(copy_skb, len), len); - pci_dma_sync_single_for_device(de->pdev, mapping, len, PCI_DMA_FROMDEVICE); + dma_sync_single_for_device(&de->pdev->dev, mapping, + len, DMA_FROM_DEVICE);
/* We'll reuse the original ring buffer. */ skb = copy_skb; @@@ -554,13 -556,15 +556,15 @@@ static void de_tx (struct de_private *d goto next;
if (unlikely(skb == DE_SETUP_SKB)) { - pci_unmap_single(de->pdev, de->tx_skb[tx_tail].mapping, - sizeof(de->setup_frame), PCI_DMA_TODEVICE); + dma_unmap_single(&de->pdev->dev, + de->tx_skb[tx_tail].mapping, + sizeof(de->setup_frame), + DMA_TO_DEVICE); goto next; }
- pci_unmap_single(de->pdev, de->tx_skb[tx_tail].mapping, - skb->len, PCI_DMA_TODEVICE); + dma_unmap_single(&de->pdev->dev, de->tx_skb[tx_tail].mapping, + skb->len, DMA_TO_DEVICE);
if (status & LastFrag) { if (status & TxError) { @@@ -620,7 -624,8 +624,8 @@@ static netdev_tx_t de_start_xmit (struc txd = &de->tx_ring[entry];
len = skb->len; - mapping = pci_map_single(de->pdev, skb->data, len, PCI_DMA_TODEVICE); + mapping = dma_map_single(&de->pdev->dev, skb->data, len, + DMA_TO_DEVICE); if (entry == (DE_TX_RING_SIZE - 1)) flags |= RingEnd; if (!tx_free || (tx_free == (DE_TX_RING_SIZE / 2))) @@@ -763,8 -768,8 +768,8 @@@ static void __de_set_rx_mode (struct ne
de->tx_skb[entry].skb = DE_SETUP_SKB; de->tx_skb[entry].mapping = mapping = - pci_map_single (de->pdev, de->setup_frame, - sizeof (de->setup_frame), PCI_DMA_TODEVICE); + dma_map_single(&de->pdev->dev, de->setup_frame, + sizeof(de->setup_frame), DMA_TO_DEVICE);
/* Put the setup frame on the Tx list. */ txd = &de->tx_ring[entry]; @@@ -1279,8 -1284,10 +1284,10 @@@ static int de_refill_rx (struct de_priv if (!skb) goto err_out;
- de->rx_skb[i].mapping = pci_map_single(de->pdev, - skb->data, de->rx_buf_sz, PCI_DMA_FROMDEVICE); + de->rx_skb[i].mapping = dma_map_single(&de->pdev->dev, + skb->data, + de->rx_buf_sz, + DMA_FROM_DEVICE); de->rx_skb[i].skb = skb;
de->rx_ring[i].opts1 = cpu_to_le32(DescOwn); @@@ -1313,7 -1320,8 +1320,8 @@@ static int de_init_rings (struct de_pri
static int de_alloc_rings (struct de_private *de) { - de->rx_ring = pci_alloc_consistent(de->pdev, DE_RING_BYTES, &de->ring_dma); + de->rx_ring = dma_alloc_coherent(&de->pdev->dev, DE_RING_BYTES, + &de->ring_dma, GFP_KERNEL); if (!de->rx_ring) return -ENOMEM; de->tx_ring = &de->rx_ring[DE_RX_RING_SIZE]; @@@ -1333,8 -1341,9 +1341,9 @@@ static void de_clean_rings (struct de_p
for (i = 0; i < DE_RX_RING_SIZE; i++) { if (de->rx_skb[i].skb) { - pci_unmap_single(de->pdev, de->rx_skb[i].mapping, - de->rx_buf_sz, PCI_DMA_FROMDEVICE); + dma_unmap_single(&de->pdev->dev, + de->rx_skb[i].mapping, de->rx_buf_sz, + DMA_FROM_DEVICE); dev_kfree_skb(de->rx_skb[i].skb); } } @@@ -1344,15 -1353,15 +1353,15 @@@ if ((skb) && (skb != DE_DUMMY_SKB)) { if (skb != DE_SETUP_SKB) { de->dev->stats.tx_dropped++; - pci_unmap_single(de->pdev, - de->tx_skb[i].mapping, - skb->len, PCI_DMA_TODEVICE); + dma_unmap_single(&de->pdev->dev, + de->tx_skb[i].mapping, + skb->len, DMA_TO_DEVICE); dev_kfree_skb(skb); } else { - pci_unmap_single(de->pdev, - de->tx_skb[i].mapping, - sizeof(de->setup_frame), - PCI_DMA_TODEVICE); + dma_unmap_single(&de->pdev->dev, + de->tx_skb[i].mapping, + sizeof(de->setup_frame), + DMA_TO_DEVICE); } } } @@@ -1364,7 -1373,8 +1373,8 @@@ static void de_free_rings (struct de_private *de) { de_clean_rings(de); - pci_free_consistent(de->pdev, DE_RING_BYTES, de->rx_ring, de->ring_dma); + dma_free_coherent(&de->pdev->dev, DE_RING_BYTES, de->rx_ring, + de->ring_dma); de->rx_ring = NULL; de->tx_ring = NULL; } diff --combined drivers/net/ethernet/huawei/hinic/hinic_main.c index 28581bd8ce07,2c63e3a690cd..19d01def891f --- a/drivers/net/ethernet/huawei/hinic/hinic_main.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_main.c @@@ -24,6 -24,7 +24,7 @@@ #include <linux/delay.h> #include <linux/err.h>
+ #include "hinic_debugfs.h" #include "hinic_hw_qp.h" #include "hinic_hw_dev.h" #include "hinic_devlink.h" @@@ -153,6 -154,8 +154,8 @@@ static int create_txqs(struct hinic_de if (!nic_dev->txqs) return -ENOMEM;
+ hinic_sq_dbgfs_init(nic_dev); + for (i = 0; i < num_txqs; i++) { struct hinic_sq *sq = hinic_hwdev_get_sq(nic_dev->hwdev, i);
@@@ -162,36 -165,32 +165,50 @@@ "Failed to init Txq\n"); goto err_init_txq; } + + err = hinic_sq_debug_add(nic_dev, i); + if (err) { + netif_err(nic_dev, drv, netdev, + "Failed to add SQ%d debug\n", i); + goto err_add_sq_dbg; + } + }
return 0;
+ err_add_sq_dbg: + hinic_clean_txq(&nic_dev->txqs[i]); err_init_txq: - for (j = 0; j < i; j++) + for (j = 0; j < i; j++) { + hinic_sq_debug_rem(nic_dev->txqs[j].sq); hinic_clean_txq(&nic_dev->txqs[j]); + } + + hinic_sq_dbgfs_uninit(nic_dev);
devm_kfree(&netdev->dev, nic_dev->txqs); return err; }
+static void enable_txqs_napi(struct hinic_dev *nic_dev) +{ + int num_txqs = hinic_hwdev_num_qps(nic_dev->hwdev); + int i; + + for (i = 0; i < num_txqs; i++) + napi_enable(&nic_dev->txqs[i].napi); +} + +static void disable_txqs_napi(struct hinic_dev *nic_dev) +{ + int num_txqs = hinic_hwdev_num_qps(nic_dev->hwdev); + int i; + + for (i = 0; i < num_txqs; i++) + napi_disable(&nic_dev->txqs[i].napi); +} + /** * free_txqs - Free the Logical Tx Queues of specific NIC device * @nic_dev: the specific NIC device @@@ -204,8 -203,12 +221,12 @@@ static void free_txqs(struct hinic_dev if (!nic_dev->txqs) return;
- for (i = 0; i < num_txqs; i++) + for (i = 0; i < num_txqs; i++) { + hinic_sq_debug_rem(nic_dev->txqs[i].sq); hinic_clean_txq(&nic_dev->txqs[i]); + } + + hinic_sq_dbgfs_uninit(nic_dev);
devm_kfree(&netdev->dev, nic_dev->txqs); nic_dev->txqs = NULL; @@@ -231,6 -234,8 +252,8 @@@ static int create_rxqs(struct hinic_de if (!nic_dev->rxqs) return -ENOMEM;
+ hinic_rq_dbgfs_init(nic_dev); + for (i = 0; i < num_rxqs; i++) { struct hinic_rq *rq = hinic_hwdev_get_rq(nic_dev->hwdev, i);
@@@ -240,13 -245,26 +263,26 @@@ "Failed to init rxq\n"); goto err_init_rxq; } + + err = hinic_rq_debug_add(nic_dev, i); + if (err) { + netif_err(nic_dev, drv, netdev, + "Failed to add RQ%d debug\n", i); + goto err_add_rq_dbg; + } }
return 0;
+ err_add_rq_dbg: + hinic_clean_rxq(&nic_dev->rxqs[i]); err_init_rxq: - for (j = 0; j < i; j++) + for (j = 0; j < i; j++) { + hinic_rq_debug_rem(nic_dev->rxqs[j].rq); hinic_clean_rxq(&nic_dev->rxqs[j]); + } + + hinic_rq_dbgfs_uninit(nic_dev);
devm_kfree(&netdev->dev, nic_dev->rxqs); return err; @@@ -264,8 -282,12 +300,12 @@@ static void free_rxqs(struct hinic_dev if (!nic_dev->rxqs) return;
- for (i = 0; i < num_rxqs; i++) + for (i = 0; i < num_rxqs; i++) { + hinic_rq_debug_rem(nic_dev->rxqs[i].rq); hinic_clean_rxq(&nic_dev->rxqs[i]); + } + + hinic_rq_dbgfs_uninit(nic_dev);
devm_kfree(&netdev->dev, nic_dev->rxqs); nic_dev->rxqs = NULL; @@@ -418,8 -440,6 +458,8 @@@ int hinic_open(struct net_device *netde goto err_create_txqs; }
+ enable_txqs_napi(nic_dev); + err = create_rxqs(nic_dev); if (err) { netif_err(nic_dev, drv, netdev, @@@ -504,7 -524,6 +544,7 @@@ err_port_state }
err_create_rxqs: + disable_txqs_napi(nic_dev); free_txqs(nic_dev);
err_create_txqs: @@@ -518,9 -537,6 +558,9 @@@ int hinic_close(struct net_device *netd struct hinic_dev *nic_dev = netdev_priv(netdev); unsigned int flags;
+ /* Disable txq napi firstly to aviod rewaking txq in free_tx_poll */ + disable_txqs_napi(nic_dev); + down(&nic_dev->mgmt_lock);
flags = nic_dev->flags; @@@ -913,11 -929,16 +953,16 @@@ static void netdev_features_init(struc netdev->hw_features = NETIF_F_SG | NETIF_F_HIGHDMA | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_RXCSUM | NETIF_F_LRO | - NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX; + NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX | + NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_UDP_TUNNEL_CSUM;
netdev->vlan_features = netdev->hw_features;
netdev->features = netdev->hw_features | NETIF_F_HW_VLAN_CTAG_FILTER; + + netdev->hw_enc_features = NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SCTP_CRC | + NETIF_F_SG | NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_TSO_ECN | + NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_UDP_TUNNEL; }
static void hinic_refresh_nic_cfg(struct hinic_dev *nic_dev) @@@ -1284,6 -1305,16 +1329,16 @@@ static int nic_dev_init(struct pci_dev goto err_init_intr; }
+ hinic_dbg_init(nic_dev); + + hinic_func_tbl_dbgfs_init(nic_dev); + + err = hinic_func_table_debug_add(nic_dev); + if (err) { + dev_err(&pdev->dev, "Failed to add func_table debug\n"); + goto err_add_func_table_dbg; + } + err = register_netdev(netdev); if (err) { dev_err(&pdev->dev, "Failed to register netdev\n"); @@@ -1293,6 -1324,10 +1348,10 @@@ return 0;
err_reg_netdev: + hinic_func_table_debug_rem(nic_dev); + err_add_func_table_dbg: + hinic_func_tbl_dbgfs_uninit(nic_dev); + hinic_dbg_uninit(nic_dev); hinic_free_intr_coalesce(nic_dev); err_init_intr: err_set_pfc: @@@ -1415,6 -1450,12 +1474,12 @@@ static void hinic_remove(struct pci_de
unregister_netdev(netdev);
+ hinic_func_table_debug_rem(nic_dev); + + hinic_func_tbl_dbgfs_uninit(nic_dev); + + hinic_dbg_uninit(nic_dev); + hinic_free_intr_coalesce(nic_dev);
hinic_port_del_mac(nic_dev, netdev->dev_addr, 0); @@@ -1469,4 -1510,17 +1534,17 @@@ static struct pci_driver hinic_driver .sriov_configure = hinic_pci_sriov_configure, };
- module_pci_driver(hinic_driver); + static int __init hinic_module_init(void) + { + hinic_dbg_register_debugfs(HINIC_DRV_NAME); + return pci_register_driver(&hinic_driver); + } + + static void __exit hinic_module_exit(void) + { + pci_unregister_driver(&hinic_driver); + hinic_dbg_unregister_debugfs(); + } + + module_init(hinic_module_init); + module_exit(hinic_module_exit); diff --combined drivers/net/ethernet/huawei/hinic/hinic_rx.c index d0072f5e7efc,f403a6711e97..070a7cc6392e --- a/drivers/net/ethernet/huawei/hinic/hinic_rx.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_rx.c @@@ -543,25 -543,18 +543,25 @@@ static int rx_request_irq(struct hinic_ if (err) { netif_err(nic_dev, drv, rxq->netdev, "Failed to set RX interrupt coalescing attribute\n"); - rx_del_napi(rxq); - return err; + goto err_req_irq; }
err = request_irq(rq->irq, rx_irq, 0, rxq->irq_name, rxq); - if (err) { - rx_del_napi(rxq); - return err; - } + if (err) + goto err_req_irq;
cpumask_set_cpu(qp->q_id % num_online_cpus(), &rq->affinity_mask); - return irq_set_affinity_hint(rq->irq, &rq->affinity_mask); + err = irq_set_affinity_hint(rq->irq, &rq->affinity_mask); + if (err) + goto err_irq_affinity; + + return 0; + +err_irq_affinity: + free_irq(rq->irq, rxq); +err_req_irq: + rx_del_napi(rxq); + return err; }
static void rx_free_irq(struct hinic_rxq *rxq) @@@ -595,7 -588,7 +595,7 @@@ int hinic_init_rxq(struct hinic_rxq *rx rxq_stats_init(rxq);
rxq->irq_name = devm_kasprintf(&netdev->dev, GFP_KERNEL, - "hinic_rxq%d", qp->q_id); + "%s_rxq%d", netdev->name, qp->q_id); if (!rxq->irq_name) return -ENOMEM;
diff --combined drivers/net/ethernet/huawei/hinic/hinic_tx.c index c1f81e9144a1,c249b7e6e432..8da7d46363b2 --- a/drivers/net/ethernet/huawei/hinic/hinic_tx.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_tx.c @@@ -357,6 -357,7 +357,7 @@@ static int offload_csum(struct hinic_sq enum hinic_l4_offload_type l4_offload; u32 offset, l4_len, network_hdr_len; enum hinic_l3_offload_type l3_type; + u32 tunnel_type = NOT_TUNNEL; union hinic_l3 ip; union hinic_l4 l4; u8 l4_proto; @@@ -367,27 -368,55 +368,55 @@@ if (skb->encapsulation) { u32 l4_tunnel_len;
+ tunnel_type = TUNNEL_UDP_NO_CSUM; ip.hdr = skb_network_header(skb);
- if (ip.v4->version == 4) + if (ip.v4->version == 4) { l3_type = IPV4_PKT_NO_CHKSUM_OFFLOAD; - else if (ip.v4->version == 6) + l4_proto = ip.v4->protocol; + } else if (ip.v4->version == 6) { + unsigned char *exthdr; + __be16 frag_off; l3_type = IPV6_PKT; - else + tunnel_type = TUNNEL_UDP_CSUM; + exthdr = ip.hdr + sizeof(*ip.v6); + l4_proto = ip.v6->nexthdr; + l4.hdr = skb_transport_header(skb); + if (l4.hdr != exthdr) + ipv6_skip_exthdr(skb, exthdr - skb->data, + &l4_proto, &frag_off); + } else { l3_type = L3TYPE_UNKNOWN; + l4_proto = IPPROTO_RAW; + }
hinic_task_set_outter_l3(task, l3_type, skb_network_header_len(skb));
- l4_tunnel_len = skb_inner_network_offset(skb) - - skb_transport_offset(skb); - - hinic_task_set_tunnel_l4(task, TUNNEL_UDP_NO_CSUM, - l4_tunnel_len); + switch (l4_proto) { + case IPPROTO_UDP: + l4_tunnel_len = skb_inner_network_offset(skb) - + skb_transport_offset(skb); + ip.hdr = skb_inner_network_header(skb); + l4.hdr = skb_inner_transport_header(skb); + network_hdr_len = skb_inner_network_header_len(skb); + break; + case IPPROTO_IPIP: + case IPPROTO_IPV6: + tunnel_type = NOT_TUNNEL; + l4_tunnel_len = 0; + + ip.hdr = skb_inner_network_header(skb); + l4.hdr = skb_transport_header(skb); + network_hdr_len = skb_network_header_len(skb); + break; + default: + /* Unsupported tunnel packet, disable csum offload */ + skb_checksum_help(skb); + return 0; + }
- ip.hdr = skb_inner_network_header(skb); - l4.hdr = skb_inner_transport_header(skb); - network_hdr_len = skb_inner_network_header_len(skb); + hinic_task_set_tunnel_l4(task, tunnel_type, l4_tunnel_len); } else { ip.hdr = skb_network_header(skb); l4.hdr = skb_transport_header(skb); @@@ -717,8 -746,8 +746,8 @@@ static int free_tx_poll(struct napi_str netdev_txq = netdev_get_tx_queue(txq->netdev, qp->q_id);
__netif_tx_lock(netdev_txq, smp_processor_id()); - - netif_wake_subqueue(nic_dev->netdev, qp->q_id); + if (!netif_testing(nic_dev->netdev)) + netif_wake_subqueue(nic_dev->netdev, qp->q_id);
__netif_tx_unlock(netdev_txq);
@@@ -745,6 -774,18 +774,6 @@@ return budget; }
-static void tx_napi_add(struct hinic_txq *txq, int weight) -{ - netif_napi_add(txq->netdev, &txq->napi, free_tx_poll, weight); - napi_enable(&txq->napi); -} - -static void tx_napi_del(struct hinic_txq *txq) -{ - napi_disable(&txq->napi); - netif_napi_del(&txq->napi); -} - static irqreturn_t tx_irq(int irq, void *data) { struct hinic_txq *txq = data; @@@ -778,7 -819,7 +807,7 @@@ static int tx_request_irq(struct hinic_
qp = container_of(sq, struct hinic_qp, sq);
- tx_napi_add(txq, nic_dev->tx_weight); + netif_napi_add(txq->netdev, &txq->napi, free_tx_poll, nic_dev->tx_weight);
hinic_hwdev_msix_set(nic_dev->hwdev, sq->msix_entry, TX_IRQ_NO_PENDING, TX_IRQ_NO_COALESC, @@@ -795,14 -836,14 +824,14 @@@ if (err) { netif_err(nic_dev, drv, txq->netdev, "Failed to set TX interrupt coalescing attribute\n"); - tx_napi_del(txq); + netif_napi_del(&txq->napi); return err; }
err = request_irq(sq->irq, tx_irq, 0, txq->irq_name, txq); if (err) { dev_err(&pdev->dev, "Failed to request Tx irq\n"); - tx_napi_del(txq); + netif_napi_del(&txq->napi); return err; }
@@@ -814,7 -855,7 +843,7 @@@ static void tx_free_irq(struct hinic_tx struct hinic_sq *sq = txq->sq;
free_irq(sq->irq, txq); - tx_napi_del(txq); + netif_napi_del(&txq->napi); }
/** @@@ -853,14 -894,14 +882,14 @@@ int hinic_init_txq(struct hinic_txq *tx goto err_alloc_free_sges; }
- irqname_len = snprintf(NULL, 0, "hinic_txq%d", qp->q_id) + 1; + irqname_len = snprintf(NULL, 0, "%s_txq%d", netdev->name, qp->q_id) + 1; txq->irq_name = devm_kzalloc(&netdev->dev, irqname_len, GFP_KERNEL); if (!txq->irq_name) { err = -ENOMEM; goto err_alloc_irqname; }
- sprintf(txq->irq_name, "hinic_txq%d", qp->q_id); + sprintf(txq->irq_name, "%s_txq%d", netdev->name, qp->q_id);
err = hinic_hwdev_hw_ci_addr_set(hwdev, sq, CI_UPDATE_NO_PENDING, CI_UPDATE_NO_COALESC); diff --combined drivers/net/ethernet/ibm/ibmvnic.c index 1b702a43a5d0,6d320be47e60..a151ff37fda2 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@@ -104,8 -104,7 +104,7 @@@ static int send_login(struct ibmvnic_ad static void send_cap_queries(struct ibmvnic_adapter *adapter); static int init_sub_crqs(struct ibmvnic_adapter *); static int init_sub_crq_irqs(struct ibmvnic_adapter *adapter); - static int ibmvnic_init(struct ibmvnic_adapter *); - static int ibmvnic_reset_init(struct ibmvnic_adapter *); + static int ibmvnic_reset_init(struct ibmvnic_adapter *, bool reset); static void release_crq_queue(struct ibmvnic_adapter *); static int __ibmvnic_set_mac(struct net_device *, u8 *); static int init_crq_queue(struct ibmvnic_adapter *adapter); @@@ -297,8 -296,7 +296,7 @@@ static void deactivate_rx_pools(struct { int i;
- for (i = 0; i < be32_to_cpu(adapter->login_rsp_buf->num_rxadd_subcrqs); - i++) + for (i = 0; i < adapter->num_active_rx_pools; i++) adapter->rx_pool[i].active = 0; }
@@@ -306,6 -304,7 +304,7 @@@ static void replenish_rx_pool(struct ib struct ibmvnic_rx_pool *pool) { int count = pool->size - atomic_read(&pool->available); + u64 handle = adapter->rx_scrq[pool->index]->handle; struct device *dev = &adapter->vdev->dev; int buffers_added = 0; unsigned long lpar_rc; @@@ -314,7 -313,6 +313,6 @@@ unsigned int offset; dma_addr_t dma_addr; unsigned char *dst; - u64 *handle_array; int shift = 0; int index; int i; @@@ -322,10 -320,6 +320,6 @@@ if (!pool->active) return;
- handle_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + - be32_to_cpu(adapter->login_rsp_buf-> - off_rxadd_subcrqs)); - for (i = 0; i < count; ++i) { skb = alloc_skb(pool->buff_size, GFP_ATOMIC); if (!skb) { @@@ -369,8 -363,7 +363,7 @@@ #endif sub_crq.rx_add.len = cpu_to_be32(pool->buff_size << shift);
- lpar_rc = send_subcrq(adapter, handle_array[pool->index], - &sub_crq); + lpar_rc = send_subcrq(adapter, handle, &sub_crq); if (lpar_rc != H_SUCCESS) goto failure;
@@@ -407,8 -400,7 +400,7 @@@ static void replenish_pools(struct ibmv int i;
adapter->replenish_task_cycles++; - for (i = 0; i < be32_to_cpu(adapter->login_rsp_buf->num_rxadd_subcrqs); - i++) { + for (i = 0; i < adapter->num_active_rx_pools; i++) { if (adapter->rx_pool[i].active) replenish_rx_pool(adapter, &adapter->rx_pool[i]); } @@@ -475,25 -467,23 +467,23 @@@ static int init_stats_token(struct ibmv static int reset_rx_pools(struct ibmvnic_adapter *adapter) { struct ibmvnic_rx_pool *rx_pool; + u64 buff_size; int rx_scrqs; int i, j, rc; - u64 *size_array;
if (!adapter->rx_pool) return -1;
- size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + - be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size)); - - rx_scrqs = be32_to_cpu(adapter->login_rsp_buf->num_rxadd_subcrqs); + buff_size = adapter->cur_rx_buf_sz; + rx_scrqs = adapter->num_active_rx_pools; for (i = 0; i < rx_scrqs; i++) { rx_pool = &adapter->rx_pool[i];
netdev_dbg(adapter->netdev, "Re-setting rx_pool[%d]\n", i);
- if (rx_pool->buff_size != be64_to_cpu(size_array[i])) { + if (rx_pool->buff_size != buff_size) { free_long_term_buff(adapter, &rx_pool->long_term_buff); - rx_pool->buff_size = be64_to_cpu(size_array[i]); + rx_pool->buff_size = buff_size; rc = alloc_long_term_buff(adapter, &rx_pool->long_term_buff, rx_pool->size * @@@ -561,13 -551,11 +551,11 @@@ static int init_rx_pools(struct net_dev struct device *dev = &adapter->vdev->dev; struct ibmvnic_rx_pool *rx_pool; int rxadd_subcrqs; - u64 *size_array; + u64 buff_size; int i, j;
- rxadd_subcrqs = - be32_to_cpu(adapter->login_rsp_buf->num_rxadd_subcrqs); - size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + - be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size)); + rxadd_subcrqs = adapter->num_active_rx_scrqs; + buff_size = adapter->cur_rx_buf_sz;
adapter->rx_pool = kcalloc(rxadd_subcrqs, sizeof(struct ibmvnic_rx_pool), @@@ -585,11 -573,11 +573,11 @@@ netdev_dbg(adapter->netdev, "Initializing rx_pool[%d], %lld buffs, %lld bytes each\n", i, adapter->req_rx_add_entries_per_subcrq, - be64_to_cpu(size_array[i])); + buff_size);
rx_pool->size = adapter->req_rx_add_entries_per_subcrq; rx_pool->index = i; - rx_pool->buff_size = be64_to_cpu(size_array[i]); + rx_pool->buff_size = buff_size; rx_pool->active = 1;
rx_pool->free_map = kcalloc(rx_pool->size, sizeof(int), @@@ -655,7 -643,7 +643,7 @@@ static int reset_tx_pools(struct ibmvni if (!adapter->tx_pool) return -1;
- tx_scrqs = be32_to_cpu(adapter->login_rsp_buf->num_txsubm_subcrqs); + tx_scrqs = adapter->num_active_tx_pools; for (i = 0; i < tx_scrqs; i++) { rc = reset_one_tx_pool(adapter, &adapter->tso_pool[i]); if (rc) @@@ -744,7 -732,7 +732,7 @@@ static int init_tx_pools(struct net_dev int tx_subcrqs; int i, rc;
- tx_subcrqs = be32_to_cpu(adapter->login_rsp_buf->num_txsubm_subcrqs); + tx_subcrqs = adapter->num_active_tx_scrqs; adapter->tx_pool = kcalloc(tx_subcrqs, sizeof(struct ibmvnic_tx_pool), GFP_KERNEL); if (!adapter->tx_pool) @@@ -980,7 -968,7 +968,7 @@@ static int set_link_state(struct ibmvni return -1; }
- if (adapter->init_done_rc == 1) { + if (adapter->init_done_rc == PARTIALSUCCESS) { /* Partuial success, delay and re-send */ mdelay(1000); resend = true; @@@ -1530,9 -1518,9 +1518,9 @@@ static netdev_tx_t ibmvnic_xmit(struct unsigned int offset; int num_entries = 1; unsigned char *dst; - u64 *handle_array; int index = 0; u8 proto = 0; + u64 handle; netdev_tx_t ret = NETDEV_TX_OK;
if (test_bit(0, &adapter->resetting)) { @@@ -1559,8 -1547,7 +1547,7 @@@
tx_scrq = adapter->tx_scrq[queue_num]; txq = netdev_get_tx_queue(netdev, skb_get_queue_mapping(skb)); - handle_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + - be32_to_cpu(adapter->login_rsp_buf->off_txsubm_subcrqs)); + handle = tx_scrq->handle;
index = tx_pool->free_map[tx_pool->consumer_index];
@@@ -1672,14 -1659,14 +1659,14 @@@ ret = NETDEV_TX_OK; goto tx_err_out; } - lpar_rc = send_subcrq_indirect(adapter, handle_array[queue_num], + lpar_rc = send_subcrq_indirect(adapter, handle, (u64)tx_buff->indir_dma, (u64)num_entries); dma_unmap_single(dev, tx_buff->indir_dma, sizeof(tx_buff->indir_arr), DMA_TO_DEVICE); } else { tx_buff->num_entries = num_entries; - lpar_rc = send_subcrq(adapter, handle_array[queue_num], + lpar_rc = send_subcrq(adapter, handle, &tx_crq); } if (lpar_rc != H_SUCCESS) { @@@ -1874,7 -1861,7 +1861,7 @@@ static int do_change_param_reset(struc return rc; }
- rc = ibmvnic_reset_init(adapter); + rc = ibmvnic_reset_init(adapter, true); if (rc) return IBMVNIC_INIT_FAILED;
@@@ -1992,7 -1979,7 +1979,7 @@@ static int do_reset(struct ibmvnic_adap goto out; }
- rc = ibmvnic_reset_init(adapter); + rc = ibmvnic_reset_init(adapter, true); if (rc) { rc = IBMVNIC_INIT_FAILED; goto out; @@@ -2032,18 -2019,16 +2019,18 @@@
} else { rc = reset_tx_pools(adapter); - if (rc) + if (rc) { netdev_dbg(adapter->netdev, "reset tx pools failed (%d)\n", rc); goto out; + }
rc = reset_rx_pools(adapter); - if (rc) + if (rc) { netdev_dbg(adapter->netdev, "reset rx pools failed (%d)\n", rc); goto out; + } } ibmvnic_disable_irqs(adapter); } @@@ -2108,7 -2093,7 +2095,7 @@@ static int do_hard_reset(struct ibmvnic return rc; }
- rc = ibmvnic_init(adapter); + rc = ibmvnic_reset_init(adapter, false); if (rc) return rc;
@@@ -3583,8 -3568,7 +3570,7 @@@ static int ibmvnic_send_crq(struct ibmv if (rc) { if (rc == H_CLOSED) { dev_warn(dev, "CRQ Queue closed\n"); - if (test_bit(0, &adapter->resetting)) - ibmvnic_reset(adapter, VNIC_RESET_FATAL); + /* do not reset, report the fail, wait for passive init from server */ }
dev_warn(dev, "Send error (rc=%d)\n", rc); @@@ -3595,14 -3579,31 +3581,31 @@@
static int ibmvnic_send_crq_init(struct ibmvnic_adapter *adapter) { + struct device *dev = &adapter->vdev->dev; union ibmvnic_crq crq; + int retries = 100; + int rc;
memset(&crq, 0, sizeof(crq)); crq.generic.first = IBMVNIC_CRQ_INIT_CMD; crq.generic.cmd = IBMVNIC_CRQ_INIT; netdev_dbg(adapter->netdev, "Sending CRQ init\n");
- return ibmvnic_send_crq(adapter, &crq); + do { + rc = ibmvnic_send_crq(adapter, &crq); + if (rc != H_CLOSED) + break; + retries--; + msleep(50); + + } while (retries > 0); + + if (rc) { + dev_err(dev, "Failed to send init request, rc = %d\n", rc); + return rc; + } + + return 0; }
static int send_version_xchg(struct ibmvnic_adapter *adapter) @@@ -4307,6 -4308,11 +4310,11 @@@ static int handle_login_rsp(union ibmvn struct net_device *netdev = adapter->netdev; struct ibmvnic_login_rsp_buffer *login_rsp = adapter->login_rsp_buf; struct ibmvnic_login_buffer *login = adapter->login_buf; + u64 *tx_handle_array; + u64 *rx_handle_array; + int num_tx_pools; + int num_rx_pools; + u64 *size_array; int i;
dma_unmap_single(dev, adapter->login_buf_token, adapter->login_buf_sz, @@@ -4341,6 -4347,30 +4349,30 @@@ ibmvnic_remove(adapter->vdev); return -EIO; } + size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + + be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size)); + /* variable buffer sizes are not supported, so just read the + * first entry. + */ + adapter->cur_rx_buf_sz = be64_to_cpu(size_array[0]); + + num_tx_pools = be32_to_cpu(adapter->login_rsp_buf->num_txsubm_subcrqs); + num_rx_pools = be32_to_cpu(adapter->login_rsp_buf->num_rxadd_subcrqs); + + tx_handle_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + + be32_to_cpu(adapter->login_rsp_buf->off_txsubm_subcrqs)); + rx_handle_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + + be32_to_cpu(adapter->login_rsp_buf->off_rxadd_subcrqs)); + + for (i = 0; i < num_tx_pools; i++) + adapter->tx_scrq[i]->handle = tx_handle_array[i]; + + for (i = 0; i < num_rx_pools; i++) + adapter->rx_scrq[i]->handle = rx_handle_array[i]; + + adapter->num_active_tx_scrqs = num_tx_pools; + adapter->num_active_rx_scrqs = num_rx_pools; + release_login_rsp_buffer(adapter); release_login_buffer(adapter); complete(&adapter->init_done);
@@@ -4812,9 -4842,9 +4844,9 @@@ static irqreturn_t ibmvnic_interrupt(in return IRQ_HANDLED; }
- static void ibmvnic_tasklet(void *data) + static void ibmvnic_tasklet(struct tasklet_struct *t) { - struct ibmvnic_adapter *adapter = data; + struct ibmvnic_adapter *adapter = from_tasklet(adapter, t, tasklet); struct ibmvnic_crq_queue *queue = &adapter->crq; union ibmvnic_crq *crq; unsigned long flags; @@@ -4949,8 -4979,7 +4981,7 @@@ static int init_crq_queue(struct ibmvni
retrc = 0;
- tasklet_init(&adapter->tasklet, (void *)ibmvnic_tasklet, - (unsigned long)adapter); + tasklet_setup(&adapter->tasklet, (void *)ibmvnic_tasklet);
netdev_dbg(adapter->netdev, "registering irq 0x%x\n", vdev->irq); snprintf(crq->name, sizeof(crq->name), "ibmvnic-%x", @@@ -4986,7 -5015,7 +5017,7 @@@ map_failed return retrc; }
- static int ibmvnic_reset_init(struct ibmvnic_adapter *adapter) + static int ibmvnic_reset_init(struct ibmvnic_adapter *adapter, bool reset) { struct device *dev = &adapter->vdev->dev; unsigned long timeout = msecs_to_jiffies(30000); @@@ -4995,12 -5024,19 +5026,19 @@@
adapter->from_passive_init = false;
- old_num_rx_queues = adapter->req_rx_queues; - old_num_tx_queues = adapter->req_tx_queues; + if (reset) { + old_num_rx_queues = adapter->req_rx_queues; + old_num_tx_queues = adapter->req_tx_queues; + reinit_completion(&adapter->init_done); + }
- reinit_completion(&adapter->init_done); adapter->init_done_rc = 0; - ibmvnic_send_crq_init(adapter); + rc = ibmvnic_send_crq_init(adapter); + if (rc) { + dev_err(dev, "Send crq init failed with error %d\n", rc); + return rc; + } + if (!wait_for_completion_timeout(&adapter->init_done, timeout)) { dev_err(dev, "Initialization sequence timed out\n"); return -1; @@@ -5017,7 -5053,8 +5055,8 @@@ return -1; }
- if (test_bit(0, &adapter->resetting) && !adapter->wait_for_reset && + if (reset && + test_bit(0, &adapter->resetting) && !adapter->wait_for_reset && adapter->reset_reason != VNIC_RESET_MOBILITY) { if (adapter->req_rx_queues != old_num_rx_queues || adapter->req_tx_queues != old_num_tx_queues) { @@@ -5045,48 -5082,6 +5084,6 @@@ return rc; }
- static int ibmvnic_init(struct ibmvnic_adapter *adapter) - { - struct device *dev = &adapter->vdev->dev; - unsigned long timeout = msecs_to_jiffies(30000); - int rc; - - adapter->from_passive_init = false; - - adapter->init_done_rc = 0; - ibmvnic_send_crq_init(adapter); - if (!wait_for_completion_timeout(&adapter->init_done, timeout)) { - dev_err(dev, "Initialization sequence timed out\n"); - return -1; - } - - if (adapter->init_done_rc) { - release_crq_queue(adapter); - return adapter->init_done_rc; - } - - if (adapter->from_passive_init) { - adapter->state = VNIC_OPEN; - adapter->from_passive_init = false; - return -1; - } - - rc = init_sub_crqs(adapter); - if (rc) { - dev_err(dev, "Initialization of sub crqs failed\n"); - release_crq_queue(adapter); - return rc; - } - - rc = init_sub_crq_irqs(adapter); - if (rc) { - dev_err(dev, "Failed to initialize sub crq irqs\n"); - release_crq_queue(adapter); - } - - return rc; - } - static struct device_attribute dev_attr_failover;
static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id) @@@ -5149,7 -5144,7 +5146,7 @@@ goto ibmvnic_init_fail; }
- rc = ibmvnic_init(adapter); + rc = ibmvnic_reset_init(adapter, false); if (rc && rc != EAGAIN) goto ibmvnic_init_fail; } while (rc == EAGAIN); @@@ -5299,8 -5294,7 +5296,7 @@@ static unsigned long ibmvnic_get_desire for (i = 0; i < adapter->req_tx_queues + adapter->req_rx_queues; i++) ret += 4 * PAGE_SIZE; /* the scrq message queue */
- for (i = 0; i < be32_to_cpu(adapter->login_rsp_buf->num_rxadd_subcrqs); - i++) + for (i = 0; i < adapter->num_active_rx_pools; i++) ret += adapter->rx_pool[i].size * IOMMU_PAGE_ALIGN(adapter->rx_pool[i].buff_size, tbl);
diff --combined drivers/net/ethernet/marvell/mvneta.c index c4345e3d616f,e5ddefe5cade..14df3aec285d --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@@ -330,7 -330,6 +330,6 @@@ #define MVNETA_SKB_HEADROOM ALIGN(max(NET_SKB_PAD, XDP_PACKET_HEADROOM), 8) #define MVNETA_SKB_PAD (SKB_DATA_ALIGN(sizeof(struct skb_shared_info) + \ MVNETA_SKB_HEADROOM)) - #define MVNETA_SKB_SIZE(len) (SKB_DATA_ALIGN(len) + MVNETA_SKB_PAD) #define MVNETA_MAX_RX_BUF_SIZE (PAGE_SIZE - MVNETA_SKB_PAD)
#define IS_TSO_HEADER(txq, addr) \ @@@ -752,13 -751,12 +751,12 @@@ static void mvneta_txq_inc_put(struct m static void mvneta_mib_counters_clear(struct mvneta_port *pp) { int i; - u32 dummy;
/* Perform dummy reads from MIB counters */ for (i = 0; i < MVNETA_MIB_LATE_COLLISION; i += 4) - dummy = mvreg_read(pp, (MVNETA_MIB_COUNTERS_BASE + i)); - dummy = mvreg_read(pp, MVNETA_RX_DISCARD_FRAME_COUNT); - dummy = mvreg_read(pp, MVNETA_OVERRUN_FRAME_COUNT); + mvreg_read(pp, (MVNETA_MIB_COUNTERS_BASE + i)); + mvreg_read(pp, MVNETA_RX_DISCARD_FRAME_COUNT); + mvreg_read(pp, MVNETA_OVERRUN_FRAME_COUNT); }
/* Get System Network Statistics */ @@@ -2029,11 -2027,11 +2027,11 @@@ mvneta_xdp_put_buff(struct mvneta_port struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp); int i;
- page_pool_put_page(rxq->page_pool, virt_to_head_page(xdp->data), - sync_len, napi); for (i = 0; i < sinfo->nr_frags; i++) page_pool_put_full_page(rxq->page_pool, skb_frag_page(&sinfo->frags[i]), napi); + page_pool_put_page(rxq->page_pool, virt_to_head_page(xdp->data), + sync_len, napi); }
static int @@@ -2227,8 -2225,7 +2225,7 @@@ mvneta_swbm_rx_frame(struct mvneta_por struct mvneta_rx_desc *rx_desc, struct mvneta_rx_queue *rxq, struct xdp_buff *xdp, int *size, - struct page *page, - struct mvneta_stats *stats) + struct page *page) { unsigned char *data = page_address(page); int data_len = -MVNETA_MH_SIZE, len; @@@ -2236,7 -2233,7 +2233,7 @@@ enum dma_data_direction dma_dir; struct skb_shared_info *sinfo;
- if (MVNETA_SKB_SIZE(rx_desc->data_size) > PAGE_SIZE) { + if (rx_desc->data_size > MVNETA_MAX_RX_BUF_SIZE) { len = MVNETA_MAX_RX_BUF_SIZE; data_len += len; } else { @@@ -2307,11 -2304,8 +2304,8 @@@ mvneta_swbm_build_skb(struct mvneta_por { struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp); int i, num_frags = sinfo->nr_frags; - skb_frag_t frags[MAX_SKB_FRAGS]; struct sk_buff *skb;
- memcpy(frags, sinfo->frags, sizeof(skb_frag_t) * num_frags); - skb = build_skb(xdp->data_hard_start, PAGE_SIZE); if (!skb) return ERR_PTR(-ENOMEM); @@@ -2323,12 -2317,12 +2317,12 @@@ mvneta_rx_csum(pp, desc_status, skb);
for (i = 0; i < num_frags; i++) { - struct page *page = skb_frag_page(&frags[i]); + skb_frag_t *frag = &sinfo->frags[i];
skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, - page, skb_frag_off(&frags[i]), - skb_frag_size(&frags[i]), PAGE_SIZE); - page_pool_release_page(rxq->page_pool, page); + skb_frag_page(frag), skb_frag_off(frag), + skb_frag_size(frag), PAGE_SIZE); + page_pool_release_page(rxq->page_pool, skb_frag_page(frag)); }
return skb; @@@ -2381,14 -2375,10 +2375,14 @@@ static int mvneta_rx_swbm(struct napi_s desc_status = rx_desc->status;
mvneta_swbm_rx_frame(pp, rx_desc, rxq, &xdp_buf, - &size, page, &ps); + &size, page); } else { - if (unlikely(!xdp_buf.data_hard_start)) + if (unlikely(!xdp_buf.data_hard_start)) { + rx_desc->buf_phys_addr = 0; + page_pool_put_full_page(rxq->page_pool, page, + true); continue; + }
mvneta_swbm_add_rx_fragment(pp, rx_desc, rxq, &xdp_buf, &size, page); diff --combined drivers/net/ethernet/qlogic/qed/qed_dev.c index 3db181f3617a,f7f08e6a3acf..d2f5855b2ea7 --- a/drivers/net/ethernet/qlogic/qed/qed_dev.c +++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c @@@ -3973,6 -3973,7 +3973,7 @@@ static int qed_hw_get_nvm_info(struct q struct qed_mcp_link_speed_params *ext_speed; struct qed_mcp_link_capabilities *p_caps; struct qed_mcp_link_params *link; + int i;
/* Read global nvm_cfg address */ nvm_cfg_addr = qed_rd(p_hwfn, p_ptt, MISC_REG_GEN_PURP_CR0); @@@ -4253,8 -4254,7 +4254,8 @@@ cdev->mf_bits = BIT(QED_MF_LLH_MAC_CLSS) | BIT(QED_MF_LLH_PROTO_CLSS) | BIT(QED_MF_LL2_NON_UNICAST) | - BIT(QED_MF_INTER_PF_SWITCH); + BIT(QED_MF_INTER_PF_SWITCH) | + BIT(QED_MF_DISABLE_ARFS); break; case NVM_CFG1_GLOB_MF_MODE_DEFAULT: cdev->mf_bits = BIT(QED_MF_LLH_MAC_CLSS) | @@@ -4267,14 -4267,6 +4268,14 @@@
DP_INFO(p_hwfn, "Multi function mode is 0x%lx\n", cdev->mf_bits); + + /* In CMT the PF is unknown when the GFS block processes the + * packet. Therefore cannot use searcher as it has a per PF + * database, and thus ARFS must be disabled. + * + */ + if (QED_IS_CMT(cdev)) + cdev->mf_bits |= BIT(QED_MF_DISABLE_ARFS); }
DP_INFO(p_hwfn, "Multi function mode is 0x%lx\n", @@@ -4299,6 -4291,14 +4300,14 @@@ __set_bit(QED_DEV_CAP_ROCE, &p_hwfn->hw_info.device_capabilities);
+ /* Read device serial number information from shmem */ + addr = MCP_REG_SCRATCH + nvm_cfg1_offset + + offsetof(struct nvm_cfg1, glob) + + offsetof(struct nvm_cfg1_glob, serial_number); + + for (i = 0; i < 4; i++) + p_hwfn->hw_info.part_num[i] = qed_rd(p_hwfn, p_ptt, addr + i * 4); + return qed_mcp_fill_shmem_func_info(p_hwfn, p_ptt); }
diff --combined drivers/net/ethernet/qlogic/qed/qed_main.c index 50e5eb22e60a,f2ab2b086946..5bd58c65e163 --- a/drivers/net/ethernet/qlogic/qed/qed_main.c +++ b/drivers/net/ethernet/qlogic/qed/qed_main.c @@@ -39,6 -39,7 +39,7 @@@ #include "qed_hw.h" #include "qed_selftest.h" #include "qed_debug.h" + #include "qed_devlink.h"
#define QED_ROCE_QPS (8192) #define QED_ROCE_DPIS (8) @@@ -444,8 -445,6 +445,8 @@@ int qed_fill_dev_info(struct qed_dev *c dev_info->fw_eng = FW_ENGINEERING_VERSION; dev_info->b_inter_pf_switch = test_bit(QED_MF_INTER_PF_SWITCH, &cdev->mf_bits); + if (!test_bit(QED_MF_DISABLE_ARFS, &cdev->mf_bits)) + dev_info->b_arfs_capable = true; dev_info->tx_switching = true;
if (hw_info->b_wol_support == QED_WOL_SUPPORT_PME) @@@ -480,6 -479,7 +481,7 @@@ }
dev_info->mtu = hw_info->mtu; + cdev->common_dev_info = *dev_info;
return 0; } @@@ -512,107 -512,6 +514,6 @@@ static int qed_set_power_state(struct q return 0; }
- struct qed_devlink { - struct qed_dev *cdev; - }; - - enum qed_devlink_param_id { - QED_DEVLINK_PARAM_ID_BASE = DEVLINK_PARAM_GENERIC_ID_MAX, - QED_DEVLINK_PARAM_ID_IWARP_CMT, - }; - - static int qed_dl_param_get(struct devlink *dl, u32 id, - struct devlink_param_gset_ctx *ctx) - { - struct qed_devlink *qed_dl; - struct qed_dev *cdev; - - qed_dl = devlink_priv(dl); - cdev = qed_dl->cdev; - ctx->val.vbool = cdev->iwarp_cmt; - - return 0; - } - - static int qed_dl_param_set(struct devlink *dl, u32 id, - struct devlink_param_gset_ctx *ctx) - { - struct qed_devlink *qed_dl; - struct qed_dev *cdev; - - qed_dl = devlink_priv(dl); - cdev = qed_dl->cdev; - cdev->iwarp_cmt = ctx->val.vbool; - - return 0; - } - - static const struct devlink_param qed_devlink_params[] = { - DEVLINK_PARAM_DRIVER(QED_DEVLINK_PARAM_ID_IWARP_CMT, - "iwarp_cmt", DEVLINK_PARAM_TYPE_BOOL, - BIT(DEVLINK_PARAM_CMODE_RUNTIME), - qed_dl_param_get, qed_dl_param_set, NULL), - }; - - static const struct devlink_ops qed_dl_ops; - - static int qed_devlink_register(struct qed_dev *cdev) - { - union devlink_param_value value; - struct qed_devlink *qed_dl; - struct devlink *dl; - int rc; - - dl = devlink_alloc(&qed_dl_ops, sizeof(*qed_dl)); - if (!dl) - return -ENOMEM; - - qed_dl = devlink_priv(dl); - - cdev->dl = dl; - qed_dl->cdev = cdev; - - rc = devlink_register(dl, &cdev->pdev->dev); - if (rc) - goto err_free; - - rc = devlink_params_register(dl, qed_devlink_params, - ARRAY_SIZE(qed_devlink_params)); - if (rc) - goto err_unregister; - - value.vbool = false; - devlink_param_driverinit_value_set(dl, - QED_DEVLINK_PARAM_ID_IWARP_CMT, - value); - - devlink_params_publish(dl); - cdev->iwarp_cmt = false; - - return 0; - - err_unregister: - devlink_unregister(dl); - - err_free: - cdev->dl = NULL; - devlink_free(dl); - - return rc; - } - - static void qed_devlink_unregister(struct qed_dev *cdev) - { - if (!cdev->dl) - return; - - devlink_params_unregister(cdev->dl, qed_devlink_params, - ARRAY_SIZE(qed_devlink_params)); - - devlink_unregister(cdev->dl); - devlink_free(cdev->dl); - } - /* probing */ static struct qed_dev *qed_probe(struct pci_dev *pdev, struct qed_probe_params *params) @@@ -641,12 -540,6 +542,6 @@@ } DP_INFO(cdev, "PCI init completed successfully\n");
- rc = qed_devlink_register(cdev); - if (rc) { - DP_INFO(cdev, "Failed to register devlink.\n"); - goto err2; - } - rc = qed_hw_prepare(cdev, QED_PCI_DEFAULT); if (rc) { DP_ERR(cdev, "hw prepare failed\n"); @@@ -676,8 -569,6 +571,6 @@@ static void qed_remove(struct qed_dev *
qed_set_power_state(cdev, PCI_D3hot);
- qed_devlink_unregister(cdev); - qed_free_cdev(cdev); }
@@@ -843,7 -734,7 +736,7 @@@ static irqreturn_t qed_single_int(int i
/* Slowpath interrupt */ if (unlikely(status & 0x1)) { - tasklet_schedule(hwfn->sp_dpc); + tasklet_schedule(&hwfn->sp_dpc); status &= ~0x1; rc = IRQ_HANDLED; } @@@ -889,7 -780,7 +782,7 @@@ int qed_slowpath_irq_req(struct qed_hwf id, cdev->pdev->bus->number, PCI_SLOT(cdev->pdev->devfn), hwfn->abs_pf_id); rc = request_irq(cdev->int_params.msix_table[id].vector, - qed_msix_sp_int, 0, hwfn->name, hwfn->sp_dpc); + qed_msix_sp_int, 0, hwfn->name, &hwfn->sp_dpc); } else { unsigned long flags = 0;
@@@ -921,8 -812,8 +814,8 @@@ static void qed_slowpath_tasklet_flush( * enable function makes this sequence a flush-like operation. */ if (p_hwfn->b_sp_dpc_enabled) { - tasklet_disable(p_hwfn->sp_dpc); - tasklet_enable(p_hwfn->sp_dpc); + tasklet_disable(&p_hwfn->sp_dpc); + tasklet_enable(&p_hwfn->sp_dpc); } }
@@@ -951,7 -842,7 +844,7 @@@ static void qed_slowpath_irq_free(struc break; synchronize_irq(cdev->int_params.msix_table[i].vector); free_irq(cdev->int_params.msix_table[i].vector, - cdev->hwfns[i].sp_dpc); + &cdev->hwfns[i].sp_dpc); } } else { if (QED_LEADING_HWFN(cdev)->b_int_requested) @@@ -970,11 -861,11 +863,11 @@@ static int qed_nic_stop(struct qed_dev struct qed_hwfn *p_hwfn = &cdev->hwfns[i];
if (p_hwfn->b_sp_dpc_enabled) { - tasklet_disable(p_hwfn->sp_dpc); + tasklet_disable(&p_hwfn->sp_dpc); p_hwfn->b_sp_dpc_enabled = false; DP_VERBOSE(cdev, NETIF_MSG_IFDOWN, "Disabled sp tasklet [hwfn %d] at %p\n", - i, p_hwfn->sp_dpc); + i, &p_hwfn->sp_dpc); } }
@@@ -2926,7 -2817,7 +2819,7 @@@ static int qed_set_led(struct qed_dev * return status; }
- static int qed_recovery_process(struct qed_dev *cdev) + int qed_recovery_process(struct qed_dev *cdev) { struct qed_hwfn *p_hwfn = QED_LEADING_HWFN(cdev); struct qed_ptt *p_ptt; @@@ -3114,6 -3005,9 +3007,9 @@@ const struct qed_common_ops qed_common_ .get_link = &qed_get_current_link, .drain = &qed_drain, .update_msglvl = &qed_init_dp, + .devlink_register = qed_devlink_register, + .devlink_unregister = qed_devlink_unregister, + .report_fatal_error = qed_report_fatal_error, .dbg_all_data = &qed_dbg_all_data, .dbg_all_data_size = &qed_dbg_all_data_size, .chain_alloc = &qed_chain_alloc, diff --combined drivers/net/ethernet/qlogic/qed/qed_rdma.c index 0df6e0587752,d3136556a1e9..da864d12916b --- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c +++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c @@@ -504,8 -504,7 +504,8 @@@ static void qed_rdma_init_devinfo(struc dev->max_mw = 0; dev->max_mr_mw_fmr_pbl = (PAGE_SIZE / 8) * (PAGE_SIZE / 8); dev->max_mr_mw_fmr_size = dev->max_mr_mw_fmr_pbl * PAGE_SIZE; - dev->max_pkey = QED_RDMA_MAX_P_KEY; + if (QED_IS_ROCE_PERSONALITY(p_hwfn)) + dev->max_pkey = QED_RDMA_MAX_P_KEY;
dev->max_srq = p_hwfn->p_rdma_info->num_srqs; dev->max_srq_wr = QED_RDMA_MAX_SRQ_WQE_ELEM; @@@ -1152,7 -1151,6 +1152,6 @@@ qed_rdma_destroy_cq(void *rdma_cxt DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "icid = %08x\n", in_params->icid);
p_ramrod_res = - (struct rdma_destroy_cq_output_params *) dma_alloc_coherent(&p_hwfn->cdev->pdev->dev, sizeof(struct rdma_destroy_cq_output_params), &ramrod_res_phys, GFP_KERNEL); @@@ -1464,14 -1462,14 +1463,14 @@@ static int qed_rdma_modify_qp(void *rdm
switch (qp->qp_type) { case QED_RDMA_QP_TYPE_XRC_INI: - qp->has_req = 1; + qp->has_req = true; break; case QED_RDMA_QP_TYPE_XRC_TGT: - qp->has_resp = 1; + qp->has_resp = true; break; default: - qp->has_req = 1; - qp->has_resp = 1; + qp->has_req = true; + qp->has_resp = true; }
if (QED_IS_IWARP_PERSONALITY(p_hwfn)) { @@@ -1521,7 -1519,7 +1520,7 @@@ qed_rdma_register_tid(void *rdma_cxt params->pbl_two_level);
SET_FIELD(flags, RDMA_REGISTER_TID_RAMROD_DATA_ZERO_BASED, - params->zbva); + false);
SET_FIELD(flags, RDMA_REGISTER_TID_RAMROD_DATA_PHY_MR, params->phy_mr);
@@@ -1583,7 -1581,15 +1582,7 @@@ p_ramrod->pd = cpu_to_le16(params->pd); p_ramrod->length_hi = (u8)(params->length >> 32); p_ramrod->length_lo = DMA_LO_LE(params->length); - if (params->zbva) { - /* Lower 32 bits of the registered MR address. - * In case of zero based MR, will hold FBO - */ - p_ramrod->va.hi = 0; - p_ramrod->va.lo = cpu_to_le32(params->fbo); - } else { - DMA_REGPAIR_LE(p_ramrod->va, params->vaddr); - } + DMA_REGPAIR_LE(p_ramrod->va, params->vaddr); DMA_REGPAIR_LE(p_ramrod->pbl_base, params->pbl_ptr);
/* DIF */ diff --combined drivers/net/ethernet/qlogic/qede/qede_main.c index 9e1f41ba766c,20d2296beb79..05e3a3b60269 --- a/drivers/net/ethernet/qlogic/qede/qede_main.c +++ b/drivers/net/ethernet/qlogic/qede/qede_main.c @@@ -804,7 -804,7 +804,7 @@@ static void qede_init_ndev(struct qede_ NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_HW_TC;
- if (!IS_VF(edev) && edev->dev_info.common.num_hwfns == 1) + if (edev->dev_info.common.b_arfs_capable) hw_features |= NETIF_F_NTUPLE;
if (edev->dev_info.common.vxlan_enable || @@@ -1170,10 -1170,23 +1170,23 @@@ static int __qede_probe(struct pci_dev rc = -ENOMEM; goto err2; } + + edev->devlink = qed_ops->common->devlink_register(cdev); + if (IS_ERR(edev->devlink)) { + DP_NOTICE(edev, "Cannot register devlink\n"); + edev->devlink = NULL; + /* Go on, we can live without devlink */ + } } else { struct net_device *ndev = pci_get_drvdata(pdev);
edev = netdev_priv(ndev); + + if (edev->devlink) { + struct qed_devlink *qdl = devlink_priv(edev->devlink); + + qdl->cdev = cdev; + } edev->cdev = cdev; memset(&edev->stats, 0, sizeof(edev->stats)); memcpy(&edev->dev_info, &dev_info, sizeof(dev_info)); @@@ -1225,7 -1238,10 +1238,10 @@@ err4: qede_rdma_dev_remove(edev, (mode == QEDE_PROBE_RECOVERY)); err3: - free_netdev(edev->ndev); + if (mode != QEDE_PROBE_RECOVERY) + free_netdev(edev->ndev); + else + edev->cdev = NULL; err2: qed_ops->common->slowpath_stop(cdev); err1: @@@ -1296,6 -1312,11 +1312,11 @@@ static void __qede_remove(struct pci_de qed_ops->common->slowpath_stop(cdev); if (system_state == SYSTEM_POWER_OFF) return; + + if (mode != QEDE_REMOVE_RECOVERY && edev->devlink) { + qed_ops->common->devlink_unregister(edev->devlink); + edev->devlink = NULL; + } qed_ops->common->remove(cdev); edev->cdev = NULL;
@@@ -2274,7 -2295,7 +2295,7 @@@ static void qede_unload(struct qede_de qede_vlan_mark_nonconfigured(edev); edev->ops->fastpath_stop(edev->cdev);
- if (!IS_VF(edev) && edev->dev_info.common.num_hwfns == 1) { + if (edev->dev_info.common.b_arfs_capable) { qede_poll_for_freeing_arfs_filters(edev); qede_free_arfs(edev); } @@@ -2341,9 -2362,10 +2362,9 @@@ static int qede_load(struct qede_dev *e if (rc) goto err2;
- if (!IS_VF(edev) && edev->dev_info.common.num_hwfns == 1) { - rc = qede_alloc_arfs(edev); - if (rc) - DP_NOTICE(edev, "aRFS memory allocation failed\n"); + if (qede_alloc_arfs(edev)) { + edev->ndev->features &= ~NETIF_F_NTUPLE; + edev->dev_info.common.b_arfs_capable = false; }
qede_napi_add_enable(edev); @@@ -2454,7 -2476,8 +2475,8 @@@ static int qede_close(struct net_devic
qede_unload(edev, QEDE_UNLOAD_NORMAL, false);
- edev->ops->common->update_drv_state(edev->cdev, false); + if (edev->cdev) + edev->ops->common->update_drv_state(edev->cdev, false);
return 0; } @@@ -2576,19 -2599,12 +2598,12 @@@ static void qede_atomic_hw_err_handler(
static void qede_generic_hw_err_handler(struct qede_dev *edev) { - struct qed_dev *cdev = edev->cdev; - DP_NOTICE(edev, "Generic sleepable HW error handling started - err_flags 0x%lx\n", edev->err_flags);
- /* Trigger a recovery process. - * This is placed in the sleep requiring section just to make - * sure it is the last one, and that all the other operations - * were completed. - */ - if (test_bit(QEDE_ERR_IS_RECOVERABLE, &edev->err_flags)) - edev->ops->common->recovery_process(cdev); + if (edev->devlink) + edev->ops->common->report_fatal_error(edev->devlink, edev->last_err_type);
clear_bit(QEDE_ERR_IS_HANDLED, &edev->err_flags);
@@@ -2642,6 -2658,7 +2657,7 @@@ static void qede_schedule_hw_err_handle return; }
+ edev->last_err_type = err_type; qede_set_hw_err_flags(edev, err_type); qede_atomic_hw_err_handler(edev); set_bit(QEDE_SP_HW_ERR, &edev->sp_flags); diff --combined drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c index 31ad3a5cd128,1753736c56f7..d8882d0b6b49 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c @@@ -1,7 -1,8 +1,7 @@@ +// SPDX-License-Identifier: GPL-2.0-only /* * QLogic qlcnic NIC Driver * Copyright (c) 2009-2013 QLogic Corporation - * - * See LICENSE.qlcnic for copyright and licensing details. */
#include <linux/if_vlan.h> @@@ -657,11 -658,10 +657,10 @@@ int qlcnic_83xx_cam_lock(struct qlcnic_ void qlcnic_83xx_cam_unlock(struct qlcnic_adapter *adapter) { void __iomem *addr; - u32 val; struct qlcnic_hardware_context *ahw = adapter->ahw;
addr = ahw->pci_base0 + QLC_83XX_SEM_UNLOCK_FUNC(ahw->pci_func); - val = readl(addr); + readl(addr); }
void qlcnic_83xx_read_crb(struct qlcnic_adapter *adapter, char *buf, @@@ -3812,7 -3812,6 +3811,6 @@@ static int qlcnic_83xx_shutdown(struct { struct qlcnic_adapter *adapter = pci_get_drvdata(pdev); struct net_device *netdev = adapter->netdev; - int retval;
netif_device_detach(netdev); qlcnic_cancel_idc_work(adapter); @@@ -3823,11 -3822,7 +3821,7 @@@ qlcnic_83xx_disable_mbx_intr(adapter); cancel_delayed_work_sync(&adapter->idc_aen_work);
- retval = pci_save_state(pdev); - if (retval) - return retval; - - return 0; + return pci_save_state(pdev); }
static int qlcnic_83xx_resume(struct qlcnic_adapter *adapter) diff --combined drivers/net/ethernet/smsc/smc91x.c index d8254b0cfe45,f6b73afd1879..b5d053292e71 --- a/drivers/net/ethernet/smsc/smc91x.c +++ b/drivers/net/ethernet/smsc/smc91x.c @@@ -535,10 -535,10 +535,10 @@@ static inline void smc_rcv(struct net_ /* * This is called to actually send a packet to the chip. */ - static void smc_hardware_send_pkt(unsigned long data) + static void smc_hardware_send_pkt(struct tasklet_struct *t) { - struct net_device *dev = (struct net_device *)data; - struct smc_local *lp = netdev_priv(dev); + struct smc_local *lp = from_tasklet(lp, t, tx_task); + struct net_device *dev = lp->dev; void __iomem *ioaddr = lp->base; struct sk_buff *skb; unsigned int packet_no, len; @@@ -688,7 -688,7 +688,7 @@@ smc_hard_start_xmit(struct sk_buff *skb * Allocation succeeded: push packet to the chip's own memory * immediately. */ - smc_hardware_send_pkt((unsigned long)dev); + smc_hardware_send_pkt(&lp->tx_task); }
return NETDEV_TX_OK; @@@ -1036,7 -1036,6 +1036,6 @@@ static void smc_phy_configure(struct wo int phyaddr = lp->mii.phy_id; int my_phy_caps; /* My PHY capabilities */ int my_ad_caps; /* My Advertised capabilities */ - int status;
DBG(3, dev, "smc_program_phy()\n");
@@@ -1110,7 -1109,7 +1109,7 @@@ * auto-negotiation is restarted, sometimes it isn't ready and * the link does not come up. */ - status = smc_phy_read(dev, phyaddr, MII_ADVERTISE); + smc_phy_read(dev, phyaddr, MII_ADVERTISE);
DBG(2, dev, "phy caps=%x\n", my_phy_caps); DBG(2, dev, "phy advertised caps=%x\n", my_ad_caps); @@@ -1965,7 -1964,7 +1964,7 @@@ static int smc_probe(struct net_device dev->netdev_ops = &smc_netdev_ops; dev->ethtool_ops = &smc_ethtool_ops;
- tasklet_init(&lp->tx_task, smc_hardware_send_pkt, (unsigned long)dev); + tasklet_setup(&lp->tx_task, smc_hardware_send_pkt); INIT_WORK(&lp->phy_configure, smc_phy_configure); lp->dev = dev; lp->mii.phy_id_mask = 0x1f; @@@ -2190,7 -2189,6 +2189,7 @@@ static const struct of_device_id smc91x }; MODULE_DEVICE_TABLE(of, smc91x_match);
+#if defined(CONFIG_GPIOLIB) /** * of_try_set_control_gpio - configure a gpio if it exists */ @@@ -2215,15 -2213,6 +2214,15 @@@ static int try_toggle_control_gpio(stru
return 0; } +#else +static int try_toggle_control_gpio(struct device *dev, + struct gpio_desc **desc, + const char *name, int index, + int value, unsigned int nsdelay) +{ + return 0; +} +#endif #endif
/* diff --combined drivers/net/ethernet/ti/cpsw_new.c index 15672d0a4de6,a3528c5c823f..26073cd63e33 --- a/drivers/net/ethernet/ti/cpsw_new.c +++ b/drivers/net/ethernet/ti/cpsw_new.c @@@ -17,7 -17,6 +17,7 @@@ #include <linux/phy.h> #include <linux/phy/phy.h> #include <linux/delay.h> +#include <linux/pinctrl/consumer.h> #include <linux/pm_runtime.h> #include <linux/gpio/consumer.h> #include <linux/of.h> @@@ -1244,7 -1243,6 +1244,6 @@@ static int cpsw_probe_dt(struct cpsw_co
data->active_slave = 0; data->channels = CPSW_MAX_QUEUES; - data->ale_entries = CPSW_ALE_NUM_ENTRIES; data->dual_emac = true; data->bd_ram_size = CPSW_BD_RAM_SIZE; data->mac_control = 0; @@@ -2071,61 -2069,9 +2070,61 @@@ static int cpsw_remove(struct platform_ return 0; }
+static int __maybe_unused cpsw_suspend(struct device *dev) +{ + struct cpsw_common *cpsw = dev_get_drvdata(dev); + int i; + + rtnl_lock(); + + for (i = 0; i < cpsw->data.slaves; i++) { + struct net_device *ndev = cpsw->slaves[i].ndev; + + if (!(ndev && netif_running(ndev))) + continue; + + cpsw_ndo_stop(ndev); + } + + rtnl_unlock(); + + /* Select sleep pin state */ + pinctrl_pm_select_sleep_state(dev); + + return 0; +} + +static int __maybe_unused cpsw_resume(struct device *dev) +{ + struct cpsw_common *cpsw = dev_get_drvdata(dev); + int i; + + /* Select default pin state */ + pinctrl_pm_select_default_state(dev); + + /* shut up ASSERT_RTNL() warning in netif_set_real_num_tx/rx_queues */ + rtnl_lock(); + + for (i = 0; i < cpsw->data.slaves; i++) { + struct net_device *ndev = cpsw->slaves[i].ndev; + + if (!(ndev && netif_running(ndev))) + continue; + + cpsw_ndo_open(ndev); + } + + rtnl_unlock(); + + return 0; +} + +static SIMPLE_DEV_PM_OPS(cpsw_pm_ops, cpsw_suspend, cpsw_resume); + static struct platform_driver cpsw_driver = { .driver = { .name = "cpsw-switch", + .pm = &cpsw_pm_ops, .of_match_table = cpsw_of_mtable, }, .probe = cpsw_probe, diff --combined drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c index 3c07d1bbe1c6,991ca9e32be3..d4989e0cd7be --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c @@@ -664,15 -664,9 +664,15 @@@ static void pkt_align(struct sk_buff *p /* To check if there's window offered */ static bool data_ok(struct brcmf_sdio *bus) { - /* Reserve TXCTL_CREDITS credits for txctl */ - return (bus->tx_max - bus->tx_seq) > TXCTL_CREDITS && - ((bus->tx_max - bus->tx_seq) & 0x80) == 0; + u8 tx_rsv = 0; + + /* Reserve TXCTL_CREDITS credits for txctl when it is ready to send */ + if (bus->ctrl_frame_stat) + tx_rsv = TXCTL_CREDITS; + + return (bus->tx_max - bus->tx_seq - tx_rsv) != 0 && + ((bus->tx_max - bus->tx_seq - tx_rsv) & 0x80) == 0; + }
/* To check if there's window offered */ @@@ -4278,8 -4272,9 +4278,9 @@@ static void brcmf_sdio_firmware_callbac brcmf_sdiod_writeb(sdiod, SBSDIO_FUNC1_MESBUSYCTRL, CY_43012_MESBUSYCTRL, &err); break; + case SDIO_DEVICE_ID_BROADCOM_4329: case SDIO_DEVICE_ID_BROADCOM_4339: - brcmf_dbg(INFO, "set F2 watermark to 0x%x*4 bytes for 4339\n", + brcmf_dbg(INFO, "set F2 watermark to 0x%x*4 bytes\n", CY_4339_F2_WATERMARK); brcmf_sdiod_writeb(sdiod, SBSDIO_WATERMARK, CY_4339_F2_WATERMARK, &err); @@@ -4292,7 -4287,7 +4293,7 @@@ CY_4339_MESBUSYCTRL, &err); break; case SDIO_DEVICE_ID_BROADCOM_43455: - brcmf_dbg(INFO, "set F2 watermark to 0x%x*4 bytes for 43455\n", + brcmf_dbg(INFO, "set F2 watermark to 0x%x*4 bytes\n", CY_43455_F2_WATERMARK); brcmf_sdiod_writeb(sdiod, SBSDIO_WATERMARK, CY_43455_F2_WATERMARK, &err); @@@ -4305,9 -4300,7 +4306,7 @@@ CY_43455_MESBUSYCTRL, &err); break; case SDIO_DEVICE_ID_BROADCOM_4359: - /* fallthrough */ case SDIO_DEVICE_ID_BROADCOM_4354: - /* fallthrough */ case SDIO_DEVICE_ID_BROADCOM_4356: brcmf_dbg(INFO, "set F2 watermark to 0x%x*4 bytes\n", CY_435X_F2_WATERMARK); diff --combined drivers/net/wireless/marvell/mwifiex/fw.h index d9f8bdbc817b,1f02c5058aed..470d669c7f14 --- a/drivers/net/wireless/marvell/mwifiex/fw.h +++ b/drivers/net/wireless/marvell/mwifiex/fw.h @@@ -513,10 -513,10 +513,10 @@@ enum mwifiex_channel_flags
#define RF_ANTENNA_AUTO 0xFFFF
- #define HostCmd_SET_SEQ_NO_BSS_INFO(seq, num, type) { \ - (((seq) & 0x00ff) | \ - (((num) & 0x000f) << 8)) | \ - (((type) & 0x000f) << 12); } + #define HostCmd_SET_SEQ_NO_BSS_INFO(seq, num, type) \ + ((((seq) & 0x00ff) | \ + (((num) & 0x000f) << 8)) | \ + (((type) & 0x000f) << 12))
#define HostCmd_GET_SEQ_NO(seq) \ ((seq) & HostCmd_SEQ_NUM_MASK) @@@ -954,7 -954,7 +954,7 @@@ struct mwifiex_tkip_param struct mwifiex_aes_param { u8 pn[WPA_PN_SIZE]; __le16 key_len; - u8 key[WLAN_KEY_LEN_CCMP]; + u8 key[WLAN_KEY_LEN_CCMP_256]; } __packed;
struct mwifiex_wapi_param { diff --combined drivers/net/wireless/mediatek/mt76/mt7615/mcu.c index bd316dbd9041,084982eb6abd..7781530fb3e6 --- a/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c @@@ -650,12 -650,12 +650,12 @@@ mt7615_mcu_add_beacon_offload(struct mt memcpy(req.pkt + MT_TXD_SIZE, skb->data, skb->len); req.pkt_len = cpu_to_le16(MT_TXD_SIZE + skb->len); req.tim_ie_pos = cpu_to_le16(MT_TXD_SIZE + offs.tim_offset); - if (offs.csa_counter_offs[0]) { + if (offs.cntdwn_counter_offs[0]) { u16 csa_offs;
- csa_offs = MT_TXD_SIZE + offs.csa_counter_offs[0] - 4; + csa_offs = MT_TXD_SIZE + offs.cntdwn_counter_offs[0] - 4; req.csa_ie_pos = cpu_to_le16(csa_offs); - req.csa_cnt = skb->data[offs.csa_counter_offs[0]]; + req.csa_cnt = skb->data[offs.cntdwn_counter_offs[0]]; } dev_kfree_skb(skb);
@@@ -1713,10 -1713,10 +1713,10 @@@ mt7615_mcu_uni_add_beacon_offload(struc req.beacon_tlv.pkt_len = cpu_to_le16(MT_TXD_SIZE + skb->len); req.beacon_tlv.tim_ie_pos = cpu_to_le16(MT_TXD_SIZE + offs.tim_offset);
- if (offs.csa_counter_offs[0]) { + if (offs.cntdwn_counter_offs[0]) { u16 csa_offs;
- csa_offs = MT_TXD_SIZE + offs.csa_counter_offs[0] - 4; + csa_offs = MT_TXD_SIZE + offs.cntdwn_counter_offs[0] - 4; req.beacon_tlv.csa_ie_pos = cpu_to_le16(csa_offs); } dev_kfree_skb(skb); @@@ -2128,8 -2128,7 +2128,8 @@@ static int mt7615_load_n9(struct mt7615 sizeof(dev->mt76.hw->wiphy->fw_version), "%.10s-%.15s", hdr->fw_ver, hdr->build_date);
- if (!strncmp(hdr->fw_ver, "2.0", sizeof(hdr->fw_ver))) { + if (!is_mt7615(&dev->mt76) && + !strncmp(hdr->fw_ver, "2.0", sizeof(hdr->fw_ver))) { dev->fw_ver = MT7615_FIRMWARE_V2; dev->mcu_ops = &sta_update_ops; } else { diff --combined drivers/s390/net/qeth_l2_main.c index 6384f7adba66,54e02518ce08..e12ac32b8b47 --- a/drivers/s390/net/qeth_l2_main.c +++ b/drivers/s390/net/qeth_l2_main.c @@@ -17,10 -17,13 +17,13 @@@ #include <linux/kernel.h> #include <linux/slab.h> #include <linux/etherdevice.h> + #include <linux/if_bridge.h> #include <linux/list.h> #include <linux/hash.h> #include <linux/hashtable.h> + #include <net/switchdev.h> #include <asm/chsc.h> + #include <asm/css_chars.h> #include <asm/setup.h> #include "qeth_core.h" #include "qeth_l2.h" @@@ -30,6 -33,7 +33,7 @@@ static void qeth_bridge_state_change(st struct qeth_ipa_cmd *cmd); static void qeth_addr_change_event(struct qeth_card *card, struct qeth_ipa_cmd *cmd); + static bool qeth_bridgeport_is_in_use(struct qeth_card *card); static void qeth_l2_vnicc_set_defaults(struct qeth_card *card); static void qeth_l2_vnicc_init(struct qeth_card *card); static bool qeth_l2_vnicc_recover_timeout(struct qeth_card *card, u32 vnicc, @@@ -273,8 -277,37 +277,37 @@@ static int qeth_l2_vlan_rx_kill_vid(str return qeth_l2_send_setdelvlan(card, vid, IPA_CMD_DELVLAN); }
+ static void qeth_l2_set_pnso_mode(struct qeth_card *card, + enum qeth_pnso_mode mode) + { + spin_lock_irq(get_ccwdev_lock(CARD_RDEV(card))); + WRITE_ONCE(card->info.pnso_mode, mode); + spin_unlock_irq(get_ccwdev_lock(CARD_RDEV(card))); + + if (mode == QETH_PNSO_NONE) + drain_workqueue(card->event_wq); + } + + static void qeth_l2_dev2br_fdb_flush(struct qeth_card *card) + { + struct switchdev_notifier_fdb_info info; + + QETH_CARD_TEXT(card, 2, "fdbflush"); + + info.addr = NULL; + /* flush all VLANs: */ + info.vid = 0; + info.added_by_user = false; + info.offloaded = true; + + call_switchdev_notifiers(SWITCHDEV_FDB_FLUSH_TO_BRIDGE, + card->dev, &info.info, NULL); + } + static void qeth_l2_stop_card(struct qeth_card *card) { + struct qeth_priv *priv = netdev_priv(card->dev); + QETH_CARD_TEXT(card, 2, "stopcard");
qeth_set_allowed_threads(card, 0, 1); @@@ -284,15 -317,21 +317,21 @@@
if (card->state == CARD_STATE_SOFTSETUP) { qeth_clear_ipacmd_list(card); - qeth_drain_output_queues(card); card->state = CARD_STATE_DOWN; }
qeth_qdio_clear_card(card, 0); + qeth_drain_output_queues(card); qeth_clear_working_pool_list(card); - flush_workqueue(card->event_wq); + qeth_l2_set_pnso_mode(card, QETH_PNSO_NONE); qeth_flush_local_addrs(card); card->info.promisc_mode = 0; + + if (priv->brport_features & BR_LEARNING_SYNC) { + rtnl_lock(); + qeth_l2_dev2br_fdb_flush(card); + rtnl_unlock(); + } }
static int qeth_l2_request_initial_mac(struct qeth_card *card) @@@ -631,6 -670,7 +670,7 @@@ static void qeth_l2_set_rx_mode(struct /** * qeth_l2_pnso() - perform network subchannel operation * @card: qeth_card structure pointer + * @oc: Operation Code * @cnc: Boolean Change-Notification Control * @cb: Callback function will be executed for each element * of the address list @@@ -641,7 -681,7 +681,7 @@@ * control" is set, further changes in the address list will be reported * via the IPA command. */ - static int qeth_l2_pnso(struct qeth_card *card, int cnc, + static int qeth_l2_pnso(struct qeth_card *card, u8 oc, int cnc, void (*cb)(void *priv, struct chsc_pnso_naid_l2 *entry), void *priv) { @@@ -652,13 -692,14 +692,14 @@@ int i, size, elems; int rc;
- QETH_CARD_TEXT(card, 2, "PNSO"); rr = (struct chsc_pnso_area *)get_zeroed_page(GFP_KERNEL); if (rr == NULL) return -ENOMEM; do { + QETH_CARD_TEXT(card, 2, "PNSO"); /* on the first iteration, naihdr.resume_token will be zero */ - rc = ccw_device_pnso(ddev, rr, rr->naihdr.resume_token, cnc); + rc = ccw_device_pnso(ddev, rr, oc, rr->naihdr.resume_token, + cnc); if (rc) continue; if (cb == NULL) @@@ -694,6 -735,218 +735,218 @@@ return rc; }
+ static bool qeth_is_my_net_if_token(struct qeth_card *card, + struct net_if_token *token) + { + return ((card->info.ddev_devno == token->devnum) && + (card->info.cssid == token->cssid) && + (card->info.iid == token->iid) && + (card->info.ssid == token->ssid) && + (card->info.chpid == token->chpid) && + (card->info.chid == token->chid)); + } + + /** + * qeth_l2_dev2br_fdb_notify() - update fdb of master bridge + * @card: qeth_card structure pointer + * @code: event bitmask: high order bit 0x80 set to + * 1 - removal of an object + * 0 - addition of an object + * Object type(s): + * 0x01 - VLAN, 0x02 - MAC, 0x03 - VLAN and MAC + * @token: "network token" structure identifying 'physical' location + * of the target + * @addr_lnid: structure with MAC address and VLAN ID of the target + */ + static void qeth_l2_dev2br_fdb_notify(struct qeth_card *card, u8 code, + struct net_if_token *token, + struct mac_addr_lnid *addr_lnid) + { + struct switchdev_notifier_fdb_info info; + u8 ntfy_mac[ETH_ALEN]; + + ether_addr_copy(ntfy_mac, addr_lnid->mac); + /* Ignore VLAN only changes */ + if (!(code & IPA_ADDR_CHANGE_CODE_MACADDR)) + return; + /* Ignore mcast entries */ + if (is_multicast_ether_addr(ntfy_mac)) + return; + /* Ignore my own addresses */ + if (qeth_is_my_net_if_token(card, token)) + return; + + info.addr = ntfy_mac; + /* don't report VLAN IDs */ + info.vid = 0; + info.added_by_user = false; + info.offloaded = true; + + if (code & IPA_ADDR_CHANGE_CODE_REMOVAL) { + call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE, + card->dev, &info.info, NULL); + QETH_CARD_TEXT(card, 4, "andelmac"); + QETH_CARD_TEXT_(card, 4, + "mc%012lx", ether_addr_to_u64(ntfy_mac)); + } else { + call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_BRIDGE, + card->dev, &info.info, NULL); + QETH_CARD_TEXT(card, 4, "anaddmac"); + QETH_CARD_TEXT_(card, 4, + "mc%012lx", ether_addr_to_u64(ntfy_mac)); + } + } + + static void qeth_l2_dev2br_an_set_cb(void *priv, + struct chsc_pnso_naid_l2 *entry) + { + u8 code = IPA_ADDR_CHANGE_CODE_MACADDR; + struct qeth_card *card = priv; + + if (entry->addr_lnid.lnid < VLAN_N_VID) + code |= IPA_ADDR_CHANGE_CODE_VLANID; + qeth_l2_dev2br_fdb_notify(card, code, + (struct net_if_token *)&entry->nit, + (struct mac_addr_lnid *)&entry->addr_lnid); + } + + /** + * qeth_l2_dev2br_an_set() - + * Enable or disable 'dev to bridge network address notification' + * @card: qeth_card structure pointer + * @enable: Enable or disable 'dev to bridge network address notification' + * + * Returns negative errno-compatible error indication or 0 on success. + * + * On enable, emits a series of address notifications for all + * currently registered hosts. + * + * Must be called under rtnl_lock + */ + static int qeth_l2_dev2br_an_set(struct qeth_card *card, bool enable) + { + int rc; + + if (enable) { + QETH_CARD_TEXT(card, 2, "anseton"); + rc = qeth_l2_pnso(card, PNSO_OC_NET_ADDR_INFO, 1, + qeth_l2_dev2br_an_set_cb, card); + if (rc == -EAGAIN) + /* address notification enabled, but inconsistent + * addresses reported -> disable address notification + */ + qeth_l2_pnso(card, PNSO_OC_NET_ADDR_INFO, 0, + NULL, NULL); + } else { + QETH_CARD_TEXT(card, 2, "ansetoff"); + rc = qeth_l2_pnso(card, PNSO_OC_NET_ADDR_INFO, 0, NULL, NULL); + } + + return rc; + } + + static int qeth_l2_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq, + struct net_device *dev, u32 filter_mask, + int nlflags) + { + struct qeth_priv *priv = netdev_priv(dev); + struct qeth_card *card = dev->ml_priv; + u16 mode = BRIDGE_MODE_UNDEF; + + /* Do not even show qeth devs that cannot do bridge_setlink */ + if (!priv->brport_hw_features || !netif_device_present(dev) || + qeth_bridgeport_is_in_use(card)) + return -EOPNOTSUPP; + + return ndo_dflt_bridge_getlink(skb, pid, seq, dev, + mode, priv->brport_features, + priv->brport_hw_features, + nlflags, filter_mask, NULL); + } + + static const struct nla_policy qeth_brport_policy[IFLA_BRPORT_MAX + 1] = { + [IFLA_BRPORT_LEARNING_SYNC] = { .type = NLA_U8 }, + }; + + /** + * qeth_l2_bridge_setlink() - set bridgeport attributes + * @dev: netdevice + * @nlh: netlink message header + * @flags: bridge flags (here: BRIDGE_FLAGS_SELF) + * @extack: extended ACK report struct + * + * Called under rtnl_lock + */ + static int qeth_l2_bridge_setlink(struct net_device *dev, struct nlmsghdr *nlh, + u16 flags, struct netlink_ext_ack *extack) + { + struct qeth_priv *priv = netdev_priv(dev); + struct nlattr *bp_tb[IFLA_BRPORT_MAX + 1]; + struct qeth_card *card = dev->ml_priv; + struct nlattr *attr, *nested_attr; + bool enable, has_protinfo = false; + int rem1, rem2; + int rc; + + if (!netif_device_present(dev)) + return -ENODEV; + if (!(priv->brport_hw_features)) + return -EOPNOTSUPP; + + nlmsg_for_each_attr(attr, nlh, sizeof(struct ifinfomsg), rem1) { + if (nla_type(attr) == IFLA_PROTINFO) { + rc = nla_parse_nested(bp_tb, IFLA_BRPORT_MAX, attr, + qeth_brport_policy, extack); + if (rc) + return rc; + has_protinfo = true; + } else if (nla_type(attr) == IFLA_AF_SPEC) { + nla_for_each_nested(nested_attr, attr, rem2) { + if (nla_type(nested_attr) == IFLA_BRIDGE_FLAGS) + continue; + NL_SET_ERR_MSG_ATTR(extack, nested_attr, + "Unsupported attribute"); + return -EINVAL; + } + } else { + NL_SET_ERR_MSG_ATTR(extack, attr, "Unsupported attribute"); + return -EINVAL; + } + } + if (!has_protinfo) + return 0; + if (!bp_tb[IFLA_BRPORT_LEARNING_SYNC]) + return -EINVAL; + enable = !!nla_get_u8(bp_tb[IFLA_BRPORT_LEARNING_SYNC]); + + if (enable == !!(priv->brport_features & BR_LEARNING_SYNC)) + return 0; + + mutex_lock(&card->sbp_lock); + /* do not change anything if BridgePort is enabled */ + if (qeth_bridgeport_is_in_use(card)) { + NL_SET_ERR_MSG(extack, "n/a (BridgePort)"); + rc = -EBUSY; + } else if (enable) { + qeth_l2_set_pnso_mode(card, QETH_PNSO_ADDR_INFO); + rc = qeth_l2_dev2br_an_set(card, true); + if (rc) + qeth_l2_set_pnso_mode(card, QETH_PNSO_NONE); + else + priv->brport_features |= BR_LEARNING_SYNC; + } else { + rc = qeth_l2_dev2br_an_set(card, false); + if (!rc) { + qeth_l2_set_pnso_mode(card, QETH_PNSO_NONE); + priv->brport_features ^= BR_LEARNING_SYNC; + qeth_l2_dev2br_fdb_flush(card); + } + } + mutex_unlock(&card->sbp_lock); + + return rc; + } + static const struct net_device_ops qeth_l2_netdev_ops = { .ndo_open = qeth_open, .ndo_stop = qeth_stop, @@@ -709,7 -962,9 +962,9 @@@ .ndo_vlan_rx_kill_vid = qeth_l2_vlan_rx_kill_vid, .ndo_tx_timeout = qeth_tx_timeout, .ndo_fix_features = qeth_fix_features, - .ndo_set_features = qeth_set_features + .ndo_set_features = qeth_set_features, + .ndo_bridge_getlink = qeth_l2_bridge_getlink, + .ndo_bridge_setlink = qeth_l2_bridge_setlink, };
static const struct net_device_ops qeth_osn_netdev_ops = { @@@ -810,8 -1065,78 +1065,78 @@@ static void qeth_l2_setup_bridgeport_at if (card->options.sbp.hostnotification) { if (qeth_bridgeport_an_set(card, 1)) card->options.sbp.hostnotification = 0; + } + } + + /** + * qeth_l2_detect_dev2br_support() - + * Detect whether this card supports 'dev to bridge fdb network address + * change notification' and thus can support the learning_sync bridgeport + * attribute + * @card: qeth_card structure pointer + * + * This is a destructive test and must be called before dev2br or + * bridgeport address notification is enabled! + */ + static void qeth_l2_detect_dev2br_support(struct qeth_card *card) + { + struct qeth_priv *priv = netdev_priv(card->dev); + bool dev2br_supported; + int rc; + + QETH_CARD_TEXT(card, 2, "d2brsup"); + if (!IS_IQD(card)) + return; + + /* dev2br requires valid cssid,iid,chid */ + if (!card->info.ids_valid) { + dev2br_supported = false; + } else if (css_general_characteristics.enarf) { + dev2br_supported = true; } else { - qeth_bridgeport_an_set(card, 0); + /* Old machines don't have the feature bit: + * Probe by testing whether a disable succeeds + */ + rc = qeth_l2_pnso(card, PNSO_OC_NET_ADDR_INFO, 0, NULL, NULL); + dev2br_supported = !rc; + } + QETH_CARD_TEXT_(card, 2, "D2Bsup%02x", dev2br_supported); + + if (dev2br_supported) + priv->brport_hw_features |= BR_LEARNING_SYNC; + else + priv->brport_hw_features &= ~BR_LEARNING_SYNC; + } + + static void qeth_l2_enable_brport_features(struct qeth_card *card) + { + struct qeth_priv *priv = netdev_priv(card->dev); + int rc; + + if (priv->brport_features & BR_LEARNING_SYNC) { + if (priv->brport_hw_features & BR_LEARNING_SYNC) { + qeth_l2_set_pnso_mode(card, QETH_PNSO_ADDR_INFO); + rc = qeth_l2_dev2br_an_set(card, true); + if (rc == -EAGAIN) { + /* Recoverable error, retry once */ + qeth_l2_set_pnso_mode(card, QETH_PNSO_NONE); + qeth_l2_dev2br_fdb_flush(card); + qeth_l2_set_pnso_mode(card, QETH_PNSO_ADDR_INFO); + rc = qeth_l2_dev2br_an_set(card, true); + } + if (rc) { + netdev_err(card->dev, + "failed to enable bridge learning_sync: %d\n", + rc); + qeth_l2_set_pnso_mode(card, QETH_PNSO_NONE); + qeth_l2_dev2br_fdb_flush(card); + priv->brport_features ^= BR_LEARNING_SYNC; + } + } else { + dev_warn(&card->gdev->dev, + "bridge learning_sync not supported\n"); + priv->brport_features ^= BR_LEARNING_SYNC; + } } }
@@@ -829,6 -1154,9 +1154,9 @@@ static int qeth_l2_set_online(struct qe goto out_remove; }
+ /* query before bridgeport_notification may be enabled */ + qeth_l2_detect_dev2br_support(card); + mutex_lock(&card->sbp_lock); qeth_bridgeport_query_support(card); if (card->options.sbp.supported_funcs) { @@@ -871,6 -1199,7 +1199,7 @@@
netif_device_attach(dev); qeth_enable_hw_features(dev); + qeth_l2_enable_brport_features(card);
if (card->info.open_when_online) { card->info.open_when_online = 0; @@@ -1090,15 -1419,14 +1419,14 @@@ static void qeth_bridge_emit_host_event struct qeth_bridge_state_data { struct work_struct worker; struct qeth_card *card; - struct qeth_sbp_state_change qports; + u8 role; + u8 state; };
static void qeth_bridge_state_change_worker(struct work_struct *work) { struct qeth_bridge_state_data *data = container_of(work, struct qeth_bridge_state_data, worker); - /* We are only interested in the first entry - local port */ - struct qeth_sbp_port_entry *entry = &data->qports.entry[0]; char env_locrem[32]; char env_role[32]; char env_state[32]; @@@ -1109,22 -1437,16 +1437,16 @@@ NULL };
- /* Role should not change by itself, but if it did, */ - /* information from the hardware is authoritative. */ - mutex_lock(&data->card->sbp_lock); - data->card->options.sbp.role = entry->role; - mutex_unlock(&data->card->sbp_lock); - snprintf(env_locrem, sizeof(env_locrem), "BRIDGEPORT=statechange"); snprintf(env_role, sizeof(env_role), "ROLE=%s", - (entry->role == QETH_SBP_ROLE_NONE) ? "none" : - (entry->role == QETH_SBP_ROLE_PRIMARY) ? "primary" : - (entry->role == QETH_SBP_ROLE_SECONDARY) ? "secondary" : + (data->role == QETH_SBP_ROLE_NONE) ? "none" : + (data->role == QETH_SBP_ROLE_PRIMARY) ? "primary" : + (data->role == QETH_SBP_ROLE_SECONDARY) ? "secondary" : "<INVALID>"); snprintf(env_state, sizeof(env_state), "STATE=%s", - (entry->state == QETH_SBP_STATE_INACTIVE) ? "inactive" : - (entry->state == QETH_SBP_STATE_STANDBY) ? "standby" : - (entry->state == QETH_SBP_STATE_ACTIVE) ? "active" : + (data->state == QETH_SBP_STATE_INACTIVE) ? "inactive" : + (data->state == QETH_SBP_STATE_STANDBY) ? "standby" : + (data->state == QETH_SBP_STATE_ACTIVE) ? "active" : "<INVALID>"); kobject_uevent_env(&data->card->gdev->dev.kobj, KOBJ_CHANGE, env); @@@ -1134,10 -1456,8 +1456,8 @@@ static void qeth_bridge_state_change(struct qeth_card *card, struct qeth_ipa_cmd *cmd) { - struct qeth_sbp_state_change *qports = - &cmd->data.sbp.data.state_change; + struct qeth_sbp_port_data *qports = &cmd->data.sbp.data.port_data; struct qeth_bridge_state_data *data; - int extrasize;
QETH_CARD_TEXT(card, 2, "brstchng"); if (qports->num_entries == 0) { @@@ -1148,34 -1468,125 +1468,125 @@@ QETH_CARD_TEXT_(card, 2, "BPsz%04x", qports->entry_length); return; } - extrasize = sizeof(struct qeth_sbp_port_entry) * qports->num_entries; - data = kzalloc(sizeof(struct qeth_bridge_state_data) + extrasize, - GFP_ATOMIC); + + data = kzalloc(sizeof(*data), GFP_ATOMIC); if (!data) { QETH_CARD_TEXT(card, 2, "BPSalloc"); return; } INIT_WORK(&data->worker, qeth_bridge_state_change_worker); data->card = card; - memcpy(&data->qports, qports, - sizeof(struct qeth_sbp_state_change) + extrasize); + /* Information for the local port: */ + data->role = qports->entry[0].role; + data->state = qports->entry[0].state; + queue_work(card->event_wq, &data->worker); }
struct qeth_addr_change_data { - struct work_struct worker; + struct delayed_work dwork; struct qeth_card *card; struct qeth_ipacmd_addr_change ac_event; };
+ static void qeth_l2_dev2br_worker(struct work_struct *work) + { + struct delayed_work *dwork = to_delayed_work(work); + struct qeth_addr_change_data *data; + struct qeth_card *card; + struct qeth_priv *priv; + unsigned int i; + int rc; + + data = container_of(dwork, struct qeth_addr_change_data, dwork); + card = data->card; + priv = netdev_priv(card->dev); + + QETH_CARD_TEXT(card, 4, "dev2brew"); + + if (READ_ONCE(card->info.pnso_mode) == QETH_PNSO_NONE) + goto free; + + /* Potential re-config in progress, try again later: */ + if (!rtnl_trylock()) { + queue_delayed_work(card->event_wq, dwork, + msecs_to_jiffies(100)); + return; + } + if (!netif_device_present(card->dev)) + goto out_unlock; + + if (data->ac_event.lost_event_mask) { + QETH_DBF_MESSAGE(3, + "Address change notification overflow on device %x\n", + CARD_DEVID(card)); + /* Card fdb and bridge fdb are out of sync, card has stopped + * notifications (no need to drain_workqueue). Purge all + * 'extern_learn' entries from the parent bridge and restart + * the notifications. + */ + qeth_l2_dev2br_fdb_flush(card); + rc = qeth_l2_dev2br_an_set(card, true); + if (rc) { + /* TODO: if we want to retry after -EAGAIN, be + * aware there could be stale entries in the + * workqueue now, that need to be drained. + * For now we give up: + */ + netdev_err(card->dev, + "bridge learning_sync failed to recover: %d\n", + rc); + WRITE_ONCE(card->info.pnso_mode, + QETH_PNSO_NONE); + /* To remove fdb entries reported by an_set: */ + qeth_l2_dev2br_fdb_flush(card); + priv->brport_features ^= BR_LEARNING_SYNC; + } else { + QETH_DBF_MESSAGE(3, + "Address Notification resynced on device %x\n", + CARD_DEVID(card)); + } + } else { + for (i = 0; i < data->ac_event.num_entries; i++) { + struct qeth_ipacmd_addr_change_entry *entry = + &data->ac_event.entry[i]; + qeth_l2_dev2br_fdb_notify(card, + entry->change_code, + &entry->token, + &entry->addr_lnid); + } + } + + out_unlock: + rtnl_unlock(); + + free: + kfree(data); + } + static void qeth_addr_change_event_worker(struct work_struct *work) { - struct qeth_addr_change_data *data = - container_of(work, struct qeth_addr_change_data, worker); + struct delayed_work *dwork = to_delayed_work(work); + struct qeth_addr_change_data *data; + struct qeth_card *card; int i;
+ data = container_of(dwork, struct qeth_addr_change_data, dwork); + card = data->card; + QETH_CARD_TEXT(data->card, 4, "adrchgew"); + + if (READ_ONCE(card->info.pnso_mode) == QETH_PNSO_NONE) + goto free; + if (data->ac_event.lost_event_mask) { + /* Potential re-config in progress, try again later: */ + if (!mutex_trylock(&card->sbp_lock)) { + queue_delayed_work(card->event_wq, dwork, + msecs_to_jiffies(100)); + return; + } + dev_info(&data->card->gdev->dev, "Address change notification stopped on %s (%s)\n", data->card->dev->name, @@@ -1184,8 -1595,9 +1595,9 @@@ : (data->ac_event.lost_event_mask == 0x02) ? "Bridge port state change" : "Unknown reason"); - mutex_lock(&data->card->sbp_lock); + data->card->options.sbp.hostnotification = 0; + card->info.pnso_mode = QETH_PNSO_NONE; mutex_unlock(&data->card->sbp_lock); qeth_bridge_emit_host_event(data->card, anev_abort, 0, NULL, NULL); @@@ -1199,6 -1611,8 +1611,8 @@@ &entry->token, &entry->addr_lnid); } + + free: kfree(data); }
@@@ -1210,6 -1624,9 +1624,9 @@@ static void qeth_addr_change_event(stru struct qeth_addr_change_data *data; int extrasize;
+ if (card->info.pnso_mode == QETH_PNSO_NONE) + return; + QETH_CARD_TEXT(card, 4, "adrchgev"); if (cmd->hdr.return_code != 0x0000) { if (cmd->hdr.return_code == 0x0010) { @@@ -1229,11 -1646,14 +1646,14 @@@ QETH_CARD_TEXT(card, 2, "ACNalloc"); return; } - INIT_WORK(&data->worker, qeth_addr_change_event_worker); + if (card->info.pnso_mode == QETH_PNSO_BRIDGEPORT) + INIT_DELAYED_WORK(&data->dwork, qeth_addr_change_event_worker); + else + INIT_DELAYED_WORK(&data->dwork, qeth_l2_dev2br_worker); data->card = card; memcpy(&data->ac_event, hostevs, sizeof(struct qeth_ipacmd_addr_change) + extrasize); - queue_work(card->event_wq, &data->worker); + queue_delayed_work(card->event_wq, &data->dwork, 0); }
/* SETBRIDGEPORT support; sending commands */ @@@ -1418,8 -1838,8 +1838,8 @@@ static int qeth_bridgeport_query_ports_ struct qeth_reply *reply, unsigned long data) { struct qeth_ipa_cmd *cmd = (struct qeth_ipa_cmd *) data; - struct qeth_sbp_query_ports *qports = &cmd->data.sbp.data.query_ports; struct _qeth_sbp_cbctl *cbctl = (struct _qeth_sbp_cbctl *)reply->param; + struct qeth_sbp_port_data *qports; int rc;
QETH_CARD_TEXT(card, 2, "brqprtcb"); @@@ -1427,6 -1847,7 +1847,7 @@@ if (rc) return rc;
+ qports = &cmd->data.sbp.data.port_data; if (qports->entry_length != sizeof(struct qeth_sbp_port_entry)) { QETH_CARD_TEXT_(card, 2, "SBPs%04x", qports->entry_length); return -EINVAL; @@@ -1554,9 -1975,15 +1975,15 @@@ int qeth_bridgeport_an_set(struct qeth_
if (enable) { qeth_bridge_emit_host_event(card, anev_reset, 0, NULL, NULL); - rc = qeth_l2_pnso(card, 1, qeth_bridgeport_an_set_cb, card); - } else - rc = qeth_l2_pnso(card, 0, NULL, NULL); + qeth_l2_set_pnso_mode(card, QETH_PNSO_BRIDGEPORT); + rc = qeth_l2_pnso(card, PNSO_OC_NET_BRIDGE_INFO, 1, + qeth_bridgeport_an_set_cb, card); + if (rc) + qeth_l2_set_pnso_mode(card, QETH_PNSO_NONE); + } else { + rc = qeth_l2_pnso(card, PNSO_OC_NET_BRIDGE_INFO, 0, NULL, NULL); + qeth_l2_set_pnso_mode(card, QETH_PNSO_NONE); + } return rc; }
@@@ -1851,7 -2278,7 +2278,7 @@@ int qeth_l2_vnicc_get_timeout(struct qe }
/* check if VNICC is currently enabled */ - bool qeth_l2_vnicc_is_in_use(struct qeth_card *card) + static bool _qeth_l2_vnicc_is_in_use(struct qeth_card *card) { if (!card->options.vnicc.sup_chars) return false; @@@ -1866,6 -2293,21 +2293,21 @@@ return true; }
+ /** + * qeth_bridgeport_allowed - are any qeth_bridgeport functions allowed? + * @card: qeth_card structure pointer + * + * qeth_bridgeport functionality is mutually exclusive with usage of the + * VNIC Characteristics and dev2br address notifications + */ + bool qeth_bridgeport_allowed(struct qeth_card *card) + { + struct qeth_priv *priv = netdev_priv(card->dev); + + return (!_qeth_l2_vnicc_is_in_use(card) && + !(priv->brport_features & BR_LEARNING_SYNC)); + } + /* recover user timeout setting */ static bool qeth_l2_vnicc_recover_timeout(struct qeth_card *card, u32 vnicc, u32 *timeout) diff --combined drivers/s390/net/qeth_l3_main.c index 09ef518ca1ea,767c5bb7c24c..33fdad1a6887 --- a/drivers/s390/net/qeth_l3_main.c +++ b/drivers/s390/net/qeth_l3_main.c @@@ -314,7 -314,8 +314,8 @@@ static int qeth_l3_setdelip_cb(struct q }
static int qeth_l3_send_setdelmc(struct qeth_card *card, - struct qeth_ipaddr *addr, int ipacmd) + struct qeth_ipaddr *addr, + enum qeth_ipa_cmds ipacmd) { struct qeth_cmd_buffer *iob; struct qeth_ipa_cmd *cmd; @@@ -1168,11 -1169,11 +1169,11 @@@ static void qeth_l3_stop_card(struct qe if (card->state == CARD_STATE_SOFTSETUP) { qeth_l3_clear_ip_htable(card, 1); qeth_clear_ipacmd_list(card); - qeth_drain_output_queues(card); card->state = CARD_STATE_DOWN; }
qeth_qdio_clear_card(card, 0); + qeth_drain_output_queues(card); qeth_clear_working_pool_list(card); flush_workqueue(card->event_wq); qeth_flush_local_addrs(card); diff --combined fs/io_uring.c index 3790c7fe9fee,522b891dd187..005e8ccc4f85 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@@ -2980,15 -2980,14 +2980,15 @@@ static inline int io_rw_prep_async(stru bool force_nonblock) { struct io_async_rw *iorw = &req->io->rw; + struct iovec *iov; ssize_t ret;
- iorw->iter.iov = iorw->fast_iov; - ret = __io_import_iovec(rw, req, (struct iovec **) &iorw->iter.iov, - &iorw->iter, !force_nonblock); + iorw->iter.iov = iov = iorw->fast_iov; + ret = __io_import_iovec(rw, req, &iov, &iorw->iter, !force_nonblock); if (unlikely(ret < 0)) return ret;
+ iorw->iter.iov = iov; io_req_map_rw(req, iorw->iter.iov, iorw->fast_iov, &iorw->iter); return 0; } @@@ -4928,6 -4927,12 +4928,12 @@@ static bool io_arm_poll_handler(struct mask |= POLLIN | POLLRDNORM; if (def->pollout) mask |= POLLOUT | POLLWRNORM; + + /* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */ + if ((req->opcode == IORING_OP_RECVMSG) && + (req->sr_msg.msg_flags & MSG_ERRQUEUE)) + mask &= ~POLLIN; + mask |= POLLERR | POLLPRI;
ipt.pt._qproc = io_async_queue_proc; @@@ -8024,28 -8029,6 +8030,28 @@@ static bool io_match_link(struct io_kio return false; }
+static inline bool io_match_files(struct io_kiocb *req, + struct files_struct *files) +{ + return (req->flags & REQ_F_WORK_INITIALIZED) && req->work.files == files; +} + +static bool io_match_link_files(struct io_kiocb *req, + struct files_struct *files) +{ + struct io_kiocb *link; + + if (io_match_files(req, files)) + return true; + if (req->flags & REQ_F_LINK_HEAD) { + list_for_each_entry(link, &req->link_list, link_list) { + if (io_match_files(link, files)) + return true; + } + } + return false; +} + /* * We're looking to cancel 'req' because it's holding on to our files, but * 'req' could be a link to another request. See if it is, and cancel that @@@ -8120,38 -8103,12 +8126,38 @@@ static void io_attempt_cancel(struct io io_timeout_remove_link(ctx, req); }
+static void io_cancel_defer_files(struct io_ring_ctx *ctx, + struct files_struct *files) +{ + struct io_defer_entry *de = NULL; + LIST_HEAD(list); + + spin_lock_irq(&ctx->completion_lock); + list_for_each_entry_reverse(de, &ctx->defer_list, list) { + if (io_match_link_files(de->req, files)) { + list_cut_position(&list, &ctx->defer_list, &de->list); + break; + } + } + spin_unlock_irq(&ctx->completion_lock); + + while (!list_empty(&list)) { + de = list_first_entry(&list, struct io_defer_entry, list); + list_del_init(&de->list); + req_set_fail_links(de->req); + io_put_req(de->req); + io_req_complete(de->req, -ECANCELED); + kfree(de); + } +} + static void io_uring_cancel_files(struct io_ring_ctx *ctx, struct files_struct *files) { if (list_empty_careful(&ctx->inflight_list)) return;
+ io_cancel_defer_files(ctx, files); /* cancel all at once, should be faster than doing it one by one*/ io_wq_cancel_cb(ctx->io_wq, io_wq_files_match, files, true);
diff --combined include/linux/bpf-cgroup.h index 82b26a1386d8,2f98d2fce62e..ed71bd1a0825 --- a/include/linux/bpf-cgroup.h +++ b/include/linux/bpf-cgroup.h @@@ -136,7 -136,7 +136,7 @@@ int __cgroup_bpf_check_dev_permission(s
int __cgroup_bpf_run_filter_sysctl(struct ctl_table_header *head, struct ctl_table *table, int write, - void **buf, size_t *pcount, loff_t *ppos, + char **buf, size_t *pcount, loff_t *ppos, enum bpf_attach_type type);
int __cgroup_bpf_run_filter_setsockopt(struct sock *sock, int *level, @@@ -279,6 -279,31 +279,31 @@@ int bpf_percpu_cgroup_storage_update(st #define BPF_CGROUP_RUN_PROG_UDP6_RECVMSG_LOCK(sk, uaddr) \ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_UDP6_RECVMSG, NULL)
+ /* The SOCK_OPS"_SK" macro should be used when sock_ops->sk is not a + * fullsock and its parent fullsock cannot be traced by + * sk_to_full_sk(). + * + * e.g. sock_ops->sk is a request_sock and it is under syncookie mode. + * Its listener-sk is not attached to the rsk_listener. + * In this case, the caller holds the listener-sk (unlocked), + * set its sock_ops->sk to req_sk, and call this SOCK_OPS"_SK" with + * the listener-sk such that the cgroup-bpf-progs of the + * listener-sk will be run. + * + * Regardless of syncookie mode or not, + * calling bpf_setsockopt on listener-sk will not make sense anyway, + * so passing 'sock_ops->sk == req_sk' to the bpf prog is appropriate here. + */ + #define BPF_CGROUP_RUN_PROG_SOCK_OPS_SK(sock_ops, sk) \ + ({ \ + int __ret = 0; \ + if (cgroup_bpf_enabled) \ + __ret = __cgroup_bpf_run_filter_sock_ops(sk, \ + sock_ops, \ + BPF_CGROUP_SOCK_OPS); \ + __ret; \ + }) + #define BPF_CGROUP_RUN_PROG_SOCK_OPS(sock_ops) \ ({ \ int __ret = 0; \ diff --combined include/linux/netdevice.h index 7bd4fcdd0738,fef0eb96cf69..a431c3229cbf --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@@ -70,6 -70,7 +70,7 @@@ struct udp_tunnel_nic struct bpf_prog; struct xdp_buff;
+ void synchronize_net(void); void netdev_set_default_ethtool_ops(struct net_device *dev, const struct ethtool_ops *ops);
@@@ -354,7 -355,7 +355,7 @@@ enum NAPI_STATE_MISSED, /* reschedule a napi */ NAPI_STATE_DISABLE, /* Disable pending */ NAPI_STATE_NPSVC, /* Netpoll - don't dequeue from poll_list */ - NAPI_STATE_HASHED, /* In NAPI hash (busy polling possible) */ + NAPI_STATE_LISTED, /* NAPI added to system lists */ NAPI_STATE_NO_BUSY_POLL,/* Do not add in napi_hash, no busy polling */ NAPI_STATE_IN_BUSY_POLL,/* sk_busy_loop() owns this NAPI */ }; @@@ -364,7 -365,7 +365,7 @@@ enum NAPIF_STATE_MISSED = BIT(NAPI_STATE_MISSED), NAPIF_STATE_DISABLE = BIT(NAPI_STATE_DISABLE), NAPIF_STATE_NPSVC = BIT(NAPI_STATE_NPSVC), - NAPIF_STATE_HASHED = BIT(NAPI_STATE_HASHED), + NAPIF_STATE_LISTED = BIT(NAPI_STATE_LISTED), NAPIF_STATE_NO_BUSY_POLL = BIT(NAPI_STATE_NO_BUSY_POLL), NAPIF_STATE_IN_BUSY_POLL = BIT(NAPI_STATE_IN_BUSY_POLL), }; @@@ -488,20 -489,6 +489,6 @@@ static inline bool napi_complete(struc return napi_complete_done(n, 0); }
- /** - * napi_hash_del - remove a NAPI from global table - * @napi: NAPI context - * - * Warning: caller must observe RCU grace period - * before freeing memory containing @napi, if - * this function returns true. - * Note: core networking stack automatically calls it - * from netif_napi_del(). - * Drivers might want to call this helper to combine all - * the needed RCU grace periods into a single one. - */ - bool napi_hash_del(struct napi_struct *napi); - /** * napi_disable - prevent NAPI from scheduling * @n: NAPI context @@@ -618,7 -605,7 +605,7 @@@ struct netdev_queue /* Subordinate device that the queue has been assigned to */ struct net_device *sb_dev; #ifdef CONFIG_XDP_SOCKETS - struct xdp_umem *umem; + struct xsk_buff_pool *pool; #endif /* * write-mostly part @@@ -640,11 -627,16 +627,16 @@@ extern int sysctl_fb_tunnels_only_for_init_net; extern int sysctl_devconf_inherit_init_net;
+ /* + * sysctl_fb_tunnels_only_for_init_net == 0 : For all netns + * == 1 : For initns only + * == 2 : For none. + */ static inline bool net_has_fallback_tunnels(const struct net *net) { - return net == &init_net || - !IS_ENABLED(CONFIG_SYSCTL) || - !sysctl_fb_tunnels_only_for_init_net; + return !IS_ENABLED(CONFIG_SYSCTL) || + !sysctl_fb_tunnels_only_for_init_net || + (net == &init_net && sysctl_fb_tunnels_only_for_init_net == 1); }
static inline int netdev_queue_numa_node_read(const struct netdev_queue *q) @@@ -751,7 -743,7 +743,7 @@@ struct netdev_rx_queue struct net_device *dev; struct xdp_rxq_info xdp_rxq; #ifdef CONFIG_XDP_SOCKETS - struct xdp_umem *umem; + struct xsk_buff_pool *pool; #endif } ____cacheline_aligned_in_smp;
@@@ -879,7 -871,7 +871,7 @@@ enum bpf_netdev_command /* BPF program for offload callbacks, invoked at program load time. */ BPF_OFFLOAD_MAP_ALLOC, BPF_OFFLOAD_MAP_FREE, - XDP_SETUP_XSK_UMEM, + XDP_SETUP_XSK_POOL, };
struct bpf_prog_offload_ops; @@@ -913,9 -905,9 +905,9 @@@ struct netdev_bpf struct { struct bpf_offloaded_map *offmap; }; - /* XDP_SETUP_XSK_UMEM */ + /* XDP_SETUP_XSK_POOL */ struct { - struct xdp_umem *umem; + struct xsk_buff_pool *pool; u16 queue_id; } xsk; }; @@@ -1784,7 -1776,6 +1776,7 @@@ enum netdev_priv_flags * the watchdog (see dev_watchdog()) * @watchdog_timer: List of timers * + * @proto_down_reason: reason a netdev interface is held down * @pcpu_refcnt: Number of references to this device * @todo_list: Delayed register/unregister * @link_watch_list: XXX: need comments on this one @@@ -1849,7 -1840,6 +1841,7 @@@ * @udp_tunnel_nic_info: static structure describing the UDP tunnel * offload capabilities of the device * @udp_tunnel_nic: UDP tunnel offload state + * @xdp_state: stores info on attached XDP BPF programs * * FIXME: cleanup struct net_device such that network protocol info * moves out. @@@ -2195,6 -2185,22 +2187,22 @@@ int netdev_get_num_tc(struct net_devic return dev->num_tc; }
+ static inline void net_prefetch(void *p) + { + prefetch(p); + #if L1_CACHE_BYTES < 128 + prefetch((u8 *)p + L1_CACHE_BYTES); + #endif + } + + static inline void net_prefetchw(void *p) + { + prefetchw(p); + #if L1_CACHE_BYTES < 128 + prefetchw((u8 *)p + L1_CACHE_BYTES); + #endif + } + void netdev_unbind_sb_channel(struct net_device *dev, struct net_device *sb_dev); int netdev_bind_sb_channel_queue(struct net_device *dev, @@@ -2349,13 -2355,27 +2357,27 @@@ static inline void netif_tx_napi_add(st netif_napi_add(dev, napi, poll, weight); }
+ /** + * __netif_napi_del - remove a NAPI context + * @napi: NAPI context + * + * Warning: caller must observe RCU grace period before freeing memory + * containing @napi. Drivers might want to call this helper to combine + * all the needed RCU grace periods into a single one. + */ + void __netif_napi_del(struct napi_struct *napi); + /** * netif_napi_del - remove a NAPI context * @napi: NAPI context * * netif_napi_del() removes a NAPI context from the network device NAPI list */ - void netif_napi_del(struct napi_struct *napi); + static inline void netif_napi_del(struct napi_struct *napi) + { + __netif_napi_del(napi); + synchronize_net(); + }
struct napi_gro_cb { /* Virtual address of skb_shinfo(skb)->frags[0].page + offset. */ @@@ -2779,7 -2799,6 +2801,6 @@@ static inline void unregister_netdevice int netdev_refcnt_read(const struct net_device *dev); void free_netdev(struct net_device *dev); void netdev_freemem(struct net_device *dev); - void synchronize_net(void); int init_dummy_netdev(struct net_device *dev);
struct net_device *netdev_get_xmit_slave(struct net_device *dev, @@@ -4659,16 -4678,6 +4680,6 @@@ int netdev_class_create_file_ns(const s void netdev_class_remove_file_ns(const struct class_attribute *class_attr, const void *ns);
- static inline int netdev_class_create_file(const struct class_attribute *class_attr) - { - return netdev_class_create_file_ns(class_attr, NULL); - } - - static inline void netdev_class_remove_file(const struct class_attribute *class_attr) - { - netdev_class_remove_file_ns(class_attr, NULL); - } - extern const struct kobj_ns_type_operations net_ns_type_operations;
const char *netdev_drivername(const struct net_device *dev); diff --combined include/linux/qed/qed_if.h index cdd73afc4c46,56fa55841d39..57fb295ea41a --- a/include/linux/qed/qed_if.h +++ b/include/linux/qed/qed_if.h @@@ -21,6 -21,7 +21,7 @@@ #include <linux/qed/common_hsi.h> #include <linux/qed/qed_chain.h> #include <linux/io-64-nonatomic-lo-hi.h> + #include <net/devlink.h>
enum dcbx_protocol_type { DCBX_PROTOCOL_ISCSI, @@@ -623,7 -624,6 +624,7 @@@ struct qed_dev_info #define QED_MFW_VERSION_3_OFFSET 24
u32 flash_size; + bool b_arfs_capable; bool b_inter_pf_switch; bool tx_switching; bool rdma_supported; @@@ -780,6 -780,11 +781,11 @@@ enum qed_nvm_flash_cmd QED_NVM_FLASH_CMD_NVM_MAX, };
+ struct qed_devlink { + struct qed_dev *cdev; + struct devlink_health_reporter *fw_reporter; + }; + struct qed_common_cb_ops { void (*arfs_filter_op)(void *dev, void *fltr, u8 fw_rc); void (*link_update)(void *dev, struct qed_link_output *link); @@@ -845,10 -850,9 +851,9 @@@ struct qed_common_ops struct qed_dev* (*probe)(struct pci_dev *dev, struct qed_probe_params *params);
- void (*remove)(struct qed_dev *cdev); + void (*remove)(struct qed_dev *cdev);
- int (*set_power_state)(struct qed_dev *cdev, - pci_power_t state); + int (*set_power_state)(struct qed_dev *cdev, pci_power_t state);
void (*set_name) (struct qed_dev *cdev, char name[]);
@@@ -856,50 -860,51 +861,51 @@@ * PF params required for the call before slowpath_start is * documented within the qed_pf_params structure definition. */ - void (*update_pf_params)(struct qed_dev *cdev, - struct qed_pf_params *params); - int (*slowpath_start)(struct qed_dev *cdev, - struct qed_slowpath_params *params); + void (*update_pf_params)(struct qed_dev *cdev, + struct qed_pf_params *params);
- int (*slowpath_stop)(struct qed_dev *cdev); + int (*slowpath_start)(struct qed_dev *cdev, + struct qed_slowpath_params *params); + + int (*slowpath_stop)(struct qed_dev *cdev);
/* Requests to use `cnt' interrupts for fastpath. * upon success, returns number of interrupts allocated for fastpath. */ - int (*set_fp_int)(struct qed_dev *cdev, - u16 cnt); + int (*set_fp_int)(struct qed_dev *cdev, u16 cnt);
/* Fills `info' with pointers required for utilizing interrupts */ - int (*get_fp_int)(struct qed_dev *cdev, - struct qed_int_info *info); - - u32 (*sb_init)(struct qed_dev *cdev, - struct qed_sb_info *sb_info, - void *sb_virt_addr, - dma_addr_t sb_phy_addr, - u16 sb_id, - enum qed_sb_type type); - - u32 (*sb_release)(struct qed_dev *cdev, - struct qed_sb_info *sb_info, - u16 sb_id, - enum qed_sb_type type); - - void (*simd_handler_config)(struct qed_dev *cdev, - void *token, - int index, - void (*handler)(void *)); - - void (*simd_handler_clean)(struct qed_dev *cdev, - int index); - int (*dbg_grc)(struct qed_dev *cdev, - void *buffer, u32 *num_dumped_bytes); + int (*get_fp_int)(struct qed_dev *cdev, struct qed_int_info *info); + + u32 (*sb_init)(struct qed_dev *cdev, + struct qed_sb_info *sb_info, + void *sb_virt_addr, + dma_addr_t sb_phy_addr, + u16 sb_id, + enum qed_sb_type type); + + u32 (*sb_release)(struct qed_dev *cdev, + struct qed_sb_info *sb_info, + u16 sb_id, + enum qed_sb_type type); + + void (*simd_handler_config)(struct qed_dev *cdev, + void *token, + int index, + void (*handler)(void *)); + + void (*simd_handler_clean)(struct qed_dev *cdev, int index); + + int (*dbg_grc)(struct qed_dev *cdev, void *buffer, u32 *num_dumped_bytes);
int (*dbg_grc_size)(struct qed_dev *cdev);
- int (*dbg_all_data) (struct qed_dev *cdev, void *buffer); + int (*dbg_all_data)(struct qed_dev *cdev, void *buffer);
- int (*dbg_all_data_size) (struct qed_dev *cdev); + int (*dbg_all_data_size)(struct qed_dev *cdev); + + int (*report_fatal_error)(struct devlink *devlink, + enum qed_hw_err_type err_type);
/** * @brief can_link_change - can the instance change the link or not @@@ -1138,6 -1143,10 +1144,10 @@@ * */ int (*set_grc_config)(struct qed_dev *cdev, u32 cfg_id, u32 val); + + struct devlink* (*devlink_register)(struct qed_dev *cdev); + + void (*devlink_unregister)(struct devlink *devlink); };
#define MASK_FIELD(_name, _value) \ diff --combined include/net/netlink.h index 8e0eb2c9c528,fdd317f8fde4..b2cf34f53e55 --- a/include/net/netlink.h +++ b/include/net/netlink.h @@@ -181,8 -181,6 +181,6 @@@ enum NLA_S64, NLA_BITFIELD32, NLA_REJECT, - NLA_EXACT_LEN, - NLA_MIN_LEN, __NLA_TYPE_MAX, };
@@@ -199,11 -197,11 +197,11 @@@ struct netlink_range_validation_signed enum nla_policy_validation { NLA_VALIDATE_NONE, NLA_VALIDATE_RANGE, + NLA_VALIDATE_RANGE_WARN_TOO_LONG, NLA_VALIDATE_MIN, NLA_VALIDATE_MAX, NLA_VALIDATE_RANGE_PTR, NLA_VALIDATE_FUNCTION, - NLA_VALIDATE_WARN_TOO_LONG, };
/** @@@ -222,7 -220,7 +220,7 @@@ * NLA_NUL_STRING Maximum length of string (excluding NUL) * NLA_FLAG Unused * NLA_BINARY Maximum length of attribute payload - * NLA_MIN_LEN Minimum length of attribute payload + * (but see also below with the validation type) * NLA_NESTED, * NLA_NESTED_ARRAY Length verification is done by checking len of * nested header (or empty); len field is used if @@@ -237,11 -235,6 +235,6 @@@ * just like "All other" * NLA_BITFIELD32 Unused * NLA_REJECT Unused - * NLA_EXACT_LEN Attribute should have exactly this length, otherwise - * it is rejected or warned about, the latter happening - * if and only if the `validation_type' is set to - * NLA_VALIDATE_WARN_TOO_LONG. - * NLA_MIN_LEN Minimum length of attribute payload * All other Minimum length of attribute payload * * Meaning of validation union: @@@ -296,6 -289,11 +289,11 @@@ * pointer to a struct netlink_range_validation_signed * that indicates the min/max values. * Use NLA_POLICY_FULL_RANGE_SIGNED(). + * + * NLA_BINARY If the validation type is like the ones for integers + * above, then the min/max length (not value like for + * integers) of the attribute is enforced. + * * All other Unused - but note that it's a union * * Meaning of `validate' field, use via NLA_POLICY_VALIDATE_FN: @@@ -309,7 -307,7 +307,7 @@@ * static const struct nla_policy my_policy[ATTR_MAX+1] = { * [ATTR_FOO] = { .type = NLA_U16 }, * [ATTR_BAR] = { .type = NLA_STRING, .len = BARSIZ }, - * [ATTR_BAZ] = { .type = NLA_EXACT_LEN, .len = sizeof(struct mystruct) }, + * [ATTR_BAZ] = NLA_POLICY_EXACT_LEN(sizeof(struct mystruct)), * [ATTR_GOO] = NLA_POLICY_BITFIELD32(myvalidflags), * }; */ @@@ -335,9 -333,10 +333,10 @@@ struct nla_policy * nesting validation starts here. * * Additionally, it means that NLA_UNSPEC is actually NLA_REJECT - * for any types >= this, so need to use NLA_MIN_LEN to get the - * previous pure { .len = xyz } behaviour. The advantage of this - * is that types not specified in the policy will be rejected. + * for any types >= this, so need to use NLA_POLICY_MIN_LEN() to + * get the previous pure { .len = xyz } behaviour. The advantage + * of this is that types not specified in the policy will be + * rejected. * * For completely new families it should be set to 1 so that the * validation is enforced for all attributes. For existing ones @@@ -349,12 -348,6 +348,6 @@@ }; };
- #define NLA_POLICY_EXACT_LEN(_len) { .type = NLA_EXACT_LEN, .len = _len } - #define NLA_POLICY_EXACT_LEN_WARN(_len) \ - { .type = NLA_EXACT_LEN, .len = _len, \ - .validation_type = NLA_VALIDATE_WARN_TOO_LONG, } - #define NLA_POLICY_MIN_LEN(_len) { .type = NLA_MIN_LEN, .len = _len } - #define NLA_POLICY_ETH_ADDR NLA_POLICY_EXACT_LEN(ETH_ALEN) #define NLA_POLICY_ETH_ADDR_COMPAT NLA_POLICY_EXACT_LEN_WARN(ETH_ALEN)
@@@ -370,19 -363,21 +363,21 @@@ { .type = NLA_BITFIELD32, .bitfield32_valid = valid }
#define __NLA_ENSURE(condition) BUILD_BUG_ON_ZERO(!(condition)) - #define NLA_ENSURE_UINT_TYPE(tp) \ + #define NLA_ENSURE_UINT_OR_BINARY_TYPE(tp) \ (__NLA_ENSURE(tp == NLA_U8 || tp == NLA_U16 || \ tp == NLA_U32 || tp == NLA_U64 || \ - tp == NLA_MSECS) + tp) + tp == NLA_MSECS || \ + tp == NLA_BINARY) + tp) #define NLA_ENSURE_SINT_TYPE(tp) \ (__NLA_ENSURE(tp == NLA_S8 || tp == NLA_S16 || \ tp == NLA_S32 || tp == NLA_S64) + tp) - #define NLA_ENSURE_INT_TYPE(tp) \ + #define NLA_ENSURE_INT_OR_BINARY_TYPE(tp) \ (__NLA_ENSURE(tp == NLA_S8 || tp == NLA_U8 || \ tp == NLA_S16 || tp == NLA_U16 || \ tp == NLA_S32 || tp == NLA_U32 || \ tp == NLA_S64 || tp == NLA_U64 || \ - tp == NLA_MSECS) + tp) + tp == NLA_MSECS || \ + tp == NLA_BINARY) + tp) #define NLA_ENSURE_NO_VALIDATION_PTR(tp) \ (__NLA_ENSURE(tp != NLA_BITFIELD32 && \ tp != NLA_REJECT && \ @@@ -390,14 -385,14 +385,14 @@@ tp != NLA_NESTED_ARRAY) + tp)
#define NLA_POLICY_RANGE(tp, _min, _max) { \ - .type = NLA_ENSURE_INT_TYPE(tp), \ + .type = NLA_ENSURE_INT_OR_BINARY_TYPE(tp), \ .validation_type = NLA_VALIDATE_RANGE, \ .min = _min, \ .max = _max \ }
#define NLA_POLICY_FULL_RANGE(tp, _range) { \ - .type = NLA_ENSURE_UINT_TYPE(tp), \ + .type = NLA_ENSURE_UINT_OR_BINARY_TYPE(tp), \ .validation_type = NLA_VALIDATE_RANGE_PTR, \ .range = _range, \ } @@@ -409,13 -404,13 +404,13 @@@ }
#define NLA_POLICY_MIN(tp, _min) { \ - .type = NLA_ENSURE_INT_TYPE(tp), \ + .type = NLA_ENSURE_INT_OR_BINARY_TYPE(tp), \ .validation_type = NLA_VALIDATE_MIN, \ .min = _min, \ }
#define NLA_POLICY_MAX(tp, _max) { \ - .type = NLA_ENSURE_INT_TYPE(tp), \ + .type = NLA_ENSURE_INT_OR_BINARY_TYPE(tp), \ .validation_type = NLA_VALIDATE_MAX, \ .max = _max, \ } @@@ -427,6 -422,15 +422,15 @@@ .len = __VA_ARGS__ + 0, \ }
+ #define NLA_POLICY_EXACT_LEN(_len) NLA_POLICY_RANGE(NLA_BINARY, _len, _len) + #define NLA_POLICY_EXACT_LEN_WARN(_len) { \ + .type = NLA_BINARY, \ + .validation_type = NLA_VALIDATE_RANGE_WARN_TOO_LONG, \ + .min = _len, \ + .max = _len \ + } + #define NLA_POLICY_MIN_LEN(_len) NLA_POLICY_MIN(NLA_BINARY, _len) + /** * struct nl_info - netlink source information * @nlh: Netlink message header of original request @@@ -726,6 -730,7 +730,6 @@@ static inline int __nlmsg_parse(const s * @hdrlen: length of family specific header * @tb: destination array with maxtype+1 elements * @maxtype: maximum attribute type to be expected - * @validate: validation strictness * @extack: extended ACK report struct * * See nla_parse() @@@ -823,6 -828,7 +827,6 @@@ static inline int nla_validate_deprecat * @len: length of attribute stream * @maxtype: maximum attribute type to be expected * @policy: validation policy - * @validate: validation strictness * @extack: extended ACK report struct * * Validates all attributes in the specified attribute stream against the diff --combined include/uapi/linux/ethtool_netlink.h index 72ba36be9655,9cee6df01a10..e2bf36e6964b --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@@ -79,7 -79,6 +79,7 @@@ enum ETHTOOL_MSG_TSINFO_GET_REPLY, ETHTOOL_MSG_CABLE_TEST_NTF, ETHTOOL_MSG_CABLE_TEST_TDR_NTF, + ETHTOOL_MSG_TUNNEL_INFO_GET_REPLY,
/* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, @@@ -92,9 -91,12 +92,12 @@@ #define ETHTOOL_FLAG_COMPACT_BITSETS (1 << 0) /* provide optional reply for SET or ACT requests */ #define ETHTOOL_FLAG_OMIT_REPLY (1 << 1) + /* request statistics, if supported by the driver */ + #define ETHTOOL_FLAG_STATS (1 << 2)
#define ETHTOOL_FLAG_ALL (ETHTOOL_FLAG_COMPACT_BITSETS | \ - ETHTOOL_FLAG_OMIT_REPLY) + ETHTOOL_FLAG_OMIT_REPLY | \ + ETHTOOL_FLAG_STATS)
enum { ETHTOOL_A_HEADER_UNSPEC, @@@ -377,12 -379,25 +380,25 @@@ enum ETHTOOL_A_PAUSE_AUTONEG, /* u8 */ ETHTOOL_A_PAUSE_RX, /* u8 */ ETHTOOL_A_PAUSE_TX, /* u8 */ + ETHTOOL_A_PAUSE_STATS, /* nest - _PAUSE_STAT_* */
/* add new constants above here */ __ETHTOOL_A_PAUSE_CNT, ETHTOOL_A_PAUSE_MAX = (__ETHTOOL_A_PAUSE_CNT - 1) };
+ enum { + ETHTOOL_A_PAUSE_STAT_UNSPEC, + ETHTOOL_A_PAUSE_STAT_PAD, + + ETHTOOL_A_PAUSE_STAT_TX_FRAMES, + ETHTOOL_A_PAUSE_STAT_RX_FRAMES, + + /* add new constants above here */ + __ETHTOOL_A_PAUSE_STAT_CNT, + ETHTOOL_A_PAUSE_STAT_MAX = (__ETHTOOL_A_PAUSE_STAT_CNT - 1) + }; + /* EEE */
enum { diff --combined init/Kconfig index 2a5df1cf838c,6ecc00e130ff..91456ac0ef20 --- a/init/Kconfig +++ b/init/Kconfig @@@ -682,8 -682,7 +682,8 @@@ config IKHEADER
config LOG_BUF_SHIFT int "Kernel log buffer size (16 => 64KB, 17 => 128KB)" - range 12 25 + range 12 25 if !H8300 + range 12 19 if H8300 default 17 depends on PRINTK help @@@ -1692,6 -1691,7 +1692,7 @@@ config BPF_SYSCAL bool "Enable bpf() system call" select BPF select IRQ_WORK + select TASKS_TRACE_RCU default n help Enable the bpf() system call that allows to manipulate eBPF @@@ -1711,6 -1711,8 +1712,8 @@@ config BPF_JIT_DEFAULT_O def_bool ARCH_WANT_DEFAULT_BPF_JIT || BPF_JIT_ALWAYS_ON depends on HAVE_EBPF_JIT && BPF_JIT
+ source "kernel/bpf/preload/Kconfig" + config USERFAULTFD bool "Enable userfaultfd() system call" depends on MMU diff --combined kernel/bpf/hashtab.c index 7df28a45c66b,fe0e06284d33..3395cf140d22 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@@ -9,6 -9,7 +9,7 @@@ #include <linux/rculist_nulls.h> #include <linux/random.h> #include <uapi/linux/btf.h> + #include <linux/rcupdate_trace.h> #include "percpu_freelist.h" #include "bpf_lru_list.h" #include "map_in_map.h" @@@ -577,8 -578,7 +578,7 @@@ static void *__htab_map_lookup_elem(str struct htab_elem *l; u32 hash, key_size;
- /* Must be called with rcu_read_lock. */ - WARN_ON_ONCE(!rcu_read_lock_held()); + WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held());
key_size = map->key_size;
@@@ -941,7 -941,7 +941,7 @@@ static int htab_map_update_elem(struct /* unknown flags */ return -EINVAL;
- WARN_ON_ONCE(!rcu_read_lock_held()); + WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held());
key_size = map->key_size;
@@@ -1032,7 -1032,7 +1032,7 @@@ static int htab_lru_map_update_elem(str /* unknown flags */ return -EINVAL;
- WARN_ON_ONCE(!rcu_read_lock_held()); + WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held());
key_size = map->key_size;
@@@ -1220,7 -1220,7 +1220,7 @@@ static int htab_map_delete_elem(struct u32 hash, key_size; int ret = -ENOENT;
- WARN_ON_ONCE(!rcu_read_lock_held()); + WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held());
key_size = map->key_size;
@@@ -1252,7 -1252,7 +1252,7 @@@ static int htab_lru_map_delete_elem(str u32 hash, key_size; int ret = -ENOENT;
- WARN_ON_ONCE(!rcu_read_lock_held()); + WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held());
key_size = map->key_size;
@@@ -1622,6 -1622,7 +1622,6 @@@ struct bpf_iter_seq_hash_map_info struct bpf_map *map; struct bpf_htab *htab; void *percpu_value_buf; // non-zero means percpu hash - unsigned long flags; u32 bucket_id; u32 skip_elems; }; @@@ -1631,6 -1632,7 +1631,6 @@@ bpf_hash_map_seq_find_next(struct bpf_i struct htab_elem *prev_elem) { const struct bpf_htab *htab = info->htab; - unsigned long flags = info->flags; u32 skip_elems = info->skip_elems; u32 bucket_id = info->bucket_id; struct hlist_nulls_head *head; @@@ -1654,18 -1656,19 +1654,18 @@@
/* not found, unlock and go to the next bucket */ b = &htab->buckets[bucket_id++]; - htab_unlock_bucket(htab, b, flags); + rcu_read_unlock(); skip_elems = 0; }
for (i = bucket_id; i < htab->n_buckets; i++) { b = &htab->buckets[i]; - flags = htab_lock_bucket(htab, b); + rcu_read_lock();
count = 0; head = &b->head; hlist_nulls_for_each_entry_rcu(elem, n, head, hash_node) { if (count >= skip_elems) { - info->flags = flags; info->bucket_id = i; info->skip_elems = count; return elem; @@@ -1673,7 -1676,7 +1673,7 @@@ count++; }
- htab_unlock_bucket(htab, b, flags); + rcu_read_unlock(); skip_elems = 0; }
@@@ -1751,10 -1754,14 +1751,10 @@@ static int bpf_hash_map_seq_show(struc
static void bpf_hash_map_seq_stop(struct seq_file *seq, void *v) { - struct bpf_iter_seq_hash_map_info *info = seq->private; - if (!v) (void)__bpf_hash_map_seq_show(seq, NULL); else - htab_unlock_bucket(info->htab, - &info->htab->buckets[info->bucket_id], - info->flags); + rcu_read_unlock(); }
static int bpf_iter_init_hash_map(void *priv_data, @@@ -1803,6 -1810,7 +1803,7 @@@ static const struct bpf_iter_seq_info i
static int htab_map_btf_id; const struct bpf_map_ops htab_map_ops = { + .map_meta_equal = bpf_map_meta_equal, .map_alloc_check = htab_map_alloc_check, .map_alloc = htab_map_alloc, .map_free = htab_map_free, @@@ -1820,6 -1828,7 +1821,7 @@@
static int htab_lru_map_btf_id; const struct bpf_map_ops htab_lru_map_ops = { + .map_meta_equal = bpf_map_meta_equal, .map_alloc_check = htab_map_alloc_check, .map_alloc = htab_map_alloc, .map_free = htab_map_free, @@@ -1940,6 -1949,7 +1942,7 @@@ static void htab_percpu_map_seq_show_el
static int htab_percpu_map_btf_id; const struct bpf_map_ops htab_percpu_map_ops = { + .map_meta_equal = bpf_map_meta_equal, .map_alloc_check = htab_map_alloc_check, .map_alloc = htab_map_alloc, .map_free = htab_map_free, @@@ -1956,6 -1966,7 +1959,7 @@@
static int htab_lru_percpu_map_btf_id; const struct bpf_map_ops htab_lru_percpu_map_ops = { + .map_meta_equal = bpf_map_meta_equal, .map_alloc_check = htab_map_alloc_check, .map_alloc = htab_map_alloc, .map_free = htab_map_free, diff --combined kernel/bpf/inode.c index 18f4969552ac,b48a56f53495..dd4b7fd60ee7 --- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@@ -20,6 -20,7 +20,7 @@@ #include <linux/filter.h> #include <linux/bpf.h> #include <linux/bpf_trace.h> + #include "preload/bpf_preload.h"
enum bpf_type { BPF_TYPE_UNSPEC = 0, @@@ -226,12 -227,10 +227,12 @@@ static void *map_seq_next(struct seq_fi else prev_key = key;
+ rcu_read_lock(); if (map->ops->map_get_next_key(map, prev_key, key)) { map_iter(m)->done = true; - return NULL; + key = NULL; } + rcu_read_unlock(); return key; }
@@@ -371,9 -370,10 +372,10 @@@ static struct dentry bpf_lookup(struct inode *dir, struct dentry *dentry, unsigned flags) { /* Dots in names (e.g. "/sys/fs/bpf/foo.bar") are reserved for future - * extensions. + * extensions. That allows popoulate_bpffs() create special files. */ - if (strchr(dentry->d_name.name, '.')) + if ((dir->i_mode & S_IALLUGO) && + strchr(dentry->d_name.name, '.')) return ERR_PTR(-EPERM);
return simple_lookup(dir, dentry, flags); @@@ -411,6 -411,27 +413,27 @@@ static const struct inode_operations bp .unlink = simple_unlink, };
+ /* pin iterator link into bpffs */ + static int bpf_iter_link_pin_kernel(struct dentry *parent, + const char *name, struct bpf_link *link) + { + umode_t mode = S_IFREG | S_IRUSR; + struct dentry *dentry; + int ret; + + inode_lock(parent->d_inode); + dentry = lookup_one_len(name, parent, strlen(name)); + if (IS_ERR(dentry)) { + inode_unlock(parent->d_inode); + return PTR_ERR(dentry); + } + ret = bpf_mkobj_ops(dentry, mode, link, &bpf_link_iops, + &bpf_iter_fops); + dput(dentry); + inode_unlock(parent->d_inode); + return ret; + } + static int bpf_obj_do_pin(const char __user *pathname, void *raw, enum bpf_type type) { @@@ -640,6 -661,91 +663,91 @@@ static int bpf_parse_param(struct fs_co return 0; }
+ struct bpf_preload_ops *bpf_preload_ops; + EXPORT_SYMBOL_GPL(bpf_preload_ops); + + static bool bpf_preload_mod_get(void) + { + /* If bpf_preload.ko wasn't loaded earlier then load it now. + * When bpf_preload is built into vmlinux the module's __init + * function will populate it. + */ + if (!bpf_preload_ops) { + request_module("bpf_preload"); + if (!bpf_preload_ops) + return false; + } + /* And grab the reference, so the module doesn't disappear while the + * kernel is interacting with the kernel module and its UMD. + */ + if (!try_module_get(bpf_preload_ops->owner)) { + pr_err("bpf_preload module get failed.\n"); + return false; + } + return true; + } + + static void bpf_preload_mod_put(void) + { + if (bpf_preload_ops) + /* now user can "rmmod bpf_preload" if necessary */ + module_put(bpf_preload_ops->owner); + } + + static DEFINE_MUTEX(bpf_preload_lock); + + static int populate_bpffs(struct dentry *parent) + { + struct bpf_preload_info objs[BPF_PRELOAD_LINKS] = {}; + struct bpf_link *links[BPF_PRELOAD_LINKS] = {}; + int err = 0, i; + + /* grab the mutex to make sure the kernel interactions with bpf_preload + * UMD are serialized + */ + mutex_lock(&bpf_preload_lock); + + /* if bpf_preload.ko wasn't built into vmlinux then load it */ + if (!bpf_preload_mod_get()) + goto out; + + if (!bpf_preload_ops->info.tgid) { + /* preload() will start UMD that will load BPF iterator programs */ + err = bpf_preload_ops->preload(objs); + if (err) + goto out_put; + for (i = 0; i < BPF_PRELOAD_LINKS; i++) { + links[i] = bpf_link_by_id(objs[i].link_id); + if (IS_ERR(links[i])) { + err = PTR_ERR(links[i]); + goto out_put; + } + } + for (i = 0; i < BPF_PRELOAD_LINKS; i++) { + err = bpf_iter_link_pin_kernel(parent, + objs[i].link_name, links[i]); + if (err) + goto out_put; + /* do not unlink successfully pinned links even + * if later link fails to pin + */ + links[i] = NULL; + } + /* finish() will tell UMD process to exit */ + err = bpf_preload_ops->finish(); + if (err) + goto out_put; + } + out_put: + bpf_preload_mod_put(); + out: + mutex_unlock(&bpf_preload_lock); + for (i = 0; i < BPF_PRELOAD_LINKS && err; i++) + if (!IS_ERR_OR_NULL(links[i])) + bpf_link_put(links[i]); + return err; + } + static int bpf_fill_super(struct super_block *sb, struct fs_context *fc) { static const struct tree_descr bpf_rfiles[] = { { "" } }; @@@ -656,8 -762,8 +764,8 @@@ inode = sb->s_root->d_inode; inode->i_op = &bpf_dir_iops; inode->i_mode &= ~S_IALLUGO; + populate_bpffs(sb->s_root); inode->i_mode |= S_ISVTX | opts->mode; - return 0; }
@@@ -707,6 -813,8 +815,8 @@@ static int __init bpf_init(void { int ret;
+ mutex_init(&bpf_preload_lock); + ret = sysfs_create_mount_point(fs_kobj, "bpf"); if (ret) return ret; diff --combined mm/filemap.c index 5202e38ab79e,054d93a86f8a..8c8f2deee4e3 --- a/mm/filemap.c +++ b/mm/filemap.c @@@ -827,10 -827,10 +827,10 @@@ int replace_page_cache_page(struct pag } EXPORT_SYMBOL_GPL(replace_page_cache_page);
- static int __add_to_page_cache_locked(struct page *page, - struct address_space *mapping, - pgoff_t offset, gfp_t gfp_mask, - void **shadowp) + noinline int __add_to_page_cache_locked(struct page *page, + struct address_space *mapping, + pgoff_t offset, gfp_t gfp_mask, + void **shadowp) { XA_STATE(xas, &mapping->i_pages, offset); int huge = PageHuge(page); @@@ -988,43 -988,9 +988,43 @@@ void __init pagecache_init(void page_writeback_init(); }
+/* + * The page wait code treats the "wait->flags" somewhat unusually, because + * we have multiple different kinds of waits, not just the usual "exclusive" + * one. + * + * We have: + * + * (a) no special bits set: + * + * We're just waiting for the bit to be released, and when a waker + * calls the wakeup function, we set WQ_FLAG_WOKEN and wake it up, + * and remove it from the wait queue. + * + * Simple and straightforward. + * + * (b) WQ_FLAG_EXCLUSIVE: + * + * The waiter is waiting to get the lock, and only one waiter should + * be woken up to avoid any thundering herd behavior. We'll set the + * WQ_FLAG_WOKEN bit, wake it up, and remove it from the wait queue. + * + * This is the traditional exclusive wait. + * + * (c) WQ_FLAG_EXCLUSIVE | WQ_FLAG_CUSTOM: + * + * The waiter is waiting to get the bit, and additionally wants the + * lock to be transferred to it for fair lock behavior. If the lock + * cannot be taken, we stop walking the wait queue without waking + * the waiter. + * + * This is the "fair lock handoff" case, and in addition to setting + * WQ_FLAG_WOKEN, we set WQ_FLAG_DONE to let the waiter easily see + * that it now has the lock. + */ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync, void *arg) { - int ret; + unsigned int flags; struct wait_page_key *key = arg; struct wait_page_queue *wait_page = container_of(wait, struct wait_page_queue, wait); @@@ -1033,44 -999,35 +1033,44 @@@ return 0;
/* - * If it's an exclusive wait, we get the bit for it, and - * stop walking if we can't. - * - * If it's a non-exclusive wait, then the fact that this - * wake function was called means that the bit already - * was cleared, and we don't care if somebody then - * re-took it. + * If it's a lock handoff wait, we get the bit for it, and + * stop walking (and do not wake it up) if we can't. */ - ret = 0; - if (wait->flags & WQ_FLAG_EXCLUSIVE) { - if (test_and_set_bit(key->bit_nr, &key->page->flags)) + flags = wait->flags; + if (flags & WQ_FLAG_EXCLUSIVE) { + if (test_bit(key->bit_nr, &key->page->flags)) return -1; - ret = 1; + if (flags & WQ_FLAG_CUSTOM) { + if (test_and_set_bit(key->bit_nr, &key->page->flags)) + return -1; + flags |= WQ_FLAG_DONE; + } } - wait->flags |= WQ_FLAG_WOKEN;
+ /* + * We are holding the wait-queue lock, but the waiter that + * is waiting for this will be checking the flags without + * any locking. + * + * So update the flags atomically, and wake up the waiter + * afterwards to avoid any races. This store-release pairs + * with the load-acquire in wait_on_page_bit_common(). + */ + smp_store_release(&wait->flags, flags | WQ_FLAG_WOKEN); wake_up_state(wait->private, mode);
/* * Ok, we have successfully done what we're waiting for, * and we can unconditionally remove the wait entry. * - * Note that this has to be the absolute last thing we do, - * since after list_del_init(&wait->entry) the wait entry + * Note that this pairs with the "finish_wait()" in the + * waiter, and has to be the absolute last thing we do. + * After this list_del_init(&wait->entry) the wait entry * might be de-allocated and the process might even have * exited. */ list_del_init_careful(&wait->entry); - return ret; + return (flags & WQ_FLAG_EXCLUSIVE) != 0; }
static void wake_up_page_bit(struct page *page, int bit_nr) @@@ -1150,8 -1107,8 +1150,8 @@@ enum behavior };
/* - * Attempt to check (or get) the page bit, and mark the - * waiter woken if successful. + * Attempt to check (or get) the page bit, and mark us done + * if successful. */ static inline bool trylock_page_bit_common(struct page *page, int bit_nr, struct wait_queue_entry *wait) @@@ -1162,17 -1119,13 +1162,17 @@@ } else if (test_bit(bit_nr, &page->flags)) return false;
- wait->flags |= WQ_FLAG_WOKEN; + wait->flags |= WQ_FLAG_WOKEN | WQ_FLAG_DONE; return true; }
+/* How many times do we accept lock stealing from under a waiter? */ +int sysctl_page_lock_unfairness = 5; + static inline int wait_on_page_bit_common(wait_queue_head_t *q, struct page *page, int bit_nr, int state, enum behavior behavior) { + int unfairness = sysctl_page_lock_unfairness; struct wait_page_queue wait_page; wait_queue_entry_t *wait = &wait_page.wait; bool thrashing = false; @@@ -1190,18 -1143,11 +1190,18 @@@ }
init_wait(wait); - wait->flags = behavior == EXCLUSIVE ? WQ_FLAG_EXCLUSIVE : 0; wait->func = wake_page_function; wait_page.page = page; wait_page.bit_nr = bit_nr;
+repeat: + wait->flags = 0; + if (behavior == EXCLUSIVE) { + wait->flags = WQ_FLAG_EXCLUSIVE; + if (--unfairness < 0) + wait->flags |= WQ_FLAG_CUSTOM; + } + /* * Do one last check whether we can get the * page bit synchronously. @@@ -1224,63 -1170,27 +1224,63 @@@
/* * From now on, all the logic will be based on - * the WQ_FLAG_WOKEN flag, and the and the page - * bit testing (and setting) will be - or has - * already been - done by the wake function. + * the WQ_FLAG_WOKEN and WQ_FLAG_DONE flag, to + * see whether the page bit testing has already + * been done by the wake function. * * We can drop our reference to the page. */ if (behavior == DROP) put_page(page);
+ /* + * Note that until the "finish_wait()", or until + * we see the WQ_FLAG_WOKEN flag, we need to + * be very careful with the 'wait->flags', because + * we may race with a waker that sets them. + */ for (;;) { + unsigned int flags; + set_current_state(state);
- if (signal_pending_state(state, current)) + /* Loop until we've been woken or interrupted */ + flags = smp_load_acquire(&wait->flags); + if (!(flags & WQ_FLAG_WOKEN)) { + if (signal_pending_state(state, current)) + break; + + io_schedule(); + continue; + } + + /* If we were non-exclusive, we're done */ + if (behavior != EXCLUSIVE) break;
- if (wait->flags & WQ_FLAG_WOKEN) + /* If the waker got the lock for us, we're done */ + if (flags & WQ_FLAG_DONE) break;
- io_schedule(); + /* + * Otherwise, if we're getting the lock, we need to + * try to get it ourselves. + * + * And if that fails, we'll have to retry this all. + */ + if (unlikely(test_and_set_bit(bit_nr, &page->flags))) + goto repeat; + + wait->flags |= WQ_FLAG_DONE; + break; }
+ /* + * If a signal happened, this 'finish_wait()' may remove the last + * waiter from the wait-queues, but the PageWaiters bit will remain + * set. That's ok. The next wakeup will take care of it, and trying + * to do it here would be difficult and prone to races. + */ finish_wait(q, wait);
if (thrashing) { @@@ -1290,20 -1200,12 +1290,20 @@@ }
/* - * A signal could leave PageWaiters set. Clearing it here if - * !waitqueue_active would be possible (by open-coding finish_wait), - * but still fail to catch it in the case of wait hash collision. We - * already can fail to clear wait hash collision cases, so don't - * bother with signals either. + * NOTE! The wait->flags weren't stable until we've done the + * 'finish_wait()', and we could have exited the loop above due + * to a signal, and had a wakeup event happen after the signal + * test but before the 'finish_wait()'. + * + * So only after the finish_wait() can we reliably determine + * if we got woken up or not, so we can now figure out the final + * return value based on that state without races. + * + * Also note that WQ_FLAG_WOKEN is sufficient for a non-exclusive + * waiter, but an exclusive one requires WQ_FLAG_DONE. */ + if (behavior == EXCLUSIVE) + return wait->flags & WQ_FLAG_DONE ? 0 : -EINTR;
return wait->flags & WQ_FLAG_WOKEN ? 0 : -EINTR; } diff --combined net/batman-adv/bridge_loop_avoidance.c index c350ab63cd54,ab6cec3c7586..ba0027d1f2df --- a/net/batman-adv/bridge_loop_avoidance.c +++ b/net/batman-adv/bridge_loop_avoidance.c @@@ -25,7 -25,6 +25,7 @@@ #include <linux/lockdep.h> #include <linux/netdevice.h> #include <linux/netlink.h> +#include <linux/preempt.h> #include <linux/rculist.h> #include <linux/rcupdate.h> #include <linux/seq_file.h> @@@ -84,12 -83,11 +84,12 @@@ static inline u32 batadv_choose_claim(c */ static inline u32 batadv_choose_backbone_gw(const void *data, u32 size) { - const struct batadv_bla_claim *claim = (struct batadv_bla_claim *)data; + const struct batadv_bla_backbone_gw *gw; u32 hash = 0;
- hash = jhash(&claim->addr, sizeof(claim->addr), hash); - hash = jhash(&claim->vid, sizeof(claim->vid), hash); + gw = (struct batadv_bla_backbone_gw *)data; + hash = jhash(&gw->orig, sizeof(gw->orig), hash); + hash = jhash(&gw->vid, sizeof(gw->vid), hash);
return hash % size; } @@@ -1581,16 -1579,13 +1581,16 @@@ int batadv_bla_init(struct batadv_priv }
/** - * batadv_bla_check_bcast_duplist() - Check if a frame is in the broadcast dup. + * batadv_bla_check_duplist() - Check if a frame is in the broadcast dup. * @bat_priv: the bat priv with all the soft interface information - * @skb: contains the bcast_packet to be checked + * @skb: contains the multicast packet to be checked + * @payload_ptr: pointer to position inside the head buffer of the skb + * marking the start of the data to be CRC'ed + * @orig: originator mac address, NULL if unknown * - * check if it is on our broadcast list. Another gateway might - * have sent the same packet because it is connected to the same backbone, - * so we have to remove this duplicate. + * Check if it is on our broadcast list. Another gateway might have sent the + * same packet because it is connected to the same backbone, so we have to + * remove this duplicate. * * This is performed by checking the CRC, which will tell us * with a good chance that it is the same packet. If it is furthermore @@@ -1599,17 -1594,19 +1599,17 @@@ * * Return: true if a packet is in the duplicate list, false otherwise. */ -bool batadv_bla_check_bcast_duplist(struct batadv_priv *bat_priv, - struct sk_buff *skb) +static bool batadv_bla_check_duplist(struct batadv_priv *bat_priv, + struct sk_buff *skb, u8 *payload_ptr, + const u8 *orig) { - int i, curr; - __be32 crc; - struct batadv_bcast_packet *bcast_packet; struct batadv_bcast_duplist_entry *entry; bool ret = false; - - bcast_packet = (struct batadv_bcast_packet *)skb->data; + int i, curr; + __be32 crc;
/* calculate the crc ... */ - crc = batadv_skb_crc32(skb, (u8 *)(bcast_packet + 1)); + crc = batadv_skb_crc32(skb, payload_ptr);
spin_lock_bh(&bat_priv->bla.bcast_duplist_lock);
@@@ -1628,21 -1625,8 +1628,21 @@@ if (entry->crc != crc) continue;
- if (batadv_compare_eth(entry->orig, bcast_packet->orig)) - continue; + /* are the originators both known and not anonymous? */ + if (orig && !is_zero_ether_addr(orig) && + !is_zero_ether_addr(entry->orig)) { + /* If known, check if the new frame came from + * the same originator: + * We are safe to take identical frames from the + * same orig, if known, as multiplications in + * the mesh are detected via the (orig, seqno) pair. + * So we can be a bit more liberal here and allow + * identical frames from the same orig which the source + * host might have sent multiple times on purpose. + */ + if (batadv_compare_eth(entry->orig, orig)) + continue; + }
/* this entry seems to match: same crc, not too old, * and from another gw. therefore return true to forbid it. @@@ -1658,14 -1642,7 +1658,14 @@@ entry = &bat_priv->bla.bcast_duplist[curr]; entry->crc = crc; entry->entrytime = jiffies; - ether_addr_copy(entry->orig, bcast_packet->orig); + + /* known originator */ + if (orig) + ether_addr_copy(entry->orig, orig); + /* anonymous originator */ + else + eth_zero_addr(entry->orig); + bat_priv->bla.bcast_duplist_curr = curr;
out: @@@ -1674,48 -1651,6 +1674,48 @@@ return ret; }
+/** + * batadv_bla_check_ucast_duplist() - Check if a frame is in the broadcast dup. + * @bat_priv: the bat priv with all the soft interface information + * @skb: contains the multicast packet to be checked, decapsulated from a + * unicast_packet + * + * Check if it is on our broadcast list. Another gateway might have sent the + * same packet because it is connected to the same backbone, so we have to + * remove this duplicate. + * + * Return: true if a packet is in the duplicate list, false otherwise. + */ +static bool batadv_bla_check_ucast_duplist(struct batadv_priv *bat_priv, + struct sk_buff *skb) +{ + return batadv_bla_check_duplist(bat_priv, skb, (u8 *)skb->data, NULL); +} + +/** + * batadv_bla_check_bcast_duplist() - Check if a frame is in the broadcast dup. + * @bat_priv: the bat priv with all the soft interface information + * @skb: contains the bcast_packet to be checked + * + * Check if it is on our broadcast list. Another gateway might have sent the + * same packet because it is connected to the same backbone, so we have to + * remove this duplicate. + * + * Return: true if a packet is in the duplicate list, false otherwise. + */ +bool batadv_bla_check_bcast_duplist(struct batadv_priv *bat_priv, + struct sk_buff *skb) +{ + struct batadv_bcast_packet *bcast_packet; + u8 *payload_ptr; + + bcast_packet = (struct batadv_bcast_packet *)skb->data; + payload_ptr = (u8 *)(bcast_packet + 1); + + return batadv_bla_check_duplist(bat_priv, skb, payload_ptr, + bcast_packet->orig); +} + /** * batadv_bla_is_backbone_gw_orig() - Check if the originator is a gateway for * the VLAN identified by vid. @@@ -1863,7 -1798,7 +1863,7 @@@ batadv_bla_loopdetect_check(struct bata
ret = queue_work(batadv_event_workqueue, &backbone_gw->report_work);
- /* backbone_gw is unreferenced in the report work function function + /* backbone_gw is unreferenced in the report work function * if queue_work() call was successful */ if (!ret) @@@ -1877,7 -1812,7 +1877,7 @@@ * @bat_priv: the bat priv with all the soft interface information * @skb: the frame to be checked * @vid: the VLAN ID of the frame - * @is_bcast: the packet came in a broadcast packet type. + * @packet_type: the batman packet type this frame came in * * batadv_bla_rx avoidance checks if: * * we have to race for a claim @@@ -1889,7 -1824,7 +1889,7 @@@ * further process the skb. */ bool batadv_bla_rx(struct batadv_priv *bat_priv, struct sk_buff *skb, - unsigned short vid, bool is_bcast) + unsigned short vid, int packet_type) { struct batadv_bla_backbone_gw *backbone_gw; struct ethhdr *ethhdr; @@@ -1911,32 -1846,9 +1911,32 @@@ goto handled;
if (unlikely(atomic_read(&bat_priv->bla.num_requests))) - /* don't allow broadcasts while requests are in flight */ - if (is_multicast_ether_addr(ethhdr->h_dest) && is_bcast) - goto handled; + /* don't allow multicast packets while requests are in flight */ + if (is_multicast_ether_addr(ethhdr->h_dest)) + /* Both broadcast flooding or multicast-via-unicasts + * delivery might send to multiple backbone gateways + * sharing the same LAN and therefore need to coordinate + * which backbone gateway forwards into the LAN, + * by claiming the payload source address. + * + * Broadcast flooding and multicast-via-unicasts + * delivery use the following two batman packet types. + * Note: explicitly exclude BATADV_UNICAST_4ADDR, + * as the DHCP gateway feature will send explicitly + * to only one BLA gateway, so the claiming process + * should be avoided there. + */ + if (packet_type == BATADV_BCAST || + packet_type == BATADV_UNICAST) + goto handled; + + /* potential duplicates from foreign BLA backbone gateways via + * multicast-in-unicast packets + */ + if (is_multicast_ether_addr(ethhdr->h_dest) && + packet_type == BATADV_UNICAST && + batadv_bla_check_ucast_duplist(bat_priv, skb)) + goto handled;
ether_addr_copy(search_claim.addr, ethhdr->h_source); search_claim.vid = vid; @@@ -1971,14 -1883,13 +1971,14 @@@ goto allow; }
- /* if it is a broadcast ... */ - if (is_multicast_ether_addr(ethhdr->h_dest) && is_bcast) { + /* if it is a multicast ... */ + if (is_multicast_ether_addr(ethhdr->h_dest) && + (packet_type == BATADV_BCAST || packet_type == BATADV_UNICAST)) { /* ... drop it. the responsible gateway is in charge. * - * We need to check is_bcast because with the gateway + * We need to check packet type because with the gateway * feature, broadcasts (like DHCP requests) may be sent - * using a unicast packet type. + * using a unicast 4 address packet type. See comment above. */ goto handled; } else { diff --combined net/batman-adv/multicast.c index ca24a2e522b7,1622c3f5898f..0746fe2c2c04 --- a/net/batman-adv/multicast.c +++ b/net/batman-adv/multicast.c @@@ -51,7 -51,6 +51,7 @@@ #include <uapi/linux/batadv_packet.h> #include <uapi/linux/batman_adv.h>
+#include "bridge_loop_avoidance.h" #include "hard-interface.h" #include "hash.h" #include "log.h" @@@ -208,7 -207,7 +208,7 @@@ static u8 batadv_mcast_mla_rtr_flags_br return BATADV_MCAST_WANT_NO_RTR4 | BATADV_MCAST_WANT_NO_RTR6;
/* TODO: ask the bridge if a multicast router is present (the bridge - * is capable of performing proper RFC4286 multicast multicast router + * is capable of performing proper RFC4286 multicast router * discovery) instead of searching for a ff02::2 listener here */ ret = br_multicast_list_adjacent(dev, &bridge_mcast_list); @@@ -1435,35 -1434,6 +1435,35 @@@ batadv_mcast_forw_mode(struct batadv_pr return BATADV_FORW_ALL; }
+/** + * batadv_mcast_forw_send_orig() - send a multicast packet to an originator + * @bat_priv: the bat priv with all the soft interface information + * @skb: the multicast packet to send + * @vid: the vlan identifier + * @orig_node: the originator to send the packet to + * + * Return: NET_XMIT_DROP in case of error or NET_XMIT_SUCCESS otherwise. + */ +int batadv_mcast_forw_send_orig(struct batadv_priv *bat_priv, + struct sk_buff *skb, + unsigned short vid, + struct batadv_orig_node *orig_node) +{ + /* Avoid sending multicast-in-unicast packets to other BLA + * gateways - they already got the frame from the LAN side + * we share with them. + * TODO: Refactor to take BLA into account earlier, to avoid + * reducing the mcast_fanout count. + */ + if (batadv_bla_is_backbone_gw_orig(bat_priv, orig_node->orig, vid)) { + dev_kfree_skb(skb); + return NET_XMIT_SUCCESS; + } + + return batadv_send_skb_unicast(bat_priv, skb, BATADV_UNICAST, 0, + orig_node, vid); +} + /** * batadv_mcast_forw_tt() - forwards a packet to multicast listeners * @bat_priv: the bat priv with all the soft interface information @@@ -1501,8 -1471,8 +1501,8 @@@ batadv_mcast_forw_tt(struct batadv_pri break; }
- batadv_send_skb_unicast(bat_priv, newskb, BATADV_UNICAST, 0, - orig_entry->orig_node, vid); + batadv_mcast_forw_send_orig(bat_priv, newskb, vid, + orig_entry->orig_node); } rcu_read_unlock();
@@@ -1543,7 -1513,8 +1543,7 @@@ batadv_mcast_forw_want_all_ipv4(struct break; }
- batadv_send_skb_unicast(bat_priv, newskb, BATADV_UNICAST, 0, - orig_node, vid); + batadv_mcast_forw_send_orig(bat_priv, newskb, vid, orig_node); } rcu_read_unlock(); return ret; @@@ -1580,7 -1551,8 +1580,7 @@@ batadv_mcast_forw_want_all_ipv6(struct break; }
- batadv_send_skb_unicast(bat_priv, newskb, BATADV_UNICAST, 0, - orig_node, vid); + batadv_mcast_forw_send_orig(bat_priv, newskb, vid, orig_node); } rcu_read_unlock(); return ret; @@@ -1646,7 -1618,8 +1646,7 @@@ batadv_mcast_forw_want_all_rtr4(struct break; }
- batadv_send_skb_unicast(bat_priv, newskb, BATADV_UNICAST, 0, - orig_node, vid); + batadv_mcast_forw_send_orig(bat_priv, newskb, vid, orig_node); } rcu_read_unlock(); return ret; @@@ -1683,7 -1656,8 +1683,7 @@@ batadv_mcast_forw_want_all_rtr6(struct break; }
- batadv_send_skb_unicast(bat_priv, newskb, BATADV_UNICAST, 0, - orig_node, vid); + batadv_mcast_forw_send_orig(bat_priv, newskb, vid, orig_node); } rcu_read_unlock(); return ret; diff --combined net/batman-adv/soft-interface.c index cdde943c1b83,9d3974ba11ed..82e7ca886605 --- a/net/batman-adv/soft-interface.c +++ b/net/batman-adv/soft-interface.c @@@ -364,8 -364,9 +364,8 @@@ send goto dropped; ret = batadv_send_skb_via_gw(bat_priv, skb, vid); } else if (mcast_single_orig) { - ret = batadv_send_skb_unicast(bat_priv, skb, - BATADV_UNICAST, 0, - mcast_single_orig, vid); + ret = batadv_mcast_forw_send_orig(bat_priv, skb, vid, + mcast_single_orig); } else if (forw_mode == BATADV_FORW_SOME) { ret = batadv_mcast_forw_send(bat_priv, skb, vid); } else { @@@ -424,10 -425,10 +424,10 @@@ void batadv_interface_rx(struct net_dev struct vlan_ethhdr *vhdr; struct ethhdr *ethhdr; unsigned short vid; - bool is_bcast; + int packet_type;
batadv_bcast_packet = (struct batadv_bcast_packet *)skb->data; - is_bcast = (batadv_bcast_packet->packet_type == BATADV_BCAST); + packet_type = batadv_bcast_packet->packet_type;
skb_pull_rcsum(skb, hdr_size); skb_reset_mac_header(skb); @@@ -470,7 -471,7 +470,7 @@@ /* Let the bridge loop avoidance check the packet. If will * not handle it, we can safely push it up. */ - if (batadv_bla_rx(bat_priv, skb, vid, is_bcast)) + if (batadv_bla_rx(bat_priv, skb, vid, packet_type)) goto out;
if (orig_node) @@@ -648,7 -649,7 +648,7 @@@ static void batadv_softif_destroy_vlan( /** * batadv_interface_add_vid() - ndo_add_vid API implementation * @dev: the netdev of the mesh interface - * @proto: protocol of the the vlan id + * @proto: protocol of the vlan id * @vid: identifier of the new vlan * * Set up all the internal structures for handling the new vlan on top of the @@@ -706,7 -707,7 +706,7 @@@ static int batadv_interface_add_vid(str /** * batadv_interface_kill_vid() - ndo_kill_vid API implementation * @dev: the netdev of the mesh interface - * @proto: protocol of the the vlan id + * @proto: protocol of the vlan id * @vid: identifier of the deleted vlan * * Destroy all the internal structures used to handle the vlan identified by vid diff --combined net/core/dev.c index 266073e300b5,bd9c8510d86f..38a172a63318 --- a/net/core/dev.c +++ b/net/core/dev.c @@@ -98,6 -98,7 +98,7 @@@ #include <net/busy_poll.h> #include <linux/rtnetlink.h> #include <linux/stat.h> + #include <net/dsa.h> #include <net/dst.h> #include <net/dst_metadata.h> #include <net/pkt_sched.h> @@@ -1130,7 -1131,7 +1131,7 @@@ EXPORT_SYMBOL(__dev_get_by_flags) * @name: name string * * Network device names need to be valid file names to - * to allow sysfs to work. We also disallow any kind of + * allow sysfs to work. We also disallow any kind of * whitespace. */ bool dev_valid_name(const char *name) @@@ -5192,7 -5193,7 +5193,7 @@@ skip_classify } }
- if (unlikely(skb_vlan_tag_present(skb))) { + if (unlikely(skb_vlan_tag_present(skb)) && !netdev_uses_dsa(skb->dev)) { check_vlan_id: if (skb_vlan_tag_get_id(skb)) { /* Vlan id is non 0 and vlan_do_receive() above couldn't @@@ -5621,17 -5622,60 +5622,60 @@@ static void flush_backlog(struct work_s local_bh_enable(); }
+ static bool flush_required(int cpu) + { + #if IS_ENABLED(CONFIG_RPS) + struct softnet_data *sd = &per_cpu(softnet_data, cpu); + bool do_flush; + + local_irq_disable(); + rps_lock(sd); + + /* as insertion into process_queue happens with the rps lock held, + * process_queue access may race only with dequeue + */ + do_flush = !skb_queue_empty(&sd->input_pkt_queue) || + !skb_queue_empty_lockless(&sd->process_queue); + rps_unlock(sd); + local_irq_enable(); + + return do_flush; + #endif + /* without RPS we can't safely check input_pkt_queue: during a + * concurrent remote skb_queue_splice() we can detect as empty both + * input_pkt_queue and process_queue even if the latter could end-up + * containing a lot of packets. + */ + return true; + } + static void flush_all_backlogs(void) { + static cpumask_t flush_cpus; unsigned int cpu;
+ /* since we are under rtnl lock protection we can use static data + * for the cpumask and avoid allocating on stack the possibly + * large mask + */ + ASSERT_RTNL(); + get_online_cpus();
- for_each_online_cpu(cpu) - queue_work_on(cpu, system_highpri_wq, - per_cpu_ptr(&flush_works, cpu)); + cpumask_clear(&flush_cpus); + for_each_online_cpu(cpu) { + if (flush_required(cpu)) { + queue_work_on(cpu, system_highpri_wq, + per_cpu_ptr(&flush_works, cpu)); + cpumask_set_cpu(cpu, &flush_cpus); + } + }
- for_each_online_cpu(cpu) + /* we can have in flight packet[s] on the cpus we are not flushing, + * synchronize_net() in rollback_registered_many() will take care of + * them + */ + for_each_cpu(cpu, &flush_cpus) flush_work(per_cpu_ptr(&flush_works, cpu));
put_online_cpus(); @@@ -6293,7 -6337,7 +6337,7 @@@ EXPORT_SYMBOL(__napi_schedule) * @n: napi context * * Test if NAPI routine is already running, and if not mark - * it as running. This is used as a condition variable + * it as running. This is used as a condition variable to * insure only one NAPI poll instance runs. We also make * sure there is no pending NAPI disable. */ @@@ -6533,8 -6577,7 +6577,7 @@@ EXPORT_SYMBOL(napi_busy_loop)
static void napi_hash_add(struct napi_struct *napi) { - if (test_bit(NAPI_STATE_NO_BUSY_POLL, &napi->state) || - test_and_set_bit(NAPI_STATE_HASHED, &napi->state)) + if (test_bit(NAPI_STATE_NO_BUSY_POLL, &napi->state)) return;
spin_lock(&napi_hash_lock); @@@ -6555,20 -6598,14 +6598,14 @@@ /* Warning : caller is responsible to make sure rcu grace period * is respected before freeing memory containing @napi */ - bool napi_hash_del(struct napi_struct *napi) + static void napi_hash_del(struct napi_struct *napi) { - bool rcu_sync_needed = false; - spin_lock(&napi_hash_lock);
- if (test_and_clear_bit(NAPI_STATE_HASHED, &napi->state)) { - rcu_sync_needed = true; - hlist_del_rcu(&napi->napi_hash_node); - } + hlist_del_init_rcu(&napi->napi_hash_node); + spin_unlock(&napi_hash_lock); - return rcu_sync_needed; } - EXPORT_SYMBOL_GPL(napi_hash_del);
static enum hrtimer_restart napi_watchdog(struct hrtimer *timer) { @@@ -6600,7 -6637,11 +6637,11 @@@ static void init_gro_hash(struct napi_s void netif_napi_add(struct net_device *dev, struct napi_struct *napi, int (*poll)(struct napi_struct *, int), int weight) { + if (WARN_ON(test_and_set_bit(NAPI_STATE_LISTED, &napi->state))) + return; + INIT_LIST_HEAD(&napi->poll_list); + INIT_HLIST_NODE(&napi->napi_hash_node); hrtimer_init(&napi->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); napi->timer.function = napi_watchdog; init_gro_hash(napi); @@@ -6653,18 -6694,19 +6694,19 @@@ static void flush_gro_hash(struct napi_ }
/* Must be called in process context */ - void netif_napi_del(struct napi_struct *napi) + void __netif_napi_del(struct napi_struct *napi) { - might_sleep(); - if (napi_hash_del(napi)) - synchronize_net(); - list_del_init(&napi->dev_list); + if (!test_and_clear_bit(NAPI_STATE_LISTED, &napi->state)) + return; + + napi_hash_del(napi); + list_del_rcu(&napi->dev_list); napi_free_frags(napi);
flush_gro_hash(napi); napi->gro_bitmask = 0; } - EXPORT_SYMBOL(netif_napi_del); + EXPORT_SYMBOL(__netif_napi_del);
static int napi_poll(struct napi_struct *n, struct list_head *repoll) { @@@ -8647,7 -8689,7 +8689,7 @@@ int dev_get_port_parent_id(struct net_d if (!first.id_len) first = *ppid; else if (memcmp(&first, ppid, sizeof(*ppid))) - return -ENODATA; + return -EOPNOTSUPP; }
return err; @@@ -9470,7 -9512,7 +9512,7 @@@ int __netdev_update_features(struct net /* driver might be less strict about feature dependencies */ features = netdev_fix_features(dev, features);
- /* some features can't be enabled if they're off an an upper device */ + /* some features can't be enabled if they're off on an upper device */ netdev_for_each_upper_dev_rcu(dev, upper, iter) features = netdev_sync_upper_features(dev, upper, features);
@@@ -9986,10 -10028,12 +10028,12 @@@ EXPORT_SYMBOL(netdev_refcnt_read) * We can get stuck here if buggy protocols don't correctly * call dev_put. */ + #define WAIT_REFS_MIN_MSECS 1 + #define WAIT_REFS_MAX_MSECS 250 static void netdev_wait_allrefs(struct net_device *dev) { unsigned long rebroadcast_time, warning_time; - int refcnt; + int wait = 0, refcnt;
linkwatch_forget_dev(dev);
@@@ -10023,7 -10067,13 +10067,13 @@@ rebroadcast_time = jiffies; }
- msleep(250); + if (!wait) { + rcu_barrier(); + wait = WAIT_REFS_MIN_MSECS; + } else { + msleep(wait); + wait = min(wait << 1, WAIT_REFS_MAX_MSECS); + }
refcnt = netdev_refcnt_read(dev);
diff --combined net/core/filter.c index 21eaf3b182f2,2ad9c0ef1946..08f577114acc --- a/net/core/filter.c +++ b/net/core/filter.c @@@ -4459,6 -4459,7 +4459,7 @@@ static int _bpf_setsockopt(struct sock } else { struct inet_connection_sock *icsk = inet_csk(sk); struct tcp_sock *tp = tcp_sk(sk); + unsigned long timeout;
if (optlen != sizeof(int)) return -EINVAL; @@@ -4480,6 -4481,20 +4481,20 @@@ tp->snd_ssthresh = val; } break; + case TCP_BPF_DELACK_MAX: + timeout = usecs_to_jiffies(val); + if (timeout > TCP_DELACK_MAX || + timeout < TCP_TIMEOUT_MIN) + return -EINVAL; + inet_csk(sk)->icsk_delack_max = timeout; + break; + case TCP_BPF_RTO_MIN: + timeout = usecs_to_jiffies(val); + if (timeout > TCP_RTO_MIN || + timeout < TCP_TIMEOUT_MIN) + return -EINVAL; + inet_csk(sk)->icsk_rto_min = timeout; + break; case TCP_SAVE_SYN: if (val < 0 || val > 1) ret = -EINVAL; @@@ -4550,9 -4565,9 +4565,9 @@@ static int _bpf_getsockopt(struct sock tp = tcp_sk(sk);
if (optlen <= 0 || !tp->saved_syn || - optlen > tp->saved_syn[0]) + optlen > tcp_saved_syn_len(tp->saved_syn)) goto err_clear; - memcpy(optval, tp->saved_syn + 1, optlen); + memcpy(optval, tp->saved_syn->data, optlen); break; default: goto err_clear; @@@ -4654,9 -4669,99 +4669,99 @@@ static const struct bpf_func_proto bpf_ .arg5_type = ARG_CONST_SIZE, };
+ static int bpf_sock_ops_get_syn(struct bpf_sock_ops_kern *bpf_sock, + int optname, const u8 **start) + { + struct sk_buff *syn_skb = bpf_sock->syn_skb; + const u8 *hdr_start; + int ret; + + if (syn_skb) { + /* sk is a request_sock here */ + + if (optname == TCP_BPF_SYN) { + hdr_start = syn_skb->data; + ret = tcp_hdrlen(syn_skb); + } else if (optname == TCP_BPF_SYN_IP) { + hdr_start = skb_network_header(syn_skb); + ret = skb_network_header_len(syn_skb) + + tcp_hdrlen(syn_skb); + } else { + /* optname == TCP_BPF_SYN_MAC */ + hdr_start = skb_mac_header(syn_skb); + ret = skb_mac_header_len(syn_skb) + + skb_network_header_len(syn_skb) + + tcp_hdrlen(syn_skb); + } + } else { + struct sock *sk = bpf_sock->sk; + struct saved_syn *saved_syn; + + if (sk->sk_state == TCP_NEW_SYN_RECV) + /* synack retransmit. bpf_sock->syn_skb will + * not be available. It has to resort to + * saved_syn (if it is saved). + */ + saved_syn = inet_reqsk(sk)->saved_syn; + else + saved_syn = tcp_sk(sk)->saved_syn; + + if (!saved_syn) + return -ENOENT; + + if (optname == TCP_BPF_SYN) { + hdr_start = saved_syn->data + + saved_syn->mac_hdrlen + + saved_syn->network_hdrlen; + ret = saved_syn->tcp_hdrlen; + } else if (optname == TCP_BPF_SYN_IP) { + hdr_start = saved_syn->data + + saved_syn->mac_hdrlen; + ret = saved_syn->network_hdrlen + + saved_syn->tcp_hdrlen; + } else { + /* optname == TCP_BPF_SYN_MAC */ + + /* TCP_SAVE_SYN may not have saved the mac hdr */ + if (!saved_syn->mac_hdrlen) + return -ENOENT; + + hdr_start = saved_syn->data; + ret = saved_syn->mac_hdrlen + + saved_syn->network_hdrlen + + saved_syn->tcp_hdrlen; + } + } + + *start = hdr_start; + return ret; + } + BPF_CALL_5(bpf_sock_ops_getsockopt, struct bpf_sock_ops_kern *, bpf_sock, int, level, int, optname, char *, optval, int, optlen) { + if (IS_ENABLED(CONFIG_INET) && level == SOL_TCP && + optname >= TCP_BPF_SYN && optname <= TCP_BPF_SYN_MAC) { + int ret, copy_len = 0; + const u8 *start; + + ret = bpf_sock_ops_get_syn(bpf_sock, optname, &start); + if (ret > 0) { + copy_len = ret; + if (optlen < copy_len) { + copy_len = optlen; + ret = -ENOSPC; + } + + memcpy(optval, start, copy_len); + } + + /* Zero out unused buffer at the end */ + memset(optval + copy_len, 0, optlen - copy_len); + + return ret; + } + return _bpf_getsockopt(bpf_sock->sk, level, optname, optval, optlen); }
@@@ -4838,7 -4943,6 +4943,7 @@@ static int bpf_ipv4_fib_lookup(struct n fl4.saddr = params->ipv4_src; fl4.fl4_sport = params->sport; fl4.fl4_dport = params->dport; + fl4.flowi4_multipath_hash = 0;
if (flags & BPF_FIB_LOOKUP_DIRECT) { u32 tbid = l3mdev_fib_table_rcu(dev) ? : RT_TABLE_MAIN; @@@ -6151,6 -6255,232 +6256,232 @@@ static const struct bpf_func_proto bpf_ .arg3_type = ARG_ANYTHING, };
+ static const u8 *bpf_search_tcp_opt(const u8 *op, const u8 *opend, + u8 search_kind, const u8 *magic, + u8 magic_len, bool *eol) + { + u8 kind, kind_len; + + *eol = false; + + while (op < opend) { + kind = op[0]; + + if (kind == TCPOPT_EOL) { + *eol = true; + return ERR_PTR(-ENOMSG); + } else if (kind == TCPOPT_NOP) { + op++; + continue; + } + + if (opend - op < 2 || opend - op < op[1] || op[1] < 2) + /* Something is wrong in the received header. + * Follow the TCP stack's tcp_parse_options() + * and just bail here. + */ + return ERR_PTR(-EFAULT); + + kind_len = op[1]; + if (search_kind == kind) { + if (!magic_len) + return op; + + if (magic_len > kind_len - 2) + return ERR_PTR(-ENOMSG); + + if (!memcmp(&op[2], magic, magic_len)) + return op; + } + + op += kind_len; + } + + return ERR_PTR(-ENOMSG); + } + + BPF_CALL_4(bpf_sock_ops_load_hdr_opt, struct bpf_sock_ops_kern *, bpf_sock, + void *, search_res, u32, len, u64, flags) + { + bool eol, load_syn = flags & BPF_LOAD_HDR_OPT_TCP_SYN; + const u8 *op, *opend, *magic, *search = search_res; + u8 search_kind, search_len, copy_len, magic_len; + int ret; + + /* 2 byte is the minimal option len except TCPOPT_NOP and + * TCPOPT_EOL which are useless for the bpf prog to learn + * and this helper disallow loading them also. + */ + if (len < 2 || flags & ~BPF_LOAD_HDR_OPT_TCP_SYN) + return -EINVAL; + + search_kind = search[0]; + search_len = search[1]; + + if (search_len > len || search_kind == TCPOPT_NOP || + search_kind == TCPOPT_EOL) + return -EINVAL; + + if (search_kind == TCPOPT_EXP || search_kind == 253) { + /* 16 or 32 bit magic. +2 for kind and kind length */ + if (search_len != 4 && search_len != 6) + return -EINVAL; + magic = &search[2]; + magic_len = search_len - 2; + } else { + if (search_len) + return -EINVAL; + magic = NULL; + magic_len = 0; + } + + if (load_syn) { + ret = bpf_sock_ops_get_syn(bpf_sock, TCP_BPF_SYN, &op); + if (ret < 0) + return ret; + + opend = op + ret; + op += sizeof(struct tcphdr); + } else { + if (!bpf_sock->skb || + bpf_sock->op == BPF_SOCK_OPS_HDR_OPT_LEN_CB) + /* This bpf_sock->op cannot call this helper */ + return -EPERM; + + opend = bpf_sock->skb_data_end; + op = bpf_sock->skb->data + sizeof(struct tcphdr); + } + + op = bpf_search_tcp_opt(op, opend, search_kind, magic, magic_len, + &eol); + if (IS_ERR(op)) + return PTR_ERR(op); + + copy_len = op[1]; + ret = copy_len; + if (copy_len > len) { + ret = -ENOSPC; + copy_len = len; + } + + memcpy(search_res, op, copy_len); + return ret; + } + + static const struct bpf_func_proto bpf_sock_ops_load_hdr_opt_proto = { + .func = bpf_sock_ops_load_hdr_opt, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_CONST_SIZE, + .arg4_type = ARG_ANYTHING, + }; + + BPF_CALL_4(bpf_sock_ops_store_hdr_opt, struct bpf_sock_ops_kern *, bpf_sock, + const void *, from, u32, len, u64, flags) + { + u8 new_kind, new_kind_len, magic_len = 0, *opend; + const u8 *op, *new_op, *magic = NULL; + struct sk_buff *skb; + bool eol; + + if (bpf_sock->op != BPF_SOCK_OPS_WRITE_HDR_OPT_CB) + return -EPERM; + + if (len < 2 || flags) + return -EINVAL; + + new_op = from; + new_kind = new_op[0]; + new_kind_len = new_op[1]; + + if (new_kind_len > len || new_kind == TCPOPT_NOP || + new_kind == TCPOPT_EOL) + return -EINVAL; + + if (new_kind_len > bpf_sock->remaining_opt_len) + return -ENOSPC; + + /* 253 is another experimental kind */ + if (new_kind == TCPOPT_EXP || new_kind == 253) { + if (new_kind_len < 4) + return -EINVAL; + /* Match for the 2 byte magic also. + * RFC 6994: the magic could be 2 or 4 bytes. + * Hence, matching by 2 byte only is on the + * conservative side but it is the right + * thing to do for the 'search-for-duplication' + * purpose. + */ + magic = &new_op[2]; + magic_len = 2; + } + + /* Check for duplication */ + skb = bpf_sock->skb; + op = skb->data + sizeof(struct tcphdr); + opend = bpf_sock->skb_data_end; + + op = bpf_search_tcp_opt(op, opend, new_kind, magic, magic_len, + &eol); + if (!IS_ERR(op)) + return -EEXIST; + + if (PTR_ERR(op) != -ENOMSG) + return PTR_ERR(op); + + if (eol) + /* The option has been ended. Treat it as no more + * header option can be written. + */ + return -ENOSPC; + + /* No duplication found. Store the header option. */ + memcpy(opend, from, new_kind_len); + + bpf_sock->remaining_opt_len -= new_kind_len; + bpf_sock->skb_data_end += new_kind_len; + + return 0; + } + + static const struct bpf_func_proto bpf_sock_ops_store_hdr_opt_proto = { + .func = bpf_sock_ops_store_hdr_opt, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_CONST_SIZE, + .arg4_type = ARG_ANYTHING, + }; + + BPF_CALL_3(bpf_sock_ops_reserve_hdr_opt, struct bpf_sock_ops_kern *, bpf_sock, + u32, len, u64, flags) + { + if (bpf_sock->op != BPF_SOCK_OPS_HDR_OPT_LEN_CB) + return -EPERM; + + if (flags || len < 2) + return -EINVAL; + + if (len > bpf_sock->remaining_opt_len) + return -ENOSPC; + + bpf_sock->remaining_opt_len -= len; + + return 0; + } + + static const struct bpf_func_proto bpf_sock_ops_reserve_hdr_opt_proto = { + .func = bpf_sock_ops_reserve_hdr_opt, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_ANYTHING, + }; + #endif /* CONFIG_INET */
bool bpf_helper_changes_pkt_data(void *func) @@@ -6179,6 -6509,9 +6510,9 @@@ func == bpf_lwt_seg6_store_bytes || func == bpf_lwt_seg6_adjust_srh || func == bpf_lwt_seg6_action || + #endif + #ifdef CONFIG_INET + func == bpf_sock_ops_store_hdr_opt || #endif func == bpf_lwt_in_push_encap || func == bpf_lwt_xmit_push_encap) @@@ -6551,6 -6884,12 +6885,12 @@@ sock_ops_func_proto(enum bpf_func_id fu case BPF_FUNC_sk_storage_delete: return &bpf_sk_storage_delete_proto; #ifdef CONFIG_INET + case BPF_FUNC_load_hdr_opt: + return &bpf_sock_ops_load_hdr_opt_proto; + case BPF_FUNC_store_hdr_opt: + return &bpf_sock_ops_store_hdr_opt_proto; + case BPF_FUNC_reserve_hdr_opt: + return &bpf_sock_ops_reserve_hdr_opt_proto; case BPF_FUNC_tcp_sock: return &bpf_tcp_sock_proto; #endif /* CONFIG_INET */ @@@ -7066,6 -7405,8 +7406,6 @@@ static int bpf_gen_ld_abs(const struct bool indirect = BPF_MODE(orig->code) == BPF_IND; struct bpf_insn *insn = insn_buf;
- /* We're guaranteed here that CTX is in R6. */ - *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_CTX); if (!indirect) { *insn++ = BPF_MOV64_IMM(BPF_REG_2, orig->imm); } else { @@@ -7073,8 -7414,6 +7413,8 @@@ if (orig->imm) *insn++ = BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, orig->imm); } + /* We're guaranteed here that CTX is in R6. */ + *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_CTX);
switch (BPF_SIZE(orig->code)) { case BPF_B: @@@ -7350,6 -7689,20 +7690,20 @@@ static bool sock_ops_is_valid_access(in return false; info->reg_type = PTR_TO_SOCKET_OR_NULL; break; + case offsetof(struct bpf_sock_ops, skb_data): + if (size != sizeof(__u64)) + return false; + info->reg_type = PTR_TO_PACKET; + break; + case offsetof(struct bpf_sock_ops, skb_data_end): + if (size != sizeof(__u64)) + return false; + info->reg_type = PTR_TO_PACKET_END; + break; + case offsetof(struct bpf_sock_ops, skb_tcp_flags): + bpf_ctx_record_field_size(info, size_default); + return bpf_ctx_narrow_access_ok(off, size, + size_default); default: if (size != size_default) return false; @@@ -8451,17 -8804,22 +8805,22 @@@ static u32 sock_ops_convert_ctx_access( return insn - insn_buf;
switch (si->off) { - case offsetof(struct bpf_sock_ops, op) ... + case offsetof(struct bpf_sock_ops, op): + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_sock_ops_kern, + op), + si->dst_reg, si->src_reg, + offsetof(struct bpf_sock_ops_kern, op)); + break; + + case offsetof(struct bpf_sock_ops, replylong[0]) ... offsetof(struct bpf_sock_ops, replylong[3]): - BUILD_BUG_ON(sizeof_field(struct bpf_sock_ops, op) != - sizeof_field(struct bpf_sock_ops_kern, op)); BUILD_BUG_ON(sizeof_field(struct bpf_sock_ops, reply) != sizeof_field(struct bpf_sock_ops_kern, reply)); BUILD_BUG_ON(sizeof_field(struct bpf_sock_ops, replylong) != sizeof_field(struct bpf_sock_ops_kern, replylong)); off = si->off; - off -= offsetof(struct bpf_sock_ops, op); - off += offsetof(struct bpf_sock_ops_kern, op); + off -= offsetof(struct bpf_sock_ops, replylong[0]); + off += offsetof(struct bpf_sock_ops_kern, replylong[0]); if (type == BPF_WRITE) *insn++ = BPF_STX_MEM(BPF_W, si->dst_reg, si->src_reg, off); @@@ -8682,6 -9040,49 +9041,49 @@@ case offsetof(struct bpf_sock_ops, sk): SOCK_OPS_GET_SK(); break; + case offsetof(struct bpf_sock_ops, skb_data_end): + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_sock_ops_kern, + skb_data_end), + si->dst_reg, si->src_reg, + offsetof(struct bpf_sock_ops_kern, + skb_data_end)); + break; + case offsetof(struct bpf_sock_ops, skb_data): + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_sock_ops_kern, + skb), + si->dst_reg, si->src_reg, + offsetof(struct bpf_sock_ops_kern, + skb)); + *insn++ = BPF_JMP_IMM(BPF_JEQ, si->dst_reg, 0, 1); + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, data), + si->dst_reg, si->dst_reg, + offsetof(struct sk_buff, data)); + break; + case offsetof(struct bpf_sock_ops, skb_len): + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_sock_ops_kern, + skb), + si->dst_reg, si->src_reg, + offsetof(struct bpf_sock_ops_kern, + skb)); + *insn++ = BPF_JMP_IMM(BPF_JEQ, si->dst_reg, 0, 1); + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, len), + si->dst_reg, si->dst_reg, + offsetof(struct sk_buff, len)); + break; + case offsetof(struct bpf_sock_ops, skb_tcp_flags): + off = offsetof(struct sk_buff, cb); + off += offsetof(struct tcp_skb_cb, tcp_flags); + *target_size = sizeof_field(struct tcp_skb_cb, tcp_flags); + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_sock_ops_kern, + skb), + si->dst_reg, si->src_reg, + offsetof(struct bpf_sock_ops_kern, + skb)); + *insn++ = BPF_JMP_IMM(BPF_JEQ, si->dst_reg, 0, 1); + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct tcp_skb_cb, + tcp_flags), + si->dst_reg, si->dst_reg, off); + break; } return insn - insn_buf; } @@@ -9523,7 -9924,7 +9925,7 @@@ BPF_CALL_1(bpf_skc_to_tcp6_sock, struc * trigger an explicit type generation here. */ BTF_TYPE_EMIT(struct tcp6_sock); - if (sk_fullsock(sk) && sk->sk_protocol == IPPROTO_TCP && + if (sk && sk_fullsock(sk) && sk->sk_protocol == IPPROTO_TCP && sk->sk_family == AF_INET6) return (unsigned long)sk;
@@@ -9541,7 -9942,7 +9943,7 @@@ const struct bpf_func_proto bpf_skc_to_
BPF_CALL_1(bpf_skc_to_tcp_sock, struct sock *, sk) { - if (sk_fullsock(sk) && sk->sk_protocol == IPPROTO_TCP) + if (sk && sk_fullsock(sk) && sk->sk_protocol == IPPROTO_TCP) return (unsigned long)sk;
return (unsigned long)NULL; @@@ -9559,12 -9960,12 +9961,12 @@@ const struct bpf_func_proto bpf_skc_to_ BPF_CALL_1(bpf_skc_to_tcp_timewait_sock, struct sock *, sk) { #ifdef CONFIG_INET - if (sk->sk_prot == &tcp_prot && sk->sk_state == TCP_TIME_WAIT) + if (sk && sk->sk_prot == &tcp_prot && sk->sk_state == TCP_TIME_WAIT) return (unsigned long)sk; #endif
#if IS_BUILTIN(CONFIG_IPV6) - if (sk->sk_prot == &tcpv6_prot && sk->sk_state == TCP_TIME_WAIT) + if (sk && sk->sk_prot == &tcpv6_prot && sk->sk_state == TCP_TIME_WAIT) return (unsigned long)sk; #endif
@@@ -9583,12 -9984,12 +9985,12 @@@ const struct bpf_func_proto bpf_skc_to_ BPF_CALL_1(bpf_skc_to_tcp_request_sock, struct sock *, sk) { #ifdef CONFIG_INET - if (sk->sk_prot == &tcp_prot && sk->sk_state == TCP_NEW_SYN_RECV) + if (sk && sk->sk_prot == &tcp_prot && sk->sk_state == TCP_NEW_SYN_RECV) return (unsigned long)sk; #endif
#if IS_BUILTIN(CONFIG_IPV6) - if (sk->sk_prot == &tcpv6_prot && sk->sk_state == TCP_NEW_SYN_RECV) + if (sk && sk->sk_prot == &tcpv6_prot && sk->sk_state == TCP_NEW_SYN_RECV) return (unsigned long)sk; #endif
@@@ -9610,7 -10011,7 +10012,7 @@@ BPF_CALL_1(bpf_skc_to_udp6_sock, struc * trigger an explicit type generation here. */ BTF_TYPE_EMIT(struct udp6_sock); - if (sk_fullsock(sk) && sk->sk_protocol == IPPROTO_UDP && + if (sk && sk_fullsock(sk) && sk->sk_protocol == IPPROTO_UDP && sk->sk_type == SOCK_DGRAM && sk->sk_family == AF_INET6) return (unsigned long)sk;
diff --combined net/core/skbuff.c index 4b02a527ee38,e0774471f56d..e038026d1e3b --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@@ -895,9 -895,6 +895,6 @@@ void __kfree_skb_defer(struct sk_buff *
void napi_consume_skb(struct sk_buff *skb, int budget) { - if (unlikely(!skb)) - return; - /* Zero budget indicate non-NAPI context called us, like netpoll */ if (unlikely(!budget)) { dev_consume_skb_any(skb); @@@ -2725,20 -2722,19 +2722,20 @@@ EXPORT_SYMBOL(skb_checksum) /* Both of above in one bottle. */
__wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset, - u8 *to, int len, __wsum csum) + u8 *to, int len) { int start = skb_headlen(skb); int i, copy = start - offset; struct sk_buff *frag_iter; int pos = 0; + __wsum csum = 0;
/* Copy header. */ if (copy > 0) { if (copy > len) copy = len; csum = csum_partial_copy_nocheck(skb->data + offset, to, - copy, csum); + copy); if ((len -= copy) == 0) return csum; offset += copy; @@@ -2768,7 -2764,7 +2765,7 @@@ vaddr = kmap_atomic(p); csum2 = csum_partial_copy_nocheck(vaddr + p_off, to + copied, - p_len, 0); + p_len); kunmap_atomic(vaddr); csum = csum_block_add(csum, csum2, pos); pos += p_len; @@@ -2794,7 -2790,7 +2791,7 @@@ copy = len; csum2 = skb_copy_and_csum_bits(frag_iter, offset - start, - to, copy, 0); + to, copy); csum = csum_block_add(csum, csum2, pos); if ((len -= copy) == 0) return csum; @@@ -3014,7 -3010,7 +3011,7 @@@ void skb_copy_and_csum_dev(const struc csum = 0; if (csstart != skb->len) csum = skb_copy_and_csum_bits(skb, csstart, to + csstart, - skb->len - csstart, 0); + skb->len - csstart);
if (skb->ip_summed == CHECKSUM_PARTIAL) { long csstuff = csstart + skb->csum_offset; @@@ -3935,7 -3931,7 +3932,7 @@@ normal skb_copy_and_csum_bits(head_skb, offset, skb_put(nskb, len), - len, 0); + len); SKB_GSO_CB(nskb)->csum_start = skb_headroom(nskb) + doffset; } else { @@@ -5956,8 -5952,7 +5953,7 @@@ static int pskb_carve_inside_nonlinear( size = SKB_WITH_OVERHEAD(ksize(data));
memcpy((struct skb_shared_info *)(data + size), - skb_shinfo(skb), offsetof(struct skb_shared_info, - frags[skb_shinfo(skb)->nr_frags])); + skb_shinfo(skb), offsetof(struct skb_shared_info, frags[0])); if (skb_orphan_frags(skb, gfp_mask)) { kfree(data); return -ENOMEM; diff --combined net/dsa/slave.c index 16e5f98d4882,2d52bfba110a..e7c1d62fde99 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@@ -303,13 -303,36 +303,36 @@@ static int dsa_slave_port_attr_set(stru return ret; }
+ /* Must be called under rcu_read_lock() */ + static int + dsa_slave_vlan_check_for_8021q_uppers(struct net_device *slave, + const struct switchdev_obj_port_vlan *vlan) + { + struct net_device *upper_dev; + struct list_head *iter; + + netdev_for_each_upper_dev_rcu(slave, upper_dev, iter) { + u16 vid; + + if (!is_vlan_dev(upper_dev)) + continue; + + vid = vlan_dev_vlan_id(upper_dev); + if (vid >= vlan->vid_begin && vid <= vlan->vid_end) + return -EBUSY; + } + + return 0; + } + static int dsa_slave_vlan_add(struct net_device *dev, const struct switchdev_obj *obj, struct switchdev_trans *trans) { + struct net_device *master = dsa_slave_to_master(dev); struct dsa_port *dp = dsa_slave_to_port(dev); struct switchdev_obj_port_vlan vlan; - int err; + int vid, err;
if (obj->orig_dev != dev) return -EOPNOTSUPP; @@@ -319,6 -342,17 +342,17 @@@
vlan = *SWITCHDEV_OBJ_PORT_VLAN(obj);
+ /* Deny adding a bridge VLAN when there is already an 802.1Q upper with + * the same VID. + */ + if (trans->ph_prepare && br_vlan_enabled(dp->bridge_dev)) { + rcu_read_lock(); + err = dsa_slave_vlan_check_for_8021q_uppers(dev, &vlan); + rcu_read_unlock(); + if (err) + return err; + } + err = dsa_port_vlan_add(dp, &vlan, trans); if (err) return err; @@@ -333,6 -367,12 +367,12 @@@ if (err) return err;
+ for (vid = vlan.vid_begin; vid <= vlan.vid_end; vid++) { + err = vlan_vid_add(master, htons(ETH_P_8021Q), vid); + if (err) + return err; + } + return 0; }
@@@ -376,7 -416,10 +416,10 @@@ static int dsa_slave_port_obj_add(struc static int dsa_slave_vlan_del(struct net_device *dev, const struct switchdev_obj *obj) { + struct net_device *master = dsa_slave_to_master(dev); struct dsa_port *dp = dsa_slave_to_port(dev); + struct switchdev_obj_port_vlan *vlan; + int vid, err;
if (obj->orig_dev != dev) return -EOPNOTSUPP; @@@ -384,10 -427,19 +427,19 @@@ if (dsa_port_skip_vlan_configuration(dp)) return 0;
+ vlan = SWITCHDEV_OBJ_PORT_VLAN(obj); + /* Do not deprogram the CPU port as it may be shared with other user * ports which can be members of this VLAN as well. */ - return dsa_port_vlan_del(dp, SWITCHDEV_OBJ_PORT_VLAN(obj)); + err = dsa_port_vlan_del(dp, vlan); + if (err) + return err; + + for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) + vlan_vid_del(master, htons(ETH_P_8021Q), vid); + + return 0; }
static int dsa_slave_port_obj_del(struct net_device *dev, @@@ -1232,64 -1284,66 +1284,66 @@@ static int dsa_slave_get_ts_info(struc static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid) { + struct net_device *master = dsa_slave_to_master(dev); struct dsa_port *dp = dsa_slave_to_port(dev); - struct bridge_vlan_info info; + struct switchdev_obj_port_vlan vlan = { + .obj.id = SWITCHDEV_OBJ_ID_PORT_VLAN, + .vid_begin = vid, + .vid_end = vid, + /* This API only allows programming tagged, non-PVID VIDs */ + .flags = 0, + }; + struct switchdev_trans trans; int ret;
- /* Check for a possible bridge VLAN entry now since there is no - * need to emulate the switchdev prepare + commit phase. - */ - if (dp->bridge_dev) { - if (dsa_port_skip_vlan_configuration(dp)) - return 0; + /* User port... */ + trans.ph_prepare = true; + ret = dsa_port_vlan_add(dp, &vlan, &trans); + if (ret) + return ret;
- /* br_vlan_get_info() returns -EINVAL or -ENOENT if the - * device, respectively the VID is not found, returning - * 0 means success, which is a failure for us here. - */ - ret = br_vlan_get_info(dp->bridge_dev, vid, &info); - if (ret == 0) - return -EBUSY; - } + trans.ph_prepare = false; + ret = dsa_port_vlan_add(dp, &vlan, &trans); + if (ret) + return ret;
- ret = dsa_port_vid_add(dp, vid, 0); + /* And CPU port... */ + trans.ph_prepare = true; + ret = dsa_port_vlan_add(dp->cpu_dp, &vlan, &trans); if (ret) return ret;
- ret = dsa_port_vid_add(dp->cpu_dp, vid, 0); + trans.ph_prepare = false; + ret = dsa_port_vlan_add(dp->cpu_dp, &vlan, &trans); if (ret) return ret;
- return 0; + return vlan_vid_add(master, proto, vid); }
static int dsa_slave_vlan_rx_kill_vid(struct net_device *dev, __be16 proto, u16 vid) { + struct net_device *master = dsa_slave_to_master(dev); struct dsa_port *dp = dsa_slave_to_port(dev); - struct bridge_vlan_info info; - int ret; - - /* Check for a possible bridge VLAN entry now since there is no - * need to emulate the switchdev prepare + commit phase. - */ - if (dp->bridge_dev) { - if (dsa_port_skip_vlan_configuration(dp)) - return 0; - - /* br_vlan_get_info() returns -EINVAL or -ENOENT if the - * device, respectively the VID is not found, returning - * 0 means success, which is a failure for us here. - */ - ret = br_vlan_get_info(dp->bridge_dev, vid, &info); - if (ret == 0) - return -EBUSY; - } + struct switchdev_obj_port_vlan vlan = { + .vid_begin = vid, + .vid_end = vid, + /* This API only allows programming tagged, non-PVID VIDs */ + .flags = 0, + }; + int err;
/* Do not deprogram the CPU port as it may be shared with other user * ports which can be members of this VLAN as well. */ - return dsa_port_vid_del(dp, vid); + err = dsa_port_vlan_del(dp, &vlan); + if (err) + return err; + + vlan_vid_del(master, proto, vid); + + return 0; }
struct dsa_hw_port { @@@ -1784,7 -1838,7 +1838,7 @@@ int dsa_slave_create(struct dsa_port *p rtnl_lock(); ret = dsa_slave_change_mtu(slave_dev, ETH_DATA_LEN); rtnl_unlock(); - if (ret) + if (ret && ret != -EOPNOTSUPP) dev_warn(ds->dev, "nonfatal error %d setting MTU on port %d\n", ret, port->index);
@@@ -1792,34 -1846,23 +1846,35 @@@
ret = dsa_slave_phy_setup(slave_dev); if (ret) { - netdev_err(master, "error %d setting up slave PHY for %s\n", - ret, slave_dev->name); + netdev_err(slave_dev, + "error %d setting up PHY for tree %d, switch %d, port %d\n", + ret, ds->dst->index, ds->index, port->index); goto out_gcells; }
dsa_slave_notify(slave_dev, DSA_PORT_REGISTER);
- ret = register_netdev(slave_dev); + rtnl_lock(); + + ret = register_netdevice(slave_dev); if (ret) { netdev_err(master, "error %d registering interface %s\n", ret, slave_dev->name); + rtnl_unlock(); goto out_phy; }
+ ret = netdev_upper_dev_link(master, slave_dev, NULL); + + rtnl_unlock(); + + if (ret) + goto out_unregister; + return 0;
+out_unregister: + unregister_netdev(slave_dev); out_phy: rtnl_lock(); phylink_disconnect_phy(p->dp->pl); @@@ -1836,18 -1879,16 +1891,18 @@@ out_free
void dsa_slave_destroy(struct net_device *slave_dev) { + struct net_device *master = dsa_slave_to_master(slave_dev); struct dsa_port *dp = dsa_slave_to_port(slave_dev); struct dsa_slave_priv *p = netdev_priv(slave_dev);
netif_carrier_off(slave_dev); rtnl_lock(); + netdev_upper_dev_unlink(master, slave_dev); + unregister_netdevice(slave_dev); phylink_disconnect_phy(dp->pl); rtnl_unlock();
dsa_slave_notify(slave_dev, DSA_PORT_UNREGISTER); - unregister_netdev(slave_dev); phylink_destroy(dp->pl); gro_cells_destroy(&p->gcells); free_percpu(p->stats64); @@@ -1880,9 -1921,9 +1935,9 @@@ static int dsa_slave_changeupper(struc return err; }
- static int dsa_slave_upper_vlan_check(struct net_device *dev, - struct netdev_notifier_changeupper_info * - info) + static int + dsa_prevent_bridging_8021q_upper(struct net_device *dev, + struct netdev_notifier_changeupper_info *info) { struct netlink_ext_ack *ext_ack; struct net_device *slave; @@@ -1912,14 -1953,56 +1967,56 @@@ return NOTIFY_DONE; }
+ static int + dsa_slave_check_8021q_upper(struct net_device *dev, + struct netdev_notifier_changeupper_info *info) + { + struct dsa_port *dp = dsa_slave_to_port(dev); + struct net_device *br = dp->bridge_dev; + struct bridge_vlan_info br_info; + struct netlink_ext_ack *extack; + int err = NOTIFY_DONE; + u16 vid; + + if (!br || !br_vlan_enabled(br)) + return NOTIFY_DONE; + + extack = netdev_notifier_info_to_extack(&info->info); + vid = vlan_dev_vlan_id(info->upper_dev); + + /* br_vlan_get_info() returns -EINVAL or -ENOENT if the + * device, respectively the VID is not found, returning + * 0 means success, which is a failure for us here. + */ + err = br_vlan_get_info(br, vid, &br_info); + if (err == 0) { + NL_SET_ERR_MSG_MOD(extack, + "This VLAN is already configured by the bridge"); + return notifier_from_errno(-EBUSY); + } + + return NOTIFY_DONE; + } + static int dsa_slave_netdevice_event(struct notifier_block *nb, unsigned long event, void *ptr) { struct net_device *dev = netdev_notifier_info_to_dev(ptr);
- if (event == NETDEV_CHANGEUPPER) { + switch (event) { + case NETDEV_PRECHANGEUPPER: { + struct netdev_notifier_changeupper_info *info = ptr; + + if (!dsa_slave_dev_check(dev)) + return dsa_prevent_bridging_8021q_upper(dev, ptr); + + if (is_vlan_dev(info->upper_dev)) + return dsa_slave_check_8021q_upper(dev, ptr); + break; + } + case NETDEV_CHANGEUPPER: if (!dsa_slave_dev_check(dev)) - return dsa_slave_upper_vlan_check(dev, ptr); + return NOTIFY_DONE;
return dsa_slave_changeupper(dev, ptr); } diff --combined net/ipv4/icmp.c index bdaaee52c41b,8f2e974a1e4d..f949af6d5cbd --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@@ -352,7 -352,7 +352,7 @@@ static int icmp_glue_bits(void *from, c
csum = skb_copy_and_csum_bits(icmp_param->skb, icmp_param->offset + offset, - to, len, 0); + to, len);
skb->csum = csum_block_add(skb->csum, csum, odd); if (icmp_pointers[icmp_param->data.icmph.type].error) @@@ -376,15 -376,15 +376,15 @@@ static void icmp_push_reply(struct icmp ip_flush_pending_frames(sk); } else if ((skb = skb_peek(&sk->sk_write_queue)) != NULL) { struct icmphdr *icmph = icmp_hdr(skb); - __wsum csum = 0; + __wsum csum; struct sk_buff *skb1;
+ csum = csum_partial_copy_nocheck((void *)&icmp_param->data, + (char *)icmph, + icmp_param->head_len); skb_queue_walk(&sk->sk_write_queue, skb1) { csum = csum_add(csum, skb1->csum); } - csum = csum_partial_copy_nocheck((void *)&icmp_param->data, - (char *)icmph, - icmp_param->head_len, csum); icmph->checksum = csum_fold(csum); skb->ip_summed = CHECKSUM_NONE; ip_push_pending_frames(sk, fl4); @@@ -690,9 -690,9 +690,9 @@@ void __icmp_send(struct sk_buff *skb_in rcu_read_unlock(); }
- tos = icmp_pointers[type].error ? ((iph->tos & IPTOS_TOS_MASK) | + tos = icmp_pointers[type].error ? (RT_TOS(iph->tos) | IPTOS_PREC_INTERNETCONTROL) : - iph->tos; + iph->tos; mark = IP4_REPLY_MARK(net, skb_in->mark);
if (__ip_options_echo(net, &icmp_param.replyopts.opt.opt, skb_in, opt)) @@@ -784,7 -784,7 +784,7 @@@ EXPORT_SYMBOL(icmp_ndo_send)
static void icmp_socket_deliver(struct sk_buff *skb, u32 info) { - const struct iphdr *iph = (const struct iphdr *) skb->data; + const struct iphdr *iph = (const struct iphdr *)skb->data; const struct net_protocol *ipprot; int protocol = iph->protocol;
diff --combined net/ipv4/ip_output.c index 5131cf70672a,5fb536ff51f0..879b76ae4435 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@@ -74,7 -74,6 +74,7 @@@ #include <net/icmp.h> #include <net/checksum.h> #include <net/inetpeer.h> +#include <net/inet_ecn.h> #include <net/lwtunnel.h> #include <linux/bpf-cgroup.h> #include <linux/igmp.h> @@@ -143,7 -142,8 +143,8 @@@ static inline int ip_select_ttl(struct * */ int ip_build_and_send_pkt(struct sk_buff *skb, const struct sock *sk, - __be32 saddr, __be32 daddr, struct ip_options_rcu *opt) + __be32 saddr, __be32 daddr, struct ip_options_rcu *opt, + u8 tos) { struct inet_sock *inet = inet_sk(sk); struct rtable *rt = skb_rtable(skb); @@@ -156,7 -156,7 +157,7 @@@ iph = ip_hdr(skb); iph->version = 4; iph->ihl = 5; - iph->tos = inet->tos; + iph->tos = tos; iph->ttl = ip_select_ttl(inet, &rt->dst); iph->daddr = (opt && opt->opt.srr ? opt->opt.faddr : daddr); iph->saddr = saddr; @@@ -997,7 -997,7 +998,7 @@@ static int __ip_append_data(struct soc
fragheaderlen = sizeof(struct iphdr) + (opt ? opt->optlen : 0); maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen; - maxnonfragsize = ip_sk_ignore_df(sk) ? 0xFFFF : mtu; + maxnonfragsize = ip_sk_ignore_df(sk) ? IP_MAX_MTU : mtu;
if (cork->length + length > maxnonfragsize - fragheaderlen) { ip_local_error(sk, EMSGSIZE, fl4->daddr, inet->inet_dport, @@@ -1127,7 -1127,7 +1128,7 @@@ alloc_new_skb if (fraggap) { skb->csum = skb_copy_and_csum_bits( skb_prev, maxfraglen, - data + transhdrlen, fraggap, 0); + data + transhdrlen, fraggap); skb_prev->csum = csum_sub(skb_prev->csum, skb->csum); data += fraggap; @@@ -1352,7 -1352,7 +1353,7 @@@ ssize_t ip_append_page(struct sock *sk if (cork->flags & IPCORK_OPT) opt = cork->opt;
- if (!(rt->dst.dev->features&NETIF_F_SG)) + if (!(rt->dst.dev->features & NETIF_F_SG)) return -EOPNOTSUPP;
hh_len = LL_RESERVED_SPACE(rt->dst.dev); @@@ -1412,7 -1412,7 +1413,7 @@@ skb->csum = skb_copy_and_csum_bits(skb_prev, maxfraglen, skb_transport_header(skb), - fraggap, 0); + fraggap); skb_prev->csum = csum_sub(skb_prev->csum, skb->csum); pskb_trim_unique(skb_prev, maxfraglen); @@@ -1537,7 -1537,7 +1538,7 @@@ struct sk_buff *__ip_make_skb(struct so ip_select_ident(net, skb, sk);
if (opt) { - iph->ihl += opt->optlen>>2; + iph->ihl += opt->optlen >> 2; ip_options_build(skb, opt, cork->addr, rt, 0); }
@@@ -1649,7 -1649,7 +1650,7 @@@ static int ip_reply_glue_bits(void *dpt { __wsum csum;
- csum = csum_partial_copy_nocheck(dptr+offset, to, len, 0); + csum = csum_partial_copy_nocheck(dptr+offset, to, len); skb->csum = csum_block_add(skb->csum, csum, odd); return 0; } @@@ -1704,7 -1704,7 +1705,7 @@@ void ip_send_unicast_reply(struct sock if (IS_ERR(rt)) return;
- inet_sk(sk)->tos = arg->tos; + inet_sk(sk)->tos = arg->tos & ~INET_ECN_MASK;
sk->sk_protocol = ip_hdr(skb)->protocol; sk->sk_bound_dev_if = arg->bound_dev_if; diff --combined net/ipv4/raw.c index 355f3ca868af,1170653a89cd..7d26e0f8bdae --- a/net/ipv4/raw.c +++ b/net/ipv4/raw.c @@@ -260,11 -260,12 +260,12 @@@ static void raw_err(struct sock *sk, st err = EHOSTUNREACH; if (code > NR_ICMP_UNREACH) break; - err = icmp_err_convert[code].errno; - harderr = icmp_err_convert[code].fatal; if (code == ICMP_FRAG_NEEDED) { harderr = inet->pmtudisc != IP_PMTUDISC_DONT; err = EMSGSIZE; + } else { + err = icmp_err_convert[code].errno; + harderr = icmp_err_convert[code].fatal; } }
@@@ -478,7 -479,7 +479,7 @@@ static int raw_getfrag(void *from, cha skb->csum = csum_block_add( skb->csum, csum_partial_copy_nocheck(rfv->hdr.c + offset, - to, copy, 0), + to, copy), odd);
odd = 0; diff --combined net/ipv4/route.c index 58642b29a499,2c05b863ae43..d15a78b26dfa --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@@ -623,7 -623,7 +623,7 @@@ static inline u32 fnhe_hashfun(__be32 d u32 hval;
net_get_random_once(&fnhe_hashrnd, sizeof(fnhe_hashrnd)); - hval = jhash_1word((__force u32) daddr, fnhe_hashrnd); + hval = jhash_1word((__force u32)daddr, fnhe_hashrnd); return hash_32(hval, FNHE_HASH_SHIFT); }
@@@ -786,10 -786,8 +786,10 @@@ static void __ip_do_redirect(struct rta neigh_event_send(n, NULL); } else { if (fib_lookup(net, fl4, &res, 0) == 0) { - struct fib_nh_common *nhc = FIB_RES_NHC(res); + struct fib_nh_common *nhc;
+ fib_select_path(net, &res, fl4, skb); + nhc = FIB_RES_NHC(res); update_or_create_fnhe(nhc, fl4->daddr, new_gw, 0, false, jiffies + ip_rt_gc_timeout); @@@ -1015,14 -1013,14 +1015,15 @@@ out: kfree_skb(skb) static void __ip_rt_update_pmtu(struct rtable *rt, struct flowi4 *fl4, u32 mtu) { struct dst_entry *dst = &rt->dst; + struct net *net = dev_net(dst->dev); - u32 old_mtu = ipv4_mtu(dst); struct fib_result res; bool lock = false; + u32 old_mtu;
if (ip_mtu_locked(dst)) return;
+ old_mtu = ipv4_mtu(dst); if (old_mtu < mtu) return;
@@@ -1036,11 -1034,9 +1037,11 @@@ return;
rcu_read_lock(); - if (fib_lookup(dev_net(dst->dev), fl4, &res, 0) == 0) { - struct fib_nh_common *nhc = FIB_RES_NHC(res); + if (fib_lookup(net, fl4, &res, 0) == 0) { + struct fib_nh_common *nhc;
+ fib_select_path(net, &res, fl4, NULL); + nhc = FIB_RES_NHC(res); update_or_create_fnhe(nhc, fl4->daddr, 0, mtu, lock, jiffies + ip_rt_mtu_expires); } @@@ -1066,7 -1062,7 +1067,7 @@@ static void ip_rt_update_pmtu(struct ds void ipv4_update_pmtu(struct sk_buff *skb, struct net *net, u32 mtu, int oif, u8 protocol) { - const struct iphdr *iph = (const struct iphdr *) skb->data; + const struct iphdr *iph = (const struct iphdr *)skb->data; struct flowi4 fl4; struct rtable *rt; u32 mark = IP4_REPLY_MARK(net, skb->mark); @@@ -1083,7 -1079,7 +1084,7 @@@ EXPORT_SYMBOL_GPL(ipv4_update_pmtu)
static void __ipv4_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, u32 mtu) { - const struct iphdr *iph = (const struct iphdr *) skb->data; + const struct iphdr *iph = (const struct iphdr *)skb->data; struct flowi4 fl4; struct rtable *rt;
@@@ -1101,7 -1097,7 +1102,7 @@@
void ipv4_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, u32 mtu) { - const struct iphdr *iph = (const struct iphdr *) skb->data; + const struct iphdr *iph = (const struct iphdr *)skb->data; struct flowi4 fl4; struct rtable *rt; struct dst_entry *odst = NULL; @@@ -1131,7 -1127,7 +1132,7 @@@ new = true; }
- __ip_rt_update_pmtu((struct rtable *) xfrm_dst_path(&rt->dst), &fl4, mtu); + __ip_rt_update_pmtu((struct rtable *)xfrm_dst_path(&rt->dst), &fl4, mtu);
if (!dst_check(&rt->dst, 0)) { if (new) @@@ -1156,7 -1152,7 +1157,7 @@@ EXPORT_SYMBOL_GPL(ipv4_sk_update_pmtu) void ipv4_redirect(struct sk_buff *skb, struct net *net, int oif, u8 protocol) { - const struct iphdr *iph = (const struct iphdr *) skb->data; + const struct iphdr *iph = (const struct iphdr *)skb->data; struct flowi4 fl4; struct rtable *rt;
@@@ -1172,7 -1168,7 +1173,7 @@@ EXPORT_SYMBOL_GPL(ipv4_redirect)
void ipv4_sk_redirect(struct sk_buff *skb, struct sock *sk) { - const struct iphdr *iph = (const struct iphdr *) skb->data; + const struct iphdr *iph = (const struct iphdr *)skb->data; struct flowi4 fl4; struct rtable *rt; struct net *net = sock_net(sk); @@@ -1312,7 -1308,7 +1313,7 @@@ static unsigned int ipv4_default_advmss
static unsigned int ipv4_mtu(const struct dst_entry *dst) { - const struct rtable *rt = (const struct rtable *) dst; + const struct rtable *rt = (const struct rtable *)dst; unsigned int mtu = rt->rt_pmtu;
if (!mtu || time_after_eq(jiffies, rt->dst.expires)) @@@ -2152,7 -2148,6 +2153,7 @@@ static int ip_route_input_slow(struct s fl4.daddr = daddr; fl4.saddr = saddr; fl4.flowi4_uid = sock_net_uid(net, NULL); + fl4.flowi4_multipath_hash = 0;
if (fib4_rules_early_flow_dissect(net, skb, &fl4, &_flkeys)) { flkeys = &_flkeys; @@@ -2673,6 -2668,8 +2674,6 @@@ struct rtable *ip_route_output_key_hash fib_select_path(net, res, fl4, skb);
dev_out = FIB_RES_DEV(*res); - fl4->flowi4_oif = dev_out->ifindex; -
make_route: rth = __mkroute_output(res, fl4, orig_oif, dev_out, flags); diff --combined net/ipv6/ip6_fib.c index 4a664ad4f4d4,44d68ed70f24..141c0a4c569a --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@@ -1812,10 -1812,14 +1812,14 @@@ static struct fib6_node *fib6_repair_tr
children = 0; child = NULL; - if (fn_r) - child = fn_r, children |= 1; - if (fn_l) - child = fn_l, children |= 2; + if (fn_r) { + child = fn_r; + children |= 1; + } + if (fn_l) { + child = fn_l; + children |= 2; + }
if (children == 3 || FIB6_SUBTREE(fn) #ifdef CONFIG_IPV6_SUBTREES @@@ -1993,19 -1997,14 +1997,19 @@@ static void fib6_del_route(struct fib6_ /* Need to own table->tb6_lock */ int fib6_del(struct fib6_info *rt, struct nl_info *info) { - struct fib6_node *fn = rcu_dereference_protected(rt->fib6_node, - lockdep_is_held(&rt->fib6_table->tb6_lock)); - struct fib6_table *table = rt->fib6_table; struct net *net = info->nl_net; struct fib6_info __rcu **rtp; struct fib6_info __rcu **rtp_next; + struct fib6_table *table; + struct fib6_node *fn; + + if (rt == net->ipv6.fib6_null_entry) + return -ENOENT;
- if (!fn || rt == net->ipv6.fib6_null_entry) + table = rt->fib6_table; + fn = rcu_dereference_protected(rt->fib6_node, + lockdep_is_held(&table->tb6_lock)); + if (!fn) return -ENOENT;
WARN_ON(!(fn->fn_flags & RTN_RTINFO)); diff --combined net/ipv6/ip6_output.c index 2689498157d1,a2a65e327f49..9dd0a847c576 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@@ -1492,7 -1492,7 +1492,7 @@@ emsgsize * Otherwise, we need to reserve fragment header and * fragment alignment (= 8-15 octects, in total). * - * Note that we may need to "move" the data from the tail of + * Note that we may need to "move" the data from the tail * of the buffer to the new fragment when we split * the message. * @@@ -1615,7 -1615,7 +1615,7 @@@ alloc_new_skb if (fraggap) { skb->csum = skb_copy_and_csum_bits( skb_prev, maxfraglen, - data + transhdrlen, fraggap, 0); + data + transhdrlen, fraggap); skb_prev->csum = csum_sub(skb_prev->csum, skb->csum); data += fraggap; diff --combined net/ipv6/route.c index fb075d9545b9,e8ee20720fe0..bde6d48a9942 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@@ -4202,7 -4202,7 +4202,7 @@@ static struct fib6_info *rt6_add_route_ .fc_nlinfo.nl_net = net, };
- cfg.fc_table = l3mdev_fib_table(dev) ? : RT6_TABLE_INFO, + cfg.fc_table = l3mdev_fib_table(dev) ? : RT6_TABLE_INFO; cfg.fc_dst = *prefix; cfg.fc_gateway = *gwaddr;
@@@ -5284,9 -5284,10 +5284,10 @@@ static int ip6_route_multipath_del(stru { struct fib6_config r_cfg; struct rtnexthop *rtnh; + int last_err = 0; int remaining; int attrlen; - int err = 1, last_err = 0; + int err;
remaining = cfg->fc_mp_len; rtnh = (struct rtnexthop *)cfg->fc_mp; diff --combined net/mac80211/mlme.c index 2e400b0ff696,50a9b9025725..2489c6c64c2d --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@@ -2432,23 -2432,6 +2432,6 @@@ static void ieee80211_set_disassoc(stru sdata->encrypt_headroom = IEEE80211_ENCRYPT_HEADROOM; }
- void ieee80211_sta_rx_notify(struct ieee80211_sub_if_data *sdata, - struct ieee80211_hdr *hdr) - { - /* - * We can postpone the mgd.timer whenever receiving unicast frames - * from AP because we know that the connection is working both ways - * at that time. But multicast frames (and hence also beacons) must - * be ignored here, because we need to trigger the timer during - * data idle periods for sending the periodic probe request to the - * AP we're connected to. - */ - if (is_multicast_ether_addr(hdr->addr1)) - return; - - ieee80211_sta_reset_conn_monitor(sdata); - } - static void ieee80211_reset_ap_probe(struct ieee80211_sub_if_data *sdata) { struct ieee80211_if_managed *ifmgd = &sdata->u.mgd; @@@ -2521,21 -2504,13 +2504,13 @@@ void ieee80211_sta_tx_notify(struct iee { ieee80211_sta_tx_wmm_ac_notify(sdata, hdr, tx_time);
- if (!ieee80211_is_data(hdr->frame_control)) - return; - - if (ieee80211_is_any_nullfunc(hdr->frame_control) && - sdata->u.mgd.probe_send_count > 0) { - if (ack) - ieee80211_sta_reset_conn_monitor(sdata); - else - sdata->u.mgd.nullfunc_failed = true; - ieee80211_queue_work(&sdata->local->hw, &sdata->work); + if (!ieee80211_is_any_nullfunc(hdr->frame_control) || + !sdata->u.mgd.probe_send_count) return; - }
- if (ack) - ieee80211_sta_reset_conn_monitor(sdata); + if (!ack) + sdata->u.mgd.nullfunc_failed = true; + ieee80211_queue_work(&sdata->local->hw, &sdata->work); }
static void ieee80211_mlme_send_probe_req(struct ieee80211_sub_if_data *sdata, @@@ -3548,6 -3523,9 +3523,9 @@@ static bool ieee80211_assoc_success(str goto out; }
+ if (sdata->wdev.use_4addr) + drv_sta_set_4addr(local, sdata, &sta->sta, true); + mutex_unlock(&sdata->local->sta_mtx);
/* @@@ -3605,8 -3583,8 +3583,8 @@@ * Start timer to probe the connection to the AP now. * Also start the timer that will detect beacon loss. */ - ieee80211_sta_rx_notify(sdata, (struct ieee80211_hdr *)mgmt); ieee80211_sta_reset_beacon_monitor(sdata); + ieee80211_sta_reset_conn_monitor(sdata);
ret = true; out: @@@ -4577,10 -4555,26 +4555,26 @@@ static void ieee80211_sta_conn_mon_time from_timer(sdata, t, u.mgd.conn_mon_timer); struct ieee80211_if_managed *ifmgd = &sdata->u.mgd; struct ieee80211_local *local = sdata->local; + struct sta_info *sta; + unsigned long timeout;
if (sdata->vif.csa_active && !ifmgd->csa_waiting_bcn) return;
+ sta = sta_info_get(sdata, ifmgd->bssid); + if (!sta) + return; + + timeout = sta->status_stats.last_ack; + if (time_before(sta->status_stats.last_ack, sta->rx_stats.last_rx)) + timeout = sta->rx_stats.last_rx; + timeout += IEEE80211_CONNECTION_IDLE_TIME; + + if (time_is_before_jiffies(timeout)) { + mod_timer(&ifmgd->conn_mon_timer, round_jiffies_up(timeout)); + return; + } + ieee80211_queue_work(&local->hw, &ifmgd->monitor_work); }
@@@ -4861,7 -4855,6 +4855,7 @@@ static int ieee80211_prep_channel(struc struct ieee80211_supported_band *sband; struct cfg80211_chan_def chandef; bool is_6ghz = cbss->channel->band == NL80211_BAND_6GHZ; + bool is_5ghz = cbss->channel->band == NL80211_BAND_5GHZ; struct ieee80211_bss *bss = (void *)cbss->priv; int ret; u32 i; @@@ -4880,7 -4873,7 +4874,7 @@@ ifmgd->flags |= IEEE80211_STA_DISABLE_HE; }
- if (!sband->vht_cap.vht_supported && !is_6ghz) { + if (!sband->vht_cap.vht_supported && is_5ghz) { ifmgd->flags |= IEEE80211_STA_DISABLE_VHT; ifmgd->flags |= IEEE80211_STA_DISABLE_HE; } diff --combined net/mac80211/rx.c index a959ebf56852,b04d6e01a346..7f88a1b2215c --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@@ -451,8 -451,7 +451,8 @@@ ieee80211_add_rx_radiotap_header(struc else if (status->bw == RATE_INFO_BW_5) channel_flags |= IEEE80211_CHAN_QUARTER;
- if (status->band == NL80211_BAND_5GHZ) + if (status->band == NL80211_BAND_5GHZ || + status->band == NL80211_BAND_6GHZ) channel_flags |= IEEE80211_CHAN_OFDM | IEEE80211_CHAN_5GHZ; else if (status->encoding != RX_ENC_LEGACY) channel_flags |= IEEE80211_CHAN_DYN | IEEE80211_CHAN_2GHZ; @@@ -1812,9 -1811,6 +1812,6 @@@ ieee80211_rx_h_sta_process(struct ieee8 sta->rx_stats.last_rate = sta_stats_encode_rate(status); }
- if (rx->sdata->vif.type == NL80211_IFTYPE_STATION) - ieee80211_sta_rx_notify(rx->sdata, hdr); - sta->rx_stats.fragments++;
u64_stats_update_begin(&rx->sta->rx_stats.syncp); @@@ -2900,7 -2896,7 +2897,7 @@@ ieee80211_rx_h_mesh_fwding(struct ieee8 fwd_hdr->frame_control &= ~cpu_to_le16(IEEE80211_FCTL_RETRY); info = IEEE80211_SKB_CB(fwd_skb); memset(info, 0, sizeof(*info)); - info->flags |= IEEE80211_TX_INTFL_NEED_TXPROCESSING; + info->control.flags |= IEEE80211_TX_INTCFL_NEED_TXPROCESSING; info->control.vif = &rx->sdata->vif; info->control.jiffies = jiffies; if (is_multicast_ether_addr(fwd_hdr->addr1)) { @@@ -4149,7 -4145,6 +4146,6 @@@ void ieee80211_check_fast_rx(struct sta fastrx.sa_offs = offsetof(struct ieee80211_hdr, addr2); fastrx.expected_ds_bits = 0; } else { - fastrx.sta_notify = sdata->u.mgd.probe_send_count > 0; fastrx.da_offs = offsetof(struct ieee80211_hdr, addr1); fastrx.sa_offs = offsetof(struct ieee80211_hdr, addr3); fastrx.expected_ds_bits = @@@ -4379,11 -4374,6 +4375,6 @@@ static bool ieee80211_invoke_fast_rx(st pskb_trim(skb, skb->len - fast_rx->icv_len)) goto drop;
- if (unlikely(fast_rx->sta_notify)) { - ieee80211_sta_rx_notify(rx->sdata, hdr); - fast_rx->sta_notify = false; - } - /* statistics part of ieee80211_rx_h_sta_process() */ if (!(status->flag & RX_FLAG_NO_SIGNAL_VAL)) { stats->last_signal = status->signal; diff --combined net/mac80211/vht.c index d1b64d0751f2,7e601d067d53..fb0e3a657d2d --- a/net/mac80211/vht.c +++ b/net/mac80211/vht.c @@@ -168,7 -168,10 +168,7 @@@ ieee80211_vht_cap_ie_to_sta_vht_cap(str /* take some capabilities as-is */ cap_info = le32_to_cpu(vht_cap_ie->vht_cap_info); vht_cap->cap = cap_info; - vht_cap->cap &= IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_3895 | - IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_7991 | - IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_11454 | - IEEE80211_VHT_CAP_RXLDPC | + vht_cap->cap &= IEEE80211_VHT_CAP_RXLDPC | IEEE80211_VHT_CAP_VHT_TXOP_PS | IEEE80211_VHT_CAP_HTC_VHT | IEEE80211_VHT_CAP_MAX_A_MPDU_LENGTH_EXPONENT_MASK | @@@ -177,9 -180,6 +177,9 @@@ IEEE80211_VHT_CAP_RX_ANTENNA_PATTERN | IEEE80211_VHT_CAP_TX_ANTENNA_PATTERN;
+ vht_cap->cap |= min_t(u32, cap_info & IEEE80211_VHT_CAP_MAX_MPDU_MASK, + own_cap.cap & IEEE80211_VHT_CAP_MAX_MPDU_MASK); + /* and some based on our own capabilities */ switch (own_cap.cap & IEEE80211_VHT_CAP_SUPP_CHAN_WIDTH_MASK) { case IEEE80211_VHT_CAP_SUPP_CHAN_WIDTH_160MHZ: @@@ -315,10 -315,6 +315,6 @@@
sta->sta.bandwidth = ieee80211_sta_cur_vht_bw(sta);
- /* If HT IE reported 3839 bytes only, stay with that size. */ - if (sta->sta.max_amsdu_len == IEEE80211_MAX_MPDU_LEN_HT_3839) - return; - switch (vht_cap->cap & IEEE80211_VHT_CAP_MAX_MPDU_MASK) { case IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_11454: sta->sta.max_amsdu_len = IEEE80211_MAX_MPDU_LEN_VHT_11454; diff --combined net/mptcp/pm_netlink.c index 770da3627848,6947f4fee6b9..b4a9624d7bf2 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@@ -23,8 -23,6 +23,6 @@@ static int pm_nl_pernet_id
struct mptcp_pm_addr_entry { struct list_head list; - unsigned int flags; - int ifindex; struct mptcp_addr_info addr; struct rcu_head rcu; }; @@@ -66,16 -64,6 +64,16 @@@ static bool addresses_equal(const struc return a->port == b->port; }
+static bool address_zero(const struct mptcp_addr_info *addr) +{ + struct mptcp_addr_info zero; + + memset(&zero, 0, sizeof(zero)); + zero.family = addr->family; + + return addresses_equal(addr, &zero, false); +} + static void local_address(const struct sock_common *skc, struct mptcp_addr_info *addr) { @@@ -129,7 -117,7 +127,7 @@@ select_local_address(const struct pm_nl rcu_read_lock(); spin_lock_bh(&msk->join_list_lock); list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) { - if (!(entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW)) + if (!(entry->addr.flags & MPTCP_PM_ADDR_FLAG_SUBFLOW)) continue;
/* avoid any address already in use by subflows and @@@ -160,7 -148,7 +158,7 @@@ select_signal_address(struct pm_nl_pern * can lead to additional addresses not being announced. */ list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) { - if (!(entry->flags & MPTCP_PM_ADDR_FLAG_SIGNAL)) + if (!(entry->addr.flags & MPTCP_PM_ADDR_FLAG_SIGNAL)) continue; if (i++ == pos) { ret = entry; @@@ -181,9 -169,9 +179,9 @@@ static void check_work_pending(struct m
static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) { + struct mptcp_addr_info remote = { 0 }; struct sock *sk = (struct sock *)msk; struct mptcp_pm_addr_entry *local; - struct mptcp_addr_info remote; struct pm_nl_pernet *pernet;
pernet = net_generic(sock_net((struct sock *)msk), pm_nl_pernet_id); @@@ -220,8 -208,7 +218,7 @@@ msk->pm.subflows++; check_work_pending(msk); spin_unlock_bh(&msk->pm.lock); - __mptcp_subflow_connect(sk, local->ifindex, - &local->addr, &remote); + __mptcp_subflow_connect(sk, &local->addr, &remote); spin_lock_bh(&msk->pm.lock); return; } @@@ -267,13 -254,13 +264,13 @@@ void mptcp_pm_nl_add_addr_received(stru local.family = remote.family;
spin_unlock_bh(&msk->pm.lock); - __mptcp_subflow_connect((struct sock *)msk, 0, &local, &remote); + __mptcp_subflow_connect((struct sock *)msk, &local, &remote); spin_lock_bh(&msk->pm.lock); }
static bool address_use_port(struct mptcp_pm_addr_entry *entry) { - return (entry->flags & + return (entry->addr.flags & (MPTCP_PM_ADDR_FLAG_SIGNAL | MPTCP_PM_ADDR_FLAG_SUBFLOW)) == MPTCP_PM_ADDR_FLAG_SIGNAL; } @@@ -303,9 -290,9 +300,9 @@@ static int mptcp_pm_nl_append_new_local goto out; }
- if (entry->flags & MPTCP_PM_ADDR_FLAG_SIGNAL) + if (entry->addr.flags & MPTCP_PM_ADDR_FLAG_SIGNAL) pernet->add_addr_signal_max++; - if (entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) + if (entry->addr.flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) pernet->local_addr_max++;
entry->addr.id = pernet->next_id++; @@@ -333,13 -320,10 +330,13 @@@ int mptcp_pm_nl_get_local_id(struct mpt * addr */ local_address((struct sock_common *)msk, &msk_local); - local_address((struct sock_common *)msk, &skc_local); + local_address((struct sock_common *)skc, &skc_local); if (addresses_equal(&msk_local, &skc_local, false)) return 0;
+ if (address_zero(&skc_local)) + return 0; + pernet = net_generic(sock_net((struct sock *)msk), pm_nl_pernet_id);
rcu_read_lock(); @@@ -354,12 -338,13 +351,13 @@@ return ret;
/* address not found, add to local list */ - entry = kmalloc(sizeof(*entry), GFP_KERNEL); + entry = kmalloc(sizeof(*entry), GFP_ATOMIC); if (!entry) return -ENOMEM;
- entry->flags = 0; entry->addr = skc_local; + entry->addr.ifindex = 0; + entry->addr.flags = 0; ret = mptcp_pm_nl_append_new_local_addr(pernet, entry); if (ret < 0) kfree(entry); @@@ -397,8 -382,8 +395,8 @@@ mptcp_pm_addr_policy[MPTCP_PM_ADDR_ATTR [MPTCP_PM_ADDR_ATTR_FAMILY] = { .type = NLA_U16, }, [MPTCP_PM_ADDR_ATTR_ID] = { .type = NLA_U8, }, [MPTCP_PM_ADDR_ATTR_ADDR4] = { .type = NLA_U32, }, - [MPTCP_PM_ADDR_ATTR_ADDR6] = { .type = NLA_EXACT_LEN, - .len = sizeof(struct in6_addr), }, + [MPTCP_PM_ADDR_ATTR_ADDR6] = + NLA_POLICY_EXACT_LEN(sizeof(struct in6_addr)), [MPTCP_PM_ADDR_ATTR_PORT] = { .type = NLA_U16 }, [MPTCP_PM_ADDR_ATTR_FLAGS] = { .type = NLA_U32 }, [MPTCP_PM_ADDR_ATTR_IF_IDX] = { .type = NLA_S32 }, @@@ -473,14 -458,17 +471,17 @@@ static int mptcp_pm_parse_addr(struct n entry->addr.addr.s_addr = nla_get_in_addr(tb[addr_addr]);
skip_family: - if (tb[MPTCP_PM_ADDR_ATTR_IF_IDX]) - entry->ifindex = nla_get_s32(tb[MPTCP_PM_ADDR_ATTR_IF_IDX]); + if (tb[MPTCP_PM_ADDR_ATTR_IF_IDX]) { + u32 val = nla_get_s32(tb[MPTCP_PM_ADDR_ATTR_IF_IDX]); + + entry->addr.ifindex = val; + }
if (tb[MPTCP_PM_ADDR_ATTR_ID]) entry->addr.id = nla_get_u8(tb[MPTCP_PM_ADDR_ATTR_ID]);
if (tb[MPTCP_PM_ADDR_ATTR_FLAGS]) - entry->flags = nla_get_u32(tb[MPTCP_PM_ADDR_ATTR_FLAGS]); + entry->addr.flags = nla_get_u32(tb[MPTCP_PM_ADDR_ATTR_FLAGS]);
return 0; } @@@ -548,9 -536,9 +549,9 @@@ static int mptcp_nl_cmd_del_addr(struc ret = -EINVAL; goto out; } - if (entry->flags & MPTCP_PM_ADDR_FLAG_SIGNAL) + if (entry->addr.flags & MPTCP_PM_ADDR_FLAG_SIGNAL) pernet->add_addr_signal_max--; - if (entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) + if (entry->addr.flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) pernet->local_addr_max--;
pernet->addrs--; @@@ -606,10 -594,10 +607,10 @@@ static int mptcp_nl_fill_addr(struct sk goto nla_put_failure; if (nla_put_u8(skb, MPTCP_PM_ADDR_ATTR_ID, addr->id)) goto nla_put_failure; - if (nla_put_u32(skb, MPTCP_PM_ADDR_ATTR_FLAGS, entry->flags)) + if (nla_put_u32(skb, MPTCP_PM_ADDR_ATTR_FLAGS, entry->addr.flags)) goto nla_put_failure; - if (entry->ifindex && - nla_put_s32(skb, MPTCP_PM_ADDR_ATTR_IF_IDX, entry->ifindex)) + if (entry->addr.ifindex && + nla_put_s32(skb, MPTCP_PM_ADDR_ATTR_IF_IDX, entry->addr.ifindex)) goto nla_put_failure;
if (addr->family == AF_INET && diff --combined net/mptcp/subflow.c index 9ead43f79023,34d6230df017..141d555b7bd2 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@@ -20,6 -20,7 +20,7 @@@ #include <net/ip6_route.h> #endif #include <net/mptcp.h> + #include <uapi/linux/mptcp.h> #include "protocol.h" #include "mib.h"
@@@ -434,6 -435,7 +435,7 @@@ static void mptcp_sock_destruct(struct sock_orphan(sk); }
+ skb_rbtree_purge(&mptcp_sk(sk)->out_of_order_queue); mptcp_token_destroy(mptcp_sk(sk)); inet_sock_destruct(sk); } @@@ -804,16 -806,25 +806,25 @@@ validate_seq return MAPPING_OK; }
- static int subflow_read_actor(read_descriptor_t *desc, - struct sk_buff *skb, - unsigned int offset, size_t len) + static void mptcp_subflow_discard_data(struct sock *ssk, struct sk_buff *skb, + u64 limit) { - size_t copy_len = min(desc->count, len); - - desc->count -= copy_len; - - pr_debug("flushed %zu bytes, %zu left", copy_len, desc->count); - return copy_len; + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); + bool fin = TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN; + u32 incr; + + incr = limit >= skb->len ? skb->len + fin : limit; + + pr_debug("discarding=%d len=%d seq=%d", incr, skb->len, + subflow->map_subflow_seq); + MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_DUPDATA); + tcp_sk(ssk)->copied_seq += incr; + if (!before(tcp_sk(ssk)->copied_seq, TCP_SKB_CB(skb)->end_seq)) + sk_eat_skb(ssk, skb); + if (mptcp_subflow_get_map_offset(subflow) >= subflow->map_data_len) + subflow->map_valid = 0; + if (incr) + tcp_cleanup_rbuf(ssk, incr); }
static bool subflow_check_data_avail(struct sock *ssk) @@@ -825,13 -836,13 +836,13 @@@
pr_debug("msk=%p ssk=%p data_avail=%d skb=%p", subflow->conn, ssk, subflow->data_avail, skb_peek(&ssk->sk_receive_queue)); + if (!skb_peek(&ssk->sk_receive_queue)) + subflow->data_avail = 0; if (subflow->data_avail) return true;
msk = mptcp_sk(subflow->conn); for (;;) { - u32 map_remaining; - size_t delta; u64 ack_seq; u64 old_ack;
@@@ -849,6 -860,7 +860,7 @@@ subflow->map_data_len = skb->len; subflow->map_subflow_seq = tcp_sk(ssk)->copied_seq - subflow->ssn_offset; + subflow->data_avail = MPTCP_SUBFLOW_DATA_AVAIL; return true; }
@@@ -876,42 -888,18 +888,18 @@@ ack_seq = mptcp_subflow_get_mapped_dsn(subflow); pr_debug("msk ack_seq=%llx subflow ack_seq=%llx", old_ack, ack_seq); - if (ack_seq == old_ack) + if (ack_seq == old_ack) { + subflow->data_avail = MPTCP_SUBFLOW_DATA_AVAIL; break; + } else if (after64(ack_seq, old_ack)) { + subflow->data_avail = MPTCP_SUBFLOW_OOO_DATA; + break; + }
/* only accept in-sequence mapping. Old values are spurious - * retransmission; we can hit "future" values on active backup - * subflow switch, we relay on retransmissions to get - * in-sequence data. - * Cuncurrent subflows support will require subflow data - * reordering + * retransmission */ - map_remaining = subflow->map_data_len - - mptcp_subflow_get_map_offset(subflow); - if (before64(ack_seq, old_ack)) - delta = min_t(size_t, old_ack - ack_seq, map_remaining); - else - delta = min_t(size_t, ack_seq - old_ack, map_remaining); - - /* discard mapped data */ - pr_debug("discarding %zu bytes, current map len=%d", delta, - map_remaining); - if (delta) { - read_descriptor_t desc = { - .count = delta, - }; - int ret; - - ret = tcp_read_sock(ssk, &desc, subflow_read_actor); - if (ret < 0) { - ssk->sk_err = -ret; - goto fatal; - } - if (ret < delta) - return false; - if (delta == map_remaining) - subflow->map_valid = 0; - } + mptcp_subflow_discard_data(ssk, skb, old_ack - ack_seq); } return true;
@@@ -922,13 -910,13 +910,13 @@@ fatal ssk->sk_error_report(ssk); tcp_set_state(ssk, TCP_CLOSE); tcp_send_active_reset(ssk, GFP_ATOMIC); + subflow->data_avail = 0; return false; }
bool mptcp_subflow_data_available(struct sock *sk) { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); - struct sk_buff *skb;
/* check if current mapping is still valid */ if (subflow->map_valid && @@@ -941,15 -929,7 +929,7 @@@ subflow->map_data_len); }
- if (!subflow_check_data_avail(sk)) { - subflow->data_avail = 0; - return false; - } - - skb = skb_peek(&sk->sk_receive_queue); - subflow->data_avail = skb && - before(tcp_sk(sk)->copied_seq, TCP_SKB_CB(skb)->end_seq); - return subflow->data_avail; + return subflow_check_data_avail(sk); }
/* If ssk has an mptcp parent socket, use the mptcp rcvbuf occupancy, @@@ -996,8 -976,10 +976,10 @@@ static void subflow_write_space(struct struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); struct sock *parent = subflow->conn;
- sk_stream_write_space(sk); - if (sk_stream_is_writeable(sk)) { + if (!sk_stream_is_writeable(sk)) + return; + + if (sk_stream_is_writeable(parent)) { set_bit(MPTCP_SEND_SPACE, &mptcp_sk(parent)->flags); smp_mb__after_atomic(); /* set SEND_SPACE before sk_stream_write_space clears NOSPACE */ @@@ -1056,14 -1038,12 +1038,13 @@@ static void mptcp_info2sockaddr(const s #endif }
- int __mptcp_subflow_connect(struct sock *sk, int ifindex, - const struct mptcp_addr_info *loc, + int __mptcp_subflow_connect(struct sock *sk, const struct mptcp_addr_info *loc, const struct mptcp_addr_info *remote) { struct mptcp_sock *msk = mptcp_sk(sk); struct mptcp_subflow_context *subflow; struct sockaddr_storage addr; + int remote_id = remote->id; int local_id = loc->id; struct socket *sf; struct sock *ssk; @@@ -1102,19 -1082,18 +1083,19 @@@ if (loc->family == AF_INET6) addrlen = sizeof(struct sockaddr_in6); #endif - ssk->sk_bound_dev_if = ifindex; + ssk->sk_bound_dev_if = loc->ifindex; err = kernel_bind(sf, (struct sockaddr *)&addr, addrlen); if (err) goto failed;
mptcp_crypto_key_sha(subflow->remote_key, &remote_token, NULL); - pr_debug("msk=%p remote_token=%u local_id=%d", msk, remote_token, - local_id); + pr_debug("msk=%p remote_token=%u local_id=%d remote_id=%d", msk, + remote_token, local_id, remote_id); subflow->remote_token = remote_token; subflow->local_id = local_id; + subflow->remote_id = remote_id; subflow->request_join = 1; - subflow->request_bkup = 1; + subflow->request_bkup = !!(loc->flags & MPTCP_PM_ADDR_FLAG_BACKUP); mptcp_info2sockaddr(remote, &addr);
err = kernel_connect(sf, (struct sockaddr *)&addr, addrlen, O_NONBLOCK); @@@ -1349,7 -1328,6 +1330,7 @@@ static void subflow_ulp_clone(const str new_ctx->fully_established = 1; new_ctx->backup = subflow_req->backup; new_ctx->local_id = subflow_req->local_id; + new_ctx->remote_id = subflow_req->remote_id; new_ctx->token = subflow_req->token; new_ctx->thmac = subflow_req->thmac; } diff --combined net/netfilter/nf_conntrack_netlink.c index c3a4214dc958,89d99f6dfd0a..3d0fd33be018 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@@ -851,6 -851,7 +851,6 @@@ static int ctnetlink_done(struct netlin }
struct ctnetlink_filter { - u_int32_t cta_flags; u8 family;
u_int32_t orig_flags; @@@ -905,6 -906,10 +905,6 @@@ static int ctnetlink_parse_tuple_filter struct nf_conntrack_zone *zone, u_int32_t flags);
-/* applied on filters */ -#define CTA_FILTER_F_CTA_MARK (1 << 0) -#define CTA_FILTER_F_CTA_MARK_MASK (1 << 1) - static struct ctnetlink_filter * ctnetlink_alloc_filter(const struct nlattr * const cda[], u8 family) { @@@ -925,10 -930,14 +925,10 @@@ #ifdef CONFIG_NF_CONNTRACK_MARK if (cda[CTA_MARK]) { filter->mark.val = ntohl(nla_get_be32(cda[CTA_MARK])); - filter->cta_flags |= CTA_FILTER_FLAG(CTA_MARK); - - if (cda[CTA_MARK_MASK]) { + if (cda[CTA_MARK_MASK]) filter->mark.mask = ntohl(nla_get_be32(cda[CTA_MARK_MASK])); - filter->cta_flags |= CTA_FILTER_FLAG(CTA_MARK_MASK); - } else { + else filter->mark.mask = 0xffffffff; - } } else if (cda[CTA_MARK_MASK]) { err = -EINVAL; goto err_filter; @@@ -1108,7 -1117,11 +1108,7 @@@ static int ctnetlink_filter_match(struc }
#ifdef CONFIG_NF_CONNTRACK_MARK - if ((filter->cta_flags & CTA_FILTER_FLAG(CTA_MARK_MASK)) && - (ct->mark & filter->mark.mask) != filter->mark.val) - goto ignore_entry; - else if ((filter->cta_flags & CTA_FILTER_FLAG(CTA_MARK)) && - ct->mark != filter->mark.val) + if ((ct->mark & filter->mark.mask) != filter->mark.val) goto ignore_entry; #endif
@@@ -1391,8 -1404,7 +1391,8 @@@ ctnetlink_parse_tuple_filter(const stru if (err < 0) return err;
- + if (l3num != NFPROTO_IPV4 && l3num != NFPROTO_IPV6) + return -EOPNOTSUPP; tuple->src.l3num = l3num;
if (flags & CTA_FILTER_FLAG(CTA_IP_DST) || @@@ -2497,7 -2509,6 +2497,6 @@@ ctnetlink_ct_stat_cpu_fill_info(struct
if (nla_put_be32(skb, CTA_STATS_FOUND, htonl(st->found)) || nla_put_be32(skb, CTA_STATS_INVALID, htonl(st->invalid)) || - nla_put_be32(skb, CTA_STATS_IGNORE, htonl(st->ignore)) || nla_put_be32(skb, CTA_STATS_INSERT, htonl(st->insert)) || nla_put_be32(skb, CTA_STATS_INSERT_FAILED, htonl(st->insert_failed)) || @@@ -2505,7 -2516,9 +2504,9 @@@ nla_put_be32(skb, CTA_STATS_EARLY_DROP, htonl(st->early_drop)) || nla_put_be32(skb, CTA_STATS_ERROR, htonl(st->error)) || nla_put_be32(skb, CTA_STATS_SEARCH_RESTART, - htonl(st->search_restart))) + htonl(st->search_restart)) || + nla_put_be32(skb, CTA_STATS_CLASH_RESOLVE, + htonl(st->clash_resolve))) goto nla_put_failure;
nlmsg_end(skb, nlh); diff --combined net/netfilter/nf_tables_api.c index 4603b667973a,84c0c1aaae99..97fb6f776114 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@@ -650,6 -650,8 +650,8 @@@ static const struct nla_policy nft_tabl .len = NFT_TABLE_MAXNAMELEN - 1 }, [NFTA_TABLE_FLAGS] = { .type = NLA_U32 }, [NFTA_TABLE_HANDLE] = { .type = NLA_U64 }, + [NFTA_TABLE_USERDATA] = { .type = NLA_BINARY, + .len = NFT_USERDATA_MAXLEN } };
static int nf_tables_fill_table_info(struct sk_buff *skb, struct net *net, @@@ -676,6 -678,11 +678,11 @@@ NFTA_TABLE_PAD)) goto nla_put_failure;
+ if (table->udata) { + if (nla_put(skb, NFTA_TABLE_USERDATA, table->udlen, table->udata)) + goto nla_put_failure; + } + nlmsg_end(skb, nlh); return 0;
@@@ -684,18 -691,6 +691,18 @@@ nla_put_failure return -1; }
+struct nftnl_skb_parms { + bool report; +}; +#define NFT_CB(skb) (*(struct nftnl_skb_parms*)&((skb)->cb)) + +static void nft_notify_enqueue(struct sk_buff *skb, bool report, + struct list_head *notify_list) +{ + NFT_CB(skb).report = report; + list_add_tail(&skb->list, notify_list); +} + static void nf_tables_table_notify(const struct nft_ctx *ctx, int event) { struct sk_buff *skb; @@@ -727,7 -722,8 +734,7 @@@ goto err; }
- nfnetlink_send(skb, ctx->net, ctx->portid, NFNLGRP_NFTABLES, - ctx->report, GFP_KERNEL); + nft_notify_enqueue(skb, ctx->report, &ctx->net->nft.notify_list); return; err: nfnetlink_set_err(ctx->net, ctx->portid, NFNLGRP_NFTABLES, -ENOBUFS); @@@ -988,8 -984,9 +995,9 @@@ static int nf_tables_newtable(struct ne int family = nfmsg->nfgen_family; const struct nlattr *attr; struct nft_table *table; - u32 flags = 0; struct nft_ctx ctx; + u32 flags = 0; + u16 udlen = 0; int err;
lockdep_assert_held(&net->nft.commit_mutex); @@@ -1025,6 -1022,16 +1033,16 @@@ if (table->name == NULL) goto err_strdup;
+ if (nla[NFTA_TABLE_USERDATA]) { + udlen = nla_len(nla[NFTA_TABLE_USERDATA]); + table->udata = kzalloc(udlen, GFP_KERNEL); + if (table->udata == NULL) + goto err_table_udata; + + nla_memcpy(table->udata, nla[NFTA_TABLE_USERDATA], udlen); + table->udlen = udlen; + } + err = rhltable_init(&table->chains_ht, &nft_chain_ht_params); if (err) goto err_chain_ht; @@@ -1047,6 -1054,8 +1065,8 @@@ err_trans: rhltable_destroy(&table->chains_ht); err_chain_ht: + kfree(table->udata); + err_table_udata: kfree(table->name); err_strdup: kfree(table); @@@ -1479,7 -1488,8 +1499,7 @@@ static void nf_tables_chain_notify(cons goto err; }
- nfnetlink_send(skb, ctx->net, ctx->portid, NFNLGRP_NFTABLES, - ctx->report, GFP_KERNEL); + nft_notify_enqueue(skb, ctx->report, &ctx->net->nft.notify_list); return; err: nfnetlink_set_err(ctx->net, ctx->portid, NFNLGRP_NFTABLES, -ENOBUFS); @@@ -2817,7 -2827,8 +2837,7 @@@ static void nf_tables_rule_notify(cons goto err; }
- nfnetlink_send(skb, ctx->net, ctx->portid, NFNLGRP_NFTABLES, - ctx->report, GFP_KERNEL); + nft_notify_enqueue(skb, ctx->report, &ctx->net->nft.notify_list); return; err: nfnetlink_set_err(ctx->net, ctx->portid, NFNLGRP_NFTABLES, -ENOBUFS); @@@ -3846,7 -3857,8 +3866,7 @@@ static void nf_tables_set_notify(const goto err; }
- nfnetlink_send(skb, ctx->net, portid, NFNLGRP_NFTABLES, ctx->report, - gfp_flags); + nft_notify_enqueue(skb, ctx->report, &ctx->net->nft.notify_list); return; err: nfnetlink_set_err(ctx->net, portid, NFNLGRP_NFTABLES, -ENOBUFS); @@@ -4967,7 -4979,8 +4987,7 @@@ static void nf_tables_setelem_notify(co goto err; }
- nfnetlink_send(skb, net, portid, NFNLGRP_NFTABLES, ctx->report, - GFP_KERNEL); + nft_notify_enqueue(skb, ctx->report, &ctx->net->nft.notify_list); return; err: nfnetlink_set_err(net, portid, NFNLGRP_NFTABLES, -ENOBUFS); @@@ -5737,6 -5750,8 +5757,8 @@@ static const struct nla_policy nft_obj_ [NFTA_OBJ_TYPE] = { .type = NLA_U32 }, [NFTA_OBJ_DATA] = { .type = NLA_NESTED }, [NFTA_OBJ_HANDLE] = { .type = NLA_U64}, + [NFTA_OBJ_USERDATA] = { .type = NLA_BINARY, + .len = NFT_USERDATA_MAXLEN }, };
static struct nft_object *nft_obj_init(const struct nft_ctx *ctx, @@@ -5884,6 -5899,7 +5906,7 @@@ static int nf_tables_newobj(struct net struct nft_object *obj; struct nft_ctx ctx; u32 objtype; + u16 udlen; int err;
if (!nla[NFTA_OBJ_TYPE] || @@@ -5928,7 -5944,7 +5951,7 @@@ obj = nft_obj_init(&ctx, type, nla[NFTA_OBJ_DATA]); if (IS_ERR(obj)) { err = PTR_ERR(obj); - goto err1; + goto err_init; } obj->key.table = table; obj->handle = nf_tables_alloc_handle(table); @@@ -5936,32 -5952,44 +5959,44 @@@ obj->key.name = nla_strdup(nla[NFTA_OBJ_NAME], GFP_KERNEL); if (!obj->key.name) { err = -ENOMEM; - goto err2; + goto err_strdup; + } + + if (nla[NFTA_OBJ_USERDATA]) { + udlen = nla_len(nla[NFTA_OBJ_USERDATA]); + obj->udata = kzalloc(udlen, GFP_KERNEL); + if (obj->udata == NULL) + goto err_userdata; + + nla_memcpy(obj->udata, nla[NFTA_OBJ_USERDATA], udlen); + obj->udlen = udlen; }
err = nft_trans_obj_add(&ctx, NFT_MSG_NEWOBJ, obj); if (err < 0) - goto err3; + goto err_trans;
err = rhltable_insert(&nft_objname_ht, &obj->rhlhead, nft_objname_ht_params); if (err < 0) - goto err4; + goto err_obj_ht;
list_add_tail_rcu(&obj->list, &table->objects); table->use++; return 0; - err4: + err_obj_ht: /* queued in transaction log */ INIT_LIST_HEAD(&obj->list); return err; - err3: + err_trans: kfree(obj->key.name); - err2: + err_userdata: + kfree(obj->udata); + err_strdup: if (obj->ops->destroy) obj->ops->destroy(&ctx, obj); kfree(obj); - err1: + err_init: module_put(type->owner); return err; } @@@ -5993,6 -6021,10 +6028,10 @@@ static int nf_tables_fill_obj_info(stru NFTA_OBJ_PAD)) goto nla_put_failure;
+ if (obj->udata && + nla_put(skb, NFTA_OBJ_USERDATA, obj->udlen, obj->udata)) + goto nla_put_failure; + nlmsg_end(skb, nlh); return 0;
@@@ -6282,7 -6314,7 +6321,7 @@@ void nft_obj_notify(struct net *net, co goto err; }
- nfnetlink_send(skb, net, portid, NFNLGRP_NFTABLES, report, gfp); + nft_notify_enqueue(skb, report, &net->nft.notify_list); return; err: nfnetlink_set_err(net, portid, NFNLGRP_NFTABLES, -ENOBUFS); @@@ -7092,7 -7124,8 +7131,7 @@@ static void nf_tables_flowtable_notify( goto err; }
- nfnetlink_send(skb, ctx->net, ctx->portid, NFNLGRP_NFTABLES, - ctx->report, GFP_KERNEL); + nft_notify_enqueue(skb, ctx->report, &ctx->net->nft.notify_list); return; err: nfnetlink_set_err(ctx->net, ctx->portid, NFNLGRP_NFTABLES, -ENOBUFS); @@@ -7701,41 -7734,6 +7740,41 @@@ static void nf_tables_commit_release(st mutex_unlock(&net->nft.commit_mutex); }
+static void nft_commit_notify(struct net *net, u32 portid) +{ + struct sk_buff *batch_skb = NULL, *nskb, *skb; + unsigned char *data; + int len; + + list_for_each_entry_safe(skb, nskb, &net->nft.notify_list, list) { + if (!batch_skb) { +new_batch: + batch_skb = skb; + len = NLMSG_GOODSIZE - skb->len; + list_del(&skb->list); + continue; + } + len -= skb->len; + if (len > 0 && NFT_CB(skb).report == NFT_CB(batch_skb).report) { + data = skb_put(batch_skb, skb->len); + memcpy(data, skb->data, skb->len); + list_del(&skb->list); + kfree_skb(skb); + continue; + } + nfnetlink_send(batch_skb, net, portid, NFNLGRP_NFTABLES, + NFT_CB(batch_skb).report, GFP_KERNEL); + goto new_batch; + } + + if (batch_skb) { + nfnetlink_send(batch_skb, net, portid, NFNLGRP_NFTABLES, + NFT_CB(batch_skb).report, GFP_KERNEL); + } + + WARN_ON_ONCE(!list_empty(&net->nft.notify_list)); +} + static int nf_tables_commit(struct net *net, struct sk_buff *skb) { struct nft_trans *trans, *next; @@@ -7938,7 -7936,6 +7977,7 @@@ } }
+ nft_commit_notify(net, NETLINK_CB(skb).portid); nf_tables_gen_notify(net, skb, NFT_MSG_NEWGEN); nf_tables_commit_release(net);
@@@ -8763,7 -8760,6 +8802,7 @@@ static int __net_init nf_tables_init_ne INIT_LIST_HEAD(&net->nft.tables); INIT_LIST_HEAD(&net->nft.commit_list); INIT_LIST_HEAD(&net->nft.module_list); + INIT_LIST_HEAD(&net->nft.notify_list); mutex_init(&net->nft.commit_mutex); net->nft.base_seq = 1; net->nft.validate_state = NFT_VALIDATE_SKIP; @@@ -8780,7 -8776,6 +8819,7 @@@ static void __net_exit nf_tables_exit_n mutex_unlock(&net->nft.commit_mutex); WARN_ON_ONCE(!list_empty(&net->nft.tables)); WARN_ON_ONCE(!list_empty(&net->nft.module_list)); + WARN_ON_ONCE(!list_empty(&net->nft.notify_list)); }
static struct pernet_operations nf_tables_net_ops = { diff --combined net/tipc/link.c index cef38a910107,97dc4b5fb20b..06b880da2a8e --- a/net/tipc/link.c +++ b/net/tipc/link.c @@@ -216,11 -216,6 +216,6 @@@ enum #define TIPC_BC_RETR_LIM (jiffies + msecs_to_jiffies(10)) #define TIPC_UC_RETR_TIME (jiffies + msecs_to_jiffies(1))
- /* - * Interval between NACKs when packets arrive out of order - */ - #define TIPC_NACK_INTV (TIPC_MIN_LINK_WIN * 2) - /* Link FSM states: */ enum { @@@ -532,8 -527,7 +527,8 @@@ bool tipc_link_create(struct net *net, * tipc_link_bc_create - create new link to be used for broadcast * @net: pointer to associated network namespace * @mtu: mtu to be used initially if no peers - * @window: send window to be used + * @min_win: minimal send window to be used by link + * @max_win: maximal send window to be used by link * @inputq: queue to put messages ready for delivery * @namedq: queue to put binding table update messages ready for delivery * @link: return value, pointer to put the created link @@@ -1256,6 -1250,11 +1251,11 @@@ static bool tipc_data_input(struct tipc case MSG_FRAGMENTER: case BCAST_PROTOCOL: return false; + #ifdef CONFIG_TIPC_CRYPTO + case MSG_CRYPTO: + tipc_crypto_msg_rcv(l->net, skb); + return true; + #endif default: pr_warn("Dropping received illegal msg type\n"); kfree_skb(skb); diff --combined net/tipc/msg.c index 52e93ba4d8e2,2d9a383b8192..11a429f4c6cd --- a/net/tipc/msg.c +++ b/net/tipc/msg.c @@@ -150,8 -150,7 +150,8 @@@ int tipc_buf_append(struct sk_buff **he if (fragid == FIRST_FRAGMENT) { if (unlikely(head)) goto err; - if (unlikely(skb_unclone(frag, GFP_ATOMIC))) + frag = skb_unshare(frag, GFP_ATOMIC); + if (unlikely(!frag)) goto err; head = *headbuf = frag; *buf = NULL; @@@ -582,7 -581,7 +582,7 @@@ bundle * @pos: position in outer message of msg to be extracted. * Returns position of next msg * Consumes outer buffer when last packet extracted - * Returns true when when there is an extracted buffer, otherwise false + * Returns true when there is an extracted buffer, otherwise false */ bool tipc_msg_extract(struct sk_buff *skb, struct sk_buff **iskb, int *pos) { diff --combined net/tipc/socket.c index 11b27ddc75ba,0f894aca98ed..e795a8a2955b --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@@ -52,10 -52,9 +52,9 @@@ #define NAGLE_START_MAX 1024 #define CONN_TIMEOUT_DEFAULT 8000 /* default connect timeout = 8s */ #define CONN_PROBING_INTV msecs_to_jiffies(3600000) /* [ms] => 1 h */ - #define TIPC_FWD_MSG 1 #define TIPC_MAX_PORT 0xffffffff #define TIPC_MIN_PORT 1 - #define TIPC_ACK_RATE 4 /* ACK at 1/4 of of rcv window size */ + #define TIPC_ACK_RATE 4 /* ACK at 1/4 of rcv window size */
enum { TIPC_LISTEN = TCP_LISTEN, @@@ -2771,7 -2770,10 +2770,7 @@@ static int tipc_shutdown(struct socket
trace_tipc_sk_shutdown(sk, NULL, TIPC_DUMP_ALL, " "); __tipc_shutdown(sock, TIPC_CONN_SHUTDOWN); - if (tipc_sk_type_connectionless(sk)) - sk->sk_shutdown = SHUTDOWN_MASK; - else - sk->sk_shutdown = SEND_SHUTDOWN; + sk->sk_shutdown = SHUTDOWN_MASK;
if (sk->sk_state == TIPC_DISCONNECTING) { /* Discard any unreceived messages */ diff --combined net/wireless/util.c index 6fa99df52f86,ac2bb1a80f2b..f01746894a4e --- a/net/wireless/util.c +++ b/net/wireless/util.c @@@ -95,7 -95,7 +95,7 @@@ u32 ieee80211_channel_to_freq_khz(int c /* see 802.11ax D6.1 27.3.23.2 */ if (chan == 2) return MHZ_TO_KHZ(5935); - if (chan <= 253) + if (chan <= 233) return MHZ_TO_KHZ(5950 + chan * 5); break; case NL80211_BAND_60GHZ: @@@ -111,6 -111,33 +111,33 @@@ } EXPORT_SYMBOL(ieee80211_channel_to_freq_khz);
+ enum nl80211_chan_width + ieee80211_s1g_channel_width(const struct ieee80211_channel *chan) + { + if (WARN_ON(!chan || chan->band != NL80211_BAND_S1GHZ)) + return NL80211_CHAN_WIDTH_20_NOHT; + + /*S1G defines a single allowed channel width per channel. + * Extract that width here. + */ + if (chan->flags & IEEE80211_CHAN_1MHZ) + return NL80211_CHAN_WIDTH_1; + else if (chan->flags & IEEE80211_CHAN_2MHZ) + return NL80211_CHAN_WIDTH_2; + else if (chan->flags & IEEE80211_CHAN_4MHZ) + return NL80211_CHAN_WIDTH_4; + else if (chan->flags & IEEE80211_CHAN_8MHZ) + return NL80211_CHAN_WIDTH_8; + else if (chan->flags & IEEE80211_CHAN_16MHZ) + return NL80211_CHAN_WIDTH_16; + + pr_err("unknown channel width for channel at %dKHz?\n", + ieee80211_channel_to_khz(chan)); + + return NL80211_CHAN_WIDTH_1; + } + EXPORT_SYMBOL(ieee80211_s1g_channel_width); + int ieee80211_freq_khz_to_channel(u32 freq) { /* TODO: just handle MHz for now */ @@@ -399,6 -426,11 +426,11 @@@ unsigned int __attribute_const__ ieee80 { unsigned int hdrlen = 24;
+ if (ieee80211_is_ext(fc)) { + hdrlen = 4; + goto out; + } + if (ieee80211_is_data(fc)) { if (ieee80211_has_a4(fc)) hdrlen = 30; diff --combined net/xdp/xdp_umem.c index b010bfde0149,a7227b447228..56d052bc65cb --- a/net/xdp/xdp_umem.c +++ b/net/xdp/xdp_umem.c @@@ -23,162 -23,6 +23,6 @@@
static DEFINE_IDA(umem_ida);
- void xdp_add_sk_umem(struct xdp_umem *umem, struct xdp_sock *xs) - { - unsigned long flags; - - if (!xs->tx) - return; - - spin_lock_irqsave(&umem->xsk_tx_list_lock, flags); - list_add_rcu(&xs->list, &umem->xsk_tx_list); - spin_unlock_irqrestore(&umem->xsk_tx_list_lock, flags); - } - - void xdp_del_sk_umem(struct xdp_umem *umem, struct xdp_sock *xs) - { - unsigned long flags; - - if (!xs->tx) - return; - - spin_lock_irqsave(&umem->xsk_tx_list_lock, flags); - list_del_rcu(&xs->list); - spin_unlock_irqrestore(&umem->xsk_tx_list_lock, flags); - } - - /* The umem is stored both in the _rx struct and the _tx struct as we do - * not know if the device has more tx queues than rx, or the opposite. - * This might also change during run time. - */ - static int xdp_reg_umem_at_qid(struct net_device *dev, struct xdp_umem *umem, - u16 queue_id) - { - if (queue_id >= max_t(unsigned int, - dev->real_num_rx_queues, - dev->real_num_tx_queues)) - return -EINVAL; - - if (queue_id < dev->real_num_rx_queues) - dev->_rx[queue_id].umem = umem; - if (queue_id < dev->real_num_tx_queues) - dev->_tx[queue_id].umem = umem; - - return 0; - } - - struct xdp_umem *xdp_get_umem_from_qid(struct net_device *dev, - u16 queue_id) - { - if (queue_id < dev->real_num_rx_queues) - return dev->_rx[queue_id].umem; - if (queue_id < dev->real_num_tx_queues) - return dev->_tx[queue_id].umem; - - return NULL; - } - EXPORT_SYMBOL(xdp_get_umem_from_qid); - - static void xdp_clear_umem_at_qid(struct net_device *dev, u16 queue_id) - { - if (queue_id < dev->real_num_rx_queues) - dev->_rx[queue_id].umem = NULL; - if (queue_id < dev->real_num_tx_queues) - dev->_tx[queue_id].umem = NULL; - } - - int xdp_umem_assign_dev(struct xdp_umem *umem, struct net_device *dev, - u16 queue_id, u16 flags) - { - bool force_zc, force_copy; - struct netdev_bpf bpf; - int err = 0; - - ASSERT_RTNL(); - - force_zc = flags & XDP_ZEROCOPY; - force_copy = flags & XDP_COPY; - - if (force_zc && force_copy) - return -EINVAL; - - if (xdp_get_umem_from_qid(dev, queue_id)) - return -EBUSY; - - err = xdp_reg_umem_at_qid(dev, umem, queue_id); - if (err) - return err; - - umem->dev = dev; - umem->queue_id = queue_id; - - if (flags & XDP_USE_NEED_WAKEUP) { - umem->flags |= XDP_UMEM_USES_NEED_WAKEUP; - /* Tx needs to be explicitly woken up the first time. - * Also for supporting drivers that do not implement this - * feature. They will always have to call sendto(). - */ - xsk_set_tx_need_wakeup(umem); - } - - dev_hold(dev); - - if (force_copy) - /* For copy-mode, we are done. */ - return 0; - - if (!dev->netdev_ops->ndo_bpf || !dev->netdev_ops->ndo_xsk_wakeup) { - err = -EOPNOTSUPP; - goto err_unreg_umem; - } - - bpf.command = XDP_SETUP_XSK_UMEM; - bpf.xsk.umem = umem; - bpf.xsk.queue_id = queue_id; - - err = dev->netdev_ops->ndo_bpf(dev, &bpf); - if (err) - goto err_unreg_umem; - - umem->zc = true; - return 0; - - err_unreg_umem: - if (!force_zc) - err = 0; /* fallback to copy mode */ - if (err) - xdp_clear_umem_at_qid(dev, queue_id); - return err; - } - - void xdp_umem_clear_dev(struct xdp_umem *umem) - { - struct netdev_bpf bpf; - int err; - - ASSERT_RTNL(); - - if (!umem->dev) - return; - - if (umem->zc) { - bpf.command = XDP_SETUP_XSK_UMEM; - bpf.xsk.umem = NULL; - bpf.xsk.queue_id = umem->queue_id; - - err = umem->dev->netdev_ops->ndo_bpf(umem->dev, &bpf); - - if (err) - WARN(1, "failed to disable umem!\n"); - } - - xdp_clear_umem_at_qid(umem->dev, umem->queue_id); - - dev_put(umem->dev); - umem->dev = NULL; - umem->zc = false; - } - static void xdp_umem_unpin_pages(struct xdp_umem *umem) { unpin_user_pages_dirty_lock(umem->pgs, umem->npgs, true); @@@ -195,38 -39,33 +39,33 @@@ static void xdp_umem_unaccount_pages(st } }
- static void xdp_umem_release(struct xdp_umem *umem) + static void xdp_umem_addr_unmap(struct xdp_umem *umem) { - rtnl_lock(); - xdp_umem_clear_dev(umem); - rtnl_unlock(); - - ida_simple_remove(&umem_ida, umem->id); + vunmap(umem->addrs); + umem->addrs = NULL; + }
- if (umem->fq) { - xskq_destroy(umem->fq); - umem->fq = NULL; - } + static int xdp_umem_addr_map(struct xdp_umem *umem, struct page **pages, + u32 nr_pages) + { + umem->addrs = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); + if (!umem->addrs) + return -ENOMEM; + return 0; + }
- if (umem->cq) { - xskq_destroy(umem->cq); - umem->cq = NULL; - } + static void xdp_umem_release(struct xdp_umem *umem) + { + umem->zc = false; + ida_simple_remove(&umem_ida, umem->id);
- xp_destroy(umem->pool); + xdp_umem_addr_unmap(umem); xdp_umem_unpin_pages(umem);
xdp_umem_unaccount_pages(umem); kfree(umem); }
- static void xdp_umem_release_deferred(struct work_struct *work) - { - struct xdp_umem *umem = container_of(work, struct xdp_umem, work); - - xdp_umem_release(umem); - } - void xdp_get_umem(struct xdp_umem *umem) { refcount_inc(&umem->users); @@@ -237,10 -76,8 +76,8 @@@ void xdp_put_umem(struct xdp_umem *umem if (!umem) return;
- if (refcount_dec_and_test(&umem->users)) { - INIT_WORK(&umem->work, xdp_umem_release_deferred); - schedule_work(&umem->work); - } + if (refcount_dec_and_test(&umem->users)) + xdp_umem_release(umem); }
static int xdp_umem_pin_pages(struct xdp_umem *umem, unsigned long address) @@@ -303,10 -140,10 +140,10 @@@ static int xdp_umem_account_pages(struc
static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) { + u32 npgs_rem, chunk_size = mr->chunk_size, headroom = mr->headroom; bool unaligned_chunks = mr->flags & XDP_UMEM_UNALIGNED_CHUNK_FLAG; - u32 chunk_size = mr->chunk_size, headroom = mr->headroom; u64 npgs, addr = mr->addr, size = mr->len; - unsigned int chunks, chunks_per_page; + unsigned int chunks, chunks_rem; int err;
if (chunk_size < XDP_UMEM_MIN_CHUNK_SIZE || chunk_size > PAGE_SIZE) { @@@ -319,8 -156,7 +156,7 @@@ return -EINVAL; }
- if (mr->flags & ~(XDP_UMEM_UNALIGNED_CHUNK_FLAG | - XDP_UMEM_USES_NEED_WAKEUP)) + if (mr->flags & ~XDP_UMEM_UNALIGNED_CHUNK_FLAG) return -EINVAL;
if (!unaligned_chunks && !is_power_of_2(chunk_size)) @@@ -336,18 -172,19 +172,18 @@@ if ((addr + size) < addr) return -EINVAL;
- npgs = size >> PAGE_SHIFT; + npgs = div_u64_rem(size, PAGE_SIZE, &npgs_rem); + if (npgs_rem) + npgs++; if (npgs > U32_MAX) return -EINVAL;
- chunks = (unsigned int)div_u64(size, chunk_size); + chunks = (unsigned int)div_u64_rem(size, chunk_size, &chunks_rem); if (chunks == 0) return -EINVAL;
- if (!unaligned_chunks) { - chunks_per_page = PAGE_SIZE / chunk_size; - if (chunks < chunks_per_page || chunks % chunks_per_page) - return -EINVAL; - } + if (!unaligned_chunks && chunks_rem) + return -EINVAL;
if (headroom >= chunk_size - XDP_PACKET_HEADROOM) return -EINVAL; @@@ -355,13 -192,13 +191,13 @@@ umem->size = size; umem->headroom = headroom; umem->chunk_size = chunk_size; + umem->chunks = chunks; umem->npgs = (u32)npgs; umem->pgs = NULL; umem->user = NULL; umem->flags = mr->flags; - INIT_LIST_HEAD(&umem->xsk_tx_list); - spin_lock_init(&umem->xsk_tx_list_lock);
+ INIT_LIST_HEAD(&umem->xsk_dma_list); refcount_set(&umem->users, 1);
err = xdp_umem_account_pages(umem); @@@ -372,15 -209,13 +208,13 @@@ if (err) goto out_account;
- umem->pool = xp_create(umem->pgs, umem->npgs, chunks, chunk_size, - headroom, size, unaligned_chunks); - if (!umem->pool) { - err = -ENOMEM; - goto out_pin; - } + err = xdp_umem_addr_map(umem, umem->pgs, umem->npgs); + if (err) + goto out_unpin; + return 0;
- out_pin: + out_unpin: xdp_umem_unpin_pages(umem); out_account: xdp_umem_unaccount_pages(umem); @@@ -412,8 -247,3 +246,3 @@@ struct xdp_umem *xdp_umem_create(struc
return umem; } - - bool xdp_umem_validate_queues(struct xdp_umem *umem) - { - return umem->fq && umem->cq; - } diff --combined net/xdp/xsk.c index 6c5e09e7440a,5eb6662f562a..0824ce838638 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@@ -36,68 -36,108 +36,108 @@@ static DEFINE_PER_CPU(struct list_head bool xsk_is_setup_for_bpf_map(struct xdp_sock *xs) { return READ_ONCE(xs->rx) && READ_ONCE(xs->umem) && - READ_ONCE(xs->umem->fq); + (xs->pool->fq || READ_ONCE(xs->fq_tmp)); }
- void xsk_set_rx_need_wakeup(struct xdp_umem *umem) + void xsk_set_rx_need_wakeup(struct xsk_buff_pool *pool) { - if (umem->need_wakeup & XDP_WAKEUP_RX) + if (pool->cached_need_wakeup & XDP_WAKEUP_RX) return;
- umem->fq->ring->flags |= XDP_RING_NEED_WAKEUP; - umem->need_wakeup |= XDP_WAKEUP_RX; + pool->fq->ring->flags |= XDP_RING_NEED_WAKEUP; + pool->cached_need_wakeup |= XDP_WAKEUP_RX; } EXPORT_SYMBOL(xsk_set_rx_need_wakeup);
- void xsk_set_tx_need_wakeup(struct xdp_umem *umem) + void xsk_set_tx_need_wakeup(struct xsk_buff_pool *pool) { struct xdp_sock *xs;
- if (umem->need_wakeup & XDP_WAKEUP_TX) + if (pool->cached_need_wakeup & XDP_WAKEUP_TX) return;
rcu_read_lock(); - list_for_each_entry_rcu(xs, &umem->xsk_tx_list, list) { + list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) { xs->tx->ring->flags |= XDP_RING_NEED_WAKEUP; } rcu_read_unlock();
- umem->need_wakeup |= XDP_WAKEUP_TX; + pool->cached_need_wakeup |= XDP_WAKEUP_TX; } EXPORT_SYMBOL(xsk_set_tx_need_wakeup);
- void xsk_clear_rx_need_wakeup(struct xdp_umem *umem) + void xsk_clear_rx_need_wakeup(struct xsk_buff_pool *pool) { - if (!(umem->need_wakeup & XDP_WAKEUP_RX)) + if (!(pool->cached_need_wakeup & XDP_WAKEUP_RX)) return;
- umem->fq->ring->flags &= ~XDP_RING_NEED_WAKEUP; - umem->need_wakeup &= ~XDP_WAKEUP_RX; + pool->fq->ring->flags &= ~XDP_RING_NEED_WAKEUP; + pool->cached_need_wakeup &= ~XDP_WAKEUP_RX; } EXPORT_SYMBOL(xsk_clear_rx_need_wakeup);
- void xsk_clear_tx_need_wakeup(struct xdp_umem *umem) + void xsk_clear_tx_need_wakeup(struct xsk_buff_pool *pool) { struct xdp_sock *xs;
- if (!(umem->need_wakeup & XDP_WAKEUP_TX)) + if (!(pool->cached_need_wakeup & XDP_WAKEUP_TX)) return;
rcu_read_lock(); - list_for_each_entry_rcu(xs, &umem->xsk_tx_list, list) { + list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) { xs->tx->ring->flags &= ~XDP_RING_NEED_WAKEUP; } rcu_read_unlock();
- umem->need_wakeup &= ~XDP_WAKEUP_TX; + pool->cached_need_wakeup &= ~XDP_WAKEUP_TX; } EXPORT_SYMBOL(xsk_clear_tx_need_wakeup);
- bool xsk_umem_uses_need_wakeup(struct xdp_umem *umem) + bool xsk_uses_need_wakeup(struct xsk_buff_pool *pool) { - return umem->flags & XDP_UMEM_USES_NEED_WAKEUP; + return pool->uses_need_wakeup; + } + EXPORT_SYMBOL(xsk_uses_need_wakeup); + + struct xsk_buff_pool *xsk_get_pool_from_qid(struct net_device *dev, + u16 queue_id) + { + if (queue_id < dev->real_num_rx_queues) + return dev->_rx[queue_id].pool; + if (queue_id < dev->real_num_tx_queues) + return dev->_tx[queue_id].pool; + + return NULL; + } + EXPORT_SYMBOL(xsk_get_pool_from_qid); + + void xsk_clear_pool_at_qid(struct net_device *dev, u16 queue_id) + { + if (queue_id < dev->real_num_rx_queues) + dev->_rx[queue_id].pool = NULL; + if (queue_id < dev->real_num_tx_queues) + dev->_tx[queue_id].pool = NULL; + } + + /* The buffer pool is stored both in the _rx struct and the _tx struct as we do + * not know if the device has more tx queues than rx, or the opposite. + * This might also change during run time. + */ + int xsk_reg_pool_at_qid(struct net_device *dev, struct xsk_buff_pool *pool, + u16 queue_id) + { + if (queue_id >= max_t(unsigned int, + dev->real_num_rx_queues, + dev->real_num_tx_queues)) + return -EINVAL; + + if (queue_id < dev->real_num_rx_queues) + dev->_rx[queue_id].pool = pool; + if (queue_id < dev->real_num_tx_queues) + dev->_tx[queue_id].pool = pool; + + return 0; } - EXPORT_SYMBOL(xsk_umem_uses_need_wakeup);
void xp_release(struct xdp_buff_xsk *xskb) { @@@ -155,12 -195,12 +195,12 @@@ static int __xsk_rcv(struct xdp_sock *x struct xdp_buff *xsk_xdp; int err;
- if (len > xsk_umem_get_rx_frame_size(xs->umem)) { + if (len > xsk_pool_get_rx_frame_size(xs->pool)) { xs->rx_dropped++; return -ENOSPC; }
- xsk_xdp = xsk_buff_alloc(xs->umem); + xsk_xdp = xsk_buff_alloc(xs->pool); if (!xsk_xdp) { xs->rx_dropped++; return -ENOSPC; @@@ -208,7 -248,7 +248,7 @@@ static int xsk_rcv(struct xdp_sock *xs static void xsk_flush(struct xdp_sock *xs) { xskq_prod_submit(xs->rx); - __xskq_cons_release(xs->umem->fq); + __xskq_cons_release(xs->pool->fq); sock_def_readable(&xs->sk); }
@@@ -249,32 -289,32 +289,32 @@@ void __xsk_map_flush(void } }
- void xsk_umem_complete_tx(struct xdp_umem *umem, u32 nb_entries) + void xsk_tx_completed(struct xsk_buff_pool *pool, u32 nb_entries) { - xskq_prod_submit_n(umem->cq, nb_entries); + xskq_prod_submit_n(pool->cq, nb_entries); } - EXPORT_SYMBOL(xsk_umem_complete_tx); + EXPORT_SYMBOL(xsk_tx_completed);
- void xsk_umem_consume_tx_done(struct xdp_umem *umem) + void xsk_tx_release(struct xsk_buff_pool *pool) { struct xdp_sock *xs;
rcu_read_lock(); - list_for_each_entry_rcu(xs, &umem->xsk_tx_list, list) { + list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) { __xskq_cons_release(xs->tx); xs->sk.sk_write_space(&xs->sk); } rcu_read_unlock(); } - EXPORT_SYMBOL(xsk_umem_consume_tx_done); + EXPORT_SYMBOL(xsk_tx_release);
- bool xsk_umem_consume_tx(struct xdp_umem *umem, struct xdp_desc *desc) + bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc) { struct xdp_sock *xs;
rcu_read_lock(); - list_for_each_entry_rcu(xs, &umem->xsk_tx_list, list) { - if (!xskq_cons_peek_desc(xs->tx, desc, umem)) { + list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) { + if (!xskq_cons_peek_desc(xs->tx, desc, pool)) { xs->tx->queue_empty_descs++; continue; } @@@ -284,7 -324,7 +324,7 @@@ * if there is space in it. This avoids having to implement * any buffering in the Tx path. */ - if (xskq_prod_reserve_addr(umem->cq, desc->addr)) + if (xskq_prod_reserve_addr(pool->cq, desc->addr)) goto out;
xskq_cons_release(xs->tx); @@@ -296,7 -336,7 +336,7 @@@ out rcu_read_unlock(); return false; } - EXPORT_SYMBOL(xsk_umem_consume_tx); + EXPORT_SYMBOL(xsk_tx_peek_desc);
static int xsk_wakeup(struct xdp_sock *xs, u8 flags) { @@@ -322,7 -362,7 +362,7 @@@ static void xsk_destruct_skb(struct sk_ unsigned long flags;
spin_lock_irqsave(&xs->tx_completion_lock, flags); - xskq_prod_submit_addr(xs->umem->cq, addr); + xskq_prod_submit_addr(xs->pool->cq, addr); spin_unlock_irqrestore(&xs->tx_completion_lock, flags);
sock_wfree(skb); @@@ -342,7 -382,7 +382,7 @@@ static int xsk_generic_xmit(struct soc if (xs->queue_id >= xs->dev->real_num_tx_queues) goto out;
- while (xskq_cons_peek_desc(xs->tx, &desc, xs->umem)) { + while (xskq_cons_peek_desc(xs->tx, &desc, xs->pool)) { char *buffer; u64 addr; u32 len; @@@ -359,14 -399,14 +399,14 @@@
skb_put(skb, len); addr = desc.addr; - buffer = xsk_buff_raw_get_data(xs->umem, addr); + buffer = xsk_buff_raw_get_data(xs->pool, addr); err = skb_store_bits(skb, 0, buffer, len); /* This is the backpressure mechanism for the Tx path. * Reserve space in the completion queue and only proceed * if there is space in it. This avoids having to implement * any buffering in the Tx path. */ - if (unlikely(err) || xskq_prod_reserve(xs->umem->cq)) { + if (unlikely(err) || xskq_prod_reserve(xs->pool->cq)) { kfree_skb(skb); goto out; } @@@ -377,30 -417,15 +417,30 @@@ skb_shinfo(skb)->destructor_arg = (void *)(long)desc.addr; skb->destructor = xsk_destruct_skb;
+ /* Hinder dev_direct_xmit from freeing the packet and + * therefore completing it in the destructor + */ + refcount_inc(&skb->users); err = dev_direct_xmit(skb, xs->queue_id); + if (err == NETDEV_TX_BUSY) { + /* Tell user-space to retry the send */ + skb->destructor = sock_wfree; + /* Free skb without triggering the perf drop trace */ + consume_skb(skb); + err = -EAGAIN; + goto out; + } + xskq_cons_release(xs->tx); /* Ignore NET_XMIT_CN as packet might have been sent */ - if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) { + if (err == NET_XMIT_DROP) { /* SKB completed but not sent */ + kfree_skb(skb); err = -EBUSY; goto out; }
+ consume_skb(skb); sent_frame = true; }
@@@ -446,16 -471,16 +486,16 @@@ static __poll_t xsk_poll(struct file *f __poll_t mask = datagram_poll(file, sock, wait); struct sock *sk = sock->sk; struct xdp_sock *xs = xdp_sk(sk); - struct xdp_umem *umem; + struct xsk_buff_pool *pool;
if (unlikely(!xsk_is_bound(xs))) return mask;
- umem = xs->umem; + pool = xs->pool;
- if (umem->need_wakeup) { + if (pool->cached_need_wakeup) { if (xs->zc) - xsk_wakeup(xs, umem->need_wakeup); + xsk_wakeup(xs, pool->cached_need_wakeup); else /* Poll needs to drive Tx also in copy mode */ __xsk_sendmsg(sk); @@@ -496,7 -521,7 +536,7 @@@ static void xsk_unbind_dev(struct xdp_s WRITE_ONCE(xs->state, XSK_UNBOUND);
/* Wait for driver to stop using the xdp socket. */ - xdp_del_sk_umem(xs->umem, xs); + xp_del_xsk(xs->pool, xs); xs->dev = NULL; synchronize_net(); dev_put(dev); @@@ -574,6 -599,8 +614,8 @@@ static int xsk_release(struct socket *s
xskq_destroy(xs->rx); xskq_destroy(xs->tx); + xskq_destroy(xs->fq_tmp); + xskq_destroy(xs->cq_tmp);
sock_orphan(sk); sock->sk = NULL; @@@ -601,6 -628,11 +643,11 @@@ static struct socket *xsk_lookup_xsk_fr return sock; }
+ static bool xsk_validate_queues(struct xdp_sock *xs) + { + return xs->fq_tmp && xs->cq_tmp; + } + static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) { struct sockaddr_xdp *sxdp = (struct sockaddr_xdp *)addr; @@@ -669,29 -701,64 +716,64 @@@ sockfd_put(sock); goto out_unlock; } - if (umem_xs->dev != dev || umem_xs->queue_id != qid) { - err = -EINVAL; - sockfd_put(sock); - goto out_unlock; + + if (umem_xs->queue_id != qid || umem_xs->dev != dev) { + /* Share the umem with another socket on another qid + * and/or device. + */ + xs->pool = xp_create_and_assign_umem(xs, + umem_xs->umem); + if (!xs->pool) { + sockfd_put(sock); + goto out_unlock; + } + + err = xp_assign_dev_shared(xs->pool, umem_xs->umem, + dev, qid); + if (err) { + xp_destroy(xs->pool); + sockfd_put(sock); + goto out_unlock; + } + } else { + /* Share the buffer pool with the other socket. */ + if (xs->fq_tmp || xs->cq_tmp) { + /* Do not allow setting your own fq or cq. */ + err = -EINVAL; + sockfd_put(sock); + goto out_unlock; + } + + xp_get_pool(umem_xs->pool); + xs->pool = umem_xs->pool; }
xdp_get_umem(umem_xs->umem); WRITE_ONCE(xs->umem, umem_xs->umem); sockfd_put(sock); - } else if (!xs->umem || !xdp_umem_validate_queues(xs->umem)) { + } else if (!xs->umem || !xsk_validate_queues(xs)) { err = -EINVAL; goto out_unlock; } else { /* This xsk has its own umem. */ - err = xdp_umem_assign_dev(xs->umem, dev, qid, flags); - if (err) + xs->pool = xp_create_and_assign_umem(xs, xs->umem); + if (!xs->pool) { + err = -ENOMEM; + goto out_unlock; + } + + err = xp_assign_dev(xs->pool, dev, qid, flags); + if (err) { + xp_destroy(xs->pool); + xs->pool = NULL; goto out_unlock; + } }
xs->dev = dev; xs->zc = xs->umem->zc; xs->queue_id = qid; - xdp_add_sk_umem(xs->umem, xs); + xp_add_xsk(xs->pool, xs);
out_unlock: if (err) { @@@ -797,16 -864,10 +879,10 @@@ static int xsk_setsockopt(struct socke mutex_unlock(&xs->mutex); return -EBUSY; } - if (!xs->umem) { - mutex_unlock(&xs->mutex); - return -EINVAL; - }
- q = (optname == XDP_UMEM_FILL_RING) ? &xs->umem->fq : - &xs->umem->cq; + q = (optname == XDP_UMEM_FILL_RING) ? &xs->fq_tmp : + &xs->cq_tmp; err = xsk_init_queue(entries, q, true); - if (optname == XDP_UMEM_FILL_RING) - xp_set_fq(xs->umem->pool, *q); mutex_unlock(&xs->mutex); return err; } @@@ -873,7 -934,7 +949,7 @@@ static int xsk_getsockopt(struct socke if (extra_stats) { stats.rx_ring_full = xs->rx_queue_full; stats.rx_fill_ring_empty_descs = - xs->umem ? xskq_nb_queue_empty_descs(xs->umem->fq) : 0; + xs->pool ? xskq_nb_queue_empty_descs(xs->pool->fq) : 0; stats.tx_ring_empty_descs = xskq_nb_queue_empty_descs(xs->tx); } else { stats.rx_dropped += xs->rx_queue_full; @@@ -975,7 -1036,6 +1051,6 @@@ static int xsk_mmap(struct file *file, unsigned long size = vma->vm_end - vma->vm_start; struct xdp_sock *xs = xdp_sk(sock->sk); struct xsk_queue *q = NULL; - struct xdp_umem *umem; unsigned long pfn; struct page *qpg;
@@@ -987,16 -1047,12 +1062,12 @@@ } else if (offset == XDP_PGOFF_TX_RING) { q = READ_ONCE(xs->tx); } else { - umem = READ_ONCE(xs->umem); - if (!umem) - return -EINVAL; - /* Matches the smp_wmb() in XDP_UMEM_REG */ smp_rmb(); if (offset == XDP_UMEM_PGOFF_FILL_RING) - q = READ_ONCE(umem->fq); + q = READ_ONCE(xs->fq_tmp); else if (offset == XDP_UMEM_PGOFF_COMPLETION_RING) - q = READ_ONCE(umem->cq); + q = READ_ONCE(xs->cq_tmp); }
if (!q) @@@ -1034,8 -1090,8 +1105,8 @@@ static int xsk_notifier(struct notifier
xsk_unbind_dev(xs);
- /* Clear device references in umem. */ - xdp_umem_clear_dev(xs->umem); + /* Clear device references. */ + xp_clear_dev(xs->pool); } mutex_unlock(&xs->mutex); } @@@ -1079,7 -1135,7 +1150,7 @@@ static void xsk_destruct(struct sock *s if (!sock_flag(sk, SOCK_DEAD)) return;
- xdp_put_umem(xs->umem); + xp_put_pool(xs->pool);
sk_refcnt_debug_dec(sk); } @@@ -1087,8 -1143,8 +1158,8 @@@ static int xsk_create(struct net *net, struct socket *sock, int protocol, int kern) { - struct sock *sk; struct xdp_sock *xs; + struct sock *sk;
if (!ns_capable(net->user_ns, CAP_NET_RAW)) return -EPERM; diff --combined tools/bpf/bpftool/Makefile index 4828913703b6,02c99bc95c69..f60e6ad3a1df --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@@ -25,7 -25,7 +25,7 @@@ endi
LIBBPF = $(LIBBPF_PATH)libbpf.a
-BPFTOOL_VERSION := $(shell make -rR --no-print-directory -sC ../../.. kernelversion) +BPFTOOL_VERSION ?= $(shell make -rR --no-print-directory -sC ../../.. kernelversion)
$(LIBBPF): FORCE $(if $(LIBBPF_OUTPUT),@mkdir -p $(LIBBPF_OUTPUT)) @@@ -176,7 -176,11 +176,11 @@@ $(OUTPUT)bpftool: $(OBJS) $(LIBBPF $(OUTPUT)%.o: %.c $(QUIET_CC)$(CC) $(CFLAGS) -c -MMD -o $@ $<
- clean: $(LIBBPF)-clean + feature-detect-clean: + $(call QUIET_CLEAN, feature-detect) + $(Q)$(MAKE) -C $(srctree)/tools/build/feature/ clean >/dev/null + + clean: $(LIBBPF)-clean feature-detect-clean $(call QUIET_CLEAN, bpftool) $(Q)$(RM) -- $(OUTPUT)bpftool $(OUTPUT)*.o $(OUTPUT)*.d $(Q)$(RM) -- $(BPFTOOL_BOOTSTRAP) $(OUTPUT)*.skel.h $(OUTPUT)vmlinux.h diff --combined tools/lib/bpf/Makefile index 9ae8f4ef0aac,adbe994610f2..f43249696d9f --- a/tools/lib/bpf/Makefile +++ b/tools/lib/bpf/Makefile @@@ -1,6 -1,9 +1,9 @@@ # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) # Most of this file is copied from tools/lib/traceevent/Makefile
+ RM ?= rm + srctree = $(abs_srctree) + LIBBPF_VERSION := $(shell \ grep -oE '^LIBBPF_([0-9.]+)' libbpf.map | \ sort -rV | head -n1 | cut -d'_' -f2) @@@ -56,10 -59,10 +59,10 @@@ ifndef VERBOS endif
FEATURE_USER = .libbpf - FEATURE_TESTS = libelf libelf-mmap zlib bpf reallocarray + FEATURE_TESTS = libelf zlib bpf FEATURE_DISPLAY = libelf zlib bpf
-INCLUDES = -I. -I$(srctree)/tools/include -I$(srctree)/tools/arch/$(ARCH)/include/uapi -I$(srctree)/tools/include/uapi +INCLUDES = -I. -I$(srctree)/tools/include -I$(srctree)/tools/include/uapi FEATURE_CHECK_CFLAGS-bpf = $(INCLUDES)
check_feat := 1 @@@ -98,16 -101,8 +101,8 @@@ els CFLAGS := -g -Wall endif
- ifeq ($(feature-libelf-mmap), 1) - override CFLAGS += -DHAVE_LIBELF_MMAP_SUPPORT - endif - - ifeq ($(feature-reallocarray), 0) - override CFLAGS += -DCOMPAT_NEED_REALLOCARRAY - endif - # Append required CFLAGS - override CFLAGS += $(EXTRA_WARNINGS) + override CFLAGS += $(EXTRA_WARNINGS) -Wno-switch-enum override CFLAGS += -Werror -Wall override CFLAGS += -fPIC override CFLAGS += $(INCLUDES) @@@ -152,7 -147,6 +147,7 @@@ GLOBAL_SYM_COUNT = $(shell readelf -s - awk '/GLOBAL/ && /DEFAULT/ && !/UND/ {print $$NF}' | \ sort -u | wc -l) VERSIONED_SYM_COUNT = $(shell readelf --dyn-syms --wide $(OUTPUT)libbpf.so | \ + awk '/GLOBAL/ && /DEFAULT/ && !/UND/ {print $$NF}' | \ grep -Eo '[^ ]+@LIBBPF_' | cut -d@ -f1 | sort -u | wc -l)
CMD_TARGETS = $(LIB_TARGET) $(PC_FILE) @@@ -197,7 -191,7 +192,7 @@@ $(OUTPUT)libbpf.so.$(LIBBPF_VERSION): $ @ln -sf $(@F) $(OUTPUT)libbpf.so.$(LIBBPF_MAJOR_VERSION)
$(OUTPUT)libbpf.a: $(BPF_IN_STATIC) - $(QUIET_LINK)$(RM) $@; $(AR) rcs $@ $^ + $(QUIET_LINK)$(RM) -f $@; $(AR) rcs $@ $^
$(OUTPUT)libbpf.pc: $(QUIET_GEN)sed -e "s|@PREFIX@|$(prefix)|" \ @@@ -220,7 -214,6 +215,7 @@@ check_abi: $(OUTPUT)libbpf.s awk '/GLOBAL/ && /DEFAULT/ && !/UND/ {print $$NF}'| \ sort -u > $(OUTPUT)libbpf_global_syms.tmp; \ readelf --dyn-syms --wide $(OUTPUT)libbpf.so | \ + awk '/GLOBAL/ && /DEFAULT/ && !/UND/ {print $$NF}'| \ grep -Eo '[^ ]+@LIBBPF_' | cut -d@ -f1 | \ sort -u > $(OUTPUT)libbpf_versioned_syms.tmp; \ diff -u $(OUTPUT)libbpf_global_syms.tmp \ @@@ -271,10 -264,10 +266,10 @@@ install: install_lib install_pkgconfig ### Cleaning rules
config-clean: - $(call QUIET_CLEAN, config) + $(call QUIET_CLEAN, feature-detect) $(Q)$(MAKE) -C $(srctree)/tools/build/feature/ clean >/dev/null
- clean: + clean: config-clean $(call QUIET_CLEAN, libbpf) $(RM) -rf $(CMD_TARGETS) \ *~ .*.d .*.cmd LIBBPF-CFLAGS $(BPF_HELPER_DEFS) \ $(SHARED_OBJDIR) $(STATIC_OBJDIR) \ @@@ -301,7 -294,7 +296,7 @@@ cscope cscope -b -q -I $(srctree)/include -f cscope.out
tags: - rm -f TAGS tags + $(RM) -f TAGS tags ls *.c *.h | xargs $(TAGS_PROG) -a
# Declare the contents of the .PHONY variable as phony. We keep that diff --combined tools/lib/bpf/btf.c index 6bdbc389b493,a3d259e614b0..24bb3c65f31d --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@@ -21,9 -21,6 +21,6 @@@ #include "libbpf_internal.h" #include "hashmap.h"
- /* make sure libbpf doesn't use kernel-only integer typedefs */ - #pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64 - #define BTF_MAX_NR_TYPES 0x7fffffffU #define BTF_MAX_STR_OFFSET 0x7fffffffU
@@@ -61,7 -58,7 +58,7 @@@ static int btf_add_type(struct btf *btf expand_by = max(btf->types_size >> 2, 16U); new_size = min(BTF_MAX_NR_TYPES, btf->types_size + expand_by);
- new_types = realloc(btf->types, sizeof(*new_types) * new_size); + new_types = libbpf_reallocarray(btf->types, new_size, sizeof(*new_types)); if (!new_types) return -ENOMEM;
@@@ -659,12 -656,6 +656,12 @@@ struct btf *btf__parse_raw(const char * err = -EIO; goto err_out; } + if (magic == __bswap_16(BTF_MAGIC)) { + /* non-native endian raw BTF */ + pr_warn("non-native BTF endianness is not supported\n"); + err = -LIBBPF_ERRNO__ENDIAN; + goto err_out; + } if (magic != BTF_MAGIC) { /* definitely not a raw BTF */ err = -EPROTO; @@@ -1137,14 -1128,14 +1134,14 @@@ static int btf_ext_setup_line_info(stru return btf_ext_setup_info(btf_ext, ¶m); }
- static int btf_ext_setup_field_reloc(struct btf_ext *btf_ext) + static int btf_ext_setup_core_relos(struct btf_ext *btf_ext) { struct btf_ext_sec_setup_param param = { - .off = btf_ext->hdr->field_reloc_off, - .len = btf_ext->hdr->field_reloc_len, - .min_rec_size = sizeof(struct bpf_field_reloc), - .ext_info = &btf_ext->field_reloc_info, - .desc = "field_reloc", + .off = btf_ext->hdr->core_relo_off, + .len = btf_ext->hdr->core_relo_len, + .min_rec_size = sizeof(struct bpf_core_relo), + .ext_info = &btf_ext->core_relo_info, + .desc = "core_relo", };
return btf_ext_setup_info(btf_ext, ¶m); @@@ -1223,10 -1214,9 +1220,9 @@@ struct btf_ext *btf_ext__new(__u8 *data if (err) goto done;
- if (btf_ext->hdr->hdr_len < - offsetofend(struct btf_ext_header, field_reloc_len)) + if (btf_ext->hdr->hdr_len < offsetofend(struct btf_ext_header, core_relo_len)) goto done; - err = btf_ext_setup_field_reloc(btf_ext); + err = btf_ext_setup_core_relos(btf_ext); if (err) goto done;
@@@ -1581,7 -1571,7 +1577,7 @@@ static int btf_dedup_hypot_map_add(stru __u32 *new_list;
d->hypot_cap += max((size_t)16, d->hypot_cap / 2); - new_list = realloc(d->hypot_list, sizeof(__u32) * d->hypot_cap); + new_list = libbpf_reallocarray(d->hypot_list, d->hypot_cap, sizeof(__u32)); if (!new_list) return -ENOMEM; d->hypot_list = new_list; @@@ -1877,8 -1867,7 +1873,7 @@@ static int btf_dedup_strings(struct btf struct btf_str_ptr *new_ptrs;
strs.cap += max(strs.cnt / 2, 16U); - new_ptrs = realloc(strs.ptrs, - sizeof(strs.ptrs[0]) * strs.cap); + new_ptrs = libbpf_reallocarray(strs.ptrs, strs.cap, sizeof(strs.ptrs[0])); if (!new_ptrs) { err = -ENOMEM; goto done; @@@ -2963,8 -2952,8 +2958,8 @@@ static int btf_dedup_compact_types(stru d->btf->nr_types = next_type_id - 1; d->btf->types_size = d->btf->nr_types; d->btf->hdr->type_len = p - types_start; - new_types = realloc(d->btf->types, - (1 + d->btf->nr_types) * sizeof(struct btf_type *)); + new_types = libbpf_reallocarray(d->btf->types, (1 + d->btf->nr_types), + sizeof(struct btf_type *)); if (!new_types) return -ENOMEM; d->btf->types = new_types; diff --combined tools/lib/bpf/libbpf.c index 7253b833576c,b688aadf09c5..46d727b45c81 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@@ -44,7 -44,6 +44,6 @@@ #include <sys/vfs.h> #include <sys/utsname.h> #include <sys/resource.h> - #include <tools/libc_compat.h> #include <libelf.h> #include <gelf.h> #include <zlib.h> @@@ -56,9 -55,6 +55,6 @@@ #include "libbpf_internal.h" #include "hashmap.h"
- /* make sure libbpf doesn't use kernel-only integer typedefs */ - #pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64 - #ifndef EM_BPF #define EM_BPF 247 #endif @@@ -67,6 -63,8 +63,8 @@@ #define BPF_FS_MAGIC 0xcafe4a11 #endif
+ #define BPF_INSN_SZ (sizeof(struct bpf_insn)) + /* vsprintf() in __base_pr() uses nonliteral format string. It may break * compilation if user enables corresponding warning. Disable it explicitly. */ @@@ -154,34 -152,35 +152,35 @@@ static void pr_perm_msg(int err ___err; }) #endif
- #ifdef HAVE_LIBELF_MMAP_SUPPORT - # define LIBBPF_ELF_C_READ_MMAP ELF_C_READ_MMAP - #else - # define LIBBPF_ELF_C_READ_MMAP ELF_C_READ - #endif - static inline __u64 ptr_to_u64(const void *ptr) { return (__u64) (unsigned long) ptr; }
- struct bpf_capabilities { + enum kern_feature_id { /* v4.14: kernel support for program & map names. */ - __u32 name:1; + FEAT_PROG_NAME, /* v5.2: kernel support for global data sections. */ - __u32 global_data:1; + FEAT_GLOBAL_DATA, + /* BTF support */ + FEAT_BTF, /* BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO support */ - __u32 btf_func:1; + FEAT_BTF_FUNC, /* BTF_KIND_VAR and BTF_KIND_DATASEC support */ - __u32 btf_datasec:1; - /* BPF_F_MMAPABLE is supported for arrays */ - __u32 array_mmap:1; + FEAT_BTF_DATASEC, /* BTF_FUNC_GLOBAL is supported */ - __u32 btf_func_global:1; + FEAT_BTF_GLOBAL_FUNC, + /* BPF_F_MMAPABLE is supported for arrays */ + FEAT_ARRAY_MMAP, /* kernel support for expected_attach_type in BPF_PROG_LOAD */ - __u32 exp_attach_type:1; + FEAT_EXP_ATTACH_TYPE, + /* bpf_probe_read_{kernel,user}[_str] helpers */ + FEAT_PROBE_READ_KERN, + __FEAT_CNT, };
+ static bool kernel_supports(enum kern_feature_id feat_id); + enum reloc_type { RELO_LD64, RELO_CALL, @@@ -209,6 -208,7 +208,7 @@@ struct bpf_sec_def bool is_exp_attach_type_optional; bool is_attachable; bool is_attach_btf; + bool is_sleepable; attach_fn_t attach_fn; };
@@@ -253,8 -253,6 +253,6 @@@ struct bpf_program __u32 func_info_rec_size; __u32 func_info_cnt;
- struct bpf_capabilities *caps; - void *line_info; __u32 line_info_rec_size; __u32 line_info_cnt; @@@ -403,6 -401,7 +401,7 @@@ struct bpf_object Elf_Data *rodata; Elf_Data *bss; Elf_Data *st_ops_data; + size_t shstrndx; /* section index for section name strings */ size_t strtabidx; struct { GElf_Shdr shdr; @@@ -436,12 -435,18 +435,18 @@@ void *priv; bpf_object_clear_priv_t clear_priv;
- struct bpf_capabilities caps; - char path[]; }; #define obj_elf_valid(o) ((o)->efile.elf)
+ static const char *elf_sym_str(const struct bpf_object *obj, size_t off); + static const char *elf_sec_str(const struct bpf_object *obj, size_t off); + static Elf_Scn *elf_sec_by_idx(const struct bpf_object *obj, size_t idx); + static Elf_Scn *elf_sec_by_name(const struct bpf_object *obj, const char *name); + static int elf_sec_hdr(const struct bpf_object *obj, Elf_Scn *scn, GElf_Shdr *hdr); + static const char *elf_sec_name(const struct bpf_object *obj, Elf_Scn *scn); + static Elf_Data *elf_sec_data(const struct bpf_object *obj, Elf_Scn *scn); + void bpf_program__unload(struct bpf_program *prog) { int i; @@@ -503,7 -508,7 +508,7 @@@ static char *__bpf_program__pin_name(st }
static int - bpf_program__init(void *data, size_t size, char *section_name, int idx, + bpf_program__init(void *data, size_t size, const char *section_name, int idx, struct bpf_program *prog) { const size_t bpf_insn_sz = sizeof(struct bpf_insn); @@@ -552,7 -557,7 +557,7 @@@ errout
static int bpf_object__add_program(struct bpf_object *obj, void *data, size_t size, - char *section_name, int idx) + const char *section_name, int idx) { struct bpf_program prog, *progs; int nr_progs, err; @@@ -561,11 -566,10 +566,10 @@@ if (err) return err;
- prog.caps = &obj->caps; progs = obj->programs; nr_progs = obj->nr_programs;
- progs = reallocarray(progs, nr_progs + 1, sizeof(progs[0])); + progs = libbpf_reallocarray(progs, nr_progs + 1, sizeof(progs[0])); if (!progs) { /* * In this case the original obj->programs @@@ -578,7 -582,7 +582,7 @@@ return -ENOMEM; }
- pr_debug("found program %s\n", prog.section_name); + pr_debug("elf: found program '%s'\n", prog.section_name); obj->programs = progs; obj->nr_programs = nr_progs + 1; prog.obj = obj; @@@ -598,8 -602,7 +602,7 @@@ bpf_object__init_prog_names(struct bpf_
prog = &obj->programs[pi];
- for (si = 0; si < symbols->d_size / sizeof(GElf_Sym) && !name; - si++) { + for (si = 0; si < symbols->d_size / sizeof(GElf_Sym) && !name; si++) { GElf_Sym sym;
if (!gelf_getsym(symbols, si, &sym)) @@@ -609,11 -612,9 +612,9 @@@ if (GELF_ST_BIND(sym.st_info) != STB_GLOBAL) continue;
- name = elf_strptr(obj->efile.elf, - obj->efile.strtabidx, - sym.st_name); + name = elf_sym_str(obj, sym.st_name); if (!name) { - pr_warn("failed to get sym name string for prog %s\n", + pr_warn("prog '%s': failed to get symbol name\n", prog->section_name); return -LIBBPF_ERRNO__LIBELF; } @@@ -623,17 -624,14 +624,14 @@@ name = ".text";
if (!name) { - pr_warn("failed to find sym for prog %s\n", + pr_warn("prog '%s': failed to find program symbol\n", prog->section_name); return -EINVAL; }
prog->name = strdup(name); - if (!prog->name) { - pr_warn("failed to allocate memory for prog sym %s\n", - name); + if (!prog->name) return -ENOMEM; - } }
return 0; @@@ -1066,13 -1064,18 +1064,18 @@@ static void bpf_object__elf_finish(stru obj->efile.obj_buf_sz = 0; }
+ /* if libelf is old and doesn't support mmap(), fall back to read() */ + #ifndef ELF_C_READ_MMAP + #define ELF_C_READ_MMAP ELF_C_READ + #endif + static int bpf_object__elf_init(struct bpf_object *obj) { int err = 0; GElf_Ehdr *ep;
if (obj_elf_valid(obj)) { - pr_warn("elf init: internal error\n"); + pr_warn("elf: init internal error\n"); return -LIBBPF_ERRNO__LIBELF; }
@@@ -1090,31 -1093,44 +1093,44 @@@
err = -errno; cp = libbpf_strerror_r(err, errmsg, sizeof(errmsg)); - pr_warn("failed to open %s: %s\n", obj->path, cp); + pr_warn("elf: failed to open %s: %s\n", obj->path, cp); return err; }
- obj->efile.elf = elf_begin(obj->efile.fd, - LIBBPF_ELF_C_READ_MMAP, NULL); + obj->efile.elf = elf_begin(obj->efile.fd, ELF_C_READ_MMAP, NULL); }
if (!obj->efile.elf) { - pr_warn("failed to open %s as ELF file\n", obj->path); + pr_warn("elf: failed to open %s as ELF file: %s\n", obj->path, elf_errmsg(-1)); err = -LIBBPF_ERRNO__LIBELF; goto errout; }
if (!gelf_getehdr(obj->efile.elf, &obj->efile.ehdr)) { - pr_warn("failed to get EHDR from %s\n", obj->path); + pr_warn("elf: failed to get ELF header from %s: %s\n", obj->path, elf_errmsg(-1)); err = -LIBBPF_ERRNO__FORMAT; goto errout; } ep = &obj->efile.ehdr;
+ if (elf_getshdrstrndx(obj->efile.elf, &obj->efile.shstrndx)) { + pr_warn("elf: failed to get section names section index for %s: %s\n", + obj->path, elf_errmsg(-1)); + err = -LIBBPF_ERRNO__FORMAT; + goto errout; + } + + /* Elf is corrupted/truncated, avoid calling elf_strptr. */ + if (!elf_rawdata(elf_getscn(obj->efile.elf, obj->efile.shstrndx), NULL)) { + pr_warn("elf: failed to get section names strings from %s: %s\n", + obj->path, elf_errmsg(-1)); + return -LIBBPF_ERRNO__FORMAT; + } + /* Old LLVM set e_machine to EM_NONE */ if (ep->e_type != ET_REL || (ep->e_machine && ep->e_machine != EM_BPF)) { - pr_warn("%s is not an eBPF object file\n", obj->path); + pr_warn("elf: %s is not a valid eBPF object file\n", obj->path); err = -LIBBPF_ERRNO__FORMAT; goto errout; } @@@ -1136,7 -1152,7 +1152,7 @@@ static int bpf_object__check_endianness #else # error "Unrecognized __BYTE_ORDER__" #endif - pr_warn("endianness mismatch.\n"); + pr_warn("elf: endianness mismatch in %s.\n", obj->path); return -LIBBPF_ERRNO__ENDIAN; }
@@@ -1171,55 -1187,10 +1187,10 @@@ static bool bpf_map_type__is_map_in_map return false; }
- static int bpf_object_search_section_size(const struct bpf_object *obj, - const char *name, size_t *d_size) - { - const GElf_Ehdr *ep = &obj->efile.ehdr; - Elf *elf = obj->efile.elf; - Elf_Scn *scn = NULL; - int idx = 0; - - while ((scn = elf_nextscn(elf, scn)) != NULL) { - const char *sec_name; - Elf_Data *data; - GElf_Shdr sh; - - idx++; - if (gelf_getshdr(scn, &sh) != &sh) { - pr_warn("failed to get section(%d) header from %s\n", - idx, obj->path); - return -EIO; - } - - sec_name = elf_strptr(elf, ep->e_shstrndx, sh.sh_name); - if (!sec_name) { - pr_warn("failed to get section(%d) name from %s\n", - idx, obj->path); - return -EIO; - } - - if (strcmp(name, sec_name)) - continue; - - data = elf_getdata(scn, 0); - if (!data) { - pr_warn("failed to get section(%d) data from %s(%s)\n", - idx, name, obj->path); - return -EIO; - } - - *d_size = data->d_size; - return 0; - } - - return -ENOENT; - } - int bpf_object__section_size(const struct bpf_object *obj, const char *name, __u32 *size) { int ret = -ENOENT; - size_t d_size;
*size = 0; if (!name) { @@@ -1237,9 -1208,13 +1208,13 @@@ if (obj->efile.st_ops_data) *size = obj->efile.st_ops_data->d_size; } else { - ret = bpf_object_search_section_size(obj, name, &d_size); - if (!ret) - *size = d_size; + Elf_Scn *scn = elf_sec_by_name(obj, name); + Elf_Data *data = elf_sec_data(obj, scn); + + if (data) { + ret = 0; /* found it */ + *size = data->d_size; + } }
return *size ? 0 : ret; @@@ -1264,8 -1239,7 +1239,7 @@@ int bpf_object__variable_offset(const s GELF_ST_TYPE(sym.st_info) != STT_OBJECT) continue;
- sname = elf_strptr(obj->efile.elf, obj->efile.strtabidx, - sym.st_name); + sname = elf_sym_str(obj, sym.st_name); if (!sname) { pr_warn("failed to get sym name string for var %s\n", name); @@@ -1290,7 -1264,7 +1264,7 @@@ static struct bpf_map *bpf_object__add_ return &obj->maps[obj->nr_maps++];
new_cap = max((size_t)4, obj->maps_cap * 3 / 2); - new_maps = realloc(obj->maps, new_cap * sizeof(*obj->maps)); + new_maps = libbpf_reallocarray(obj->maps, new_cap, sizeof(*obj->maps)); if (!new_maps) { pr_warn("alloc maps for object failed\n"); return ERR_PTR(-ENOMEM); @@@ -1742,12 -1716,12 +1716,12 @@@ static int bpf_object__init_user_maps(s if (!symbols) return -EINVAL;
- scn = elf_getscn(obj->efile.elf, obj->efile.maps_shndx); - if (scn) - data = elf_getdata(scn, NULL); + + scn = elf_sec_by_idx(obj, obj->efile.maps_shndx); + data = elf_sec_data(obj, scn); if (!scn || !data) { - pr_warn("failed to get Elf_Data from map section %d\n", - obj->efile.maps_shndx); + pr_warn("elf: failed to get legacy map definitions for %s\n", + obj->path); return -EINVAL; }
@@@ -1769,12 -1743,12 +1743,12 @@@ nr_maps++; } /* Assume equally sized map definitions */ - pr_debug("maps in %s: %d maps in %zd bytes\n", - obj->path, nr_maps, data->d_size); + pr_debug("elf: found %d legacy map definitions (%zd bytes) in %s\n", + nr_maps, data->d_size, obj->path);
if (!data->d_size || nr_maps == 0 || (data->d_size % nr_maps) != 0) { - pr_warn("unable to determine map definition size section %s, %d maps in %zd bytes\n", - obj->path, nr_maps, data->d_size); + pr_warn("elf: unable to determine legacy map definition size in %s\n", + obj->path); return -EINVAL; } map_def_sz = data->d_size / nr_maps; @@@ -1795,8 -1769,7 +1769,7 @@@ if (IS_ERR(map)) return PTR_ERR(map);
- map_name = elf_strptr(obj->efile.elf, obj->efile.strtabidx, - sym.st_name); + map_name = elf_sym_str(obj, sym.st_name); if (!map_name) { pr_warn("failed to get map #%d name sym string for obj %s\n", i, obj->path); @@@ -1884,6 -1857,29 +1857,29 @@@ resolve_func_ptr(const struct btf *btf return btf_is_func_proto(t) ? t : NULL; }
+ static const char *btf_kind_str(const struct btf_type *t) + { + switch (btf_kind(t)) { + case BTF_KIND_UNKN: return "void"; + case BTF_KIND_INT: return "int"; + case BTF_KIND_PTR: return "ptr"; + case BTF_KIND_ARRAY: return "array"; + case BTF_KIND_STRUCT: return "struct"; + case BTF_KIND_UNION: return "union"; + case BTF_KIND_ENUM: return "enum"; + case BTF_KIND_FWD: return "fwd"; + case BTF_KIND_TYPEDEF: return "typedef"; + case BTF_KIND_VOLATILE: return "volatile"; + case BTF_KIND_CONST: return "const"; + case BTF_KIND_RESTRICT: return "restrict"; + case BTF_KIND_FUNC: return "func"; + case BTF_KIND_FUNC_PROTO: return "func_proto"; + case BTF_KIND_VAR: return "var"; + case BTF_KIND_DATASEC: return "datasec"; + default: return "unknown"; + } + } + /* * Fetch integer attribute of BTF map definition. Such attributes are * represented using a pointer to an array, in which dimensionality of array @@@ -1900,8 -1896,8 +1896,8 @@@ static bool get_map_field_int(const cha const struct btf_type *arr_t;
if (!btf_is_ptr(t)) { - pr_warn("map '%s': attr '%s': expected PTR, got %u.\n", - map_name, name, btf_kind(t)); + pr_warn("map '%s': attr '%s': expected PTR, got %s.\n", + map_name, name, btf_kind_str(t)); return false; }
@@@ -1912,8 -1908,8 +1908,8 @@@ return false; } if (!btf_is_array(arr_t)) { - pr_warn("map '%s': attr '%s': expected ARRAY, got %u.\n", - map_name, name, btf_kind(arr_t)); + pr_warn("map '%s': attr '%s': expected ARRAY, got %s.\n", + map_name, name, btf_kind_str(arr_t)); return false; } arr_info = btf_array(arr_t); @@@ -1924,7 -1920,7 +1920,7 @@@ static int build_map_pin_path(struct bpf_map *map, const char *path) { char buf[PATH_MAX]; - int err, len; + int len;
if (!path) path = "/sys/fs/bpf"; @@@ -1935,11 -1931,7 +1931,7 @@@ else if (len >= PATH_MAX) return -ENAMETOOLONG;
- err = bpf_map__set_pin_path(map, buf); - if (err) - return err; - - return 0; + return bpf_map__set_pin_path(map, buf); }
@@@ -2007,8 -1999,8 +1999,8 @@@ static int parse_btf_map_def(struct bpf return -EINVAL; } if (!btf_is_ptr(t)) { - pr_warn("map '%s': key spec is not PTR: %u.\n", - map->name, btf_kind(t)); + pr_warn("map '%s': key spec is not PTR: %s.\n", + map->name, btf_kind_str(t)); return -EINVAL; } sz = btf__resolve_size(obj->btf, t->type); @@@ -2049,8 -2041,8 +2041,8 @@@ return -EINVAL; } if (!btf_is_ptr(t)) { - pr_warn("map '%s': value spec is not PTR: %u.\n", - map->name, btf_kind(t)); + pr_warn("map '%s': value spec is not PTR: %s.\n", + map->name, btf_kind_str(t)); return -EINVAL; } sz = btf__resolve_size(obj->btf, t->type); @@@ -2107,14 -2099,14 +2099,14 @@@ t = skip_mods_and_typedefs(obj->btf, btf_array(t)->type, NULL); if (!btf_is_ptr(t)) { - pr_warn("map '%s': map-in-map inner def is of unexpected kind %u.\n", - map->name, btf_kind(t)); + pr_warn("map '%s': map-in-map inner def is of unexpected kind %s.\n", + map->name, btf_kind_str(t)); return -EINVAL; } t = skip_mods_and_typedefs(obj->btf, t->type, NULL); if (!btf_is_struct(t)) { - pr_warn("map '%s': map-in-map inner def is of unexpected kind %u.\n", - map->name, btf_kind(t)); + pr_warn("map '%s': map-in-map inner def is of unexpected kind %s.\n", + map->name, btf_kind_str(t)); return -EINVAL; }
@@@ -2205,8 -2197,8 +2197,8 @@@ static int bpf_object__init_user_btf_ma return -EINVAL; } if (!btf_is_var(var)) { - pr_warn("map '%s': unexpected var kind %u.\n", - map_name, btf_kind(var)); + pr_warn("map '%s': unexpected var kind %s.\n", + map_name, btf_kind_str(var)); return -EINVAL; } if (var_extra->linkage != BTF_VAR_GLOBAL_ALLOCATED && @@@ -2218,8 -2210,8 +2210,8 @@@
def = skip_mods_and_typedefs(obj->btf, var->type, NULL); if (!btf_is_struct(def)) { - pr_warn("map '%s': unexpected def kind %u.\n", - map_name, btf_kind(var)); + pr_warn("map '%s': unexpected def kind %s.\n", + map_name, btf_kind_str(var)); return -EINVAL; } if (def->size > vi->size) { @@@ -2259,12 -2251,11 +2251,11 @@@ static int bpf_object__init_user_btf_ma if (obj->efile.btf_maps_shndx < 0) return 0;
- scn = elf_getscn(obj->efile.elf, obj->efile.btf_maps_shndx); - if (scn) - data = elf_getdata(scn, NULL); + scn = elf_sec_by_idx(obj, obj->efile.btf_maps_shndx); + data = elf_sec_data(obj, scn); if (!scn || !data) { - pr_warn("failed to get Elf_Data from map section %d (%s)\n", - obj->efile.btf_maps_shndx, MAPS_ELF_SEC); + pr_warn("elf: failed to get %s map definitions for %s\n", + MAPS_ELF_SEC, obj->path); return -EINVAL; }
@@@ -2322,36 -2313,28 +2313,28 @@@ static int bpf_object__init_maps(struc
static bool section_have_execinstr(struct bpf_object *obj, int idx) { - Elf_Scn *scn; GElf_Shdr sh;
- scn = elf_getscn(obj->efile.elf, idx); - if (!scn) - return false; - - if (gelf_getshdr(scn, &sh) != &sh) + if (elf_sec_hdr(obj, elf_sec_by_idx(obj, idx), &sh)) return false;
- if (sh.sh_flags & SHF_EXECINSTR) - return true; - - return false; + return sh.sh_flags & SHF_EXECINSTR; }
static bool btf_needs_sanitization(struct bpf_object *obj) { - bool has_func_global = obj->caps.btf_func_global; - bool has_datasec = obj->caps.btf_datasec; - bool has_func = obj->caps.btf_func; + bool has_func_global = kernel_supports(FEAT_BTF_GLOBAL_FUNC); + bool has_datasec = kernel_supports(FEAT_BTF_DATASEC); + bool has_func = kernel_supports(FEAT_BTF_FUNC);
return !has_func || !has_datasec || !has_func_global; }
static void bpf_object__sanitize_btf(struct bpf_object *obj, struct btf *btf) { - bool has_func_global = obj->caps.btf_func_global; - bool has_datasec = obj->caps.btf_datasec; - bool has_func = obj->caps.btf_func; + bool has_func_global = kernel_supports(FEAT_BTF_GLOBAL_FUNC); + bool has_datasec = kernel_supports(FEAT_BTF_DATASEC); + bool has_func = kernel_supports(FEAT_BTF_FUNC); struct btf_type *t; int i, j, vlen;
@@@ -2499,7 -2482,7 +2482,7 @@@ static int bpf_object__load_vmlinux_btf int err;
/* CO-RE relocations need kernel BTF */ - if (obj->btf_ext && obj->btf_ext->field_reloc_info.len) + if (obj->btf_ext && obj->btf_ext->core_relo_info.len) need_vmlinux_btf = true;
bpf_object__for_each_program(prog, obj) { @@@ -2533,6 -2516,15 +2516,15 @@@ static int bpf_object__sanitize_and_loa if (!obj->btf) return 0;
+ if (!kernel_supports(FEAT_BTF)) { + if (kernel_needs_btf(obj)) { + err = -EOPNOTSUPP; + goto report; + } + pr_debug("Kernel doesn't support BTF, skipping uploading it.\n"); + return 0; + } + sanitize = btf_needs_sanitization(obj); if (sanitize) { const void *raw_data; @@@ -2558,6 -2550,7 +2550,7 @@@ } btf__free(kern_btf); } + report: if (err) { btf_mandatory = kernel_needs_btf(obj); pr_warn("Error loading .BTF into kernel: %d. %s\n", err, @@@ -2569,61 -2562,199 +2562,199 @@@ return err; }
+ static const char *elf_sym_str(const struct bpf_object *obj, size_t off) + { + const char *name; + + name = elf_strptr(obj->efile.elf, obj->efile.strtabidx, off); + if (!name) { + pr_warn("elf: failed to get section name string at offset %zu from %s: %s\n", + off, obj->path, elf_errmsg(-1)); + return NULL; + } + + return name; + } + + static const char *elf_sec_str(const struct bpf_object *obj, size_t off) + { + const char *name; + + name = elf_strptr(obj->efile.elf, obj->efile.shstrndx, off); + if (!name) { + pr_warn("elf: failed to get section name string at offset %zu from %s: %s\n", + off, obj->path, elf_errmsg(-1)); + return NULL; + } + + return name; + } + + static Elf_Scn *elf_sec_by_idx(const struct bpf_object *obj, size_t idx) + { + Elf_Scn *scn; + + scn = elf_getscn(obj->efile.elf, idx); + if (!scn) { + pr_warn("elf: failed to get section(%zu) from %s: %s\n", + idx, obj->path, elf_errmsg(-1)); + return NULL; + } + return scn; + } + + static Elf_Scn *elf_sec_by_name(const struct bpf_object *obj, const char *name) + { + Elf_Scn *scn = NULL; + Elf *elf = obj->efile.elf; + const char *sec_name; + + while ((scn = elf_nextscn(elf, scn)) != NULL) { + sec_name = elf_sec_name(obj, scn); + if (!sec_name) + return NULL; + + if (strcmp(sec_name, name) != 0) + continue; + + return scn; + } + return NULL; + } + + static int elf_sec_hdr(const struct bpf_object *obj, Elf_Scn *scn, GElf_Shdr *hdr) + { + if (!scn) + return -EINVAL; + + if (gelf_getshdr(scn, hdr) != hdr) { + pr_warn("elf: failed to get section(%zu) header from %s: %s\n", + elf_ndxscn(scn), obj->path, elf_errmsg(-1)); + return -EINVAL; + } + + return 0; + } + + static const char *elf_sec_name(const struct bpf_object *obj, Elf_Scn *scn) + { + const char *name; + GElf_Shdr sh; + + if (!scn) + return NULL; + + if (elf_sec_hdr(obj, scn, &sh)) + return NULL; + + name = elf_sec_str(obj, sh.sh_name); + if (!name) { + pr_warn("elf: failed to get section(%zu) name from %s: %s\n", + elf_ndxscn(scn), obj->path, elf_errmsg(-1)); + return NULL; + } + + return name; + } + + static Elf_Data *elf_sec_data(const struct bpf_object *obj, Elf_Scn *scn) + { + Elf_Data *data; + + if (!scn) + return NULL; + + data = elf_getdata(scn, 0); + if (!data) { + pr_warn("elf: failed to get section(%zu) %s data from %s: %s\n", + elf_ndxscn(scn), elf_sec_name(obj, scn) ?: "<?>", + obj->path, elf_errmsg(-1)); + return NULL; + } + + return data; + } + + static bool is_sec_name_dwarf(const char *name) + { + /* approximation, but the actual list is too long */ + return strncmp(name, ".debug_", sizeof(".debug_") - 1) == 0; + } + + static bool ignore_elf_section(GElf_Shdr *hdr, const char *name) + { + /* no special handling of .strtab */ + if (hdr->sh_type == SHT_STRTAB) + return true; + + /* ignore .llvm_addrsig section as well */ + if (hdr->sh_type == 0x6FFF4C03 /* SHT_LLVM_ADDRSIG */) + return true; + + /* no subprograms will lead to an empty .text section, ignore it */ + if (hdr->sh_type == SHT_PROGBITS && hdr->sh_size == 0 && + strcmp(name, ".text") == 0) + return true; + + /* DWARF sections */ + if (is_sec_name_dwarf(name)) + return true; + + if (strncmp(name, ".rel", sizeof(".rel") - 1) == 0) { + name += sizeof(".rel") - 1; + /* DWARF section relocations */ + if (is_sec_name_dwarf(name)) + return true; + + /* .BTF and .BTF.ext don't need relocations */ + if (strcmp(name, BTF_ELF_SEC) == 0 || + strcmp(name, BTF_EXT_ELF_SEC) == 0) + return true; + } + + return false; + } + static int bpf_object__elf_collect(struct bpf_object *obj) { Elf *elf = obj->efile.elf; - GElf_Ehdr *ep = &obj->efile.ehdr; Elf_Data *btf_ext_data = NULL; Elf_Data *btf_data = NULL; Elf_Scn *scn = NULL; int idx = 0, err = 0;
- /* Elf is corrupted/truncated, avoid calling elf_strptr. */ - if (!elf_rawdata(elf_getscn(elf, ep->e_shstrndx), NULL)) { - pr_warn("failed to get e_shstrndx from %s\n", obj->path); - return -LIBBPF_ERRNO__FORMAT; - } - while ((scn = elf_nextscn(elf, scn)) != NULL) { - char *name; + const char *name; GElf_Shdr sh; Elf_Data *data;
idx++; - if (gelf_getshdr(scn, &sh) != &sh) { - pr_warn("failed to get section(%d) header from %s\n", - idx, obj->path); + + if (elf_sec_hdr(obj, scn, &sh)) return -LIBBPF_ERRNO__FORMAT; - }
- name = elf_strptr(elf, ep->e_shstrndx, sh.sh_name); - if (!name) { - pr_warn("failed to get section(%d) name from %s\n", - idx, obj->path); + name = elf_sec_str(obj, sh.sh_name); + if (!name) return -LIBBPF_ERRNO__FORMAT; - }
- data = elf_getdata(scn, 0); - if (!data) { - pr_warn("failed to get section(%d) data from %s(%s)\n", - idx, name, obj->path); + if (ignore_elf_section(&sh, name)) + continue; + + data = elf_sec_data(obj, scn); + if (!data) return -LIBBPF_ERRNO__FORMAT; - } - pr_debug("section(%d) %s, size %ld, link %d, flags %lx, type=%d\n", + + pr_debug("elf: section(%d) %s, size %ld, link %d, flags %lx, type=%d\n", idx, name, (unsigned long)data->d_size, (int)sh.sh_link, (unsigned long)sh.sh_flags, (int)sh.sh_type);
if (strcmp(name, "license") == 0) { - err = bpf_object__init_license(obj, - data->d_buf, - data->d_size); + err = bpf_object__init_license(obj, data->d_buf, data->d_size); if (err) return err; } else if (strcmp(name, "version") == 0) { - err = bpf_object__init_kversion(obj, - data->d_buf, - data->d_size); + err = bpf_object__init_kversion(obj, data->d_buf, data->d_size); if (err) return err; } else if (strcmp(name, "maps") == 0) { @@@ -2636,8 -2767,7 +2767,7 @@@ btf_ext_data = data; } else if (sh.sh_type == SHT_SYMTAB) { if (obj->efile.symbols) { - pr_warn("bpf: multiple SYMTAB in %s\n", - obj->path); + pr_warn("elf: multiple symbol tables in %s\n", obj->path); return -LIBBPF_ERRNO__FORMAT; } obj->efile.symbols = data; @@@ -2650,16 -2780,8 +2780,8 @@@ err = bpf_object__add_program(obj, data->d_buf, data->d_size, name, idx); - if (err) { - char errmsg[STRERR_BUFSIZE]; - char *cp; - - cp = libbpf_strerror_r(-err, errmsg, - sizeof(errmsg)); - pr_warn("failed to alloc program %s (%s): %s", - name, obj->path, cp); + if (err) return err; - } } else if (strcmp(name, DATA_SEC) == 0) { obj->efile.data = data; obj->efile.data_shndx = idx; @@@ -2670,7 -2792,8 +2792,8 @@@ obj->efile.st_ops_data = data; obj->efile.st_ops_shndx = idx; } else { - pr_debug("skip section(%d) %s\n", idx, name); + pr_info("elf: skipping unrecognized data section(%d) %s\n", + idx, name); } } else if (sh.sh_type == SHT_REL) { int nr_sects = obj->efile.nr_reloc_sects; @@@ -2681,34 -2804,33 +2804,33 @@@ if (!section_have_execinstr(obj, sec) && strcmp(name, ".rel" STRUCT_OPS_SEC) && strcmp(name, ".rel" MAPS_ELF_SEC)) { - pr_debug("skip relo %s(%d) for section(%d)\n", - name, idx, sec); + pr_info("elf: skipping relo section(%d) %s for section(%d) %s\n", + idx, name, sec, + elf_sec_name(obj, elf_sec_by_idx(obj, sec)) ?: "<?>"); continue; }
- sects = reallocarray(sects, nr_sects + 1, - sizeof(*obj->efile.reloc_sects)); - if (!sects) { - pr_warn("reloc_sects realloc failed\n"); + sects = libbpf_reallocarray(sects, nr_sects + 1, + sizeof(*obj->efile.reloc_sects)); + if (!sects) return -ENOMEM; - }
obj->efile.reloc_sects = sects; obj->efile.nr_reloc_sects++;
obj->efile.reloc_sects[nr_sects].shdr = sh; obj->efile.reloc_sects[nr_sects].data = data; - } else if (sh.sh_type == SHT_NOBITS && - strcmp(name, BSS_SEC) == 0) { + } else if (sh.sh_type == SHT_NOBITS && strcmp(name, BSS_SEC) == 0) { obj->efile.bss = data; obj->efile.bss_shndx = idx; } else { - pr_debug("skip section(%d) %s\n", idx, name); + pr_info("elf: skipping section(%d) %s (size %zu)\n", idx, name, + (size_t)sh.sh_size); } }
if (!obj->efile.strtabidx || obj->efile.strtabidx > idx) { - pr_warn("Corrupted ELF file: index of strtab invalid\n"); + pr_warn("elf: symbol strings section missing or invalid in %s\n", obj->path); return -LIBBPF_ERRNO__FORMAT; } return bpf_object__init_btf(obj, btf_data, btf_ext_data); @@@ -2869,14 -2991,13 +2991,13 @@@ static int bpf_object__collect_externs( if (!obj->efile.symbols) return 0;
- scn = elf_getscn(obj->efile.elf, obj->efile.symbols_shndx); - if (!scn) - return -LIBBPF_ERRNO__FORMAT; - if (gelf_getshdr(scn, &sh) != &sh) + scn = elf_sec_by_idx(obj, obj->efile.symbols_shndx); + if (elf_sec_hdr(obj, scn, &sh)) return -LIBBPF_ERRNO__FORMAT; - n = sh.sh_size / sh.sh_entsize;
+ n = sh.sh_size / sh.sh_entsize; pr_debug("looking for externs among %d symbols...\n", n); + for (i = 0; i < n; i++) { GElf_Sym sym;
@@@ -2884,13 -3005,12 +3005,12 @@@ return -LIBBPF_ERRNO__FORMAT; if (!sym_is_extern(&sym)) continue; - ext_name = elf_strptr(obj->efile.elf, obj->efile.strtabidx, - sym.st_name); + ext_name = elf_sym_str(obj, sym.st_name); if (!ext_name || !ext_name[0]) continue;
ext = obj->externs; - ext = reallocarray(ext, obj->nr_extern + 1, sizeof(*ext)); + ext = libbpf_reallocarray(ext, obj->nr_extern + 1, sizeof(*ext)); if (!ext) return -ENOMEM; obj->externs = ext; @@@ -3109,7 -3229,7 +3229,7 @@@ bpf_object__section_to_libbpf_map_type(
static int bpf_program__record_reloc(struct bpf_program *prog, struct reloc_desc *reloc_desc, - __u32 insn_idx, const char *name, + __u32 insn_idx, const char *sym_name, const GElf_Sym *sym, const GElf_Rel *rel) { struct bpf_insn *insn = &prog->insns[insn_idx]; @@@ -3117,22 -3237,25 +3237,25 @@@ struct bpf_object *obj = prog->obj; __u32 shdr_idx = sym->st_shndx; enum libbpf_map_type type; + const char *sym_sec_name; struct bpf_map *map;
/* sub-program call relocation */ if (insn->code == (BPF_JMP | BPF_CALL)) { if (insn->src_reg != BPF_PSEUDO_CALL) { - pr_warn("incorrect bpf_call opcode\n"); + pr_warn("prog '%s': incorrect bpf_call opcode\n", prog->name); return -LIBBPF_ERRNO__RELOC; } /* text_shndx can be 0, if no default "main" program exists */ if (!shdr_idx || shdr_idx != obj->efile.text_shndx) { - pr_warn("bad call relo against section %u\n", shdr_idx); + sym_sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, shdr_idx)); + pr_warn("prog '%s': bad call relo against '%s' in section '%s'\n", + prog->name, sym_name, sym_sec_name); return -LIBBPF_ERRNO__RELOC; } - if (sym->st_value % 8) { - pr_warn("bad call relo offset: %zu\n", - (size_t)sym->st_value); + if (sym->st_value % BPF_INSN_SZ) { + pr_warn("prog '%s': bad call relo against '%s' at offset %zu\n", + prog->name, sym_name, (size_t)sym->st_value); return -LIBBPF_ERRNO__RELOC; } reloc_desc->type = RELO_CALL; @@@ -3143,8 -3266,8 +3266,8 @@@ }
if (insn->code != (BPF_LD | BPF_IMM | BPF_DW)) { - pr_warn("invalid relo for insns[%d].code 0x%x\n", - insn_idx, insn->code); + pr_warn("prog '%s': invalid relo against '%s' for insns[%d].code 0x%x\n", + prog->name, sym_name, insn_idx, insn->code); return -LIBBPF_ERRNO__RELOC; }
@@@ -3159,12 -3282,12 +3282,12 @@@ break; } if (i >= n) { - pr_warn("extern relo failed to find extern for sym %d\n", - sym_idx); + pr_warn("prog '%s': extern relo failed to find extern for '%s' (%d)\n", + prog->name, sym_name, sym_idx); return -LIBBPF_ERRNO__RELOC; } - pr_debug("found extern #%d '%s' (sym %d) for insn %u\n", - i, ext->name, ext->sym_idx, insn_idx); + pr_debug("prog '%s': found extern #%d '%s' (sym %d) for insn #%u\n", + prog->name, i, ext->name, ext->sym_idx, insn_idx); reloc_desc->type = RELO_EXTERN; reloc_desc->insn_idx = insn_idx; reloc_desc->sym_off = i; /* sym_off stores extern index */ @@@ -3172,18 -3295,19 +3295,19 @@@ }
if (!shdr_idx || shdr_idx >= SHN_LORESERVE) { - pr_warn("invalid relo for '%s' in special section 0x%x; forgot to initialize global var?..\n", - name, shdr_idx); + pr_warn("prog '%s': invalid relo against '%s' in special section 0x%x; forgot to initialize global var?..\n", + prog->name, sym_name, shdr_idx); return -LIBBPF_ERRNO__RELOC; }
type = bpf_object__section_to_libbpf_map_type(obj, shdr_idx); + sym_sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, shdr_idx));
/* generic map reference relocation */ if (type == LIBBPF_MAP_UNSPEC) { if (!bpf_object__shndx_is_maps(obj, shdr_idx)) { - pr_warn("bad map relo against section %u\n", - shdr_idx); + pr_warn("prog '%s': bad map relo against '%s' in section '%s'\n", + prog->name, sym_name, sym_sec_name); return -LIBBPF_ERRNO__RELOC; } for (map_idx = 0; map_idx < nr_maps; map_idx++) { @@@ -3192,14 -3316,14 +3316,14 @@@ map->sec_idx != sym->st_shndx || map->sec_offset != sym->st_value) continue; - pr_debug("found map %zd (%s, sec %d, off %zu) for insn %u\n", - map_idx, map->name, map->sec_idx, + pr_debug("prog '%s': found map %zd (%s, sec %d, off %zu) for insn #%u\n", + prog->name, map_idx, map->name, map->sec_idx, map->sec_offset, insn_idx); break; } if (map_idx >= nr_maps) { - pr_warn("map relo failed to find map for sec %u, off %zu\n", - shdr_idx, (size_t)sym->st_value); + pr_warn("prog '%s': map relo failed to find map for section '%s', off %zu\n", + prog->name, sym_sec_name, (size_t)sym->st_value); return -LIBBPF_ERRNO__RELOC; } reloc_desc->type = RELO_LD64; @@@ -3211,21 -3335,22 +3335,22 @@@
/* global data map relocation */ if (!bpf_object__shndx_is_data(obj, shdr_idx)) { - pr_warn("bad data relo against section %u\n", shdr_idx); + pr_warn("prog '%s': bad data relo against section '%s'\n", + prog->name, sym_sec_name); return -LIBBPF_ERRNO__RELOC; } for (map_idx = 0; map_idx < nr_maps; map_idx++) { map = &obj->maps[map_idx]; if (map->libbpf_type != type) continue; - pr_debug("found data map %zd (%s, sec %d, off %zu) for insn %u\n", - map_idx, map->name, map->sec_idx, map->sec_offset, - insn_idx); + pr_debug("prog '%s': found data map %zd (%s, sec %d, off %zu) for insn %u\n", + prog->name, map_idx, map->name, map->sec_idx, + map->sec_offset, insn_idx); break; } if (map_idx >= nr_maps) { - pr_warn("data relo failed to find map for sec %u\n", - shdr_idx); + pr_warn("prog '%s': data relo failed to find map for section '%s'\n", + prog->name, sym_sec_name); return -LIBBPF_ERRNO__RELOC; }
@@@ -3241,9 -3366,17 +3366,17 @@@ bpf_program__collect_reloc(struct bpf_p Elf_Data *data, struct bpf_object *obj) { Elf_Data *symbols = obj->efile.symbols; + const char *relo_sec_name, *sec_name; + size_t sec_idx = shdr->sh_info; int err, i, nrels;
- pr_debug("collecting relocating info for: '%s'\n", prog->section_name); + relo_sec_name = elf_sec_str(obj, shdr->sh_name); + sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, sec_idx)); + if (!relo_sec_name || !sec_name) + return -EINVAL; + + pr_debug("sec '%s': collecting relocation for section(%zu) '%s'\n", + relo_sec_name, sec_idx, sec_name); nrels = shdr->sh_size / shdr->sh_entsize;
prog->reloc_desc = malloc(sizeof(*prog->reloc_desc) * nrels); @@@ -3254,35 -3387,34 +3387,34 @@@ prog->nr_reloc = nrels;
for (i = 0; i < nrels; i++) { - const char *name; + const char *sym_name; __u32 insn_idx; GElf_Sym sym; GElf_Rel rel;
if (!gelf_getrel(data, i, &rel)) { - pr_warn("relocation: failed to get %d reloc\n", i); + pr_warn("sec '%s': failed to get relo #%d\n", relo_sec_name, i); return -LIBBPF_ERRNO__FORMAT; } if (!gelf_getsym(symbols, GELF_R_SYM(rel.r_info), &sym)) { - pr_warn("relocation: symbol %"PRIx64" not found\n", - GELF_R_SYM(rel.r_info)); + pr_warn("sec '%s': symbol 0x%zx not found for relo #%d\n", + relo_sec_name, (size_t)GELF_R_SYM(rel.r_info), i); return -LIBBPF_ERRNO__FORMAT; } - if (rel.r_offset % sizeof(struct bpf_insn)) + if (rel.r_offset % BPF_INSN_SZ) { + pr_warn("sec '%s': invalid offset 0x%zx for relo #%d\n", + relo_sec_name, (size_t)GELF_R_SYM(rel.r_info), i); return -LIBBPF_ERRNO__FORMAT; + }
- insn_idx = rel.r_offset / sizeof(struct bpf_insn); - name = elf_strptr(obj->efile.elf, obj->efile.strtabidx, - sym.st_name) ? : "<?>"; + insn_idx = rel.r_offset / BPF_INSN_SZ; + sym_name = elf_sym_str(obj, sym.st_name) ?: "<?>";
- pr_debug("relo for shdr %u, symb %zu, value %zu, type %d, bind %d, name %d ('%s'), insn %u\n", - (__u32)sym.st_shndx, (size_t)GELF_R_SYM(rel.r_info), - (size_t)sym.st_value, GELF_ST_TYPE(sym.st_info), - GELF_ST_BIND(sym.st_info), sym.st_name, name, - insn_idx); + pr_debug("sec '%s': relo #%d: insn #%u against '%s'\n", + relo_sec_name, i, insn_idx, sym_name);
err = bpf_program__record_reloc(prog, &prog->reloc_desc[i], - insn_idx, name, &sym, &rel); + insn_idx, sym_name, &sym, &rel); if (err) return err; } @@@ -3433,8 -3565,14 +3565,14 @@@ bpf_object__probe_loading(struct bpf_ob return 0; }
- static int - bpf_object__probe_name(struct bpf_object *obj) + static int probe_fd(int fd) + { + if (fd >= 0) + close(fd); + return fd >= 0; + } + + static int probe_kern_prog_name(void) { struct bpf_load_program_attr attr; struct bpf_insn insns[] = { @@@ -3452,16 -3590,10 +3590,10 @@@ attr.license = "GPL"; attr.name = "test"; ret = bpf_load_program_xattr(&attr, NULL, 0); - if (ret >= 0) { - obj->caps.name = 1; - close(ret); - } - - return 0; + return probe_fd(ret); }
- static int - bpf_object__probe_global_data(struct bpf_object *obj) + static int probe_kern_global_data(void) { struct bpf_load_program_attr prg_attr; struct bpf_create_map_attr map_attr; @@@ -3498,16 -3630,23 +3630,23 @@@ prg_attr.license = "GPL";
ret = bpf_load_program_xattr(&prg_attr, NULL, 0); - if (ret >= 0) { - obj->caps.global_data = 1; - close(ret); - } - close(map); - return 0; + return probe_fd(ret); + } + + static int probe_kern_btf(void) + { + static const char strs[] = "\0int"; + __u32 types[] = { + /* int */ + BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4), + }; + + return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types), + strs, sizeof(strs))); }
- static int bpf_object__probe_btf_func(struct bpf_object *obj) + static int probe_kern_btf_func(void) { static const char strs[] = "\0int\0x\0a"; /* void x(int a) {} */ @@@ -3520,20 -3659,12 +3659,12 @@@ /* FUNC x */ /* [3] */ BTF_TYPE_ENC(5, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 0), 2), }; - int btf_fd;
- btf_fd = libbpf__load_raw_btf((char *)types, sizeof(types), - strs, sizeof(strs)); - if (btf_fd >= 0) { - obj->caps.btf_func = 1; - close(btf_fd); - return 1; - } - - return 0; + return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types), + strs, sizeof(strs))); }
- static int bpf_object__probe_btf_func_global(struct bpf_object *obj) + static int probe_kern_btf_func_global(void) { static const char strs[] = "\0int\0x\0a"; /* static void x(int a) {} */ @@@ -3546,20 -3677,12 +3677,12 @@@ /* FUNC x BTF_FUNC_GLOBAL */ /* [3] */ BTF_TYPE_ENC(5, BTF_INFO_ENC(BTF_KIND_FUNC, 0, BTF_FUNC_GLOBAL), 2), }; - int btf_fd;
- btf_fd = libbpf__load_raw_btf((char *)types, sizeof(types), - strs, sizeof(strs)); - if (btf_fd >= 0) { - obj->caps.btf_func_global = 1; - close(btf_fd); - return 1; - } - - return 0; + return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types), + strs, sizeof(strs))); }
- static int bpf_object__probe_btf_datasec(struct bpf_object *obj) + static int probe_kern_btf_datasec(void) { static const char strs[] = "\0x\0.data"; /* static int a; */ @@@ -3573,20 -3696,12 +3696,12 @@@ BTF_TYPE_ENC(3, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4), BTF_VAR_SECINFO_ENC(2, 0, 4), }; - int btf_fd; - - btf_fd = libbpf__load_raw_btf((char *)types, sizeof(types), - strs, sizeof(strs)); - if (btf_fd >= 0) { - obj->caps.btf_datasec = 1; - close(btf_fd); - return 1; - }
- return 0; + return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types), + strs, sizeof(strs))); }
- static int bpf_object__probe_array_mmap(struct bpf_object *obj) + static int probe_kern_array_mmap(void) { struct bpf_create_map_attr attr = { .map_type = BPF_MAP_TYPE_ARRAY, @@@ -3595,27 -3710,17 +3710,17 @@@ .value_size = sizeof(int), .max_entries = 1, }; - int fd; - - fd = bpf_create_map_xattr(&attr); - if (fd >= 0) { - obj->caps.array_mmap = 1; - close(fd); - return 1; - }
- return 0; + return probe_fd(bpf_create_map_xattr(&attr)); }
- static int - bpf_object__probe_exp_attach_type(struct bpf_object *obj) + static int probe_kern_exp_attach_type(void) { struct bpf_load_program_attr attr; struct bpf_insn insns[] = { BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }; - int fd;
memset(&attr, 0, sizeof(attr)); /* use any valid combination of program type and (optional) @@@ -3629,36 -3734,91 +3734,91 @@@ attr.insns_cnt = ARRAY_SIZE(insns); attr.license = "GPL";
- fd = bpf_load_program_xattr(&attr, NULL, 0); - if (fd >= 0) { - obj->caps.exp_attach_type = 1; - close(fd); - return 1; - } - return 0; + return probe_fd(bpf_load_program_xattr(&attr, NULL, 0)); }
- static int - bpf_object__probe_caps(struct bpf_object *obj) - { - int (*probe_fn[])(struct bpf_object *obj) = { - bpf_object__probe_name, - bpf_object__probe_global_data, - bpf_object__probe_btf_func, - bpf_object__probe_btf_func_global, - bpf_object__probe_btf_datasec, - bpf_object__probe_array_mmap, - bpf_object__probe_exp_attach_type, + static int probe_kern_probe_read_kernel(void) + { + struct bpf_load_program_attr attr; + struct bpf_insn insns[] = { + BPF_MOV64_REG(BPF_REG_1, BPF_REG_10), /* r1 = r10 (fp) */ + BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8), /* r1 += -8 */ + BPF_MOV64_IMM(BPF_REG_2, 8), /* r2 = 8 */ + BPF_MOV64_IMM(BPF_REG_3, 0), /* r3 = 0 */ + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_probe_read_kernel), + BPF_EXIT_INSN(), }; - int i, ret;
- for (i = 0; i < ARRAY_SIZE(probe_fn); i++) { - ret = probe_fn[i](obj); - if (ret < 0) - pr_debug("Probe #%d failed with %d.\n", i, ret); + memset(&attr, 0, sizeof(attr)); + attr.prog_type = BPF_PROG_TYPE_KPROBE; + attr.insns = insns; + attr.insns_cnt = ARRAY_SIZE(insns); + attr.license = "GPL"; + + return probe_fd(bpf_load_program_xattr(&attr, NULL, 0)); + } + + enum kern_feature_result { + FEAT_UNKNOWN = 0, + FEAT_SUPPORTED = 1, + FEAT_MISSING = 2, + }; + + typedef int (*feature_probe_fn)(void); + + static struct kern_feature_desc { + const char *desc; + feature_probe_fn probe; + enum kern_feature_result res; + } feature_probes[__FEAT_CNT] = { + [FEAT_PROG_NAME] = { + "BPF program name", probe_kern_prog_name, + }, + [FEAT_GLOBAL_DATA] = { + "global variables", probe_kern_global_data, + }, + [FEAT_BTF] = { + "minimal BTF", probe_kern_btf, + }, + [FEAT_BTF_FUNC] = { + "BTF functions", probe_kern_btf_func, + }, + [FEAT_BTF_GLOBAL_FUNC] = { + "BTF global function", probe_kern_btf_func_global, + }, + [FEAT_BTF_DATASEC] = { + "BTF data section and variable", probe_kern_btf_datasec, + }, + [FEAT_ARRAY_MMAP] = { + "ARRAY map mmap()", probe_kern_array_mmap, + }, + [FEAT_EXP_ATTACH_TYPE] = { + "BPF_PROG_LOAD expected_attach_type attribute", + probe_kern_exp_attach_type, + }, + [FEAT_PROBE_READ_KERN] = { + "bpf_probe_read_kernel() helper", probe_kern_probe_read_kernel, } + };
- return 0; + static bool kernel_supports(enum kern_feature_id feat_id) + { + struct kern_feature_desc *feat = &feature_probes[feat_id]; + int ret; + + if (READ_ONCE(feat->res) == FEAT_UNKNOWN) { + ret = feat->probe(); + if (ret > 0) { + WRITE_ONCE(feat->res, FEAT_SUPPORTED); + } else if (ret == 0) { + WRITE_ONCE(feat->res, FEAT_MISSING); + } else { + pr_warn("Detection of kernel %s support failed: %d\n", feat->desc, ret); + WRITE_ONCE(feat->res, FEAT_MISSING); + } + } + + return READ_ONCE(feat->res) == FEAT_SUPPORTED; }
static bool map_is_reuse_compat(const struct bpf_map *map, int map_fd) @@@ -3760,7 -3920,7 +3920,7 @@@ static int bpf_object__create_map(struc
memset(&create_attr, 0, sizeof(create_attr));
- if (obj->caps.name) + if (kernel_supports(FEAT_PROG_NAME)) create_attr.name = map->name; create_attr.map_ifindex = map->map_ifindex; create_attr.map_type = def->type; @@@ -4011,6 -4171,10 +4171,10 @@@ struct bpf_core_spec const struct btf *btf; /* high-level spec: named fields and array indices only */ struct bpf_core_accessor spec[BPF_CORE_SPEC_MAX_LEN]; + /* original unresolved (no skip_mods_or_typedefs) root type ID */ + __u32 root_type_id; + /* CO-RE relocation kind */ + enum bpf_core_relo_kind relo_kind; /* high-level spec length */ int len; /* raw, low-level spec: 1-to-1 with accessor spec string */ @@@ -4041,8 -4205,66 +4205,66 @@@ static bool is_flex_arr(const struct bt return acc->idx == btf_vlen(t) - 1; }
+ static const char *core_relo_kind_str(enum bpf_core_relo_kind kind) + { + switch (kind) { + case BPF_FIELD_BYTE_OFFSET: return "byte_off"; + case BPF_FIELD_BYTE_SIZE: return "byte_sz"; + case BPF_FIELD_EXISTS: return "field_exists"; + case BPF_FIELD_SIGNED: return "signed"; + case BPF_FIELD_LSHIFT_U64: return "lshift_u64"; + case BPF_FIELD_RSHIFT_U64: return "rshift_u64"; + case BPF_TYPE_ID_LOCAL: return "local_type_id"; + case BPF_TYPE_ID_TARGET: return "target_type_id"; + case BPF_TYPE_EXISTS: return "type_exists"; + case BPF_TYPE_SIZE: return "type_size"; + case BPF_ENUMVAL_EXISTS: return "enumval_exists"; + case BPF_ENUMVAL_VALUE: return "enumval_value"; + default: return "unknown"; + } + } + + static bool core_relo_is_field_based(enum bpf_core_relo_kind kind) + { + switch (kind) { + case BPF_FIELD_BYTE_OFFSET: + case BPF_FIELD_BYTE_SIZE: + case BPF_FIELD_EXISTS: + case BPF_FIELD_SIGNED: + case BPF_FIELD_LSHIFT_U64: + case BPF_FIELD_RSHIFT_U64: + return true; + default: + return false; + } + } + + static bool core_relo_is_type_based(enum bpf_core_relo_kind kind) + { + switch (kind) { + case BPF_TYPE_ID_LOCAL: + case BPF_TYPE_ID_TARGET: + case BPF_TYPE_EXISTS: + case BPF_TYPE_SIZE: + return true; + default: + return false; + } + } + + static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind) + { + switch (kind) { + case BPF_ENUMVAL_EXISTS: + case BPF_ENUMVAL_VALUE: + return true; + default: + return false; + } + } + /* - * Turn bpf_field_reloc into a low- and high-level spec representation, + * Turn bpf_core_relo into a low- and high-level spec representation, * validating correctness along the way, as well as calculating resulting * field bit offset, specified by accessor string. Low-level spec captures * every single level of nestedness, including traversing anonymous @@@ -4071,10 -4293,17 +4293,17 @@@ * - field 'a' access (corresponds to '2' in low-level spec); * - array element #3 access (corresponds to '3' in low-level spec). * + * Type-based relocations (TYPE_EXISTS/TYPE_SIZE, + * TYPE_ID_LOCAL/TYPE_ID_TARGET) don't capture any field information. Their + * spec and raw_spec are kept empty. + * + * Enum value-based relocations (ENUMVAL_EXISTS/ENUMVAL_VALUE) use access + * string to specify enumerator's value index that need to be relocated. */ - static int bpf_core_spec_parse(const struct btf *btf, + static int bpf_core_parse_spec(const struct btf *btf, __u32 type_id, const char *spec_str, + enum bpf_core_relo_kind relo_kind, struct bpf_core_spec *spec) { int access_idx, parsed_len, i; @@@ -4089,6 -4318,15 +4318,15 @@@
memset(spec, 0, sizeof(*spec)); spec->btf = btf; + spec->root_type_id = type_id; + spec->relo_kind = relo_kind; + + /* type-based relocations don't have a field access string */ + if (core_relo_is_type_based(relo_kind)) { + if (strcmp(spec_str, "0")) + return -EINVAL; + return 0; + }
/* parse spec_str="0:1:2:3:4" into array raw_spec=[0, 1, 2, 3, 4] */ while (*spec_str) { @@@ -4105,16 -4343,28 +4343,28 @@@ if (spec->raw_len == 0) return -EINVAL;
- /* first spec value is always reloc type array index */ t = skip_mods_and_typedefs(btf, type_id, &id); if (!t) return -EINVAL;
access_idx = spec->raw_spec[0]; - spec->spec[0].type_id = id; - spec->spec[0].idx = access_idx; + acc = &spec->spec[0]; + acc->type_id = id; + acc->idx = access_idx; spec->len++;
+ if (core_relo_is_enumval_based(relo_kind)) { + if (!btf_is_enum(t) || spec->raw_len > 1 || access_idx >= btf_vlen(t)) + return -EINVAL; + + /* record enumerator name in a first accessor */ + acc->name = btf__name_by_offset(btf, btf_enum(t)[access_idx].name_off); + return 0; + } + + if (!core_relo_is_field_based(relo_kind)) + return -EINVAL; + sz = btf__resolve_size(btf, id); if (sz < 0) return sz; @@@ -4172,8 -4422,8 +4422,8 @@@ return sz; spec->bit_offset += access_idx * sz * 8; } else { - pr_warn("relo for [%u] %s (at idx %d) captures type [%d] of unexpected kind %d\n", - type_id, spec_str, i, id, btf_kind(t)); + pr_warn("relo for [%u] %s (at idx %d) captures type [%d] of unexpected kind %s\n", + type_id, spec_str, i, id, btf_kind_str(t)); return -EINVAL; } } @@@ -4223,16 -4473,16 +4473,16 @@@ static struct ids_vec *bpf_core_find_ca { size_t local_essent_len, targ_essent_len; const char *local_name, *targ_name; - const struct btf_type *t; + const struct btf_type *t, *local_t; struct ids_vec *cand_ids; __u32 *new_ids; int i, err, n;
- t = btf__type_by_id(local_btf, local_type_id); - if (!t) + local_t = btf__type_by_id(local_btf, local_type_id); + if (!local_t) return ERR_PTR(-EINVAL);
- local_name = btf__name_by_offset(local_btf, t->name_off); + local_name = btf__name_by_offset(local_btf, local_t->name_off); if (str_is_empty(local_name)) return ERR_PTR(-EINVAL); local_essent_len = bpf_core_essential_name_len(local_name); @@@ -4244,12 -4494,11 +4494,11 @@@ n = btf__get_nr_types(targ_btf); for (i = 1; i <= n; i++) { t = btf__type_by_id(targ_btf, i); - targ_name = btf__name_by_offset(targ_btf, t->name_off); - if (str_is_empty(targ_name)) + if (btf_kind(t) != btf_kind(local_t)) continue;
- t = skip_mods_and_typedefs(targ_btf, i, NULL); - if (!btf_is_composite(t) && !btf_is_array(t)) + targ_name = btf__name_by_offset(targ_btf, t->name_off); + if (str_is_empty(targ_name)) continue;
targ_essent_len = bpf_core_essential_name_len(targ_name); @@@ -4257,11 -4506,12 +4506,12 @@@ continue;
if (strncmp(local_name, targ_name, local_essent_len) == 0) { - pr_debug("[%d] %s: found candidate [%d] %s\n", - local_type_id, local_name, i, targ_name); - new_ids = reallocarray(cand_ids->data, - cand_ids->len + 1, - sizeof(*cand_ids->data)); + pr_debug("CO-RE relocating [%d] %s %s: found target candidate [%d] %s %s\n", + local_type_id, btf_kind_str(local_t), + local_name, i, btf_kind_str(t), targ_name); + new_ids = libbpf_reallocarray(cand_ids->data, + cand_ids->len + 1, + sizeof(*cand_ids->data)); if (!new_ids) { err = -ENOMEM; goto err_out; @@@ -4276,8 -4526,9 +4526,9 @@@ err_out return ERR_PTR(err); }
- /* Check two types for compatibility, skipping const/volatile/restrict and - * typedefs, to ensure we are relocating compatible entities: + /* Check two types for compatibility for the purpose of field access + * relocation. const/volatile/restrict and typedefs are skipped to ensure we + * are relocating semantically compatible entities: * - any two STRUCTs/UNIONs are compatible and can be mixed; * - any two FWDs are compatible, if their names match (modulo flavor suffix); * - any two PTRs are always compatible; @@@ -4411,25 -4662,119 +4662,119 @@@ static int bpf_core_match_member(const /* matching named field */ struct bpf_core_accessor *targ_acc;
- targ_acc = &spec->spec[spec->len++]; - targ_acc->type_id = targ_id; - targ_acc->idx = i; - targ_acc->name = targ_name; + targ_acc = &spec->spec[spec->len++]; + targ_acc->type_id = targ_id; + targ_acc->idx = i; + targ_acc->name = targ_name; + + *next_targ_id = m->type; + found = bpf_core_fields_are_compat(local_btf, + local_member->type, + targ_btf, m->type); + if (!found) + spec->len--; /* pop accessor */ + return found; + } + /* member turned out not to be what we looked for */ + spec->bit_offset -= bit_offset; + spec->raw_len--; + } + + return 0; + } + + /* Check local and target types for compatibility. This check is used for + * type-based CO-RE relocations and follow slightly different rules than + * field-based relocations. This function assumes that root types were already + * checked for name match. Beyond that initial root-level name check, names + * are completely ignored. Compatibility rules are as follows: + * - any two STRUCTs/UNIONs/FWDs/ENUMs/INTs are considered compatible, but + * kind should match for local and target types (i.e., STRUCT is not + * compatible with UNION); + * - for ENUMs, the size is ignored; + * - for INT, size and signedness are ignored; + * - for ARRAY, dimensionality is ignored, element types are checked for + * compatibility recursively; + * - CONST/VOLATILE/RESTRICT modifiers are ignored; + * - TYPEDEFs/PTRs are compatible if types they pointing to are compatible; + * - FUNC_PROTOs are compatible if they have compatible signature: same + * number of input args and compatible return and argument types. + * These rules are not set in stone and probably will be adjusted as we get + * more experience with using BPF CO-RE relocations. + */ + static int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id, + const struct btf *targ_btf, __u32 targ_id) + { + const struct btf_type *local_type, *targ_type; + int depth = 32; /* max recursion depth */ + + /* caller made sure that names match (ignoring flavor suffix) */ + local_type = btf__type_by_id(local_btf, local_id); + targ_type = btf__type_by_id(targ_btf, targ_id); + if (btf_kind(local_type) != btf_kind(targ_type)) + return 0; + + recur: + depth--; + if (depth < 0) + return -EINVAL;
- *next_targ_id = m->type; - found = bpf_core_fields_are_compat(local_btf, - local_member->type, - targ_btf, m->type); - if (!found) - spec->len--; /* pop accessor */ - return found; + local_type = skip_mods_and_typedefs(local_btf, local_id, &local_id); + targ_type = skip_mods_and_typedefs(targ_btf, targ_id, &targ_id); + if (!local_type || !targ_type) + return -EINVAL; + + if (btf_kind(local_type) != btf_kind(targ_type)) + return 0; + + switch (btf_kind(local_type)) { + case BTF_KIND_UNKN: + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: + case BTF_KIND_ENUM: + case BTF_KIND_FWD: + return 1; + case BTF_KIND_INT: + /* just reject deprecated bitfield-like integers; all other + * integers are by default compatible between each other + */ + return btf_int_offset(local_type) == 0 && btf_int_offset(targ_type) == 0; + case BTF_KIND_PTR: + local_id = local_type->type; + targ_id = targ_type->type; + goto recur; + case BTF_KIND_ARRAY: + local_id = btf_array(local_type)->type; + targ_id = btf_array(targ_type)->type; + goto recur; + case BTF_KIND_FUNC_PROTO: { + struct btf_param *local_p = btf_params(local_type); + struct btf_param *targ_p = btf_params(targ_type); + __u16 local_vlen = btf_vlen(local_type); + __u16 targ_vlen = btf_vlen(targ_type); + int i, err; + + if (local_vlen != targ_vlen) + return 0; + + for (i = 0; i < local_vlen; i++, local_p++, targ_p++) { + skip_mods_and_typedefs(local_btf, local_p->type, &local_id); + skip_mods_and_typedefs(targ_btf, targ_p->type, &targ_id); + err = bpf_core_types_are_compat(local_btf, local_id, targ_btf, targ_id); + if (err <= 0) + return err; } - /* member turned out not to be what we looked for */ - spec->bit_offset -= bit_offset; - spec->raw_len--; - }
- return 0; + /* tail recurse for return type check */ + skip_mods_and_typedefs(local_btf, local_type->type, &local_id); + skip_mods_and_typedefs(targ_btf, targ_type->type, &targ_id); + goto recur; + } + default: + pr_warn("unexpected kind %s relocated, local [%d], target [%d]\n", + btf_kind_str(local_type), local_id, targ_id); + return 0; + } }
/* @@@ -4447,10 -4792,51 +4792,51 @@@ static int bpf_core_spec_match(struct b
memset(targ_spec, 0, sizeof(*targ_spec)); targ_spec->btf = targ_btf; + targ_spec->root_type_id = targ_id; + targ_spec->relo_kind = local_spec->relo_kind; + + if (core_relo_is_type_based(local_spec->relo_kind)) { + return bpf_core_types_are_compat(local_spec->btf, + local_spec->root_type_id, + targ_btf, targ_id); + }
local_acc = &local_spec->spec[0]; targ_acc = &targ_spec->spec[0];
+ if (core_relo_is_enumval_based(local_spec->relo_kind)) { + size_t local_essent_len, targ_essent_len; + const struct btf_enum *e; + const char *targ_name; + + /* has to resolve to an enum */ + targ_type = skip_mods_and_typedefs(targ_spec->btf, targ_id, &targ_id); + if (!btf_is_enum(targ_type)) + return 0; + + local_essent_len = bpf_core_essential_name_len(local_acc->name); + + for (i = 0, e = btf_enum(targ_type); i < btf_vlen(targ_type); i++, e++) { + targ_name = btf__name_by_offset(targ_spec->btf, e->name_off); + targ_essent_len = bpf_core_essential_name_len(targ_name); + if (targ_essent_len != local_essent_len) + continue; + if (strncmp(local_acc->name, targ_name, local_essent_len) == 0) { + targ_acc->type_id = targ_id; + targ_acc->idx = i; + targ_acc->name = targ_name; + targ_spec->len++; + targ_spec->raw_spec[targ_spec->raw_len] = targ_acc->idx; + targ_spec->raw_len++; + return 1; + } + } + return 0; + } + + if (!core_relo_is_field_based(local_spec->relo_kind)) + return -EINVAL; + for (i = 0; i < local_spec->len; i++, local_acc++, targ_acc++) { targ_type = skip_mods_and_typedefs(targ_spec->btf, targ_id, &targ_id); @@@ -4507,18 -4893,29 +4893,29 @@@ }
static int bpf_core_calc_field_relo(const struct bpf_program *prog, - const struct bpf_field_reloc *relo, + const struct bpf_core_relo *relo, const struct bpf_core_spec *spec, __u32 *val, bool *validate) { - const struct bpf_core_accessor *acc = &spec->spec[spec->len - 1]; - const struct btf_type *t = btf__type_by_id(spec->btf, acc->type_id); + const struct bpf_core_accessor *acc; + const struct btf_type *t; __u32 byte_off, byte_sz, bit_off, bit_sz; const struct btf_member *m; const struct btf_type *mt; bool bitfield; __s64 sz;
+ if (relo->kind == BPF_FIELD_EXISTS) { + *val = spec ? 1 : 0; + return 0; + } + + if (!spec) + return -EUCLEAN; /* request instruction poisoning */ + + acc = &spec->spec[spec->len - 1]; + t = btf__type_by_id(spec->btf, acc->type_id); + /* a[n] accessor needs special handling */ if (!acc->name) { if (relo->kind == BPF_FIELD_BYTE_OFFSET) { @@@ -4604,21 -5001,158 +5001,158 @@@ break; case BPF_FIELD_EXISTS: default: - pr_warn("prog '%s': unknown relo %d at insn #%d\n", - bpf_program__title(prog, false), - relo->kind, relo->insn_off / 8); - return -EINVAL; + return -EOPNOTSUPP; + } + + return 0; + } + + static int bpf_core_calc_type_relo(const struct bpf_core_relo *relo, + const struct bpf_core_spec *spec, + __u32 *val) + { + __s64 sz; + + /* type-based relos return zero when target type is not found */ + if (!spec) { + *val = 0; + return 0; + } + + switch (relo->kind) { + case BPF_TYPE_ID_TARGET: + *val = spec->root_type_id; + break; + case BPF_TYPE_EXISTS: + *val = 1; + break; + case BPF_TYPE_SIZE: + sz = btf__resolve_size(spec->btf, spec->root_type_id); + if (sz < 0) + return -EINVAL; + *val = sz; + break; + case BPF_TYPE_ID_LOCAL: + /* BPF_TYPE_ID_LOCAL is handled specially and shouldn't get here */ + default: + return -EOPNOTSUPP; + } + + return 0; + } + + static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo, + const struct bpf_core_spec *spec, + __u32 *val) + { + const struct btf_type *t; + const struct btf_enum *e; + + switch (relo->kind) { + case BPF_ENUMVAL_EXISTS: + *val = spec ? 1 : 0; + break; + case BPF_ENUMVAL_VALUE: + if (!spec) + return -EUCLEAN; /* request instruction poisoning */ + t = btf__type_by_id(spec->btf, spec->spec[0].type_id); + e = btf_enum(t) + spec->spec[0].idx; + *val = e->val; + break; + default: + return -EOPNOTSUPP; }
return 0; }
+ struct bpf_core_relo_res + { + /* expected value in the instruction, unless validate == false */ + __u32 orig_val; + /* new value that needs to be patched up to */ + __u32 new_val; + /* relocation unsuccessful, poison instruction, but don't fail load */ + bool poison; + /* some relocations can't be validated against orig_val */ + bool validate; + }; + + /* Calculate original and target relocation values, given local and target + * specs and relocation kind. These values are calculated for each candidate. + * If there are multiple candidates, resulting values should all be consistent + * with each other. Otherwise, libbpf will refuse to proceed due to ambiguity. + * If instruction has to be poisoned, *poison will be set to true. + */ + static int bpf_core_calc_relo(const struct bpf_program *prog, + const struct bpf_core_relo *relo, + int relo_idx, + const struct bpf_core_spec *local_spec, + const struct bpf_core_spec *targ_spec, + struct bpf_core_relo_res *res) + { + int err = -EOPNOTSUPP; + + res->orig_val = 0; + res->new_val = 0; + res->poison = false; + res->validate = true; + + if (core_relo_is_field_based(relo->kind)) { + err = bpf_core_calc_field_relo(prog, relo, local_spec, &res->orig_val, &res->validate); + err = err ?: bpf_core_calc_field_relo(prog, relo, targ_spec, &res->new_val, NULL); + } else if (core_relo_is_type_based(relo->kind)) { + err = bpf_core_calc_type_relo(relo, local_spec, &res->orig_val); + err = err ?: bpf_core_calc_type_relo(relo, targ_spec, &res->new_val); + } else if (core_relo_is_enumval_based(relo->kind)) { + err = bpf_core_calc_enumval_relo(relo, local_spec, &res->orig_val); + err = err ?: bpf_core_calc_enumval_relo(relo, targ_spec, &res->new_val); + } + + if (err == -EUCLEAN) { + /* EUCLEAN is used to signal instruction poisoning request */ + res->poison = true; + err = 0; + } else if (err == -EOPNOTSUPP) { + /* EOPNOTSUPP means unknown/unsupported relocation */ + pr_warn("prog '%s': relo #%d: unrecognized CO-RE relocation %s (%d) at insn #%d\n", + bpf_program__title(prog, false), relo_idx, + core_relo_kind_str(relo->kind), relo->kind, relo->insn_off / 8); + } + + return err; + } + + /* + * Turn instruction for which CO_RE relocation failed into invalid one with + * distinct signature. + */ + static void bpf_core_poison_insn(struct bpf_program *prog, int relo_idx, + int insn_idx, struct bpf_insn *insn) + { + pr_debug("prog '%s': relo #%d: substituting insn #%d w/ invalid insn\n", + bpf_program__title(prog, false), relo_idx, insn_idx); + insn->code = BPF_JMP | BPF_CALL; + insn->dst_reg = 0; + insn->src_reg = 0; + insn->off = 0; + /* if this instruction is reachable (not a dead code), + * verifier will complain with the following message: + * invalid func unknown#195896080 + */ + insn->imm = 195896080; /* => 0xbad2310 => "bad relo" */ + } + + static bool is_ldimm64(struct bpf_insn *insn) + { + return insn->code == (BPF_LD | BPF_IMM | BPF_DW); + } + /* * Patch relocatable BPF instruction. * * Patched value is determined by relocation kind and target specification. - * For field existence relocation target spec will be NULL if field is not - * found. + * For existence relocations target spec will be NULL if field/type is not found. * Expected insn->imm value is determined using relocation kind and local * spec, and is checked before patching instruction. If actual insn->imm value * is wrong, bail out with error. @@@ -4626,58 -5160,43 +5160,43 @@@ * Currently three kinds of BPF instructions are supported: * 1. rX = <imm> (assignment with immediate operand); * 2. rX += <imm> (arithmetic operations with immediate operand); + * 3. rX = <imm64> (load with 64-bit immediate value). */ - static int bpf_core_reloc_insn(struct bpf_program *prog, - const struct bpf_field_reloc *relo, + static int bpf_core_patch_insn(struct bpf_program *prog, + const struct bpf_core_relo *relo, int relo_idx, - const struct bpf_core_spec *local_spec, - const struct bpf_core_spec *targ_spec) + const struct bpf_core_relo_res *res) { __u32 orig_val, new_val; struct bpf_insn *insn; - bool validate = true; - int insn_idx, err; + int insn_idx; __u8 class;
- if (relo->insn_off % sizeof(struct bpf_insn)) + if (relo->insn_off % BPF_INSN_SZ) return -EINVAL; - insn_idx = relo->insn_off / sizeof(struct bpf_insn); + insn_idx = relo->insn_off / BPF_INSN_SZ; insn = &prog->insns[insn_idx]; class = BPF_CLASS(insn->code);
- if (relo->kind == BPF_FIELD_EXISTS) { - orig_val = 1; /* can't generate EXISTS relo w/o local field */ - new_val = targ_spec ? 1 : 0; - } else if (!targ_spec) { - pr_debug("prog '%s': relo #%d: substituting insn #%d w/ invalid insn\n", - bpf_program__title(prog, false), relo_idx, insn_idx); - insn->code = BPF_JMP | BPF_CALL; - insn->dst_reg = 0; - insn->src_reg = 0; - insn->off = 0; - /* if this instruction is reachable (not a dead code), - * verifier will complain with the following message: - * invalid func unknown#195896080 + if (res->poison) { + /* poison second part of ldimm64 to avoid confusing error from + * verifier about "unknown opcode 00" */ - insn->imm = 195896080; /* => 0xbad2310 => "bad relo" */ + if (is_ldimm64(insn)) + bpf_core_poison_insn(prog, relo_idx, insn_idx + 1, insn + 1); + bpf_core_poison_insn(prog, relo_idx, insn_idx, insn); return 0; - } else { - err = bpf_core_calc_field_relo(prog, relo, local_spec, - &orig_val, &validate); - if (err) - return err; - err = bpf_core_calc_field_relo(prog, relo, targ_spec, - &new_val, NULL); - if (err) - return err; }
+ orig_val = res->orig_val; + new_val = res->new_val; + switch (class) { case BPF_ALU: case BPF_ALU64: if (BPF_SRC(insn->code) != BPF_K) return -EINVAL; - if (validate && insn->imm != orig_val) { + if (res->validate && insn->imm != orig_val) { pr_warn("prog '%s': relo #%d: unexpected insn #%d (ALU/ALU64) value: got %u, exp %u -> %u\n", bpf_program__title(prog, false), relo_idx, insn_idx, insn->imm, orig_val, new_val); @@@ -4692,8 -5211,8 +5211,8 @@@ case BPF_LDX: case BPF_ST: case BPF_STX: - if (validate && insn->off != orig_val) { - pr_warn("prog '%s': relo #%d: unexpected insn #%d (LD/LDX/ST/STX) value: got %u, exp %u -> %u\n", + if (res->validate && insn->off != orig_val) { + pr_warn("prog '%s': relo #%d: unexpected insn #%d (LDX/ST/STX) value: got %u, exp %u -> %u\n", bpf_program__title(prog, false), relo_idx, insn_idx, insn->off, orig_val, new_val); return -EINVAL; @@@ -4710,8 -5229,37 +5229,37 @@@ bpf_program__title(prog, false), relo_idx, insn_idx, orig_val, new_val); break; + case BPF_LD: { + __u64 imm; + + if (!is_ldimm64(insn) || + insn[0].src_reg != 0 || insn[0].off != 0 || + insn_idx + 1 >= prog->insns_cnt || + insn[1].code != 0 || insn[1].dst_reg != 0 || + insn[1].src_reg != 0 || insn[1].off != 0) { + pr_warn("prog '%s': relo #%d: insn #%d (LDIMM64) has unexpected form\n", + bpf_program__title(prog, false), relo_idx, insn_idx); + return -EINVAL; + } + + imm = insn[0].imm + ((__u64)insn[1].imm << 32); + if (res->validate && imm != orig_val) { + pr_warn("prog '%s': relo #%d: unexpected insn #%d (LDIMM64) value: got %llu, exp %u -> %u\n", + bpf_program__title(prog, false), relo_idx, + insn_idx, (unsigned long long)imm, + orig_val, new_val); + return -EINVAL; + } + + insn[0].imm = new_val; + insn[1].imm = 0; /* currently only 32-bit values are supported */ + pr_debug("prog '%s': relo #%d: patched insn #%d (LDIMM64) imm64 %llu -> %u\n", + bpf_program__title(prog, false), relo_idx, insn_idx, + (unsigned long long)imm, new_val); + break; + } default: - pr_warn("prog '%s': relo #%d: trying to relocate unrecognized insn #%d, code:%x, src:%x, dst:%x, off:%x, imm:%x\n", + pr_warn("prog '%s': relo #%d: trying to relocate unrecognized insn #%d, code:0x%x, src:0x%x, dst:0x%x, off:0x%x, imm:0x%x\n", bpf_program__title(prog, false), relo_idx, insn_idx, insn->code, insn->src_reg, insn->dst_reg, insn->off, insn->imm); @@@ -4728,29 -5276,48 +5276,48 @@@ static void bpf_core_dump_spec(int level, const struct bpf_core_spec *spec) { const struct btf_type *t; + const struct btf_enum *e; const char *s; __u32 type_id; int i;
- type_id = spec->spec[0].type_id; + type_id = spec->root_type_id; t = btf__type_by_id(spec->btf, type_id); s = btf__name_by_offset(spec->btf, t->name_off); - libbpf_print(level, "[%u] %s + ", type_id, s);
- for (i = 0; i < spec->raw_len; i++) - libbpf_print(level, "%d%s", spec->raw_spec[i], - i == spec->raw_len - 1 ? " => " : ":"); + libbpf_print(level, "[%u] %s %s", type_id, btf_kind_str(t), str_is_empty(s) ? "<anon>" : s);
- libbpf_print(level, "%u.%u @ &x", - spec->bit_offset / 8, spec->bit_offset % 8); + if (core_relo_is_type_based(spec->relo_kind)) + return;
- for (i = 0; i < spec->len; i++) { - if (spec->spec[i].name) - libbpf_print(level, ".%s", spec->spec[i].name); - else - libbpf_print(level, "[%u]", spec->spec[i].idx); + if (core_relo_is_enumval_based(spec->relo_kind)) { + t = skip_mods_and_typedefs(spec->btf, type_id, NULL); + e = btf_enum(t) + spec->raw_spec[0]; + s = btf__name_by_offset(spec->btf, e->name_off); + + libbpf_print(level, "::%s = %u", s, e->val); + return; }
+ if (core_relo_is_field_based(spec->relo_kind)) { + for (i = 0; i < spec->len; i++) { + if (spec->spec[i].name) + libbpf_print(level, ".%s", spec->spec[i].name); + else if (i > 0 || spec->spec[i].idx > 0) + libbpf_print(level, "[%u]", spec->spec[i].idx); + } + + libbpf_print(level, " ("); + for (i = 0; i < spec->raw_len; i++) + libbpf_print(level, "%s%d", i == 0 ? "" : ":", spec->raw_spec[i]); + + if (spec->bit_offset % 8) + libbpf_print(level, " @ offset %u.%u)", + spec->bit_offset / 8, spec->bit_offset % 8); + else + libbpf_print(level, " @ offset %u)", spec->bit_offset / 8); + return; + } }
static size_t bpf_core_hash_fn(const void *key, void *ctx) @@@ -4814,22 -5381,23 +5381,23 @@@ static void *u32_as_hash_key(__u32 x * CPU-wise compared to prebuilding a map from all local type names to * a list of candidate type names. It's also sped up by caching resolved * list of matching candidates per each local "root" type ID, that has at - * least one bpf_field_reloc associated with it. This list is shared + * least one bpf_core_relo associated with it. This list is shared * between multiple relocations for the same type ID and is updated as some * of the candidates are pruned due to structural incompatibility. */ - static int bpf_core_reloc_field(struct bpf_program *prog, - const struct bpf_field_reloc *relo, - int relo_idx, - const struct btf *local_btf, - const struct btf *targ_btf, - struct hashmap *cand_cache) + static int bpf_core_apply_relo(struct bpf_program *prog, + const struct bpf_core_relo *relo, + int relo_idx, + const struct btf *local_btf, + const struct btf *targ_btf, + struct hashmap *cand_cache) { const char *prog_name = bpf_program__title(prog, false); - struct bpf_core_spec local_spec, cand_spec, targ_spec; + struct bpf_core_spec local_spec, cand_spec, targ_spec = {}; const void *type_key = u32_as_hash_key(relo->type_id); - const struct btf_type *local_type, *cand_type; - const char *local_name, *cand_name; + struct bpf_core_relo_res cand_res, targ_res; + const struct btf_type *local_type; + const char *local_name; struct ids_vec *cand_ids; __u32 local_id, cand_id; const char *spec_str; @@@ -4841,32 -5409,49 +5409,49 @@@ return -EINVAL;
local_name = btf__name_by_offset(local_btf, local_type->name_off); - if (str_is_empty(local_name)) + if (!local_name) return -EINVAL;
spec_str = btf__name_by_offset(local_btf, relo->access_str_off); if (str_is_empty(spec_str)) return -EINVAL;
- err = bpf_core_spec_parse(local_btf, local_id, spec_str, &local_spec); + err = bpf_core_parse_spec(local_btf, local_id, spec_str, relo->kind, &local_spec); if (err) { - pr_warn("prog '%s': relo #%d: parsing [%d] %s + %s failed: %d\n", - prog_name, relo_idx, local_id, local_name, spec_str, - err); + pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n", + prog_name, relo_idx, local_id, btf_kind_str(local_type), + str_is_empty(local_name) ? "<anon>" : local_name, + spec_str, err); return -EINVAL; }
- pr_debug("prog '%s': relo #%d: kind %d, spec is ", prog_name, relo_idx, - relo->kind); + pr_debug("prog '%s': relo #%d: kind <%s> (%d), spec is ", prog_name, + relo_idx, core_relo_kind_str(relo->kind), relo->kind); bpf_core_dump_spec(LIBBPF_DEBUG, &local_spec); libbpf_print(LIBBPF_DEBUG, "\n");
+ /* TYPE_ID_LOCAL relo is special and doesn't need candidate search */ + if (relo->kind == BPF_TYPE_ID_LOCAL) { + targ_res.validate = true; + targ_res.poison = false; + targ_res.orig_val = local_spec.root_type_id; + targ_res.new_val = local_spec.root_type_id; + goto patch_insn; + } + + /* libbpf doesn't support candidate search for anonymous types */ + if (str_is_empty(spec_str)) { + pr_warn("prog '%s': relo #%d: <%s> (%d) relocation doesn't support anonymous types\n", + prog_name, relo_idx, core_relo_kind_str(relo->kind), relo->kind); + return -EOPNOTSUPP; + } + if (!hashmap__find(cand_cache, type_key, (void **)&cand_ids)) { cand_ids = bpf_core_find_cands(local_btf, local_id, targ_btf); if (IS_ERR(cand_ids)) { - pr_warn("prog '%s': relo #%d: target candidate search failed for [%d] %s: %ld", - prog_name, relo_idx, local_id, local_name, - PTR_ERR(cand_ids)); + pr_warn("prog '%s': relo #%d: target candidate search failed for [%d] %s %s: %ld", + prog_name, relo_idx, local_id, btf_kind_str(local_type), + local_name, PTR_ERR(cand_ids)); return PTR_ERR(cand_ids); } err = hashmap__set(cand_cache, type_key, cand_ids, NULL, NULL); @@@ -4878,36 -5463,51 +5463,51 @@@
for (i = 0, j = 0; i < cand_ids->len; i++) { cand_id = cand_ids->data[i]; - cand_type = btf__type_by_id(targ_btf, cand_id); - cand_name = btf__name_by_offset(targ_btf, cand_type->name_off); - - err = bpf_core_spec_match(&local_spec, targ_btf, - cand_id, &cand_spec); - pr_debug("prog '%s': relo #%d: matching candidate #%d %s against spec ", - prog_name, relo_idx, i, cand_name); - bpf_core_dump_spec(LIBBPF_DEBUG, &cand_spec); - libbpf_print(LIBBPF_DEBUG, ": %d\n", err); + err = bpf_core_spec_match(&local_spec, targ_btf, cand_id, &cand_spec); if (err < 0) { - pr_warn("prog '%s': relo #%d: matching error: %d\n", - prog_name, relo_idx, err); + pr_warn("prog '%s': relo #%d: error matching candidate #%d ", + prog_name, relo_idx, i); + bpf_core_dump_spec(LIBBPF_WARN, &cand_spec); + libbpf_print(LIBBPF_WARN, ": %d\n", err); return err; } + + pr_debug("prog '%s': relo #%d: %s candidate #%d ", prog_name, + relo_idx, err == 0 ? "non-matching" : "matching", i); + bpf_core_dump_spec(LIBBPF_DEBUG, &cand_spec); + libbpf_print(LIBBPF_DEBUG, "\n"); + if (err == 0) continue;
+ err = bpf_core_calc_relo(prog, relo, relo_idx, &local_spec, &cand_spec, &cand_res); + if (err) + return err; + if (j == 0) { + targ_res = cand_res; targ_spec = cand_spec; } else if (cand_spec.bit_offset != targ_spec.bit_offset) { - /* if there are many candidates, they should all - * resolve to the same bit offset + /* if there are many field relo candidates, they + * should all resolve to the same bit offset */ - pr_warn("prog '%s': relo #%d: offset ambiguity: %u != %u\n", + pr_warn("prog '%s': relo #%d: field offset ambiguity: %u != %u\n", prog_name, relo_idx, cand_spec.bit_offset, targ_spec.bit_offset); return -EINVAL; + } else if (cand_res.poison != targ_res.poison || cand_res.new_val != targ_res.new_val) { + /* all candidates should result in the same relocation + * decision and value, otherwise it's dangerous to + * proceed due to ambiguity + */ + pr_warn("prog '%s': relo #%d: relocation decision ambiguity: %s %u != %s %u\n", + prog_name, relo_idx, + cand_res.poison ? "failure" : "success", cand_res.new_val, + targ_res.poison ? "failure" : "success", targ_res.new_val); + return -EINVAL; }
- cand_ids->data[j++] = cand_spec.spec[0].type_id; + cand_ids->data[j++] = cand_spec.root_type_id; }
/* @@@ -4926,19 -5526,25 +5526,25 @@@ * as well as expected case, depending whether instruction w/ * relocation is guarded in some way that makes it unreachable (dead * code) if relocation can't be resolved. This is handled in - * bpf_core_reloc_insn() uniformly by replacing that instruction with + * bpf_core_patch_insn() uniformly by replacing that instruction with * BPF helper call insn (using invalid helper ID). If that instruction * is indeed unreachable, then it will be ignored and eliminated by * verifier. If it was an error, then verifier will complain and point * to a specific instruction number in its log. */ - if (j == 0) - pr_debug("prog '%s': relo #%d: no matching targets found for [%d] %s + %s\n", - prog_name, relo_idx, local_id, local_name, spec_str); + if (j == 0) { + pr_debug("prog '%s': relo #%d: no matching targets found\n", + prog_name, relo_idx); + + /* calculate single target relo result explicitly */ + err = bpf_core_calc_relo(prog, relo, relo_idx, &local_spec, NULL, &targ_res); + if (err) + return err; + }
- /* bpf_core_reloc_insn should know how to handle missing targ_spec */ - err = bpf_core_reloc_insn(prog, relo, relo_idx, &local_spec, - j ? &targ_spec : NULL); + patch_insn: + /* bpf_core_patch_insn() should know how to handle missing targ_spec */ + err = bpf_core_patch_insn(prog, relo, relo_idx, &targ_res); if (err) { pr_warn("prog '%s': relo #%d: failed to patch insn at offset %d: %d\n", prog_name, relo_idx, relo->insn_off, err); @@@ -4949,10 -5555,10 +5555,10 @@@ }
static int - bpf_core_reloc_fields(struct bpf_object *obj, const char *targ_btf_path) + bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) { const struct btf_ext_info_sec *sec; - const struct bpf_field_reloc *rec; + const struct bpf_core_relo *rec; const struct btf_ext_info *seg; struct hashmap_entry *entry; struct hashmap *cand_cache = NULL; @@@ -4961,6 -5567,9 +5567,9 @@@ const char *sec_name; int i, err = 0;
+ if (obj->btf_ext->core_relo_info.len == 0) + return 0; + if (targ_btf_path) targ_btf = btf__parse_elf(targ_btf_path, NULL); else @@@ -4976,7 -5585,7 +5585,7 @@@ goto out; }
- seg = &obj->btf_ext->field_reloc_info; + seg = &obj->btf_ext->core_relo_info; for_each_btf_ext_sec(seg, sec) { sec_name = btf__name_by_offset(obj->btf, sec->sec_name_off); if (str_is_empty(sec_name)) { @@@ -4997,15 -5606,15 +5606,15 @@@ goto out; }
- pr_debug("prog '%s': performing %d CO-RE offset relocs\n", + pr_debug("sec '%s': found %d CO-RE relocations\n", sec_name, sec->num_info);
for_each_btf_ext_rec(seg, sec, i, rec) { - err = bpf_core_reloc_field(prog, rec, i, obj->btf, - targ_btf, cand_cache); + err = bpf_core_apply_relo(prog, rec, i, obj->btf, + targ_btf, cand_cache); if (err) { pr_warn("prog '%s': relo #%d: failed to relocate: %d\n", - sec_name, i, err); + prog->name, i, err); goto out; } } @@@ -5024,17 -5633,6 +5633,6 @@@ out return err; }
- static int - bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) - { - int err = 0; - - if (obj->btf_ext->field_reloc_info.len) - err = bpf_core_reloc_fields(obj, targ_btf_path); - - return err; - } - static int bpf_program__reloc_text(struct bpf_program *prog, struct bpf_object *obj, struct reloc_desc *relo) @@@ -5051,7 -5649,7 +5649,7 @@@ return -LIBBPF_ERRNO__RELOC; } new_cnt = prog->insns_cnt + text->insns_cnt; - new_insn = reallocarray(prog->insns, new_cnt, sizeof(*insn)); + new_insn = libbpf_reallocarray(prog->insns, new_cnt, sizeof(*insn)); if (!new_insn) { pr_warn("oom in prog realloc\n"); return -ENOMEM; @@@ -5136,7 -5734,8 +5734,8 @@@ bpf_program__relocate(struct bpf_progra return err; break; default: - pr_warn("relo #%d: bad relo type %d\n", i, relo->type); + pr_warn("prog '%s': relo #%d: bad relo type %d\n", + prog->name, i, relo->type); return -EINVAL; } } @@@ -5171,7 -5770,8 +5770,8 @@@ bpf_object__relocate(struct bpf_object
err = bpf_program__relocate(prog, obj); if (err) { - pr_warn("failed to relocate '%s'\n", prog->section_name); + pr_warn("prog '%s': failed to relocate data references: %d\n", + prog->name, err); return err; } break; @@@ -5186,7 -5786,8 +5786,8 @@@
err = bpf_program__relocate(prog, obj); if (err) { - pr_warn("failed to relocate '%s'\n", prog->section_name); + pr_warn("prog '%s': failed to relocate calls: %d\n", + prog->name, err); return err; } } @@@ -5203,8 -5804,8 +5804,8 @@@ static int bpf_object__collect_map_relo int i, j, nrels, new_sz; const struct btf_var_secinfo *vi = NULL; const struct btf_type *sec, *var, *def; + struct bpf_map *map = NULL, *targ_map; const struct btf_member *member; - struct bpf_map *map, *targ_map; const char *name, *mname; Elf_Data *symbols; unsigned int moff; @@@ -5230,8 -5831,7 +5831,7 @@@ i, (size_t)GELF_R_SYM(rel.r_info)); return -LIBBPF_ERRNO__FORMAT; } - name = elf_strptr(obj->efile.elf, obj->efile.strtabidx, - sym.st_name) ? : "<?>"; + name = elf_sym_str(obj, sym.st_name) ?: "<?>"; if (sym.st_shndx != obj->efile.btf_maps_shndx) { pr_warn(".maps relo #%d: '%s' isn't a BTF-defined map\n", i, name); @@@ -5293,7 -5893,7 +5893,7 @@@ moff /= bpf_ptr_sz; if (moff >= map->init_slots_sz) { new_sz = moff + 1; - tmp = realloc(map->init_slots, new_sz * host_ptr_sz); + tmp = libbpf_reallocarray(map->init_slots, new_sz, host_ptr_sz); if (!tmp) return -ENOMEM; map->init_slots = tmp; @@@ -5348,6 -5948,51 +5948,51 @@@ static int bpf_object__collect_reloc(st return 0; }
+ static bool insn_is_helper_call(struct bpf_insn *insn, enum bpf_func_id *func_id) + { + if (BPF_CLASS(insn->code) == BPF_JMP && + BPF_OP(insn->code) == BPF_CALL && + BPF_SRC(insn->code) == BPF_K && + insn->src_reg == 0 && + insn->dst_reg == 0) { + *func_id = insn->imm; + return true; + } + return false; + } + + static int bpf_object__sanitize_prog(struct bpf_object* obj, struct bpf_program *prog) + { + struct bpf_insn *insn = prog->insns; + enum bpf_func_id func_id; + int i; + + for (i = 0; i < prog->insns_cnt; i++, insn++) { + if (!insn_is_helper_call(insn, &func_id)) + continue; + + /* on kernels that don't yet support + * bpf_probe_read_{kernel,user}[_str] helpers, fall back + * to bpf_probe_read() which works well for old kernels + */ + switch (func_id) { + case BPF_FUNC_probe_read_kernel: + case BPF_FUNC_probe_read_user: + if (!kernel_supports(FEAT_PROBE_READ_KERN)) + insn->imm = BPF_FUNC_probe_read; + break; + case BPF_FUNC_probe_read_kernel_str: + case BPF_FUNC_probe_read_user_str: + if (!kernel_supports(FEAT_PROBE_READ_KERN)) + insn->imm = BPF_FUNC_probe_read_str; + break; + default: + break; + } + } + return 0; + } + static int load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, char *license, __u32 kern_version, int *pfd) @@@ -5364,12 -6009,12 +6009,12 @@@ memset(&load_attr, 0, sizeof(struct bpf_load_program_attr)); load_attr.prog_type = prog->type; /* old kernels might not support specifying expected_attach_type */ - if (!prog->caps->exp_attach_type && prog->sec_def && + if (!kernel_supports(FEAT_EXP_ATTACH_TYPE) && prog->sec_def && prog->sec_def->is_exp_attach_type_optional) load_attr.expected_attach_type = 0; else load_attr.expected_attach_type = prog->expected_attach_type; - if (prog->caps->name) + if (kernel_supports(FEAT_PROG_NAME)) load_attr.name = prog->name; load_attr.insns = insns; load_attr.insns_cnt = insns_cnt; @@@ -5387,7 -6032,7 +6032,7 @@@ } /* specify func_info/line_info only if kernel supports them */ btf_fd = bpf_object__btf_fd(prog->obj); - if (btf_fd >= 0 && prog->obj->caps.btf_func) { + if (btf_fd >= 0 && kernel_supports(FEAT_BTF_FUNC)) { load_attr.prog_btf_fd = btf_fd; load_attr.func_info = prog->func_info; load_attr.func_info_rec_size = prog->func_info_rec_size; @@@ -5425,7 -6070,7 +6070,7 @@@ retry_load free(log_buf); goto retry_load; } - ret = -errno; + ret = errno ? -errno : -LIBBPF_ERRNO__LOAD; cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg)); pr_warn("load bpf program failed: %s\n", cp); pr_perm_msg(ret); @@@ -5562,13 -6207,19 +6207,19 @@@ bpf_object__load_progs(struct bpf_objec size_t i; int err;
+ for (i = 0; i < obj->nr_programs; i++) { + prog = &obj->programs[i]; + err = bpf_object__sanitize_prog(obj, prog); + if (err) + return err; + } + for (i = 0; i < obj->nr_programs; i++) { prog = &obj->programs[i]; if (bpf_program__is_function_storage(prog, obj)) continue; if (!prog->load) { - pr_debug("prog '%s'('%s'): skipped loading\n", - prog->name, prog->section_name); + pr_debug("prog '%s': skipped loading\n", prog->name); continue; } prog->log_level |= log_level; @@@ -5641,6 -6292,8 +6292,8 @@@ __bpf_object__open(const char *path, co /* couldn't guess, but user might manually specify */ continue;
+ if (prog->sec_def->is_sleepable) + prog->prog_flags |= BPF_F_SLEEPABLE; bpf_program__set_type(prog, prog->sec_def->prog_type); bpf_program__set_expected_attach_type(prog, prog->sec_def->expected_attach_type); @@@ -5750,11 -6403,11 +6403,11 @@@ static int bpf_object__sanitize_maps(st bpf_object__for_each_map(m, obj) { if (!bpf_map__is_internal(m)) continue; - if (!obj->caps.global_data) { + if (!kernel_supports(FEAT_GLOBAL_DATA)) { pr_warn("kernel doesn't support global data\n"); return -ENOTSUP; } - if (!obj->caps.array_mmap) + if (!kernel_supports(FEAT_ARRAY_MMAP)) m->def.map_flags ^= BPF_F_MMAPABLE; }
@@@ -5904,7 -6557,6 +6557,6 @@@ int bpf_object__load_xattr(struct bpf_o }
err = bpf_object__probe_loading(obj); - err = err ? : bpf_object__probe_caps(obj); err = err ? : bpf_object__resolve_externs(obj, obj->kconfig); err = err ? : bpf_object__sanitize_and_load_btf(obj); err = err ? : bpf_object__sanitize_maps(obj); @@@ -6713,7 -7365,7 +7365,7 @@@ int bpf_program__fd(const struct bpf_pr
size_t bpf_program__size(const struct bpf_program *prog) { - return prog->insns_cnt * sizeof(struct bpf_insn); + return prog->insns_cnt * BPF_INSN_SZ; }
int bpf_program__set_prep(struct bpf_program *prog, int nr_instances, @@@ -6910,6 -7562,21 +7562,21 @@@ static const struct bpf_sec_def section .expected_attach_type = BPF_TRACE_FEXIT, .is_attach_btf = true, .attach_fn = attach_trace), + SEC_DEF("fentry.s/", TRACING, + .expected_attach_type = BPF_TRACE_FENTRY, + .is_attach_btf = true, + .is_sleepable = true, + .attach_fn = attach_trace), + SEC_DEF("fmod_ret.s/", TRACING, + .expected_attach_type = BPF_MODIFY_RETURN, + .is_attach_btf = true, + .is_sleepable = true, + .attach_fn = attach_trace), + SEC_DEF("fexit.s/", TRACING, + .expected_attach_type = BPF_TRACE_FEXIT, + .is_attach_btf = true, + .is_sleepable = true, + .attach_fn = attach_trace), SEC_DEF("freplace/", EXT, .is_attach_btf = true, .attach_fn = attach_trace), @@@ -6917,6 -7584,11 +7584,11 @@@ .is_attach_btf = true, .expected_attach_type = BPF_LSM_MAC, .attach_fn = attach_lsm), + SEC_DEF("lsm.s/", LSM, + .is_attach_btf = true, + .is_sleepable = true, + .expected_attach_type = BPF_LSM_MAC, + .attach_fn = attach_lsm), SEC_DEF("iter/", TRACING, .expected_attach_type = BPF_TRACE_ITER, .is_attach_btf = true, @@@ -7122,8 -7794,7 +7794,7 @@@ static int bpf_object__collect_st_ops_r return -LIBBPF_ERRNO__FORMAT; }
- name = elf_strptr(obj->efile.elf, obj->efile.strtabidx, - sym.st_name) ? : "<?>"; + name = elf_sym_str(obj, sym.st_name) ?: "<?>"; map = find_struct_ops_map_by_offset(obj, rel.r_offset); if (!map) { pr_warn("struct_ops reloc: cannot find map at rel.r_offset %zu\n", @@@ -7640,7 -8311,7 +8311,7 @@@ int bpf_prog_load_xattr(const struct bp
prog->prog_ifindex = attr->ifindex; prog->log_level = attr->log_level; - prog->prog_flags = attr->prog_flags; + prog->prog_flags |= attr->prog_flags; if (!first_prog) first_prog = prog; } @@@ -8594,7 -9265,7 +9265,7 @@@ struct perf_buffer *perf_buffer__new(in struct perf_buffer_params p = {}; struct perf_event_attr attr = { 0, };
- attr.config = PERF_COUNT_SW_BPF_OUTPUT, + attr.config = PERF_COUNT_SW_BPF_OUTPUT; attr.type = PERF_TYPE_SOFTWARE; attr.sample_type = PERF_SAMPLE_RAW; attr.sample_period = 1; @@@ -8832,6 -9503,11 +9503,11 @@@ static int perf_buffer__process_records return 0; }
+ int perf_buffer__epoll_fd(const struct perf_buffer *pb) + { + return pb->epoll_fd; + } + int perf_buffer__poll(struct perf_buffer *pb, int timeout_ms) { int i, cnt, err; @@@ -8849,6 -9525,55 +9525,55 @@@ return cnt < 0 ? -errno : cnt; }
+ /* Return number of PERF_EVENT_ARRAY map slots set up by this perf_buffer + * manager. + */ + size_t perf_buffer__buffer_cnt(const struct perf_buffer *pb) + { + return pb->cpu_cnt; + } + + /* + * Return perf_event FD of a ring buffer in *buf_idx* slot of + * PERF_EVENT_ARRAY BPF map. This FD can be polled for new data using + * select()/poll()/epoll() Linux syscalls. + */ + int perf_buffer__buffer_fd(const struct perf_buffer *pb, size_t buf_idx) + { + struct perf_cpu_buf *cpu_buf; + + if (buf_idx >= pb->cpu_cnt) + return -EINVAL; + + cpu_buf = pb->cpu_bufs[buf_idx]; + if (!cpu_buf) + return -ENOENT; + + return cpu_buf->fd; + } + + /* + * Consume data from perf ring buffer corresponding to slot *buf_idx* in + * PERF_EVENT_ARRAY BPF map without waiting/polling. If there is no data to + * consume, do nothing and return success. + * Returns: + * - 0 on success; + * - <0 on failure. + */ + int perf_buffer__consume_buffer(struct perf_buffer *pb, size_t buf_idx) + { + struct perf_cpu_buf *cpu_buf; + + if (buf_idx >= pb->cpu_cnt) + return -EINVAL; + + cpu_buf = pb->cpu_bufs[buf_idx]; + if (!cpu_buf) + return -ENOENT; + + return perf_buffer__process_records(pb, cpu_buf); + } + int perf_buffer__consume(struct perf_buffer *pb) { int i, err; @@@ -8861,7 -9586,7 +9586,7 @@@
err = perf_buffer__process_records(pb, cpu_buf); if (err) { - pr_warn("error while processing records: %d\n", err); + pr_warn("perf_buffer: failed to process records in buffer #%d: %d\n", i, err); return err; } }
linux-merge@lists.open-mesh.org