summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-11-18mrp: introduce active flags to prevent UAF when applicant uninitSchspa Shi2-5/+14
The caller of del_timer_sync must prevent restarting of the timer, If we have no this synchronization, there is a small probability that the cancellation will not be successful. And syzbot report the fellowing crash: ================================================================== BUG: KASAN: use-after-free in hlist_add_head include/linux/list.h:929 [inline] BUG: KASAN: use-after-free in enqueue_timer+0x18/0xa4 kernel/time/timer.c:605 Write at addr f9ff000024df6058 by task syz-fuzzer/2256 Pointer tag: [f9], memory tag: [fe] CPU: 1 PID: 2256 Comm: syz-fuzzer Not tainted 6.1.0-rc5-syzkaller-00008- ge01d50cbd6ee #0 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace.part.0+0xe0/0xf0 arch/arm64/kernel/stacktrace.c:156 dump_backtrace arch/arm64/kernel/stacktrace.c:162 [inline] show_stack+0x18/0x40 arch/arm64/kernel/stacktrace.c:163 __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x68/0x84 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:284 [inline] print_report+0x1a8/0x4a0 mm/kasan/report.c:395 kasan_report+0x94/0xb4 mm/kasan/report.c:495 __do_kernel_fault+0x164/0x1e0 arch/arm64/mm/fault.c:320 do_bad_area arch/arm64/mm/fault.c:473 [inline] do_tag_check_fault+0x78/0x8c arch/arm64/mm/fault.c:749 do_mem_abort+0x44/0x94 arch/arm64/mm/fault.c:825 el1_abort+0x40/0x60 arch/arm64/kernel/entry-common.c:367 el1h_64_sync_handler+0xd8/0xe4 arch/arm64/kernel/entry-common.c:427 el1h_64_sync+0x64/0x68 arch/arm64/kernel/entry.S:576 hlist_add_head include/linux/list.h:929 [inline] enqueue_timer+0x18/0xa4 kernel/time/timer.c:605 mod_timer+0x14/0x20 kernel/time/timer.c:1161 mrp_periodic_timer_arm net/802/mrp.c:614 [inline] mrp_periodic_timer+0xa0/0xc0 net/802/mrp.c:627 call_timer_fn.constprop.0+0x24/0x80 kernel/time/timer.c:1474 expire_timers+0x98/0xc4 kernel/time/timer.c:1519 To fix it, we can introduce a new active flags to make sure the timer will not restart. Reported-by: syzbot+6fd64001c20aa99e34a4@syzkaller.appspotmail.com Signed-off-by: Schspa Shi <schspa@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18net: dsa: sja1105: disallow C45 transactions on the BASE-TX MDIO busVladimir Oltean1-0/+6
You'd think people know that the internal 100BASE-TX PHY on the SJA1110 responds only to clause 22 MDIO transactions, but they don't :) When a clause 45 transaction is attempted, sja1105_base_tx_mdio_read() and sja1105_base_tx_mdio_write() don't expect "reg" to contain bit 30 set (MII_ADDR_C45) and pack this value into the SPI transaction buffer. But the field in the SPI buffer has a width smaller than 30 bits, so we see this confusing message from the packing() API rather than a proper rejection of C45 transactions: Call trace: dump_stack+0x1c/0x38 sja1105_pack+0xbc/0xc0 [sja1105] sja1105_xfer+0x114/0x2b0 [sja1105] sja1105_xfer_u32+0x44/0xf4 [sja1105] sja1105_base_tx_mdio_read+0x44/0x7c [sja1105] mdiobus_read+0x44/0x80 get_phy_c45_ids+0x70/0x234 get_phy_device+0x68/0x15c fwnode_mdiobus_register_phy+0x74/0x240 of_mdiobus_register+0x13c/0x380 sja1105_mdiobus_register+0x368/0x490 [sja1105] sja1105_setup+0x94/0x119c [sja1105] Cannot store 401d2405 inside bits 24-4 (would truncate) Fixes: 5a8f09748ee7 ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18Merge tag 'rxrpc-next-20221116' of ↵David S. Miller3-11/+19
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Fix oops and missing config conditionals The patches that were pulled into net-next previously[1] had some issues that this patchset fixes: (1) Fix missing IPV6 config conditionals. (2) Fix an oops caused by calling udpv6_sendmsg() directly on an AF_INET socket. (3) Fix the validation of network addresses on entry to socket functions so that we don't allow an AF_INET6 address if we've selected an AF_INET transport socket. Link: https://lore.kernel.org/r/166794587113.2389296.16484814996876530222.stgit@warthog.procyon.org.uk/ [1] ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18rxrpc: Fix race between conn bundle lookup and bundle removal [ZDI-CAN-15975]David Howells2-15/+24
After rxrpc_unbundle_conn() has removed a connection from a bundle, it checks to see if there are any conns with available channels and, if not, removes and attempts to destroy the bundle. Whilst it does check after grabbing client_bundles_lock that there are no connections attached, this races with rxrpc_look_up_bundle() retrieving the bundle, but not attaching a connection for the connection to be attached later. There is therefore a window in which the bundle can get destroyed before we manage to attach a new connection to it. Fix this by adding an "active" counter to struct rxrpc_bundle: (1) rxrpc_connect_call() obtains an active count by prepping/looking up a bundle and ditches it before returning. (2) If, during rxrpc_connect_call(), a connection is added to the bundle, this obtains an active count, which is held until the connection is discarded. (3) rxrpc_deactivate_bundle() is created to drop an active count on a bundle and destroy it when the active count reaches 0. The active count is checked inside client_bundles_lock() to prevent a race with rxrpc_look_up_bundle(). (4) rxrpc_unbundle_conn() then calls rxrpc_deactivate_bundle(). Fixes: 245500d853e9 ("rxrpc: Rewrite the client connection manager") Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-15975 Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: zdi-disclosures@trendmicro.com cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18net: fix napi_disable() logic errorEric Dumazet1-2/+2
Dan reported a new warning after my recent patch: New smatch warnings: net/core/dev.c:6409 napi_disable() error: uninitialized symbol 'new'. Indeed, we must first wait for STATE_SCHED and STATE_NPSVC to be cleared, to make sure @new variable has been initialized properly. Fixes: 4ffa1d1c6842 ("net: adopt try_cmpxchg() in napi_{enable|disable}()") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18rxrpc: uninitialized variable in rxrpc_send_ack_packet()Dan Carpenter1-2/+0
The "pkt" was supposed to have been deleted in a previous patch. It leads to an uninitialized variable bug. Fixes: 72f0c6fb0579 ("rxrpc: Allocate ACK records at proposal and queue for transmission") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18rxrpc: fix rxkad_verify_response()Dan Carpenter1-2/+4
The error handling for if skb_copy_bits() fails was accidentally deleted so the rxkad_decrypt_ticket() function is not called. Fixes: 5d7edbc9231e ("rxrpc: Get rid of the Rx ring") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18selftests/net: fix missing xdp_dummyWang Yufen5-15/+23
After commit afef88e65554 ("selftests/bpf: Store BPF object files with .bpf.o extension"), we should use xdp_dummy.bpf.o instade of xdp_dummy.o. In addition, use the BPF_FILE variable to save the BPF object file name, which can be better identified and modified. Fixes: afef88e65554 ("selftests/bpf: Store BPF object files with .bpf.o extension") Signed-off-by: Wang Yufen <wangyufen@huawei.com> Cc: Daniel Müller <deso@posteo.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18net: ethernet: mtk_eth_soc: remove cpu_relax in mtk_pending_workLorenzo Bianconi1-5/+2
Get rid of cpu_relax in mtk_pending_work routine since MTK_RESETTING is set only in mtk_pending_work() and it runs holding rtnl lock Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18net: ethernet: mtk_eth_soc: do not overwrite mtu configuration running reset ↵Lorenzo Bianconi1-19/+34
routine Restore user configured MTU running mtk_hw_init() during tx timeout routine since it will be overwritten after a hw reset. Reported-by: Felix Fietkau <nbd@nbd.name> Fixes: 9ea4d311509f ("net: ethernet: mediatek: add the whole ethernet reset into the reset process") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18net: ipa: avoid a null pointer dereferenceAlex Elder1-3/+6
Dan Carpenter reported that Smatch found an instance where a pointer which had previously been assumed could be null (as indicated by a null check) was later dereferenced without a similar check. In practice this doesn't lead to a problem because currently the pointers used are all non-null. Nevertheless this patch addresses the reported problem. In addition, I spotted another bug that arose in the same commit. When the command to initialize a routing table memory region was added, the number of entries computed for the non-hashed table was wrong (it ended up being a Boolean rather than the count intended). This bug is fixed here as well. Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/kernel-janitors/Y3OOP9dXK6oEydkf@kili Tested-by: Caleb Connolly <caleb.connolly@linaro.com> Fixes: 5cb76899fb47 ("net: ipa: reduce arguments to ipa_table_init_add()") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18Merge tag 'wireless-next-2022-11-18' of ↵David S. Miller106-1419/+5790
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.2 Second set of patches for v6.2. Only driver patches this time, nothing really special. Unused platform data support was removed from wl1251 and rtw89 got WoWLAN support. Major changes: ath11k * support configuring channel dwell time during scan rtw89 * new dynamic header firmware format support * Wake-over-WLAN support rtl8xxxu * enable IEEE80211_HW_SUPPORT_FAST_XMIT ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18Merge branch 'sctp-vrf'David S. Miller14-69/+461
Xin Long says: ==================== sctp: support vrf processing This patchset adds the VRF processing in SCTP. Simliar to TCP/UDP, it includes socket bind and socket/association lookup changes. For socket bind change, it allows sockets to bind to a VRF device and allows multiple sockets with the same IP and PORT to bind to different interfaces in patch 1-3. For socket/association lookup change, it adds dif and sdif check in both asoc and ep lookup in patch 4 and 5, and when binding to nodev, users can decide if accept the packets received from one l3mdev by setup a sysctl option in patch 6. Note with VRF support, in a netns, an association will be decided by src ip + src port + dst ip + dst port + bound_dev_if, and it's possible for ss to have: State Local Address:Port Peer Address:Port ESTAB 192.168.1.2%vrf-s1:1234 `- ESTAB 192.168.1.2%veth1:1234 192.168.1.1:1234 ESTAB 192.168.1.2%vrf-s2:1234 `- ESTAB 192.168.1.2%veth2:1234 192.168.1.1:1234 See the selftest in patch 7 for more usage. Also, thanks Carlo for testing this patch series on their use. v1->v2: - In Patch 5, move sctp_sk_bound_dev_eq() definition to net/sctp/ input.c to avoid a build error when IP_SCTP is disabled, as Paolo suggested. - In Patch 7, avoid one sleep by disabling the IPv6 dad, and remove another sleep by using ss to check if the server's ready, and also delete two unncessary sleeps in sctp_hello.c, as Paolo suggested. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18selftests: add a selftest for sctp vrfXin Long3-0/+317
This patch adds 12 small test cases: 01-04 test for the sysctl net.sctp.l3mdev_accept. 05-10 test for only binding to a right l3mdev device, the connection can be created. 11-12 test for two socks binding to different l3mdev devices at the same time, each of them can process the packets from the corresponding peer. The tests run for both IPv4 and IPv6 SCTP. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18sctp: add sysctl net.sctp.l3mdev_acceptXin Long2-0/+20
This patch is to add sysctl net.sctp.l3mdev_accept to allow users to change the pernet global l3mdev_accept. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18sctp: add dif and sdif check in asoc and ep lookupXin Long8-61/+89
This patch at first adds a pernet global l3mdev_accept to decide if it accepts the packets from a l3mdev when a SCTP socket doesn't bind to any interface. It's set to 1 to avoid any possible incompatible issue, and in next patch, a sysctl will be introduced to allow to change it. Then similar to inet/udp_sk_bound_dev_eq(), sctp_sk_bound_dev_eq() is added to check either dif or sdif is equal to sk_bound_dev_if, and to check sid is 0 or l3mdev_accept is 1 if sk_bound_dev_if is not set. This function is used to match a association or a endpoint, namely called by sctp_addrs_lookup_transport() and sctp_endpoint_is_match(). All functions that needs updating are: sctp_rcv(): asoc: __sctp_rcv_lookup() __sctp_lookup_association() -> sctp_addrs_lookup_transport() __sctp_rcv_lookup_harder() __sctp_rcv_init_lookup() __sctp_lookup_association() -> sctp_addrs_lookup_transport() __sctp_rcv_walk_lookup() __sctp_rcv_asconf_lookup() __sctp_lookup_association() -> sctp_addrs_lookup_transport() ep: __sctp_rcv_lookup_endpoint() -> sctp_endpoint_is_match() sctp_connect(): sctp_endpoint_is_peeled_off() __sctp_lookup_association() sctp_has_association() sctp_lookup_association() __sctp_lookup_association() -> sctp_addrs_lookup_transport() sctp_diag_dump_one(): sctp_transport_lookup_process() -> sctp_addrs_lookup_transport() Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18sctp: add skb_sdif in struct sctp_afXin Long3-1/+14
Add skb_sdif function in struct sctp_af to get the enslaved device for both ipv4 and ipv6 when adding SCTP VRF support in sctp_rcv in the next patch. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18sctp: check sk_bound_dev_if when matching ep in get_portXin Long1-1/+4
In sctp_get_port_local(), when binding to IP and PORT, it should also check sk_bound_dev_if to match listening sk if it's set by SO_BINDTOIFINDEX, so that multiple sockets with the same IP and PORT, but different sk_bound_dev_if can be listened at the same time. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18sctp: check ipv6 addr with sk_bound_dev if setXin Long1-3/+11
When binding to an ipv6 address, it calls ipv6_chk_addr() to check if this address is on any dev. If a socket binds to a l3mdev but no dev is passed to do this check, all l3mdev and slaves will be skipped and the check will fail. This patch is to pass the bound_dev to make sure the devices under the same l3mdev can be returned in ipv6_chk_addr(). When the bound_dev is not a l3mdev or l3slave, l3mdev_master_dev_rcu() will return NULL in __ipv6_chk_addr_and_flags(), it will keep compitable with before when NULL dev was passed. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18sctp: verify the bind address with the tb_id from l3mdevXin Long1-3/+6
After binding to a l3mdev, it should use the route table from the corresponding VRF to verify the addr when binding to an address. Note ipv6 doesn't need it, as binding to ipv6 address does not verify the addr with route lookup. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18spi: spi-imx: spi_imx_transfer_one(): check for DMA transfer firstMarc Kleine-Budde1-3/+7
The SPI framework checks for each transfer (with the struct spi_controller::can_dma callback) whether the driver wants to use DMA for the transfer. If the driver returns true, the SPI framework will map the transfer's data to the device, start the actual transfer and map the data back. In commit 07e759387788 ("spi: spi-imx: add PIO polling support") the spi-imx driver's spi_imx_transfer_one() function was extended. If the estimated duration of a transfer does not exceed a configurable duration, a polling transfer function is used. This check happens before checking if the driver decided earlier for a DMA transfer. If spi_imx_can_dma() decided to use a DMA transfer, and the user configured a big maximum polling duration, a polling transfer will be used. The DMA unmap after the transfer destroys the transferred data. To fix this problem check in spi_imx_transfer_one() if the driver decided for DMA transfer first, then check the limits for a polling transfer. Fixes: 07e759387788 ("spi: spi-imx: add PIO polling support") Link: https://lore.kernel.org/all/20221111003032.82371-1-festevam@gmail.com Reported-by: Frieder Schrempf <frieder.schrempf@kontron.de> Reported-by: Fabio Estevam <festevam@gmail.com> Tested-by: Fabio Estevam <festevam@gmail.com> Cc: David Jander <david@protonic.nl> Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Tested-by: Frieder Schrempf <frieder.schrempf@kontron.de> Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de> Link: https://lore.kernel.org/r/20221116164930.855362-1-mkl@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>
2022-11-18ASoC: SOF: dai: move AMD_HS to end of list to restore backwards-compatibilityPierre-Louis Bossart1-1/+1
The addition of AMD_HS breaks Mediatek platforms by using an index previously allocated to Mediatek. This is a backwards-compatibility issue and needs to be fixed. All firmware released by AMD needs to be re-generated and re-distributed. Fixes: ed2562c64b4f ("ASoC: SOF: Adding amd HS functionality to the sof core") Link: https://github.com/thesofproject/sof/issues/6615 Link: https://lore.kernel.org/alsa-devel/36a45c7a-820a-7675-d740-c0e83ae2c417@collabora.com/ Reported-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Reviewed-by: Basavaraj Hiregoudar <basavaraj.hiregoudar@amd.com> Reviewed-by: V sujith kumar Reddy <Vsujithkumar.Reddy@amd.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20221117232120.112639-1-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-11-18net: libwx: Fix dead code for duplicate checkJiawen Wu1-2/+0
Fix duplicate check on polling timeout. Fixes: 1efa9bfe58c5 ("net: libwx: Implement interaction with firmware") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18ipvlan: hold lower dev to avoid possible use-after-freeMahesh Bandewar2-0/+3
Recently syzkaller discovered the issue of disappearing lower device (NETDEV_UNREGISTER) while the virtual device (like macvlan) is still having it as a lower device. So it's just a matter of time similar discovery will be made for IPvlan device setup. So fixing it preemptively. Also while at it, add a refcount tracker. Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.") Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18net: neigh: decrement the family specific qlenThomas Zeitlhofer2-29/+31
Commit 0ff4eb3d5ebb ("neighbour: make proxy_queue.qlen limit per-device") introduced the length counter qlen in struct neigh_parms. There are separate neigh_parms instances for IPv4/ARP and IPv6/ND, and while the family specific qlen is incremented in pneigh_enqueue(), the mentioned commit decrements always the IPv4/ARP specific qlen, regardless of the currently processed family, in pneigh_queue_purge() and neigh_proxy_process(). As a result, with IPv6/ND, the family specific qlen is only incremented (and never decremented) until it exceeds PROXY_QLEN, and then, according to the check in pneigh_enqueue(), neighbor solicitations are not answered anymore. As an example, this is noted when using the subnet-router anycast address to access a Linux router. After a certain amount of time (in the observed case, qlen exceeded PROXY_QLEN after two days), the Linux router stops answering neighbor solicitations for its subnet-router anycast address and effectively becomes unreachable. Another result with IPv6/ND is that the IPv4/ARP specific qlen is decremented more often than incremented. This leads to negative qlen values, as a signed integer has been used for the length counter qlen, and potentially to an integer overflow. Fix this by introducing the helper function neigh_parms_qlen_dec(), which decrements the family specific qlen. Thereby, make use of the existing helper function neigh_get_dev_parms_rcu(), whose definition therefore needs to be placed earlier in neighbour.c. Take the family member from struct neigh_table to determine the currently processed family and appropriately call neigh_parms_qlen_dec() from pneigh_queue_purge() and neigh_proxy_process(). Additionally, use an unsigned integer for the length counter qlen. Fixes: 0ff4eb3d5ebb ("neighbour: make proxy_queue.qlen limit per-device") Signed-off-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18net: liquidio: simplify if expressionLeon Romanovsky1-2/+2
Fix the warning reported by kbuild: cocci warnings: (new ones prefixed by >>) >> drivers/net/ethernet/cavium/liquidio/lio_main.c:1797:54-56: WARNING !A || A && B is equivalent to !A || B drivers/net/ethernet/cavium/liquidio/lio_main.c:1827:54-56: WARNING !A || A && B is equivalent to !A || B Fixes: 8979f428a4af ("net: liquidio: release resources when liquidio driver open failed") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Saeed Mahameed <saeed@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18net: phy: mscc: macsec: do not copy encryption keysAntoine Tenart2-29/+30
Following 1b16b3fdf675 ("net: phy: mscc: macsec: clear encryption keys when freeing a flow"), go one step further and instead of calling memzero_explicit on the key when freeing a flow, simply not copy the key in the first place as it's only used when a new flow is set up. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-18gpu: host1x: Avoid trying to use GART on Tegra20Robin Murphy2-0/+8
Since commit c7e3ca515e78 ("iommu/tegra: gart: Do not register with bus") quite some time ago, the GART driver has effectively disabled itself to avoid issues with the GPU driver expecting it to work in ways that it doesn't. As of commit 57365a04c921 ("iommu: Move bus setup to IOMMU device registration") that bodge no longer works, but really the GPU driver should be responsible for its own behaviour anyway. Make the workaround explicit. Reported-by: Jon Hunter <jonathanh@nvidia.com> Suggested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-11-18Merge branch 'mptcp-selftests-fix-timeouts-and-test-isolation'Jakub Kicinski3-9/+11
Mat Martineau says: ==================== mptcp: selftests: Fix timeouts and test isolation Patches 1 and 3 adjust test timeouts to reduce false negatives on slow machines. Patch 2 improves test isolation by running the mptcp_sockopt test in its own net namespace. ==================== Link: https://lore.kernel.org/r/20221115221046.20370-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18Merge branch 'net-ipa-change-gsi-firmware-load-specification'Jakub Kicinski2-38/+135
Alex Elder says: ==================== net: ipa: change GSI firmware load specification Currently, GSI firmware must be loaded for IPA before it can be used--either by the modem, or by the AP. New hardware supports a third option, with the bootloader taking responsibility for loading GSI firmware. In that case, neither the AP nor the modem needs to do that. The first patch in this series deprecates the "modem-init" Device Tree property in the IPA binding, using a new "qcom,gsi-loader" property instead. The second and third implement logic in the code to support either the "old" or the "new" way of specifying how GSI firmware is loaded. The last two patches implement a new value for the "qcom,gsi-loader" property. If the value is "skip", neither the AP nor modem needs to load the GSI firmware. The first of these patches implements the change in the IPA binding; the second implements it in the code. ==================== Link: https://lore.kernel.org/r/20221116073257.34010-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18net: ipa: permit GSI firmware loading to be skippedAlex Elder1-4/+12
Define a new value "skip" for the "qcom,gsi-loader" Device Tree property. If used, it indicates that neither the AP nor the modem need to load GSI firmware (because it has already been loaded--for example by the boot loader). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18dt-bindings: net: qcom,ipa: support skipping GSI firmware loadAlex Elder1-0/+2
Add a new enumerated value to those defined for the qcom,gsi-loader property. If the qcom,gsi-loader is "skip", the GSI firmware will already be loaded, so neither the AP nor modem is required to load GSI firmware. Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18net: ipa: introduce "qcom,gsi-loader" propertyAlex Elder1-7/+37
Introduce a new way of specifying how the GSI firmware gets loaded for IPA. Currently, this is indicated by the presence or absence of the Boolean "modem-init" Device Tree property. The new property must have a value--either "self" or "modem"--which indicates whether the AP or modem is the GSI firmware loader, respectively. For legacy systems, the new property will not exist, and the "modem-init" property will be used. For newer systems, the "qcom,gsi-loader" property *must* exist, and must have one of the two prescribed values. It is an error to have both properties defined, and it is an error for the new property to have an unrecognized value. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18net: ipa: encapsulate decision about firmware loadAlex Elder1-8/+31
The GSI layer used for IPA requires firmware to be loaded. Currently either the AP or the modem loads the firmware, distinguished by whether the "modem-init" Device Tree property is defined. Some newer systems implement a third option. In preparation for that, encapsulate the code that determines how the GSI firmware gets loaded in a new function, ipa_firmware_loader(). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18dt-bindings: net: qcom,ipa: deprecate modem-initAlex Elder1-21/+55
GSI firmware for IPA must be loaded during initialization, either by the AP or by the modem. The loader is currently specified based on whether the Boolean modem-init property is present. Instead, use a new property with an enumerated value to indicate explicitly how GSI firmware gets loaded. With this in place, a third approach can be added in an upcoming patch. The new qcom,gsi-loader property has two defined values: - self: The AP loads GSI firmware - modem: The modem loads GSI firmware The modem-init property must still be supported, but is now marked deprecated. Update the example so it represents the SC7180 SoC, and provide examples for the qcom,gsi-loader, memory-region, and firmware-name properties. Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18selftests: mptcp: fix mibit vs mbit mix upMatthieu Baerts1-2/+3
The estimated time was supposing the rate was expressed in mibit (bit * 1024^2) but it is in mbit (bit * 1000^2). This makes the threshold higher but in a more realistic way to avoid false positives reported by CI instances. Before this patch, the thresholds were at 7561/4005ms and now they are at 7906/4178ms. While at it, also fix a typo in the linked comment, spotted by Mat. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/310 Fixes: 1a418cb8e888 ("mptcp: simult flow self-tests") Suggested-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18selftests: mptcp: run mptcp_sockopt from a new netnsMatthieu Baerts1-4/+5
Not running it from a new netns causes issues if some MPTCP settings are modified, e.g. if MPTCP is disabled from the sysctl knob, if multiple addresses are available and added to the MPTCP path-manager, etc. In these cases, the created connection will not behave as expected, e.g. unable to create an MPTCP socket, more than one subflow is seen, etc. A new "sandbox" net namespace is now created and used to run mptcp_sockopt from this controlled environment. Fixes: ce9979129a0b ("selftests: mptcp: add mptcp getsockopt test cases") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18selftests: mptcp: gives slow test-case more timePaolo Abeni1-3/+3
On slow or busy VM, some test-cases still fail because the data transfer completes before the endpoint manipulation actually took effect. Address the issue by artificially increasing the runtime for the relevant test-cases. Fixes: ef360019db40 ("selftests: mptcp: signal addresses testcases") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/309 Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18sctp: move SCTP_PAD4 and SCTP_TRUNC4 to linux/sctp.hXin Long4-7/+5
Move these two macros from net/sctp/sctp.h to linux/sctp.h, so that it will be enough to include only linux/sctp.h in nft_exthdr.c and xt_sctp.c. It should not include "net/sctp/sctp.h" if a module does not have a dependence on SCTP module. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Saeed Mahameed <saeed@kernel.org> Link: https://lore.kernel.org/r/ef6468a687f36da06f575c2131cd4612f6b7be88.1668526821.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18sctp: change to include linux/sctp.h in net/sctp/checksum.hXin Long1-1/+1
Currently "net/sctp/checksum.h" including "net/sctp/sctp.h" is included in quite some places in netfilter and openswitch and net/sched. It's not necessary to include "net/sctp/sctp.h" if a module does not have dependence on SCTP, "linux/sctp.h" is the right one to include. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Saeed Mahameed <saeed@kernel.org> Link: https://lore.kernel.org/r/ca7ea96d62a26732f0491153c3979dc1c0d8d34a.1668526793.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18Merge branch 'implement-devlink-rate-api-and-extend-it'Jakub Kicinski18-28/+970
Michal Wilczynski says: ==================== Implement devlink-rate API and extend it This patch series implements devlink-rate for ice driver. Unfortunately current API isn't flexible enough for our use case, so there is a need to extend it. Some functions have been introduced to enable the driver to export current Tx scheduling configuration. Pasting justification for this series from commit implementing devlink-rate in ice driver(that is a part of this series): There is a need to support modification of Tx scheduler tree, in the ice driver. This will allow user to control Tx settings of each node in the internal hierarchy of nodes. As a result user will be able to use Hierarchy QoS implemented entirely in the hardware. This patch implemenents devlink-rate API. It also exports initial default hierarchy. It's mostly dictated by the fact that the tree can't be removed entirely, all we can do is enable the user to modify it. For example root node shouldn't ever be removed, also nodes that have children are off-limits. Example initial tree with 2 VF's: [root@fedora ~]# devlink port function rate show pci/0000:4b:00.0/node_27: type node parent node_26 pci/0000:4b:00.0/node_26: type node parent node_0 pci/0000:4b:00.0/node_34: type node parent node_33 pci/0000:4b:00.0/node_33: type node parent node_32 pci/0000:4b:00.0/node_32: type node parent node_16 pci/0000:4b:00.0/node_19: type node parent node_18 pci/0000:4b:00.0/node_18: type node parent node_17 pci/0000:4b:00.0/node_17: type node parent node_16 pci/0000:4b:00.0/node_21: type node parent node_20 pci/0000:4b:00.0/node_20: type node parent node_3 pci/0000:4b:00.0/node_14: type node parent node_5 pci/0000:4b:00.0/node_5: type node parent node_3 pci/0000:4b:00.0/node_13: type node parent node_4 pci/0000:4b:00.0/node_12: type node parent node_4 pci/0000:4b:00.0/node_11: type node parent node_4 pci/0000:4b:00.0/node_10: type node parent node_4 pci/0000:4b:00.0/node_9: type node parent node_4 pci/0000:4b:00.0/node_8: type node parent node_4 pci/0000:4b:00.0/node_7: type node parent node_4 pci/0000:4b:00.0/node_6: type node parent node_4 pci/0000:4b:00.0/node_4: type node parent node_3 pci/0000:4b:00.0/node_3: type node parent node_16 pci/0000:4b:00.0/node_16: type node parent node_15 pci/0000:4b:00.0/node_15: type node parent node_0 pci/0000:4b:00.0/node_2: type node parent node_1 pci/0000:4b:00.0/node_1: type node parent node_0 pci/0000:4b:00.0/node_0: type node pci/0000:4b:00.0/1: type leaf parent node_27 pci/0000:4b:00.0/2: type leaf parent node_27 Let me visualize part of the tree: +---------+ | node_0 | +---------+ | +----v----+ | node_26 | +----+----+ | +----v----+ | node_27 | +----+----+ | |-----------------| +----v----+ +----v----+ | VF 1 | | VF 2 | +----+----+ +----+----+ So at this point there is a couple things that can be done. For example we could only assign parameters to VF's. [root@fedora ~]# devlink port function rate set pci/0000:4b:00.0/1 \ tx_max 5Gbps This would cap the VF 1 BW to 5Gbps. But let's say you would like to create a completely new branch. This can be done like this: [root@fedora ~]# devlink port function rate add \ pci/0000:4b:00.0/node_custom parent node_0 [root@fedora ~]# devlink port function rate add \ pci/0000:4b:00.0/node_custom_1 parent node_custom [root@fedora ~]# devlink port function rate set \ pci/0000:4b:00.0/1 parent node_custom_1 This creates a completely new branch and reassigns VF 1 to it. A number of parameters is supported per each node: tx_max, tx_share, tx_priority and tx_weight. ==================== Link: https://lore.kernel.org/r/20221115104825.172668-1-michal.wilczynski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18Documentation: Add documentation for new devlink-rate attributesMichal Wilczynski1-1/+32
Provide documentation for newly introduced netlink attributes for devlink-rate: tx_priority and tx_weight. Mention the possibility to export tree from the driver. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18ice: Add documentation for devlink-rate implementationMichal Wilczynski1-0/+115
Add documentation to a newly added devlink-rate feature. Provide some examples on how to use the commands, which netlink attributes are supported and descriptions of the attributes. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18ice: Prevent ADQ, DCB coexistence with Custom Tx schedulerMichal Wilczynski6-0/+111
ADQ, DCB might interfere with Custom Tx Scheduler changes that user might introduce using devlink-rate API. Check if ADQ, DCB is active, when user tries to change any setting in exported Tx scheduler tree. If any of those are active block the user from doing so, and log an appropriate message. Remove the exported hierarchy if user enable ADQ or DCB. Prevent ADQ or DCB from getting configured if user already made some changes using devlink-rate API. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18ice: Implement devlink-rate APIMichal Wilczynski3-0/+426
There is a need to support modification of Tx scheduler tree, in the ice driver. This will allow user to control Tx settings of each node in the internal hierarchy of nodes. As a result user will be able to use Hierarchy QoS implemented entirely in the hardware. This patch implemenents devlink-rate API. It also exports initial default hierarchy. It's mostly dictated by the fact that the tree can't be removed entirely, all we can do is enable the user to modify it. For example root node shouldn't ever be removed, also nodes that have children are off-limits. Example initial tree with 2 VF's: [root@fedora ~]# devlink port function rate show pci/0000:4b:00.0/node_27: type node parent node_26 pci/0000:4b:00.0/node_26: type node parent node_0 pci/0000:4b:00.0/node_34: type node parent node_33 pci/0000:4b:00.0/node_33: type node parent node_32 pci/0000:4b:00.0/node_32: type node parent node_16 pci/0000:4b:00.0/node_19: type node parent node_18 pci/0000:4b:00.0/node_18: type node parent node_17 pci/0000:4b:00.0/node_17: type node parent node_16 pci/0000:4b:00.0/node_21: type node parent node_20 pci/0000:4b:00.0/node_20: type node parent node_3 pci/0000:4b:00.0/node_14: type node parent node_5 pci/0000:4b:00.0/node_5: type node parent node_3 pci/0000:4b:00.0/node_13: type node parent node_4 pci/0000:4b:00.0/node_12: type node parent node_4 pci/0000:4b:00.0/node_11: type node parent node_4 pci/0000:4b:00.0/node_10: type node parent node_4 pci/0000:4b:00.0/node_9: type node parent node_4 pci/0000:4b:00.0/node_8: type node parent node_4 pci/0000:4b:00.0/node_7: type node parent node_4 pci/0000:4b:00.0/node_6: type node parent node_4 pci/0000:4b:00.0/node_4: type node parent node_3 pci/0000:4b:00.0/node_3: type node parent node_16 pci/0000:4b:00.0/node_16: type node parent node_15 pci/0000:4b:00.0/node_15: type node parent node_0 pci/0000:4b:00.0/node_2: type node parent node_1 pci/0000:4b:00.0/node_1: type node parent node_0 pci/0000:4b:00.0/node_0: type node pci/0000:4b:00.0/1: type leaf parent node_27 pci/0000:4b:00.0/2: type leaf parent node_27 Let me visualize part of the tree: +---------+ | node_0 | +---------+ | +----v----+ | node_26 | +----+----+ | +----v----+ | node_27 | +----+----+ | |-----------------| +----v----+ +----v----+ | VF 1 | | VF 2 | +----+----+ +----+----+ So at this point there is a couple things that can be done. For example we could only assign parameters to VF's. [root@fedora ~]# devlink port function rate set pci/0000:4b:00.0/1 \ tx_max 5Gbps This would cap the VF 1 BW to 5Gbps. But let's say you would like to create a completely new branch. This can be done like this: [root@fedora ~]# devlink port function rate add \ pci/0000:4b:00.0/node_custom parent node_0 [root@fedora ~]# devlink port function rate add \ pci/0000:4b:00.0/node_custom_1 parent node_custom [root@fedora ~]# devlink port function rate set \ pci/0000:4b:00.0/1 parent node_custom_1 This creates a completely new branch and reassigns VF 1 to it. A number of parameters is supported per each node: tx_max, tx_share, tx_priority and tx_weight. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18ice: Add an option to pre-allocate memory for ice_sched_nodeMichal Wilczynski4-11/+24
devlink-rate API requires a priv object to be allocated when node still doesn't have a parent. This is problematic, because ice_sched_node can't be currently created without a parent. Add an option to pre-allocate memory for ice_sched_node struct. Add new arguments to ice_sched_add() and ice_sched_add_elems() that allow for pre-allocation of memory for ice_sched_node struct. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18ice: Introduce new parameters in ice_sched_nodeMichal Wilczynski5-7/+116
To support new devlink-rate API ice_sched_node struct needs to store a number of additional parameters. This includes tx_max, tx_share, tx_weight, and tx_priority. Add new fields to ice_sched_node struct. Add new functions to configure the hardware with new parameters. Introduce new xarray to identify nodes uniquely. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18devlink: Allow to set up parent in devl_rate_leaf_create()Michal Wilczynski4-5/+14
Currently the driver is able to create leaf nodes for the devlink-rate, but is unable to set parent for them. This wasn't as issue before the possibility to export hierarchy from the driver. After adding the export feature, in order for the driver to supply correct hierarchy, it's necessary for it to be able to supply a parent name to devl_rate_leaf_create(). Introduce a new parameter 'parent_name' in devl_rate_leaf_create(). Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18devlink: Allow for devlink-rate nodes parent reassignmentMichal Wilczynski1-5/+7
Currently it's not possible to reassign the parent of the node using one command. As the previous commit introduced a way to export entire hierarchy from the driver, being able to modify and reassign parents become important. This way user might easily change QoS settings without interrupting traffic. Example command: devlink port function rate set pci/0000:4b:00.0/1 parent node_custom_1 This reassigns leaf node parent to node_custom_1. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18devlink: Enable creation of the devlink-rate nodes from the driverMichal Wilczynski2-0/+48
Intel 100G card internal firmware hierarchy for Hierarchicial QoS is very rigid and can't be easily removed. This requires an ability to export default hierarchy to allow user to modify it. Currently the driver is only able to create the 'leaf' nodes, which usually represent the vport. This is not enough for HQoS implemented in Intel hardware. Introduce new function devl_rate_node_create() that allows for creation of the devlink-rate nodes from the driver. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>