summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-08-18net: dsa: sja1105: fix buffer overflow in sja1105_setup_devlink_regions()Rustam Subkhankulov1-1/+1
If an error occurs in dsa_devlink_region_create(), then 'priv->regions' array will be accessed by negative index '-1'. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Rustam Subkhankulov <subkhankulov@ispras.ru> Fixes: bf425b82059e ("net: dsa: sja1105: expose static config as devlink region") Link: https://lore.kernel.org/r/20220817003845.389644-1-subkhankulov@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski9-261/+390
Florian Westphal says: ==================== netfilter: conntrack and nf_tables bug fixes The following patchset contains netfilter fixes for net. Broken since 5.19: A few ancient connection tracking helpers assume TCP packets cannot exceed 64kb in size, but this isn't the case anymore with 5.19 when BIG TCP got merged, from myself. Regressions since 5.19: 1. 'conntrack -E expect' won't display anything because nfnetlink failed to enable events for expectations, only for normal conntrack events. 2. partially revert change that added resched calls to a function that can be in atomic context. Both broken and fixed up by myself. Broken for several releases (up to original merge of nf_tables): Several fixes for nf_tables control plane, from Pablo. This fixes up resource leaks in error paths and adds more sanity checks for mutually exclusive attributes/flags. Kconfig: NF_CONNTRACK_PROCFS is very old and doesn't provide all info provided via ctnetlink, so it should not default to y. From Geert Uytterhoeven. Selftests: rework nft_flowtable.sh: it frequently indicated failure; the way it tried to detect an offload failure did not work reliably. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: testing: selftests: nft_flowtable.sh: rework test to detect offload failure testing: selftests: nft_flowtable.sh: use random netns names netfilter: conntrack: NF_CONNTRACK_PROCFS should no longer default to y netfilter: nf_tables: check NFT_SET_CONCAT flag if field_count is specified netfilter: nf_tables: disallow NFT_SET_ELEM_CATCHALL and NFT_SET_ELEM_INTERVAL_END netfilter: nf_tables: NFTA_SET_ELEM_KEY_END requires concat and interval flags netfilter: nf_tables: validate NFTA_SET_ELEM_OBJREF based on NFT_SET_OBJECT flag netfilter: nf_tables: really skip inactive sets when allocating name netfilter: nfnetlink: re-enable conntrack expectation events netfilter: nf_tables: fix scheduling-while-atomic splat netfilter: nf_ct_irc: cap packet search space to 4k netfilter: nf_ct_ftp: prefer skb_linearize netfilter: nf_ct_h323: cap packet size at 64k netfilter: nf_ct_sane: remove pseudo skb linearization netfilter: nf_tables: possible module reference underflow in error path netfilter: nf_tables: disallow NFTA_SET_ELEM_KEY_END with NFT_SET_ELEM_INTERVAL_END flag netfilter: nf_tables: use READ_ONCE and WRITE_ONCE for shared generation id access ==================== Link: https://lore.kernel.org/r/20220817140015.25843-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18net: Fix suspicious RCU usage in bpf_sk_reuseport_detach()David Howells2-1/+26
bpf_sk_reuseport_detach() calls __rcu_dereference_sk_user_data_with_flags() to obtain the value of sk->sk_user_data, but that function is only usable if the RCU read lock is held, and neither that function nor any of its callers hold it. Fix this by adding a new helper, __locked_read_sk_user_data_with_flags() that checks to see if sk->sk_callback_lock() is held and use that here instead. Alternatively, making __rcu_dereference_sk_user_data_with_flags() use rcu_dereference_checked() might suffice. Without this, the following warning can be occasionally observed: ============================= WARNING: suspicious RCU usage 6.0.0-rc1-build2+ #563 Not tainted ----------------------------- include/net/sock.h:592 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 5 locks held by locktest/29873: #0: ffff88812734b550 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at: __sock_release+0x77/0x121 #1: ffff88812f5621b0 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_close+0x1c/0x70 #2: ffff88810312f5c8 (&h->lhash2[i].lock){+.+.}-{2:2}, at: inet_unhash+0x76/0x1c0 #3: ffffffff83768bb8 (reuseport_lock){+...}-{2:2}, at: reuseport_detach_sock+0x18/0xdd #4: ffff88812f562438 (clock-AF_INET){++..}-{2:2}, at: bpf_sk_reuseport_detach+0x24/0xa4 stack backtrace: CPU: 1 PID: 29873 Comm: locktest Not tainted 6.0.0-rc1-build2+ #563 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 Call Trace: <TASK> dump_stack_lvl+0x4c/0x5f bpf_sk_reuseport_detach+0x6d/0xa4 reuseport_detach_sock+0x75/0xdd inet_unhash+0xa5/0x1c0 tcp_set_state+0x169/0x20f ? lockdep_sock_is_held+0x3a/0x3a ? __lock_release.isra.0+0x13e/0x220 ? reacquire_held_locks+0x1bb/0x1bb ? hlock_class+0x31/0x96 ? mark_lock+0x9e/0x1af __tcp_close+0x50/0x4b6 tcp_close+0x28/0x70 inet_release+0x8e/0xa7 __sock_release+0x95/0x121 sock_close+0x14/0x17 __fput+0x20f/0x36a task_work_run+0xa3/0xcc exit_to_user_mode_prepare+0x9c/0x14d syscall_exit_to_user_mode+0x18/0x44 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: cf8c1e967224 ("net: refactor bpf_sk_reuseport_detach()") Signed-off-by: David Howells <dhowells@redhat.com> cc: Hawkins Jiawei <yin31149@gmail.com> Link: https://lore.kernel.org/r/166064248071.3502205.10036394558814861778.stgit@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18Merge tag 'ntfs3_for_6.0' of ↵Linus Torvalds14-329/+835
https://github.com/Paragon-Software-Group/linux-ntfs3 Pull ntfs3 updates from Konstantin Komarov: - implement FALLOC_FL_INSERT_RANGE - fix some logic errors - fixed xfstests (tested on x86_64): generic/064 generic/213 generic/300 generic/361 generic/449 generic/485 - some dead code removed or refactored * tag 'ntfs3_for_6.0' of https://github.com/Paragon-Software-Group/linux-ntfs3: (39 commits) fs/ntfs3: uninitialized variable in ntfs_set_acl_ex() fs/ntfs3: Remove unused function wnd_bits fs/ntfs3: Make ni_ins_new_attr return error fs/ntfs3: Create MFT zone only if length is large enough fs/ntfs3: Refactoring attr_insert_range to restore after errors fs/ntfs3: Refactoring attr_punch_hole to restore after errors fs/ntfs3: Refactoring attr_set_size to restore after errors fs/ntfs3: New function ntfs_bad_inode fs/ntfs3: Make MFT zone less fragmented fs/ntfs3: Check possible errors in run_pack in advance fs/ntfs3: Added comments to frecord functions fs/ntfs3: Fill duplicate info in ni_add_name fs/ntfs3: Make static function attr_load_runs fs/ntfs3: Add new argument is_mft to ntfs_mark_rec_free fs/ntfs3: Remove unused mi_mark_free fs/ntfs3: Fix very fragmented case in attr_punch_hole fs/ntfs3: Fix work with fragmented xattr fs/ntfs3: Make ntfs_fallocate return -ENOSPC instead of -EFBIG fs/ntfs3: extend ni_insert_nonresident to return inserted ATTR_LIST_ENTRY fs/ntfs3: Check reserved size for maximum allowed ...
2022-08-18dcache: move the DCACHE_OP_COMPARE case out of the __d_lookup_rcu loopLinus Torvalds1-23/+49
__d_lookup_rcu() is one of the hottest functions in the kernel on certain loads, and it is complicated by filesystems that might want to have their own name compare function. We can improve code generation by moving the test of DCACHE_OP_COMPARE outside the loop, which makes the loop itself much simpler, at the cost of some code duplication. But both cases end up being simpler, and the "native" direct case-sensitive compare particularly so. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-17net: dsa: microchip: ksz9477: fix fdb_dump last invalid entryArun Ramadoss1-0/+3
In the ksz9477_fdb_dump function it reads the ALU control register and exit from the timeout loop if there is valid entry or search is complete. After exiting the loop, it reads the alu entry and report to the user space irrespective of entry is valid. It works till the valid entry. If the loop exited when search is complete, it reads the alu table. The table returns all ones and it is reported to user space. So bridge fdb show gives ff:ff:ff:ff:ff:ff as last entry for every port. To fix it, after exiting the loop the entry is reported only if it is valid one. Fixes: b987e98e50ab ("dsa: add DSA switch driver for Microchip KSZ9477") Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220816105516.18350-1-arun.ramadoss@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-17ice: Fix VF not able to send tagged traffic with no VLAN filtersSylwester Dziedziuch2-11/+57
VF was not able to send tagged traffic when it didn't have any VLAN interfaces and VLAN anti-spoofing was enabled. Fix this by allowing VFs with no VLAN filters to send tagged traffic. After VF adds a VLAN interface it will be able to send tagged traffic matching VLAN filters only. Testing hints: 1. Spawn VF 2. Send tagged packet from a VF 3. The packet should be sent out and not dropped 4. Add a VLAN interface on VF 5. Send tagged packet on that VLAN interface 6. Packet should be sent out and not dropped 7. Send tagged packet with id different than VLAN interface 8. Packet should be dropped Fixes: daf4dd16438b ("ice: Refactor spoofcheck configuration functions") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Ignore error message when setting same promiscuous modeBenjamin Mikailenko1-4/+4
Commit 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling") introduced new checks when setting/clearing promiscuous mode. But if the requested promiscuous mode setting already exists, an -EEXIST error message would be printed. This is incorrect because promiscuous mode is either on/off and shouldn't print an error when the requested configuration is already set. This can happen when removing a bridge with two bonded interfaces and promiscuous most isn't fully cleared from VLAN VSI in hardware. Fix this by ignoring cases where requested promiscuous mode exists. Fixes: 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling") Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com> Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Fix clearing of promisc mode with bridge over bondGrzegorz Siwik2-2/+16
When at least two interfaces are bonded and a bridge is enabled on the bond, an error can occur when the bridge is removed and re-added. The reason for the error is because promiscuous mode was not fully cleared from the VLAN VSI in the hardware. With this change, promiscuous mode is properly removed when the bridge disconnects from bonding. [ 1033.676359] bond1: link status definitely down for interface enp95s0f0, disabling it [ 1033.676366] bond1: making interface enp175s0f0 the new active one [ 1033.676369] device enp95s0f0 left promiscuous mode [ 1033.676522] device enp175s0f0 entered promiscuous mode [ 1033.676901] ice 0000:af:00.0 enp175s0f0: Error setting Multicast promiscuous mode on VSI 6 [ 1041.795662] ice 0000:af:00.0 enp175s0f0: Error setting Multicast promiscuous mode on VSI 6 [ 1041.944826] bond1: link status definitely down for interface enp175s0f0, disabling it [ 1041.944874] device enp175s0f0 left promiscuous mode [ 1041.944918] bond1: now running without any active interface! Fixes: c31af68a1b94 ("ice: Add outer_vlan_ops and VSI specific VLAN ops implementations") Co-developed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Tested-by: Igor Raits <igor@gooddata.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Ignore EEXIST when setting promisc modeGrzegorz Siwik1-1/+1
Ignore EEXIST error when setting promiscuous mode. This fix is needed because the driver could set promiscuous mode when it still has not cleared properly. Promiscuous mode could be set only once, so setting it second time will be rejected. Fixes: 5eda8afd6bcc ("ice: Add support for PF/VF promiscuous mode") Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Tested-by: Igor Raits <igor@gooddata.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Fix double VLAN error when entering promisc modeGrzegorz Siwik1-0/+7
Avoid enabling or disabling VLAN 0 when trying to set promiscuous VLAN mode if double VLAN mode is enabled. This fix is needed because the driver tries to add the VLAN 0 filter twice (once for inner and once for outer) when double VLAN mode is enabled. The filter program is rejected by the firmware when double VLAN is enabled, because the promiscuous filter only needs to be set once. This issue was missed in the initial implementation of double VLAN mode. Fixes: 5eda8afd6bcc ("ice: Add support for PF/VF promiscuous mode") Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Tested-by: Igor Raits <igor@gooddata.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds15-121/+59
Pull virtio fixes from Michael Tsirkin: "Most notably this drops the commits that trip up google cloud (turns out, any legacy device). Plus a kerneldoc patch" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio: kerneldocs fixes and enhancements virtio: Revert "virtio: find_vqs() add arg sizes" virtio_vdpa: Revert "virtio_vdpa: support the arg sizes of find_vqs()" virtio_pci: Revert "virtio_pci: support the arg sizes of find_vqs()" virtio-mmio: Revert "virtio_mmio: support the arg sizes of find_vqs()" virtio: Revert "virtio: add helper virtio_find_vqs_ctx_size()" virtio_net: Revert "virtio_net: set the default max ring size by find_vqs()"
2022-08-17testing: selftests: nft_flowtable.sh: rework test to detect offload failureFlorian Westphal1-57/+84
This test fails on current kernel releases because the flotwable path now calls dst_check from packet path and will then remove the offload. Test script has two purposes: 1. check that file (random content) can be sent to other netns (and vv) 2. check that the flow is offloaded (rather than handled by classic forwarding path). Since dst_check is in place, 2) fails because the nftables ruleset in router namespace 1 intentionally blocks traffic under the assumption that packets are not passed via classic path at all. Rework this: Instead of blocking traffic, create two named counters, one for original and one for reverse direction. The first three test cases are handled by classic forwarding path (path mtu discovery is disabled and packets exceed MTU). But all other tests enable PMTUD, so the originator and responder are expected to lower packet size and flowtable is expected to do the packet forwarding. For those tests, check that the packet counters (which are only incremented for packets that are passed up to classic forward path) are significantly lower than the file size transferred. I've tested that the counter-checks fail as expected when the 'flow add' statement is removed from the ruleset. Signed-off-by: Florian Westphal <fw@strlen.de>
2022-08-17Merge branch '40GbE' of ↵David S. Miller2-4/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue ==================== Intel Wired LAN Driver Updates 2022-08-16) This series contains updates to i40e driver only. Przemyslaw fixes issue with checksum offload on VXLAN tunnels. Alan disables VSI for Tx timeout when all recovery methods have failed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-17tls: rx: react to strparser initialization errorsJakub Kicinski1-1/+3
Even though the normal strparser's init function has a return value we got away with ignoring errors until now, as it only validates the parameters and we were passing correct parameters. tls_strp can fail to init on memory allocation errors, which syzbot duly induced and reported. Reported-by: syzbot+abd45eb849b05194b1b6@syzkaller.appspotmail.com Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-17testing: selftests: nft_flowtable.sh: use random netns namesFlorian Westphal1-118/+128
"ns1" is a too generic name, use a random suffix to avoid errors when such a netns exists. Also allows to run multiple instances of the script in parallel. Signed-off-by: Florian Westphal <fw@strlen.de>
2022-08-17netfilter: conntrack: NF_CONNTRACK_PROCFS should no longer default to yGeert Uytterhoeven1-1/+0
NF_CONNTRACK_PROCFS was marked obsolete in commit 54b07dca68557b09 ("netfilter: provide config option to disable ancient procfs parts") in v3.3. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2022-08-17net: sched: fix misuse of qcpu->backlog in gnet_stats_add_queue_cpuZhengchao Shao1-1/+1
In the gnet_stats_add_queue_cpu function, the qstats->qlen statistics are incorrectly set to qcpu->backlog. Fixes: 448e163f8b9b ("gen_stats: Add gnet_stats_add_queue()") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20220815030848.276746-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-16Merge tag 'nios2_fixes_v6.0' of ↵Linus Torvalds5-9/+22
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux Pull NIOS2 fixes from Dinh Nguyen: - Security fixes from Al Viro * tag 'nios2_fixes_v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: nios2: add force_successful_syscall_return() nios2: restarts apply only to the first sigframe we build... nios2: fix syscall restart checks nios2: traced syscall does need to check the syscall number nios2: don't leave NULLs in sys_call_table[] nios2: page fault et.al. are *not* restartable syscalls...
2022-08-16Merge tag 'spi-fix-v6.0-rc1' of ↵Linus Torvalds6-39/+112
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A few fixes that came in since my pull request, the Meson fix is a little large since it's fixing all possible cases of the problem that was observed with the driver and clock API trying to share configuration by integrating the device clocking fully with the clock API rather than spot fixing the one instance that was observed" * tag 'spi-fix-v6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: dt-bindings: Drop Pratyush Yadav spi: meson-spicc: add local pow2 clock ops to preserve rate between messages MAINTAINERS: rectify entry for ARM/HPE GXP ARCHITECTURE spi: spi.c: Add missing __percpu annotations in users of spi_statistics
2022-08-16Merge tag 'regulator-fix-v6.0-rc1' of ↵Linus Torvalds2-12/+1
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A couple of small fixes that came in since my pull request, nothing major here" * tag 'regulator-fix-v6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: core: Fix missing error return from regulator_bulk_get() regulator: pca9450: Remove restrictions for regulator-name
2022-08-16x86: simplify load_unaligned_zeropad() implementationLinus Torvalds3-43/+60
The exception for the "unaligned access at the end of the page, next page not mapped" never happens, but the fixup code ends up causing trouble for compilers to optimize well. clang in particular ends up seeing it being in the middle of a loop, and tries desperately to optimize the exception fixup code that is never really reached. The simple solution is to just move all the fixups into the exception handler itself, which moves it all out of the hot case code, and means that the compiler never sees it or needs to worry about it. Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-16locking/atomic: Make test_and_*_bit() ordered on failureHector Martin2-7/+1
These operations are documented as always ordered in include/asm-generic/bitops/instrumented-atomic.h, and producer-consumer type use cases where one side needs to ensure a flag is left pending after some shared data was updated rely on this ordering, even in the failure case. This is the case with the workqueue code, which currently suffers from a reproducible ordering violation on Apple M1 platforms (which are notoriously out-of-order) that ends up causing the TTY layer to fail to deliver data to userspace properly under the right conditions. This change fixes that bug. Change the documentation to restrict the "no order on failure" story to the _lock() variant (for which it makes sense), and remove the early-exit from the generic implementation, which is what causes the missing barrier semantics in that case. Without this, the remaining atomic op is fully ordered (including on ARM64 LSE, as of recent versions of the architecture spec). Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Fixes: e986a0d6cb36 ("locking/atomics, asm-generic/bitops/atomic.h: Rewrite using atomic_*() APIs") Fixes: 61e02392d3c7 ("locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit()") Signed-off-by: Hector Martin <marcan@marcan.st> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-16i40e: Fix to stop tx_timeout recovery if GLOBR failsAlan Brady1-1/+3
When a tx_timeout fires, the PF attempts to recover by incrementally resetting. First we try a PFR, then CORER and finally a GLOBR. If the GLOBR fails, then we keep hitting the tx_timeout and incrementing the recovery level and issuing dmesgs, which is both annoying to the user and accomplishes nothing. If the GLOBR fails, then we're pretty much totally hosed, and there's not much else we can do to recover, so this makes it such that we just kill the VSI and stop hitting the tx_timeout in such a case. Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16i40e: Fix tunnel checksum offload with fragmented trafficPrzemyslaw Patynowski1-3/+5
Fix checksum offload on VXLAN tunnels. In case, when mpls protocol is not used, set l4 header to transport header of skb. This fixes case, when user tries to offload checksums of VXLAN tunneled traffic. Steps for reproduction (requires link partner with tunnels): ip l s enp130s0f0 up ip a f enp130s0f0 ip a a 10.10.110.2/24 dev enp130s0f0 ip l s enp130s0f0 mtu 1600 ip link add vxlan12_sut type vxlan id 12 group 238.168.100.100 dev \ enp130s0f0 dstport 4789 ip l s vxlan12_sut up ip a a 20.10.110.2/24 dev vxlan12_sut iperf3 -c 20.10.110.1 #should connect Without this patch, TX descriptor was using wrong data, due to l4 header pointing wrong address. NIC would then drop those packets internally, due to incorrect TX descriptor data, which increased GLV_TEPC register. Fixes: b4fb2d33514a ("i40e: Add support for MPLS + TSO") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16virtio: kerneldocs fixes and enhancementsRicardo Cañuelo4-11/+25
Fix variable names in some kerneldocs, naming in others. Add kerneldocs for struct vring_desc and vring_interrupt. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> Message-Id: <20220810094004.1250-2-ricardo.canuelo@collabora.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2022-08-16virtio: Revert "virtio: find_vqs() add arg sizes"Michael S. Tsirkin10-22/+10
This reverts commit a10fba0377145fccefea4dc4dd5915b7ed87e546: the proposed API isn't supported on all transports but no effort was made to address this. It might not be hard to fix if we want to: maybe just rename size to size_hint and make sure legacy transports ignore the hint. But it's not sure what the benefit is in any case, so let's drop it. Fixes: a10fba037714 ("virtio: find_vqs() add arg sizes") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-8-mst@redhat.com>
2022-08-16virtio_vdpa: Revert "virtio_vdpa: support the arg sizes of find_vqs()"Michael S. Tsirkin1-9/+6
This reverts commit 99e8927d8a4da8eb8a8a5904dc13a3156be8e7c0: proposed API isn't supported on all transports but no effort was made to address this. It might not be hard to fix if we want to: maybe just rename size to size_hint and make sure legacy transports ignore the hint. But it's not sure what the benefit is in any case, so let's drop it. Fixes: 99e8927d8a4d ("virtio_vdpa: support the arg sizes of find_vqs()") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-6-mst@redhat.com>
2022-08-16virtio_pci: Revert "virtio_pci: support the arg sizes of find_vqs()"Michael S. Tsirkin4-23/+12
This reverts commit cdb44806fca2d0ad29ca644cbf1505433902ee0c: the legacy path is wrong and in fact can not support the proposed API since for a legacy device we never communicate the vq size to the hypervisor. Reported-by: Andres Freund <andres@anarazel.de> Fixes: cdb44806fca2 ("virtio_pci: support the arg sizes of find_vqs()") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-5-mst@redhat.com>
2022-08-16virtio-mmio: Revert "virtio_mmio: support the arg sizes of find_vqs()"Michael S. Tsirkin1-6/+2
This reverts commit fbed86abba6e0472d98079790e58060e4332608a. The API is now unused, let's not carry dead code around. Fixes: fbed86abba6e ("virtio_mmio: support the arg sizes of find_vqs()") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-4-mst@redhat.com>
2022-08-16virtio: Revert "virtio: add helper virtio_find_vqs_ctx_size()"Michael S. Tsirkin1-12/+0
This reverts commit fe3dc04e31aa51f91dc7f741a5f76cc4817eb5b4: the API is now unused and in fact can't be implemented on top of a legacy device. Fixes: fe3dc04e31aa ("virtio: add helper virtio_find_vqs_ctx_size()") Cc: "Xuan Zhuo" <xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-3-mst@redhat.com>
2022-08-16virtio_net: Revert "virtio_net: set the default max ring size by find_vqs()"Michael S. Tsirkin1-38/+4
This reverts commit 762faee5a2678559d3dc09d95f8f2c54cd0466a7. This has been reported to trip up guests on GCP (Google Cloud). The reason is that virtio_find_vqs_ctx_size is broken on legacy devices. We can in theory fix virtio_find_vqs_ctx_size but in fact the patch itself has several other issues: - It treats unknown speed as < 10G - It leaves userspace no way to find out the ring size set by hypervisor - It tests speed when link is down - It ignores the virtio spec advice: Both \field{speed} and \field{duplex} can change, thus the driver is expected to re-read these values after receiving a configuration change notification. - It is not clear the performance impact has been tested properly Revert the patch for now. Reported-by: Andres Freund <andres@anarazel.de> Link: https://lore.kernel.org/r/20220814212610.GA3690074%40roeck-us.net Link: https://lore.kernel.org/r/20220815070203.plwjx7b3cyugpdt7%40awork3.anarazel.de Link: https://lore.kernel.org/r/3df6bb82-1951-455d-a768-e9e1513eb667%40www.fastmail.com Link: https://lore.kernel.org/r/FCDC5DDE-3CDD-4B8A-916F-CA7D87B547CE%40anarazel.de Fixes: 762faee5a267 ("virtio_net: set the default max ring size by find_vqs()") Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Andres Freund <andres@anarazel.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Message-Id: <20220816053602.173815-2-mst@redhat.com>
2022-08-16Merge branch '40GbE' of ↵Jakub Kicinski2-7/+30
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-12 (iavf) This series contains updates to iavf driver only. Przemyslaw frees memory for admin queues in initialization error paths, prevents freeing of vf_res which is causing null pointer dereference, and adjusts calls in error path of reset to avoid iavf_close() which could cause deadlock. Ivan Vecera avoids deadlock that can occur when driver if part of failover. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: iavf: Fix deadlock in initialization iavf: Fix reset error handling iavf: Fix NULL pointer dereference in iavf_get_link_ksettings iavf: Fix adminq error handling ==================== Link: https://lore.kernel.org/r/20220812172309.853230-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-16net: rtnetlink: fix module reference count leak issue in rtnetlink_rcv_msgZhengchao Shao1-0/+1
When bulk delete command is received in the rtnetlink_rcv_msg function, if bulk delete is not supported, module_put is not called to release the reference counting. As a result, module reference count is leaked. Fixes: a6cec0bcd342 ("net: rtnetlink: add bulk delete support flag") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20220815024629.240367-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-16net: moxa: pass pdev instead of ndev to DMA functionsSergei Antonov1-10/+10
dma_map_single() calls fail in moxart_mac_setup_desc_ring() and moxart_mac_start_xmit() which leads to an incessant output of this: [ 16.043925] moxart-ethernet 92000000.mac eth0: DMA mapping error [ 16.050957] moxart-ethernet 92000000.mac eth0: DMA mapping error [ 16.058229] moxart-ethernet 92000000.mac eth0: DMA mapping error Passing pdev to DMA is a common approach among net drivers. Fixes: 6c821bd9edc9 ("net: Add MOXA ART SoCs ethernet driver") Signed-off-by: Sergei Antonov <saproj@gmail.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220812171339.2271788-1-saproj@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-15selftests/landlock: fix broken include of linux/landlock.hGuillaume Tucker1-2/+5
Revert part of the earlier changes to fix the kselftest build when using a sub-directory from the top of the tree as this broke the landlock test build as a side-effect when building with "make -C tools/testing/selftests/landlock". Reported-by: Mickaël Salaün <mic@digikod.net> Fixes: a917dd94b832 ("selftests/landlock: drop deprecated headers dependency") Fixes: f2745dc0ba3d ("selftests: stop using KSFT_KHDR_INSTALL") Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-08-15netfilter: nf_tables: check NFT_SET_CONCAT flag if field_count is specifiedPablo Neira Ayuso1-0/+5
Since f3a2181e16f1 ("netfilter: nf_tables: Support for sets with multiple ranged fields"), it possible to combine intervals and concatenations. Later on, ef516e8625dd ("netfilter: nf_tables: reintroduce the NFT_SET_CONCAT flag") provides the NFT_SET_CONCAT flag for userspace to report that the set stores a concatenation. Make sure NFT_SET_CONCAT is set on if field_count is specified for consistency. Otherwise, if NFT_SET_CONCAT is specified with no field_count, bail out with EINVAL. Fixes: ef516e8625dd ("netfilter: nf_tables: reintroduce the NFT_SET_CONCAT flag") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-15nios2: add force_successful_syscall_return()Al Viro2-0/+8
If we use the ancient SysV syscall ABI, we'd better have tell the kernel how to claim that a negative return value is a success. Use ->orig_r2 for that - it's inaccessible via ptrace, so it's a fair game for changes and it's normally[*] non-negative on return from syscall. Set to -1; syscall is not going to be restart-worthy by definition, so we won't interfere with that use either. [*] the only exception is rt_sigreturn(), where we skip the entire messing with r1/r2 anyway. Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: restarts apply only to the first sigframe we build...Al Viro1-0/+1
Fixes: b53e906d255d ("nios2: Signal handling support") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: fix syscall restart checksAl Viro1-1/+1
sys_foo() returns -512 (aka -ERESTARTSYS) => do_signal() sees 512 in r2 and 1 in r1. sys_foo() returns 512 => do_signal() sees 512 in r2 and 0 in r1. The former is restart-worthy; the latter obviously isn't. Fixes: b53e906d255d ("nios2: Signal handling support") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: traced syscall does need to check the syscall numberAl Viro1-3/+8
all checks done before letting the tracer modify the register state are worthless... Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: don't leave NULLs in sys_call_table[]Al Viro2-1/+1
fill the gaps in there with sys_ni_syscall, as everyone does... Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: page fault et.al. are *not* restartable syscalls...Al Viro2-4/+3
make sure that ->orig_r2 is negative for everything except the syscalls. Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15netfilter: nf_tables: disallow NFT_SET_ELEM_CATCHALL and ↵Pablo Neira Ayuso1-0/+3
NFT_SET_ELEM_INTERVAL_END These flags are mutually exclusive, report EINVAL in this case. Fixes: aaa31047a6d2 ("netfilter: nftables: add catch-all set element support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-15netfilter: nf_tables: NFTA_SET_ELEM_KEY_END requires concat and interval flagsPablo Neira Ayuso1-0/+24
If the NFT_SET_CONCAT|NFT_SET_INTERVAL flags are set on, then the netlink attribute NFTA_SET_ELEM_KEY_END must be specified. Otherwise, NFTA_SET_ELEM_KEY_END should not be present. For catch-all element, NFTA_SET_ELEM_KEY_END should not be present. The NFT_SET_ELEM_INTERVAL_END is never used with this set flags combination. Fixes: 7b225d0b5c6d ("netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-15Merge branch 'mlxsw-fixes'David S. Miller3-16/+34
Petr Machata says: ==================== mlxsw: Fixes for PTP support This set fixes several issues in mlxsw PTP code. - Patch #1 fixes compilation warnings. - Patch #2 adjusts the order of operation during cleanup, thereby closing the window after PTP state was already cleaned in the ASIC for the given port, but before the port is removed, when the user could still in theory make changes to the configuration. - Patch #3 protects the PTP configuration with a custom mutex, instead of relying on RTNL, which is not held in all access paths. - Patch #4 forbids enablement of PTP only in RX or only in TX. The driver implicitly assumed this would be the case, but neglected to sanitize the configuration. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15mlxsw: spectrum_ptp: Forbid PTP enablement only in RX or in TXAmit Cohen1-0/+3
Currently mlxsw driver configures one global PTP configuration for all ports. The reason is that the switch behaves like a transparent clock between CPU port and front-panel ports. When time stamp is enabled in any port, the hardware is configured to update the correction field. The fact that the configuration of CPU port affects all the ports, makes the correction field update to be global for all ports. Otherwise, user will see odd values in the correction field, as the switch will update the correction field in the CPU port, but not in all the front-panel ports. The CPU port is relevant in both RX and TX, so to avoid problematic configuration, forbid PTP enablement only in one direction, i.e., only in RX or TX. Without the change: $ hwstamp_ctl -i swp1 -r 12 -t 0 current settings: tx_type 0 rx_filter 0 new settings: tx_type 0 rx_filter 2 $ echo $? 0 With the change: $ hwstamp_ctl -i swp1 -r 12 -t 0 current settings: tx_type 1 rx_filter 2 SIOCSHWTSTAMP failed: Invalid argument Fixes: 08ef8bc825d96 ("mlxsw: spectrum_ptp: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15mlxsw: spectrum_ptp: Protect PTP configuration with a mutexAmit Cohen1-7/+20
Currently the functions mlxsw_sp2_ptp_{configure, deconfigure}_port() assume that they are called when RTNL is locked and they warn otherwise. The deconfigure function can be called when port is removed, for example as part of device reload, then there is no locked RTNL and the function warns [1]. To avoid such case, do not assume that RTNL protects this code, add a dedicated mutex instead. The mutex protects 'ptp_state->config' which stores the existing global configuration in hardware. Use this mutex also to protect the code which configures the hardware. Then, there will be only one configuration in any time, which will be updated in 'ptp_state' and a race will be avoided. [1]: RTNL: assertion failed at drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c (1600) WARNING: CPU: 1 PID: 1583493 at drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:1600 mlxsw_sp2_ptp_hwtstamp_set+0x2d3/0x300 [mlxsw_spectrum] [...] CPU: 1 PID: 1583493 Comm: devlink Not tainted5.19.0-rc8-custom-127022-gb371dffda095 #789 Hardware name: Mellanox Technologies Ltd.MSN3420/VMOD0005, BIOS 5.11 01/06/2019 RIP: 0010:mlxsw_sp2_ptp_hwtstamp_set+0x2d3/0x300[mlxsw_spectrum] [...] Call Trace: <TASK> mlxsw_sp_port_remove+0x7e/0x190 [mlxsw_spectrum] mlxsw_sp_fini+0xd1/0x270 [mlxsw_spectrum] mlxsw_core_bus_device_unregister+0x55/0x280 [mlxsw_core] mlxsw_devlink_core_bus_device_reload_down+0x1c/0x30[mlxsw_core] devlink_reload+0x1ee/0x230 devlink_nl_cmd_reload+0x4de/0x580 genl_family_rcv_msg_doit+0xdc/0x140 genl_rcv_msg+0xd7/0x1d0 netlink_rcv_skb+0x49/0xf0 genl_rcv+0x1f/0x30 netlink_unicast+0x22f/0x350 netlink_sendmsg+0x208/0x440 __sys_sendto+0xf0/0x140 __x64_sys_sendto+0x1b/0x20 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 08ef8bc825d96 ("mlxsw: spectrum_ptp: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls") Reported-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15mlxsw: spectrum: Clear PTP configuration after unregistering the netdeviceAmit Cohen1-1/+1
Currently as part of removing port, PTP API is called to clear the existing configuration and set the 'rx_filter' and 'tx_type' to zero. The clearing is done before unregistering the netdevice, which means that there is a window of time in which the user can reconfigure PTP in the port, and this configuration will not be cleared. Reorder the operations, clear PTP configuration after unregistering the netdevice. Fixes: 8748642751ede ("mlxsw: spectrum: PTP: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15mlxsw: spectrum_ptp: Fix compilation warningsAmit Cohen1-8/+10
In case that 'CONFIG_PTP_1588_CLOCK' is not enabled in the config file, there are implementations for the functions mlxsw_{sp,sp2}_ptp_txhdr_construct() as part of 'spectrum_ptp.h'. In this case, they should be defined as 'static' as they are not supposed to be used out of this file. Make the functions 'static', otherwise the following warnings are returned: "warning: no previous prototype for 'mlxsw_sp_ptp_txhdr_construct'" "warning: no previous prototype for 'mlxsw_sp2_ptp_txhdr_construct'" In addition, make the functions 'inline' for case that 'spectrum_ptp.h' will be included anywhere else and the functions would probably not be used, so compilation warnings about unused static will be returned. Fixes: 24157bc69f45 ("mlxsw: Send PTP packets as data packets to overcome a limitation") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>