summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-08-17net: sched: delete unused input parameter in qdisc_createZhengchao Shao1-3/+3
The input parameter p is unused in qdisc_create. Delete it. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20220815061023.51318-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-17net: vertexcom: mse102x: Update email addressStefan Wahren1-1/+1
in-tech smart charging is now chargebyte. So update the email address accordingly. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Link: https://lore.kernel.org/r/20220815080626.9688-2-stefan.wahren@i2se.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-17dt-bindings: vertexcom-mse102x: Update email addressStefan Wahren1-1/+1
in-tech smart charging is now chargebyte. So update the email address accordingly. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20220815080626.9688-1-stefan.wahren@i2se.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-17net: sched: fix misuse of qcpu->backlog in gnet_stats_add_queue_cpuZhengchao Shao1-1/+1
In the gnet_stats_add_queue_cpu function, the qstats->qlen statistics are incorrectly set to qcpu->backlog. Fixes: 448e163f8b9b ("gen_stats: Add gnet_stats_add_queue()") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20220815030848.276746-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-17net: sched: remove the unused return value of unregister_qdiscZhengchao Shao2-3/+4
Return value of unregister_qdisc is unused, remove it. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20220815030417.271894-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-16selftests/bpf: Fix attach point for non-x86 arches in test_progs/lsmArtem Savkov2-2/+3
Use SYS_PREFIX macro from bpf_misc.h instead of hard-coded '__x64_' prefix for sys_setdomainname attach point in lsm test. Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220816055231.717006-1-asavkov@redhat.com
2022-08-16Merge tag 'nios2_fixes_v6.0' of ↵Linus Torvalds5-9/+22
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux Pull NIOS2 fixes from Dinh Nguyen: - Security fixes from Al Viro * tag 'nios2_fixes_v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: nios2: add force_successful_syscall_return() nios2: restarts apply only to the first sigframe we build... nios2: fix syscall restart checks nios2: traced syscall does need to check the syscall number nios2: don't leave NULLs in sys_call_table[] nios2: page fault et.al. are *not* restartable syscalls...
2022-08-16Merge tag 'spi-fix-v6.0-rc1' of ↵Linus Torvalds6-39/+112
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A few fixes that came in since my pull request, the Meson fix is a little large since it's fixing all possible cases of the problem that was observed with the driver and clock API trying to share configuration by integrating the device clocking fully with the clock API rather than spot fixing the one instance that was observed" * tag 'spi-fix-v6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: dt-bindings: Drop Pratyush Yadav spi: meson-spicc: add local pow2 clock ops to preserve rate between messages MAINTAINERS: rectify entry for ARM/HPE GXP ARCHITECTURE spi: spi.c: Add missing __percpu annotations in users of spi_statistics
2022-08-16Merge tag 'regulator-fix-v6.0-rc1' of ↵Linus Torvalds2-12/+1
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A couple of small fixes that came in since my pull request, nothing major here" * tag 'regulator-fix-v6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: core: Fix missing error return from regulator_bulk_get() regulator: pca9450: Remove restrictions for regulator-name
2022-08-16x86: simplify load_unaligned_zeropad() implementationLinus Torvalds3-43/+60
The exception for the "unaligned access at the end of the page, next page not mapped" never happens, but the fixup code ends up causing trouble for compilers to optimize well. clang in particular ends up seeing it being in the middle of a loop, and tries desperately to optimize the exception fixup code that is never really reached. The simple solution is to just move all the fixups into the exception handler itself, which moves it all out of the hot case code, and means that the compiler never sees it or needs to worry about it. Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-16locking/atomic: Make test_and_*_bit() ordered on failureHector Martin2-7/+1
These operations are documented as always ordered in include/asm-generic/bitops/instrumented-atomic.h, and producer-consumer type use cases where one side needs to ensure a flag is left pending after some shared data was updated rely on this ordering, even in the failure case. This is the case with the workqueue code, which currently suffers from a reproducible ordering violation on Apple M1 platforms (which are notoriously out-of-order) that ends up causing the TTY layer to fail to deliver data to userspace properly under the right conditions. This change fixes that bug. Change the documentation to restrict the "no order on failure" story to the _lock() variant (for which it makes sense), and remove the early-exit from the generic implementation, which is what causes the missing barrier semantics in that case. Without this, the remaining atomic op is fully ordered (including on ARM64 LSE, as of recent versions of the architecture spec). Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Fixes: e986a0d6cb36 ("locking/atomics, asm-generic/bitops/atomic.h: Rewrite using atomic_*() APIs") Fixes: 61e02392d3c7 ("locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit()") Signed-off-by: Hector Martin <marcan@marcan.st> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-16ice: introduce ice_ptp_reset_cached_phctime functionJacob Keller1-23/+76
If the PTP hardware clock is adjusted, the ice driver must update the cached PHC timestamp. This is required in order to perform timestamp extension on the shorter timestamps captured by the PHY. Currently, we simply call ice_ptp_update_cached_phctime in the settime and adjtime callbacks. This has a few issues: 1) if ICE_CFG_BUSY is set because another thread is updating the Rx rings, we will exit with an error. This is not checked, and the functions do not re-schedule the update. This could leave the cached timestamp incorrect until the next scheduled work item execution. 2) even if we did handle an update, any currently outstanding Tx timestamp would be extended using the wrong cached PHC time. This would produce incorrect results. To fix these issues, introduce a new ice_ptp_reset_cached_phctime function. This function calls the ice_ptp_update_cached_phctime, and discards outstanding Tx timestamps. If the ice_ptp_update_cached_phctime function fails because ICE_CFG_BUSY is set, we log a warning and schedule the thread to execute soon. The update function is modified so that it always updates the cached copy in the PF regardless. This ensures we have the most up to date values possible and minimizes the risk of a packet timestamp being extended with the wrong value. It would be nice if we could skip reporting Rx timestamps until the cached values are up to date. However, we can't access the Rx rings while ICE_CFG_BUSY is set because they are actively being updated by another thread. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: re-arrange some static functions in ice_ptp.cJacob Keller1-336/+336
A following change is going to want to make use of ice_ptp_flush_tx_tracker earlier in the ice_ptp.c file. To make this work, move the Tx timestamp tracking functions higher up in the file, and pull the ice_ptp_update_cached_timestamp function below them. This should have no functional change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: track and warn when PHC update is lateJacob Keller3-3/+34
The ice driver requires a cached copy of the PHC time in order to perform timestamp extension on Tx and Rx hardware timestamp values. This cached PHC time must always be updated at least once every 2 seconds. Otherwise, the math used to perform the extension would produce invalid results. The updates are supposed to occur periodically in the PTP kthread work item, which is scheduled to run every half second. Thus, we do not expect an update to be delayed for so long. However, there are error conditions which can cause the update to be delayed. Track this situation by using jiffies to determine approximately how long ago the last update occurred. Add a new statistic and a dev_warn when we have failed to update the cached PHC time. This makes the error case more obvious. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: track Tx timestamp stats similar to other Intel driversJacob Keller4-4/+20
Several Intel networking drivers which support PTP track when Tx timestamps are skipped or when they timeout without a timestamp from hardware. The conditions which could cause these events are rare, but it can be useful to know when and how often they occur. Implement similar statistics for the ice driver, tx_hwtstamp_skipped, tx_hwtstamp_timeouts, and tx_hwtstamp_flushed. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: initialize cached_phctime when creating Rx ringsJacob Keller2-0/+2
When we create new Rx rings, the cached_phctime field is zero initialized. This could result in incorrect timestamp reporting due to the cached value not yet being updated. Although a background task will periodically update the cached value, ensure it matches the existing cached value in the PF structure at ring initialization. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16ice: set tx_tstamps when creating new Tx rings via ethtoolJacob Keller1-0/+1
When the user changes the number of queues via ethtool, the driver allocates new rings. This allocation did not initialize tx_tstamps. This results in the tx_tstamps field being zero (due to kcalloc allocation), and would result in a NULL pointer dereference when attempting a transmit timestamp on the new ring. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16i40e: Fix to stop tx_timeout recovery if GLOBR failsAlan Brady1-1/+3
When a tx_timeout fires, the PF attempts to recover by incrementally resetting. First we try a PFR, then CORER and finally a GLOBR. If the GLOBR fails, then we keep hitting the tx_timeout and incrementing the recovery level and issuing dmesgs, which is both annoying to the user and accomplishes nothing. If the GLOBR fails, then we're pretty much totally hosed, and there's not much else we can do to recover, so this makes it such that we just kill the VSI and stop hitting the tx_timeout in such a case. Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16i40e: Fix tunnel checksum offload with fragmented trafficPrzemyslaw Patynowski1-3/+5
Fix checksum offload on VXLAN tunnels. In case, when mpls protocol is not used, set l4 header to transport header of skb. This fixes case, when user tries to offload checksums of VXLAN tunneled traffic. Steps for reproduction (requires link partner with tunnels): ip l s enp130s0f0 up ip a f enp130s0f0 ip a a 10.10.110.2/24 dev enp130s0f0 ip l s enp130s0f0 mtu 1600 ip link add vxlan12_sut type vxlan id 12 group 238.168.100.100 dev \ enp130s0f0 dstport 4789 ip l s vxlan12_sut up ip a a 20.10.110.2/24 dev vxlan12_sut iperf3 -c 20.10.110.1 #should connect Without this patch, TX descriptor was using wrong data, due to l4 header pointing wrong address. NIC would then drop those packets internally, due to incorrect TX descriptor data, which increased GLV_TEPC register. Fixes: b4fb2d33514a ("i40e: Add support for MPLS + TSO") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-16virtio: kerneldocs fixes and enhancementsRicardo Cañuelo4-11/+25
Fix variable names in some kerneldocs, naming in others. Add kerneldocs for struct vring_desc and vring_interrupt. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> Message-Id: <20220810094004.1250-2-ricardo.canuelo@collabora.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2022-08-16virtio: Revert "virtio: find_vqs() add arg sizes"Michael S. Tsirkin10-22/+10
This reverts commit a10fba0377145fccefea4dc4dd5915b7ed87e546: the proposed API isn't supported on all transports but no effort was made to address this. It might not be hard to fix if we want to: maybe just rename size to size_hint and make sure legacy transports ignore the hint. But it's not sure what the benefit is in any case, so let's drop it. Fixes: a10fba037714 ("virtio: find_vqs() add arg sizes") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-8-mst@redhat.com>
2022-08-16virtio_vdpa: Revert "virtio_vdpa: support the arg sizes of find_vqs()"Michael S. Tsirkin1-9/+6
This reverts commit 99e8927d8a4da8eb8a8a5904dc13a3156be8e7c0: proposed API isn't supported on all transports but no effort was made to address this. It might not be hard to fix if we want to: maybe just rename size to size_hint and make sure legacy transports ignore the hint. But it's not sure what the benefit is in any case, so let's drop it. Fixes: 99e8927d8a4d ("virtio_vdpa: support the arg sizes of find_vqs()") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-6-mst@redhat.com>
2022-08-16virtio_pci: Revert "virtio_pci: support the arg sizes of find_vqs()"Michael S. Tsirkin4-23/+12
This reverts commit cdb44806fca2d0ad29ca644cbf1505433902ee0c: the legacy path is wrong and in fact can not support the proposed API since for a legacy device we never communicate the vq size to the hypervisor. Reported-by: Andres Freund <andres@anarazel.de> Fixes: cdb44806fca2 ("virtio_pci: support the arg sizes of find_vqs()") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-5-mst@redhat.com>
2022-08-16virtio-mmio: Revert "virtio_mmio: support the arg sizes of find_vqs()"Michael S. Tsirkin1-6/+2
This reverts commit fbed86abba6e0472d98079790e58060e4332608a. The API is now unused, let's not carry dead code around. Fixes: fbed86abba6e ("virtio_mmio: support the arg sizes of find_vqs()") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-4-mst@redhat.com>
2022-08-16virtio: Revert "virtio: add helper virtio_find_vqs_ctx_size()"Michael S. Tsirkin1-12/+0
This reverts commit fe3dc04e31aa51f91dc7f741a5f76cc4817eb5b4: the API is now unused and in fact can't be implemented on top of a legacy device. Fixes: fe3dc04e31aa ("virtio: add helper virtio_find_vqs_ctx_size()") Cc: "Xuan Zhuo" <xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220816053602.173815-3-mst@redhat.com>
2022-08-16virtio_net: Revert "virtio_net: set the default max ring size by find_vqs()"Michael S. Tsirkin1-38/+4
This reverts commit 762faee5a2678559d3dc09d95f8f2c54cd0466a7. This has been reported to trip up guests on GCP (Google Cloud). The reason is that virtio_find_vqs_ctx_size is broken on legacy devices. We can in theory fix virtio_find_vqs_ctx_size but in fact the patch itself has several other issues: - It treats unknown speed as < 10G - It leaves userspace no way to find out the ring size set by hypervisor - It tests speed when link is down - It ignores the virtio spec advice: Both \field{speed} and \field{duplex} can change, thus the driver is expected to re-read these values after receiving a configuration change notification. - It is not clear the performance impact has been tested properly Revert the patch for now. Reported-by: Andres Freund <andres@anarazel.de> Link: https://lore.kernel.org/r/20220814212610.GA3690074%40roeck-us.net Link: https://lore.kernel.org/r/20220815070203.plwjx7b3cyugpdt7%40awork3.anarazel.de Link: https://lore.kernel.org/r/3df6bb82-1951-455d-a768-e9e1513eb667%40www.fastmail.com Link: https://lore.kernel.org/r/FCDC5DDE-3CDD-4B8A-916F-CA7D87B547CE%40anarazel.de Fixes: 762faee5a267 ("virtio_net: set the default max ring size by find_vqs()") Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Andres Freund <andres@anarazel.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Message-Id: <20220816053602.173815-2-mst@redhat.com>
2022-08-16Merge branch '40GbE' of ↵Jakub Kicinski2-7/+30
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-12 (iavf) This series contains updates to iavf driver only. Przemyslaw frees memory for admin queues in initialization error paths, prevents freeing of vf_res which is causing null pointer dereference, and adjusts calls in error path of reset to avoid iavf_close() which could cause deadlock. Ivan Vecera avoids deadlock that can occur when driver if part of failover. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: iavf: Fix deadlock in initialization iavf: Fix reset error handling iavf: Fix NULL pointer dereference in iavf_get_link_ksettings iavf: Fix adminq error handling ==================== Link: https://lore.kernel.org/r/20220812172309.853230-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-16net: rtnetlink: fix module reference count leak issue in rtnetlink_rcv_msgZhengchao Shao1-0/+1
When bulk delete command is received in the rtnetlink_rcv_msg function, if bulk delete is not supported, module_put is not called to release the reference counting. As a result, module reference count is leaked. Fixes: a6cec0bcd342 ("net: rtnetlink: add bulk delete support flag") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20220815024629.240367-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-16net: moxa: pass pdev instead of ndev to DMA functionsSergei Antonov1-10/+10
dma_map_single() calls fail in moxart_mac_setup_desc_ring() and moxart_mac_start_xmit() which leads to an incessant output of this: [ 16.043925] moxart-ethernet 92000000.mac eth0: DMA mapping error [ 16.050957] moxart-ethernet 92000000.mac eth0: DMA mapping error [ 16.058229] moxart-ethernet 92000000.mac eth0: DMA mapping error Passing pdev to DMA is a common approach among net drivers. Fixes: 6c821bd9edc9 ("net: Add MOXA ART SoCs ethernet driver") Signed-off-by: Sergei Antonov <saproj@gmail.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220812171339.2271788-1-saproj@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-16libbpf: Making bpf_prog_load() ignore name if kernel doesn't supportHangbin Liu3-6/+16
Similar with commit 10b62d6a38f7 ("libbpf: Add names for auxiliary maps"), let's make bpf_prog_load() also ignore name if kernel doesn't support program name. To achieve this, we need to call sys_bpf_prog_load() directly in probe_kern_prog_name() to avoid circular dependency. sys_bpf_prog_load() also need to be exported in the libbpf_internal.h file. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220813000936.6464-1-liuhangbin@gmail.com
2022-08-15selftests/bpf: Update CI kconfigDaniel Xu1-0/+2
The previous selftest changes require two kconfig changes in bpf-ci. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/2c27c6ebf7a03954915f83560653752450389564.1660254747.git.dxu@dxuuu.xyz
2022-08-15selftests/bpf: Add connmark read testDaniel Xu2-1/+5
Test that the prog can read from the connection mark. This test is nice because it ensures progs can interact with netfilter subsystem correctly. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/d3bc620a491e4c626c20d80631063922cbe13e2b.1660254747.git.dxu@dxuuu.xyz
2022-08-15selftests/bpf: Add existing connection bpf_*_ct_lookup() testDaniel Xu2-0/+77
Add a test where we do a conntrack lookup on an existing connection. This is nice because it's a more realistic test than artifically creating a ct entry and looking it up afterwards. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/de5a617832f38f8b5631cc87e2a836da7c94d497.1660254747.git.dxu@dxuuu.xyz
2022-08-15bpftool: Clear errno after libcap's checksQuentin Monnet1-0/+10
When bpftool is linked against libcap, the library runs a "constructor" function to compute the number of capabilities of the running kernel [0], at the beginning of the execution of the program. As part of this, it performs multiple calls to prctl(). Some of these may fail, and set errno to a non-zero value: # strace -e prctl ./bpftool version prctl(PR_CAPBSET_READ, CAP_MAC_OVERRIDE) = 1 prctl(PR_CAPBSET_READ, 0x30 /* CAP_??? */) = -1 EINVAL (Invalid argument) prctl(PR_CAPBSET_READ, CAP_CHECKPOINT_RESTORE) = 1 prctl(PR_CAPBSET_READ, 0x2c /* CAP_??? */) = -1 EINVAL (Invalid argument) prctl(PR_CAPBSET_READ, 0x2a /* CAP_??? */) = -1 EINVAL (Invalid argument) prctl(PR_CAPBSET_READ, 0x29 /* CAP_??? */) = -1 EINVAL (Invalid argument) ** fprintf added at the top of main(): we have errno == 1 ./bpftool v7.0.0 using libbpf v1.0 features: libbfd, libbpf_strict, skeletons +++ exited with 0 +++ This has been addressed in libcap 2.63 [1], but until this version is available everywhere, we can fix it on bpftool side. Let's clean errno at the beginning of the main() function, to make sure that these checks do not interfere with the batch mode, where we error out if errno is set after a bpftool command. [0] https://git.kernel.org/pub/scm/libs/libcap/libcap.git/tree/libcap/cap_alloc.c?h=libcap-2.65#n20 [1] https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=f25a1b7e69f7b33e6afb58b3e38f3450b7d2d9a0 Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220815162205.45043-1-quentin@isovalent.com
2022-08-15selftests/landlock: fix broken include of linux/landlock.hGuillaume Tucker1-2/+5
Revert part of the earlier changes to fix the kselftest build when using a sub-directory from the top of the tree as this broke the landlock test build as a side-effect when building with "make -C tools/testing/selftests/landlock". Reported-by: Mickaël Salaün <mic@digikod.net> Fixes: a917dd94b832 ("selftests/landlock: drop deprecated headers dependency") Fixes: f2745dc0ba3d ("selftests: stop using KSFT_KHDR_INSTALL") Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-08-15netfilter: nf_tables: check NFT_SET_CONCAT flag if field_count is specifiedPablo Neira Ayuso1-0/+5
Since f3a2181e16f1 ("netfilter: nf_tables: Support for sets with multiple ranged fields"), it possible to combine intervals and concatenations. Later on, ef516e8625dd ("netfilter: nf_tables: reintroduce the NFT_SET_CONCAT flag") provides the NFT_SET_CONCAT flag for userspace to report that the set stores a concatenation. Make sure NFT_SET_CONCAT is set on if field_count is specified for consistency. Otherwise, if NFT_SET_CONCAT is specified with no field_count, bail out with EINVAL. Fixes: ef516e8625dd ("netfilter: nf_tables: reintroduce the NFT_SET_CONCAT flag") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-15nios2: add force_successful_syscall_return()Al Viro2-0/+8
If we use the ancient SysV syscall ABI, we'd better have tell the kernel how to claim that a negative return value is a success. Use ->orig_r2 for that - it's inaccessible via ptrace, so it's a fair game for changes and it's normally[*] non-negative on return from syscall. Set to -1; syscall is not going to be restart-worthy by definition, so we won't interfere with that use either. [*] the only exception is rt_sigreturn(), where we skip the entire messing with r1/r2 anyway. Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: restarts apply only to the first sigframe we build...Al Viro1-0/+1
Fixes: b53e906d255d ("nios2: Signal handling support") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: fix syscall restart checksAl Viro1-1/+1
sys_foo() returns -512 (aka -ERESTARTSYS) => do_signal() sees 512 in r2 and 1 in r1. sys_foo() returns 512 => do_signal() sees 512 in r2 and 0 in r1. The former is restart-worthy; the latter obviously isn't. Fixes: b53e906d255d ("nios2: Signal handling support") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: traced syscall does need to check the syscall numberAl Viro1-3/+8
all checks done before letting the tracer modify the register state are worthless... Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: don't leave NULLs in sys_call_table[]Al Viro2-1/+1
fill the gaps in there with sys_ni_syscall, as everyone does... Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: page fault et.al. are *not* restartable syscalls...Al Viro2-4/+3
make sure that ->orig_r2 is negative for everything except the syscalls. Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15netfilter: nf_tables: disallow NFT_SET_ELEM_CATCHALL and ↵Pablo Neira Ayuso1-0/+3
NFT_SET_ELEM_INTERVAL_END These flags are mutually exclusive, report EINVAL in this case. Fixes: aaa31047a6d2 ("netfilter: nftables: add catch-all set element support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-15netfilter: nf_tables: NFTA_SET_ELEM_KEY_END requires concat and interval flagsPablo Neira Ayuso1-0/+24
If the NFT_SET_CONCAT|NFT_SET_INTERVAL flags are set on, then the netlink attribute NFTA_SET_ELEM_KEY_END must be specified. Otherwise, NFTA_SET_ELEM_KEY_END should not be present. For catch-all element, NFTA_SET_ELEM_KEY_END should not be present. The NFT_SET_ELEM_INTERVAL_END is never used with this set flags combination. Fixes: 7b225d0b5c6d ("netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-15bpf: Clear up confusion in bpf_skb_adjust_room()'s documentationQuentin Monnet2-4/+8
Adding or removing room space _below_ layers 2 or 3, as the description mentions, is ambiguous. This was written with a mental image of the packet with layer 2 at the top, layer 3 under it, and so on. But it has led users to believe that it was on lower layers (before the beginning of the L2 and L3 headers respectively). Let's make it more explicit, and specify between which layers the room space is adjusted. Reported-by: Rumen Telbizov <rumen.telbizov@menlosecurity.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220812153727.224500-3-quentin@isovalent.com
2022-08-15bpftool: Fix a typo in a commentQuentin Monnet1-1/+1
This is the wrong library name: libcap, not libpcap. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220812153727.224500-1-quentin@isovalent.com
2022-08-15Merge branch 'mlxsw-fixes'David S. Miller3-16/+34
Petr Machata says: ==================== mlxsw: Fixes for PTP support This set fixes several issues in mlxsw PTP code. - Patch #1 fixes compilation warnings. - Patch #2 adjusts the order of operation during cleanup, thereby closing the window after PTP state was already cleaned in the ASIC for the given port, but before the port is removed, when the user could still in theory make changes to the configuration. - Patch #3 protects the PTP configuration with a custom mutex, instead of relying on RTNL, which is not held in all access paths. - Patch #4 forbids enablement of PTP only in RX or only in TX. The driver implicitly assumed this would be the case, but neglected to sanitize the configuration. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15mlxsw: spectrum_ptp: Forbid PTP enablement only in RX or in TXAmit Cohen1-0/+3
Currently mlxsw driver configures one global PTP configuration for all ports. The reason is that the switch behaves like a transparent clock between CPU port and front-panel ports. When time stamp is enabled in any port, the hardware is configured to update the correction field. The fact that the configuration of CPU port affects all the ports, makes the correction field update to be global for all ports. Otherwise, user will see odd values in the correction field, as the switch will update the correction field in the CPU port, but not in all the front-panel ports. The CPU port is relevant in both RX and TX, so to avoid problematic configuration, forbid PTP enablement only in one direction, i.e., only in RX or TX. Without the change: $ hwstamp_ctl -i swp1 -r 12 -t 0 current settings: tx_type 0 rx_filter 0 new settings: tx_type 0 rx_filter 2 $ echo $? 0 With the change: $ hwstamp_ctl -i swp1 -r 12 -t 0 current settings: tx_type 1 rx_filter 2 SIOCSHWTSTAMP failed: Invalid argument Fixes: 08ef8bc825d96 ("mlxsw: spectrum_ptp: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15mlxsw: spectrum_ptp: Protect PTP configuration with a mutexAmit Cohen1-7/+20
Currently the functions mlxsw_sp2_ptp_{configure, deconfigure}_port() assume that they are called when RTNL is locked and they warn otherwise. The deconfigure function can be called when port is removed, for example as part of device reload, then there is no locked RTNL and the function warns [1]. To avoid such case, do not assume that RTNL protects this code, add a dedicated mutex instead. The mutex protects 'ptp_state->config' which stores the existing global configuration in hardware. Use this mutex also to protect the code which configures the hardware. Then, there will be only one configuration in any time, which will be updated in 'ptp_state' and a race will be avoided. [1]: RTNL: assertion failed at drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c (1600) WARNING: CPU: 1 PID: 1583493 at drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:1600 mlxsw_sp2_ptp_hwtstamp_set+0x2d3/0x300 [mlxsw_spectrum] [...] CPU: 1 PID: 1583493 Comm: devlink Not tainted5.19.0-rc8-custom-127022-gb371dffda095 #789 Hardware name: Mellanox Technologies Ltd.MSN3420/VMOD0005, BIOS 5.11 01/06/2019 RIP: 0010:mlxsw_sp2_ptp_hwtstamp_set+0x2d3/0x300[mlxsw_spectrum] [...] Call Trace: <TASK> mlxsw_sp_port_remove+0x7e/0x190 [mlxsw_spectrum] mlxsw_sp_fini+0xd1/0x270 [mlxsw_spectrum] mlxsw_core_bus_device_unregister+0x55/0x280 [mlxsw_core] mlxsw_devlink_core_bus_device_reload_down+0x1c/0x30[mlxsw_core] devlink_reload+0x1ee/0x230 devlink_nl_cmd_reload+0x4de/0x580 genl_family_rcv_msg_doit+0xdc/0x140 genl_rcv_msg+0xd7/0x1d0 netlink_rcv_skb+0x49/0xf0 genl_rcv+0x1f/0x30 netlink_unicast+0x22f/0x350 netlink_sendmsg+0x208/0x440 __sys_sendto+0xf0/0x140 __x64_sys_sendto+0x1b/0x20 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 08ef8bc825d96 ("mlxsw: spectrum_ptp: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls") Reported-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15mlxsw: spectrum: Clear PTP configuration after unregistering the netdeviceAmit Cohen1-1/+1
Currently as part of removing port, PTP API is called to clear the existing configuration and set the 'rx_filter' and 'tx_type' to zero. The clearing is done before unregistering the netdevice, which means that there is a window of time in which the user can reconfigure PTP in the port, and this configuration will not be cleared. Reorder the operations, clear PTP configuration after unregistering the netdevice. Fixes: 8748642751ede ("mlxsw: spectrum: PTP: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>