summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-11-17i2c: Drop legacy muxing pseudo-driversJean Delvare7-534/+1
The i2c-amd756-s4882 and i2c-nforce2-s4985 muxing pseudo-drivers were written at a time when the i2c core did not support muxing. They are essentially board-specific hacks. If we had to add support for these boards today, we would implement it in a completely different way. These Tyan server boards are 19 years old by now, so I very much doubt any of these is still running today. So let's just drop this clumsy code. If anyone really still needs this support and complains, I'll rewrite it in a proper way on top of i2c-mux. This also fixes the following warnings: drivers/i2c/busses/i2c-amd756.c:286:20: warning: symbol 'amd756_smbus' was not declared. Should it be static? drivers/i2c/busses/i2c-nforce2.c:123:20: warning: symbol 'nforce2_smbus' was not declared. Should it be static? Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Andi Shyti <andi.shyti@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17i2c: imx: prevent rescheduling in non dma modeStefan Eichenberger1-23/+249
We are experiencing a problem with the i.MX I2C controller when communicating with SMBus devices. We are seeing devices time-out because the time between sending/receiving two bytes is too long, and the SMBus device returns to the idle state. This happens because the i.MX I2C controller sends and receives byte by byte. When a byte is sent or received, we get an interrupt and can send or receive the next byte. The current implementation sends a byte and then waits for an event generated by the interrupt subroutine. After the event is received, the next byte is sent and we wait again. This waiting allows the scheduler to reschedule other tasks, with the disadvantage that we may not send the next byte for a long time because the send task is not immediately scheduled. For example, if the rescheduling takes more than 25ms, this can cause SMBus devices to timeout and communication to fail. This patch changes the behavior so that we do not reschedule the send/receive task, but instead send or receive the next byte in the interrupt subroutine. This prevents rescheduling and drastically reduces the time between sending/receiving bytes. The cost in the interrupt subroutine is relatively small, we check what state we are in and then send/receive the next byte. Before we had to call wake_up, which is even less expensive. However, we also had to do some scheduling, which increased the overall cost compared to the new solution. The wake_up function to wake up the send/receive task is now only called when an error occurs or when the transfer is complete. Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17i2c: imx: separate atomic, dma and non-dma use caseStefan Eichenberger1-37/+70
Separate the atomic, dma and non-dma use case as a preparation step for moving the non-dma use case to the isr to avoid rescheduling while a transfer is in progress. Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17i2c: imx: do not poll for bus busy in single master modeStefan Eichenberger1-2/+13
According to the i.MX8M Mini reference manual chapter "16.1.4.2 Generation of Start" it is only necessary to poll for bus busy and arbitration lost in multi master mode. This helps to avoid rescheduling while the i2c bus is busy and avoids SMBus devices to timeout. For backward compatibility, the single-master property needs to be explicitly set to disable the bus busy polling. Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17i2c: designware: Add a new ACPI HID for HJMC01 I2C controllerHunter Yu1-0/+1
Define a new ACPI HID for HJMC01 Signed-off-by: Hunter Yu <hunter.yu@hj-micro.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17i2c: qcom-geni: Keep comment why interrupts start disabledWolfram Sang1-0/+2
The to-be-fixed commit rightfully reduced a race window, but also removed a comment which is still helpful after the fix. Bring the comment back. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17dt-bindings: i2c: microchip: corei2c: Add PIC64GX as compatible with driverPierre-Henry Moussay1-1/+3
PIC64GX i2c is compatible with the microchip corei2c, just add fallback Signed-off-by: Pierre-Henry Moussay <pierre-henry.moussay@microchip.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17i2c: designware: constify abort_sourcesRaag Jadav1-1/+1
We never modify abort_sources, mark it as const. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17i2c: Switch back to struct platform_driver::remove()Uwe Kleine-König88-88/+88
After commit 0edb555a65d1 ("platform: Make platform_driver::remove() return void") .remove() is (again) the right callback to implement for platform drivers. Convert all platform drivers below drivers/i2c to use .remove(), with the eventual goal to drop struct platform_driver::remove_new(). As .remove() and .remove_new() have the same prototypes, conversion is done by just changing the structure member name in the driver initializer. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17i2c: qcom-geni: Support systems with 32MHz serial engine clockManikanta Mylavarapu1-4/+19
In existing socs, I2C serial engine is sourced from XO (19.2MHz). Where as in IPQ5424, I2C serial engine is sourced from GPLL0 (32MHz). The existing map table is based on 19.2MHz. This patch incorporates the clock map table to derive the SCL clock from the 32MHz source clock frequency. Signed-off-by: Manikanta Mylavarapu <quic_mmanikan@quicinc.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17i2c: qcom-cci: Stop complaining about DT set clock rateBryan O'Donoghue1-8/+0
It is common practice in the downstream and upstream CCI dt to set CCI clock rates to 19.2 MHz. It appears to be fairly common for initial code to set the CCI clock rate to 37.5 MHz. Applying the widely used CCI clock rates from downstream ought not to cause warning messages in the upstream kernel where our general policy is to usually copy downstream hardware clock rates across the range of Qualcomm drivers. Drop the warning it is pervasive across CAMSS users but doesn't add any information or warrant any changes to the DT to align the DT clock rate to the bootloader clock rate. Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org> Link: https://lore.kernel.org/linux-arm-msm/20240824115900.40702-1-bryan.odonoghue@linaro.org Signed-off-by: Richard Acayan <mailingradian@gmail.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17dt-bindings: i2c: qcom-cci: Document SDM670 compatibleRichard Acayan1-0/+19
The CCI on the Snapdragon 670 is the interface for controlling camera hardware over I2C. Add the compatible so it can be added to the SDM670 device tree. Signed-off-by: Richard Acayan <mailingradian@gmail.com> Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17i2c: npcm: use a software flag to indicate a BER conditionTyrone Ting1-1/+14
If not clearing the BB (bus busy) condition in the BER (bus error) interrupt, the driver causes a timeout and hence the i2c core doesn't do the i2c transfer retry but returns the driver's return value to the upper layer instead. Clear the BB condition in the BER interrupt and a software flag is used. The driver does an i2c recovery without causing the timeout if the flag is set. Signed-off-by: Tyrone Ting <kfting@nuvoton.com> Reviewed-by: Tali Perry <tali.perry1@gmail.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17i2c: npcm: correct the read/write operation procedureTyrone Ting1-5/+2
Originally the driver uses the XMIT bit in SMBnST register to decide the upcoming i2c transaction. If XMIT bit is 1, then it will be an i2c write operation. If it's 0, then a read operation will be executed. In slave mode the XMIT bit can simply be used directly to set the state. XMIT bit can be used as an indication to the current state of the state machine during slave operation. (meaning XMIT = 1 during writing and XMIT = 0 during reading). In master operation XMIT is valid only if there are no bus errors. For example: in a multi master where the same module is switching from master to slave at runtime, and there are collisions, the XMIT bit cannot be trusted. However the maser already "knows" what the bus state is, so this bit is not needed and the driver can just track what it is currently doing. Signed-off-by: Tyrone Ting <kfting@nuvoton.com> Reviewed-by: Tali Perry <tali.perry1@gmail.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17i2c: amd-asf: Fix uninitialized variables issue in amd_asf_process_targetQianqiang Liu1-1/+1
The len variable is not initialized, which may cause the for loop to behave unexpectedly. Fixes: 9b25419ad397 ("i2c: amd-asf: Add routine to handle the ASF slave process") Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-17efi: Fix memory leak in efivar_ssdt_loadCyrill Gorcunov1-13/+28
When we load SSDT from efi variable (specified with efivar_ssdt=<var> boot command line argument) a name for the variable is allocated dynamically because we traverse all EFI variables. Unlike ACPI table data, which is later used by ACPI engine, the name is no longer needed once traverse is complete -- don't forget to free this memory. Same time we silently ignore any errors happened here let's print a message if something went wrong (but do not exit since this is not a critical error and the system should continue to boot). Also while here -- add a note why we keep SSDT table on success. Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-11-17efi/libstub: Take command line overrides into account for loaded filesArd Biesheuvel1-1/+2
When CONFIG_CMDLINE_OVERRIDE or CONFIG_CMDLINE_FORCE are configured, the command line provided by the boot stack should be ignored, and only the built-in command line should be taken into account. Add the required handling of this when dealing with initrd= or dtb= command line options in the EFI stub. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-11-17efi/libstub: Fix command line fallback handling when loading filesArd Biesheuvel1-0/+21
CONFIG_CMDLINE, when set, is supposed to serve either as a fallback when no command line is provided by the bootloader, or to be taken into account unconditionally, depending on the configured options. The initrd and dtb loader ignores CONFIG_CMDLINE in either case, and only takes the EFI firmware provided load options into account. This means that configuring the kernel with initrd= or dtb= on the built-in command line does not produce the expected result. Fix this by doing a separate pass over the built-in command line when dealing with initrd= or dtb= options. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-11-17Merge tag 'mm-hotfixes-stable-2024-11-16-15-33' of ↵Linus Torvalds19-10/+62
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "10 hotfixes, 7 of which are cc:stable. All singletons, please see the changelogs for details" * tag 'mm-hotfixes-stable-2024-11-16-15-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: revert "mm: shmem: fix data-race in shmem_getattr()" ocfs2: uncache inode which has failed entering the group mm: fix NULL pointer dereference in alloc_pages_bulk_noprof mm, doc: update read_ahead_kb for MADV_HUGEPAGE fs/proc/task_mmu: prevent integer overflow in pagemap_scan_get_args() sched/task_stack: fix object_is_on_stack() for KASAN tagged pointers crash, powerpc: default to CRASH_DUMP=n on PPC_BOOK3S_32 mm/mremap: fix address wraparound in move_page_tables() tools/mm: fix compile error mm, swap: fix allocation and scanning race with swapoff
2024-11-17mm: revert "mm: shmem: fix data-race in shmem_getattr()"Andrew Morton1-2/+0
Revert d949d1d14fa2 ("mm: shmem: fix data-race in shmem_getattr()") as suggested by Chuck [1]. It is causing deadlocks when accessing tmpfs over NFS. As Hugh commented, "added just to silence a syzbot sanitizer splat: added where there has never been any practical problem". Link: https://lkml.kernel.org/r/ZzdxKF39VEmXSSyN@tissot.1015granger.net [1] Fixes: d949d1d14fa2 ("mm: shmem: fix data-race in shmem_getattr()") Acked-by: Hugh Dickins <hughd@google.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Jeongjun Park <aha310510@gmail.com> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-17Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linuxLinus Torvalds7-17/+50
Pull ARM fixes from Russell King: - Fix kernel mapping for XIP kernels - Fix SMP support for XIP kernels - Fix complication corner case with CFI - Fix a typo in nommu code - Fix cacheflush syscall when PAN is enabled on LPAE platforms * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: ARM: fix cacheflush with PAN ARM: 9435/1: ARM/nommu: Fix typo "absence" ARM: 9434/1: cfi: Fix compilation corner case ARM: 9420/1: smp: Fix SMP for xip kernels ARM: 9419/1: mm: Fix kernel memory mapping for xip kernels
2024-11-17Merge tag 'drm-fixes-2024-11-17' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds12-84/+36
Pull drm fix from Dave Airlie: "Alex sent on a last minute revert for a amdgpu/swsmu regression: - revert patch to fix swsmu regression" * tag 'drm-fixes-2024-11-17' of https://gitlab.freedesktop.org/drm/kernel: Revert "drm/amd/pm: correct the workload setting"
2024-11-17Merge tag 'amd-drm-fixes-6.12-2024-11-16' of ↵Dave Airlie12-84/+36
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.12-2024-11-16: amdgpu: - Revert a swsmu patch to fix a regression Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241116145320.2507156-1-alexander.deucher@amd.com
2024-11-16libbpf: Change hash_combine parameters from long to unsigned longSidong Yang1-1/+1
The hash_combine() could be trapped when compiled with sanitizer like "zig cc" or clang with signed-integer-overflow option. This patch parameters and return type to unsigned long to remove the potential overflow. Signed-off-by: Sidong Yang <sidong.yang@furiosa.ai> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241116081054.65195-1-sidong.yang@furiosa.ai Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-16selftests/bpf: Fix build error with llvm 19Alexei Starovoitov1-1/+1
llvm 19 fails to compile arena self test: CLNG-BPF [test_progs] verifier_arena_large.bpf.o progs/verifier_arena_large.c:90:24: error: unsupported signed division, please convert to unsigned div/mod. 90 | pg_idx = (pg - base) / PAGE_SIZE; Though llvm <= 18 and llvm >= 20 don't have this issue, fix the test to avoid the build error. Reported-by: Jiri Olsa <olsajiri@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-16Merge tag 'trace-ringbuffer-v6.12-rc7-2' of ↵Linus Torvalds2-8/+29
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ring buffer fixes from Steven Rostedt: - Revert: "ring-buffer: Do not have boot mapped buffers hook to CPU hotplug" A crash that happened on cpu hotplug was actually caused by the incorrect ref counting that was fixed by commit 2cf9733891a4 ("ring-buffer: Fix refcount setting of boot mapped buffers"). The removal of calling cpu hotplug callbacks on memory mapped buffers was not an issue even though the tests at the time pointed toward it. But in fact, there's a check in that code that tests to see if the buffers are already allocated or not, and will not allocate them again if they are. Not calling the cpu hotplug callbacks ended up not initializing the non boot CPU buffers. Simply remove that change. - Clear all CPU buffers when starting tracing in a boot mapped buffer To properly process events from a previous boot, the address space needs to be accounted for due to KASLR and the events in the buffer are updated accordingly when read. This also requires that when the buffer has tracing enabled again in the current boot that the buffers are reset so that events from the previous boot do not interact with the events of the current boot and cause confusing due to not having the proper meta data. It was found that if a CPU is taken offline, that its per CPU buffer is not reset when tracing starts. This allows for events to be from both the previous boot and the current boot to be in the buffer at the same time. Clear all CPU buffers when tracing is started in a boot mapped buffer. * tag 'trace-ringbuffer-v6.12-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/ring-buffer: Clear all memory mapped CPU ring buffers on first recording Revert: "ring-buffer: Do not have boot mapped buffers hook to CPU hotplug"
2024-11-16Documentation: alienware-wmi: Describe THERMAL_INFORMATION operation 0x02Kurt Borja1-1/+11
This operation is used by alienware-wmi driver to avoid brute-forcing operation 0x03. Signed-off-by: Kurt Borja <kuurtb@gmail.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20241111183639.14726-1-kuurtb@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-11-16alienware-wmi: create_thermal_profile() no longer brute-forces IDsKurt Borja1-2/+12
WMAX_METHOD_THERMAL_INFORMATION has a *system description* operation that outputs a buffer with the following structure: out[0] -> Number of fans out[1] -> Number of sensors out[2] -> 0x00 out[3] -> Number of thermal modes This is now used by create_thermal_profile() to retrieve available thermal codes instead of brute-forcing every ID. Tested on an Alienware x15 R1. Verified by checking ACPI tables of supported models. Signed-off-by: Kurt Borja <kuurtb@gmail.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20241111183623.14691-1-kuurtb@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-11-16alienware-wmi: Adds support to Alienware x17 R2Kurt Borja1-0/+9
Adds support to Alienware x17 R2 Tested-by: Samith Castro <SamithNarayam@hotmail.com> Signed-off-by: Kurt Borja <kuurtb@gmail.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20241111183609.14653-1-kuurtb@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-11-16alienware-wmi: extends the list of supported modelsKurt Borja1-0/+81
Adds thermal + gmode quirk to: - Dell G15 5510 - Dell G15 5511 - Dell G15 5515 - Dell G3 3500 - Dell G3 3590 - Dell G5 5500 Adds thermal quirk to: - Alienware m18 R2 - Alienware m17 R5 AMD Support for these models was manually verified by reading their respective ACPI tables. Signed-off-by: Kurt Borja <kuurtb@gmail.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20241111183546.14617-1-kuurtb@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-11-16alienware-wmi: order alienware_quirks[] alphabeticallyKurt Borja1-26/+26
alienware_quirks[] entries are now ordered alphabetically Signed-off-by: Kurt Borja <kuurtb@gmail.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20241111183520.14573-1-kuurtb@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-11-16Revert "drm/amd/pm: correct the workload setting"Alex Deucher12-84/+36
This reverts commit 74e1006430a5377228e49310f6d915628609929e. This causes a regression in the workload selection. A more extensive fix is being worked on. For now, revert. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Fixes: 74e1006430a5 ("drm/amd/pm: correct the workload setting") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-16Merge branch 'virtio-net-support-af_xdp-zero-copy-tx'Jakub Kicinski3-249/+489
Xuan Zhuo says: ==================== virtio-net: support AF_XDP zero copy (tx) XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero copy feature of xsk (XDP socket) needs to be supported by the driver. The performance of zero copy is very good. mlx5 and intel ixgbe already support this feature, This patch set allows virtio-net to support xsk's zerocopy xmit feature. At present, we have completed some preparation: 1. vq-reset (virtio spec and kernel code) 2. virtio-core premapped dma 3. virtio-net xdp refactor So it is time for Virtio-Net to complete the support for the XDP Socket Zerocopy. Virtio-net can not increase the queue num at will, so xsk shares the queue with kernel. This patch set includes some refactor to the virtio-net to let that to support AF_XDP. The current configuration sets the virtqueue (vq) to premapped mode, implying that all buffers submitted to this queue must be mapped ahead of time. This presents a challenge for the virtnet send queue (sq): the virtnet driver would be required to keep track of dma information for vq size * 17, which can be substantial. However, if the premapped mode were applied on a per-buffer basis, the complexity would be greatly reduced. With AF_XDP enabled, AF_XDP buffers would become premapped, while kernel skb buffers could remain unmapped. We can distinguish them by sg_page(sg), When sg_page(sg) is NULL, this indicates that the driver has performed DMA mapping in advance, allowing the Virtio core to directly utilize sg_dma_address(sg) without conducting any internal DMA mapping. Additionally, DMA unmap operations for this buffer will be bypassed. ENV: Qemu with vhost-user(polling mode). Host CPU: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 19531092064 RX-missed: 0 RX-bytes: 1093741155584 RX-errors: 0 RX-nombuf: 0 TX-packets: 5959955552 TX-errors: 0 TX-bytes: 371030645664 Throughput (since last show) Rx-pps: 8861574 Rx-bps: 3969985208 Tx-pps: 8861493 Tx-bps: 3969962736 ############################################################################ testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 68152727 RX-missed: 0 RX-bytes: 3816552712 RX-errors: 0 RX-nombuf: 0 TX-packets: 68114967 TX-errors: 33216 TX-bytes: 3814438152 Throughput (since last show) Rx-pps: 6333196 Rx-bps: 2837272088 Tx-pps: 6333227 Tx-bps: 2837285936 ############################################################################ But AF_XDP consumes more CPU for tx and rx napi(100% and 86%). ==================== Link: https://patch.msgid.link/20241112012928.102478-1-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16virtio_net: xdp_features add NETDEV_XDP_ACT_XSK_ZEROCOPYXuan Zhuo1-1/+2
Now, we support AF_XDP(xsk). Add NETDEV_XDP_ACT_XSK_ZEROCOPY to xdp_features. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-14-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16virtio_net: update tx timeout recordXuan Zhuo1-0/+7
If send queue sent some packets, we update the tx timeout record to prevent the tx timeout. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-13-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16virtio_net: xsk: tx: support xmit xsk bufferXuan Zhuo1-11/+168
The driver's tx napi is very important for XSK. It is responsible for obtaining data from the XSK queue and sending it out. At the beginning, we need to trigger tx napi. virtnet_free_old_xmit distinguishes three type ptr(skb, xdp frame, xsk buffer) by the last bits of the pointer. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-12-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16virtio_net: xsk: prevent disable tx napiXuan Zhuo1-1/+9
Since xsk's TX queue is consumed by TX NAPI, if sq is bound to xsk, then we must stop tx napi from being disabled. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-11-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16virtio_net: xsk: bind/unbind xsk for txXuan Zhuo1-0/+53
This patch implement the logic of bind/unbind xsk pool to sq and rq. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-10-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16virtio_net: refactor the xmit typeXuan Zhuo1-39/+51
Because the af-xdp will introduce a new xmit type, so I refactor the xmit type mechanism first. We know both xdp_frame and sk_buff are at least 4 bytes aligned. For the xdp tx, we do not pass any pointer to virtio core as data, we just need to pass the len of the packet. So we will push len to the void pointer. We can make sure the pointer is 4 bytes aligned. And the data structure of AF_XDP also is at least 4 bytes aligned. So the last two bits of the pointers are free, we can't use these to distinguish them. 00 for skb 01 for SKB_ORPHAN 10 for XDP 11 for AF-XDP tx Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-9-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16virtio_ring: remove API virtqueue_set_dma_premappedXuan Zhuo3-63/+0
Now, this API is useless. remove it. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-8-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16virtio-net: rq submits premapped per-bufferXuan Zhuo1-11/+11
virtio-net rq submits premapped per-buffer by setting sg page to NULL; Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-7-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16virtio_ring: introduce add api for premappedXuan Zhuo2-0/+59
Two APIs are introduced to submit premapped per-buffers. int virtqueue_add_inbuf_premapped(struct virtqueue *vq, struct scatterlist *sg, unsigned int num, void *data, void *ctx, gfp_t gfp); int virtqueue_add_outbuf_premapped(struct virtqueue *vq, struct scatterlist *sg, unsigned int num, void *data, gfp_t gfp); Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-6-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16virtio_ring: perform premapped operations based on per-bufferXuan Zhuo1-48/+53
The current configuration sets the virtqueue (vq) to premapped mode, implying that all buffers submitted to this queue must be mapped ahead of time. This presents a challenge for the virtnet send queue (sq): the virtnet driver would be required to keep track of dma information for vq size * 17, which can be substantial. However, if the premapped mode were applied on a per-buffer basis, the complexity would be greatly reduced. With AF_XDP enabled, AF_XDP buffers would become premapped, while kernel skb buffers could remain unmapped. And consider that some sgs are not generated by the virtio driver, that may be passed from the block stack. So we can not change the sgs, new APIs are the better way. So we pass the new argument 'premapped' to indicate the buffers submitted to virtio are premapped in advance. Additionally, DMA unmap operations for these buffers will be bypassed. Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-5-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16virtio_ring: packed: record extras for indirect buffersXuan Zhuo1-24/+36
The subsequent commit needs to know whether every indirect buffer is premapped or not. So we need to introduce an extra struct for every indirect buffer to record this info. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-4-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16virtio_ring: split: record extras for indirect buffersXuan Zhuo1-60/+52
The subsequent commit needs to know whether every indirect buffer is premapped or not. So we need to introduce an extra struct for every indirect buffer to record this info. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-3-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16virtio_ring: introduce vring_need_unmap_bufferXuan Zhuo1-15/+12
To make the code readable, introduce vring_need_unmap_buffer() to replace do_unmap. use_dma_api premapped -> vring_need_unmap_buffer() 1. false false false 2. true false true 3. true true false Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-2-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16Merge branch '100GbE' of ↵Jakub Kicinski27-481/+513
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-11-05 (ice, ixgbe, igc. igb, igbvf, e1000) For ice: Mateusz refactors and adds additional SerDes configuration values to be output. Przemek refactors processing of DDP and adds support for a flag field in the DDP's signature segment header. Joe Damato adds support for persistent NAPI config. Brett adjusts setting of Tx promiscuous based on unicast/multicast setting. Jake moves setting of pf->supported_rxdids to occur directly after DDP load and changes a small struct to use stack memory. Frederic Weisbecker adds WQ_UNBOUND flag to the workqueue. For ixgbe: Diomidis Spinellis removes a circular dependency. For igc: Vitaly removes an unneeded autoneg parameter. For igb: Johnny Park fixes a couple of typos. For igbvf: Wander Lairson Costa removes an unused spinlock. For e1000: Joe Damato adds RTNL lock to some calls where it is expected to be held. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: e1000: Hold RTNL when e1000_down can be called igbvf: remove unused spinlock igb: Fix 2 typos in comments in igb_main.c igc: remove autoneg parameter from igc_mac_info ixgbe: Break include dependency cycle ice: Unbind the workqueue ice: use stack variable for virtchnl_supported_rxdids ice: initialize pf->supported_rxdids immediately after loading DDP ice: only allow Tx promiscuous for multicast ice: Add support for persistent NAPI config ice: support optional flags in signature segment header ice: refactor "last" segment of DDP pkg ice: extend dump serdes equalizer values feature ice: rework of dump serdes equalizer values feature ==================== Link: https://patch.msgid.link/20241113185431.1289708-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16Merge branch 'net-ndo_fdb_add-del-have-drivers-report-whether-they-notified'Jakub Kicinski29-267/+419
Petr Machata says: ==================== net: ndo_fdb_add/del: Have drivers report whether they notified Currently when FDB entries are added to or deleted from a VXLAN netdevice, the VXLAN driver emits one notification, including the VXLAN-specific attributes. The core however always sends a notification as well, a generic one. Thus two notifications are unnecessarily sent for these operations. A similar situation comes up with bridge driver, which also emits notifications on its own. # ip link add name vx type vxlan id 1000 dstport 4789 # bridge monitor fdb & [1] 1981693 # bridge fdb add de:ad:be:ef:13:37 dev vx self dst 192.0.2.1 de:ad:be:ef:13:37 dev vx dst 192.0.2.1 self permanent de:ad:be:ef:13:37 dev vx self permanent In order to prevent this duplicity, add a parameter, bool *notified, to ndo_fdb_add and ndo_fdb_del. The flag is primed to false, and if the callee sends a notification on its own, it sets the flag to true, thus informing the core that it should not generate another notification. Patches #1 to #2 are concerned with the above. In the remaining patches, #3 to #7, add a selftest. This takes place across several patches. Many of the helpers we would like to use for the test are in forwarding/lib.sh, whereas net/ is a more suitable place for the test, so the libraries need to be massaged a bit first. ==================== Link: https://patch.msgid.link/cover.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16selftests: net: fdb_notify: Add a test for FDB notificationsPetr Machata3-1/+114
Check that only one notification is produced for various FDB edit operations. Regarding the ip_link_add() and ip_link_master() helpers. This pattern of action plus corresponding defer is bound to come up often, and a dedicated vocabulary to capture it will be handy. tunnel_create() and vlan_create() from forwarding/lib.sh are somewhat opaque and perhaps too kitchen-sinky, so I tried to go in the opposite direction with these ones, and wrapped only the bare minimum to schedule a corresponding cleanup. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://patch.msgid.link/910c5880ae6d3b558d6889cbdba2be690c2615c6.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-16selftests: net: lib: Add kill_processPetr Machata15-34/+41
A number of selftests run processes in the background and need to kill them afterwards. Instead for everyone to open-code the kill / wait / redirect mantra, add a helper in net/lib.sh. Convert existing open-code sites. Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Link: https://patch.msgid.link/a9db102067d741c118f0bd93b10c75e2a34665ea.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>