summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)AuthorFilesLines
2025-05-29Merge tag 'net-next-6.16' of ↵Linus Torvalds3-12/+6
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core: - Implement the Device Memory TCP transmit path, allowing zero-copy data transmission on top of TCP from e.g. GPU memory to the wire. - Move all the IPv6 routing tables management outside the RTNL scope, under its own lock and RCU. The route control path is now 3x times faster. - Convert queue related netlink ops to instance lock, reducing again the scope of the RTNL lock. This improves the control plane scalability. - Refactor the software crc32c implementation, removing unneeded abstraction layers and improving significantly the related micro-benchmarks. - Optimize the GRO engine for UDP-tunneled traffic, for a 10% performance improvement in related stream tests. - Cover more per-CPU storage with local nested BH locking; this is a prep work to remove the current per-CPU lock in local_bh_disable() on PREMPT_RT. - Introduce and use nlmsg_payload helper, combining buffer bounds verification with accessing payload carried by netlink messages. Netfilter: - Rewrite the procfs conntrack table implementation, improving considerably the dump performance. A lot of user-space tools still use this interface. - Implement support for wildcard netdevice in netdev basechain and flowtables. - Integrate conntrack information into nft trace infrastructure. - Export set count and backend name to userspace, for better introspection. BPF: - BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops programs and can be controlled in similar way to traditional qdiscs using the "tc qdisc" command. - Refactor the UDP socket iterator, addressing long standing issues WRT duplicate hits or missed sockets. Protocols: - Improve TCP receive buffer auto-tuning and increase the default upper bound for the receive buffer; overall this improves the single flow maximum thoughput on 200Gbs link by over 60%. - Add AFS GSSAPI security class to AF_RXRPC; it provides transport security for connections to the AFS fileserver and VL server. - Improve TCP multipath routing, so that the sources address always matches the nexthop device. - Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS, and thus preventing DoS caused by passing around problematic FDs. - Retire DCCP socket. DCCP only receives updates for bugs, and major distros disable it by default. Its removal allows for better organisation of TCP fields to reduce the number of cache lines hit in the fast path. - Extend TCP drop-reason support to cover PAWS checks. Driver API: - Reorganize PTP ioctl flag support to require an explicit opt-in for the drivers, avoiding the problem of drivers not rejecting new unsupported flags. - Converted several device drivers to timestamping APIs. - Introduce per-PHY ethtool dump helpers, improving the support for dump operations targeting PHYs. Tests and tooling: - Add support for classic netlink in user space C codegen, so that ynl-c can now read, create and modify links, routes addresses and qdisc layer configuration. - Add ynl sub-types for binary attributes, allowing ynl-c to output known struct instead of raw binary data, clarifying the classic netlink output. - Extend MPTCP selftests to improve the code-coverage. - Add tests for XDP tail adjustment in AF_XDP. New hardware / drivers: - OpenVPN virtual driver: offload OpenVPN data channels processing to the kernel-space, increasing the data transfer throughput WRT the user-space implementation. - Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC. - Broadcom asp-v3.0 ethernet driver. - AMD Renoir ethernet device. - ReakTek MT9888 2.5G ethernet PHY driver. - Aeonsemi 10G C45 PHYs driver. Drivers: - Ethernet high-speed NICs: - nVidia/Mellanox (mlx5): - refactor the steering table handling to significantly reduce the amount of memory used - add support for complex matches in H/W flow steering - improve flow streeing error handling - convert to netdev instance locking - Intel (100G, ice, igb, ixgbe, idpf): - ice: add switchdev support for LLDP traffic over VF - ixgbe: add firmware manipulation and regions devlink support - igb: introduce support for frame transmission premption - igb: adds persistent NAPI configuration - idpf: introduce RDMA support - idpf: add initial PTP support - Meta (fbnic): - extend hardware stats coverage - add devlink dev flash support - Broadcom (bnxt): - add support for RX-side device memory TCP - Wangxun (txgbe): - implement support for udp tunnel offload - complete PTP and SRIOV support for AML 25G/10G devices - Ethernet NICs embedded and virtual: - Google (gve): - add device memory TCP TX support - Amazon (ena): - support persistent per-NAPI config - Airoha: - add H/W support for L2 traffic offload - add per flow stats for flow offloading - RealTek (rtl8211): add support for WoL magic packet - Synopsys (stmmac): - dwmac-socfpga 1000BaseX support - add Loongson-2K3000 support - introduce support for hardware-accelerated VLAN stripping - Broadcom (bcmgenet): - expose more H/W stats - Freescale (enetc, dpaa2-eth): - enetc: add MAC filter, VLAN filter RSS and loopback support - dpaa2-eth: convert to H/W timestamping APIs - vxlan: convert FDB table to rhashtable, for better scalabilty - veth: apply qdisc backpressure on full ring to reduce TX drops - Ethernet switches: - Microchip (kzZ88x3): add ETS scheduler support - Ethernet PHYs: - RealTek (rtl8211): - add support for WoL magic packet - add support for PHY LEDs - CAN: - Adds RZ/G3E CANFD support to the rcar_canfd driver. - Preparatory work for CAN-XL support. - Add self-tests framework with support for CAN physical interfaces. - WiFi: - mac80211: - scan improvements with multi-link operation (MLO) - Qualcomm (ath12k): - enable AHB support for IPQ5332 - add monitor interface support to QCN9274 - add multi-link operation support to WCN7850 - add 802.11d scan offload support to WCN7850 - monitor mode for WCN7850, better 6 GHz regulatory - Qualcomm (ath11k): - restore hibernation support - MediaTek (mt76): - WiFi-7 improvements - implement support for mt7990 - Intel (iwlwifi): - enhanced multi-link single-radio (EMLSR) support on 5 GHz links - rework device configuration - RealTek (rtw88): - improve throughput for RTL8814AU - RealTek (rtw89): - add multi-link operation support - STA/P2P concurrency improvements - support different SAR configs by antenna - Bluetooth: - introduce HCI Driver protocol - btintel_pcie: do not generate coredump for diagnostic events - btusb: add HCI Drv commands for configuring altsetting - btusb: add RTL8851BE device 0x0bda:0xb850 - btusb: add new VID/PID 13d3/3584 for MT7922 - btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925 - btnxpuart: implement host-wakeup feature" * tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1611 commits) selftests/bpf: Fix bpf selftest build warning selftests: netfilter: Fix skip of wildcard interface test net: phy: mscc: Stop clearing the the UDPv4 checksum for L2 frames net: openvswitch: Fix the dead loop of MPLS parse calipso: Don't call calipso functions for AF_INET sk. selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem net_sched: hfsc: Address reentrant enqueue adding class to eltree twice octeontx2-pf: QOS: Refactor TC_HTB_LEAF_DEL_LAST callback octeontx2-pf: QOS: Perform cache sync on send queue teardown net: mana: Add support for Multi Vports on Bare metal net: devmem: ncdevmem: remove unused variable net: devmem: ksft: upgrade rx test to send 1K data net: devmem: ksft: add 5 tuple FS support net: devmem: ksft: add exit_wait to make rx test pass net: devmem: ksft: add ipv4 support net: devmem: preserve sockc_err page_pool: fix ugly page_pool formatting net: devmem: move list_add to net_devmem_bind_dmabuf. selftests: netfilter: nft_queue.sh: include file transfer duration in log message net: phy: mscc: Fix memory leak when using one step timestamping ...
2025-05-28Merge tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds2-13/+66
Pull drm updates from Dave Airlie: "As part of building up nova-core/nova-drm pieces we've brought in some rust abstractions through this tree, aux bus being the main one, with devres changes also in the driver-core tree. Along with the drm core abstractions and enough nova-core/nova-drm to use them. This is still all stub work under construction, to build the nova driver upstream. The other big NVIDIA related one is nouveau adds support for Hopper/Blackwell GPUs, this required a new GSP firmware update to 570.144, and a bunch of rework in order to support multiple fw interfaces. There is also the introduction of an asahi uapi header file as a precursor to getting the real driver in later, but to unblock userspace mesa packages while the driver is trapped behind rust enablement. Otherwise it's the usual mixture of stuff all over, amdgpu, i915/xe, and msm being the main ones, and some changes to vsprintf. new drivers: - bring in the asahi uapi header standalone - nova-drm: stub driver rust dependencies (for nova-core): - auxiliary - bus abstractions - driver registration - sample driver - devres changes from driver-core - revocable changes core: - add Apple fourcc modifiers - add virtio capset definitions - extend EXPORT_SYNC_FILE for timeline syncobjs - convert to devm_platform_ioremap_resource - refactor shmem helper page pinning - DP powerup/down link helpers - extended %p4cc in vsprintf.c to support fourcc prints - change vsprintf %p4cn to %p4chR, remove %p4cn - Add drm_file_err function - IN_FORMATS_ASYNC property - move sitronix from tiny to their own subdir rust: - add drm core infrastructure rust abstractions (device/driver, ioctl, file, gem) dma-buf: - adjust sg handling to not cache map on attach - allow setting dma-device for import - Add a helper to sort and deduplicate dma_fence arrays docs: - updated drm scheduler docs - fbdev todo update - fb rendering - actual brightness ttm: - fix delayed destroy resv object bridge: - add kunit tests - convert tc358775 to atomic - convert drivers to devm_drm_bridge_alloc - convert rk3066_hdmi to bridge driver scheduler: - add kunit tests panel: - refcount panels to improve lifetime handling - Powertip PH128800T004-ZZA01 - NLT NL13676BC25-03F, Tianma TM070JDHG34-00 - Himax HX8279/HX8279-D DDIC - Visionox G2647FB105 - Sitronix ST7571 - ZOTAC rotation quirk vkms: - allow attaching more displays i915: - xe3lpd display updates - vrr refactor - intel_display struct conversions - xe2hpd memory type identification - add link rate/count to i915_display_info - cleanup VGA plane handling - refactor HDCP GSC - fix SLPC wait boosting reference counting - add 20ms delay to engine reset - fix fence release on early probe errors xe: - SRIOV updates - BMG PCI ID update - support separate firmware for each GT - SVM fix, prelim SVM multi-device work - export fan speed - temp disable d3cold on BMG - backup VRAM in PM notifier instead of suspend/freeze - update xe_ttm_access_memory to use GPU for non-visible access - fix guc_info debugfs for VFs - use copy_from_user instead of __copy_from_user - append PCIe gen5 limitations to xe_firmware document amdgpu: - DSC cleanup - DC Scaling updates - Fused I2C-over-AUX updates - DMUB updates - Use drm_file_err in amdgpu - Enforce isolation updates - Use new dma_fence helpers - USERQ fixes - Documentation updates - SR-IOV updates - RAS updates - PSP 12 cleanups - GC 9.5 updates - SMU 13.x updates - VCN / JPEG SR-IOV updates amdkfd: - Update error messages for SDMA - Userptr updates - XNACK fixes radeon: - CIK doorbell cleanup nouveau: - add support for NVIDIA r570 GSP firmware - enable Hopper/Blackwell support nova-core: - fix task list - register definition infrastructure - move firmware into own rust module - register auxiliary device for nova-drm nova-drm: - initial driver skeleton msm: - GPU: - ACD (adaptive clock distribution) for X1-85 - drop fictional address_space_size - improve GMU HFI response time out robustness - fix crash when throttling during boot - DPU: - use single CTL path for flushing on DPU 5.x+ - improve SSPP allocation code for better sharing - Enabled SmartDMA on SM8150, SC8180X, SC8280XP, SM8550 - Added SAR2130P support - Disabled DSC support on MSM8937, MSM8917, MSM8953, SDM660 - DP: - switch to new audio helpers - better LTTPR handling - DSI: - Added support for SA8775P - Added SAR2130P support - HDMI: - Switched to use new helpers for ACR data - Fixed old standing issue of HPD not working in some cases amdxdna: - add dma-buf support - allow empty command submits renesas: - add dma-buf support - add zpos, alpha, blend support panthor: - fail properly for NO_MMAP bos - add SET_LABEL ioctl - debugfs BO dumping support imagination: - update DT bindings - support TI AM68 GPU hibmc: - improve interrupt handling and HPD support virtio: - add panic handler support rockchip: - add RK3588 support - add DP AUX bus panel support ivpu: - add heartbeat based hangcheck mediatek: - prepares support for MT8195/99 HDMIv2/DDCv2 anx7625: - improve HPD tegra: - speed up firmware loading * tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernel: (1627 commits) drm/nouveau/tegra: Fix error pointer vs NULL return in nvkm_device_tegra_resource_addr() drm/xe: Default auto_link_downgrade status to false drm/xe/guc: Make creation of SLPC debugfs files conditional drm/i915/display: Add check for alloc_ordered_workqueue() and alloc_workqueue() drm/i915/dp_mst: Work around Thunderbolt sink disconnect after SINK_COUNT_ESI read drm/i915/ptl: Use everywhere the correct DDI port clock select mask drm/nouveau/kms: add support for GB20x drm/dp: add option to disable zero sized address only transactions. drm/nouveau: add support for GB20x drm/nouveau/gsp: add hal for fifo.chan.doorbell_handle drm/nouveau: add support for GB10x drm/nouveau/gf100-: track chan progress with non-WFI semaphore release drm/nouveau/nv50-: separate CHANNEL_GPFIFO handling out from CHANNEL_DMA drm/nouveau: add helper functions for allocating pinned/cpu-mapped bos drm/nouveau: add support for GH100 drm/nouveau: improve handling of 64-bit BARs drm/nouveau/gv100-: switch to volta semaphore methods drm/nouveau/gsp: support deeper page tables in COPY_SERVER_RESERVED_PDES drm/nouveau/gsp: init client VMMs with NV0080_CTRL_DMA_SET_PAGE_DIRECTORY drm/nouveau/gsp: fetch level shift and PDE from BAR2 VMM ...
2025-05-28Merge tag 'hardening-v6.16-rc1' of ↵Linus Torvalds6-1/+348
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: - Update overflow helpers to ease refactoring of on-stack flex array instances (Gustavo A. R. Silva, Kees Cook) - lkdtm: Use SLAB_NO_MERGE instead of constructors (Harry Yoo) - Simplify CONFIG_CC_HAS_COUNTED_BY (Jan Hendrik Farr) - Disable u64 usercopy KUnit test on 32-bit SPARC (Thomas Weißschuh) - Add missed designated initializers now exposed by fixed randstruct (Nathan Chancellor, Kees Cook) - Document compilers versions for __builtin_dynamic_object_size - Remove ARM_SSP_PER_TASK GCC plugin - Fix GCC plugin randstruct, add selftests, and restore COMPILE_TEST builds - Kbuild: induce full rebuilds when dependencies change with GCC plugins, the Clang sanitizer .scl file, or the randstruct seed. - Kbuild: Switch from -Wvla to -Wvla-larger-than=1 - Correct several __nonstring uses for -Wunterminated-string-initialization * tag 'hardening-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits) Revert "hardening: Disable GCC randstruct for COMPILE_TEST" lib/tests: randstruct: Add deep function pointer layout test lib/tests: Add randstruct KUnit test randstruct: gcc-plugin: Remove bogus void member net: qede: Initialize qede_ll_ops with designated initializer scsi: qedf: Use designated initializer for struct qed_fcoe_cb_ops md/bcache: Mark __nonstring look-up table integer-wrap: Force full rebuild when .scl file changes randstruct: Force full rebuild when seed changes gcc-plugins: Force full rebuild when plugins change kbuild: Switch from -Wvla to -Wvla-larger-than=1 hardening: simplify CONFIG_CC_HAS_COUNTED_BY overflow: Fix direct struct member initialization in _DEFINE_FLEX() kunit/overflow: Add tests for STACK_FLEX_ARRAY_SIZE() helper overflow: Add STACK_FLEX_ARRAY_SIZE() helper input/joystick: magellan: Mark __nonstring look-up table const watchdog: exar: Shorten identity name to fit correctly mod_devicetable: Enlarge the maximum platform_device_id name length overflow: Clarify expectations for getting DEFINE_FLEX variable sizes compiler_types: Identify compiler versions for __builtin_dynamic_object_size ...
2025-05-28Merge tag 'sysctl-6.16-rc1' of ↵Linus Torvalds1-40/+91
git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl updates from Joel Granados: - Move kern_table members out of kernel/sysctl.c Moved a subset (tracing, panic, signal, stack_tracer and sparc) out of the kern_table array. The goal is for kern_table to only have sysctl elements. All this increases modularity by placing the ctl_tables closer to where they are used while reducing the chances of merge conflicts in kernel/sysctl.c. - Fixed sysctl unit test panic by relocating it to selftests * tag 'sysctl-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: sysctl: Close test ctl_headers with a for loop sysctl: call sysctl tests with a for loop sysctl: Add 0012 to test the u8 range check sysctl: move u8 register test to lib/test_sysctl.c sparc: mv sparc sysctls into their own file under arch/sparc/kernel stack_tracer: move sysctl registration to kernel/trace/trace_stack.c tracing: Move trace sysctls into trace.c signal: Move signal ctl tables into signal.c panic: Move panic ctl tables into panic.c
2025-05-28llist: make llist_add_batch() a static inlineJens Axboe1-22/+0
The function is small enough that it should be, and it's a (very) hot path for io_uring. Doing this actually reduces my vmlinux text size for my standard build/test box. Before: axboe@r7625 ~/g/linux (test)> size vmlinux text data bss dec hex filename 19892174 5938310 2470432 28300916 1afd674 vmlinux After: axboe@r7625 ~/g/linux (test)> size vmlinux text data bss dec hex filename 19891878 5938310 2470436 28300624 1afd550 vmlinux Link: https://lkml.kernel.org/r/f1d104c6-7ac8-457a-a53d-6bb741421b2f@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-28Merge tag 'thermal-6.16-rc1' of ↵Linus Torvalds1-8/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control updates from Rafael Wysocki: "These add support for a new feature, Platform Temperature Control (PTC), to the Intel int340x thermal driver, add support for the Airoha EN7581 thermal sensor and the IPQ5018 platform, fix up the ACPI thermal zones handling, fix other assorted issues and clean up code Specifics: - Add Platform Temperature Control (PTC) support to the Intel int340x thermal driver (Srinivas Pandruvada) - Make the Hisilicon thermal driver compile by default when ARCH_HISI is set (Krzysztof Kozlowski) - Clean up printk() format by using %pC instead of %pCn in the bcm2835 thermal driver (Luca Ceresoli) - Fix variable name coding style in the AmLogic thermal driver (Enrique Isidoro Vazquez Ramos) - Fix missing debugfs entry removal on failure by using the devm_ variant in the LVTS thermal driver (AngeloGioacchino Del Regno) - Remove the unused lvts_debugfs_exit() function as the devm_ variant introduced before takes care of removing the debugfs entry in the LVTS driver (Arnd Bergmann) - Add the Airoha EN7581 thermal sensor support along with its DT bindings (Christian Marangi) - Add ipq5018 compatible string DT binding, cleanup and add its suppot to the QCom Tsens thermal driver (Sricharan Ramabadhran, George Moussalem) - Fix comments typos in the Airoha driver (Christian Marangi, Colin Ian King) - Address a sparse warning by making a local variable static in the QCom thermal driver (George Moussalem) - Fix the usage of the _SCP control method in the driver for ACPI thermal zones (Armin Wolf)" * tag 'thermal-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: qcom: ipq5018: make ops_ipq5018 struct static thermal/drivers/airoha: Fix spelling mistake "calibrarion" -> "calibration" ACPI: thermal: Execute _SCP before reading trip points ACPI: OSI: Stop advertising support for "3.0 _SCP Extensions" thermal/drivers/airoha: Fix spelling mistake thermal/drivers/qcom/tsens: Add support for IPQ5018 tsens thermal/drivers/qcom/tsens: Add support for tsens v1 without RPM thermal/drivers/qcom/tsens: Update conditions to strictly evaluate for IP v2+ dt-bindings: thermal: qcom-tsens: Add ipq5018 compatible thermal/drivers: Add support for Airoha EN7581 thermal sensor dt-bindings: thermal: Add support for Airoha EN7581 thermal sensor thermal/drivers/mediatek/lvts: Remove unused lvts_debugfs_exit thermal/drivers/mediatek/lvts: Fix debugfs unregister on failure thermal/drivers/amlogic: Rename Uptat to uptat to follow kernel coding style vsprintf: remove redundant and unused %pCn format specifier thermal/drivers/bcm2835: Use %pC instead of %pCn thermal/drivers/hisi: Do not enable by default during compile testing thermal: int340x: processor_thermal: Platform temperature control documentation thermal: intel: int340x: Enable platform temperature control thermal: intel: int340x: Add platform temperature control interface
2025-05-28Merge tag 'sound-6.16-rc1' of ↵Linus Torvalds1-18/+21
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "We've received a lot of activities in this cycle, mostly about leaf driver codes rather than the core part, but with a good mixture of code cleanups and new driver additions. Below are some highlights: ASoC: - Support for automatically enumerating DAIs from standards conforming SoundWire SDCA devices; not much used as of this writing, rather for future implementations - Conversion of quite a few drivers to newer GPIO APIs - Continued cleanups and helper usages in allover places - Support for a wider range of Intel AVS platforms - Support for AMD ACP 7.x platforms, Cirrus Logic CS35L63 and CS48L32 Everest Semiconductor ES8375 and ES8389, Longsoon-1 AC'97 controllers, nVidia Tegra264, Richtek ALC203 and RT9123 and Rockchip SAI controllers HD-audio: - Lots of cleanups of TAS2781 codec drivers - A new HD-audio control bound via ACPI for Nvidia - Support for Tegra264, Intel WCL, usual new codec quirks USB-audio: - Fix a race at removal of MIDI device - Pioneer DJM-V10 support, Scarlett2 driver cleanups Misc: - Cleanups of deprecated PCI functions - Removal of unused / dead function codes" * tag 'sound-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (364 commits) firmware: cs_dsp: Fix OOB memory read access in KUnit test ASoC: codecs: add support for ES8375 ASoC: dt-bindings: Add Everest ES8375 audio CODEC ALSA: hda: acpi: Make driver's match data const static ALSA: hda: acpi: Use SYSTEM_SLEEP_PM_OPS() ALSA: atmel: Replace deprecated strcpy() with strscpy() ALSA: core: fix up bus match const issues. ASoC: wm_adsp: Make cirrus_dir const ASoC: tegra: Tegra264 support in isomgr_bw ASoC: tegra: AHUB: Add Tegra264 support ASoC: tegra: ADX: Add Tegra264 support ASoC: tegra: AMX: Add Tegra264 support ASoC: tegra: I2S: Add Tegra264 support ASoC: tegra: Update PLL rate for Tegra264 ASoC: tegra: ASRC: Update ARAM address ASoC: tegra: ADMAIF: Add Tegra264 support ASoC: tegra: CIF: Add Tegra264 support dt-bindings: ASoC: Document Tegra264 APE support dt-bindings: ASoC: admaif: Add missing properties ASoC: dt-bindings: audio-graph-card2: reference audio-graph routing property ...
2025-05-27Merge tag 'ratelimit.2025.05.25a' of ↵Linus Torvalds1-23/+52
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull rate-limit updates from Paul McKenney: "lib/ratelimit: Reduce false-positive and silent misses: - Reduce open-coded use of ratelimit_state structure fields. - Convert the ->missed field to atomic_t. - Count misses that are due to lock contention. - Eliminate jiffies=0 special case. - Reduce ___ratelimit() false-positive rate limiting (Petr Mladek). - Allow zero ->burst to hard-disable rate limiting. - Optimize away atomic operations when a miss is guaranteed. - Warn if ->interval or ->burst are negative (Petr Mladek). - Simplify the resulting code. A smoke test and stress test have been created, but they are not yet ready for mainline. With luck, we will offer them for the v6.17 merge window" * tag 'ratelimit.2025.05.25a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: ratelimit: Drop redundant accesses to burst ratelimit: Use nolock_ret restructuring to collapse common case code ratelimit: Use nolock_ret label to collapse lock-failure code ratelimit: Use nolock_ret label to save a couple of lines of code ratelimit: Simplify common-case exit path ratelimit: Warn if ->interval or ->burst are negative ratelimit: Avoid atomic decrement under lock if already rate-limited ratelimit: Avoid atomic decrement if already rate-limited ratelimit: Don't flush misses counter if RATELIMIT_MSG_ON_RELEASE ratelimit: Force re-initialization when rate-limiting re-enabled ratelimit: Allow zero ->burst to disable ratelimiting ratelimit: Reduce ___ratelimit() false-positive rate limiting ratelimit: Avoid jiffies=0 special case ratelimit: Count misses due to lock contention ratelimit: Convert the ->missed field to atomic_t drm/amd/pm: Avoid open-coded use of ratelimit_state structure's internals drm/i915: Avoid open-coded use of ratelimit_state structure's ->missed field random: Avoid open-coded use of ratelimit_state structure's ->missed field ratelimit: Create functions to handle ratelimit_state internals
2025-05-27Merge tag 'x86_cache_for_v6.16_rc1' of ↵Linus Torvalds1-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 resource control updates from Borislav Petkov: "Carve out the resctrl filesystem-related code into fs/resctrl/ so that multiple architectures can share the fs API for manipulating their respective hw resource control implementation. This is the second step in the work towards sharing the resctrl filesystem interface, the next one being plugging ARM's MPAM into the aforementioned fs API" * tag 'x86_cache_for_v6.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) MAINTAINERS: Add reviewers for fs/resctrl x86,fs/resctrl: Move the resctrl filesystem code to live in /fs/resctrl x86/resctrl: Always initialise rid field in rdt_resources_all[] x86/resctrl: Relax some asm #includes x86/resctrl: Prefer alloc(sizeof(*foo)) idiom in rdt_init_fs_context() x86/resctrl: Squelch whitespace anomalies in resctrl core code x86/resctrl: Move pseudo lock prototypes to include/linux/resctrl.h x86/resctrl: Fix types in resctrl_arch_mon_ctx_{alloc,free}() stubs x86/resctrl: Move enum resctrl_event_id to resctrl.h x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl fs/resctrl: Add boiler plate for external resctrl code x86/resctrl: Add 'resctrl' to the title of the resctrl documentation x86/resctrl: Split trace.h x86/resctrl: Expand the width of domid by replacing mon_data_bits x86/resctrl: Add end-marker to the resctrl_event_id enum x86/resctrl: Move is_mba_sc() out of core.c x86/resctrl: Drop __init/__exit on assorted symbols x86/resctrl: Resctrl_exit() teardown resctrl but leave the mount point x86/resctrl: Check all domains are offline in resctrl_exit() x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_" ...
2025-05-27Merge tag 'linux_kselftest-kunit-6.16-rc1' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kunit updates from Shuah Khan: - Enable qemu_config for riscv32, sparc 64-bit, PowerPC 32-bit BE and 64-bit LE - Enable CONFIG_SPARC32 to clearly differentiate between sparc 32-bit and 64-bit configurations - Enable CONFIG_CPU_BIG_ENDIAN to clearly differentiate between powerpc LE and BE configurations - Add feature to list available architectures to kunit tool - Fixes to bugs and changes to documentation * tag 'linux_kselftest-kunit-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: Fix wrong parameter to kunit_deactivate_static_stub() kunit: tool: add test counts to JSON output Documentation: kunit: improve example on testing static functions kunit: executor: Remove const from kunit_filter_suites() allocation type kunit: qemu_configs: Disable faulting tests on 32-bit SPARC kunit: qemu_configs: Add 64-bit SPARC configuration kunit: qemu_configs: sparc: Explicitly enable CONFIG_SPARC32=y kunit: qemu_configs: Add PowerPC 32-bit BE and 64-bit LE kunit: qemu_configs: powerpc: Explicitly enable CONFIG_CPU_BIG_ENDIAN=y kunit: tool: Implement listing of available architectures kunit: qemu_configs: Add riscv32 config kunit: configs: Enable CONFIG_INIT_STACK_ALL_PATTERN in all_tests
2025-05-26Merge tag 'v6.16-p1' of ↵Linus Torvalds15-244/+553
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Fix memcpy_sglist to handle partially overlapping SG lists - Use memcpy_sglist to replace null skcipher - Rename CRYPTO_TESTS to CRYPTO_BENCHMARK - Flip CRYPTO_MANAGER_DISABLE_TEST into CRYPTO_SELFTESTS - Hide CRYPTO_MANAGER - Add delayed freeing of driver crypto_alg structures Compression: - Allocate large buffers on first use instead of initialisation in scomp - Drop destination linearisation buffer in scomp - Move scomp stream allocation into acomp - Add acomp scatter-gather walker - Remove request chaining - Add optional async request allocation Hashing: - Remove request chaining - Add optional async request allocation - Move partial block handling into API - Add ahash support to hmac - Fix shash documentation to disallow usage in hard IRQs Algorithms: - Remove unnecessary SIMD fallback code on x86 and arm/arm64 - Drop avx10_256 xts(aes)/ctr(aes) on x86 - Improve avx-512 optimisations for xts(aes) - Move chacha arch implementations into lib/crypto - Move poly1305 into lib/crypto and drop unused Crypto API algorithm - Disable powerpc/poly1305 as it has no SIMD fallback - Move sha256 arch implementations into lib/crypto - Convert deflate to acomp - Set block size correctly in cbcmac Drivers: - Do not use sg_dma_len before mapping in sun8i-ss - Fix warm-reboot failure by making shutdown do more work in qat - Add locking in zynqmp-sha - Remove cavium/zip - Add support for PCI device 0x17D8 to ccp - Add qat_6xxx support in qat - Add support for RK3576 in rockchip-rng - Add support for i.MX8QM in caam Others: - Fix irq_fpu_usable/kernel_fpu_begin inconsistency during CPU bring-up - Add new SEV/SNP platform shutdown API in ccp" * tag 'v6.16-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (382 commits) x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining crypto: qat - add missing header inclusion crypto: api - Redo lookup on EEXIST Revert "crypto: testmgr - Add hash export format testing" crypto: marvell/cesa - Do not chain submitted requests crypto: powerpc/poly1305 - add depends on BROKEN for now Revert "crypto: powerpc/poly1305 - Add SIMD fallback" crypto: ccp - Add missing tee info reg for teev2 crypto: ccp - Add missing bootloader info reg for pspv5 crypto: sun8i-ce - move fallback ahash_request to the end of the struct crypto: octeontx2 - Use dynamic allocated memory region for lmtst crypto: octeontx2 - Initialize cptlfs device info once crypto: xts - Only add ecb if it is not already there crypto: lrw - Only add ecb if it is not already there crypto: testmgr - Add hash export format testing crypto: testmgr - Use ahash for generic tfm crypto: hmac - Add ahash support crypto: testmgr - Ignore EEXIST on shash allocation crypto: algapi - Add driver template support to crypto_inst_setname crypto: shash - Set reqsize in shash_alg ...
2025-05-26Merge tag 'crc-for-linus' of ↵Linus Torvalds2-8/+5
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull CRC updates from Eric Biggers: "Cleanups for the kernel's CRC (cyclic redundancy check) code: - Use __ro_after_init where appropriate - Remove unnecessary static_key on s390 - Rename some source code files - Rename the crc32 and crc32c crypto API modules - Use subsys_initcall instead of arch_initcall - Restore maintainers for crc_kunit.c - Fold crc16_byte() into crc16.c - Add some SPDX license identifiers" * tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: lib/crc32: add SPDX license identifier lib/crc16: unexport crc16_table and crc16_byte() w1: ds2406: use crc16() instead of crc16_byte() loop MAINTAINERS: add crc_kunit.c back to CRC LIBRARY lib/crc: make arch-optimized code use subsys_initcall crypto: crc32 - remove "generic" from file and module names x86/crc: drop "glue" from filenames sparc/crc: drop "glue" from filenames s390/crc: drop "glue" from filenames powerpc/crc: rename crc32-vpmsum_core.S to crc-vpmsum-template.S powerpc/crc: drop "glue" from filenames arm64/crc: drop "glue" from filenames arm/crc: drop "glue" from filenames s390/crc32: Remove no-op module init and exit functions s390/crc32: Remove have_vxrs static key lib/crc: make the CPU feature static keys __ro_after_init
2025-05-25alloc_tag: allocate percpu counters for module tags dynamicallySuren Baghdasaryan2-20/+72
When a module gets unloaded it checks whether any of its tags are still in use and if so, we keep the memory containing module's allocation tags alive until all tags are unused. However percpu counters referenced by the tags are freed by free_module(). This will lead to UAF if the memory allocated by a module is accessed after module was unloaded. To fix this we allocate percpu counters for module allocation tags dynamically and we keep it alive for tags which are still in use after module unloading. This also removes the requirement of a larger PERCPU_MODULE_RESERVE when memory allocation profiling is enabled because percpu memory for counters does not need to be reserved anymore. Link: https://lkml.kernel.org/r/20250517000739.5930-1-surenb@google.com Fixes: 0db6f8d7820a ("alloc_tag: load module tags into separate contiguous memory") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: David Wang <00107082@163.com> Closes: https://lore.kernel.org/all/20250516131246.6244-1-00107082@163.com/ Tested-by: David Wang <00107082@163.com> Cc: Christoph Lameter (Ampere) <cl@gentwo.org> Cc: Dennis Zhou <dennis@kernel.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-23alloc_tag: check mem_profiling_support in alloc_tag_initCasey Chen1-15/+19
If mem_profiling_support is false, for example by sysctl.vm.mem_profiling=never, alloc_tag_init should skip module tags allocation, codetag type registration and procfs init. Link: https://lkml.kernel.org/r/20250513182602.121843-1-cachen@purestorage.com Signed-off-by: Casey Chen <cachen@purestorage.com> Reviewed-by: Yuanyuan Zhong <yzhong@purestorage.com> Acked-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-22lib/crc32: remove unused support for CRC32C combinationEric Biggers2-12/+0
crc32c_combine() and crc32c_shift() are no longer used (except by the KUnit test that tests them), and their current implementation is very slow. Remove them. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://patch.msgid.link/20250519175012.36581-8-ebiggers@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-21kunit: Fix wrong parameter to kunit_deactivate_static_stub()Tzung-Bi Shih1-1/+1
kunit_deactivate_static_stub() accepts real_fn_addr instead of replacement_addr. In the case, it always passes NULL to kunit_deactivate_static_stub(). Fix it. Link: https://lore.kernel.org/r/20250520082050.2254875-1-tzungbi@kernel.org Fixes: e047c5eaa763 ("kunit: Expose 'static stub' API to redirect functions") Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-05-16vsprintf: remove redundant and unused %pCn format specifierLuca Ceresoli1-8/+2
%pC and %pCn print the same string, and commit 900cca294425 ("lib/vsprintf: add %pC{,n,r} format specifiers for clocks") introducing them does not clarify any intended difference. It can be assumed %pC is a default for %pCn as some other specifiers do, but not all are consistent with this policy. Moreover there is now no other suffix other than 'n', which makes a default not really useful. All users in the kernel were using %pC except for one which has been converted. So now remove %pCn and all the unnecessary extra code and documentation. Acked-by: Stephen Boyd <sboyd@kernel.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Yanteng Si <si.yanteng@linux.dev> Link: https://lore.kernel.org/r/20250311-vsprintf-pcn-v2-2-0af40fc7dee4@bootlin.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2025-05-15find: Add find_first_andnot_bit()Yury Norov [NVIDIA]1-0/+11
The function helps to implement cpumask_andnot() APIs. Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: James Morse <james.morse@arm.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: James Morse <james.morse@arm.com> Tested-by: Tony Luck <tony.luck@intel.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Link: https://lore.kernel.org/20250515165855.31452-3-james.morse@arm.com
2025-05-15pldmfw: Don't require send_package_data or send_component_table to be definedLee Trager1-0/+6
Not all drivers require send_package_data or send_component_table when updating firmware. Instead of forcing drivers to implement a stub allow these functions to go undefined. Signed-off-by: Lee Trager <lee@trager.us> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250512190109.2475614-2-lee@trager.us Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-14lib/crc32: add SPDX license identifierEric Biggers1-3/+1
lib/crc32.c and include/linux/crc32.h got missed by the bulk SPDX conversion because of the nonstandard explanation of the license. However, crc32.c clearly states that it's licensed under the GNU General Public License, Version 2. And the comment in crc32.h clearly indicates that it's meant to have the same license as crc32.c. Therefore, apply SPDX-License-Identifier: GPL-2.0-only to both files. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20250514052409.194822-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-05-14lib/crc16: unexport crc16_table and crc16_byte()Eric Biggers1-5/+4
Now that neither crc16_table nor crc16_byte() is used outside lib/crc16.c, fold them into lib/crc16.c. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250513022115.39109-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-05-13xarray: fix kerneldoc for __xa_cmpxchgChristoph Hellwig1-3/+6
Fix the documentation for __xa_cmpxchg to actually describe the cmpxch-like semantics correctly, based on the version for xa_cmpxchg. Link: https://lkml.kernel.org/r/20250507051656.3900864-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12crypto: testmgr - make it easier to enable the full set of testsEric Biggers1-1/+1
Currently the full set of crypto self-tests requires CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y. This is problematic in two ways. First, developers regularly overlook this option. Second, the description of the tests as "extra" sometimes gives the impression that it is not required that all algorithms pass these tests. Given that the main use case for the crypto self-tests is for developers, make enabling CONFIG_CRYPTO_SELFTESTS=y just enable the full set of crypto self-tests by default. The slow tests can still be disabled by adding the command-line parameter cryptomgr.noextratests=1, soon to be renamed to cryptomgr.noslowtests=1. The only known use case for doing this is for people trying to use the crypto self-tests to satisfy the FIPS 140-3 pre-operational self-testing requirements when the kernel is being validated as a FIPS 140-3 cryptographic module. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-12crypto: testmgr - replace CRYPTO_MANAGER_DISABLE_TESTS with CRYPTO_SELFTESTSEric Biggers6-11/+8
The negative-sense of CRYPTO_MANAGER_DISABLE_TESTS is a longstanding mistake that regularly causes confusion. Especially bad is that you can have CRYPTO=n && CRYPTO_MANAGER_DISABLE_TESTS=n, which is ambiguous. Replace CRYPTO_MANAGER_DISABLE_TESTS with CRYPTO_SELFTESTS which has the expected behavior. The tests continue to be disabled by default. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-12crypto: lib/chacha - add array bounds to function prototypesEric Biggers2-9/+8
Add explicit array bounds to the function prototypes for the parameters that didn't already get handled by the conversion to use chacha_state: - chacha_block_*(): Change 'u8 *out' or 'u8 *stream' to u8 out[CHACHA_BLOCK_SIZE]. - hchacha_block_*(): Change 'u32 *out' or 'u32 *stream' to u32 out[HCHACHA_OUT_WORDS]. - chacha_init(): Change 'const u32 *key' to 'const u32 key[CHACHA_KEY_WORDS]'. Change 'const u8 *iv' to 'const u8 iv[CHACHA_IV_SIZE]'. No functional changes. This just makes it clear when fixed-size arrays are expected. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-12crypto: lib/chacha - add strongly-typed state zeroizationEric Biggers1-3/+3
Now that the ChaCha state matrix is strongly-typed, add a helper function chacha_zeroize_state() which zeroizes it. Then convert all applicable callers to use it instead of direct memzero_explicit. No functional changes. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-12crypto: lib/chacha - use struct assignment to copy stateEric Biggers1-6/+2
Use struct assignment instead of memcpy() in lib/crypto/chacha.c where appropriate. No functional change. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-12crypto: lib/chacha - strongly type the ChaCha stateEric Biggers4-44/+52
The ChaCha state matrix is 16 32-bit words. Currently it is represented in the code as a raw u32 array, or even just a pointer to u32. This weak typing is error-prone. Instead, introduce struct chacha_state: struct chacha_state { u32 x[16]; }; Convert all ChaCha and HChaCha functions to use struct chacha_state. No functional changes. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-12lib/oid_registry.c: remove unused sprint_OIDDr. David Alan Gilbert1-24/+1
sprint_OID() was added as part of 2012's commit 4f73175d0375 ("X.509: Add utility functions to render OIDs as strings") but it hasn't been used. Remove it. Note that there's also 'sprint_oid' (lower case) which is used in a lot of places; that's left as is except for fixing its case in a comment. Link: https://lkml.kernel.org/r/20250501010502.326472-1-linux@treblig.org Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12lib/test_kmod: do not hardcode/depend on any filesystemHerton R. Krzesinski2-36/+34
Right now test_kmod has hardcoded dependencies on btrfs/xfs. That is not optimal since you end up needing to select/build them, but it is not really required since other fs could be selected for the testing. Also, we can't change the default/driver module used for testing on initialization. Thus make it more generic: introduce two module parameters (start_driver and start_test_fs), which allow to select which modules/fs to use for the testing on test_kmod initialization. Then it's up to the user to select which modules/fs to use for testing based on his config. However, keep test_module as required default. This way, config/modules becomes selectable as when the testing is done from selftests (userspace). While at it, also change trigger_config_run_type, since at module initialization we already set the defaults at __kmod_config_init and should not need to do it again in test_kmod_init(), thus we can avoid to again set test_driver/test_fs. Link: https://lkml.kernel.org/r/20250418165047.702487-1-herton@redhat.com Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Reviewed-by: Luis Chambelrain <mcgrof@kernel.org> Cc: Daniel Gomez <da.gomez@samsung.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12scatterlist: inline sg_next()Caleb Sander Mateos1-23/+0
sg_next() is a short function called frequently in I/O paths. Define it in the header file so it can be inlined into its callers. Link: https://lkml.kernel.org/r/20250416160615.3571958-1-csander@purestorage.com Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Cc: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12errseq: eliminate special limitation for macro MAX_ERRNOZijun Hu1-6/+7
Current errseq implementation depends on a very special precondition that macro MAX_ERRNO must be (2^n - 1). Eliminate the limitation by - redefining macro ERRSEQ_SHIFT - defining a new macro ERRNO_MASK instead of MAX_ERRNO for errno mask. There is no plan to change the value of MAX_ERRNO, but this makes the implementation more generic and eliminates the BUILD_BUG_ON(). Link: https://lkml.kernel.org/r/20250407-improve_errseq-v1-1-7b27cbeb8298@quicinc.com Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12kstrtox: add support for enabled and disabled in kstrtobool()Mario Limonciello1-0/+4
In some places in the kernel there is a design pattern for sysfs attributes to use kstrtobool() in store() and str_enabled_disabled() in show(). This is counterintuitive to interact with because kstrtobool() takes on/off but str_enabled_disabled() shows enabled/disabled. Some of those sysfs uses could switch to str_on_off() but for some attributes enabled/disabled really makes more sense. Add support for kstrtobool() to accept enabled/disabled. Link: https://lkml.kernel.org/r/20250321022538.1532445-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12lib/rbtree.c: fix the example typoChisheng Chen1-4/+4
Replace `sr` with `Sr`. The condition `!tmp1 || rb_is_black(tmp1)` ensures that `tmp1` (which is `sibling->rb_right`) is either NULL or a black node. Therefore, the right child of the sibling must be black, and the example should use `Sr` instead of `sr`. Link: https://lkml.kernel.org/r/20250403112614.570140-1-johnny1001s000602@gmail.com Signed-off-by: Chisheng Chen <johnny1001s000602@gmail.com> Cc: Hsin Chang Yu <zxcvb600870024@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12lib/test_vmalloc.c: allow built-in executionUladzislau Rezki (Sony)2-4/+4
Remove the dependency on module loading ("m") for the vmalloc test suite, enabling it to be built directly into the kernel, so both ("=m") and ("=y") are supported. Motivation: - Faster debugging/testing of vmalloc code; - It allows to configure the test via kernel-boot parameters. Configuration example: test_vmalloc.nr_threads=64 test_vmalloc.run_test_mask=7 test_vmalloc.sequential_test_order=1 Link: https://lkml.kernel.org/r/20250417161216.88318-2-urezki@gmail.com Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Baoquan He <bhe@redhat.com> Reviewed-by: Adrian Huang <ahuang12@lenovo.com> Tested-by: Adrian Huang <ahuang12@lenovo.com> Cc: Christop Hellwig <hch@infradead.org> Cc: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12lib/test_vmalloc.c: replace RWSEM to SRCU for setupUladzislau Rezki (Sony)1-10/+7
The test has the initialization step during which threads are created. To prevent the workers from starting prematurely a write lock was previously used by the main setup thread, while each worker would block on a read lock. Replace this RWSEM based synchronization with a simpler SRCU based approach. Which does two basic steps: - Main thread wraps the setup phase in an SRCU read-side critical section. Pair of srcu_read_lock()/srcu_read_unlock(). - Each worker calls synchronize_srcu() on entry, ensuring it waits for the initialization phase to be completed. This patch eliminates the need for down_read()/up_read() and down_write()/up_write() pairs thus simplifying the logic and improving clarity. [urezki@gmail.com: fix compile error with CONFIG_TINY_RCU] Link: https://lkml.kernel.org/r/20250420142029.103169-1-urezki@gmail.com Link: https://lkml.kernel.org/r/20250417161216.88318-1-urezki@gmail.com Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Adrian Huang <ahuang12@lenovo.com> Tested-by: Adrian Huang <ahuang12@lenovo.com> Cc: Baoquan He <bhe@redhat.com> Cc: Christop Hellwig <hch@infradead.org> Cc: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12maple_tree: reorder mas->store_type case statementsSidhartha Kumar1-26/+25
Move the unlikely case that mas->store_type is invalid to be the last evaluated case and put liklier cases higher up. Link: https://lkml.kernel.org/r/20250410191446.2474640-7-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Suggested-by: Liam R. Howlett <liam.howlett@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12maple_tree: add sufficient heightSidhartha Kumar1-3/+16
In order to support rebalancing and spanning stores using less than the worst case number of nodes, we need to track more than just the vacant height. Using only vacant height to reduce the worst case maple node allocation count can lead to a shortcoming of nodes in the following scenarios. For rebalancing writes, when a leaf node becomes insufficient, it may be combined with a sibling into a single node. This means that the parent node which has entries for this children will lose one entry. If this parent node was just meeting the minimum entries, losing one entry will now cause this parent node to be insufficient. This leads to a cascading operation of rebalancing at different levels and can lead to more node allocations than simply using vacant height can return. For spanning writes, a similar situation occurs. At the location at which a spanning write is detected, the number of ancestor nodes may similarly need to rebalanced into a smaller number of nodes and the same cascading situation could occur. To use less than the full height of the tree for the number of allocations, we also need to track the height at which a non-leaf node cannot become insufficient. This means even if a rebalance occurs to a child of this node, it currently has enough entries that it can lose one without any further action. This field is stored in the maple write state as sufficient height. In mas_prealloc_calc() when figuring out how many nodes to allocate, we check if the vacant node is lower in the tree than a sufficient node (has a larger value). If it is, we cannot use the vacant height and must use the difference in the height and sufficient height as the basis for the number of nodes needed. An off by one bug was also discovered in mast_overflow() where it is using >= rather than >. This caused extra iterations of the mas_spanning_rebalance() loop and lead to unneeded allocations. A test is also added to check the number of allocations is correct. Link: https://lkml.kernel.org/r/20250410191446.2474640-6-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12maple_tree: break on convergence in mas_spanning_rebalance()Sidhartha Kumar1-3/+13
This allows support for using the vacant height to calculate the worst case number of nodes needed for wr_rebalance operation. mas_spanning_rebalance() was seen to perform unnecessary node allocations. We can reduce allocations by breaking early during the rebalancing loop once we realize that we have ascended to a common ancestor. Link: https://lkml.kernel.org/r/20250410191446.2474640-5-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Suggested-by: Liam Howlett <liam.howlett@oracle.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12maple_tree: use vacant nodes to reduce worst case allocationsSidhartha Kumar1-4/+9
In order to determine the store type for a maple tree operation, a walk of the tree is done through mas_wr_walk(). This function descends the tree until a spanning write is detected or we reach a leaf node. While descending, keep track of the height at which we encounter a node with available space. This is done by checking if mas->end is less than the number of slots a given node type can fit. Now that the height of the vacant node is tracked, we can use the difference between the height of the tree and the height of the vacant node to know how many levels we will have to propagate creating new nodes. Update mas_prealloc_calc() to consider the vacant height and reduce the number of worst-case allocations. Rebalancing and spanning stores are not supported and fall back to using the full height of the tree for allocations. Update preallocation testing assertions to take into account vacant height. Link: https://lkml.kernel.org/r/20250410191446.2474640-4-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12maple_tree: use height and depth consistentlySidhartha Kumar1-40/+44
For the maple tree, the root node is defined to have a depth of 0 with a height of 1. Each level down from the node, these values are incremented by 1. Various code paths define a root with depth 1 which is inconsisent with the definition. Modify the code to be consistent with this definition. In mas_spanning_rebalance(), l_mas.depth was being used to track the height based on the number of iterations done in the main loop. This information was then used in mas_put_in_tree() to set the height. Rather than overload the l_mas.depth field to track height, simply keep track of height in the local variable new_height and directly pass this to mas_wmb_replace() which will be passed into mas_put_in_tree(). This allows up to remove writes to l_mas.depth. Link: https://lkml.kernel.org/r/20250410191446.2474640-3-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12maple_tree: convert mas_prealloc_calc() to take in a maple write stateSidhartha Kumar1-8/+8
Patch series "Track node vacancy to reduce worst case allocation counts", v5. ================ overview ======================== Currently, the maple tree preallocates the worst case number of nodes for given store type by taking into account the whole height of the tree. This comes from a worst case scenario of every node in the tree being full and having to propagate node allocation upwards until we reach the root of the tree. This can be optimized if there are vacancies in nodes that are at a lower depth than the root node. This series implements tracking the level at which there is a vacant node so we only need to allocate until this level is reached, rather than always using the full height of the tree. The ma_wr_state struct is modified to add a field which keeps track of the vacant height and is updated during walks of the tree. This value is then read in mas_prealloc_calc() when we decide how many nodes to allocate. For rebalancing and spanning stores, we also need to track the lowest height at which a node has 1 more entry than the minimum sufficient number of entries. This is because rebalancing can cause a parent node to become insufficient which results in further node allocations. In this case, we need to use the sufficient height as the worst case rather than the vacant height. patch 1-2: preparatory patches patch 3: implement vacant height tracking + update the tests patch 4: support vacant height tracking for rebalancing writes patch 5: implement sufficient height tracking patch 6: reorder switch case statements ================ results ========================= Bpftrace was used to profile the allocation path for requesting new maple nodes while running stress-ng mmap 120s. The histograms below represent requests to kmem_cache_alloc_bulk() and show the count argument. This represnts how many maple nodes the caller is requesting in kmem_cache_alloc_bulk() command: stress-ng --mmap 4 --timeout 120 mm-unstable @bulk_alloc_req: [3, 4) 4 | | [4, 5) 54170 |@ | [5, 6) 0 | | [6, 7) 893057 |@@@@@@@@@@@@@@@@@@@@ | [7, 8) 4 | | [8, 9) 2230287 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [9, 10) 55811 |@ | [10, 11) 77834 |@ | [11, 12) 0 | | [12, 13) 1368684 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [13, 14) 0 | | [14, 15) 0 | | [15, 16) 367197 |@@@@@@@@ | @maple_node_total: 46,630,160 @total_vmas: 46184591 mm-unstable + this series @bulk_alloc_req: [2, 3) 198 | | [3, 4) 4 | | [4, 5) 43 | | [5, 6) 0 | | [6, 7) 1069503 |@@@@@@@@@@@@@@@@@@@@@ | [7, 8) 4 | | [8, 9) 2597268 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [9, 10) 472191 |@@@@@@@@@ | [10, 11) 191904 |@@@ | [11, 12) 0 | | [12, 13) 247316 |@@@@ | [13, 14) 0 | | [14, 15) 0 | | [15, 16) 98769 |@ | @maple_node_total: 37,813,856 @total_vmas: 43493287 This represents a ~19% reduction in the number of bulk maple nodes allocated. For more reproducible results, a historgram of the return value of mas_prealloc_calc() is displayed while running the maple_tree_tests whcih have a deterministic store pattern mas_prealloc_calc() return value mm-unstable 1 : (12068) 3 : (11836) 5 : ***** (271192) 7 : ************************************************** (2329329) 9 : *********** (534186) 10 : (435) 11 : *************** (704306) 13 : ******** (409781) mas_prealloc_calc() return value mm-unstable + this series 1 : (12070) 3 : ************************************************** (3548777) 5 : ******** (633458) 7 : (65081) 9 : (11224) 10 : (341) 11 : (2973) 13 : (68) do_mmap latency was also measured for regressions: command: stress-ng --mmap 4 --timeout 120 mm-unstable: avg = 7162 nsecs, total: 16101821292 nsecs, count: 2248034 mm-unstable + this series: avg = 6689 nsecs, total: 15135391764 nsecs, count: 2262726 stress-ng --mmap4 --timeout 120 with vacant_height: stress-ng: info: [257] 21526312 Maple Tree Read 0.176 M/sec stress-ng: info: [257] 339979348 Maple Tree Write 2.774 M/sec without vacant_height: stress-ng: info: [8228] 20968900 Maple Tree Read 0.171 M/sec stress-ng: info: [8228] 312214648 Maple Tree Write 2.547 M/sec This represents an increase of ~3% read throughput and ~9% increase in write throughput. This patch (of 6): In a subsequent patch, mas_prealloc_calc() will need to access fields only in the ma_wr_state. Convert the function to take in a ma_wr_state and modify all callers. There is no functional change. Link: https://lkml.kernel.org/r/20250410191446.2474640-1-sidhartha.kumar@oracle.com Link: https://lkml.kernel.org/r/20250410191446.2474640-2-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12xarray: make xa_alloc_cyclic() return 0 on all success casesPrzemek Kitszel1-2/+15
Change xa_alloc_cyclic() to return 0 even on wrap-around. Do the same for xa_alloc_cyclic_irq() and xa_alloc_cyclic_bh(). This will prevent any future bug of treating return of 1 as an error: int ret = xa_alloc_cyclic(...) if (ret) // currently mishandles ret==1 goto failure; If there will be someone interested in when wrap-around occurs, there is still __xa_alloc_cyclic() that behaves as before. For now there is no such user. Link: https://lkml.kernel.org/r/20250320102219.8101-1-przemyslaw.kitszel@intel.com Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Link: https://lore.kernel.org/netdev/Z9gUd-5t8b5NX2wE@casper.infradead.org Cc: Andriy Shevchenko <andriy.shevchenko@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12iov_iter: convert iov_iter_extract_xarray_pages() to use foliosMatthew Wilcox (Oracle)1-8/+8
ITER_XARRAY is exclusively used with xarrays that contain folios, not pages, so extract folio pointers from it, not page pointers. Removes a use of find_subpage(). Link: https://lkml.kernel.org/r/20250402210612.2444135-5-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12iov_iter: convert iter_xarray_populate_pages() to use foliosMatthew Wilcox (Oracle)1-7/+7
ITER_XARRAY is exclusively used with xarrays that contain folios, not pages, so extract folio pointers from it, not page pointers. Removes a hidden call to compound_head() and a use of find_subpage(). Link: https://lkml.kernel.org/r/20250402210612.2444135-4-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-09ratelimit: Drop redundant accesses to burstPaul E. McKenney1-2/+2
Now that there is the "burst <= 0" fastpath, for all later code, burst must be strictly greater than zero. Therefore, drop the redundant checks of this local variable. Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/ Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: John Ogness <john.ogness@linutronix.de> Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
2025-05-09ratelimit: Use nolock_ret restructuring to collapse common case codePaul E. McKenney1-10/+3
Now that unlock_ret releases the lock, then falls into nolock_ret, which handles ->missed based on the value of ret, the common-case lock-held code can be collapsed into a single "if" statement with a single-statement "then" clause. Yes, we could go further and just assign the "if" condition to ret, but in the immortal words of MSDOS, "Are you sure?". Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/ Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: John Ogness <john.ogness@linutronix.de> Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
2025-05-09ratelimit: Use nolock_ret label to collapse lock-failure codePaul E. McKenney1-14/+4
Now that we have a nolock_ret label that handles ->missed correctly based on the value of ret, we can eliminate a local variable and collapse several "if" statements on the lock-acquisition-failure code path. Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/ Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: John Ogness <john.ogness@linutronix.de> Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
2025-05-09ratelimit: Use nolock_ret label to save a couple of lines of codePaul E. McKenney1-5/+3
Create a nolock_ret label in order to start consolidating the unlocked return paths that conditionally invoke ratelimit_state_inc_miss(). Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/ Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: John Ogness <john.ogness@linutronix.de> Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
2025-05-09ratelimit: Simplify common-case exit pathPaul E. McKenney1-9/+5
By making "ret" always be initialized, and moving the final call to ratelimit_state_inc_miss() out from under the lock, we save a goto and a couple lines of code. This also saves a couple of lines of code from the unconditional enable/disable slowpath. Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/ Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: John Ogness <john.ogness@linutronix.de> Cc: Sergey Senozhatsky <senozhatsky@chromium.org>