summaryrefslogtreecommitdiff
path: root/security
AgeCommit message (Collapse)AuthorFilesLines
2025-08-28apparmor: Fix 8-byte alignment for initial dfa blob streamsHelge Deller1-2/+2
commit c567de2c4f5fe6e079672e074e1bc6122bf7e444 upstream. The dfa blob stream for the aa_dfa_unpack() function is expected to be aligned on a 8 byte boundary. The static nulldfa_src[] and stacksplitdfa_src[] arrays store the initial apparmor dfa blob streams, but since they are declared as an array-of-chars the compiler and linker will only ensure a "char" (1-byte) alignment. Add an __aligned(8) annotation to the arrays to tell the linker to always align them on a 8-byte boundary. This avoids runtime warnings at startup on alignment-sensitive platforms like parisc such as: Kernel: unaligned access to 0x7f2a584a in aa_dfa_unpack+0x124/0x788 (iir 0xca0109f) Kernel: unaligned access to 0x7f2a584e in aa_dfa_unpack+0x210/0x788 (iir 0xca8109c) Kernel: unaligned access to 0x7f2a586a in aa_dfa_unpack+0x278/0x788 (iir 0xcb01090) Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org Fixes: 98b824ff8984 ("apparmor: refcount the pdb") Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-20apparmor: fix x_table_lookup when stacking is not the first entryJohn Johansen1-23/+29
[ Upstream commit a9eb185be84e998aa9a99c7760534ccc06216705 ] x_table_lookup currently does stacking during label_parse() if the target specifies a stack but its only caller ensures that it will never be used with stacking. Refactor to slightly simplify the code in x_to_label(), this also fixes a long standing problem where x_to_labels check on stacking is only on the first element to the table option list, instead of the element that is found and used. Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20apparmor: use the condition in AA_BUG_FMT even with debug disabledMateusz Guzik1-1/+5
[ Upstream commit 67e370aa7f968f6a4f3573ed61a77b36d1b26475 ] This follows the established practice and fixes a build failure for me: security/apparmor/file.c: In function ‘__file_sock_perm’: security/apparmor/file.c:544:24: error: unused variable ‘sock’ [-Werror=unused-variable] 544 | struct socket *sock = (struct socket *) file->private_data; | ^~~~ Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20apparmor: shift ouid when mediating hard links in usernsGabriel Totev1-2/+4
[ Upstream commit c5bf96d20fd787e4909b755de4705d52f3458836 ] When using AppArmor profiles inside an unprivileged container, the link operation observes an unshifted ouid. (tested with LXD and Incus) For example, root inside container and uid 1000000 outside, with `owner /root/link l,` profile entry for ln: /root$ touch chain && ln chain link ==> dmesg apparmor="DENIED" operation="link" class="file" namespace="root//lxd-feet_<var-snap-lxd-common-lxd>" profile="linkit" name="/root/link" pid=1655 comm="ln" requested_mask="l" denied_mask="l" fsuid=1000000 ouid=0 [<== should be 1000000] target="/root/chain" Fix by mapping inode uid of old_dentry in aa_path_link() rather than using it directly, similarly to how it's mapped in __file_path_perm() later in the file. Signed-off-by: Gabriel Totev <gabriel.totev@zetier.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20securityfs: don't pin dentries twice, once is enough...Al Viro1-2/+0
[ Upstream commit 27cd1bf1240d482e4f02ca4f9812e748f3106e4f ] incidentally, securityfs_recursive_remove() is broken without that - it leaks dentries, since simple_recursive_removal() does not expect anything of that sort. It could be worked around by dput() in remove_one() callback, but it's easier to just drop that double-get stuff. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20landlock: opened file never has a negative dentryAl Viro1-1/+0
[ Upstream commit d1832e648d2be564e4b5e357f94d0f33156590dc ] Reviewed-by: Christian Brauner <brauner@kernel.org> Acked-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15apparmor: Fix unaligned memory accesses in KUnit testHelge Deller1-2/+4
[ Upstream commit c68804199dd9d63868497a27b5da3c3cd15356db ] The testcase triggers some unnecessary unaligned memory accesses on the parisc architecture: Kernel: unaligned access to 0x12f28e27 in policy_unpack_test_init+0x180/0x374 (iir 0x0cdc1280) Kernel: unaligned access to 0x12f28e67 in policy_unpack_test_init+0x270/0x374 (iir 0x64dc00ce) Use the existing helper functions put_unaligned_le32() and put_unaligned_le16() to avoid such warnings on architectures which prefer aligned memory accesses. Signed-off-by: Helge Deller <deller@gmx.de> Fixes: 98c0cc48e27e ("apparmor: fix policy_unpack_test on big endian systems") Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15apparmor: fix loop detection used in conflicting attachment resolutionRyan Lee2-15/+12
[ Upstream commit a88db916b8c77552f49f7d9f8744095ea01a268f ] Conflicting attachment resolution is based on the number of states traversed to reach an accepting state in the attachment DFA, accounting for DFA loops traversed during the matching process. However, the loop counting logic had multiple bugs: - The inc_wb_pos macro increments both position and length, but length is supposed to saturate upon hitting buffer capacity, instead of wrapping around. - If no revisited state is found when traversing the history, is_loop would still return true, as if there was a loop found the length of the history buffer, instead of returning false and signalling that no loop was found. As a result, the adjustment step of aa_dfa_leftmatch would sometimes produce negative counts with loop- free DFAs that traversed enough states. - The iteration in the is_loop for loop is supposed to stop before i = wb->len, so the conditional should be < instead of <=. This patch fixes the above bugs as well as the following nits: - The count and size fields in struct match_workbuf were not used, so they can be removed. - The history buffer in match_workbuf semantically stores aa_state_t and not unsigned ints, even if aa_state_t is currently unsigned int. - The local variables in is_loop are counters, and thus should be unsigned ints instead of aa_state_t's. Fixes: 21f606610502 ("apparmor: improve overlapping domain attachment resolution") Signed-off-by: Ryan Lee <ryan.lee@canonical.com> Co-developed-by: John Johansen <john.johansen@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15apparmor: ensure WB_HISTORY_SIZE value is a power of 2Ryan Lee2-1/+3
[ Upstream commit 6c055e62560b958354625604293652753d82bcae ] WB_HISTORY_SIZE was defined to be a value not a power of 2, despite a comment in the declaration of struct match_workbuf stating it is and a modular arithmetic usage in the inc_wb_pos macro assuming that it is. Bump WB_HISTORY_SIZE's value up to 32 and add a BUILD_BUG_ON_NOT_POWER_OF_2 line to ensure that any future changes to the value of WB_HISTORY_SIZE respect this requirement. Fixes: 136db994852a ("apparmor: increase left match history buffer size") Signed-off-by: Ryan Lee <ryan.lee@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15landlock: Fix warning from KUnit testsTingmao Wang1-27/+42
[ Upstream commit e0a69cf2c03e61bd8069becb97f66c173d0d1fa1 ] get_id_range() expects a positive value as first argument but get_random_u8() can return 0. Fix this by clamping it. Validated by running the test in a for loop for 1000 times. Note that MAX() is wrong as it is only supposed to be used for constants, but max() is good here. [..] ok 9 test_range2_rand1 [..] ok 10 test_range2_rand2 [..] ok 11 test_range2_rand15 [..] ------------[ cut here ]------------ [..] WARNING: CPU: 6 PID: 104 at security/landlock/id.c:99 test_range2_rand16 (security/landlock/id.c:99 (discriminator 1) security/landlock/id.c:234 (discriminator 1)) [..] Modules linked in: [..] CPU: 6 UID: 0 PID: 104 Comm: kunit_try_catch Tainted: G N 6.16.0-rc1-dev-00001-g314a2f98b65f #1 PREEMPT(undef) [..] Tainted: [N]=TEST [..] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [..] RIP: 0010:test_range2_rand16 (security/landlock/id.c:99 (discriminator 1) security/landlock/id.c:234 (discriminator 1)) [..] Code: 49 c7 c0 10 70 30 82 4c 89 ff 48 c7 c6 a0 63 1e 83 49 c7 45 a0 e0 63 1e 83 e8 3f 95 17 00 e9 1f ff ff ff 0f 0b e9 df fd ff ff <0f> 0b ba 01 00 00 00 e9 68 fe ff ff 49 89 45 a8 49 8d 4d a0 45 31 [..] RSP: 0000:ffff888104eb7c78 EFLAGS: 00010246 [..] RAX: 0000000000000000 RBX: 000000000870822c RCX: 0000000000000000 ^^^^^^^^^^^^^^^^ [..] [..] Call Trace: [..] [..] ---[ end trace 0000000000000000 ]--- [..] ok 12 test_range2_rand16 [..] # landlock_id: pass:12 fail:0 skip:0 total:12 [..] # Totals: pass:12 fail:0 skip:0 total:12 [..] ok 1 landlock_id Fixes: d9d2a68ed44b ("landlock: Add unique ID generator") Signed-off-by: Tingmao Wang <m@maowtm.org> Link: https://lore.kernel.org/r/73e28efc5b8cc394608b99d5bc2596ca917d7c4a.1750003733.git.m@maowtm.org [mic: Minor cosmetic improvements] Signed-off-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-19selinux: change security_compute_sid to return the ssid or tsid on matchStephen Smalley1-5/+11
If the end result of a security_compute_sid() computation matches the ssid or tsid, return that SID rather than looking it up again. This avoids the problem of multiple initial SIDs that map to the same context. Cc: stable@vger.kernel.org Reported-by: Guido Trentalancia <guido@trentalancia.com> Fixes: ae254858ce07 ("selinux: introduce an initial SID for early boot processes") Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com> Tested-by: Guido Trentalancia <guido@trentalancia.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-06-17selinux: fix selinux_xfrm_alloc_user() to set correct ctx_lenStephen Smalley1-1/+1
We should count the terminating NUL byte as part of the ctx_len. Otherwise, UBSAN logs a warning: UBSAN: array-index-out-of-bounds in security/selinux/xfrm.c:99:14 index 60 is out of range for type 'char [*]' The allocation itself is correct so there is no actual out of bounds indexing, just a warning. Cc: stable@vger.kernel.org Suggested-by: Christian Göttsche <cgzones@googlemail.com> Link: https://lore.kernel.org/selinux/CAEjxPJ6tA5+LxsGfOJokzdPeRomBHjKLBVR6zbrg+_w3ZZbM3A@mail.gmail.com/ Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-06-11KEYS: Invert FINAL_PUT bitHerbert Xu2-4/+5
Invert the FINAL_PUT bit so that test_bit_acquire and clear_bit_unlock can be used instead of smp_mb. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-05-31Merge tag 'gcc-minimum-version-6.16' of ↵Linus Torvalds1-76/+0
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull compiler version requirement update from Arnd Bergmann: "Require gcc-8 and binutils-2.30 x86 already uses gcc-8 as the minimum version, this changes all other architectures to the same version. gcc-8 is used is Debian 10 and Red Hat Enterprise Linux 8, both of which are still supported, and binutils 2.30 is the oldest corresponding version on those. Ubuntu Pro 18.04 and SUSE Linux Enterprise Server 15 both use gcc-7 as the system compiler but additionally include toolchains that remain supported. With the new minimum toolchain versions, a number of workarounds for older versions can be dropped, in particular on x86_64 and arm64. Importantly, the updated compiler version allows removing two of the five remaining gcc plugins, as support for sancov and structeak features is already included in modern compiler versions. I tried collecting the known changes that are possible based on the new toolchain version, but expect that more cleanups will be possible. Since this touches multiple architectures, I merged the patches through the asm-generic tree." * tag 'gcc-minimum-version-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: Makefile.kcov: apply needed compiler option unconditionally in CFLAGS_KCOV Documentation: update binutils-2.30 version reference gcc-plugins: remove SANCOV gcc plugin Kbuild: remove structleak gcc plugin arm64: drop binutils version checks raid6: skip avx512 checks kbuild: require gcc-8 and binutils-2.30
2025-05-29Merge tag 'ipe-pr-20250527' of ↵Linus Torvalds4-26/+63
git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe Pull IPE update from Fan Wu: "A single commit from Jasjiv Singh, that adds an errno field to IPE policy load auditing to log failures with error details, not just successes. This improves the security audit trail and helps diagnose policy deployment issues" * tag 'ipe-pr-20250527' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe: ipe: add errno field to IPE policy load auditing
2025-05-29Merge tag 'net-next-6.16' of ↵Linus Torvalds5-70/+2
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core: - Implement the Device Memory TCP transmit path, allowing zero-copy data transmission on top of TCP from e.g. GPU memory to the wire. - Move all the IPv6 routing tables management outside the RTNL scope, under its own lock and RCU. The route control path is now 3x times faster. - Convert queue related netlink ops to instance lock, reducing again the scope of the RTNL lock. This improves the control plane scalability. - Refactor the software crc32c implementation, removing unneeded abstraction layers and improving significantly the related micro-benchmarks. - Optimize the GRO engine for UDP-tunneled traffic, for a 10% performance improvement in related stream tests. - Cover more per-CPU storage with local nested BH locking; this is a prep work to remove the current per-CPU lock in local_bh_disable() on PREMPT_RT. - Introduce and use nlmsg_payload helper, combining buffer bounds verification with accessing payload carried by netlink messages. Netfilter: - Rewrite the procfs conntrack table implementation, improving considerably the dump performance. A lot of user-space tools still use this interface. - Implement support for wildcard netdevice in netdev basechain and flowtables. - Integrate conntrack information into nft trace infrastructure. - Export set count and backend name to userspace, for better introspection. BPF: - BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops programs and can be controlled in similar way to traditional qdiscs using the "tc qdisc" command. - Refactor the UDP socket iterator, addressing long standing issues WRT duplicate hits or missed sockets. Protocols: - Improve TCP receive buffer auto-tuning and increase the default upper bound for the receive buffer; overall this improves the single flow maximum thoughput on 200Gbs link by over 60%. - Add AFS GSSAPI security class to AF_RXRPC; it provides transport security for connections to the AFS fileserver and VL server. - Improve TCP multipath routing, so that the sources address always matches the nexthop device. - Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS, and thus preventing DoS caused by passing around problematic FDs. - Retire DCCP socket. DCCP only receives updates for bugs, and major distros disable it by default. Its removal allows for better organisation of TCP fields to reduce the number of cache lines hit in the fast path. - Extend TCP drop-reason support to cover PAWS checks. Driver API: - Reorganize PTP ioctl flag support to require an explicit opt-in for the drivers, avoiding the problem of drivers not rejecting new unsupported flags. - Converted several device drivers to timestamping APIs. - Introduce per-PHY ethtool dump helpers, improving the support for dump operations targeting PHYs. Tests and tooling: - Add support for classic netlink in user space C codegen, so that ynl-c can now read, create and modify links, routes addresses and qdisc layer configuration. - Add ynl sub-types for binary attributes, allowing ynl-c to output known struct instead of raw binary data, clarifying the classic netlink output. - Extend MPTCP selftests to improve the code-coverage. - Add tests for XDP tail adjustment in AF_XDP. New hardware / drivers: - OpenVPN virtual driver: offload OpenVPN data channels processing to the kernel-space, increasing the data transfer throughput WRT the user-space implementation. - Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC. - Broadcom asp-v3.0 ethernet driver. - AMD Renoir ethernet device. - ReakTek MT9888 2.5G ethernet PHY driver. - Aeonsemi 10G C45 PHYs driver. Drivers: - Ethernet high-speed NICs: - nVidia/Mellanox (mlx5): - refactor the steering table handling to significantly reduce the amount of memory used - add support for complex matches in H/W flow steering - improve flow streeing error handling - convert to netdev instance locking - Intel (100G, ice, igb, ixgbe, idpf): - ice: add switchdev support for LLDP traffic over VF - ixgbe: add firmware manipulation and regions devlink support - igb: introduce support for frame transmission premption - igb: adds persistent NAPI configuration - idpf: introduce RDMA support - idpf: add initial PTP support - Meta (fbnic): - extend hardware stats coverage - add devlink dev flash support - Broadcom (bnxt): - add support for RX-side device memory TCP - Wangxun (txgbe): - implement support for udp tunnel offload - complete PTP and SRIOV support for AML 25G/10G devices - Ethernet NICs embedded and virtual: - Google (gve): - add device memory TCP TX support - Amazon (ena): - support persistent per-NAPI config - Airoha: - add H/W support for L2 traffic offload - add per flow stats for flow offloading - RealTek (rtl8211): add support for WoL magic packet - Synopsys (stmmac): - dwmac-socfpga 1000BaseX support - add Loongson-2K3000 support - introduce support for hardware-accelerated VLAN stripping - Broadcom (bcmgenet): - expose more H/W stats - Freescale (enetc, dpaa2-eth): - enetc: add MAC filter, VLAN filter RSS and loopback support - dpaa2-eth: convert to H/W timestamping APIs - vxlan: convert FDB table to rhashtable, for better scalabilty - veth: apply qdisc backpressure on full ring to reduce TX drops - Ethernet switches: - Microchip (kzZ88x3): add ETS scheduler support - Ethernet PHYs: - RealTek (rtl8211): - add support for WoL magic packet - add support for PHY LEDs - CAN: - Adds RZ/G3E CANFD support to the rcar_canfd driver. - Preparatory work for CAN-XL support. - Add self-tests framework with support for CAN physical interfaces. - WiFi: - mac80211: - scan improvements with multi-link operation (MLO) - Qualcomm (ath12k): - enable AHB support for IPQ5332 - add monitor interface support to QCN9274 - add multi-link operation support to WCN7850 - add 802.11d scan offload support to WCN7850 - monitor mode for WCN7850, better 6 GHz regulatory - Qualcomm (ath11k): - restore hibernation support - MediaTek (mt76): - WiFi-7 improvements - implement support for mt7990 - Intel (iwlwifi): - enhanced multi-link single-radio (EMLSR) support on 5 GHz links - rework device configuration - RealTek (rtw88): - improve throughput for RTL8814AU - RealTek (rtw89): - add multi-link operation support - STA/P2P concurrency improvements - support different SAR configs by antenna - Bluetooth: - introduce HCI Driver protocol - btintel_pcie: do not generate coredump for diagnostic events - btusb: add HCI Drv commands for configuring altsetting - btusb: add RTL8851BE device 0x0bda:0xb850 - btusb: add new VID/PID 13d3/3584 for MT7922 - btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925 - btnxpuart: implement host-wakeup feature" * tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1611 commits) selftests/bpf: Fix bpf selftest build warning selftests: netfilter: Fix skip of wildcard interface test net: phy: mscc: Stop clearing the the UDPv4 checksum for L2 frames net: openvswitch: Fix the dead loop of MPLS parse calipso: Don't call calipso functions for AF_INET sk. selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem net_sched: hfsc: Address reentrant enqueue adding class to eltree twice octeontx2-pf: QOS: Refactor TC_HTB_LEAF_DEL_LAST callback octeontx2-pf: QOS: Perform cache sync on send queue teardown net: mana: Add support for Multi Vports on Bare metal net: devmem: ncdevmem: remove unused variable net: devmem: ksft: upgrade rx test to send 1K data net: devmem: ksft: add 5 tuple FS support net: devmem: ksft: add exit_wait to make rx test pass net: devmem: ksft: add ipv4 support net: devmem: preserve sockc_err page_pool: fix ugly page_pool formatting net: devmem: move list_add to net_devmem_bind_dmabuf. selftests: netfilter: nft_queue.sh: include file transfer duration in log message net: phy: mscc: Fix memory leak when using one step timestamping ...
2025-05-28Merge tag 'selinux-pr-20250527' of ↵Linus Torvalds11-85/+232
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Reduce the SELinux impact on path walks. Add a small directory access cache to the per-task SELinux state. This cache allows SELinux to cache the most recently used directory access decisions in order to avoid repeatedly querying the AVC on path walks where the majority of the directories have similar security contexts/labels. My performance measurements are crude, but prior to this patch the time spent in SELinux code on a 'make allmodconfig' run was 103% that of __d_lookup_rcu(), and with this patch the time spent in SELinux code dropped to 63% of __d_lookup_rcu(), a ~40% improvement. Additional improvments can be expected in the future, but those will require additional SELinux policy/toolchain support. - Add support for wildcards in genfscon policy statements. This patch allows for wildcards in the genfscon patch matching logic as opposed to the prefix matching that was used prior to this change. Adding wilcard support allows for more expressive and efficient path matching in the policy which is especially helpful for sysfs, and has resulted in a ~15% boot time reduction in Android. SELinux policies can opt into wilcard matching by using the "genfs_seclabel_wildcard" policy capability. - Unify the error/OOM handling of the SELinux network caches. A failure to allocate memory for the SELinux network caches isn't fatal as the object label can still be safely returned to the caller, it simply means that we cannot add the new data to the cache, at least temporarily. This patch corrects this behavior for the InfiniBand cache and does some minor cleanup. - Minor improvements around constification, 'likely' annotations, and removal of bogus comments. * tag 'selinux-pr-20250527' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: fix the kdoc header for task_avdcache_update selinux: remove a duplicated include selinux: reduce path walk overhead selinux: support wildcard match in genfscon selinux: drop copy-paste comment selinux: unify OOM handling in network hashtables selinux: add likely hints for fast paths selinux: contify network namespace pointer selinux: constify network address pointer
2025-05-28Merge tag 'lsm-pr-20250527' of ↵Linus Torvalds1-18/+18
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm update from Paul Moore: "One minor LSM framework patch to move the selinux_netlink_send() hook under the CONFIG_SECURITY_NETWORK Kconfig knob" * tag 'lsm-pr-20250527' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lsm: Move security_netlink_send to under CONFIG_SECURITY_NETWORK
2025-05-28Merge tag 'integrity-v6.16' of ↵Linus Torvalds4-33/+185
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "Carrying the IMA measurement list across kexec is not a new feature, but is updated to address a couple of issues: - Carrying the IMA measurement list across kexec required knowing apriori all the file measurements between the "kexec load" and "kexec execute" in order to measure them before the "kexec load". Any delay between the "kexec load" and "kexec exec" exacerbated the problem. - Any file measurements post "kexec load" were not carried across kexec, resulting in the measurement list being out of sync with the TPM PCR. With these changes, the buffer for the IMA measurement list is still allocated at "kexec load", but copying the IMA measurement list is deferred to after quiescing the TPM. Two new kexec critical data records are defined" * tag 'integrity-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: do not copy measurement list to kdump kernel ima: measure kexec load and exec events as critical data ima: make the kexec extra memory configurable ima: verify if the segment size has changed ima: kexec: move IMA log copy from kexec load to execute ima: kexec: define functions to copy IMA log at soft boot ima: kexec: skip IMA segment validation after kexec soft reboot kexec: define functions to map and unmap segments ima: define and call ima_alloc_kexec_file_buf() ima: rename variable the seq_file "file" to "ima_kexec_file"
2025-05-28Merge tag 'Smack-for-6.16' of https://github.com/cschaufler/smack-nextLinus Torvalds1-7/+5
Pull smack update from Casey Schaufler: "One trivial kernel doc fix" * tag 'Smack-for-6.16' of https://github.com/cschaufler/smack-next: security/smack/smackfs: small kernel-doc fixes
2025-05-28Merge tag 'hardening-v6.16-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: - Update overflow helpers to ease refactoring of on-stack flex array instances (Gustavo A. R. Silva, Kees Cook) - lkdtm: Use SLAB_NO_MERGE instead of constructors (Harry Yoo) - Simplify CONFIG_CC_HAS_COUNTED_BY (Jan Hendrik Farr) - Disable u64 usercopy KUnit test on 32-bit SPARC (Thomas Weißschuh) - Add missed designated initializers now exposed by fixed randstruct (Nathan Chancellor, Kees Cook) - Document compilers versions for __builtin_dynamic_object_size - Remove ARM_SSP_PER_TASK GCC plugin - Fix GCC plugin randstruct, add selftests, and restore COMPILE_TEST builds - Kbuild: induce full rebuilds when dependencies change with GCC plugins, the Clang sanitizer .scl file, or the randstruct seed. - Kbuild: Switch from -Wvla to -Wvla-larger-than=1 - Correct several __nonstring uses for -Wunterminated-string-initialization * tag 'hardening-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits) Revert "hardening: Disable GCC randstruct for COMPILE_TEST" lib/tests: randstruct: Add deep function pointer layout test lib/tests: Add randstruct KUnit test randstruct: gcc-plugin: Remove bogus void member net: qede: Initialize qede_ll_ops with designated initializer scsi: qedf: Use designated initializer for struct qed_fcoe_cb_ops md/bcache: Mark __nonstring look-up table integer-wrap: Force full rebuild when .scl file changes randstruct: Force full rebuild when seed changes gcc-plugins: Force full rebuild when plugins change kbuild: Switch from -Wvla to -Wvla-larger-than=1 hardening: simplify CONFIG_CC_HAS_COUNTED_BY overflow: Fix direct struct member initialization in _DEFINE_FLEX() kunit/overflow: Add tests for STACK_FLEX_ARRAY_SIZE() helper overflow: Add STACK_FLEX_ARRAY_SIZE() helper input/joystick: magellan: Mark __nonstring look-up table const watchdog: exar: Shorten identity name to fit correctly mod_devicetable: Enlarge the maximum platform_device_id name length overflow: Clarify expectations for getting DEFINE_FLEX variable sizes compiler_types: Identify compiler versions for __builtin_dynamic_object_size ...
2025-05-28ipe: add errno field to IPE policy load auditingJasjiv Singh4-26/+63
Users of IPE require a way to identify when and why an operation fails, allowing them to both respond to violations of policy and be notified of potentially malicious actions on their systems with respect to IPE. This patch introduces a new error field to the AUDIT_IPE_POLICY_LOAD event to log policy loading failures. Currently, IPE only logs successful policy loads, but not failures. Tracking failures is crucial to detect malicious attempts and ensure a complete audit trail for security events. The new error field will capture the following error codes: * -ENOKEY: Key used to sign the IPE policy not found in the keyring * -ESTALE: Attempting to update an IPE policy with an older version * -EKEYREJECTED: IPE signature verification failed * -ENOENT: Policy was deleted while updating * -EEXIST: Same name policy already deployed * -ERANGE: Policy version number overflow * -EINVAL: Policy version parsing error * -EPERM: Insufficient permission * -ENOMEM: Out of memory (OOM) * -EBADMSG: Policy is invalid Here are some examples of the updated audit record types: AUDIT_IPE_POLICY_LOAD(1422): audit: AUDIT1422 policy_name="Test_Policy" policy_version=0.0.1 policy_digest=sha256:84EFBA8FA71E62AE0A537FAB962F8A2BD1053964C4299DCA 92BFFF4DB82E86D3 auid=1000 ses=3 lsm=ipe res=1 errno=0 The above record shows a new policy has been successfully loaded into the kernel with the policy name, version, and hash with the errno=0. AUDIT_IPE_POLICY_LOAD(1422) with error: audit: AUDIT1422 policy_name=? policy_version=? policy_digest=? auid=1000 ses=3 lsm=ipe res=0 errno=-74 The above record shows a policy load failure due to an invalid policy (-EBADMSG). By adding this error field, we ensure that all policy load attempts, whether successful or failed, are logged, providing a comprehensive audit trail for IPE policy management. Signed-off-by: Jasjiv Singh <jasjivsingh@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@kernel.org>
2025-05-26Merge tag 'vfs-6.16-rc1.async.dir' of ↵Linus Torvalds3-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs directory lookup updates from Christian Brauner: "This contains cleanups for the lookup_one*() family of helpers. We expose a set of functions with names containing "lookup_one_len" and others without the "_len". This difference has nothing to do with "len". It's rater a historical accident that can be confusing. The functions without "_len" take a "mnt_idmap" pointer. This is found in the "vfsmount" and that is an important question when choosing which to use: do you have a vfsmount, or are you "inside" the filesystem. A related question is "is permission checking relevant here?". nfsd and cachefiles *do* have a vfsmount but *don't* use the non-_len functions. They pass nop_mnt_idmap and refuse to work on filesystems which have any other idmap. This work changes nfsd and cachefile to use the lookup_one family of functions and to explictily pass &nop_mnt_idmap which is consistent with all other vfs interfaces used where &nop_mnt_idmap is explicitly passed. The remaining uses of the "_one" functions do not require permission checks so these are renamed to be "_noperm" and the permission checking is removed. This series also changes these lookup function to take a qstr instead of separate name and len. In many cases this simplifies the call" * tag 'vfs-6.16-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: VFS: change lookup_one_common and lookup_noperm_common to take a qstr Use try_lookup_noperm() instead of d_hash_and_lookup() outside of VFS VFS: rename lookup_one_len family to lookup_noperm and remove permission check cachefiles: Use lookup_one() rather than lookup_one_len() nfsd: Use lookup_one() rather than lookup_one_len() VFS: improve interface for lookup_one functions
2025-05-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-5/+35
Cross-merge networking fixes after downstream PR (net-6.15-rc8). Conflicts: 80f2ab46c2ee ("irdma: free iwdev->rf after removing MSI-X") 4bcc063939a5 ("ice, irdma: fix an off by one in error handling code") c24a65b6a27c ("iidc/ice/irdma: Update IDC to support multiple consumers") https://lore.kernel.org/20250513130630.280ee6c5@canb.auug.org.au No extra adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-20security/smack/smackfs: small kernel-doc fixesRandy Dunlap1-7/+5
Add function short descriptions to the kernel-doc where missing. Correct a verb and add ending periods to sentences. smackfs.c:1080: warning: missing initial short description on line: * smk_net4addr_insert smackfs.c:1343: warning: missing initial short description on line: * smk_net6addr_insert Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: linux-security-module@vger.kernel.org Cc: Paul Moore <paul@paul-moore.com> Cc: James Morris <jmorris@namei.org> Cc: "Serge E. Hallyn" <serge@hallyn.com>
2025-05-14ima: do not copy measurement list to kdump kernelSteven Chen1-0/+3
Kdump kernel doesn't need IMA to do integrity measurement. Hence the measurement list in 1st kernel doesn't need to be copied to kdump kernel. Here skip allocating buffer for measurement list copying if loading kdump kernel. Then there won't be the later handling related to ima_kexec_buffer. Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Tested-by: Baoquan He <bhe@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-05-12landlock: Improve bit operations in audit codeMickaël Salaün3-4/+34
Use the BIT() and BIT_ULL() macros in the new audit code instead of explicit shifts to improve readability. Use bitmask instead of modulo operation to simplify code. Add test_range1_rand15() and test_range2_rand15() KUnit tests to improve get_id_range() coverage. Signed-off-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20250512093732.1408485-1-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-05-08Revert "hardening: Disable GCC randstruct for COMPILE_TEST"Kees Cook1-1/+1
This reverts commit f5c68a4e84f9feca3be578199ec648b676db2030. It is again possible to build "allmodconfig" with the randstruct GCC plugin, so enable it for COMPILE_TEST to catch future bugs. Signed-off-by: Kees Cook <kees@kernel.org>
2025-05-03landlock: Remove KUnit test that triggers a warningMickaël Salaün1-1/+1
A KUnit test checking boundaries triggers a canary warning, which may be disturbing. Let's remove this test for now. Hopefully, KUnit will soon get support for suppressing warning backtraces [1]. Cc: Alessandro Carminati <acarmina@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Günther Noack <gnoack@google.com> Reported-by: Tingmao Wang <m@maowtm.org> Closes: https://lore.kernel.org/r/20250327213807.12964-1-m@maowtm.org Link: https://lore.kernel.org/r/20250425193249.78b45d2589575c15f483c3d8@linux-foundation.org [1] Link: https://lore.kernel.org/r/20250503065359.3625407-1-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-05-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-17/+16
Cross-merge networking fixes after downstream PR (net-6.15-rc5). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-30Kbuild: remove structleak gcc pluginArnd Bergmann1-76/+0
gcc-12 and higher support the -ftrivial-auto-var-init= flag, after gcc-8 is the minimum version, this is half of the supported ones, and the vast majority of the versions that users are actually likely to have, so it seems like a good time to stop having the fallback plugin implementation Older toolchains are still able to build kernels normally without this plugin, but won't be able to use variable initialization.. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-04-29ima: measure kexec load and exec events as critical dataSteven Chen3-0/+32
The amount of memory allocated at kexec load, even with the extra memory allocated, might not be large enough for the entire measurement list. The indeterminate interval between kexec 'load' and 'execute' could exacerbate this problem. Define two new IMA events, 'kexec_load' and 'kexec_execute', to be measured as critical data at kexec 'load' and 'execute' respectively. Report the allocated kexec segment size, IMA binary log size and the runtime measurements count as part of those events. These events, and the values reported through them, serve as markers in the IMA log to verify the IMA events are captured during kexec soft reboot. The presence of a 'kexec_load' event in between the last two 'boot_aggregate' events in the IMA log implies this is a kexec soft reboot, and not a cold-boot. And the absence of 'kexec_execute' event after kexec soft reboot implies missing events in that window which results in inconsistency with TPM PCR quotes, necessitating a cold boot for a successful remote attestation. These critical data events are displayed as hex encoded ascii in the ascii_runtime_measurement_list. Verifying the critical data hash requires calculating the hash of the decoded ascii string. For example, to verify the 'kexec_load' data hash: sudo cat /sys/kernel/security/integrity/ima/ascii_runtime_measurements | grep kexec_load | cut -d' ' -f 6 | xxd -r -p | sha256sum To verify the 'kexec_execute' data hash: sudo cat /sys/kernel/security/integrity/ima/ascii_runtime_measurements | grep kexec_execute | cut -d' ' -f 6 | xxd -r -p | sha256sum Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29ima: make the kexec extra memory configurableSteven Chen2-5/+22
The extra memory allocated for carrying the IMA measurement list across kexec is hard-coded as half a PAGE. Make it configurable. Define a Kconfig option, IMA_KEXEC_EXTRA_MEMORY_KB, to configure the extra memory (in kb) to be allocated for IMA measurements added during kexec soft reboot. Ensure the default value of the option is set such that extra half a page of memory for additional measurements is allocated for the additional measurements. Update ima_add_kexec_buffer() function to allocate memory based on the Kconfig option value, rather than the currently hard-coded one. Suggested-by: Stefan Berger <stefanb@linux.ibm.com> Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29ima: verify if the segment size has changedSteven Chen1-0/+10
kexec 'load' may be called multiple times. Free and realloc the buffer only if the segment_size is changed from the previous kexec 'load' call. Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29ima: kexec: move IMA log copy from kexec load to executeSteven Chen1-14/+29
The IMA log is currently copied to the new kernel during kexec 'load' using ima_dump_measurement_list(). However, the IMA measurement list copied at kexec 'load' may result in loss of IMA measurements records that only occurred after the kexec 'load'. Move the IMA measurement list log copy from kexec 'load' to 'execute' Make the kexec_segment_size variable a local static variable within the file, so it can be accessed during both kexec 'load' and 'execute'. Define kexec_post_load() as a wrapper for calling ima_kexec_post_load() and machine_kexec_post_load(). Replace the existing direct call to machine_kexec_post_load() with kexec_post_load(). When there is insufficient memory to copy all the measurement logs, copy as much of the measurement list as possible. Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Young <dyoung@redhat.com> Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29ima: kexec: define functions to copy IMA log at soft bootSteven Chen1-0/+47
The IMA log is currently copied to the new kernel during kexec 'load' using ima_dump_measurement_list(). However, the log copied at kexec 'load' may result in loss of IMA measurements that only occurred after kexec "load'. Setup the needed infrastructure to move the IMA log copy from kexec 'load' to 'execute'. Define a new IMA hook ima_update_kexec_buffer() as a stub function. It will be used to call ima_dump_measurement_list() during kexec 'execute'. Implement ima_kexec_post_load() function to be invoked after the new Kernel image has been loaded for kexec. ima_kexec_post_load() maps the IMA buffer to a segment in the newly loaded Kernel. It also registers the reboot notifier_block to trigger ima_update_kexec_buffer() at kexec 'execute'. Set the priority of register_reboot_notifier to INT_MIN to ensure that the IMA log copy operation will happen at the end of the operation chain, so that all the IMA measurement records extended into the TPM are copied Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Young <dyoung@redhat.com> Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29ima: kexec: skip IMA segment validation after kexec soft rebootSteven Chen1-0/+3
Currently, the function kexec_calculate_store_digests() calculates and stores the digest of the segment during the kexec_file_load syscall, where the IMA segment is also allocated. Later, the IMA segment will be updated with the measurement log at the kexec execute stage when a kexec reboot is initiated. Therefore, the digests should be updated for the IMA segment in the normal case. The problem is that the content of memory segments carried over to the new kernel during the kexec systemcall can be changed at kexec 'execute' stage, but the size and the location of the memory segments cannot be changed at kexec 'execute' stage. To address this, skip the calculation and storage of the digest for the IMA segment in kexec_calculate_store_digests() so that it is not added to the purgatory_sha_regions. With this change, the IMA segment is not included in the digest calculation, storage, and verification. Cc: Eric Biederman <ebiederm@xmission.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Young <dyoung@redhat.com> Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm [zohar@linux.ibm.com: Fixed Signed-off-by tag to match author's email ] Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29ima: define and call ima_alloc_kexec_file_buf()Steven Chen1-11/+35
In the current implementation, the ima_dump_measurement_list() API is called during the kexec "load" phase, where a buffer is allocated and the measurement records are copied. Due to this, new events added after kexec load but before kexec execute are not carried over to the new kernel during kexec operation Carrying the IMA measurement list across kexec requires allocating a buffer and copying the measurement records. Separate allocating the buffer and copying the measurement records into separate functions in order to allocate the buffer at kexec 'load' and copy the measurements at kexec 'execute'. After moving the vfree() here at this stage in the patch set, the IMA measurement list fails to verify when doing two consecutive "kexec -s -l" with/without a "kexec -s -u" in between. Only after "ima: kexec: move IMA log copy from kexec load to execute" the IMA measurement list verifies properly with the vfree() here. Co-developed-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-29ima: rename variable the seq_file "file" to "ima_kexec_file"Steven Chen1-15/+16
Before making the function local seq_file "file" variable file static global, rename it to "ima_kexec_file". Signed-off-by: Steven Chen <chenste@linux.microsoft.com> Acked-by: Baoquan He <bhe@redhat.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> # ppc64/kvm Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-24Merge tag 'landlock-6.15-rc4' of ↵Linus Torvalds3-17/+16
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock fixes from Mickaël Salaün: "Fix some Landlock audit issues, add related tests, and updates documentation" * tag 'landlock-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: landlock: Update log documentation landlock: Fix documentation for landlock_restrict_self(2) landlock: Fix documentation for landlock_create_ruleset(2) selftests/landlock: Add PID tests for audit records selftests/landlock: Factor out audit fixture in audit_test landlock: Log the TGID of the domain creator landlock: Remove incorrect warning
2025-04-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-2/+4
Cross-merge networking fixes after downstream PR (net-6.15-rc4). This pull includes wireless and a fix to vxlan which isn't in Linus's tree just yet. The latter creates with a silent conflict / build breakage, so merging it now to avoid causing problems. drivers/net/vxlan/vxlan_vnifilter.c 094adad91310 ("vxlan: Use a single lock to protect the FDB table") 087a9eb9e597 ("vxlan: vnifilter: Fix unlocked deletion of default FDB entry") https://lore.kernel.org/20250423145131.513029-1-idosch@nvidia.com No "normal" conflicts, or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-22lsm: Move security_netlink_send to under CONFIG_SECURITY_NETWORKSong Liu1-18/+18
security_netlink_send() is a networking hook, so it fits better under CONFIG_SECURITY_NETWORK. Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-04-22ima: process_measurement() needlessly takes inode_lock() on MAY_READFrederick Lawler1-1/+3
On IMA policy update, if a measure rule exists in the policy, IMA_MEASURE is set for ima_policy_flags which makes the violation_check variable always true. Coupled with a no-action on MAY_READ for a FILE_CHECK call, we're always taking the inode_lock(). This becomes a performance problem for extremely heavy read-only workloads. Therefore, prevent this only in the case there's no action to be taken. Signed-off-by: Frederick Lawler <fred@cloudflare.com> Acked-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-04-17landlock: Fix documentation for landlock_restrict_self(2)Mickaël Salaün1-6/+6
Fix, deduplicate, and improve rendering of landlock_restrict_self(2)'s flags documentation. The flags are now rendered like the syscall's parameters and description. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250416154716.1799902-2-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-04-17landlock: Fix documentation for landlock_create_ruleset(2)Mickaël Salaün1-8/+7
Move and fix the flags documentation, and improve formatting. It makes more sense and it eases maintenance to document syscall flags in landlock.h, where they are defined. This is already the case for landlock_restrict_self(2)'s flags. The flags are now rendered like the syscall's parameters and description. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250416154716.1799902-1-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-04-15hardening: Disable GCC randstruct for COMPILE_TESTKees Cook1-1/+1
There is a GCC crash bug in the randstruct for latest GCC versions that is being tickled by landlock[1]. Temporarily disable GCC randstruct for COMPILE_TEST builds to unbreak CI systems for the coming -rc2. This can be restored once the bug is fixed. Suggested-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/all/20250407-kbuild-disable-gcc-plugins-v1-1-5d46ae583f5e@kernel.org/ [1] Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20250409151154.work.872-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-04-12selinux: fix the kdoc header for task_avdcache_updatePaul Moore1-1/+1
The kdoc header incorrectly references an older parameter, update it to reference what is currently used in the function. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504122308.Ch8PzJdD-lkp@intel.com/ Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-04-12selinux: remove a duplicated includePaul Moore1-1/+0
The "linux/parser.h" header was included twice, we only need it once. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504121945.Q0GDD0sG-lkp@intel.com/ Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-04-12net: Retire DCCP socket.Kuniyuki Iwashima5-70/+2
DCCP was orphaned in 2021 by commit 054c4610bd05 ("MAINTAINERS: dccp: move Gerrit Renker to CREDITS"), which noted that the last maintainer had been inactive for five years. In recent years, it has become a playground for syzbot, and most changes to DCCP have been odd bug fixes triggered by syzbot. Apart from that, the only changes have been driven by treewide or networking API updates or adjustments related to TCP. Thus, in 2023, we announced we would remove DCCP in 2025 via commit b144fcaf46d4 ("dccp: Print deprecation notice."). Since then, only one individual has contacted the netdev mailing list. [0] There is ongoing research for Multipath DCCP. The repository is hosted on GitHub [1], and development is not taking place through the upstream community. While the repository is published under the GPLv2 license, the scheduling part remains proprietary, with a LICENSE file [2] stating: "This is not Open Source software." The researcher mentioned a plan to address the licensing issue, upstream the patches, and step up as a maintainer, but there has been no further communication since then. Maintaining DCCP for a decade without any real users has become a burden. Therefore, it's time to remove it. Removing DCCP will also provide significant benefits to TCP. It allows us to freely reorganize the layout of struct inet_connection_sock, which is currently shared with DCCP, and optimize it to reduce the number of cachelines accessed in the TCP fast path. Note that we keep DCCP netfilter modules as requested. [3] Link: https://lore.kernel.org/netdev/20230710182253.81446-1-kuniyu@amazon.com/T/#u #[0] Link: https://github.com/telekom/mp-dccp #[1] Link: https://github.com/telekom/mp-dccp/blob/mpdccp_v03_k5.10/net/dccp/non_gpl_scheduler/LICENSE #[2] Link: https://lore.kernel.org/netdev/Z_VQ0KlCRkqYWXa-@calendula/ #[3] Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paul Moore <paul@paul-moore.com> (LSM and SELinux) Acked-by: Casey Schaufler <casey@schaufler-ca.com> Link: https://patch.msgid.link/20250410023921.11307-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11selinux: reduce path walk overheadPaul Moore2-54/+185
Reduce the SELinux performance overhead during path walks through the use of a per-task directory access cache and some minor code optimizations. The directory access cache is per-task because it allows for a lockless cache while also fitting well with a common application pattern of heavily accessing a relatively small number of SELinux directory labels. The cache is inherited by child processes when the child runs with the same SELinux domain as the parent, and invalidated on changes to the task's SELinux domain or the loaded SELinux policy. A cache of four entries was chosen based on testing with the Fedora "targeted" policy, a SELinux Reference Policy variant, and 'make allmodconfig' on Linux v6.14. Code optimizations include better use of inline functions to reduce function calls in the common case, especially in the inode revalidation code paths, and elimination of redundant checks between the LSM and SELinux layers. As mentioned briefly above, aside from general use and regression testing with the selinux-testsuite, performance was measured using 'make allmodconfig' with Linux v6.14 as a base reference. As expected, there were variations from one test run to another, but the measurements below are a good representation of the test results seen on my test system. * Linux v6.14 REF 1.26% [k] __d_lookup_rcu SELINUX (1.31%) 0.58% [k] selinux_inode_permission 0.29% [k] avc_lookup 0.25% [k] avc_has_perm_noaudit 0.19% [k] __inode_security_revalidate * Linux v6.14 + patch REF 1.41% [k] __d_lookup_rcu SELINUX (0.89%) 0.65% [k] selinux_inode_permission 0.15% [k] avc_lookup 0.05% [k] avc_has_perm_noaudit 0.04% [k] avc_policy_seqno X.XX% [k] __inode_security_revalidate (now inline) In both cases the __d_lookup_rcu() function was used as a reference point to establish a context for the SELinux related functions. On a unpatched Linux v6.14 system we see the time spent in the combined SELinux functions exceeded that of __d_lookup_rcu(), 1.31% compared to 1.26%. However, with this patch applied the time spent in the combined SELinux functions dropped to roughly 65% of the time spent in __d_lookup_rcu(), 0.89% compared to 1.41%. Aside from the significant decrease in time spent in the SELinux AVC, it appears that any additional time spent searching and updating the cache is offset by other code improvements, e.g. time spent in selinux_inode_permission() + __inode_security_revalidate() + avc_policy_seqno() is less on the patched kernel than the unpatched kernel. It is worth noting that in this patch the use of the per-task cache is limited to the security_inode_permission() LSM callback, selinux_inode_permission(), but future work could expand the cache into inode_has_perm(), likely through consolidation of the two functions. While this would likely have little to no impact on path walks, it may benefit other operations. Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> Tested-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>