summaryrefslogtreecommitdiff
path: root/security
AgeCommit message (Collapse)AuthorFilesLines
2020-05-27Merge branch 'for-5.7-fixes' of ↵Linus Torvalds2-4/+17
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Reverted stricter synchronization for cgroup recursive stats which was prepping it for event counter usage which never got merged. The change was causing performation regressions in some cases. - Restore bpf-based device-cgroup operation even when cgroup1 device cgroup is disabled. - An out-param init fix. * 'for-5.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: device_cgroup: Cleanup cgroup eBPF device filter code xattr: fix uninitialized out-param Revert "cgroup: Add memory barriers to plug cgroup_rstat_updated() race window"
2020-05-27Merge branch 'exec-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull execve fix from Eric Biederman: "While working on my exec cleanups I found a bug in exec that winds up miscomputing the ambient credentials during exec. Andy appears to have to been confused as to why credentials are computed for both the script and the interpreter From the original patch description: [3] Linux very confusingly processes both the script and the interpreter if applicable, for reasons that elude me. The results from thinking about a script's file capabilities and/or setuid bits are mostly discarded. The only value in struct cred that gets changed in cap_bprm_set_creds that I could find that might persist between the script and the interpreter was cap_ambient. Which is fixed with this trivial change" * 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: exec: Always set cap_ambient in cap_bprm_set_creds
2020-05-26exec: Always set cap_ambient in cap_bprm_set_credsEric W. Biederman1-0/+1
An invariant of cap_bprm_set_creds is that every field in the new cred structure that cap_bprm_set_creds might set, needs to be set every time to ensure the fields does not get a stale value. The field cap_ambient is not set every time cap_bprm_set_creds is called, which means that if there is a suid or sgid script with an interpreter that has neither the suid nor the sgid bits set the interpreter should be able to accept ambient credentials. Unfortuantely because cap_ambient is not reset to it's original value the interpreter can not accept ambient credentials. Given that the ambient capability set is expected to be controlled by the caller, I don't think this is particularly serious. But it is definitely worth fixing so the code works correctly. I have tested to verify my reading of the code is correct and the interpreter of a sgid can receive ambient capabilities with this change and cannot receive ambient capabilities without this change. Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@kernel.org> Fixes: 58319057b784 ("capabilities: ambient capabilities") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds1-2/+14
Pull networking fixes from David Miller: 1) Fix RCU warnings in ipv6 multicast router code, from Madhuparna Bhowmik. 2) Nexthop attributes aren't being checked properly because of mis-initialized iterator, from David Ahern. 3) Revert iop_idents_reserve() change as it caused performance regressions and was just working around what is really a UBSAN bug in the compiler. From Yuqi Jin. 4) Read MAC address properly from ROM in bmac driver (double iteration proceeds past end of address array), from Jeremy Kerr. 5) Add Microsoft Surface device IDs to r8152, from Marc Payne. 6) Prevent reference to freed SKB in __netif_receive_skb_core(), from Boris Sukholitko. 7) Fix ACK discard behavior in rxrpc, from David Howells. 8) Preserve flow hash across packet scrubbing in wireguard, from Jason A. Donenfeld. 9) Cap option length properly for SO_BINDTODEVICE in AX25, from Eric Dumazet. 10) Fix encryption error checking in kTLS code, from Vadim Fedorenko. 11) Missing BPF prog ref release in flow dissector, from Jakub Sitnicki. 12) dst_cache must be used with BH disabled in tipc, from Eric Dumazet. 13) Fix use after free in mlxsw driver, from Jiri Pirko. 14) Order kTLS key destruction properly in mlx5 driver, from Tariq Toukan. 15) Check devm_platform_ioremap_resource() return value properly in several drivers, from Tiezhu Yang. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits) net: smsc911x: Fix runtime PM imbalance on error net/mlx4_core: fix a memory leak bug. net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during suspend net: phy: mscc: fix initialization of the MACsec protocol mode net: stmmac: don't attach interface until resume finishes net: Fix return value about devm_platform_ioremap_resource() net/mlx5: Fix error flow in case of function_setup failure net/mlx5e: CT: Correctly get flow rule net/mlx5e: Update netdev txq on completions during closure net/mlx5: Annotate mutex destroy for root ns net/mlx5: Don't maintain a case of del_sw_func being null net/mlx5: Fix cleaning unmanaged flow tables net/mlx5: Fix memory leak in mlx5_events_init net/mlx5e: Fix inner tirs handling net/mlx5e: kTLS, Destroy key object after destroying the TIS net/mlx5e: Fix allowed tc redirect merged eswitch offload cases net/mlx5: Avoid processing commands before cmdif is ready net/mlx5: Fix a race when moving command interface to events mode net/mlx5: Add command entry handling completion rxrpc: Fix a memory leak in rxkad_verify_response() ...
2020-05-22apparmor: Fix use-after-free in aa_audit_rule_initNavid Emamdoost1-1/+2
In the implementation of aa_audit_rule_init(), when aa_label_parse() fails the allocated memory for rule is released using aa_audit_rule_free(). But after this release, the return statement tries to access the label field of the rule which results in use-after-free. Before releasing the rule, copy errNo and return it after release. Fixes: 52e8c38001d8 ("apparmor: Fix memory leak of rule on error exit path") Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2020-05-22apparmor: Fix aa_label refcnt leak in policy_updateXiyu Yang1-1/+2
policy_update() invokes begin_current_label_crit_section(), which returns a reference of the updated aa_label object to "label" with increased refcount. When policy_update() returns, "label" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in one exception handling path of policy_update(). When aa_may_manage_policy() returns not NULL, the refcnt increased by begin_current_label_crit_section() is not decreased, causing a refcnt leak. Fix this issue by jumping to "end_section" label when aa_may_manage_policy() returns not NULL. Fixes: 5ac8c355ae00 ("apparmor: allow introspecting the loaded policy pre internal transform") Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2020-05-22apparmor: fix potential label refcnt leak in aa_change_profileXiyu Yang1-2/+1
aa_change_profile() invokes aa_get_current_label(), which returns a reference of the current task's label. According to the comment of aa_get_current_label(), the returned reference must be put with aa_put_label(). However, when the original object pointed by "label" becomes unreachable because aa_change_profile() returns or a new object is assigned to "label", reference count increased by aa_get_current_label() is not decreased, causing a refcnt leak. Fix this by calling aa_put_label() before aa_change_profile() return and dropping unnecessary aa_get_current_label(). Fixes: 9fcf78cca198 ("apparmor: update domain transitions that are subsets of confinement at nnp") Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2020-05-21security: Fix hook iteration for secid_to_secctxKP Singh1-2/+14
secid_to_secctx is not stackable, and since the BPF LSM registers this hook by default, the call_int_hook logic is not suitable which "bails-on-fail" and casues issues when other LSMs register this hook and eventually breaks Audit. In order to fix this, directly iterate over the security hooks instead of using call_int_hook as suggested in: https: //lore.kernel.org/bpf/9d0eb6c6-803a-ff3a-5603-9ad6d9edfc00@schaufler-ca.com/#t Fixes: 98e828a0650f ("security: Refactor declaration of LSM hooks") Fixes: 625236ba3832 ("security: Fix the default value of secid_to_secctx hook") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/bpf/20200520125616.193765-1-kpsingh@chromium.org
2020-05-18Merge branch 'fixes' of ↵Linus Torvalds5-34/+40
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity fixes from Mimi Zohar: "A couple of miscellaneous bug fixes for the integrity subsystem: IMA: - Properly modify the open flags in order to calculate the file hash. - On systems requiring the IMA policy to be signed, the policy is loaded differently. Don't differentiate between "enforce" and either "log" or "fix" modes how the policy is loaded. EVM: - Two patches to fix an EVM race condition, normally the result of attempting to load an unsupported hash algorithm. - Use the lockless RCU version for walking an append only list" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: evm: Fix a small race in init_desc() evm: Fix RCU list related warnings ima: Fix return value of ima_write_policy() evm: Check also if *tfm is an error pointer in init_desc() ima: Set file->f_mode instead of file->f_flags in ima_calc_file_hash()
2020-05-15evm: Fix a small race in init_desc()Dan Carpenter1-22/+22
The IS_ERR_OR_NULL() function has two conditions and if we got really unlucky we could hit a race where "ptr" started as an error pointer and then was set to NULL. Both conditions would be false even though the pointer at the end was NULL. This patch fixes the problem by ensuring that "*tfm" can only be NULL or valid. I have introduced a "tmp_tfm" variable to make that work. I also reversed a condition and pulled the code in one tab. Reported-by: Roberto Sassu <roberto.sassu@huawei.com> Fixes: 53de3b080d5e ("evm: Check also if *tfm is an error pointer in init_desc()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Roberto Sassu <roberto.sassu@huawei.com> Acked-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-05-08evm: Fix RCU list related warningsMadhuparna Bhowmik3-4/+11
This patch fixes the following warning and few other instances of traversal of evm_config_xattrnames list: [ 32.848432] ============================= [ 32.848707] WARNING: suspicious RCU usage [ 32.848966] 5.7.0-rc1-00006-ga8d5875ce5f0b #1 Not tainted [ 32.849308] ----------------------------- [ 32.849567] security/integrity/evm/evm_main.c:231 RCU-list traversed in non-reader section!! Since entries are only added to the list and never deleted, use list_for_each_entry_lockless() instead of list_for_each_entry_rcu for traversing the list. Also, add a relevant comment in evm_secfs.c to indicate this fact. Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> (RCU viewpoint) Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-05-08ima: Fix return value of ima_write_policy()Roberto Sassu1-2/+1
This patch fixes the return value of ima_write_policy() when a new policy is directly passed to IMA and the current policy requires appraisal of the file containing the policy. Currently, if appraisal is not in ENFORCE mode, ima_write_policy() returns 0 and leads user space applications to an endless loop. Fix this issue by denying the operation regardless of the appraisal mode. Cc: stable@vger.kernel.org # 4.10.x Fixes: 19f8a84713edc ("ima: measure and appraise the IMA policy itself") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-05-08evm: Check also if *tfm is an error pointer in init_desc()Roberto Sassu1-1/+1
This patch avoids a kernel panic due to accessing an error pointer set by crypto_alloc_shash(). It occurs especially when there are many files that require an unsupported algorithm, as it would increase the likelihood of the following race condition: Task A: *tfm = crypto_alloc_shash() <= error pointer Task B: if (*tfm == NULL) <= *tfm is not NULL, use it Task B: rc = crypto_shash_init(desc) <= panic Task A: *tfm = NULL This patch uses the IS_ERR_OR_NULL macro to determine whether or not a new crypto context must be created. Cc: stable@vger.kernel.org Fixes: d46eb3699502b ("evm: crypto hash replaced by shash") Co-developed-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com> Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-05-08ima: Set file->f_mode instead of file->f_flags in ima_calc_file_hash()Roberto Sassu1-6/+6
Commit a408e4a86b36 ("ima: open a new file instance if no read permissions") tries to create a new file descriptor to calculate a file digest if the file has not been opened with O_RDONLY flag. However, if a new file descriptor cannot be obtained, it sets the FMODE_READ flag to file->f_flags instead of file->f_mode. This patch fixes this issue by replacing f_flags with f_mode as it was before that commit. Cc: stable@vger.kernel.org # 4.20.x Fixes: a408e4a86b36 ("ima: open a new file instance if no read permissions") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-05-01Merge tag 'selinux-pr-20200430' of ↵Linus Torvalds2-26/+46
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull SELinux fixes from Paul Moore: "Two more SELinux patches to fix problems in the v5.7-rcX releases. Wei Yongjun's patch fixes a return code in an error path, and my patch fixes a problem where we were not correctly applying access controls to all of the netlink messages in the netlink_send LSM hook" * tag 'selinux-pr-20200430' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: properly handle multiple messages in selinux_netlink_send() selinux: fix error return code in cond_read_list()
2020-04-30selinux: properly handle multiple messages in selinux_netlink_send()Paul Moore1-25/+45
Fix the SELinux netlink_send hook to properly handle multiple netlink messages in a single sk_buff; each message is parsed and subject to SELinux access control. Prior to this patch, SELinux only inspected the first message in the sk_buff. Cc: stable@vger.kernel.org Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-04-28selinux: fix error return code in cond_read_list()Wei Yongjun1-1/+1
Fix to return negative error code -ENOMEM from the error handling case instead of 0, as done elsewhere in this function. Fixes: 60abd3181db2 ("selinux: convert cond_list to array") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-04-16Merge tag 'selinux-pr-20200416' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull SELinux fix from Paul Moore: "One small SELinux fix to ensure we cleanup properly on an error condition" * tag 'selinux-pr-20200416' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: free str on error in str_read()
2020-04-16keys: Fix proc_keys_next to increase position indexVasily Averin1-0/+2
If seq_file .next function does not change position index, read after some lseek can generate unexpected output: $ dd if=/proc/keys bs=1 # full usual output 0f6bfdf5 I--Q--- 2 perm 3f010000 1000 1000 user 4af2f79ab8848d0a: 740 1fb91b32 I--Q--- 3 perm 1f3f0000 1000 65534 keyring _uid.1000: 2 27589480 I--Q--- 1 perm 0b0b0000 0 0 user invocation_id: 16 2f33ab67 I--Q--- 152 perm 3f030000 0 0 keyring _ses: 2 33f1d8fa I--Q--- 4 perm 3f030000 1000 1000 keyring _ses: 1 3d427fda I--Q--- 2 perm 3f010000 1000 1000 user 69ec44aec7678e5a: 740 3ead4096 I--Q--- 1 perm 1f3f0000 1000 65534 keyring _uid_ses.1000: 1 521+0 records in 521+0 records out 521 bytes copied, 0,00123769 s, 421 kB/s But a read after lseek in middle of last line results in the partial last line and then a repeat of the final line: $ dd if=/proc/keys bs=500 skip=1 dd: /proc/keys: cannot skip to specified offset g _uid_ses.1000: 1 3ead4096 I--Q--- 1 perm 1f3f0000 1000 65534 keyring _uid_ses.1000: 1 0+1 records in 0+1 records out 97 bytes copied, 0,000135035 s, 718 kB/s and a read after lseek beyond end of file results in the last line being shown: $ dd if=/proc/keys bs=1000 skip=1 # read after lseek beyond end of file dd: /proc/keys: cannot skip to specified offset 3ead4096 I--Q--- 1 perm 1f3f0000 1000 65534 keyring _uid_ses.1000: 1 0+1 records in 0+1 records out 76 bytes copied, 0,000119981 s, 633 kB/s See https://bugzilla.kernel.org/show_bug.cgi?id=206283 Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-16selinux: free str on error in str_read()Ondrej Mosnacek1-4/+4
In [see "Fixes:"] I missed the fact that str_read() may give back an allocated pointer even if it returns an error, causing a potential memory leak in filename_trans_read_one(). Fix this by making the function free the allocated string whenever it returns a non-zero value, which also makes its behavior more obvious and prevents repeating the same mistake in the future. Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 1461665 ("Resource leaks") Fixes: c3a276111ea2 ("selinux: optimize storage of filename transitions") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-04-13device_cgroup: Cleanup cgroup eBPF device filter codeOdin Ugedal2-4/+17
Original cgroup v2 eBPF code for filtering device access made it possible to compile with CONFIG_CGROUP_DEVICE=n and still use the eBPF filtering. Change commit 4b7d4d453fc4 ("device_cgroup: Export devcgroup_check_permission") reverted this, making it required to set it to y. Since the device filtering (and all the docs) for cgroup v2 is no longer a "device controller" like it was in v1, someone might compile their kernel with CONFIG_CGROUP_DEVICE=n. Then (for linux 5.5+) the eBPF filter will not be invoked, and all processes will be allowed access to all devices, no matter what the eBPF filter says. Signed-off-by: Odin Ugedal <odin@ugedal.com> Acked-by: Roman Gushchin <guro@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-04-04Merge tag 'keys-fixes-20200329' of ↵Linus Torvalds8-52/+113
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull keyrings fixes from David Howells: "Here's a couple of patches that fix a circular dependency between holding key->sem and mm->mmap_sem when reading data from a key. One potential issue is that a filesystem looking to use a key inside, say, ->readpages() could deadlock if the key being read is the key that's required and the buffer the key is being read into is on a page that needs to be fetched. The case actually detected is a bit more involved - with a filesystem calling request_key() and locking the target keyring for write - which could be being read" * tag 'keys-fixes-20200329' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: KEYS: Avoid false positive ENOMEM error on key read KEYS: Don't write out to userspace while holding key semaphore
2020-04-03Merge tag 'spdx-5.7-rc1' of ↵Linus Torvalds3-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull SPDX updates from Greg KH: "Here are three SPDX patches for 5.7-rc1. One fixes up the SPDX tag for a single driver, while the other two go through the tree and add SPDX tags for all of the .gitignore files as needed. Nothing too complex, but you will get a merge conflict with your current tree, that should be trivial to handle (one file modified by two things, one file deleted.) All three of these have been in linux-next for a while, with no reported issues other than the merge conflict" * tag 'spdx-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: ASoC: MT6660: make spdxcheck.py happy .gitignore: add SPDX License Identifier .gitignore: remove too obvious comments
2020-04-03Merge branch 'next-integrity' of ↵Linus Torvalds19-34/+19
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "Just a couple of updates for linux-5.7: - A new Kconfig option to enable IMA architecture specific runtime policy rules needed for secure and/or trusted boot, as requested. - Some message cleanup (eg. pr_fmt, additional error messages)" * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: add a new CONFIG for loading arch-specific policies integrity: Remove duplicate pr_fmt definitions IMA: Add log statements for failure conditions IMA: Update KBUILD_MODNAME for IMA files to ima
2020-04-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds5-16/+68
Pull networking updates from David Miller: "Highlights: 1) Fix the iwlwifi regression, from Johannes Berg. 2) Support BSS coloring and 802.11 encapsulation offloading in hardware, from John Crispin. 3) Fix some potential Spectre issues in qtnfmac, from Sergey Matyukevich. 4) Add TTL decrement action to openvswitch, from Matteo Croce. 5) Allow paralleization through flow_action setup by not taking the RTNL mutex, from Vlad Buslov. 6) A lot of zero-length array to flexible-array conversions, from Gustavo A. R. Silva. 7) Align XDP statistics names across several drivers for consistency, from Lorenzo Bianconi. 8) Add various pieces of infrastructure for offloading conntrack, and make use of it in mlx5 driver, from Paul Blakey. 9) Allow using listening sockets in BPF sockmap, from Jakub Sitnicki. 10) Lots of parallelization improvements during configuration changes in mlxsw driver, from Ido Schimmel. 11) Add support to devlink for generic packet traps, which report packets dropped during ACL processing. And use them in mlxsw driver. From Jiri Pirko. 12) Support bcmgenet on ACPI, from Jeremy Linton. 13) Make BPF compatible with RT, from Thomas Gleixnet, Alexei Starovoitov, and your's truly. 14) Support XDP meta-data in virtio_net, from Yuya Kusakabe. 15) Fix sysfs permissions when network devices change namespaces, from Christian Brauner. 16) Add a flags element to ethtool_ops so that drivers can more simply indicate which coalescing parameters they actually support, and therefore the generic layer can validate the user's ethtool request. Use this in all drivers, from Jakub Kicinski. 17) Offload FIFO qdisc in mlxsw, from Petr Machata. 18) Support UDP sockets in sockmap, from Lorenz Bauer. 19) Fix stretch ACK bugs in several TCP congestion control modules, from Pengcheng Yang. 20) Support virtual functiosn in octeontx2 driver, from Tomasz Duszynski. 21) Add region operations for devlink and use it in ice driver to dump NVM contents, from Jacob Keller. 22) Add support for hw offload of MACSEC, from Antoine Tenart. 23) Add support for BPF programs that can be attached to LSM hooks, from KP Singh. 24) Support for multiple paths, path managers, and counters in MPTCP. From Peter Krystad, Paolo Abeni, Florian Westphal, Davide Caratti, and others. 25) More progress on adding the netlink interface to ethtool, from Michal Kubecek" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2121 commits) net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline cxgb4/chcr: nic-tls stats in ethtool net: dsa: fix oops while probing Marvell DSA switches net/bpfilter: remove superfluous testing message net: macb: Fix handling of fixed-link node net: dsa: ksz: Select KSZ protocol tag netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_write net: stmmac: add EHL 2.5Gbps PCI info and PCI ID net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID net: stmmac: create dwmac-intel.c to contain all Intel platform net: dsa: bcm_sf2: Support specifying VLAN tag egress rule net: dsa: bcm_sf2: Add support for matching VLAN TCI net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functions net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXT net: dsa: bcm_sf2: Disable learning for ASP port net: dsa: b53: Deny enslaving port 7 for 7278 into a bridge net: dsa: b53: Prevent tagged VLAN on port 7 for 7278 net: dsa: b53: Restore VLAN entries upon (re)configuration net: dsa: bcm_sf2: Fix overflow checks hv_netvsc: Remove unnecessary round_up for recv_completion_cnt ...
2020-04-01Merge tag 'selinux-pr-20200330' of ↵Linus Torvalds18-448/+448
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull SELinux updates from Paul Moore: "We've got twenty SELinux patches for the v5.7 merge window, the highlights are below: - Deprecate setting /sys/fs/selinux/checkreqprot to 1. This flag was originally created to deal with legacy userspace and the READ_IMPLIES_EXEC personality flag. We changed the default from 1 to 0 back in Linux v4.4 and now we are taking the next step of deprecating it, at some point in the future we will take the final step of rejecting 1. - Allow kernfs symlinks to inherit the SELinux label of the parent directory. In order to preserve backwards compatibility this is protected by the genfs_seclabel_symlinks SELinux policy capability. - Optimize how we store filename transitions in the kernel, resulting in some significant improvements to policy load times. - Do a better job calculating our internal hash table sizes which resulted in additional policy load improvements and likely general SELinux performance improvements as well. - Remove the unused initial SIDs (labels) and improve how we handle initial SIDs. - Enable per-file labeling for the bpf filesystem. - Ensure that we properly label NFS v4.2 filesystems to avoid a temporary unlabeled condition. - Add some missing XFS quota command types to the SELinux quota access controls. - Fix a problem where we were not updating the seq_file position index correctly in selinuxfs. - We consolidate some duplicated code into helper functions. - A number of list to array conversions. - Update Stephen Smalley's email address in MAINTAINERS" * tag 'selinux-pr-20200330' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: clean up indentation issue with assignment statement NFS: Ensure security label is set for root inode MAINTAINERS: Update my email address selinux: avtab_init() and cond_policydb_init() return void selinux: clean up error path in policydb_init() selinux: remove unused initial SIDs and improve handling selinux: reduce the use of hard-coded hash sizes selinux: Add xfs quota command types selinux: optimize storage of filename transitions selinux: factor out loop body from filename_trans_read() security: selinux: allow per-file labeling for bpffs selinux: generalize evaluate_cond_node() selinux: convert cond_expr to array selinux: convert cond_av_list to array selinux: convert cond_list to array selinux: sel_avc_get_stat_idx should increase position index selinux: allow kernfs symlinks to inherit parent directory context selinux: simplify evaluate_cond_node() Documentation,selinux: deprecate setting checkreqprot to 1 selinux: move status variables out of selinux_ss
2020-03-31selinux: clean up indentation issue with assignment statementColin Ian King1-4/+3
The assignment of e->type_names is indented one level too deep, clean this up by removing the extraneous tab. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-03-31Merge branch 'efi-core-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The EFI changes in this cycle are much larger than usual, for two (positive) reasons: - The GRUB project is showing signs of life again, resulting in the introduction of the generic Linux/UEFI boot protocol, instead of x86 specific hacks which are increasingly difficult to maintain. There's hope that all future extensions will now go through that boot protocol. - Preparatory work for RISC-V EFI support. The main changes are: - Boot time GDT handling changes - Simplify handling of EFI properties table on arm64 - Generic EFI stub cleanups, to improve command line handling, file I/O, memory allocation, etc. - Introduce a generic initrd loading method based on calling back into the firmware, instead of relying on the x86 EFI handover protocol or device tree. - Introduce a mixed mode boot method that does not rely on the x86 EFI handover protocol either, and could potentially be adopted by other architectures (if another one ever surfaces where one execution mode is a superset of another) - Clean up the contents of 'struct efi', and move out everything that doesn't need to be stored there. - Incorporate support for UEFI spec v2.8A changes that permit firmware implementations to return EFI_UNSUPPORTED from UEFI runtime services at OS runtime, and expose a mask of which ones are supported or unsupported via a configuration table. - Partial fix for the lack of by-VA cache maintenance in the decompressor on 32-bit ARM. - Changes to load device firmware from EFI boot service memory regions - Various documentation updates and minor code cleanups and fixes" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits) efi/libstub/arm: Fix spurious message that an initrd was loaded efi/libstub/arm64: Avoid image_base value from efi_loaded_image partitions/efi: Fix partition name parsing in GUID partition entry efi/x86: Fix cast of image argument efi/libstub/x86: Use ULONG_MAX as upper bound for all allocations efi: Fix a mistype in comments mentioning efivar_entry_iter_begin() efi/libstub: Avoid linking libstub/lib-ksyms.o into vmlinux efi/x86: Preserve %ebx correctly in efi_set_virtual_address_map() efi/x86: Ignore the memory attributes table on i386 efi/x86: Don't relocate the kernel unless necessary efi/x86: Remove extra headroom for setup block efi/x86: Add kernel preferred address to PE header efi/x86: Decompress at start of PE image load address x86/boot/compressed/32: Save the output address instead of recalculating it efi/libstub/x86: Deal with exit() boot service returning x86/boot: Use unsigned comparison for addresses efi/x86: Avoid using code32_start efi/x86: Make efi32_pe_entry() more readable efi/x86: Respect 32-bit ABI in efi32_pe_entry() efi/x86: Annotate the LOADED_IMAGE_PROTOCOL_GUID with SYM_DATA ...
2020-03-30bpf: lsm: Initialize the BPF LSM hooksKP Singh4-5/+38
* The hooks are initialized using the definitions in include/linux/lsm_hook_defs.h. * The LSM can be enabled / disabled with CONFIG_BPF_LSM. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Florent Revest <revest@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/bpf/20200329004356.27286-6-kpsingh@chromium.org
2020-03-30security: Refactor declaration of LSM hooksKP Singh1-11/+30
The information about the different types of LSM hooks is scattered in two locations i.e. union security_list_options and struct security_hook_heads. Rather than duplicating this information even further for BPF_PROG_TYPE_LSM, define all the hooks with the LSM_HOOK macro in lsm_hook_defs.h which is then used to generate all the data structures required by the LSM framework. The LSM hooks are defined as: LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...) with <default_value> acccessible in security.c as: LSM_RET_DEFAULT(<hook_name>) Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Florent Revest <revest@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/bpf/20200329004356.27286-3-kpsingh@chromium.org
2020-03-29KEYS: Avoid false positive ENOMEM error on key readWaiman Long2-15/+55
By allocating a kernel buffer with a user-supplied buffer length, it is possible that a false positive ENOMEM error may be returned because the user-supplied length is just too large even if the system do have enough memory to hold the actual key data. Moreover, if the buffer length is larger than the maximum amount of memory that can be returned by kmalloc() (2^(MAX_ORDER-1) number of pages), a warning message will also be printed. To reduce this possibility, we set a threshold (PAGE_SIZE) over which we do check the actual key length first before allocating a buffer of the right size to hold it. The threshold is arbitrary, it is just used to trigger a buffer length check. It does not limit the actual key length as long as there is enough memory to satisfy the memory request. To further avoid large buffer allocation failure due to page fragmentation, kvmalloc() is used to allocate the buffer so that vmapped pages can be used when there is not a large enough contiguous set of pages available for allocation. In the extremely unlikely scenario that the key keeps on being changed and made longer (still <= buflen) in between 2 __keyctl_read_key() calls, the __keyctl_read_key() calling loop in keyctl_read_key() may have to be iterated a large number of times, but definitely not infinite. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com>
2020-03-29KEYS: Don't write out to userspace while holding key semaphoreWaiman Long7-51/+72
A lockdep circular locking dependency report was seen when running a keyutils test: [12537.027242] ====================================================== [12537.059309] WARNING: possible circular locking dependency detected [12537.088148] 4.18.0-147.7.1.el8_1.x86_64+debug #1 Tainted: G OE --------- - - [12537.125253] ------------------------------------------------------ [12537.153189] keyctl/25598 is trying to acquire lock: [12537.175087] 000000007c39f96c (&mm->mmap_sem){++++}, at: __might_fault+0xc4/0x1b0 [12537.208365] [12537.208365] but task is already holding lock: [12537.234507] 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220 [12537.270476] [12537.270476] which lock already depends on the new lock. [12537.270476] [12537.307209] [12537.307209] the existing dependency chain (in reverse order) is: [12537.340754] [12537.340754] -> #3 (&type->lock_class){++++}: [12537.367434] down_write+0x4d/0x110 [12537.385202] __key_link_begin+0x87/0x280 [12537.405232] request_key_and_link+0x483/0xf70 [12537.427221] request_key+0x3c/0x80 [12537.444839] dns_query+0x1db/0x5a5 [dns_resolver] [12537.468445] dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs] [12537.496731] cifs_reconnect+0xe04/0x2500 [cifs] [12537.519418] cifs_readv_from_socket+0x461/0x690 [cifs] [12537.546263] cifs_read_from_socket+0xa0/0xe0 [cifs] [12537.573551] cifs_demultiplex_thread+0x311/0x2db0 [cifs] [12537.601045] kthread+0x30c/0x3d0 [12537.617906] ret_from_fork+0x3a/0x50 [12537.636225] [12537.636225] -> #2 (root_key_user.cons_lock){+.+.}: [12537.664525] __mutex_lock+0x105/0x11f0 [12537.683734] request_key_and_link+0x35a/0xf70 [12537.705640] request_key+0x3c/0x80 [12537.723304] dns_query+0x1db/0x5a5 [dns_resolver] [12537.746773] dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs] [12537.775607] cifs_reconnect+0xe04/0x2500 [cifs] [12537.798322] cifs_readv_from_socket+0x461/0x690 [cifs] [12537.823369] cifs_read_from_socket+0xa0/0xe0 [cifs] [12537.847262] cifs_demultiplex_thread+0x311/0x2db0 [cifs] [12537.873477] kthread+0x30c/0x3d0 [12537.890281] ret_from_fork+0x3a/0x50 [12537.908649] [12537.908649] -> #1 (&tcp_ses->srv_mutex){+.+.}: [12537.935225] __mutex_lock+0x105/0x11f0 [12537.954450] cifs_call_async+0x102/0x7f0 [cifs] [12537.977250] smb2_async_readv+0x6c3/0xc90 [cifs] [12538.000659] cifs_readpages+0x120a/0x1e50 [cifs] [12538.023920] read_pages+0xf5/0x560 [12538.041583] __do_page_cache_readahead+0x41d/0x4b0 [12538.067047] ondemand_readahead+0x44c/0xc10 [12538.092069] filemap_fault+0xec1/0x1830 [12538.111637] __do_fault+0x82/0x260 [12538.129216] do_fault+0x419/0xfb0 [12538.146390] __handle_mm_fault+0x862/0xdf0 [12538.167408] handle_mm_fault+0x154/0x550 [12538.187401] __do_page_fault+0x42f/0xa60 [12538.207395] do_page_fault+0x38/0x5e0 [12538.225777] page_fault+0x1e/0x30 [12538.243010] [12538.243010] -> #0 (&mm->mmap_sem){++++}: [12538.267875] lock_acquire+0x14c/0x420 [12538.286848] __might_fault+0x119/0x1b0 [12538.306006] keyring_read_iterator+0x7e/0x170 [12538.327936] assoc_array_subtree_iterate+0x97/0x280 [12538.352154] keyring_read+0xe9/0x110 [12538.370558] keyctl_read_key+0x1b9/0x220 [12538.391470] do_syscall_64+0xa5/0x4b0 [12538.410511] entry_SYSCALL_64_after_hwframe+0x6a/0xdf [12538.435535] [12538.435535] other info that might help us debug this: [12538.435535] [12538.472829] Chain exists of: [12538.472829] &mm->mmap_sem --> root_key_user.cons_lock --> &type->lock_class [12538.472829] [12538.524820] Possible unsafe locking scenario: [12538.524820] [12538.551431] CPU0 CPU1 [12538.572654] ---- ---- [12538.595865] lock(&type->lock_class); [12538.613737] lock(root_key_user.cons_lock); [12538.644234] lock(&type->lock_class); [12538.672410] lock(&mm->mmap_sem); [12538.687758] [12538.687758] *** DEADLOCK *** [12538.687758] [12538.714455] 1 lock held by keyctl/25598: [12538.732097] #0: 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220 [12538.770573] [12538.770573] stack backtrace: [12538.790136] CPU: 2 PID: 25598 Comm: keyctl Kdump: loaded Tainted: G [12538.844855] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015 [12538.881963] Call Trace: [12538.892897] dump_stack+0x9a/0xf0 [12538.907908] print_circular_bug.isra.25.cold.50+0x1bc/0x279 [12538.932891] ? save_trace+0xd6/0x250 [12538.948979] check_prev_add.constprop.32+0xc36/0x14f0 [12538.971643] ? keyring_compare_object+0x104/0x190 [12538.992738] ? check_usage+0x550/0x550 [12539.009845] ? sched_clock+0x5/0x10 [12539.025484] ? sched_clock_cpu+0x18/0x1e0 [12539.043555] __lock_acquire+0x1f12/0x38d0 [12539.061551] ? trace_hardirqs_on+0x10/0x10 [12539.080554] lock_acquire+0x14c/0x420 [12539.100330] ? __might_fault+0xc4/0x1b0 [12539.119079] __might_fault+0x119/0x1b0 [12539.135869] ? __might_fault+0xc4/0x1b0 [12539.153234] keyring_read_iterator+0x7e/0x170 [12539.172787] ? keyring_read+0x110/0x110 [12539.190059] assoc_array_subtree_iterate+0x97/0x280 [12539.211526] keyring_read+0xe9/0x110 [12539.227561] ? keyring_gc_check_iterator+0xc0/0xc0 [12539.249076] keyctl_read_key+0x1b9/0x220 [12539.266660] do_syscall_64+0xa5/0x4b0 [12539.283091] entry_SYSCALL_64_after_hwframe+0x6a/0xdf One way to prevent this deadlock scenario from happening is to not allow writing to userspace while holding the key semaphore. Instead, an internal buffer is allocated for getting the keys out from the read method first before copying them out to userspace without holding the lock. That requires taking out the __user modifier from all the relevant read methods as well as additional changes to not use any userspace write helpers. That is, 1) The put_user() call is replaced by a direct copy. 2) The copy_to_user() call is replaced by memcpy(). 3) All the fault handling code is removed. Compiling on a x86-64 system, the size of the rxrpc_read() function is reduced from 3795 bytes to 2384 bytes with this patch. Fixes: ^1da177e4c3f4 ("Linux-2.6.12-rc2") Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com>
2020-03-25.gitignore: add SPDX License IdentifierMasahiro Yamada3-0/+3
Add SPDX License Identifier to all .gitignore files. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-25.gitignore: remove too obvious commentsMasahiro Yamada1-3/+0
Some .gitignore files have comments like "Generated files", "Ignore generated files" at the header part, but they are too obvious. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-15KEYS: reaching the keys quotas correctlyYang Xu2-3/+3
Currently, when we add a new user key, the calltrace as below: add_key() key_create_or_update() key_alloc() __key_instantiate_and_link generic_key_instantiate key_payload_reserve ...... Since commit a08bf91ce28e ("KEYS: allow reaching the keys quotas exactly"), we can reach max bytes/keys in key_alloc, but we forget to remove this limit when we reserver space for payload in key_payload_reserve. So we can only reach max keys but not max bytes when having delta between plen and type->def_datalen. Remove this limit when instantiating the key, so we can keep consistent with key_alloc. Also, fix the similar problem in keyctl_chown_key(). Fixes: 0b77f5bfb45c ("keys: make the keyring quotas controllable through /proc/sys") Fixes: a08bf91ce28e ("KEYS: allow reaching the keys quotas exactly") Cc: stable@vger.kernel.org # 5.0.x Cc: Eric Biggers <ebiggers@google.com> Signed-off-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2020-03-12ima: add a new CONFIG for loading arch-specific policiesNayna Jain1-0/+7
Every time a new architecture defines the IMA architecture specific functions - arch_ima_get_secureboot() and arch_ima_get_policy(), the IMA include file needs to be updated. To avoid this "noise", this patch defines a new IMA Kconfig IMA_SECURE_AND_OR_TRUSTED_BOOT option, allowing the different architectures to select it. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Nayna Jain <nayna@linux.ibm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Philipp Rudo <prudo@linux.ibm.com> (s390) Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-03-05selinux: avtab_init() and cond_policydb_init() return voidPaul Moore5-21/+7
The avtab_init() and cond_policydb_init() functions always return zero so mark them as returning void and update the callers not to check for a return value. Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com> Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-03-05selinux: clean up error path in policydb_init()Ondrej Mosnacek1-13/+5
Commit e0ac568de1fa ("selinux: reduce the use of hard-coded hash sizes") moved symtab initialization out of policydb_init(), but left the cleanup of symtabs from the error path. This patch fixes the oversight. Suggested-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-02-28integrity: Remove duplicate pr_fmt definitionsTushar Sugandhi17-31/+6
The #define for formatting log messages, pr_fmt, is duplicated in the files under security/integrity. This change moves the definition to security/integrity/integrity.h and removes the duplicate definitions in the other files under security/integrity. With this change, the messages in the following files will be prefixed with 'integrity'. security/integrity/platform_certs/platform_keyring.c security/integrity/platform_certs/load_powerpc.c security/integrity/platform_certs/load_uefi.c security/integrity/iint.c e.g. "integrity: Error adding keys to platform keyring %s\n" And the messages in the following file will be prefixed with 'ima'. security/integrity/ima/ima_mok.c e.g. "ima: Allocating IMA blacklist keyring.\n" For the rest of the files under security/integrity, there will be no change in the message format. Suggested-by: Shuah Khan <skhan@linuxfoundation.org> Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-02-28IMA: Add log statements for failure conditionsTushar Sugandhi1-0/+3
process_buffer_measurement() does not have log messages for failure conditions. This change adds a log statement in the above function. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-02-28IMA: Update KBUILD_MODNAME for IMA files to imaTushar Sugandhi1-3/+3
The kbuild Makefile specifies object files for vmlinux in the $(obj-y) lists. These lists depend on the kernel configuration[1]. The kbuild Makefile for IMA combines the object files for IMA into a single object file namely ima.o. All the object files for IMA should be combined into ima.o. But certain object files are being added to their own $(obj-y). This results in the log messages from those modules getting prefixed with their respective base file name, instead of "ima". This is inconsistent with the log messages from the IMA modules that are combined into ima.o. This change fixes the above issue. [1] Documentation\kbuild\makefiles.rst Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-02-28selinux: remove unused initial SIDs and improve handlingStephen Smalley4-56/+58
Remove initial SIDs that have never been used or are no longer used by the kernel from its string table, which is also used to generate the SECINITSID_* symbols referenced in code. Update the code to gracefully handle the fact that these can now be NULL. Stop treating it as an error if a policy defines additional initial SIDs unknown to the kernel. Do not load unused initial SID contexts into the sidtab. Fix the incorrect usage of the name from the ocontext in error messages when loading initial SIDs since these are not presently written to the kernel policy and are therefore always NULL. After this change, it is possible to safely reclaim and reuse some of the unused initial SIDs without compatibility issues. Specifically, unused initial SIDs that were being assigned the same context as the unlabeled initial SID in policies can be reclaimed and reused for another purpose, with existing policies still treating them as having the unlabeled context and future policies having the option of mapping them to a more specific context. For example, this could have been used when the infiniband labeling support was introduced to define initial SIDs for the default pkey and endport SIDs similar to the handling of port/netif/node SIDs rather than always using SECINITSID_UNLABELED as the default. The set of safely reclaimable unused initial SIDs across all known policies is igmp_packet (13), icmp_socket (14), tcp_socket (15), kmod (24), policy (25), and scmp_packet (26); these initial SIDs were assigned the same context as unlabeled in all known policies including mls. If only considering non-mls policies (i.e. assuming that mls users always upgrade policy with their kernels), the set of safely reclaimable unused initial SIDs further includes file_labels (6), init (7), sysctl_modprobe (16), and sysctl_fs (18) through sysctl_dev (23). Adding new initial SIDs beyond SECINITSID_NUM to policy unfortunately became a fatal error in commit 24ed7fdae669 ("selinux: use separate table for initial SID lookup") and even before that it could cause problems on a policy reload (collision between the new initial SID and one allocated at runtime) ever since commit 42596eafdd75 ("selinux: load the initial SIDs upon every policy load") so we cannot safely start adding new initial SIDs to policies beyond SECINITSID_NUM (27) until such a time as all such kernels do not need to be supported and only those that include this commit are relevant. That is not a big deal since we haven't added a new initial SID since 2004 (v2.6.7) and we have plenty of unused ones we can reclaim if we truly need one. If we want to avoid the wasted storage in initial_sid_to_string[] and/or sidtab->isids[] for the unused initial SIDs, we could introduce an indirection between the kernel initial SID values and the policy initial SID values and just map the policy SID values in the ocontexts to the kernel values during policy_load_isids(). Originally I thought we'd do this by preserving the initial SID names in the kernel policy and creating a mapping at load time like we do for the security classes and permissions but that would require a new kernel policy format version and associated changes to libsepol/checkpolicy and I'm not sure it is justified. Simpler approach is just to create a fixed mapping table in the kernel from the existing fixed policy values to the kernel values. Less flexible but probably sufficient. A separate selinux userspace change was applied in https://github.com/SELinuxProject/selinux/commit/8677ce5e8f592950ae6f14cea1b68a20ddc1ac25 to enable removal of most of the unused initial SID contexts from policies, but there is no dependency between that change and this one. That change permits removing all of the unused initial SID contexts from policy except for the fs and sysctl SID contexts. The initial SID declarations themselves would remain in policy to preserve the values of subsequent ones but the contexts can be dropped. If/when the kernel decides to reuse one of them, future policies can change the name and start assigning a context again without breaking compatibility. Here is how I would envision staging changes to the initial SIDs in a compatible manner after this commit is applied: 1. At any time after this commit is applied, the kernel could choose to reclaim one of the safely reclaimable unused initial SIDs listed above for a new purpose (i.e. replace its NULL entry in the initial_sid_to_string[] table with a new name and start using the newly generated SECINITSID_name symbol in code), and refpolicy could at that time rename its declaration of that initial SID to reflect its new purpose and start assigning it a context going forward. Existing/old policies would map the reclaimed initial SID to the unlabeled context, so that would be the initial default behavior until policies are updated. This doesn't depend on the selinux userspace change; it will work with existing policies and userspace. 2. In 6 months or so we'll have another SELinux userspace release that will include the libsepol/checkpolicy support for omitting unused initial SID contexts. 3. At any time after that release, refpolicy can make that release its minimum build requirement and drop the sid context statements (but not the sid declarations) for all of the unused initial SIDs except for fs and sysctl, which must remain for compatibility on policy reload with old kernels and for compatibility with kernels that were still using SECINITSID_SYSCTL (< 2.6.39). This doesn't depend on this kernel commit; it will work with previous kernels as well. 4. After N years for some value of N, refpolicy decides that it no longer cares about policy reload compatibility for kernels that predate this kernel commit, and refpolicy drops the fs and sysctl SID contexts from policy too (but retains the declarations). 5. After M years for some value of M, the kernel decides that it no longer cares about compatibility with refpolicies that predate step 4 (dropping the fs and sysctl SIDs), and those two SIDs also become safely reclaimable. This step is optional and need not ever occur unless we decide that the need to reclaim those two SIDs outweighs the compatibility cost. 6. After O years for some value of O, refpolicy decides that it no longer cares about policy load (not just reload) compatibility for kernels that predate this kernel commit, and both kernel and refpolicy can then start adding and using new initial SIDs beyond 27. This does not depend on the previous change (step 5) and can occur independent of it. Fixes: https://github.com/SELinuxProject/selinux-kernel/issues/12 Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-02-28selinux: reduce the use of hard-coded hash sizesOndrej Mosnacek4-40/+45
Instead allocate hash tables with just the right size based on the actual number of elements (which is almost always known beforehand, we just need to defer the hashtab allocation to the right time). The only case when we don't know the size (with the current policy format) is the new filename transitions hashtable. Here I just left the existing value. After this patch, the time to load Fedora policy on x86_64 decreases from 790 ms to 167 ms. If the unconfined module is removed, it decreases from 750 ms to 122 ms. It is also likely that other operations are going to be faster, mainly string_to_context_struct() or mls_compute_sid(), but I didn't try to quantify that. The memory usage of all hash table arrays increases from ~58 KB to ~163 KB (with Fedora policy on x86_64). Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-02-26Merge tag 'efi-next' of ↵Ingo Molnar1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/core Pull EFI updates for v5.7 from Ard Biesheuvel: This time, the set of changes for the EFI subsystem is much larger than usual. The main reasons are: - Get things cleaned up before EFI support for RISC-V arrives, which will increase the size of the validation matrix, and therefore the threshold to making drastic changes, - After years of defunct maintainership, the GRUB project has finally started to consider changes from the distros regarding UEFI boot, some of which are highly specific to the way x86 does UEFI secure boot and measured boot, based on knowledge of both shim internals and the layout of bootparams and the x86 setup header. Having this maintenance burden on other architectures (which don't need shim in the first place) is hard to justify, so instead, we are introducing a generic Linux/UEFI boot protocol. Summary of changes: - Boot time GDT handling changes (Arvind) - Simplify handling of EFI properties table on arm64 - Generic EFI stub cleanups, to improve command line handling, file I/O, memory allocation, etc. - Introduce a generic initrd loading method based on calling back into the firmware, instead of relying on the x86 EFI handover protocol or device tree. - Introduce a mixed mode boot method that does not rely on the x86 EFI handover protocol either, and could potentially be adopted by other architectures (if another one ever surfaces where one execution mode is a superset of another) - Clean up the contents of struct efi, and move out everything that doesn't need to be stored there. - Incorporate support for UEFI spec v2.8A changes that permit firmware implementations to return EFI_UNSUPPORTED from UEFI runtime services at OS runtime, and expose a mask of which ones are supported or unsupported via a configuration table. - Various documentation updates and minor code cleanups (Heinrich) - Partial fix for the lack of by-VA cache maintenance in the decompressor on 32-bit ARM. Note that these patches were deliberately put at the beginning so they can be used as a stable branch that will be shared with a PR containing the complete fix, which I will send to the ARM tree. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-02-23integrity: Check properly whether EFI GetVariable() is availableArd Biesheuvel1-1/+1
Testing the value of the efi.get_variable function pointer is not the right way to establish whether the platform supports EFI variables at runtime. Instead, use the newly added granular check that can test for the presence of each EFI runtime service individually. Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-02-22selinux: Add xfs quota command typesRichard Haines1-0/+7
Add Q_XQUOTAOFF, Q_XQUOTAON and Q_XSETQLIM to trigger filesystem quotamod permission check. Add Q_XGETQUOTA, Q_XGETQSTAT, Q_XGETQSTATV and Q_XGETNEXTQUOTA to trigger filesystem quotaget permission check. Signed-off-by: Richard Haines <richard_c_haines@btinternet.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-02-22selinux: optimize storage of filename transitionsOndrej Mosnacek3-80/+110
In these rules, each rule with the same (target type, target class, filename) values is (in practice) always mapped to the same result type. Therefore, it is much more efficient to group the rules by (ttype, tclass, filename). Thus, this patch drops the stype field from the key and changes the datum to be a linked list of one or more structures that contain a result type and an ebitmap of source types that map the given target to the given result type under the given filename. The size of the hash table is also incremented to 2048 to be more optimal for Fedora policy (which currently has ~2500 unique (ttype, tclass, filename) tuples, regardless of whether the 'unconfined' module is enabled). Not only does this dramtically reduce memory usage when the policy contains a lot of unconfined domains (ergo a lot of filename based transitions), but it also slightly reduces memory usage of strongly confined policies (modeled on Fedora policy with 'unconfined' module disabled) and significantly reduces lookup times of these rules on Fedora (roughly matches the performance of the rhashtable conversion patch [1] posted recently to selinux@vger.kernel.org). An obvious next step is to change binary policy format to match this layout, so that disk space is also saved. However, since that requires more work (including matching userspace changes) and this patch is already beneficial on its own, I'm posting it separately. Performance/memory usage comparison: Kernel | Policy load | Policy load | Mem usage | Mem usage | openbench | | (-unconfined) | | (-unconfined) | (createfiles) -----------------|-------------|---------------|-----------|---------------|-------------- reference | 1,30s | 0,91s | 90MB | 77MB | 55 us/file rhashtable patch | 0.98s | 0,85s | 85MB | 75MB | 38 us/file this patch | 0,95s | 0,87s | 75MB | 75MB | 40 us/file (Memory usage is measured after boot. With SELinux disabled the memory usage was ~60MB on the same system.) [1] https://lore.kernel.org/selinux/20200116213937.77795-1-dev@lynxeye.de/T/ Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-02-21Merge branch 'next-integrity' of ↵Linus Torvalds2-14/+31
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull IMA fixes from Mimi Zohar: "Two bug fixes and an associated change for each. The one that adds SM3 to the IMA list of supported hash algorithms is a simple change, but could be considered a new feature" * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: add sm3 algorithm to hash algorithm configuration list crypto: rename sm3-256 to sm3 in hash_algo_name efi: Only print errors about failing to get certs if EFI vars are found x86/ima: use correct identifier for SetupMode variable
2020-02-18ima: add sm3 algorithm to hash algorithm configuration listTianjia Zhang1-0/+5
sm3 has been supported by the ima hash algorithm, but it is not yet in the Kconfig configuration list. After adding, both ima and tpm2 can support sm3 well. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-02-18efi: Only print errors about failing to get certs if EFI vars are foundJavier Martinez Canillas1-14/+26
If CONFIG_LOAD_UEFI_KEYS is enabled, the kernel attempts to load the certs from the db, dbx and MokListRT EFI variables into the appropriate keyrings. But it just assumes that the variables will be present and prints an error if the certs can't be loaded, even when is possible that the variables may not exist. For example the MokListRT variable will only be present if shim is used. So only print an error message about failing to get the certs list from an EFI variable if this is found. Otherwise these printed errors just pollute the kernel log ring buffer with confusing messages like the following: [ 5.427251] Couldn't get size: 0x800000000000000e [ 5.427261] MODSIGN: Couldn't get UEFI db list [ 5.428012] Couldn't get size: 0x800000000000000e [ 5.428023] Couldn't get UEFI MokListRT Reported-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>