summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-08-10Merge tag 'cxl-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxlLinus Torvalds39-576/+5498
Pull cxl updates from Dan Williams: "Compute Express Link (CXL) updates for 6.0: - Introduce a 'struct cxl_region' object with support for provisioning and assembling persistent memory regions. - Introduce alloc_free_mem_region() to accompany the existing request_free_mem_region() as a method to allocate physical memory capacity out of an existing resource. - Export insert_resource_expand_to_fit() for the CXL subsystem to late-publish CXL platform windows in iomem_resource. - Add a polled mode PCI DOE (Data Object Exchange) driver service and use it in cxl_pci to retrieve the CDAT (Coherent Device Attribute Table)" * tag 'cxl-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (74 commits) cxl/hdm: Fix skip allocations vs multiple pmem allocations cxl/region: Disallow region granularity != window granularity cxl/region: Fix x1 interleave to greater than x1 interleave routing cxl/region: Move HPA setup to cxl_region_attach() cxl/region: Fix decoder interleave programming Documentation: cxl: remove dangling kernel-doc reference cxl/region: describe targets and nr_targets members of cxl_region_params cxl/regions: add padding for cxl_rr_ep_add nested lists cxl/region: Fix IS_ERR() vs NULL check cxl/region: Fix region reference target accounting cxl/region: Fix region commit uninitialized variable warning cxl/region: Fix port setup uninitialized variable warnings cxl/region: Stop initializing interleave granularity cxl/hdm: Fix DPA reservation vs cxl_endpoint_decoder lifetime cxl/acpi: Minimize granularity for x1 interleaves cxl/region: Delete 'region' attribute from root decoders cxl/acpi: Autoload driver for 'cxl_acpi' test devices cxl/region: decrement ->nr_targets on error in cxl_region_attach() cxl/region: prevent underflow in ways_to_cxl() cxl/region: uninitialized variable in alloc_hpa() ...
2022-08-10Merge tag 'pinctrl-v6.0-1' of ↵Linus Torvalds88-667/+13354
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control updates from Linus Walleij: "Outside the pinctrl driver and DT bindings we hit some Arm DT files, patched by the maintainers. Other than that it is business as usual. Core changes: - Add PINCTRL_PINGROUP() helper macro (and use it in the AMD driver). New drivers: - Intel Meteor Lake support. - Reneasas RZ/V2M and r8a779g0 (R-Car V4H). - AXP209 variants AXP221, AXP223 and AXP809. - Qualcomm MSM8909, PM8226, PMP8074 and SM6375. - Allwinner D1. Improvements: - Proper pin multiplexing in the AMD driver. - Mediatek MT8192 can use generic drive strength and pin bias, then fixes on top plus some I2C pin group fixes. - Have the Allwinner Sunplus SP7021 use the generic DT schema and make interrupts optional. - Handle Qualcomm SC7280 ADSP. - Handle Qualcomm MSM8916 CAMSS GP clock muxing. - High impedance bias on ZynqMP. - Serialize StarFive access to MMIO. - Immutable gpiochip for BCM2835, Ingenic, Qualcomm SPMI GPIO" * tag 'pinctrl-v6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (117 commits) dt-bindings: pinctrl: qcom,pmic-gpio: add PM8226 constraints pinctrl: qcom: Make PINCTRL_SM8450 depend on PINCTRL_MSM pinctrl: qcom: sm8250: Fix PDC map pinctrl: amd: Fix an unused variable dt-bindings: pinctrl: mt8186: Add and use drive-strength-microamp dt-bindings: pinctrl: mt8186: Add gpio-line-names property ARM: dts: imxrt1170-pinfunc: Add pinctrl binding header pinctrl: amd: Use unicode for debugfs output pinctrl: amd: Fix newline declaration in debugfs output pinctrl: at91: Fix typo 'the the' in comment dt-bindings: pinctrl: st,stm32: Correct 'resets' property name pinctrl: mvebu: Missing a blank line after declarations. pinctrl: qcom: Add SM6375 TLMM driver dt-bindings: pinctrl: Add DT schema for SM6375 TLMM dt-bindings: pinctrl: mt8195: Use drive-strength-microamp in examples Revert "pinctrl: qcom: spmi-gpio: make the irqchip immutable" pinctrl: imx93: Add MODULE_DEVICE_TABLE() pinctrl: sunxi: Add driver for Allwinner D1 pinctrl: sunxi: Make some layout parameters dynamic pinctrl: sunxi: Refactor register/offset calculation ...
2022-08-10Merge tag 'apparmor-pr-2022-08-08' of ↵Linus Torvalds30-340/+492
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor Pull AppArmor updates from John Johansen: "This is mostly cleanups and bug fixes with the one bigger change being Mathew Wilcox's patch to use XArrays instead of the IDR from the thread around the locking weirdness. Features: - Convert secid mapping to XArrays instead of IDR - Add a kernel label to use on kernel objects - Extend policydb permission set by making use of the xbits - Make export of raw binary profile to userspace optional - Enable tuning of policy paranoid load for embedded systems - Don't create raw_sha1 symlink if sha1 hashing is disabled - Allow labels to carry debug flags Cleanups: - Update MAINTAINERS file - Use struct_size() helper in kmalloc() - Move ptrace mediation to more logical task.{h,c} - Resolve uninitialized symbol warnings - Remove redundant ret variable - Mark alloc_unconfined() as static - Update help description of policy hash for introspection - Remove some casts which are no-longer required Bug Fixes: - Fix aa_label_asxprint return check - Fix reference count leak in aa_pivotroot() - Fix memleak in aa_simple_write_to_buffer() - Fix kernel doc comments - Fix absroot causing audited secids to begin with = - Fix quiet_denied for file rules - Fix failed mount permission check error message - Disable showing the mode as part of a secid to secctx - Fix setting unconfined mode on a loaded profile - Fix overlapping attachment computation - Fix undefined reference to `zlib_deflate_workspacesize'" * tag 'apparmor-pr-2022-08-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (34 commits) apparmor: Update MAINTAINERS file with new email address apparmor: correct config reference to intended one apparmor: move ptrace mediation to more logical task.{h,c} apparmor: extend policydb permission set by making use of the xbits apparmor: allow label to carry debug flags apparmor: fix overlapping attachment computation apparmor: fix setting unconfined mode on a loaded profile apparmor: Fix some kernel-doc comments apparmor: Mark alloc_unconfined() as static apparmor: disable showing the mode as part of a secid to secctx apparmor: Convert secid mapping to XArrays instead of IDR apparmor: add a kernel label to use on kernel objects apparmor: test: Remove some casts which are no-longer required apparmor: Fix memleak in aa_simple_write_to_buffer() apparmor: fix reference count leak in aa_pivotroot() apparmor: Fix some kernel-doc comments apparmor: Fix undefined reference to `zlib_deflate_workspacesize' apparmor: fix aa_label_asxprint return check apparmor: Fix some kernel-doc comments apparmor: Fix some kernel-doc comments ...
2022-08-10Merge tag 'kbuild-v5.20' of ↵Linus Torvalds36-313/+111
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove the support for -O3 (CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3) - Fix error of rpm-pkg cross-builds - Support riscv for checkstack tool - Re-enable -Wformwat warnings for Clang - Clean up modpost, Makefiles, and misc scripts * tag 'kbuild-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (30 commits) modpost: remove .symbol_white_list field entirely modpost: remove unneeded .symbol_white_list initializers modpost: add PATTERNS() helper macro modpost: shorten warning messages in report_sec_mismatch() Revert "Kbuild, lto, workaround: Don't warn for initcall_reference in modpost" modpost: use more reliable way to get fromsec in section_rel(a)() modpost: add array range check to sec_name() modpost: refactor get_secindex() kbuild: set EXIT trap before creating temporary directory modpost: remove unused Elf_Sword macro Makefile.extrawarn: re-enable -Wformat for clang kbuild: add dtbs_prepare target kconfig: Qt5: tell the user which packages are required modpost: use sym_get_data() to get module device_table data modpost: drop executable ELF support checkstack: add riscv support for scripts/checkstack.pl kconfig: shorten the temporary directory name for cc-option scripts: headers_install.sh: Update config leak ignore entries kbuild: error out if $(INSTALL_MOD_PATH) contains % or : kbuild: error out if $(KBUILD_EXTMOD) contains % or : ...
2022-08-10add barriers to buffer_uptodate and set_buffer_uptodateMikulas Patocka1-1/+24
Let's have a look at this piece of code in __bread_slow: get_bh(bh); bh->b_end_io = end_buffer_read_sync; submit_bh(REQ_OP_READ, 0, bh); wait_on_buffer(bh); if (buffer_uptodate(bh)) return bh; Neither wait_on_buffer nor buffer_uptodate contain any memory barrier. Consequently, if someone calls sb_bread and then reads the buffer data, the read of buffer data may be executed before wait_on_buffer(bh) on architectures with weak memory ordering and it may return invalid data. Fix this bug by adding a memory barrier to set_buffer_uptodate and an acquire barrier to buffer_uptodate (in a similar way as folio_test_uptodate and folio_mark_uptodate). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-10Merge tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds33-945/+1426
Pull nfsd updates from Chuck Lever: "Work on 'courteous server', which was introduced in 5.19, continues apace. This release introduces a more flexible limit on the number of NFSv4 clients that NFSD allows, now that NFSv4 clients can remain in courtesy state long after the lease expiration timeout. The client limit is adjusted based on the physical memory size of the server. The NFSD filecache is a cache of files held open by NFSv4 clients or recently touched by NFSv2 or NFSv3 clients. This cache had some significant scalability constraints that have been relieved in this release. Thanks to all who contributed to this work. A data corruption bug found during the most recent NFS bake-a-thon that involves NFSv3 and NFSv4 clients writing the same file has been addressed in this release. This release includes several improvements in CPU scalability for NFSv4 operations. In addition, Neil Brown provided patches that simplify locking during file lookup, creation, rename, and removal that enables subsequent work on making these operations more scalable. We expect to see that work materialize in the next release. There are also numerous single-patch fixes, clean-ups, and the usual improvements in observability" * tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (78 commits) lockd: detect and reject lock arguments that overflow NFSD: discard fh_locked flag and fh_lock/fh_unlock NFSD: use (un)lock_inode instead of fh_(un)lock for file operations NFSD: use explicit lock/unlock for directory ops NFSD: reduce locking in nfsd_lookup() NFSD: only call fh_unlock() once in nfsd_link() NFSD: always drop directory lock in nfsd_unlink() NFSD: change nfsd_create()/nfsd_symlink() to unlock directory before returning. NFSD: add posix ACLs to struct nfsd_attrs NFSD: add security label to struct nfsd_attrs NFSD: set attributes when creating symlinks NFSD: introduce struct nfsd_attrs NFSD: verify the opened dentry after setting a delegation NFSD: drop fh argument from alloc_init_deleg NFSD: Move copy offload callback arguments into a separate structure NFSD: Add nfsd4_send_cb_offload() NFSD: Remove kmalloc from nfsd4_do_async_copy() NFSD: Refactor nfsd4_do_copy() NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2) NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2) ...
2022-08-09Merge tag 'fscache-fixes-20220809' of ↵Linus Torvalds2-2/+9
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull fscache updates from David Howells: - Fix a cookie access ref leak if a cookie is invalidated a second time before the first invalidation is actually processed. - Add a tracepoint to log cookie lookup failure * tag 'fscache-fixes-20220809' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: fscache: add tracepoint when failing cookie fscache: don't leak cookie access refs if invalidation is in progress or failed
2022-08-09Merge tag 'afs-fixes-20220802' of ↵Linus Torvalds9-116/+124
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: "Fix AFS refcount handling. The first patch converts afs to use refcount_t for its refcounts and the second patch fixes afs_put_call() and afs_put_server() to save the values they're going to log in the tracepoint before decrementing the refcount" * tag 'afs-fixes-20220802' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix access after dec in put functions afs: Use refcount_t rather than atomic_t
2022-08-09Merge tag 'fs.setgid.v6.0' of ↵Linus Torvalds5-19/+102
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull setgid updates from Christian Brauner: "This contains the work to move setgid stripping out of individual filesystems and into the VFS itself. Creating files that have both the S_IXGRP and S_ISGID bit raised in directories that themselves have the S_ISGID bit set requires additional privileges to avoid security issues. When a filesystem creates a new inode it needs to take care that the caller is either in the group of the newly created inode or they have CAP_FSETID in their current user namespace and are privileged over the parent directory of the new inode. If any of these two conditions is true then the S_ISGID bit can be raised for an S_IXGRP file and if not it needs to be stripped. However, there are several key issues with the current implementation: - S_ISGID stripping logic is entangled with umask stripping. For example, if the umask removes the S_IXGRP bit from the file about to be created then the S_ISGID bit will be kept. The inode_init_owner() helper is responsible for S_ISGID stripping and is called before posix_acl_create(). So we can end up with two different orderings: 1. FS without POSIX ACL support First strip umask then strip S_ISGID in inode_init_owner(). In other words, if a filesystem doesn't support or enable POSIX ACLs then umask stripping is done directly in the vfs before calling into the filesystem: 2. FS with POSIX ACL support First strip S_ISGID in inode_init_owner() then strip umask in posix_acl_create(). In other words, if the filesystem does support POSIX ACLs then unmask stripping may be done in the filesystem itself when calling posix_acl_create(). Note that technically filesystems are free to impose their own ordering between posix_acl_create() and inode_init_owner() meaning that there's additional ordering issues that influence S_ISGID inheritance. (Note that the commit message of commit 1639a49ccdce ("fs: move S_ISGID stripping into the vfs_*() helpers") gets the ordering between inode_init_owner() and posix_acl_create() the wrong way around. I realized this too late.) - Filesystems that don't rely on inode_init_owner() don't get S_ISGID stripping logic. While that may be intentional (e.g. network filesystems might just defer setgid stripping to a server) it is often just a security issue. Note that mandating the use of inode_init_owner() was proposed as an alternative solution but that wouldn't fix the ordering issues and there are examples such as afs where the use of inode_init_owner() isn't possible. In any case, we should also try the cleaner and generalized solution first before resorting to this approach. - We still have S_ISGID inheritance bugs years after the initial round of S_ISGID inheritance fixes: e014f37db1a2 ("xfs: use setattr_copy to set vfs inode attributes") 01ea173e103e ("xfs: fix up non-directory creation in SGID directories") fd84bfdddd16 ("ceph: fix up non-directory creation in SGID directories") All of this led us to conclude that the current state is too messy. While we won't be able to make it completely clean as posix_acl_create() is still a filesystem specific call we can improve the S_SIGD stripping situation quite a bit by hoisting it out of inode_init_owner() and into the respective vfs creation operations. The obvious advantage is that we don't need to rely on individual filesystems getting S_ISGID stripping right and instead can standardize the ordering between S_ISGID and umask stripping directly in the VFS. A few short implementation notes: - The stripping logic needs to happen in vfs_*() helpers for the sake of stacking filesystems such as overlayfs that rely on these helpers taking care of S_ISGID stripping. - Security hooks have never seen the mode as it is ultimately seen by the filesystem because of the ordering issue we mentioned. Nothing is changed for them. We simply continue to strip the umask before passing the mode down to the security hooks. - The following filesystems use inode_init_owner() and thus relied on S_ISGID stripping: spufs, 9p, bfs, btrfs, ext2, ext4, f2fs, hfsplus, hugetlbfs, jfs, minix, nilfs2, ntfs3, ocfs2, omfs, overlayfs, ramfs, reiserfs, sysv, ubifs, udf, ufs, xfs, zonefs, bpf, tmpfs. We've audited all callchains as best as we could. More details can be found in the commit message to 1639a49ccdce ("fs: move S_ISGID stripping into the vfs_*() helpers")" * tag 'fs.setgid.v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: ceph: rely on vfs for setgid stripping fs: move S_ISGID stripping into the vfs_*() helpers fs: Add missing umask strip in vfs_tmpfile fs: add mode_strip_sgid() helper
2022-08-09Merge tag 'memblock-v5.20-rc1' of ↵Linus Torvalds14-370/+920
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock updates from Mike Rapoport: - An optimization in memblock_add_range() to reduce array traversals - Improvements to the memblock test suite * tag 'memblock-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock test: Modify the obsolete description in README memblock tests: fix compilation errors memblock tests: change build options to run-time options memblock tests: remove completed TODO items memblock tests: set memblock_debug to enable memblock_dbg() messages memblock tests: add verbose output to memblock tests memblock tests: Makefile: add arguments to control verbosity memblock: avoid some repeat when add new range
2022-08-09Merge tag 'm68knommu-for-v5.20' of ↵Linus Torvalds3-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68knommu fixes from Greg Ungerer: - spelling in comment - compilation when flexcan driver enabled - sparse warning * tag 'm68knommu-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68k: Fix syntax errors in comments m68k: coldfire: make symbol m523x_clk_lookup static m68k: coldfire/device.c: protect FLEXCAN blocks
2022-08-09Merge tag 'x86_bugs_pbrsb' of ↵Linus Torvalds9-30/+116
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 eIBRS fixes from Borislav Petkov: "More from the CPU vulnerability nightmares front: Intel eIBRS machines do not sufficiently mitigate against RET mispredictions when doing a VM Exit therefore an additional RSB, one-entry stuffing is needed" * tag 'x86_bugs_pbrsb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation: Add LFENCE to RSB fill sequence x86/speculation: Add RSB VM Exit protections
2022-08-09fscache: add tracepoint when failing cookieJeff Layton2-0/+4
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com>
2022-08-09fscache: don't leak cookie access refs if invalidation is in progress or failedJeff Layton1-2/+5
It's possible for a request to invalidate a fscache_cookie will come in while we're already processing an invalidation. If that happens we currently take an extra access reference that will leak. Only call __fscache_begin_cookie_access if the FSCACHE_COOKIE_DO_INVALIDATE bit was previously clear. Also, ensure that we attempt to clear the bit when the cookie is "FAILED" and put the reference to avoid an access leak. Fixes: 85e4ea1049c7 ("fscache: Fix invalidation/lookup race") Suggested-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com>
2022-08-09Merge tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds20-220/+322
Pull ksmbd updates from Steve French: - fixes for memory access bugs (out of bounds access, oops, leak) - multichannel fixes - session disconnect performance improvement, and session register improvement - cleanup * tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix heap-based overflow in set_ntacl_dacl() ksmbd: prevent out of bound read for SMB2_TREE_CONNNECT ksmbd: prevent out of bound read for SMB2_WRITE ksmbd: fix use-after-free bug in smb2_tree_disconect ksmbd: fix memory leak in smb2_handle_negotiate ksmbd: fix racy issue while destroying session on multichannel ksmbd: use wait_event instead of schedule_timeout() ksmbd: fix kernel oops from idr_remove() ksmbd: add channel rwlock ksmbd: replace sessions list in connection with xarray MAINTAINERS: ksmbd: add entry for documentation ksmbd: remove unused ksmbd_share_configs_cleanup function
2022-08-09Merge tag 'pull-work.iov_iter-rebased' of ↵Linus Torvalds30-621/+508
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more iov_iter updates from Al Viro: - more new_sync_{read,write}() speedups - ITER_UBUF introduction - ITER_PIPE cleanups - unification of iov_iter_get_pages/iov_iter_get_pages_alloc and switching them to advancing semantics - making ITER_PIPE take high-order pages without splitting them - handling copy_page_from_iter() for high-order pages properly * tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (32 commits) fix copy_page_from_iter() for compound destinations hugetlbfs: copy_page_to_iter() can deal with compound pages copy_page_to_iter(): don't split high-order page in case of ITER_PIPE expand those iov_iter_advance()... pipe_get_pages(): switch to append_pipe() get rid of non-advancing variants ceph: switch the last caller of iov_iter_get_pages_alloc() 9p: convert to advancing variant of iov_iter_get_pages_alloc() af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages() iter_to_pipe(): switch to advancing variant of iov_iter_get_pages() block: convert to advancing variants of iov_iter_get_pages{,_alloc}() iov_iter: advancing variants of iov_iter_get_pages{,_alloc}() iov_iter: saner helper for page array allocation fold __pipe_get_pages() into pipe_get_pages() ITER_XARRAY: don't open-code DIV_ROUND_UP() unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts unify xarray_get_pages() and xarray_get_pages_alloc() unify pipe_get_pages() and pipe_get_pages_alloc() iov_iter_get_pages(): sanity-check arguments iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper ...
2022-08-09fix copy_page_from_iter() for compound destinationsAl Viro1-4/+18
had been broken for ITER_BVEC et.al. since ever (OK, v3.17 when ITER_BVEC had first appeared)... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09hugetlbfs: copy_page_to_iter() can deal with compound pagesAl Viro1-30/+1
... since April 2021 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09copy_page_to_iter(): don't split high-order page in case of ITER_PIPEAl Viro1-15/+6
... just shove it into one pipe_buffer. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09expand those iov_iter_advance()...Al Viro1-2/+9
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09pipe_get_pages(): switch to append_pipe()Al Viro1-29/+6
now that we are advancing the iterator, there's no need to treat the first page separately - just call append_pipe() in a loop. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09get rid of non-advancing variantsAl Viro2-31/+20
mechanical change; will be further massaged in subsequent commits Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09ceph: switch the last caller of iov_iter_get_pages_alloc()Al Viro1-1/+1
here nothing even looks at the iov_iter after the call, so we couldn't care less whether it advances or not. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-099p: convert to advancing variant of iov_iter_get_pages_alloc()Al Viro3-19/+26
that one is somewhat clumsier than usual and needs serious testing. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()Al Viro2-4/+4
... and adjust the callers Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09iter_to_pipe(): switch to advancing variant of iov_iter_get_pages()Al Viro1-23/+24
... and untangle the cleanup on failure to add into pipe. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09block: convert to advancing variants of iov_iter_get_pages{,_alloc}()Al Viro2-14/+18
... doing revert if we end up not using some pages Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09iov_iter: advancing variants of iov_iter_get_pages{,_alloc}()Al Viro13-30/+34
Most of the users immediately follow successful iov_iter_get_pages() with advancing by the amount it had returned. Provide inline wrappers doing that, convert trivial open-coded uses of those. BTW, iov_iter_get_pages() never returns more than it had been asked to; such checks in cifs ought to be removed someday... Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09iov_iter: saner helper for page array allocationAl Viro1-45/+32
All call sites of get_pages_array() are essenitally identical now. Replace with common helper... Returns number of slots available in resulting array or 0 on OOM; it's up to the caller to make sure it doesn't ask to zero-entry array (i.e. neither maxpages nor size are allowed to be zero). Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09fold __pipe_get_pages() into pipe_get_pages()Al Viro1-37/+38
... and don't mangle maxsize there - turn the loop into counting one instead. Easier to see that we won't run out of array that way. Note that special treatment of the partial buffer in that thing is an artifact of the non-advancing semantics of iov_iter_get_pages() - if not for that, it would be append_pipe(), same as the body of the loop that follows it. IOW, once we make iov_iter_get_pages() advancing, the whole thing will turn into calculate how many pages do we want allocate an array (if needed) call append_pipe() that many times. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09ITER_XARRAY: don't open-code DIV_ROUND_UP()Al Viro1-9/+1
Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() gutsAl Viro1-59/+27
same as for pipes and xarrays; after that iov_iter_get_pages() becomes a wrapper for __iov_iter_get_pages_alloc(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09unify xarray_get_pages() and xarray_get_pages_alloc()Al Viro1-39/+10
same as for pipes Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09unify pipe_get_pages() and pipe_get_pages_alloc()Al Viro1-32/+17
The differences between those two are * pipe_get_pages() gets a non-NULL struct page ** value pointing to preallocated array + array size. * pipe_get_pages_alloc() gets an address of struct page ** variable that contains NULL, allocates the array and (on success) stores its address in that variable. Not hard to combine - always pass struct page ***, have the previous pipe_get_pages_alloc() caller pass ~0U as cap for array size. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09iov_iter_get_pages(): sanity-check argumentsAl Viro1-7/+2
zero maxpages is bogus, but best treated as "just return 0"; NULL pages, OTOH, should be treated as a hard bug. get rid of now completely useless checks in xarray_get_pages{,_alloc}(). Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into ↵Al Viro1-16/+22
wrapper Incidentally, ITER_XARRAY did *not* free the sucker in case when iter_xarray_populate_pages() returned 0... Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09ITER_PIPE: fold data_start() and pipe_space_for_user() togetherAl Viro2-45/+19
All their callers are next to each other; all of them want the total amount of pages and, possibly, the offset in the partial final buffer. Combine into a new helper (pipe_npages()), fix the bogosity in pipe_space_for_user(), while we are at it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09ITER_PIPE: cache the type of last bufferAl Viro2-40/+42
We often need to find whether the last buffer is anon or not, and currently it's rather clumsy: check if ->iov_offset is non-zero (i.e. that pipe is not empty) if so, get the corresponding pipe_buffer and check its ->ops if it's &default_pipe_buf_ops, we have an anon buffer. Let's replace the use of ->iov_offset (which is nowhere near similar to its role for other flavours) with signed field (->last_offset), with the following rules: empty, no buffers occupied: 0 anon, with bytes up to N-1 filled: N zero-copy, with bytes up to N-1 filled: -N That way abs(i->last_offset) is equal to what used to be in i->iov_offset and empty vs. anon vs. zero-copy can be distinguished by the sign of i->last_offset. Checks for "should we extend the last buffer or should we start a new one?" become easier to follow that way. Note that most of the operations can only be done in a sane state - i.e. when the pipe has nothing past the current position of iterator. About the only thing that could be done outside of that state is iov_iter_advance(), which transitions to the sane state by truncating the pipe. There are only two cases where we leave the sane state: 1) iov_iter_get_pages()/iov_iter_get_pages_alloc(). Will be dealt with later, when we make get_pages advancing - the callers are actually happier that way. 2) iov_iter copied, then something is put into the copy. Since they share the underlying pipe, the original gets behind. When we decide that we are done with the copy (original is not usable until then) we advance the original. direct_io used to be done that way; nowadays it operates on the original and we do iov_iter_revert() to discard the excessive data. At the moment there's nothing in the kernel that could do that to ITER_PIPE iterators, so this reason for insane state is theoretical right now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09ITER_PIPE: clean iov_iter_revert()Al Viro1-46/+14
Fold pipe_truncate() into it, clean up. We can release buffers in the same loop where we walk backwards to the iterator beginning looking for the place where the new position will be. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09ITER_PIPE: clean pipe_advance() upAl Viro1-17/+17
instead of setting ->iov_offset for new position and calling pipe_truncate() to adjust ->len of the last buffer and discard everything after it, adjust ->len at the same time we set ->iov_offset and use pipe_discard_from() to deal with buffers past that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09ITER_PIPE: lose iter_head argument of __pipe_get_pages()Al Viro1-4/+3
it's only used to get to the partial buffer we can add to, and that's always the last one, i.e. pipe->head - 1. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09ITER_PIPE: fold push_pipe() into __pipe_get_pages()Al Viro1-55/+25
Expand the only remaining call of push_pipe() (in __pipe_get_pages()), combine it with the page-collecting loop there. Note that the only reason it's not a loop doing append_pipe() is that append_pipe() is advancing, while iov_iter_get_pages() is not. As soon as it switches to saner semantics, this thing will switch to using append_pipe(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09ITER_PIPE: allocate buffers as we go in copy-to-pipe primitivesAl Viro1-73/+98
New helper: append_pipe(). Extends the last buffer if possible, allocates a new one otherwise. Returns page and offset in it on success, NULL on failure. iov_iter is advanced past the data we've got. Use that instead of push_pipe() in copy-to-pipe primitives; they get simpler that way. Handling of short copy (in "mc" one) is done simply by iov_iter_revert() - iov_iter is in consistent state after that one, so we can use that. [Fix for braino caught by Liu Xinpeng <liuxp11@chinatelecom.cn> folded in] [another braino fix, this time in copy_pipe_to_iter() and pipe_zero(); caught by testcase from Hugh Dickins] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09ITER_PIPE: helpers for adding pipe buffersAl Viro1-42/+46
There are only two kinds of pipe_buffer in the area used by ITER_PIPE. 1) anonymous - copy_to_iter() et.al. end up creating those and copying data there. They have zero ->offset, and their ->ops points to default_pipe_page_ops. 2) zero-copy ones - those come from copy_page_to_iter(), and page comes from caller. ->offset is also caller-supplied - it might be non-zero. ->ops points to page_cache_pipe_buf_ops. Move creation and insertion of those into helpers - push_anon(pipe, size) and push_page(pipe, page, offset, size) resp., separating them from the "could we avoid creating a new buffer by merging with the current head?" logics. Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09ITER_PIPE: helper for getting pipe buffer by indexAl Viro1-6/+9
pipe_buffer instances of a pipe are organized as a ring buffer, with power-of-2 size. Indices are kept *not* reduced modulo ring size, so the buffer refered to by index N is pipe->bufs[N & (pipe->ring_size - 1)]. Ring size can change over the lifetime of a pipe, but not while the pipe is locked. So for any iov_iter primitives it's a constant. Original conversion of pipes to this layout went overboard trying to microoptimize that - calculating pipe->ring_size - 1, storing it in a local variable and using through the function. In some cases it might be warranted, but most of the times it only obfuscates what's going on in there. Introduce a helper (pipe_buf(pipe, N)) that would encapsulate that and use it in the obvious cases. More will follow... Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09splice: stop abusing iov_iter_advance() to flush a pipeAl Viro1-5/+2
Use pipe_discard_from() explicitly in generic_file_read_iter(); don't bother with rather non-obvious use of iov_iter_advance() in there. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09switch new_sync_{read,write}() to ITER_UBUFAl Viro1-4/+2
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09new iov_iter flavour - ITER_UBUFAl Viro12-31/+108
Equivalent of single-segment iovec. Initialized by iov_iter_ubuf(), checked for by iter_is_ubuf(), otherwise behaves like ITER_IOVEC ones. We are going to expose the things like ->write_iter() et.al. to those in subsequent commits. New predicate (user_backed_iter()) that is true for ITER_IOVEC and ITER_UBUF; places like direct-IO handling should use that for checking that pages we modify after getting them from iov_iter_get_pages() would need to be dirtied. DO NOT assume that replacing iter_is_iovec() with user_backed_iter() will solve all problems - there's code that uses iter_is_iovec() to decide how to poke around in iov_iter guts and for that the predicate replacement obviously won't suffice. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09Merge tag 'rproc-v5.20' of ↵Linus Torvalds28-227/+958
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux Pull remoteproc updates from Bjorn Andersson: "This introduces support for the remoteproc on Mediatek MT8188, and enables caches for MT8186 SCP. It adds support for PRU cores found on the TI K3 AM62x SoCs. It moves the recovery work after a firmware crash to an unbound workqueue, to allow recovery to happen in parallel. A new DMA API is introduced to release dma_mem for a device. It adds support a panic handler for the Qualcomm modem remoteproc, with the goal of having caches flushed in memory dumps for post-mortem debugging and it introduces a mechanism to wait for the modem firmware on SM8450 to decrypt part of its memory for post-mortem debugging. Qualcomm sysmon is restricted to only inform remote processors about peers that are actually running, to avoid a race where Linux tries to notify a recovering remote processor about its peers new state. A mechanism for waiting for the sysmon connection to be established is also introduced, to avoid out-of-sync updates for rapidly restarting remote processors. A number of Devicetree binding cleanups and conversions to YAML are introduced, to facilitate Devicetree validation. Lastly it introduces a number of smaller fixes and cleanups in the core and a few different drivers" * tag 'rproc-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (42 commits) remoteproc: qcom_q6v5_pas: Do not fail if regulators are not found drivers/remoteproc: fix repeated words in comments remoteproc: Directly use ida_alloc()/free() remoteproc: Use unbounded workqueue for recovery work remoteproc: using pm_runtime_resume_and_get instead of pm_runtime_get_sync remoteproc: qcom_q6v5_pas: Deal silently with optional px and cx regulators remoteproc: sysmon: Send sysmon state only for running rprocs remoteproc: sysmon: Wait for SSCTL service to come up remoteproc: qcom: q6v5: Set q6 state to offline on receiving wdog irq remoteproc: qcom: pas: Check if coredump is enabled remoteproc: qcom: pas: Mark devices as wakeup capable remoteproc: qcom: pas: Mark va as io memory remoteproc: qcom: pas: Add decrypt shutdown support for modem remoteproc: qcom: q6v5-mss: add powerdomains to MSM8996 config remoteproc: qcom_q6v5: Introduce panic handler for MSS remoteproc: qcom_q6v5_mss: Update MBA log info remoteproc: qcom: correct kerneldoc remoteproc: qcom_q6v5_mss: map/unmap metadata region before/after use remoteproc: qcom: using pm_runtime_resume_and_get to simplify the code remoteproc: mediatek: Support MT8188 SCP ...
2022-08-09Merge tag 'rpmsg-v5.20' of ↵Linus Torvalds7-17/+20
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux Pull rpmsg updates from Bjorn Andersson: "This contains fixes and cleanups in the rpmsg core, Qualcomm SMD and GLINK drivers, a circular lock dependency in the Mediatek driver and a possible race condition in the rpmsg_char driver is resolved" * tag 'rpmsg-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: rpmsg: convert sysfs snprintf to sysfs_emit rpmsg: qcom_smd: Fix refcount leak in qcom_smd_parse_edge rpmsg: qcom: correct kerneldoc rpmsg: qcom: glink: remove unused name rpmsg: qcom: glink: replace strncpy() with strscpy_pad() rpmsg: Strcpy is not safe, use strscpy_pad() instead rpmsg: Fix possible refcount leak in rpmsg_register_device_override() rpmsg: Fix parameter naming for announce_create/destroy ops rpmsg: mtk_rpmsg: Fix circular locking dependency rpmsg: char: Add mutex protection for rpmsg_eptdev_open()