summaryrefslogtreecommitdiff
path: root/arch/mips
AgeCommit message (Collapse)AuthorFilesLines
2021-09-01Merge tag 'driver-core-5.15-rc1' of ↵Linus Torvalds1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big set of driver core patches for 5.15-rc1. These do change a number of different things across different subsystems, and because of that, there were 2 stable tags created that might have already come into your tree from different pulls that did the following - changed the bus remove callback to return void - sysfs iomem_get_mapping rework Other than those two things, there's only a few small things in here: - kernfs performance improvements for huge numbers of sysfs users at once - tiny api cleanups - other minor changes All of these have been in linux-next for a while with no reported problems, other than the before-mentioned merge issue" * tag 'driver-core-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (33 commits) MAINTAINERS: Add dri-devel for component.[hc] driver core: platform: Remove platform_device_add_properties() ARM: tegra: paz00: Handle device properties with software node API bitmap: extend comment to bitmap_print_bitmask/list_to_buf drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI topology: use bin_attribute to break the size limitation of cpumap ABI lib: test_bitmap: add bitmap_print_bitmask/list_to_buf test cases cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list sysfs: Rename struct bin_attribute member to f_mapping sysfs: Invoke iomem_get_mapping() from the sysfs open callback debugfs: Return error during {full/open}_proxy_open() on rmmod zorro: Drop useless (and hardly used) .driver member in struct zorro_dev zorro: Simplify remove callback sh: superhyway: Simplify check in remove callback nubus: Simplify check in remove callback nubus: Make struct nubus_driver::remove return void kernfs: dont call d_splice_alias() under kernfs node lock kernfs: use i_lock to protect concurrent inode updates kernfs: switch kernfs to use an rwsem kernfs: use VFS negative dentry caching ...
2021-09-01Merge tag 'net-next-5.15' of ↵Linus Torvalds2-13/+13
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - Enable memcg accounting for various networking objects. BPF: - Introduce bpf timers. - Add perf link and opaque bpf_cookie which the program can read out again, to be used in libbpf-based USDT library. - Add bpf_task_pt_regs() helper to access user space pt_regs in kprobes, to help user space stack unwinding. - Add support for UNIX sockets for BPF sockmap. - Extend BPF iterator support for UNIX domain sockets. - Allow BPF TCP congestion control progs and bpf iterators to call bpf_setsockopt(), e.g. to switch to another congestion control algorithm. Protocols: - Support IOAM Pre-allocated Trace with IPv6. - Support Management Component Transport Protocol. - bridge: multicast: add vlan support. - netfilter: add hooks for the SRv6 lightweight tunnel driver. - tcp: - enable mid-stream window clamping (by user space or BPF) - allow data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD - more accurate DSACK processing for RACK-TLP - mptcp: - add full mesh path manager option - add partial support for MP_FAIL - improve use of backup subflows - optimize option processing - af_unix: add OOB notification support. - ipv6: add IFLA_INET6_RA_MTU to expose MTU value advertised by the router. - mac80211: Target Wake Time support in AP mode. - can: j1939: extend UAPI to notify about RX status. Driver APIs: - Add page frag support in page pool API. - Many improvements to the DSA (distributed switch) APIs. - ethtool: extend IRQ coalesce uAPI with timer reset modes. - devlink: control which auxiliary devices are created. - Support CAN PHYs via the generic PHY subsystem. - Proper cross-chip support for tag_8021q. - Allow TX forwarding for the software bridge data path to be offloaded to capable devices. Drivers: - veth: more flexible channels number configuration. - openvswitch: introduce per-cpu upcall dispatch. - Add internet mix (IMIX) mode to pktgen. - Transparently handle XDP operations in the bonding driver. - Add LiteETH network driver. - Renesas (ravb): - support Gigabit Ethernet IP - NXP Ethernet switch (sja1105): - fast aging support - support for "H" switch topologies - traffic termination for ports under VLAN-aware bridge - Intel 1G Ethernet - support getcrosststamp() with PCIe PTM (Precision Time Measurement) for better time sync - support Credit-Based Shaper (CBS) offload, enabling HW traffic prioritization and bandwidth reservation - Broadcom Ethernet (bnxt) - support pulse-per-second output - support larger Rx rings - Mellanox Ethernet (mlx5) - support ethtool RSS contexts and MQPRIO channel mode - support LAG offload with bridging - support devlink rate limit API - support packet sampling on tunnels - Huawei Ethernet (hns3): - basic devlink support - add extended IRQ coalescing support - report extended link state - Netronome Ethernet (nfp): - add conntrack offload support - Broadcom WiFi (brcmfmac): - add WPA3 Personal with FT to supported cipher suites - support 43752 SDIO device - Intel WiFi (iwlwifi): - support scanning hidden 6GHz networks - support for a new hardware family (Bz) - Xen pv driver: - harden netfront against malicious backends - Qualcomm mobile - ipa: refactor power management and enable automatic suspend - mhi: move MBIM to WWAN subsystem interfaces Refactor: - Ambient BPF run context and cgroup storage cleanup. - Compat rework for ndo_ioctl. Old code removal: - prism54 remove the obsoleted driver, deprecated by the p54 driver. - wan: remove sbni/granch driver" * tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1715 commits) net: Add depends on OF_NET for LiteX's LiteETH ipv6: seg6: remove duplicated include net: hns3: remove unnecessary spaces net: hns3: add some required spaces net: hns3: clean up a type mismatch warning net: hns3: refine function hns3_set_default_feature() ipv6: remove duplicated 'net/lwtunnel.h' include net: w5100: check return value after calling platform_get_resource() net/mlxbf_gige: Make use of devm_platform_ioremap_resourcexxx() net: mdio: mscc-miim: Make use of the helper function devm_platform_ioremap_resource() net: mdio-ipq4019: Make use of devm_platform_ioremap_resource() fou: remove sparse errors ipv4: fix endianness issue in inet_rtm_getroute_build_skb() octeontx2-af: Set proper errorcode for IPv4 checksum errors octeontx2-af: Fix static code analyzer reported issues octeontx2-af: Fix mailbox errors in nix_rss_flowkey_cfg octeontx2-af: Fix loop in free and unmap counter af_unix: fix potential NULL deref in unix_dgram_connect() dpaa2-eth: Replace strlcpy with strscpy octeontx2-af: Use NDC TX for transmit packet data ...
2021-08-31Merge tag 'for-5.15/block-2021-08-30' of git://git.kernel.dk/linux-blockLinus Torvalds1-2/+0
Pull block updates from Jens Axboe: "Nothing major in here - lots of good cleanups and tech debt handling, which is also evident in the diffstats. In particular: - Add disk sequence numbers (Matteo) - Discard merge fix (Ming) - Relax disk zoned reporting restrictions (Niklas) - Bio error handling zoned leak fix (Pavel) - Start of proper add_disk() error handling (Luis, Christoph) - blk crypto fix (Eric) - Non-standard GPT location support (Dmitry) - IO priority improvements and cleanups (Damien)o - blk-throtl improvements (Chunguang) - diskstats_show() stack reduction (Abd-Alrhman) - Loop scheduler selection (Bart) - Switch block layer to use kmap_local_page() (Christoph) - Remove obsolete disk_name helper (Christoph) - block_device refcounting improvements (Christoph) - Ensure gendisk always has a request queue reference (Christoph) - Misc fixes/cleanups (Shaokun, Oliver, Guoqing)" * tag 'for-5.15/block-2021-08-30' of git://git.kernel.dk/linux-block: (129 commits) sg: pass the device name to blk_trace_setup block, bfq: cleanup the repeated declaration blk-crypto: fix check for too-large dun_bytes blk-zoned: allow BLKREPORTZONE without CAP_SYS_ADMIN blk-zoned: allow zone management send operations without CAP_SYS_ADMIN block: mark blkdev_fsync static block: refine the disk_live check in del_gendisk mmc: sdhci-tegra: Enable MMC_CAP2_ALT_GPT_TEGRA mmc: block: Support alternative_gpt_sector() operation partitions/efi: Support non-standard GPT location block: Add alternative_gpt_sector() operation bio: fix page leak bio_add_hw_page failure block: remove CONFIG_DEBUG_BLOCK_EXT_DEVT block: remove a pointless call to MINOR() in device_add_disk null_blk: add error handling support for add_disk() virtio_blk: add error handling support for add_disk() block: add error handling for device_add_disk / add_disk block: return errors from disk_alloc_events block: return errors from blk_integrity_add block: call blk_register_queue earlier in device_add_disk ...
2021-08-29Merge tag 'irqchip-5.15' of ↵Thomas Gleixner8-39/+29
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - API updates: - Treewide conversion to generic_handle_domain_irq() for anything that looks like a chained interrupt controller - Update the irqdomain documentation - Use of bitmap_zalloc() throughout the tree - New functionalities: - Support for GICv3 EPPI partitions - Fixes: - Qualcomm PDC hierarchy fixes - Yet another priority decoding fix for the GICv3 pseudo-NMIs - Fix the apple-aic driver irq_eoi() callback to always unmask the interrupt - Properly handle edge interrupts on loongson-pch-pic - Let the mtk-sysirq driver advertise IRQCHIP_SKIP_SET_WAKE Link: https://lore.kernel.org/r/20210828121013.2647964-1-maz@kernel.org
2021-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-8/+14
Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h 9e26680733d5 ("bnxt_en: Update firmware call to retrieve TX PTP timestamp") 9e518f25802c ("bnxt_en: 1PPS functions to configure TSIO pins") 099fdeda659d ("bnxt_en: Event handler for PPS events") kernel/bpf/helpers.c include/linux/bpf-cgroup.h a2baf4e8bb0f ("bpf: Fix potentially incorrect results with bpf_get_local_storage()") c7603cfa04e7 ("bpf: Add ambient BPF runtime context stored in current") drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c 5957cc557dc5 ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray") 2d0b41a37679 ("net/mlx5: Refcount mlx5_irq with integer") MAINTAINERS 7b637cd52f02 ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo") 7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-12mips: Bulk conversion to generic_handle_domain_irq()Marc Zyngier8-39/+29
Wherever possible, replace constructs that match either generic_handle_irq(irq_find_mapping()) or generic_handle_irq(irq_linear_revmap()) to a single call to generic_handle_domain_irq(). Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-09Merge 5.14-rc5 into driver-core-nextGreg Kroah-Hartman5-9/+17
We need the driver core fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-08Merge tag 'tty-5.14-rc5' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are some small tty/serial driver fixes for 5.14-rc5 to resolve a number of reported problems. They include: - mips serial driver fixes - 8250 driver fixes for reported problems - fsl_lpuart driver fixes - other tiny driver fixes All have been in linux-next for a while with no reported problems" * tag 'tty-5.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: 8250_pci: Avoid irq sharing for MSI(-X) interrupts. serial: 8250_mtk: fix uart corruption issue when rx power off tty: serial: fsl_lpuart: fix the wrong return value in lpuart32_get_mctrl serial: 8250_pci: Enumerate Elkhart Lake UARTs via dedicated driver serial: 8250: fix handle_irq locking serial: tegra: Only print FIFO error message when an error occurs MIPS: Malta: Do not byte-swap accesses to the CBUS UART serial: 8250: Mask out floating 16/32-bit bus bits serial: max310x: Unprepare and disable clock in error path
2021-08-07Merge tag 'kbuild-fixes-v5.14-2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Correct the Extended Regular Expressions in tools - Adjust scripts/checkversion.pl for the current Kbuild - Unset sub_make_done for 'make install' to make DKMS work again * tag 'kbuild-fixes-v5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: cancel sub_make_done for the install target to fix DKMS scripts: checkversion: modernize linux/version.h search strings mips: Fix non-POSIX regexp x86/tools/relocs: Fix non-POSIX regexp
2021-08-06Merge tag 'mips-fixes_5.14_1' of ↵Linus Torvalds1-6/+11
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fix from Thomas Bogendoerfer: "Fix PMD accounting change" * tag 'mips-fixes_5.14_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: check return value of pgtable_pmd_page_ctor
2021-08-05mips: Fix non-POSIX regexpH. Nikolaus Schaller1-1/+1
When cross compiling a MIPS kernel on a BSD based HOSTCC leads to errors like SYNC include/config/auto.conf.cmd - due to: .config egrep: empty (sub)expression UPD include/config/kernel.release HOSTCC scripts/dtc/dtc.o - due to target missing It turns out that egrep uses this egrep pattern: (|MINOR_|PATCHLEVEL_) This is not valid syntax or gives undefined results according to POSIX 9.5.3 ERE Grammar https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html It seems to be silently accepted by the Linux egrep implementation while a BSD host complains. Such patterns can be replaced by a transformation like "(|p1|p2)" -> "(p1|p2)?" Fixes: 48c35b2d245f ("[MIPS] There is no __GNUC_MAJOR__") Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-08-05MIPS: check return value of pgtable_pmd_page_ctorHuang Pei1-6/+11
+. According to Documentation/vm/split_page_table_lock, handle failure of pgtable_pmd_page_ctor +. Use GFP_KERNEL_ACCOUNT instead of GFP_KERNEL|__GFP_ACCOUNT +. Adjust coding style Fixes: ed914d48b6a1 ("MIPS: add PMD table accounting into MIPS') Reported-by: Joshua Kinard <kumba@gentoo.org> Signed-off-by: Huang Pei <huangpei@loongson.cn> Reviewed-by: Joshua Kinard <kumba@gentoo.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-08-04sock: allow reading and changing sk_userlocks with setsockoptPavel Tikhomirov1-0/+2
SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags disable automatic socket buffers adjustment done by kernel (see tcp_fixup_rcvbuf() and tcp_sndbuf_expand()). If we've just created a new socket this adjustment is enabled on it, but if one changes the socket buffer size by setsockopt(SO_{SND,RCV}BUF*) it becomes disabled. CRIU needs to call setsockopt(SO_{SND,RCV}BUF*) on each socket on restore as it first needs to increase buffer sizes for packet queues restore and second it needs to restore back original buffer sizes. So after CRIU restore all sockets become non-auto-adjustable, which can decrease network performance of restored applications significantly. CRIU need to be able to restore sockets with enabled/disabled adjustment to the same state it was before dump, so let's add special setsockopt for it. Let's also export SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags to uAPI so that using these interface one can reenable automatic socket buffer adjustment on their sockets. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-02MIPS: don't include <linux/genhd.h> in <asm/mach-rc32434/rb.h>Christoph Hellwig1-2/+0
There is no need to include genhd.h from a random arch header, and not doing so prevents the possibility for nasty include loops. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20210727055646.118787-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-1/+3
Conflicting commits, all resolutions pretty trivial: drivers/bus/mhi/pci_generic.c 5c2c85315948 ("bus: mhi: pci-generic: configurable network interface MRU") 56f6f4c4eb2a ("bus: mhi: pci_generic: Apply no-op for wake using sideband wake boolean") drivers/nfc/s3fwrn5/firmware.c a0302ff5906a ("nfc: s3fwrn5: remove unnecessary label") 46573e3ab08f ("nfc: s3fwrn5: fix undefined parameter values in dev_err()") 801e541c79bb ("nfc: s3fwrn5: fix undefined parameter values in dev_err()") MAINTAINERS 7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver") 8a7b46fa7902 ("MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-07-31Merge tag 'net-5.14-rc4' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.14-rc4, including fixes from bpf, can, WiFi (mac80211) and netfilter trees. Current release - regressions: - mac80211: fix starting aggregation sessions on mesh interfaces Current release - new code bugs: - sctp: send pmtu probe only if packet loss in Search Complete state - bnxt_en: add missing periodic PHC overflow check - devlink: fix phys_port_name of virtual port and merge error - hns3: change the method of obtaining default ptp cycle - can: mcba_usb_start(): add missing urb->transfer_dma initialization Previous releases - regressions: - set true network header for ECN decapsulation - mlx5e: RX, avoid possible data corruption w/ relaxed ordering and LRO - phy: re-add check for PHY_BRCM_DIS_TXCRXC_NOENRGY on the BCM54811 PHY - sctp: fix return value check in __sctp_rcv_asconf_lookup Previous releases - always broken: - bpf: - more spectre corner case fixes, introduce a BPF nospec instruction for mitigating Spectre v4 - fix OOB read when printing XDP link fdinfo - sockmap: fix cleanup related races - mac80211: fix enabling 4-address mode on a sta vif after assoc - can: - raw: raw_setsockopt(): fix raw_rcv panic for sock UAF - j1939: j1939_session_deactivate(): clarify lifetime of session object, avoid UAF - fix number of identical memory leaks in USB drivers - tipc: - do not blindly write skb_shinfo frags when doing decryption - fix sleeping in tipc accept routine" * tag 'net-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits) gve: Update MAINTAINERS list can: esd_usb2: fix memory leak can: ems_usb: fix memory leak can: usb_8dev: fix memory leak can: mcba_usb_start(): add missing urb->transfer_dma initialization can: hi311x: fix a signedness bug in hi3110_cmd() MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver bpf: Fix leakage due to insufficient speculative store bypass mitigation bpf: Introduce BPF nospec instruction for mitigating Spectre v4 sis900: Fix missing pci_disable_device() in probe and remove net: let flow have same hash in two directions nfc: nfcsim: fix use after free during module unload tulip: windbond-840: Fix missing pci_disable_device() in probe and remove sctp: fix return value check in __sctp_rcv_asconf_lookup nfc: s3fwrn5: fix undefined parameter values in dev_err() net/mlx5: Fix mlx5_vport_tbl_attr chain from u16 to u32 net/mlx5e: Fix nullptr in mlx5e_hairpin_get_mdev() net/mlx5: Unload device upon firmware fatal error net/mlx5e: Fix page allocation failure for ptp-RQ over SF net/mlx5e: Fix page allocation failure for trap-RQ over SF ...
2021-07-30Merge tag 'libata-5.14-2021-07-30' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+0
Pull libata fixlets from Jens Axboe: - A fix for PIO highmem (Christoph) - Kill HAVE_IDE as it's now unused (Lukas) * tag 'libata-5.14-2021-07-30' of git://git.kernel.dk/linux-block: arch: Kconfig: clean up obsolete use of HAVE_IDE libata: fix ata_pio_sector for CONFIG_HIGHMEM
2021-07-30arch: Kconfig: clean up obsolete use of HAVE_IDELukas Bulwahn1-1/+0
The arch-specific Kconfig files use HAVE_IDE to indicate if IDE is supported. As IDE support and the HAVE_IDE config vanishes with commit b7fb14d3ac63 ("ide: remove the legacy ide driver"), there is no need to mention HAVE_IDE in all those arch-specific Kconfig files. The issue was identified with ./scripts/checkkconfigsymbols.py. Fixes: b7fb14d3ac63 ("ide: remove the legacy ide driver") Suggested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20210728182115.4401-1-lukas.bulwahn@gmail.com Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-0/+3
Daniel Borkmann says: ==================== pull-request: bpf 2021-07-29 The following pull-request contains BPF updates for your *net* tree. We've added 9 non-merge commits during the last 14 day(s) which contain a total of 20 files changed, 446 insertions(+), 138 deletions(-). The main changes are: 1) Fix UBSAN out-of-bounds splat for showing XDP link fdinfo, from Lorenz Bauer. 2) Fix insufficient Spectre v4 mitigation in BPF runtime, from Daniel Borkmann, Piotr Krysiuk and Benedict Schlueter. 3) Batch of fixes for BPF sockmap found under stress testing, from John Fastabend. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-29bpf: Introduce BPF nospec instruction for mitigating Spectre v4Daniel Borkmann1-0/+3
In case of JITs, each of the JIT backends compiles the BPF nospec instruction /either/ to a machine instruction which emits a speculation barrier /or/ to /no/ machine instruction in case the underlying architecture is not affected by Speculative Store Bypass or has different mitigations in place already. This covers both x86 and (implicitly) arm64: In case of x86, we use 'lfence' instruction for mitigation. In case of arm64, we rely on the firmware mitigation as controlled via the ssbd kernel parameter. Whenever the mitigation is enabled, it works for all of the kernel code with no need to provide any additional instructions here (hence only comment in arm64 JIT). Other archs can follow as needed. The BPF nospec instruction is specifically targeting Spectre v4 since i) we don't use a serialization barrier for the Spectre v1 case, and ii) mitigation instructions for v1 and v4 might be different on some archs. The BPF nospec is required for a future commit, where the BPF verifier does annotate intermediate BPF programs with speculation barriers. Co-developed-by: Piotr Krysiuk <piotras@gmail.com> Co-developed-by: Benedict Schlueter <benedict.schlueter@rub.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Piotr Krysiuk <piotras@gmail.com> Signed-off-by: Benedict Schlueter <benedict.schlueter@rub.de> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-07-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2-1/+3
Conflicts are simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23compat: make linux/compat.h available everywhereArnd Bergmann1-13/+11
Parts of linux/compat.h are under an #ifdef, but we end up using more of those over time, moving things around bit by bit. To get it over with once and for all, make all of this file uncondititonal now so it can be accessed everywhere. There are only a few types left that are in asm/compat.h but not yet in the asm-generic version, so add those in the process. This requires providing a few more types in asm-generic/compat.h that were not already there. The only tricky one is compat_sigset_t, which needs a little help on 32-bit architectures and for x86. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-21MIPS: Malta: Do not byte-swap accesses to the CBUS UARTMaciej W. Rozycki1-1/+2
Correct big-endian accesses to the CBUS UART, a Malta on-board discrete TI16C550C part wired directly to the system controller's device bus, and do not use byte swapping with the 32-bit accesses to the device. The CBUS is used for devices such as the boot flash memory needed early on in system bootstrap even before PCI has been initialised. Therefore it uses the system controller's device bus, which follows the endianness set with the CPU, which means no byte-swapping is ever required for data accesses to CBUS, unlike with PCI. The CBUS UART uses the UPIO_MEM32 access method, that is the `readl' and `writel' MMIO accessors, which on the MIPS platform imply byte-swapping with PCI systems. Consequently the wrong byte lane is accessed with the big-endian configuration and the UART is not correctly accessed. As it happens the UPIO_MEM32BE access method makes use of the `ioread32' and `iowrite32' MMIO accessors, which still use `readl' and `writel' respectively, however they byte-swap data passed, effectively cancelling swapping done with the accessors themselves and making it suitable for the CBUS UART. Make the CBUS UART switch between UPIO_MEM32 and UPIO_MEM32BE then, based on the endianness selected. With this change in place the device is correctly recognised with big-endian Malta at boot, along with the Super I/O devices behind PCI: Serial: 8250/16550 driver, 5 ports, IRQ sharing enabled printk: console [ttyS0] disabled serial8250.0: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A printk: console [ttyS0] enabled printk: bootconsole [uart8250] disabled serial8250.0: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A serial8250.0: ttyS2 at MMIO 0x1f000900 (irq = 20, base_baud = 230400) is a 16550A Fixes: e7c4782f92fc ("[MIPS] Put an end to <asm/serial.h>'s long and annyoing existence") Cc: stable@vger.kernel.org # v2.6.23+ Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://lore.kernel.org/r/alpine.DEB.2.21.2106260524430.37803@angie.orcam.me.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-21bus: Make remove callback return voidUwe Kleine-König1-2/+1
The driver core ignores the return value of this callback because there is only little it can do when a device disappears. This is the final bit of a long lasting cleanup quest where several buses were converted to also return void from their remove callback. Additionally some resource leaks were fixed that were caused by drivers returning an error code in the expectation that the driver won't go away. With struct bus_type::remove returning void it's prevented that newly implemented buses return an ignored error code and so don't anticipate wrong expectations for driver authors. Reviewed-by: Tom Rix <trix@redhat.com> (For fpga) Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> (For drivers/s390 and drivers/vfio) Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> (For ARM, Amba and related parts) Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Chen-Yu Tsai <wens@csie.org> (for sunxi-rsb) Acked-by: Pali Rohár <pali@kernel.org> Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org> (for media) Acked-by: Hans de Goede <hdegoede@redhat.com> (For drivers/platform) Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-By: Vinod Koul <vkoul@kernel.org> Acked-by: Juergen Gross <jgross@suse.com> (For xen) Acked-by: Lee Jones <lee.jones@linaro.org> (For mfd) Acked-by: Johannes Thumshirn <jth@kernel.org> (For mcb) Acked-by: Johan Hovold <johan@kernel.org> Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> (For slimbus) Acked-by: Kirti Wankhede <kwankhede@nvidia.com> (For vfio) Acked-by: Maximilian Luz <luzmaximilian@gmail.com> Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> (For ulpi and typec) Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> (For ipack) Acked-by: Geoff Levand <geoff@infradead.org> (For ps3) Acked-by: Yehezkel Bernat <YehezkelShB@gmail.com> (For thunderbolt) Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> (For intel_th) Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> (For pcmcia) Acked-by: Rafael J. Wysocki <rafael@kernel.org> (For ACPI) Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> (rpmsg and apr) Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> (For intel-ish-hid) Acked-by: Dan Williams <dan.j.williams@intel.com> (For CXL, DAX, and NVDIMM) Acked-by: William Breathitt Gray <vilhelm.gray@gmail.com> (For isa) Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (For firewire) Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> (For hid) Acked-by: Thorsten Scherer <t.scherer@eckelmann.de> (For siox) Acked-by: Sven Van Asbroeck <TheSven73@gmail.com> (For anybuss) Acked-by: Ulf Hansson <ulf.hansson@linaro.org> (For MMC) Acked-by: Wolfram Sang <wsa@kernel.org> # for I2C Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Finn Thain <fthain@linux-m68k.org> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20210713193522.1770306-6-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-13MIPS: Fix unreachable code issueGustavo A. R. Silva1-1/+1
Fix the following warning (mips-randconfig): arch/mips/include/asm/fpu.h:79:3: warning: fallthrough annotation in unreachable code [-Wimplicit-fallthrough] Originally, the /* fallthrough */ comment was introduced by commit: 597ce1723e0f ("MIPS: Support for 64-bit FP with O32 binaries") and it was wrongly replaced with fallthrough; by commit: c9b029903466 ("MIPS: Use fallthrough for arch/mips") As the original comment is actually useful, fix this issue by removing unreachable fallthrough; statement and place the original /* fallthrough */ comment back. Fixes: c9b029903466 ("MIPS: Use fallthrough for arch/mips") Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%25lkp@intel.com/ Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2021-07-13MIPS: Fix fall-through warnings for ClangGustavo A. R. Silva1-0/+2
Fix the following fallthrough warnings: arch/mips/mm/tlbex.c:1386:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough] arch/mips/mm/tlbex.c:2173:3: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough] Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%25lkp@intel.com/ Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2021-07-11Merge tag 'irq-urgent-2021-07-11' of ↵Linus Torvalds2-0/+19
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: "Two fixes: - Fix a MIPS IRQ handling RCU bug - Remove a DocBook annotation for a parameter that doesn't exist anymore" * tag 'irq-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/mips: Fix RCU violation when using irqdomain lookup on interrupt entry genirq/irqdesc: Drop excess kernel-doc entry @lookup
2021-07-10Merge tag 'kbuild-v5.14' of ↵Linus Torvalds1-3/+0
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Increase the -falign-functions alignment for the debug option. - Remove ugly libelf checks from the top Makefile. - Make the silent build (-s) more silent. - Re-compile the kernel if KBUILD_BUILD_TIMESTAMP is specified. - Various script cleanups * tag 'kbuild-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (27 commits) scripts: add generic syscallnr.sh scripts: check duplicated syscall number in syscall table sparc: syscalls: use pattern rules to generate syscall headers parisc: syscalls: use pattern rules to generate syscall headers nds32: add arch/nds32/boot/.gitignore kbuild: mkcompile_h: consider timestamp if KBUILD_BUILD_TIMESTAMP is set kbuild: modpost: Explicitly warn about unprototyped symbols kbuild: remove trailing slashes from $(KBUILD_EXTMOD) kconfig.h: explain IS_MODULE(), IS_ENABLED() kconfig: constify long_opts scripts/setlocalversion: simplify the short version part scripts/setlocalversion: factor out 12-chars hash construction scripts/setlocalversion: add more comments to -dirty flag detection scripts/setlocalversion: remove workaround for old make-kpkg scripts/setlocalversion: remove mercurial, svn and git-svn supports kbuild: clean up ${quiet} checks in shell scripts kbuild: sink stdout from cmd for silent build init: use $(call cmd,) for generating include/generated/compile.h kbuild: merge scripts/mkmakefile to top Makefile sh: move core-y in arch/sh/Makefile to arch/sh/Kbuild ...
2021-07-10Merge tag 'mips_5.14_1' of ↵Linus Torvalds3-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - fix for accesing gic via vdso - two build fixes * tag 'mips_5.14_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: vdso: Invalid GIC access through VDSO mips: disable branch profiling in boot/decompress.o mips: always link byteswap helpers into decompressor
2021-07-09Merge tag 'irqchip-fixes-5.14-1' of ↵Thomas Gleixner2-0/+19
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zyngier: - Fix a MIPS bug where irqdomain loopkups could occur in a context where RCU is not allowed - Fix a documentation bug for handle_domain_irq
2021-07-09MIPS: vdso: Invalid GIC access through VDSOMartin Fäcknitz1-1/+1
Accessing raw timers (currently only CLOCK_MONOTONIC_RAW) through VDSO doesn't return the correct time when using the GIC as clock source. The address of the GIC mapped page is in this case not calculated correctly. The GIC mapped page is calculated from the VDSO data by subtracting PAGE_SIZE: void *get_gic(const struct vdso_data *data) { return (void __iomem *)data - PAGE_SIZE; } However, the data pointer is not page aligned for raw clock sources. This is because the VDSO data for raw clock sources (CS_RAW = 1) is stored after the VDSO data for coarse clock sources (CS_HRES_COARSE = 0). Therefore, only the VDSO data for CS_HRES_COARSE is page aligned: +--------------------+ | | | vd[CS_RAW] | ---+ | vd[CS_HRES_COARSE] | | +--------------------+ | -PAGE_SIZE | | | | GIC mapped page | <--+ | | +--------------------+ When __arch_get_hw_counter() is called with &vd[CS_RAW], get_gic returns the wrong address (somewhere inside the GIC mapped page). The GIC counter values are not returned which results in an invalid time. Fixes: a7f4df4e21dd ("MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()") Signed-off-by: Martin Fäcknitz <faecknitz@hotsplots.de> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-07-09irqchip/mips: Fix RCU violation when using irqdomain lookup on interrupt entryMarc Zyngier2-0/+19
Since d4a45c68dc81 ("irqdomain: Protect the linear revmap with RCU"), any irqdomain lookup requires the RCU read lock to be held. This assumes that the architecture code will be structured such as irq_enter() will be called *before* the interrupt is looked up in the irq domain. However, this isn't the case for MIPS, and a number of drivers are structured to do it the other way around when handling an interrupt in their root irqchip (secondary irqchips are OK by construction). This results in a RCU splat on a lockdep-enabled kernel when the kernel takes an interrupt from idle, as reported by Guenter Roeck. Note that this could have fired previously if any driver had used tree-based irqdomain, which always had the RCU requirement. To solve this, provide a MIPS-specific helper (do_domain_IRQ()) as the pendent of do_IRQ() that will do thing in the right order (and maybe save some cycles in the process). Ideally, MIPS would be moved over to using handle_domain_irq(), but that's much more ambitious. Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> [maz: add dependency on CONFIG_IRQ_DOMAIN after report from the kernelci bot] Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Serge Semin <fancer.lancer@gmail.com> Link: https://lore.kernel.org/r/20210705172352.GA56304@roeck-us.net Link: https://lore.kernel.org/r/20210706110647.3979002-1-maz@kernel.org
2021-07-08mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t *Aneesh Kumar K.V1-2/+2
No functional change in this patch. [aneesh.kumar@linux.ibm.com: m68k build error reported by kernel robot] Link: https://lkml.kernel.org/r/87tulxnb2v.fsf@linux.ibm.com Link: https://lkml.kernel.org/r/20210615110859.320299-2-aneesh.kumar@linux.ibm.com Link: https://lore.kernel.org/linuxppc-dev/CAHk-=wi+J+iodze9FtjM3Zi4j4OeS+qqbKxME9QN4roxPEXH9Q@mail.gmail.com/ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Hugh Dickins <hughd@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-08mm: rename pud_page_vaddr to pud_pgtable and make it return pmd_t *Aneesh Kumar K.V1-2/+2
No functional change in this patch. [aneesh.kumar@linux.ibm.com: fix] Link: https://lkml.kernel.org/r/87wnqtnb60.fsf@linux.ibm.com [sfr@canb.auug.org.au: another fix] Link: https://lkml.kernel.org/r/20210619134410.89559-1-aneesh.kumar@linux.ibm.com Link: https://lkml.kernel.org/r/20210615110859.320299-1-aneesh.kumar@linux.ibm.com Link: https://lore.kernel.org/linuxppc-dev/CAHk-=wi+J+iodze9FtjM3Zi4j4OeS+qqbKxME9QN4roxPEXH9Q@mail.gmail.com/ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Hugh Dickins <hughd@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-06Merge tag 'tty-5.14-rc1' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial updates from Greg KH: "Here is the big set of tty and serial driver patches for 5.14-rc1. A bit more than normal, but nothing major, lots of cleanups. Highlights are: - lots of tty api cleanups and mxser driver cleanups from Jiri - build warning fixes - various serial driver updates - coding style cleanups - various tty driver minor fixes and updates - removal of broken and disable r3964 line discipline (finally!) All of these have been in linux-next for a while with no reported issues" * tag 'tty-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (227 commits) serial: mvebu-uart: remove unused member nb from struct mvebu_uart arm64: dts: marvell: armada-37xx: Fix reg for standard variant of UART dt-bindings: mvebu-uart: fix documentation serial: mvebu-uart: correctly calculate minimal possible baudrate serial: mvebu-uart: do not allow changing baudrate when uartclk is not available serial: mvebu-uart: fix calculation of clock divisor tty: make linux/tty_flip.h self-contained serial: Prefer unsigned int to bare use of unsigned serial: 8250: 8250_omap: Fix possible interrupt storm on K3 SoCs serial: qcom_geni_serial: use DT aliases according to DT bindings Revert "tty: serial: Add UART driver for Cortina-Access platform" tty: serial: Add UART driver for Cortina-Access platform MAINTAINERS: add me back as mxser maintainer mxser: Documentation, fix typos mxser: Documentation, make the docs up-to-date mxser: Documentation, remove traces of callout device mxser: introduce mxser_16550A_or_MUST helper mxser: rename flags to old_speed in mxser_set_serial_info mxser: use port variable in mxser_set_serial_info mxser: access info->MCR under info->slock ...
2021-07-06Merge tag 'staging-5.14-rc1' of ↵Linus Torvalds1-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging / IIO driver updates from Greg KH: "Here is the big set of IIO and staging driver patches for 5.14-rc1. Loads of IIO driver updates and additions in here, the shortlog has the full details. For the staging side, we moved a few drivers out of staging, and deleted the kpc2000 drivers as the original developer asked us to because no one was working on them anymore. Also in here are loads of coding style cleanups due to different intern projects focusing on the staging tree to try to get experience doing kernel development. All of these have been in the linux-next tree for a while with no reported problems" * tag 'staging-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (744 commits) staging: hi6421-spmi-pmic: cleanup some macros staging: hi6421-spmi-pmic: change identation of a table staging: hi6421-spmi-pmic: change a return code staging: hi6421-spmi-pmic: better name IRQs staging: hi6421-spmi-pmic: use devm_request_threaded_irq() staging: hisilicon,hi6421-spmi-pmic.yaml: cleanup descriptions spmi: hisi-spmi-controller: move driver from staging phy: phy-hi3670-usb3: move driver from staging into phy staging: rtl8188eu: remove include/rtw_debug.h header staging: rtl8188eu: remove GlobalDebugLevel variable staging: rtl8188eu: remove DRIVER_PREFIX preprocessor definition staging: rtl8188eu: remove RT_TRACE macro staging: rtl8188eu: remove all RT_TRACE calls from hal/rtl8188eu_recv.c staging: rtl8188eu: remove all RT_TRACE calls from hal/hal_intf.c staging: rtl8188eu: remove all RT_TRACE calls from hal/rtl8188eu_xmit.c staging: rtl8188eu: remove all RT_TRACE calls from core/rtw_xmit.c staging: rtl8188eu: remove all RT_TRACE calls from core/rtw_pwrctrl.c staging: rtl8188eu: remove all RT_TRACE calls from core/rtw_recv.c staging: rtl8188eu: remove all RT_TRACE calls from core/rtw_ioctl_set.c staging: rtl8188eu: remove all RT_TRACE calls from core/rtw_ieee80211.c ...
2021-07-05mips: disable branch profiling in boot/decompress.oRandy Dunlap1-0/+2
Use DISABLE_BRANCH_PROFILING for arch/mips/boot/compressed/decompress.o to prevent linkage errors. mips64-linux-ld: arch/mips/boot/compressed/decompress.o: in function `LZ4_decompress_fast_extDict': decompress.c:(.text+0x8c): undefined reference to `ftrace_likely_update' mips64-linux-ld: decompress.c:(.text+0xf4): undefined reference to `ftrace_likely_update' mips64-linux-ld: decompress.c:(.text+0x200): undefined reference to `ftrace_likely_update' mips64-linux-ld: decompress.c:(.text+0x230): undefined reference to `ftrace_likely_update' mips64-linux-ld: decompress.c:(.text+0x320): undefined reference to `ftrace_likely_update' mips64-linux-ld: arch/mips/boot/compressed/decompress.o:decompress.c:(.text+0x3f4): more undefined references to `ftrace_likely_update' follow Fixes: e76e1fdfa8f8 ("lib: add support for LZ4-compressed kernel") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-mips@vger.kernel.org Cc: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-07-05mips: always link byteswap helpers into decompressorArnd Bergmann1-2/+2
My series to clean up the unaligned access implementation across architectures caused some mips randconfig builds to fail with: mips64-linux-ld: arch/mips/boot/compressed/decompress.o: in function `decompress_kernel': decompress.c:(.text.decompress_kernel+0x54): undefined reference to `__bswapsi2' It turns out that this problem has already been fixed for the XZ decompressor but now it also shows up in (at least) LZO and LZ4. From my analysis I concluded that the compiler could always have emitted those calls, but the different implementation allowed it to make otherwise better decisions about not inlining the byteswap, which results in the link error when the out-of-line code is missing. While it could be addressed by adding it to the two decompressor implementations that are known to be affected, but as this only adds 112 bytes to the kernel, the safer choice is to always add them. Fixes: c50ec6787536 ("MIPS: zboot: Fix the build with XZ compression on older GCC versions") Fixes: 0652035a5794 ("asm-generic: unaligned: remove byteshift helpers") Link: https://lore.kernel.org/linux-mm/202106301304.gz2wVY9w-lkp@intel.com/ Link: https://lore.kernel.org/linux-mm/202106260659.TyMe8mjr-lkp@intel.com/ Link: https://lore.kernel.org/linux-mm/202106172016.onWT6Tza-lkp@intel.com/ Link: https://lore.kernel.org/linux-mm/202105231743.JJcALnhS-lkp@intel.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-07-02Merge tag 'asm-generic-unaligned-5.14' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm/unaligned.h unification from Arnd Bergmann: "Unify asm/unaligned.h around struct helper The get_unaligned()/put_unaligned() helpers are traditionally architecture specific, with the two main variants being the "access-ok.h" version that assumes unaligned pointer accesses always work on a particular architecture, and the "le-struct.h" version that casts the data to a byte aligned type before dereferencing, for architectures that cannot always do unaligned accesses in hardware. Based on the discussion linked below, it appears that the access-ok version is not realiable on any architecture, but the struct version probably has no downsides. This series changes the code to use the same implementation on all architectures, addressing the few exceptions separately" Link: https://lore.kernel.org/lkml/75d07691-1e4f-741f-9852-38c0b4f520bc@synopsys.com/ Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363 Link: https://lore.kernel.org/lkml/20210507220813.365382-14-arnd@kernel.org/ Link: git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git unaligned-rework-v2 Link: https://lore.kernel.org/lkml/CAHk-=whGObOKruA_bU3aPGZfoDqZM1_9wBkwREp0H0FgR-90uQ@mail.gmail.com/ * tag 'asm-generic-unaligned-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: simplify asm/unaligned.h asm-generic: uaccess: 1-byte access is always aligned netpoll: avoid put_unaligned() on single character mwifiex: re-fix for unaligned accesses apparmor: use get_unaligned() only for multi-byte words partitions: msdos: fix one-byte get_unaligned() asm-generic: unaligned always use struct helpers asm-generic: unaligned: remove byteshift helpers powerpc: use linux/unaligned/le_struct.h on LE power7 m68k: select CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS sh: remove unaligned access for sh4a openrisc: always use unaligned-struct header asm-generic: use asm-generic/unaligned.h for most architectures
2021-07-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds8-10/+6
Merge more updates from Andrew Morton: "190 patches. Subsystems affected by this patch series: mm (hugetlb, userfaultfd, vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock, migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap, zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc, core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs, signals, exec, kcov, selftests, compress/decompress, and ipc" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits) ipc/util.c: use binary search for max_idx ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock ipc: use kmalloc for msg_queue and shmid_kernel ipc sem: use kvmalloc for sem_undo allocation lib/decompressors: remove set but not used variabled 'level' selftests/vm/pkeys: exercise x86 XSAVE init state selftests/vm/pkeys: refill shadow register after implicit kernel write selftests/vm/pkeys: handle negative sys_pkey_alloc() return code selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random kcov: add __no_sanitize_coverage to fix noinstr for all architectures exec: remove checks in __register_bimfmt() x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned hfsplus: report create_date to kstat.btime hfsplus: remove unnecessary oom message nilfs2: remove redundant continue statement in a while-loop kprobes: remove duplicated strong free_insn_page in x86 and s390 init: print out unknown kernel parameters checkpatch: do not complain about positive return values starting with EPOLL checkpatch: improve the indented label test checkpatch: scripts/spdxcheck.py now requires python3 ...
2021-07-02Merge tag 'mips_5.14' of ↵Linus Torvalds37-115/+250
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS updates from Thomas Bogendoerfer: - add support for OpeneEmbed SOM9331 board - Ingenic fixes/improvments - other fixes and cleanups * tag 'mips_5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (39 commits) MIPS: Fix PKMAP with 32-bit MIPS huge page support MIPS: CI20: Add second percpu timer for SMP. MIPS: CI20: Reduce clocksource to 750 kHz. MIPS: Ingenic: Add MAC syscon nodes for Ingenic SoCs. dt-bindings: clock: Add documentation for MAC PHY control bindings. MIPS: X1830: Respect cell count of common properties. MIPS: set mips32r5 for virt extensions MIPS: loongsoon64: Reserve memory below starting pfn to prevent Oops MIPS: MT extensions are not available on MIPS32r1 mips/kvm: Use BUG_ON instead of if condition followed by BUG MIPS: OCTEON: octeon-usb: Use devm_platform_get_and_ioremap_resource() MIPS: add PMD table accounting into MIPS'pmd_alloc_one MIPS: Loongson64: fix spelling of SPDX tag MIPS: ingenic: rs90: Add dedicated VRAM memory region MIPS: ingenic: gcw0: Set codec to cap-less mode for FM radio MIPS: ingenic: jz4780: Fix I2C nodes to match DT doc MIPS: ingenic: Select CPU_SUPPORTS_CPUFREQ && MIPS_EXTERNAL_TIMER MIPS: Kconfig: ingenic: Ensure MACH_INGENIC_GENERIC selects all SoCs MIPS: cpu-probe: Fix FPU detection on Ingenic JZ4760(B) MIPS: boot: Support specifying UART port on Ingenic SoCs ...
2021-07-02Merge tag 'pinctrl-v5.14-1' of ↵Linus Torvalds11-717/+7
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control updates from Linus Walleij: "This is the bulk of pin control changes for the v5.14 kernel. Not so much going on. No core changes, just drivers. The most interesting would be that MIPS Ralink is migrating to pin control and we have some bindings but not yet code for the Apple M1 pin controller. New drivers: - Last merge window we created a driver for the Ralink RT2880. We are now moving the Ralink SoC pin control drivers out of the MIPS architecture code and into the pin control subsystem. This concerns RT288X, MT7620, RT305X, RT3883 and MT7621. - Qualcomm SM6125 SoC pin control driver. - Qualcomm spmi-gpio support for PM7325. - Qualcomm spmi-mpp also handles PMI8994 (just a compatible string) - Mediatek MT8365 SoC pin controller. - New device HID for the AMD GPIO controller. Improvements: - Pin bias config support for a slew of Renesas pin controllers. - Incremental improvements and non-urgent bug fixes to the Renesas SoC drivers. - Implement irq_set_wake on the AMD pin controller so we can wake up from external pin events. Misc: - Devicetree bindings for the Apple M1 pin controller, we will probably see a proper driver for this soon as well" * tag 'pinctrl-v5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (54 commits) pinctrl: ralink: rt305x: add missing include pinctrl: stm32: check for IRQ MUX validity during alloc() pinctrl: zynqmp: some code cleanups drivers: qcom: pinctrl: Add pinctrl driver for sm6125 dt-bindings: pinctrl: qcom: sm6125: Document SM6125 pinctrl driver dt-bindings: pinctrl: mcp23s08: add documentation for reset-gpios pinctrl: mcp23s08: Add optional reset GPIO pinctrl: mediatek: fix mode encoding pinctrl: mcp23s08: Fix missing unlock on error in mcp23s08_irq() pinctrl: bcm: Constify static pinmux_ops pinctrl: bcm: Constify static pinctrl_ops pinctrl: ralink: move RT288X SoC pinmux config into a new 'pinctrl-rt288x.c' file pinctrl: ralink: move MT7620 SoC pinmux config into a new 'pinctrl-mt7620.c' file pinctrl: ralink: move RT305X SoC pinmux config into a new 'pinctrl-rt305x.c' file pinctrl: ralink: move RT3883 SoC pinmux config into a new 'pinctrl-rt3883.c' file pinctrl: ralink: move MT7621 SoC pinmux config into a new 'pinctrl-mt7621.c' file pinctrl: ralink: move ralink architecture pinmux header into the driver pinctrl: single: config: enable the pin's input pinctrl: mtk: Fix mt8365 Kconfig dependency pinctrl: mcp23s08: fix race condition in irq handler ...
2021-07-01Merge tag 'clk-for-linus' of ↵Linus Torvalds6-159/+34
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk updates from Stephen Boyd: "This round has a diffstat dominated by Qualcomm clk drivers. Honestly though that's just a bunch of data so the diffstat reflects that. Looking beyond that there's just a bunch of updates all around in various clk drivers. Renesas and NXP (for i.MX) are two SoC vendors that have a lot of patches in here. Overall the driver changes look to be mostly enabling more clks and non-critical fixes that we could hold until the next merge window. I'm especially excited about the series from Arnd that graduates clkdev to be the only implementation of clk_get() and clk_put(). That's a good step in the right direction to migreate eveerything over to the common clk framework. Now we don't have to worry about clkdev specific details, they're just part of the clk API now. Core: - clkdev is now the only option, i.e. clk_get()/clk_put() is implemented in only one place in the kernel instead of in drivers/clk/clkdev.c and in architectures that want their own implementation New Drivers: - Texas Instruments' LMK04832 Ultra Low-Noise JESD204B Compliant Clock Jitter Cleaner With Dual Loop PLLs - Qualcomm MDM9607 GCC - Qualcomm SC8180X display clks - Qualcomm SM6125 GCC - Qualcomm SM8250 CAMCC (camera) - Renesas RZ/G2L SoC - Hisilicon hi3559A SoC Updates: - Stop using clock-output-names in ST clk drivers (yay!) - Support secure mode of STM32MP1 SoCs - Improve clock support for Actions S500 SoC - duty cycle setting support on qcom clks - Add TI am33xx spread spectrum clock support - Use determine_rate() for the Amlogic pll ops instead of round_rate() - Restrict Amlogic gp0/1 and audio plls range on g12a/sm1 - Improve Amlogic axg-audio controller error on deferral - Add NNA clocks on Amlogic g12a - Reduce memory footprint of Rockchip PLL rate tables - A fix for the newly added Rockchip rk3568 clk driver - Exported clock for the newly added Rockchip video decoder - Remove audio ipg clock from i.MX8MP - Remove deprecated legacy clock binding for i.MX SCU clock driver - Use common clk-imx8qxp for both i.MX8QXP and i.MX8QM - Add multiple clocks to clk-imx8qxp driver (enet, hdmi, lcdif, audio, parallel interface) - Add dedicated clock ops for i.MX paralel interface - Different fixes for clocks controlled by ATF on i.MX SoCs - Add A53/A72 frequency scaling support i.MX clk-scu driver - Add special case for DCSS clock on suspend for i.MX clk-scu driver - Add parent save/restore on suspend/resume to i.MX clk-scu driver - Skip runtime PM enablement for CPU clocks in i.MX clk-scu driver - Remove the sys1_pll/sys2_pll clock gates for i.MX8MQ and their bindings - Tegra clk driver no longer deasserts resets on clk_enable as it gets in the way of certain power-up sequences - Fix compile testing for Tegra clk driver - One patch to fix a divider on the Allwinner v3s Audio PLL - Add support for CPU core clock boost modes on Renesas R-Car Gen3 - Add ISPCS (Image Signal Processor) clocks on Renesas R-Car V3U - Switch SH/R-Mobile and R-Car "DIV6" clocks to .determine_rate() and improve support for multiple parents - Switch Renesas RZ/N1 divider clocks to .determine_rate() - Add ZA2 (Audio Clock Generator) clock on Renesas R-Car D3 - Convert ar7 to common clk framework - Convert ralink to common clk framework" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (161 commits) clk: zynqmp: Handle divider specific read only flag clk: zynqmp: Use firmware specific mux clock flags clk: zynqmp: Use firmware specific divider clock flags clk: zynqmp: Use firmware specific common clock flags clk: lmk04832: Use of match table clk: lmk04832: Depend on SPI clk: stm32mp1: new compatible for secure RCC support dt-bindings: clock: stm32mp1 new compatible for secure rcc dt-bindings: reset: add MCU HOLD BOOT ID for SCMI reset domains on stm32mp15 dt-bindings: reset: add IDs for SCMI reset domains on stm32mp15 dt-bindings: clock: add IDs for SCMI clocks on stm32mp15 reset: stm32mp1: remove stm32mp1 reset clk: hisilicon: Add clock driver for hi3559A SoC dt-bindings: Document the hi3559a clock bindings clk: si5341: Add sysfs properties to allow checking/resetting device faults clk: si5341: Add silabs,iovdd-33 property clk: si5341: Add silabs,xaxb-ext-clk property clk: si5341: Allow different output VDD_SEL values clk: si5341: Update initialization magic clk: si5341: Check for input clock presence and PLL lock on startup ...
2021-07-01Merge tag 'fs_for_v5.14-rc1' of ↵Linus Torvalds3-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull misc fs updates from Jan Kara: "The new quotactl_fd() syscall (remake of quotactl_path() syscall that got introduced & disabled in 5.13 cycle), and couple of udf, reiserfs, isofs, and writeback fixes and cleanups" * tag 'fs_for_v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: writeback: fix obtain a reference to a freeing memcg css quota: remove unnecessary oom message isofs: remove redundant continue statement quota: Wire up quotactl_fd syscall quota: Change quotactl_path() systcall to an fd-based one reiserfs: Remove unneed check in reiserfs_write_full_page() udf: Fix NULL pointer dereference in udf_symlink function reiserfs: add check for invalid 1st journal block
2021-07-01kernel.h: split out panic and oops helpersAndy Shevchenko3-0/+3
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out panic and oops helpers. There are several purposes of doing this: - dropping dependency in bug.h - dropping a loop by moving out panic_notifier.h - unload kernel.h from something which has its own domain At the same time convert users tree-wide to use new headers, although for the time being include new header back to kernel.h to avoid twisted indirected includes for existing users. [akpm@linux-foundation.org: thread_info.h needs limits.h] [andriy.shevchenko@linux.intel.com: ia64 fix] Link: https://lkml.kernel.org/r/20210520130557.55277-1-andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20210511074137.33666-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Co-developed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Sebastian Reichel <sre@kernel.org> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Stephen Boyd <sboyd@kernel.org> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01mm/thp: define default pmd_pgtable()Anshuman Khandual1-1/+0
Currently most platforms define pmd_pgtable() as pmd_page() duplicating the same code all over. Instead just define a default value i.e pmd_page() for pmd_pgtable() and let platforms override when required via <asm/pgtable.h>. All the existing platform that override pmd_pgtable() have been moved into their respective <asm/pgtable.h> header in order to precede before the new generic definition. This makes it much cleaner with reduced code. Link: https://lkml.kernel.org/r/1623646133-20306-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Nick Hu <nickhu@andestech.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01mm: define default value for FIRST_USER_ADDRESSAnshuman Khandual2-2/+0
Currently most platforms define FIRST_USER_ADDRESS as 0UL duplication the same code all over. Instead just define a generic default value (i.e 0UL) for FIRST_USER_ADDRESS and let the platforms override when required. This makes it much cleaner with reduced code. The default FIRST_USER_ADDRESS here would be skipped in <linux/pgtable.h> when the given platform overrides its value via <asm/pgtable.h>. Link: https://lkml.kernel.org/r/1620615725-24623-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Stafford Horne <shorne@gmail.com> [openrisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> [RISC-V] Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tablesDavid Hildenbrand1-0/+3
I. Background: Sparse Memory Mappings When we manage sparse memory mappings dynamically in user space - also sometimes involving MAP_NORESERVE - we want to dynamically populate/ discard memory inside such a sparse memory region. Example users are hypervisors (especially implementing memory ballooning or similar technologies like virtio-mem) and memory allocators. In addition, we want to fail in a nice way (instead of generating SIGBUS) if populating does not succeed because we are out of backend memory (which can happen easily with file-based mappings, especially tmpfs and hugetlbfs). While MADV_DONTNEED, MADV_REMOVE and FALLOC_FL_PUNCH_HOLE allow for reliably discarding memory for most mapping types, there is no generic approach to populate page tables and preallocate memory. Although mmap() supports MAP_POPULATE, it is not applicable to the concept of sparse memory mappings, where we want to populate/discard dynamically and avoid expensive/problematic remappings. In addition, we never actually report errors during the final populate phase - it is best-effort only. fallocate() can be used to preallocate file-based memory and fail in a safe way. However, it cannot really be used for any private mappings on anonymous files via memfd due to COW semantics. In addition, fallocate() does not actually populate page tables, so we still always get pagefaults on first access - which is sometimes undesired (i.e., real-time workloads) and requires real prefaulting of page tables, not just a preallocation of backend storage. There might be interesting use cases for sparse memory regions along with mlockall(MCL_ONFAULT) which fallocate() cannot satisfy as it does not prefault page tables. II. On preallcoation/prefaulting from user space Because we don't have a proper interface, what applications (like QEMU and databases) end up doing is touching (i.e., reading+writing one byte to not overwrite existing data) all individual pages. However, that approach 1) Can result in wear on storage backing, because we end up reading/writing each page; this is especially a problem for dax/pmem. 2) Can result in mmap_sem contention when prefaulting via multiple threads. 3) Requires expensive signal handling, especially to catch SIGBUS in case of hugetlbfs/shmem/file-backed memory. For example, this is problematic in hypervisors like QEMU where SIGBUS handlers might already be used by other subsystems concurrently to e.g, handle hardware errors. "Simply" doing preallocation concurrently from other thread is not that easy. III. On MADV_WILLNEED Extending MADV_WILLNEED is not an option because 1. It would change the semantics: "Expect access in the near future." and "might be a good idea to read some pages" vs. "Definitely populate/ preallocate all memory and definitely fail on errors.". 2. Existing users (like virtio-balloon in QEMU when deflating the balloon) don't want populate/prealloc semantics. They treat this rather as a hint to give a little performance boost without too much overhead - and don't expect that a lot of memory might get consumed or a lot of time might be spent. IV. MADV_POPULATE_READ and MADV_POPULATE_WRITE Let's introduce MADV_POPULATE_READ and MADV_POPULATE_WRITE, inspired by MAP_POPULATE, with the following semantics: 1. MADV_POPULATE_READ can be used to prefault page tables just like manually reading each individual page. This will not break any COW mappings. The shared zero page might get mapped and no backend storage might get preallocated -- allocation might be deferred to write-fault time. Especially shared file mappings require an explicit fallocate() upfront to actually preallocate backend memory (blocks in the file system) in case the file might have holes. 2. If MADV_POPULATE_READ succeeds, all page tables have been populated (prefaulted) readable once. 3. MADV_POPULATE_WRITE can be used to preallocate backend memory and prefault page tables just like manually writing (or reading+writing) each individual page. This will break any COW mappings -- e.g., the shared zeropage is never populated. 4. If MADV_POPULATE_WRITE succeeds, all page tables have been populated (prefaulted) writable once. 5. MADV_POPULATE_READ and MADV_POPULATE_WRITE cannot be applied to special mappings marked with VM_PFNMAP and VM_IO. Also, proper access permissions (e.g., PROT_READ, PROT_WRITE) are required. If any such mapping is encountered, madvise() fails with -EINVAL. 6. If MADV_POPULATE_READ or MADV_POPULATE_WRITE fails, some page tables might have been populated. 7. MADV_POPULATE_READ and MADV_POPULATE_WRITE will return -EHWPOISON when encountering a HW poisoned page in the range. 8. Similar to MAP_POPULATE, MADV_POPULATE_READ and MADV_POPULATE_WRITE cannot protect from the OOM (Out Of Memory) handler killing the process. While the use case for MADV_POPULATE_WRITE is fairly obvious (i.e., preallocate memory and prefault page tables for VMs), one issue is that whenever we prefault pages writable, the pages have to be marked dirty, because the CPU could dirty them any time. while not a real problem for hugetlbfs or dax/pmem, it can be a problem for shared file mappings: each page will be marked dirty and has to be written back later when evicting. MADV_POPULATE_READ allows for optimizing this scenario: Pre-read a whole mapping from backend storage without marking it dirty, such that eviction won't have to write it back. As discussed above, shared file mappings might require an explciit fallocate() upfront to achieve preallcoation+prepopulation. Although sparse memory mappings are the primary use case, this will also be useful for other preallocate/prefault use cases where MAP_POPULATE is not desired or the semantics of MAP_POPULATE are not sufficient: as one example, QEMU users can trigger preallocation/prefaulting of guest RAM after the mapping was created -- and don't want errors to be silently suppressed. Looking at the history, MADV_POPULATE was already proposed in 2013 [1], however, the main motivation back than was performance improvements -- which should also still be the case. V. Single-threaded performance comparison I did a short experiment, prefaulting page tables on completely *empty mappings/files* and repeated the experiment 10 times. The results correspond to the shortest execution time. In general, the performance benefit for huge pages is negligible with small mappings. V.1: Private mappings POPULATE_READ and POPULATE_WRITE is fastest. Note that Reading/POPULATE_READ will populate the shared zeropage where applicable -- which result in short population times. The fastest way to allocate backend storage (here: swap or huge pages) and prefault page tables is POPULATE_WRITE. V.2: Shared mappings fallocate() is fastest, however, doesn't prefault page tables. POPULATE_WRITE is faster than simple writes and read/writes. POPULATE_READ is faster than simple reads. Without a fd, the fastest way to allocate backend storage and prefault page tables is POPULATE_WRITE. With an fd, the fastest way is usually FALLOCATE+POPULATE_READ or FALLOCATE+POPULATE_WRITE respectively; one exception are actual files: FALLOCATE+Read is slightly faster than FALLOCATE+POPULATE_READ. The fastest way to allocate backend storage prefault page tables is FALLOCATE+POPULATE_WRITE -- except when dealing with actual files; then, FALLOCATE+POPULATE_READ is fastest and won't directly mark all pages as dirty. v.3: Detailed results ================================================== 2 MiB MAP_PRIVATE: ************************************************** Anon 4 KiB : Read : 0.119 ms Anon 4 KiB : Write : 0.222 ms Anon 4 KiB : Read/Write : 0.380 ms Anon 4 KiB : POPULATE_READ : 0.060 ms Anon 4 KiB : POPULATE_WRITE : 0.158 ms Memfd 4 KiB : Read : 0.034 ms Memfd 4 KiB : Write : 0.310 ms Memfd 4 KiB : Read/Write : 0.362 ms Memfd 4 KiB : POPULATE_READ : 0.039 ms Memfd 4 KiB : POPULATE_WRITE : 0.229 ms Memfd 2 MiB : Read : 0.030 ms Memfd 2 MiB : Write : 0.030 ms Memfd 2 MiB : Read/Write : 0.030 ms Memfd 2 MiB : POPULATE_READ : 0.030 ms Memfd 2 MiB : POPULATE_WRITE : 0.030 ms tmpfs : Read : 0.033 ms tmpfs : Write : 0.313 ms tmpfs : Read/Write : 0.406 ms tmpfs : POPULATE_READ : 0.039 ms tmpfs : POPULATE_WRITE : 0.285 ms file : Read : 0.033 ms file : Write : 0.351 ms file : Read/Write : 0.408 ms file : POPULATE_READ : 0.039 ms file : POPULATE_WRITE : 0.290 ms hugetlbfs : Read : 0.030 ms hugetlbfs : Write : 0.030 ms hugetlbfs : Read/Write : 0.030 ms hugetlbfs : POPULATE_READ : 0.030 ms hugetlbfs : POPULATE_WRITE : 0.030 ms ************************************************** 4096 MiB MAP_PRIVATE: ************************************************** Anon 4 KiB : Read : 237.940 ms Anon 4 KiB : Write : 708.409 ms Anon 4 KiB : Read/Write : 1054.041 ms Anon 4 KiB : POPULATE_READ : 124.310 ms Anon 4 KiB : POPULATE_WRITE : 572.582 ms Memfd 4 KiB : Read : 136.928 ms Memfd 4 KiB : Write : 963.898 ms Memfd 4 KiB : Read/Write : 1106.561 ms Memfd 4 KiB : POPULATE_READ : 78.450 ms Memfd 4 KiB : POPULATE_WRITE : 805.881 ms Memfd 2 MiB : Read : 357.116 ms Memfd 2 MiB : Write : 357.210 ms Memfd 2 MiB : Read/Write : 357.606 ms Memfd 2 MiB : POPULATE_READ : 356.094 ms Memfd 2 MiB : POPULATE_WRITE : 356.937 ms tmpfs : Read : 137.536 ms tmpfs : Write : 954.362 ms tmpfs : Read/Write : 1105.954 ms tmpfs : POPULATE_READ : 80.289 ms tmpfs : POPULATE_WRITE : 822.826 ms file : Read : 137.874 ms file : Write : 987.025 ms file : Read/Write : 1107.439 ms file : POPULATE_READ : 80.413 ms file : POPULATE_WRITE : 857.622 ms hugetlbfs : Read : 355.607 ms hugetlbfs : Write : 355.729 ms hugetlbfs : Read/Write : 356.127 ms hugetlbfs : POPULATE_READ : 354.585 ms hugetlbfs : POPULATE_WRITE : 355.138 ms ************************************************** 2 MiB MAP_SHARED: ************************************************** Anon 4 KiB : Read : 0.394 ms Anon 4 KiB : Write : 0.348 ms Anon 4 KiB : Read/Write : 0.400 ms Anon 4 KiB : POPULATE_READ : 0.326 ms Anon 4 KiB : POPULATE_WRITE : 0.273 ms Anon 2 MiB : Read : 0.030 ms Anon 2 MiB : Write : 0.030 ms Anon 2 MiB : Read/Write : 0.030 ms Anon 2 MiB : POPULATE_READ : 0.030 ms Anon 2 MiB : POPULATE_WRITE : 0.030 ms Memfd 4 KiB : Read : 0.412 ms Memfd 4 KiB : Write : 0.372 ms Memfd 4 KiB : Read/Write : 0.419 ms Memfd 4 KiB : POPULATE_READ : 0.343 ms Memfd 4 KiB : POPULATE_WRITE : 0.288 ms Memfd 4 KiB : FALLOCATE : 0.137 ms Memfd 4 KiB : FALLOCATE+Read : 0.446 ms Memfd 4 KiB : FALLOCATE+Write : 0.330 ms Memfd 4 KiB : FALLOCATE+Read/Write : 0.454 ms Memfd 4 KiB : FALLOCATE+POPULATE_READ : 0.379 ms Memfd 4 KiB : FALLOCATE+POPULATE_WRITE : 0.268 ms Memfd 2 MiB : Read : 0.030 ms Memfd 2 MiB : Write : 0.030 ms Memfd 2 MiB : Read/Write : 0.030 ms Memfd 2 MiB : POPULATE_READ : 0.030 ms Memfd 2 MiB : POPULATE_WRITE : 0.030 ms Memfd 2 MiB : FALLOCATE : 0.030 ms Memfd 2 MiB : FALLOCATE+Read : 0.031 ms Memfd 2 MiB : FALLOCATE+Write : 0.031 ms Memfd 2 MiB : FALLOCATE+Read/Write : 0.031 ms Memfd 2 MiB : FALLOCATE+POPULATE_READ : 0.030 ms Memfd 2 MiB : FALLOCATE+POPULATE_WRITE : 0.030 ms tmpfs : Read : 0.416 ms tmpfs : Write : 0.369 ms tmpfs : Read/Write : 0.425 ms tmpfs : POPULATE_READ : 0.346 ms tmpfs : POPULATE_WRITE : 0.295 ms tmpfs : FALLOCATE : 0.139 ms tmpfs : FALLOCATE+Read : 0.447 ms tmpfs : FALLOCATE+Write : 0.333 ms tmpfs : FALLOCATE+Read/Write : 0.454 ms tmpfs : FALLOCATE+POPULATE_READ : 0.380 ms tmpfs : FALLOCATE+POPULATE_WRITE : 0.272 ms file : Read : 0.191 ms file : Write : 0.511 ms file : Read/Write : 0.524 ms file : POPULATE_READ : 0.196 ms file : POPULATE_WRITE : 0.434 ms file : FALLOCATE : 0.004 ms file : FALLOCATE+Read : 0.197 ms file : FALLOCATE+Write : 0.554 ms file : FALLOCATE+Read/Write : 0.480 ms file : FALLOCATE+POPULATE_READ : 0.201 ms file : FALLOCATE+POPULATE_WRITE : 0.381 ms hugetlbfs : Read : 0.030 ms hugetlbfs : Write : 0.030 ms hugetlbfs : Read/Write : 0.030 ms hugetlbfs : POPULATE_READ : 0.030 ms hugetlbfs : POPULATE_WRITE : 0.030 ms hugetlbfs : FALLOCATE : 0.030 ms hugetlbfs : FALLOCATE+Read : 0.031 ms hugetlbfs : FALLOCATE+Write : 0.031 ms hugetlbfs : FALLOCATE+Read/Write : 0.030 ms hugetlbfs : FALLOCATE+POPULATE_READ : 0.030 ms hugetlbfs : FALLOCATE+POPULATE_WRITE : 0.030 ms ************************************************** 4096 MiB MAP_SHARED: ************************************************** Anon 4 KiB : Read : 1053.090 ms Anon 4 KiB : Write : 913.642 ms Anon 4 KiB : Read/Write : 1060.350 ms Anon 4 KiB : POPULATE_READ : 893.691 ms Anon 4 KiB : POPULATE_WRITE : 782.885 ms Anon 2 MiB : Read : 358.553 ms Anon 2 MiB : Write : 358.419 ms Anon 2 MiB : Read/Write : 357.992 ms Anon 2 MiB : POPULATE_READ : 357.533 ms Anon 2 MiB : POPULATE_WRITE : 357.808 ms Memfd 4 KiB : Read : 1078.144 ms Memfd 4 KiB : Write : 942.036 ms Memfd 4 KiB : Read/Write : 1100.391 ms Memfd 4 KiB : POPULATE_READ : 925.829 ms Memfd 4 KiB : POPULATE_WRITE : 804.394 ms Memfd 4 KiB : FALLOCATE : 304.632 ms Memfd 4 KiB : FALLOCATE+Read : 1163.359 ms Memfd 4 KiB : FALLOCATE+Write : 933.186 ms Memfd 4 KiB : FALLOCATE+Read/Write : 1187.304 ms Memfd 4 KiB : FALLOCATE+POPULATE_READ : 1013.660 ms Memfd 4 KiB : FALLOCATE+POPULATE_WRITE : 794.560 ms Memfd 2 MiB : Read : 358.131 ms Memfd 2 MiB : Write : 358.099 ms Memfd 2 MiB : Read/Write : 358.250 ms Memfd 2 MiB : POPULATE_READ : 357.563 ms Memfd 2 MiB : POPULATE_WRITE : 357.334 ms Memfd 2 MiB : FALLOCATE : 356.735 ms Memfd 2 MiB : FALLOCATE+Read : 358.152 ms Memfd 2 MiB : FALLOCATE+Write : 358.331 ms Memfd 2 MiB : FALLOCATE+Read/Write : 358.018 ms Memfd 2 MiB : FALLOCATE+POPULATE_READ : 357.286 ms Memfd 2 MiB : FALLOCATE+POPULATE_WRITE : 357.523 ms tmpfs : Read : 1087.265 ms tmpfs : Write : 950.840 ms tmpfs : Read/Write : 1107.567 ms tmpfs : POPULATE_READ : 922.605 ms tmpfs : POPULATE_WRITE : 810.094 ms tmpfs : FALLOCATE : 306.320 ms tmpfs : FALLOCATE+Read : 1169.796 ms tmpfs : FALLOCATE+Write : 933.730 ms tmpfs : FALLOCATE+Read/Write : 1191.610 ms tmpfs : FALLOCATE+POPULATE_READ : 1020.474 ms tmpfs : FALLOCATE+POPULATE_WRITE : 798.945 ms file : Read : 654.101 ms file : Write : 1259.142 ms file : Read/Write : 1289.509 ms file : POPULATE_READ : 661.642 ms file : POPULATE_WRITE : 1106.816 ms file : FALLOCATE : 1.864 ms file : FALLOCATE+Read : 656.328 ms file : FALLOCATE+Write : 1153.300 ms file : FALLOCATE+Read/Write : 1180.613 ms file : FALLOCATE+POPULATE_READ : 668.347 ms file : FALLOCATE+POPULATE_WRITE : 996.143 ms hugetlbfs : Read : 357.245 ms hugetlbfs : Write : 357.413 ms hugetlbfs : Read/Write : 357.120 ms hugetlbfs : POPULATE_READ : 356.321 ms hugetlbfs : POPULATE_WRITE : 356.693 ms hugetlbfs : FALLOCATE : 355.927 ms hugetlbfs : FALLOCATE+Read : 357.074 ms hugetlbfs : FALLOCATE+Write : 357.120 ms hugetlbfs : FALLOCATE+Read/Write : 356.983 ms hugetlbfs : FALLOCATE+POPULATE_READ : 356.413 ms hugetlbfs : FALLOCATE+POPULATE_WRITE : 356.266 ms ************************************************** [1] https://lkml.org/lkml/2013/6/27/698 [akpm@linux-foundation.org: coding style fixes] Link: https://lkml.kernel.org/r/20210419135443.12822-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@surriel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rolf Eike Beer <eike-kernel@sf-tec.de> Cc: Ram Pai <linuxram@us.ibm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01mm: generalize ZONE_[DMA|DMA32]Kefeng Wang1-7/+0
ZONE_[DMA|DMA32] configs have duplicate definitions on platforms that subscribe to them. Instead, just make them generic options which can be selected on applicable platforms. Also only x86/arm64 architectures could enable both ZONE_DMA and ZONE_DMA32 if EXPERT, add ARCH_HAS_ZONE_DMA_SET to make dma zone configurable and visible on the two architectures. Link: https://lkml.kernel.org/r/20210528074557.17768-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> [RISC-V] Acked-by: Michal Simek <michal.simek@xilinx.com> [microblaze] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <linux@armlinux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01mm/kconfig: move HOLES_IN_ZONE into mmKefeng Wang1-3/+0
commit a55749639dc1 ("ia64: drop marked broken DISCONTIGMEM and VIRTUAL_MEM_MAP") drop VIRTUAL_MEM_MAP, so there is no need HOLES_IN_ZONE on ia64. Also move HOLES_IN_ZONE into mm/Kconfig, select it if architecture needs this feature. Link: https://lkml.kernel.org/r/20210417075946.181402-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Cc: Will Deacon <will@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>