Age | Commit message (Collapse) | Author | Files | Lines |
|
The following function is the only place that checks USER_LOCK_ATTACHED.
This flag is set when lock request is granted through user_ast() and only
the following function will clear it.
Checking of this flag here is to make sure ocfs2_dlm_unlock is not issued
if this lock is never granted. For example, lock file is created and then
get removed, open file never happens.
Clearing the flag here is not necessary because this is the only function
that checks it, if another flow is executing user_dlm_destroy_lock(), it
will bail out at the beginning because of USER_LOCK_IN_TEARDOWN and never
check USER_LOCK_ATTACHED. Drop the clear, so we don't need take care of
it for the following error handling patch.
int user_dlm_destroy_lock(struct user_lock_res *lockres)
{
...
status = 0;
if (!(lockres->l_flags & USER_LOCK_ATTACHED)) {
spin_unlock(&lockres->l_lock);
goto bail;
}
lockres->l_flags &= ~USER_LOCK_ATTACHED;
lockres->l_flags |= USER_LOCK_BUSY;
spin_unlock(&lockres->l_lock);
status = ocfs2_dlm_unlock(conn, &lockres->l_lksb, DLM_LKF_VALBLK);
if (status) {
user_log_dlm_error("ocfs2_dlm_unlock", status, lockres);
goto bail;
}
...
}
V1 discussion with Joseph:
https://lore.kernel.org/all/7b620c53-0c45-da2c-829e-26195cbe7d4e@linux.alibaba.com/T/
Link: https://lkml.kernel.org/r/20220518235224.87100-1-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core
----
- Support TCPv6 segmentation offload with super-segments larger than
64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP).
- Generalize skb freeing deferral to per-cpu lists, instead of
per-socket lists.
- Add a netdev statistic for packets dropped due to L2 address
mismatch (rx_otherhost_dropped).
- Continue work annotating skb drop reasons.
- Accept alternative netdev names (ALT_IFNAME) in more netlink
requests.
- Add VLAN support for AF_PACKET SOCK_RAW GSO.
- Allow receiving skb mark from the socket as a cmsg.
- Enable memcg accounting for veth queues, sysctl tables and IPv6.
BPF
---
- Add libbpf support for User Statically-Defined Tracing (USDTs).
- Speed up symbol resolution for kprobes multi-link attachments.
- Support storing typed pointers to referenced and unreferenced
objects in BPF maps.
- Add support for BPF link iterator.
- Introduce access to remote CPU map elements in BPF per-cpu map.
- Allow middle-of-the-road settings for the
kernel.unprivileged_bpf_disabled sysctl.
- Implement basic types of dynamic pointers e.g. to allow for
dynamically sized ringbuf reservations without extra memory copies.
Protocols
---------
- Retire port only listening_hash table, add a second bind table
hashed by port and address. Avoid linear list walk when binding to
very popular ports (e.g. 443).
- Add bridge FDB bulk flush filtering support allowing user space to
remove all FDB entries matching a condition.
- Introduce accept_unsolicited_na sysctl for IPv6 to implement
router-side changes for RFC9131.
- Support for MPTCP path manager in user space.
- Add MPTCP support for fallback to regular TCP for connections that
have never connected additional subflows or transmitted
out-of-sequence data (partial support for RFC8684 fallback).
- Avoid races in MPTCP-level window tracking, stabilize and improve
throughput.
- Support lockless operation of GRE tunnels with seq numbers enabled.
- WiFi support for host based BSS color collision detection.
- Add support for SO_TXTIME/SCM_TXTIME on CAN sockets.
- Support transmission w/o flow control in CAN ISOTP (ISO 15765-2).
- Support zero-copy Tx with TLS 1.2 crypto offload (sendfile).
- Allow matching on the number of VLAN tags via tc-flower.
- Add tracepoint for tcp_set_ca_state().
Driver API
----------
- Improve error reporting from classifier and action offload.
- Add support for listing line cards in switches (devlink).
- Add helpers for reporting page pool statistics with ethtool -S.
- Add support for reading clock cycles when using PTP virtual clocks,
instead of having the driver convert to time before reporting. This
makes it possible to report time from different vclocks.
- Support configuring low-latency Tx descriptor push via ethtool.
- Separate Clause 22 and Clause 45 MDIO accesses more explicitly.
New hardware / drivers
----------------------
- Ethernet:
- Marvell's Octeon NIC PCI Endpoint support (octeon_ep)
- Sunplus SP7021 SoC (sp7021_emac)
- Add support for Renesas RZ/V2M (in ravb)
- Add support for MediaTek mt7986 switches (in mtk_eth_soc)
- Ethernet PHYs:
- ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting)
- TI DP83TD510 PHY
- Microchip LAN8742/LAN88xx PHYs
- WiFi:
- Driver for pureLiFi X, XL, XC devices (plfxlc)
- Driver for Silicon Labs devices (wfx)
- Support for WCN6750 (in ath11k)
- Support Realtek 8852ce devices (in rtw89)
- Mobile:
- MediaTek T700 modems (Intel 5G 5000 M.2 cards)
- CAN:
- ctucanfd: add support for CTU CAN FD open-source IP core from
Czech Technical University in Prague
Drivers
-------
- Delete a number of old drivers still using virt_to_bus().
- Ethernet NICs:
- intel: support TSO on tunnels MPLS
- broadcom: support multi-buffer XDP
- nfp: support VF rate limiting
- sfc: use hardware tx timestamps for more than PTP
- mlx5: multi-port eswitch support
- hyper-v: add support for XDP_REDIRECT
- atlantic: XDP support (including multi-buffer)
- macb: improve real-time perf by deferring Tx processing to NAPI
- High-speed Ethernet switches:
- mlxsw: implement basic line card information querying
- prestera: add support for traffic policing on ingress and egress
- Embedded Ethernet switches:
- lan966x: add support for packet DMA (FDMA)
- lan966x: add support for PTP programmable pins
- ti: cpsw_new: enable bc/mc storm prevention
- Qualcomm 802.11ax WiFi (ath11k):
- Wake-on-WLAN support for QCA6390 and WCN6855
- device recovery (firmware restart) support
- support setting Specific Absorption Rate (SAR) for WCN6855
- read country code from SMBIOS for WCN6855/QCA6390
- enable keep-alive during WoWLAN suspend
- implement remain-on-channel support
- MediaTek WiFi (mt76):
- support Wireless Ethernet Dispatch offloading packet movement
between the Ethernet switch and WiFi interfaces
- non-standard VHT MCS10-11 support
- mt7921 AP mode support
- mt7921 IPv6 NS offload support
- Ethernet PHYs:
- micrel: ksz9031/ksz9131: cabletest support
- lan87xx: SQI support for T1 PHYs
- lan937x: add interrupt support for link detection"
* tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits)
ptp: ocp: Add firmware header checks
ptp: ocp: fix PPS source selector debugfs reporting
ptp: ocp: add .init function for sma_op vector
ptp: ocp: vectorize the sma accessor functions
ptp: ocp: constify selectors
ptp: ocp: parameterize input/output sma selectors
ptp: ocp: revise firmware display
ptp: ocp: add Celestica timecard PCI ids
ptp: ocp: Remove #ifdefs around PCI IDs
ptp: ocp: 32-bit fixups for pci start address
Revert "net/smc: fix listen processing for SMC-Rv2"
ath6kl: Use cc-disable-warning to disable -Wdangling-pointer
selftests/bpf: Dynptr tests
bpf: Add dynptr data slices
bpf: Add bpf_dynptr_read and bpf_dynptr_write
bpf: Dynptr support for ring buffers
bpf: Add bpf_dynptr_from_mem for local dynptrs
bpf: Add verifier support for dynptrs
bpf: Suppress 'passing zero to PTR_ERR' warning
bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack
...
|
|
Pull workqueue update from Tejun Heo:
"A lone commit fixing CPU offline handling for per-cpu wq workers so
that they don't bother isolated CPUs"
* 'for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Restrict kworker in the offline CPU pool running on housekeeping CPUs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
"Nothing too interesting. This adds cpu controller selftests and there
are a couple code cleanup patches"
* 'for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: remove the superfluous judgment
cgroup: Make cgroup_debug static
kseltest/cgroup: Make test_stress.sh work if run interactively
kselftest/cgroup: fix test_stress.sh to use OUTPUT dir
cgroup: Add config file to cgroup selftest suite
cgroup: Add test_cpucg_max_nested() testcase
cgroup: Add test_cpucg_max() testcase
cgroup: Add test_cpucg_nested_weight_underprovisioned() testcase
cgroup: Adding test_cpucg_nested_weight_overprovisioned() testcase
cgroup: Add test_cpucg_weight_underprovisioned() testcase
cgroup: Add test_cpucg_weight_overprovisioned() testcase
cgroup: Add test_cpucg_stats() testcase to cgroup cpu selftests
cgroup: Add new test_cpu.c test suite in cgroup selftests
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull KUnit updates from Shuah Khan:
"Several fixes, cleanups, and enhancements to tests and framework:
- introduce _NULL and _NOT_NULL macros to pointer error checks
- rework kunit_resource allocation policy to fix memory leaks when
caller doesn't specify free() function to be used when allocating
memory using kunit_add_resource() and kunit_alloc_resource() funcs.
- add ability to specify suite-level init and exit functions"
* tag 'linux-kselftest-kunit-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (41 commits)
kunit: tool: Use qemu-system-i386 for i386 runs
kunit: fix executor OOM error handling logic on non-UML
kunit: tool: update riscv QEMU config with new serial dependency
kcsan: test: use new suite_{init,exit} support
kunit: tool: Add list of all valid test configs on UML
kunit: take `kunit_assert` as `const`
kunit: tool: misc cleanups
kunit: tool: minor cosmetic cleanups in kunit_parser.py
kunit: tool: make parser stop overwriting status of suites w/ no_tests
kunit: tool: remove dead parse_crash_in_log() logic
kunit: tool: print clearer error message when there's no TAP output
kunit: tool: stop using a shell to run kernel under QEMU
kunit: tool: update test counts summary line format
kunit: bail out of test filtering logic quicker if OOM
lib/Kconfig.debug: change KUnit tests to default to KUNIT_ALL_TESTS
kunit: Rework kunit_resource allocation policy
kunit: fix debugfs code to use enum kunit_status, not bool
kfence: test: use new suite_{init/exit} support, add .kunitconfig
kunit: add ability to specify suite-level init and exit functions
kunit: rename print_subtest_{start,end} for clarity (s/subtest/suite)
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest updates from Shuah Khan:
"Several fixes, cleanups, and enhancements to tests:
- add mips support for kprobe args string and syntax tests
- updates to resctrl test to use kselftest framework
- fixes, cleanups, and enhancements to tests"
* tag 'linux-kselftest-next-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kselftests/ir : Improve readability of modprobe error message
selftests/resctrl: Fix null pointer dereference on open failed
selftests/resctrl: Add missing SPDX license to Makefile
selftests/resctrl: Update README about using kselftest framework to build/run resctrl_tests
selftests/resctrl: Make resctrl_tests run using kselftest framework
selftests/resctrl: Fix resctrl_tests' return code to work with selftest framework
selftests/resctrl: Change the default limited time to 120 seconds
selftests/resctrl: Kill child process before parent process terminates if SIGTERM is received
selftests/resctrl: Print a message if the result of MBM&CMT tests is failed on Intel CPU
selftests/resctrl: Extend CPU vendor detection
selftests/x86/corrupt_xstate_header: Use provided __cpuid_count() macro
selftests/x86/amx: Use provided __cpuid_count() macro
selftests/vm/pkeys: Use provided __cpuid_count() macro
selftests: Provide local define of __cpuid_count()
selftests/damon: add damon to selftests root Makefile
selftests/binderfs: Improve message to provide more info
selftests: mqueue: drop duplicate min definition
selftests/ftrace: add mips support for kprobe args syntax tests
selftests/ftrace: add mips support for kprobe args string tests
|
|
Pull documentation updates from Jonathan Corbet:
"It was a moderately busy cycle for documentation; highlights include:
- After a long period of inactivity, the Japanese translations are
seeing some much-needed maintenance and updating.
- Reworked IOMMU documentation
- Some new documentation for static-analysis tools
- A new overall structure for the memory-management documentation.
This is an LSFMM outcome that, it is hoped, will help encourage
developers to fill in the many gaps. Optimism is eternal...but
hopefully it will work.
- More Chinese translations.
Plus the usual typo fixes, updates, etc"
* tag 'docs-5.19' of git://git.lwn.net/linux: (70 commits)
docs: pdfdocs: Add space for chapter counts >= 100 in TOC
docs/zh_CN: Add dev-tools/gdb-kernel-debugging.rst Chinese translation
input: Docs: correct ntrig.rst typo
input: Docs: correct atarikbd.rst typos
MAINTAINERS: Become the docs/zh_CN maintainer
docs/zh_CN: fix devicetree usage-model translation
mm,doc: Add new documentation structure
Documentation: drop more IDE boot options and ide-cd.rst
Documentation/process: use scripts/get_maintainer.pl on patches
MAINTAINERS: Add entry for DOCUMENTATION/JAPANESE
docs/trans/ja_JP/howto: Don't mention specific kernel versions
docs/ja_JP/SubmittingPatches: Request summaries for commit references
docs/ja_JP/SubmittingPatches: Add Suggested-by as a standard signature
docs/ja_JP/SubmittingPatches: Randy has moved
docs/ja_JP/SubmittingPatches: Suggest the use of scripts/get_maintainer.pl
docs/ja_JP/SubmittingPatches: Update GregKH links
Documentation/sysctl: document max_rcu_stall_to_panic
Documentation: add missing angle bracket in cgroup-v2 doc
Documentation: dev-tools: use literal block instead of code-block
docs/zh_CN: add vm numa translation
...
|
|
Use PAGE_ALIGNED macro instead of IS_ALIGNED and passing PAGE_SIZE.
Link: https://lkml.kernel.org/r/20220520021833.121405-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Cc: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The default "timeout" for one kselftest is 45 seconds, while some cases in
run_vmtests.sh require more time. This will cause testing timeout like:
not ok 4 selftests: vm: run_vmtests.sh # TIMEOUT 45 seconds
Therefore, add the "settings" file with timeout variable so users can set
the "timeout" value.
Link: https://lkml.kernel.org/r/20220521083825.319654-4-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The "test_hmm.sh" file used by run_vmtests.sh dose not be installed into
INSTALL_PATH. Thus run_vmtests.sh can not call it in INSTALL_PATH:
---------------------------
running ./test_hmm.sh smoke
---------------------------
./run_vmtests.sh: line 74: ./test_hmm.sh: No such file or directory
[FAIL]
-----------------------
Add "test_hmm.sh" to TEST_FILES so that it will be installed.
Link: https://lkml.kernel.org/r/20220521083825.319654-3-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
in ksm_tests
Patch series "selftests: vm: a few fixup patches".
This series contains three fixup patches for vm selftests. They are
independent. Please see the patches.
This patch (of 3):
Currently, ksm_tests operates "merge_across_nodes" with NUMA either
enabled or disabled. In a system with NUMA disabled, these operations
will fail and output a misleading report given "merge_across_nodes" does
not exist in sysfs:
----------------------------
running ./ksm_tests -M -p 10
----------------------------
f /sys/kernel/mm/ksm/merge_across_nodes
fopen: No such file or directory
Cannot save default tunables
[FAIL]
----------------------
So check numa_available() before those operations to skip them if NUMA is
disabled.
Link: https://lkml.kernel.org/r/20220521083825.319654-1-patrick.wang.shcn@gmail.com
Link: https://lkml.kernel.org/r/20220521083825.319654-2-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add newly added migration test object to .gitignore file.
Link: https://lkml.kernel.org/r/20220521094313.166505-1-usama.anjum@collabora.com
Fixes: 0c2d08728470 ("mm: add selftests for migration entries")
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Spelling mistake (triple letters) in comment. Detected with the help of
Coccinelle.
Link: https://lkml.kernel.org/r/20220521111145.81697-80-Julia.Lawall@inria.fr
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Spelling mistake (triple letters) in comment. Detected with the help of
Coccinelle.
Link: https://lkml.kernel.org/r/20220521111145.81697-94-Julia.Lawall@inria.fr
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Introduce process_mrelease syscall sanity tests which include tests
which expect to fail:
- process_mrelease with invalid pidfd and flags inputs
- process_mrelease on a live process with no pending signals
and valid process_mrelease usage which is expected to succeed. Because
process_mrelease has to be used against a process with a pending SIGKILL,
it's possible that the process exits before process_mrelease gets called.
In such cases we retry the test with a victim that allocates twice more
memory up to 1GB. This would require the victim process to spend more
time during exit and process_mrelease has a better chance of catching the
process before it exits and succeeding.
On success the test reports the amount of memory the child had to allocate
for reaping to succeed. Sample output:
$ mrelease_test
Success reaping a child with 1MB of memory allocations
On failure the test reports the failure. Sample outputs:
$ mrelease_test
All process_mrelease attempts failed!
$ mrelease_test
process_mrelease: Invalid argument
Link: https://lkml.kernel.org/r/20220518204316.13131-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This reverts commit 3a235693d3930e1276c8d9cc0ca5807ef292cf0a.
Its premise was that cgroup reclaim cares about freeing memory inside the
cgroup, and demotion just moves them around within the cgroup limit.
Hence, pages from toptier nodes should be reclaimed directly.
However, with NUMA balancing now doing tier promotions, demotion is part
of the page aging process. Global reclaim demotes the coldest toptier
pages to secondary memory, where their life continues and from which they
have a chance to get promoted back. Essentially, tiered memory systems
have an LRU order that spans multiple nodes.
When cgroup reclaims pages coming off the toptier directly, there can be
colder pages on lower tier nodes that were demoted by global reclaim.
This is an aging inversion, not unlike if cgroups were to reclaim directly
from the active lists while there are inactive pages.
Proactive reclaim is another factor. The goal of that it is to offload
colder pages from expensive RAM to cheaper storage. When lower tier
memory is available as an intermediate layer, we want offloading to take
advantage of it instead of bypassing to storage.
Revert the patch so that cgroups respect the LRU order spanning the memory
hierarchy.
Of note is a specific undercommit scenario, where all cgroup limits in the
system add up to <= available toptier memory. In that case, shuffling
pages out to lower tiers first to reclaim them from there is inefficient.
This is something could be optimized/short-circuited later on (although
care must be taken not to accidentally recreate the aging inversion).
Let's ensure correctness first.
Link: https://lkml.kernel.org/r/20220518190911.82400-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
By printing information, we can friendly prompt the status change
information of kfence by dmesg and record by syslog.
Also, set kfence_enabled to false only when needed.
Link: https://lkml.kernel.org/r/20220518073105.3160335-1-liu.yun@linux.dev
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Co-developed-by: Marco Elver <elver@google.com>
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
percpu_alloc_percpu event trace"
Fix sparse warning about incorrect gfp_t cast.
Link: https://lkml.kernel.org/r/001979f3-e978-0998-cbed-61a4a2ac87b8@openvz.org
Fixes: f67bed134a05 ("percpu: improve percpu_alloc_percpu event trace")
Signed-off-by: Vasily Averin <vvs@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
conversion"
Redefines __def_gfpflag_names array according to akpm@, willy@ and Joe
Perches recommendations.
Link: https://lkml.kernel.org/r/6f811e19-41c6-f3e8-fca6-23a19a62e313@openvz.org
Fixes: fe573327ffb1 ("tracing: incorrect gfp_t conversion")
Signed-off-by: Vasily Averin <vvs@openvz.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Joe Perches <joe@perches.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
In isolate_single_pageblock() called by start_isolate_page_range(), there
are some pageblock isolation issues causing a potential infinite loop when
isolating a page range. This is reported by Qian Cai.
1. the pageblock was isolated by just changing pageblock migratetype
without checking unmovable pages. Calling set_migratetype_isolate() to
isolate pageblock properly.
2. an off-by-one error caused migrating pages unnecessarily, since the page
is not crossing pageblock boundary.
3. migrating a compound page across pageblock boundary then splitting the
free page later has a small race window that the free page might be
allocated again, so that the code will try again, causing an potential
infinite loop. Temporarily set the to-be-migrated page's pageblock to
MIGRATE_ISOLATE to prevent that and bail out early if no free page is
found after page migration.
An additional fix to split_free_page() aims to avoid crashing in
__free_one_page(). When the free page is split at the specified
split_pfn_offset, free_page_order should check both the first bit of
free_page_pfn and the last bit of split_pfn_offset and use the smaller
one. For example, if free_page_pfn=0x10000, split_pfn_offset=0xc000,
free_page_order should first be 0x8000 then 0x4000, instead of 0x4000 then
0x8000, which the original algorithm did.
[akpm@linux-foundation.org: suppress min() warning]
Link: https://lkml.kernel.org/r/20220524194756.1698351-1-zi.yan@sent.com
Fixes: b2c9e2fbba3253 ("mm: make alloc_contig_range work at pageblock granularity")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Ren <renzhengeek@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
I have been focusing on mm for the past two years. e.g. developing,
fixing bugs, reviewing related to HugeTLB system. I would like to help
Mike and other people working on HugeTLB by reviewing their work.
When I first introduced the vmemmmap reduction, I forgot to update
MAINTAINERS file. Let's update it as well. And rename "HUGETLB
FILESYSTEM" to "HUGETLB SUBSYSTEM" since some files are not only related
to filesystem but also memory management (the name of FILESYSTEM cannot
cover this area).
Link: https://lkml.kernel.org/r/20220521074103.79468-1-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
ZSMALLOC depends on MMU so ZRAM should also depend on MMU since 'select'
does not follow any dependency chains.
Fixes this Kconfig warning:
WARNING: unmet direct dependencies detected for ZSMALLOC
Depends on [n]: MMU [=n]
Selected by [y]:
- ZRAM [=y] && BLK_DEV [=y] && BLOCK [=y] && SYSFS [=y] && (CRYPTO_LZO [=y] || CRYPTO_ZSTD [=m] || CRYPTO_LZ4 [=m] || CRYPTO_LZ4HC [=n] || CRYPTO_842 [=n])
Link: https://lkml.kernel.org/r/20220522204027.22964-1-rdunlap@infradead.org
Fixes: b3fbd58fcbb10 ("mm: Kconfig: simplify zswap configuration")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Shmem swapoff makes no progress: the index to indices is not incremented.
But "ret" is no longer a return value, so use folio_batch_count() instead.
Link: https://lkml.kernel.org/r/c32bee8a-f0aa-245-f94e-24dd271924fa@google.com
Fixes: da08e9b79323 ("mm/shmem: convert shmem_swapin_page() to shmem_swapin_folio()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Tested-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
If the first goto is taken, 'fd' is not opened yet (and is un-initialized).
So a direct return is safer.
Link: https://lkml.kernel.org/r/628312312eb40e0e39463a2c06415fde5295c716.1653229120.git.christophe.jaillet@wanadoo.fr
Fixes: c1a31a2f7a9c ("cgroup: fix racy check in alloc_pagecache_max_30M() helper function")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: David Vernet <void@manifault.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Offload writing printk() messages on consoles to per-console
kthreads.
It prevents soft-lockups when an extensive amount of messages is
printed. It was observed, for example, during boot of large systems
with a lot of peripherals like disks or network interfaces.
It prevents live-lockups that were observed, for example, when
messages about allocation failures were reported and a CPU handled
consoles instead of reclaiming the memory. It was hard to solve even
with rate limiting because it would need to take into account the
amount of messages and the speed of all consoles.
It is a must to have for real time. Otherwise, any printk() might
break latency guarantees.
The per-console kthreads allow to handle each console on its own
speed. Slow consoles do not longer slow down faster ones. And
printk() does not longer unpredictably slows down various code paths.
There are situations when the kthreads are either not available or
not reliable, for example, early boot, suspend, or panic. In these
situations, printk() uses the legacy mode and tries to handle
consoles immediately.
- Add documentation for the printk index.
* tag 'printk-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk, tracing: fix console tracepoint
printk: remove @console_locked
printk: extend console_lock for per-console locking
printk: add kthread console printers
printk: add functions to prefer direct printing
printk: add pr_flush()
printk: move buffer definitions into console_emit_next_record() caller
printk: refactor and rework printing logic
printk: add con_printk() macro for console details
printk: call boot_delay_msec() in printk_delay()
printk: get caller_id/timestamp after migration disable
printk: wake waiters for safe and NMI contexts
printk: wake up all waiters
printk: add missing memory barrier to wake_up_klogd()
printk: cpu sync always disable interrupts
printk: rename cpulock functions
printk/index: Printk index feature documentation
MAINTAINERS: Add printk indexing maintainers on mention of printk_index
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
- Conversion of slub_debug stack traces to stackdepot, allowing more
useful debugfs-based inspection for e.g. memory leak debugging.
Allocation and free debugfs info now includes full traces and is
sorted by the unique trace frequency.
The stackdepot conversion was already attempted last year but
reverted by ae14c63a9f20. The memory overhead (while not actually
enabled on boot) has been meanwhile solved by making the large
stackdepot allocation dynamic. The xfstest issues haven't been
reproduced on current kernel locally nor in -next, so the slab cache
layout changes that originally made that bug manifest were probably
not the root cause.
- Refactoring of dma-kmalloc caches creation.
- Trivial cleanups such as removal of unused parameters, fixes and
clarifications of comments.
- Hyeonggon Yoo joins as a reviewer.
* tag 'slab-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
MAINTAINERS: add myself as reviewer for slab
mm/slub: remove unused kmem_cache_order_objects max
mm: slab: fix comment for __assume_kmalloc_alignment
mm: slab: fix comment for ARCH_KMALLOC_MINALIGN
mm/slub: remove unneeded return value of slab_pad_check
mm/slab_common: move dma-kmalloc caches creation into new_kmalloc_cache()
mm/slub: remove meaningless node check in ___slab_alloc()
mm/slub: remove duplicate flag in allocate_slab()
mm/slub: remove unused parameter in setup_object*()
mm/slab.c: fix comments
slab, documentation: add description of debugfs files for SLUB caches
mm/slub: sort debugfs output by frequency of stack traces
mm/slub: distinguish and print stack traces in debugfs files
mm/slub: use stackdepot to save stack trace in objects
mm/slub: move struct track init out of set_track()
lib/stackdepot: allow requesting early initialization dynamically
mm/slub, kunit: Make slub_kunit unaffected by user specified flags
mm/slab: remove some unused functions
|
|
Commit c724c866bb70 ("linux/types.h: remove unnecessary __bitwise__")
was right that there are no users of __bitwise__ in the kernel, but it
turns out there are user space users of it that do expect it.
It is, after all, in the uapi directory, so user space usage is to be
expected.
Instead of reverting the commit completely, let's just clarify the
situation so that it doesn't happen again, and have some in-code
explanations for why that "__bitwise__" still exists.
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Link: https://lore.kernel.org/all/b5c0a68d-8387-4909-beea-f70ab9e6e3d5@kernel.org/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit b2a90f4fcb14 ("media: lirc: remove unused lirc features") removed
feature flags which were never implemented, but they are still used by
the lirc daemon went built from source.
Reinstate these symbols in order not to break the lirc build.
Fixes: b2a90f4fcb14 ("media: lirc: remove unused lirc features")
Link: https://lore.kernel.org/all/a0470450-ecfd-2918-e04a-7b57c1fd7694@kernel.org/
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The IXP4xx Kconfig we ended up with for mach-ixp4xx creates
as kismet warning:
WARNING: unmet direct dependencies detected for GPIO_IXP4XX
Depends on [n]: GPIOLIB [=y] && HAS_IOMEM [=y] && ARCH_IXP4XX [=y] && OF [=n]
Selected by [y]:
- ARCH_IXP4XX [=y] && <choice>
This is because it is possible to select ARCH_IXP4XX witout
OF while that selects the GPIO driver that now depends on
OF.
Fix this by creating a single ARCH_IXP4XX kconfig that selects
USE_OF.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Imre Kaloz <kaloz@openwrt.org>
Cc: Krzysztof Halasa <khalasa@piap.pl>
Link: https://lore.kernel.org/r/20220522072356.34062-1-linus.walleij@linaro.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
When kernel handles the vm-exit caused by external interrupts and NMI,
it always sets kvm_intr_type to tell if it's dealing an IRQ or NMI. For
the PMI scenario, it could be IRQ or NMI.
However, intel_pt PMIs are only generated for HARDWARE perf events, and
HARDWARE events are always configured to generate NMIs. Use
kvm_handling_nmi_from_guest() to precisely identify if the intel_pt PMI
came from the guest; this avoids false positives if an intel_pt PMI/NMI
arrives while the host is handling an unrelated IRQ VM-Exit.
Fixes: db215756ae59 ("KVM: x86: More precisely identify NMI from guest when handling PMI")
Signed-off-by: Yanfei Xu <yanfei.xu@intel.com>
Message-Id: <20220523140821.1345605-1-yanfei.xu@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Fixing side effect of the so-called opportunistic change in the commit.
Fixes: dc8a9febbab0 ("KVM: selftests: x86: Fix test failure on arch lbr capable platforms")
Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20220518170118.66263-2-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Commit ddd7ed842627 ("x86/kvm: Alloc dummy async #PF token outside of
raw spinlock") leads to the following Smatch static checker warning:
arch/x86/kernel/kvm.c:212 kvm_async_pf_task_wake()
warn: sleeping in atomic context
arch/x86/kernel/kvm.c
202 raw_spin_lock(&b->lock);
203 n = _find_apf_task(b, token);
204 if (!n) {
205 /*
206 * Async #PF not yet handled, add a dummy entry for the token.
207 * Allocating the token must be down outside of the raw lock
208 * as the allocator is preemptible on PREEMPT_RT kernels.
209 */
210 if (!dummy) {
211 raw_spin_unlock(&b->lock);
--> 212 dummy = kzalloc(sizeof(*dummy), GFP_KERNEL);
^^^^^^^^^^
Smatch thinks the caller has preempt disabled. The `smdb.py preempt
kvm_async_pf_task_wake` output call tree is:
sysvec_kvm_asyncpf_interrupt() <- disables preempt
-> __sysvec_kvm_asyncpf_interrupt()
-> kvm_async_pf_task_wake()
The caller is this:
arch/x86/kernel/kvm.c
290 DEFINE_IDTENTRY_SYSVEC(sysvec_kvm_asyncpf_interrupt)
291 {
292 struct pt_regs *old_regs = set_irq_regs(regs);
293 u32 token;
294
295 ack_APIC_irq();
296
297 inc_irq_stat(irq_hv_callback_count);
298
299 if (__this_cpu_read(apf_reason.enabled)) {
300 token = __this_cpu_read(apf_reason.token);
301 kvm_async_pf_task_wake(token);
302 __this_cpu_write(apf_reason.token, 0);
303 wrmsrl(MSR_KVM_ASYNC_PF_ACK, 1);
304 }
305
306 set_irq_regs(old_regs);
307 }
The DEFINE_IDTENTRY_SYSVEC() is a wrapper that calls this function
from the call_on_irqstack_cond(). It's inside the call_on_irqstack_cond()
where preempt is disabled (unless it's already disabled). The
irq_enter/exit_rcu() functions disable/enable preempt.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The timer is disarmed when switching between TSC deadline and other modes;
however, the pending timer is still in-flight, so let's accurately remove
any traces of the previous mode.
Fixes: 4427593258 ("KVM: x86: thoroughly disarm LAPIC timer around TSC deadline switch")
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Drop the raw spinlock in kvm_async_pf_task_wake() before allocating the
the dummy async #PF token, the allocator is preemptible on PREEMPT_RT
kernels and must not be called from truly atomic contexts.
Opportunistically document why it's ok to loop on allocation failure,
i.e. why the function won't get stuck in an infinite loop.
Reported-by: Yajun Deng <yajun.deng@linux.dev>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Whenever x86_decode_emulated_instruction() detects a breakpoint, it
returns the value that kvm_vcpu_check_breakpoint() writes into its
pass-by-reference second argument. Unfortunately this is completely
bogus because the expected outcome of x86_decode_emulated_instruction
is an EMULATION_* value.
Then, if kvm_vcpu_check_breakpoint() does "*r = 0" (corresponding to
a KVM_EXIT_DEBUG userspace exit), it is misunderstood as EMULATION_OK
and x86_emulate_instruction() is called without having decoded the
instruction. This causes various havoc from running with a stale
emulation context.
The fix is to move the call to kvm_vcpu_check_breakpoint() where it was
before commit 4aa2691dcbd3 ("KVM: x86: Factor out x86 instruction
emulation with decoding") introduced x86_decode_emulated_instruction().
The other caller of the function does not need breakpoint checks,
because it is invoked as part of a vmexit and the processor has already
checked those before executing the instruction that #GP'd.
This fixes CVE-2022-1852.
Reported-by: Qiuhao Li <qiuhao@sysec.org>
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Reported-by: Yongkang Jia <kangel@zju.edu.cn>
Fixes: 4aa2691dcbd3 ("KVM: x86: Factor out x86 instruction emulation with decoding")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220311032801.3467418-2-seanjc@google.com>
[Rewrote commit message according to Qiuhao's report, since a patch
already existed to fix the bug. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
For some sev ioctl interfaces, the length parameter that is passed maybe
less than or equal to SEV_FW_BLOB_MAX_SIZE, but larger than the data
that PSP firmware returns. In this case, kmalloc will allocate memory
that is the size of the input rather than the size of the data.
Since PSP firmware doesn't fully overwrite the allocated buffer, these
sev ioctl interface may return uninitialized kernel slab memory.
Reported-by: Andy Nguyen <theflow@google.com>
Suggested-by: David Rientjes <rientjes@google.com>
Suggested-by: Peter Gonda <pgonda@google.com>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Fixes: eaf78265a4ab3 ("KVM: SVM: Move SEV code to separate file")
Fixes: 2c07ded06427d ("KVM: SVM: add support for SEV attestation command")
Fixes: 4cfdd47d6d95a ("KVM: SVM: Add KVM_SEV SEND_START command")
Fixes: d3d1af85e2c75 ("KVM: SVM: Add KVM_SEND_UPDATE_DATA command")
Fixes: eba04b20e4861 ("KVM: x86: Account a variety of miscellaneous allocations")
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Peter Gonda <pgonda@google.com>
Message-Id: <20220516154310.3685678-1-Ashish.Kalra@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Set the starting uABI size of KVM's guest FPU to 'struct kvm_xsave',
i.e. to KVM's historical uABI size. When saving FPU state for usersapce,
KVM (well, now the FPU) sets the FP+SSE bits in the XSAVE header even if
the host doesn't support XSAVE. Setting the XSAVE header allows the VM
to be migrated to a host that does support XSAVE without the new host
having to handle FPU state that may or may not be compatible with XSAVE.
Setting the uABI size to the host's default size results in out-of-bounds
writes (setting the FP+SSE bits) and data corruption (that is thankfully
caught by KASAN) when running on hosts without XSAVE, e.g. on Core2 CPUs.
WARN if the default size is larger than KVM's historical uABI size; all
features that can push the FPU size beyond the historical size must be
opt-in.
==================================================================
BUG: KASAN: slab-out-of-bounds in fpu_copy_uabi_to_guest_fpstate+0x86/0x130
Read of size 8 at addr ffff888011e33a00 by task qemu-build/681
CPU: 1 PID: 681 Comm: qemu-build Not tainted 5.18.0-rc5-KASAN-amd64 #1
Hardware name: /DG35EC, BIOS ECG3510M.86A.0118.2010.0113.1426 01/13/2010
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x45
print_report.cold+0x45/0x575
kasan_report+0x9b/0xd0
fpu_copy_uabi_to_guest_fpstate+0x86/0x130
kvm_arch_vcpu_ioctl+0x72a/0x1c50 [kvm]
kvm_vcpu_ioctl+0x47f/0x7b0 [kvm]
__x64_sys_ioctl+0x5de/0xc90
do_syscall_64+0x31/0x50
entry_SYSCALL_64_after_hwframe+0x44/0xae
</TASK>
Allocated by task 0:
(stack is not available)
The buggy address belongs to the object at ffff888011e33800
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 0 bytes to the right of
512-byte region [ffff888011e33800, ffff888011e33a00)
The buggy address belongs to the physical page:
page:0000000089cd4adb refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11e30
head:0000000089cd4adb order:2 compound_mapcount:0 compound_pincount:0
flags: 0x4000000000010200(slab|head|zone=1)
raw: 4000000000010200 dead000000000100 dead000000000122 ffff888001041c80
raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888011e33900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff888011e33980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff888011e33a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff888011e33a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888011e33b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint
Fixes: be50b2065dfa ("kvm: x86: Add support for getting/setting expanded xstate buffer")
Fixes: c60427dd50ba ("x86/fpu: Add uabi_size to guest_fpu")
Reported-by: Zdenek Kaspar <zkaspar82@gmail.com>
Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Tested-by: Zdenek Kaspar <zkaspar82@gmail.com>
Message-Id: <20220504001219.983513-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: Fix and feature for 5.19
- ultravisor communication device driver
- fix TEID on terminating storage key ops
|
|
KVM/riscv changes for 5.19
- Added Sv57x4 support for G-stage page table
- Added range based local HFENCE functions
- Added remote HFENCE functions based on VCPU requests
- Added ISA extension registers in ONE_REG interface
- Updated KVM RISC-V maintainers entry to cover selftests support
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 5.19
- Add support for the ARMv8.6 WFxT extension
- Guard pages for the EL2 stacks
- Trap and emulate AArch32 ID registers to hide unsupported features
- Ability to select and save/restore the set of hypercalls exposed
to the guest
- Support for PSCI-initiated suspend in collaboration with userspace
- GICv3 register-based LPI invalidation support
- Move host PMU event merging into the vcpu data structure
- GICv3 ITS save/restore fixes
- The usual set of small-scale cleanups and fixes
[Due to the conflict, KVM_SYSTEM_EVENT_SEV_TERM is relocated
from 4 to 6. - Paolo]
|
|
On Arch LBR capable platforms, LBR_FMT in perf capability msr is 0x3f,
so the last format test will fail. Use a true invalid format(0x30) for
the test if it's running on these platforms. Opportunistically change
the file name to reflect the tests actually carried out.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Message-Id: <20220512084046.105479-1-weijiang.yang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In commit ec0671d5684a ("KVM: LAPIC: Delay trace_kvm_wait_lapic_expire
tracepoint to after vmexit", 2019-06-04), trace_kvm_wait_lapic_expire
was moved after guest_exit_irqoff() because invoking tracepoints within
kvm_guest_enter/kvm_guest_exit caused a lockdep splat.
These days this is not necessary, because commit 87fa7f3e98a1 ("x86/kvm:
Move context tracking where it belongs", 2020-07-09) restricted
the RCU extended quiescent state to be closer to vmentry/vmexit.
Moving the tracepoint back to __kvm_wait_lapic_expire is more accurate,
because it will be reported even if vcpu_enter_guest causes multiple
vmentries via the IPI/Timer fast paths, and it allows the removal of
advance_expire_delta.
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1650961551-38390-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Pull page cache updates from Matthew Wilcox:
- Appoint myself page cache maintainer
- Fix how scsicam uses the page cache
- Use the memalloc_nofs_save() API to replace AOP_FLAG_NOFS
- Remove the AOP flags entirely
- Remove pagecache_write_begin() and pagecache_write_end()
- Documentation updates
- Convert several address_space operations to use folios:
- is_dirty_writeback
- readpage becomes read_folio
- releasepage becomes release_folio
- freepage becomes free_folio
- Change filler_t to require a struct file pointer be the first
argument like ->read_folio
* tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecache: (107 commits)
nilfs2: Fix some kernel-doc comments
Appoint myself page cache maintainer
fs: Remove aops->freepage
secretmem: Convert to free_folio
nfs: Convert to free_folio
orangefs: Convert to free_folio
fs: Add free_folio address space operation
fs: Convert drop_buffers() to use a folio
fs: Change try_to_free_buffers() to take a folio
jbd2: Convert release_buffer_page() to use a folio
jbd2: Convert jbd2_journal_try_to_free_buffers to take a folio
reiserfs: Convert release_buffer_page() to use a folio
fs: Remove last vestiges of releasepage
ubifs: Convert to release_folio
reiserfs: Convert to release_folio
orangefs: Convert to release_folio
ocfs2: Convert to release_folio
nilfs2: Remove comment about releasepage
nfs: Convert to release_folio
jfs: Convert to release_folio
...
|
|
Pull iomap updates from Darrick Wong:
"There's a couple of corrections sent in by Andreas for some accounting
errors.
The biggest change this time around is that writeback errors longer
clear pageuptodate nor does XFS invalidate the page cache anymore.
This brings XFS (and gfs2/zonefs) behavior in line with every other
Linux filesystem driver, and fixes some UAF bugs that only cropped up
after willy turned on multipage folios for XFS in 5.18-rc1.
Regrettably, it took all the way to the end of the 5.18 cycle to find
the source of these bugs and reach a consensus that XFS' writeback
failure behavior from 20 years ago is no longer necessary.
Summary:
- Fix a couple of accounting errors in the buffered io code.
- Discontinue the practice of marking folios !uptodate and
invalidating them when writeback fails.
This fixes some UAF bugs when multipage folios are enabled, and
brings the behavior of XFS/gfs/zonefs into alignment with the
behavior of all the other Linux filesystems"
* tag 'iomap-5.19-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: don't invalidate folios after writeback errors
iomap: iomap_write_end cleanup
iomap: iomap_write_failed fix
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
"This includes several large patches to improve endian handling and
remove sparse warnings. The code previously used in/out, in-place
endianness conversion functions.
Other code cleanup includes the list iterator changes.
Finally, a long standing bug was found and fixed, caused by missed
decrement on an lock struct ref count"
* tag 'dlm-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: (28 commits)
dlm: use kref_put_lock in __put_lkb
dlm: use kref_put_lock in put_rsb
dlm: remove unnecessary error assign
dlm: fix missing lkb refcount handling
fs: dlm: cast resource pointer to uintptr_t
dlm: replace usage of found with dedicated list iterator variable
dlm: remove usage of list iterator for list_add() after the loop body
dlm: fix pending remove if msg allocation fails
dlm: fix wake_up() calls for pending remove
dlm: check required context while close
dlm: cleanup lock handling in dlm_master_lookup
dlm: remove found label in dlm_master_lookup
dlm: remove __user conversion warnings
dlm: move conversion to compile time
dlm: use __le types for dlm messages
dlm: use __le types for rcom messages
dlm: use __le types for dlm header
dlm: use __le types for options header
dlm: add __CHECKER__ for false positives
dlm: move global to static inits
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"Various bug fixes and cleanups for ext4.
In particular, move the crypto related fucntions from fs/ext4/super.c
into a new fs/ext4/crypto.c, and fix a number of bugs found by fuzzers
and error injection tools"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (25 commits)
ext4: only allow test_dummy_encryption when supported
ext4: fix bug_on in __es_tree_search
ext4: avoid cycles in directory h-tree
ext4: verify dir block before splitting it
ext4: filter out EXT4_FC_REPLAY from on-disk superblock field s_state
ext4: fix bug_on in ext4_writepages
ext4: refactor and move ext4_ioctl_get_encryption_pwsalt()
ext4: cleanup function defs from ext4.h into crypto.c
ext4: move ext4 crypto code to its own file crypto.c
ext4: fix memory leak in parse_apply_sb_mount_options()
ext4: reject the 'commit' option on ext2 filesystems
ext4: remove duplicated #include of dax.h in inode.c
ext4: fix race condition between ext4_write and ext4_convert_inline_data
ext4: convert symlink external data block mapping to bdev
ext4: add nowait mode for ext4_getblk()
ext4: fix journal_ioprio mount option handling
ext4: mark group as trimmed only if it was fully scanned
ext4: fix use-after-free in ext4_rename_dir_prepare
ext4: add unmount filesystem message
ext4: remove unnecessary conditionals
...
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
drm/i915 fixes for v5.19 merge window:
- Build, sparse, UB, and CFI fixes
- Variable scope fix
- Audio pipe logging fix
- ICL+ DSI NULL dereference fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87sfozuj44.fsf@intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 updates from Andreas Gruenbacher:
- Clean up the allocation of glocks that have an address space attached
- Quota locking fix and quota iomap conversion
- Fix the FITRIM error reporting
- Some list iterator cleanups
* tag 'gfs2-v5.18-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Convert function bh_get to use iomap
gfs2: use i_lock spin_lock for inode qadata
gfs2: Return more useful errors from gfs2_rgrp_send_discards()
gfs2: Use container_of() for gfs2_glock(aspace)
gfs2: Explain some direct I/O oddities
gfs2: replace 'found' with dedicated list iterator variable
|