summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-10-08drm/xe/guc: Dead CT helperJohn Harrison5-29/+335
Add a worker function helper for asynchronously dumping state when an internal/fatal error is detected in CT processing. Being asynchronous is required to avoid deadlocks and scheduling-while-atomic or process-stalled-for-too-long issues. Also check for a bunch more error conditions and improve the handling of some existing checks. v2: Use compile time CONFIG check for new (but not directly CT_DEAD related) checks and use unsigned int for a bitmask, rename CT_DEAD_RESET to CT_DEAD_REARM and add some explaining comments, rename 'hxg' macro parameter to 'ctb' - review feedback from Michal W. Drop CT_DEAD_ALIVE as no need for a bitfield define to just set the entire mask to zero. v3: Fix kerneldoc v4: Nullify some floating pointers after free. v5: Add section headings and device info to make the state dump look more like a devcoredump to allow parsing by the same tools (eventual aim is to just call the devcoredump code itself, but that currently requires an xe_sched_job, which is not available in the CT code). v6: Fix potential for leaking snapshots with concurrent error conditions (review feedback from Julia F). v7: Don't complain about unexpected G2H messages yet because there is a known issue causing them. Fix bit shift bug with v6 change. Add GT id to fake coredump headers and use puts instead of printf. v8: Disable the head mis-match check in g2h_read because it is failing on various discrete platforms due to unknown reasons. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-9-John.C.Harrison@Intel.com
2024-10-08drm/print: Introduce drm_line_printerMichal Wajdeczko2-0/+78
This drm printer wrapper can be used to increase the robustness of the captured output generated by any other drm_printer to make sure we didn't lost any intermediate lines of the output by adding line numbers to each output line. Helpful for capturing some crash data. v2: Extended short int counters to full int (JohnH) Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: dri-devel@lists.freedesktop.org Reviewed-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-8-John.C.Harrison@Intel.com
2024-10-08drm/xe/guc: Use a two stage dump for GuC logs and add more infoJohn Harrison4-15/+195
Split the GuC log dump into a two stage snapshot and print mechanism. This allows the log to be captured at the point of an error (which may be in a restricted context) and then dump it out later (from a regular context such as a worker function or a sysfs file handler). Also add a bunch of other useful pieces of information that can help (or are fundamentally required!) to decode and parse the log. v2: Add kerneldoc and fix a couple of comment typos - review feedback from Michal W. v3: Move chunking code to this patch as it makes the deltas simpler. Fix a bunch of kerneldoc issues. v4: Move the CS frequency out of the coredump snapshot function into the debugfs only code (as that info is already part of the main devcoredump). Add a header to the debugfs log to match the one in the devcoredump to aid processing by a unified tool. Add forcewake to the GuC timestamp read so it actually works. v6: Add colon to GuC version string (review feedback by Julia F). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-7-John.C.Harrison@Intel.com
2024-10-08drm/xe/guc: Copy GuC log prior to dumpingJohn Harrison1-17/+23
Add an extra stage to the GuC log print to copy the log buffer into regular host memory first, rather than printing the live GPU buffer object directly. Doing so helps prevent inconsistencies due to the log being updated as it is being dumped. It also allows the use of the ASCII85 helper function for printing the log in a more compact form than a straight hex dump. v2: Use %zx instead of %lx for size_t prints. v3: Replace hexdump code with ascii85 call (review feedback from Matthew B). Move chunking code into next patch as that reduces the deltas of both. v4: Add a prefix to the ASCII85 output to aid tool parsing. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-6-John.C.Harrison@Intel.com
2024-10-08drm/xe/devcoredump: Add ASCII85 dump helper functionJohn Harrison2-0/+93
There is a need to include the GuC log and other large binary objects in core dumps and via dmesg. So add a helper for dumping to a printer function via conversion to ASCII85 encoding. Another issue with dumping such a large buffer is that it can be slow, especially if dumping to dmesg over a serial port. So add a yield to prevent the 'task has been stuck for 120s' kernel hang check feature from firing. v2: Add a prefix to the output string. Fix memory allocation bug. v3: Correct a string size calculation and clean up a define (review feedback from Julia F). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-5-John.C.Harrison@Intel.com
2024-10-08drm/xe/devcoredump: Improve section headings and add tile infoJohn Harrison5-3/+9
The xe_guc_exec_queue_snapshot is not really a GuC internal thing and is definitely not a GuC CT thing. So give it its own section heading. The snapshot itself is really a capture of the submission backend's internal state. Although all it currently prints out is the submission contexts. So label it as 'Contexts'. If more general state is added later then it could be change to 'Submission backend' or some such. Further, everything from the GuC CT section onwards is GT specific but there was no indication of which GT it was related to (and that is impossible to work out from the other fields that are given). So add a GT section heading. Also include the tile id of the GT, because again significant information. Lastly, drop a couple of unnecessary line feeds within sections. v2: Add GT section heading, add tile id to device section. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-4-John.C.Harrison@Intel.com
2024-10-08drm/xe/devcoredump: Use drm_puts and already cached local variablesJohn Harrison1-20/+20
There are a bunch of calls to drm_printf with static strings. Switch them to drm_puts instead. There are also a bunch of 'coredump->snapshot.XXX' references when 'coredump->snapshot' has alread been cached locally as 'ss'. So use 'ss->XXX' instead. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-3-John.C.Harrison@Intel.com
2024-10-08drm/xe/guc: Remove spurious line feed in debug printJohn Harrison1-1/+1
Including line feeds at the start of a debug print messes up the output when sent to dmesg. The break appears between all the useful prefix information and the actual string being printed. In this case, each block of data has a very clear start line and an extra delimeter is really not necessary. So don't do it. v2: Fix typo in commit message (review feedback from Michal W.) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-2-John.C.Harrison@Intel.com
2024-10-07drm/xe/pf: Allow to save and restore VF config blob from debugfsMichal Wajdeczko1-0/+78
For feature enabling and testing purposes, allow to capture and replace full VF configuration using debugfs blob file, but only under strict debug config. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240919171528.1451-6-michal.wajdeczko@intel.com
2024-10-07drm/xe/pf: Add functions to save and restore VF configuration blobMichal Wajdeczko2-0/+174
We already have support to save and restore GuC VF state, but that will only work when the target VF configuration (provisioning) will be exactly the same as the source VF configuration. To help with assuring that both configurations match, allow to encode whole VF configuration that can be saved as blob and restored later. In the future we may want to use such captured configuration blobs as templates to make sure we provision VFs with exactly the same configuration that was previously tested or recommended, or when debugfs knobs are not be available. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240919171528.1451-5-michal.wajdeczko@intel.com
2024-10-07drm/xe/pf: Allow to encode subset of VF configuration KLVsMichal Wajdeczko1-13/+19
We want to reuse format of the GuC KLVs while saving and restoring VF configuration by the PF driver, but some of those KLVs (like doorbell begin index or GGTT starting offset) are not strictly needed to correctly restore VF configuration. Modify functions to omit encoding of some of the KLVs with GuC only details. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240919171528.1451-4-michal.wajdeczko@intel.com
2024-10-07drm/xe/pf: Update success code of pf_validate_vf_config()Michal Wajdeczko1-1/+1
This function may return negative error codes on invalid or incomplete VF configuration, but unlike other int functions, it was returning 1 instead of 0 on success, which might be little inconvinient if we would like to use it directly in other functions. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240919171528.1451-3-michal.wajdeczko@intel.com
2024-10-07drm/xe/guc: Add yet another helper macro for thresholdMichal Wajdeczko1-0/+7
We already have and use MAKE_GUC_KLV_VF_CFG_THRESHOLD_KEY, but in upcoming patch we want to generate the key for the length. Add MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN for this purpose. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240919171528.1451-2-michal.wajdeczko@intel.com
2024-10-06drm/xe: Add memirq report page address helpersMichal Wajdeczko1-14/+21
Both xe_memirq_{source,status}_ptr functions are now strictly for obtaining an address of the memory based interrupt report pages used by the HW engines. When initializing the GuC that does not require special per instance page preparation, we don't need to abuse these public functions and pass a NULL instead of valid hwe pointer. Also, without further fixes, this actually may lead to NPD crash once the hw_reports_to_instance_zero() will be true. Add internal helpers that will provide report page addresses based solely on the instance number, which will be always 0 for both GuCs. Fixes: ef6103d20f97 ("drm/xe: memirq infra changes for MSI-X") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Ilia Levi <ilia.levi@intel.com> Reviewed-by: Ilia Levi <ilia.levi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240925084445.1495-1-michal.wajdeczko@intel.com
2024-10-04drm/xe: Make wedged_mode debugfs writableMatt Roper1-1/+1
The intent of this debugfs entry is to allow modification of wedging behavior, either from IGT tests or during manual debug; it should be marked as writable to properly reflect this. In practice this hasn't caused a problem because we always access wedged_mode as root, which ignores file permissions, but it's still misleading to have the entry incorrectly marked as RO. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Fixes: 6b8ef44cc0a9 ("drm/xe: Introduce the wedged_mode debugfs") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241002230620.1249258-2-matthew.d.roper@intel.com
2024-10-04Merge drm/drm-next into drm-xe-nextThomas Hellström10325-169349/+413369
Backmerging to resolve a conflict with core locally. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-10-03drm/xe: Restore GT freq on GSC load errorVinay Belgaumkar1-1/+3
As part of a Wa_22019338487, ensure that GT freq is restored even when GSC reload is not successful. Fixes: 3b1592fb7835 ("drm/xe/lnl: Apply Wa_22019338487") Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240925204918.1989574-1-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-10-03drm/xe: Use fault injection infrastructure to find issues at probe timeFrancois Dugast13-0/+49
The kernel fault injection infrastructure is used to test proper error handling during probe. The return code of the functions using ALLOW_ERROR_INJECTION() can be conditionnally modified at runtime by tuning some debugfs entries. This requires CONFIG_FUNCTION_ERROR_INJECTION (among others). One way to use fault injection at probe time by making each of those functions fail one at a time is: FAILTYPE=fail_function DEVICE="0000:00:08.0" # depends on the system ERRNO=-12 # -ENOMEM, can depend on the function echo N > /sys/kernel/debug/$FAILTYPE/task-filter echo 100 > /sys/kernel/debug/$FAILTYPE/probability echo 0 > /sys/kernel/debug/$FAILTYPE/interval echo -1 > /sys/kernel/debug/$FAILTYPE/times echo 0 > /sys/kernel/debug/$FAILTYPE/space echo 1 > /sys/kernel/debug/$FAILTYPE/verbose modprobe xe echo $DEVICE > /sys/bus/pci/drivers/xe/unbind grep -oP "^.* \[xe\]" /sys/kernel/debug/$FAILTYPE/injectable | \ cut -d ' ' -f 1 | while read -r FUNCTION ; do echo "Injecting fault in $FUNCTION" echo "" > /sys/kernel/debug/$FAILTYPE/inject echo $FUNCTION > /sys/kernel/debug/$FAILTYPE/inject printf %#x $ERRNO > /sys/kernel/debug/$FAILTYPE/$FUNCTION/retval echo $DEVICE > /sys/bus/pci/drivers/xe/bind done rmmod xe It will also be integrated into IGT for systematic execution by CI. v2: Wrappers are not needed in the cases covered by this patch, so remove them and use ALLOW_ERROR_INJECTION() directly. v3: Document the use of fault injection at probe time in xe_pci_probe and refer to it where ALLOW_ERROR_INJECTION() is used. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240927151207.399354-1-francois.dugast@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-10-03drm/xe/ct: drop irq usage of xa_erase()Matthew Auld1-2/+2
Unclear why disabling interrupts is needed here. Nothing seems to be touching fence_lookup and its corresponding lock from an irq so there should be no risk of deadlock. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241001084346.98516-8-matthew.auld@intel.com
2024-10-03drm/xe/guc_submit: fix xa_store() error checkingMatthew Auld1-6/+3
Looks like we are meant to use xa_err() to extract the error encoded in the ptr. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241001084346.98516-7-matthew.auld@intel.com
2024-10-03drm/xe/ct: fix xa_store() error checkingMatthew Auld1-15/+8
Looks like we are meant to use xa_err() to extract the error encoded in the ptr. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241001084346.98516-6-matthew.auld@intel.com
2024-10-03drm/xe/ct: prevent UAF in send_recv()Matthew Auld1-3/+18
Ensure we serialize with completion side to prevent UAF with fence going out of scope on the stack, since we have no clue if it will fire after the timeout before we can erase from the xa. Also we have some dependent loads and stores for which we need the correct ordering, and we lack the needed barriers. Fix this by grabbing the ct->lock after the wait, which is also held by the completion side. v2 (Badal): - Also print done after acquiring the lock and seeing timeout. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241001084346.98516-5-matthew.auld@intel.com
2024-10-02drm/xe: Fix memory leak when aborting bindsMatthew Brost1-1/+1
Make sure to call xe_pt_update_ops_fini in xe_pt_update_ops_abort to free any memory the bind allocated. Caught by kmemleak when running Vulkan CTS tests on LNL. The leak seems to happen only when there's some kind of failure happening, like the lack of memory. Example output: unreferenced object 0xffff9120bdf62000 (size 8192): comm "deqp-vk", pid 115008, jiffies 4310295728 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 1b 05 f9 28 01 00 00 40 ...........(...@ 00 00 00 00 00 00 00 00 1b 15 f9 28 01 00 00 40 ...........(...@ backtrace (crc 7a56be79): [<ffffffff86dd81f0>] __kmalloc_cache_noprof+0x310/0x3d0 [<ffffffffc08e8211>] xe_pt_new_shared.constprop.0+0x81/0xb0 [xe] [<ffffffffc08e8309>] xe_pt_insert_entry+0xb9/0x140 [xe] [<ffffffffc08eab6d>] xe_pt_stage_bind_entry+0x12d/0x5b0 [xe] [<ffffffffc08ecbca>] xe_pt_walk_range+0xea/0x280 [xe] [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe] [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe] [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe] [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe] [<ffffffffc08e9eff>] xe_pt_stage_bind.constprop.0+0x25f/0x580 [xe] [<ffffffffc08eb21a>] bind_op_prepare+0xea/0x6e0 [xe] [<ffffffffc08ebab8>] xe_pt_update_ops_prepare+0x1c8/0x440 [xe] [<ffffffffc08ffbf3>] ops_execute+0x143/0x850 [xe] [<ffffffffc0900b64>] vm_bind_ioctl_ops_execute+0x244/0x800 [xe] [<ffffffffc0906467>] xe_vm_bind_ioctl+0x1877/0x2370 [xe] [<ffffffffc05e92b3>] drm_ioctl_kernel+0xb3/0x110 [drm] unreferenced object 0xffff9120bdf72000 (size 8192): comm "deqp-vk", pid 115008, jiffies 4310295728 hex dump (first 32 bytes): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk backtrace (crc 23b2f0b5): [<ffffffff86dd81f0>] __kmalloc_cache_noprof+0x310/0x3d0 [<ffffffffc08e8211>] xe_pt_new_shared.constprop.0+0x81/0xb0 [xe] [<ffffffffc08e8453>] xe_pt_stage_unbind_post_descend+0xb3/0x150 [xe] [<ffffffffc08ecd26>] xe_pt_walk_range+0x246/0x280 [xe] [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe] [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe] [<ffffffffc08eccea>] xe_pt_walk_range+0x20a/0x280 [xe] [<ffffffffc08ece31>] xe_pt_walk_shared+0xc1/0x110 [xe] [<ffffffffc08e7b2a>] xe_pt_stage_unbind+0x9a/0xd0 [xe] [<ffffffffc08e913d>] unbind_op_prepare+0xdd/0x270 [xe] [<ffffffffc08eb9f6>] xe_pt_update_ops_prepare+0x106/0x440 [xe] [<ffffffffc08ffbf3>] ops_execute+0x143/0x850 [xe] [<ffffffffc0900b64>] vm_bind_ioctl_ops_execute+0x244/0x800 [xe] [<ffffffffc0906467>] xe_vm_bind_ioctl+0x1877/0x2370 [xe] [<ffffffffc05e92b3>] drm_ioctl_kernel+0xb3/0x110 [drm] [<ffffffffc05e95a0>] drm_ioctl+0x280/0x4e0 [drm] Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2877 Fixes: a708f6501c69 ("drm/xe: Update PT layer with better error handling") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240927232228.3255246-1-matthew.brost@intel.com
2024-10-02drm/xe: Prevent null pointer access in xe_migrate_copyZhanjun Dong1-2/+2
xe_migrate_copy designed to copy content of TTM resources. When source resource is null, it will trigger a NULL pointer dereference in xe_migrate_copy. To avoid this situation, update lacks source flag to true for this case, the flag will trigger xe_migrate_clear rather than xe_migrate_copy. Issue trace: <7> [317.089847] xe 0000:00:02.0: [drm:xe_migrate_copy [xe]] Pass 14, sizes: 4194304 & 4194304 <7> [317.089945] xe 0000:00:02.0: [drm:xe_migrate_copy [xe]] Pass 15, sizes: 4194304 & 4194304 <1> [317.128055] BUG: kernel NULL pointer dereference, address: 0000000000000010 <1> [317.128064] #PF: supervisor read access in kernel mode <1> [317.128066] #PF: error_code(0x0000) - not-present page <6> [317.128069] PGD 0 P4D 0 <4> [317.128071] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI <4> [317.128074] CPU: 1 UID: 0 PID: 1440 Comm: kunit_try_catch Tainted: G U N 6.11.0-rc7-xe #1 <4> [317.128078] Tainted: [U]=USER, [N]=TEST <4> [317.128080] Hardware name: Intel Corporation Lunar Lake Client Platform/LNL-M LP5 RVP1, BIOS LNLMFWI1.R00.3221.D80.2407291239 07/29/2024 <4> [317.128082] RIP: 0010:xe_migrate_copy+0x66/0x13e0 [xe] <4> [317.128158] Code: 00 00 48 89 8d e0 fe ff ff 48 8b 40 10 4c 89 85 c8 fe ff ff 44 88 8d bd fe ff ff 65 48 8b 3c 25 28 00 00 00 48 89 7d d0 31 ff <8b> 79 10 48 89 85 a0 fe ff ff 48 8b 00 48 89 b5 d8 fe ff ff 83 ff <4> [317.128162] RSP: 0018:ffffc9000167f9f0 EFLAGS: 00010246 <4> [317.128164] RAX: ffff8881120d8028 RBX: ffff88814d070428 RCX: 0000000000000000 <4> [317.128166] RDX: ffff88813cb99c00 RSI: 0000000004000000 RDI: 0000000000000000 <4> [317.128168] RBP: ffffc9000167fbb8 R08: ffff88814e7b1f08 R09: 0000000000000001 <4> [317.128170] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88814e7b1f08 <4> [317.128172] R13: ffff88814e7b1f08 R14: ffff88813cb99c00 R15: 0000000000000001 <4> [317.128174] FS: 0000000000000000(0000) GS:ffff88846f280000(0000) knlGS:0000000000000000 <4> [317.128176] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [317.128178] CR2: 0000000000000010 CR3: 000000011f676004 CR4: 0000000000770ef0 <4> [317.128180] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4> [317.128182] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400 <4> [317.128184] PKRU: 55555554 <4> [317.128185] Call Trace: <4> [317.128187] <TASK> <4> [317.128189] ? show_regs+0x67/0x70 <4> [317.128194] ? __die_body+0x20/0x70 <4> [317.128196] ? __die+0x2b/0x40 <4> [317.128198] ? page_fault_oops+0x15f/0x4e0 <4> [317.128203] ? do_user_addr_fault+0x3fb/0x970 <4> [317.128205] ? lock_acquire+0xc7/0x2e0 <4> [317.128209] ? exc_page_fault+0x87/0x2b0 <4> [317.128212] ? asm_exc_page_fault+0x27/0x30 <4> [317.128216] ? xe_migrate_copy+0x66/0x13e0 [xe] <4> [317.128263] ? __lock_acquire+0xb9d/0x26f0 <4> [317.128265] ? __lock_acquire+0xb9d/0x26f0 <4> [317.128267] ? sg_free_append_table+0x20/0x80 <4> [317.128271] ? lock_acquire+0xc7/0x2e0 <4> [317.128273] ? mark_held_locks+0x4d/0x80 <4> [317.128275] ? trace_hardirqs_on+0x1e/0xd0 <4> [317.128278] ? _raw_spin_unlock_irqrestore+0x31/0x60 <4> [317.128281] ? __pm_runtime_resume+0x60/0xa0 <4> [317.128284] xe_bo_move+0x682/0xc50 [xe] <4> [317.128315] ? lock_is_held_type+0xaa/0x120 <4> [317.128318] ttm_bo_handle_move_mem+0xe5/0x1a0 [ttm] <4> [317.128324] ttm_bo_validate+0xd1/0x1a0 [ttm] <4> [317.128328] shrink_test_run_device+0x721/0xc10 [xe] <4> [317.128360] ? find_held_lock+0x31/0x90 <4> [317.128363] ? lock_release+0xd1/0x2a0 <4> [317.128365] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [kunit] <4> [317.128370] xe_bo_shrink_kunit+0x11/0x20 [xe] <4> [317.128397] kunit_try_run_case+0x6e/0x150 [kunit] <4> [317.128400] ? trace_hardirqs_on+0x1e/0xd0 <4> [317.128402] ? _raw_spin_unlock_irqrestore+0x31/0x60 <4> [317.128404] kunit_generic_run_threadfn_adapter+0x1e/0x40 [kunit] <4> [317.128407] kthread+0xf5/0x130 <4> [317.128410] ? __pfx_kthread+0x10/0x10 <4> [317.128412] ret_from_fork+0x39/0x60 <4> [317.128415] ? __pfx_kthread+0x10/0x10 <4> [317.128416] ret_from_fork_asm+0x1a/0x30 <4> [317.128420] </TASK> Fixes: 266c85885263 ("drm/xe/xe2: Handle flat ccs move for igfx.") Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240927161308.862323-2-zhanjun.dong@intel.com
2024-10-01drm/xe/compat: remove unused i915_gpu_error.hJani Nikula1-17/+0
The last user of the compat header was removed in commit d6b933912df0 ("drm/i915/dmc: convert intel_dmc_print_error_state() to drm_printer"). Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240930164052.3862911-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-09-30Linux 6.12-rc1Linus Torvalds1-2/+2
2024-09-30x86: kvm: fix build errorLinus Torvalds1-0/+2
The cpu_emergency_register_virt_callback() function is used unconditionally by the x86 kvm code, but it is declared (and defined) conditionally: #if IS_ENABLED(CONFIG_KVM_INTEL) || IS_ENABLED(CONFIG_KVM_AMD) void cpu_emergency_register_virt_callback(cpu_emergency_virt_cb *callback); ... leading to a build error when neither KVM_INTEL nor KVM_AMD support is enabled: arch/x86/kvm/x86.c: In function ‘kvm_arch_enable_virtualization’: arch/x86/kvm/x86.c:12517:9: error: implicit declaration of function ‘cpu_emergency_register_virt_callback’ [-Wimplicit-function-declaration] 12517 | cpu_emergency_register_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/x86/kvm/x86.c: In function ‘kvm_arch_disable_virtualization’: arch/x86/kvm/x86.c:12522:9: error: implicit declaration of function ‘cpu_emergency_unregister_virt_callback’ [-Wimplicit-function-declaration] 12522 | cpu_emergency_unregister_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix the build by defining empty helper functions the same way the old cpu_emergency_disable_virtualization() function was dealt with for the same situation. Maybe we could instead have made the call sites conditional, since the callers (kvm_arch_{en,dis}able_virtualization()) have an empty weak fallback. I'll leave that to the kvm people to argue about, this at least gets the build going for that particular config. Fixes: 590b09b1d88e ("KVM: x86: Register "emergency disable" callbacks when virt is enabled") Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Kai Huang <kai.huang@intel.com> Cc: Chao Gao <chao.gao@intel.com> Cc: Farrah Chen <farrah.chen@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-09-29Merge tag 'mailbox-v6.12' of ↵Linus Torvalds10-51/+32
git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox Pull mailbox updates from Jassi Brar: - fix kconfig dependencies (mhu-v3, omap2+) - use devie name instead of genereic imx_mu_chan as interrupt name (imx) - enable sa8255p and qcs8300 ipc controllers (qcom) - Fix timeout during suspend mode (bcm2835) - convert to use use of_property_match_string (mailbox) - enable mt8188 (mediatek) - use devm_clk_get_enabled helpers (spreadtrum) - fix device-id typo (rockchip) * tag 'mailbox-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox: mailbox, remoteproc: omap2+: fix compile testing dt-bindings: mailbox: qcom-ipcc: Document QCS8300 IPCC dt-bindings: mailbox: qcom-ipcc: document the support for SA8255p dt-bindings: mailbox: mtk,adsp-mbox: Add compatible for MT8188 mailbox: Use of_property_match_string() instead of open-coding mailbox: bcm2835: Fix timeout during suspend mode mailbox: sprd: Use devm_clk_get_enabled() helpers mailbox: rockchip: fix a typo in module autoloading mailbox: imx: use device name in interrupt name mailbox: ARM_MHU_V3 should depend on ARM64
2024-09-29Merge tag 'i2c-for-6.12-rc1-additional_fixes' of ↵Linus Torvalds6-3/+58
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: - fix DesignWare driver ENABLE-ABORT sequence, ensuring ABORT can always be sent when needed - check for PCLK in the SynQuacer controller as an optional clock, allowing ACPI to directly provide the clock rate - KEBA driver Kconfig dependency fix - fix XIIC driver power suspend sequence * tag 'i2c-for-6.12-rc1-additional_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled i2c: keba: I2C_KEBA should depend on KEBA_CP500 i2c: synquacer: Deal with optional PCLK correctly i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled
2024-09-29Merge tag 'dma-mapping-6.12-2024-09-29' of ↵Linus Torvalds1-18/+19
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fix from Christoph Hellwig: - handle chained SGLs in the new tracing code (Christoph Hellwig) * tag 'dma-mapping-6.12-2024-09-29' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: fix DMA API tracing for chained scatterlists
2024-09-29Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds34-154/+415
Pull more SCSI updates from James Bottomley: "These are mostly minor updates. There are two drivers (lpfc and mpi3mr) which missed the initial pull and a core change to retry a start/stop unit which affect suspend/resume" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits) scsi: lpfc: Update lpfc version to 14.4.0.5 scsi: lpfc: Support loopback tests with VMID enabled scsi: lpfc: Revise TRACE_EVENT log flag severities from KERN_ERR to KERN_WARNING scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance scsi: lpfc: Fix kref imbalance on fabric ndlps from dev_loss_tmo handler scsi: lpfc: Restrict support for 32 byte CDBs to specific HBAs scsi: lpfc: Update phba link state conditional before sending CMF_SYNC_WQE scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd() scsi: mpi3mr: Update driver version to 8.12.0.0.50 scsi: mpi3mr: Improve wait logic while controller transitions to READY state scsi: mpi3mr: Update MPI Headers to revision 34 scsi: mpi3mr: Use firmware-provided timestamp update interval scsi: mpi3mr: Enhance the Enable Controller retry logic scsi: sd: Fix off-by-one error in sd_read_block_characteristics() scsi: pm8001: Do not overwrite PCI queue mapping scsi: scsi_debug: Remove a useless memset() scsi: pmcraid: Convert comma to semicolon scsi: sd: Retry START STOP UNIT commands scsi: mpi3mr: A performance fix scsi: ufs: qcom: Update MODE_MAX cfg_bw value ...
2024-09-29Merge tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefsLinus Torvalds39-309/+469
Pull more bcachefs updates from Kent Overstreet: "Assorted minor syzbot fixes, and for bigger stuff: Fix two disk accounting rewrite bugs: - Disk accounting keys use the version field of bkey so that journal replay can tell which updates have been applied to the btree. This is set in the transaction commit path, after we've gotten our journal reservation (and our time ordering), but the BCH_TRANS_COMMIT_skip_accounting_apply flag that journal replay uses was incorrectly skipping this for new updates generated prior to journal replay. This fixes the underlying cause of an assertion pop in disk_accounting_read. - A couple of fixes for disk accounting + device removal. Checking if acocunting replicas entries were marked in the superblock was being done at the wrong point, when deltas in the journal could still zero them out, and then additionally we'd try to add a missing replicas entry to the superblock without checking if it referred to an invalid (removed) device. A whole slew of repair fixes: - fix infinite loop in propagate_key_to_snapshot_leaves(), this fixes an infinite loop when repairing a filesystem with many snapshots - fix incorrect transaction restart handling leading to occasional "fsck counted ..." warnings - fix warning in __bch2_fsck_err() for bkey fsck errors - check_inode() in fsck now correctly checks if the filesystem was clean - there shouldn't be pending logged ops if the fs was clean, we now check for this - remove_backpointer() doesn't remove a dirent that doesn't actually point to the inode - many more fsck errors are AUTOFIX" * tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs: (35 commits) bcachefs: check_subvol_path() now prints subvol root inode bcachefs: remove_backpointer() now checks if dirent points to inode bcachefs: dirent_points_to_inode() now warns on mismatch bcachefs: Fix lost wake up bcachefs: Check for logged ops when clean bcachefs: BCH_FS_clean_recovery bcachefs: Convert disk accounting BUG_ON() to WARN_ON() bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply bcachefs: Check for accounting keys with bversion=0 bcachefs: rename version -> bversion bcachefs: Don't delete unlinked inodes before logged op resume bcachefs: Fix BCH_SB_ERRS() so we can reorder bcachefs: Fix fsck warnings from bkey validation bcachefs: Move transaction commit path validation to as late as possible bcachefs: Fix disk accounting attempting to mark invalid replicas entry bcachefs: Fix unlocked access to c->disk_sb.sb in bch2_replicas_entry_validate() bcachefs: Fix accounting read + device removal bcachefs: bch_accounting_mode bcachefs: fix transaction restart handling in check_extents(), check_dirents() bcachefs: kill inode_walker_entry.seen_this_pos ...
2024-09-29Merge tag 'x86-urgent-2024-09-29' of ↵Linus Torvalds2-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Fix TDX MMIO #VE fault handling, and add two new Intel model numbers for 'Pantherlake' and 'Diamond Rapids'" * tag 'x86-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add two Intel CPU model numbers x86/tdx: Fix "in-kernel MMIO" check
2024-09-29Merge tag 'locking-urgent-2024-09-29' of ↵Linus Torvalds10-44/+240
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "lockdep: - Fix potential deadlock between lockdep and RCU (Zhiguo Niu) - Use str_plural() to address Coccinelle warning (Thorsten Blum) - Add debuggability enhancement (Luis Claudio R. Goncalves) static keys & calls: - Fix static_key_slow_dec() yet again (Peter Zijlstra) - Handle module init failure correctly in static_call_del_module() (Thomas Gleixner) - Replace pointless WARN_ON() in static_call_module_notify() (Thomas Gleixner) <linux/cleanup.h>: - Add usage and style documentation (Dan Williams) rwsems: - Move is_rwsem_reader_owned() and rwsem_owner() under CONFIG_DEBUG_RWSEMS (Waiman Long) atomic ops, x86: - Redeclare x86_32 arch_atomic64_{add,sub}() as void (Uros Bizjak) - Introduce the read64_nonatomic macro to x86_32 with cx8 (Uros Bizjak)" Signed-off-by: Ingo Molnar <mingo@kernel.org> * tag 'locking-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rwsem: Move is_rwsem_reader_owned() and rwsem_owner() under CONFIG_DEBUG_RWSEMS jump_label: Fix static_key_slow_dec() yet again static_call: Replace pointless WARN_ON() in static_call_module_notify() static_call: Handle module init failure correctly in static_call_del_module() locking/lockdep: Simplify character output in seq_line() lockdep: fix deadlock issue between lockdep and rcu lockdep: Use str_plural() to fix Coccinelle warning cleanup: Add usage and style documentation lockdep: suggest the fix for "lockdep bfs error:-1" on print_bfs_bug locking/atomic/x86: Redeclare x86_32 arch_atomic64_{add,sub}() as void locking/atomic/x86: Introduce the read64_nonatomic macro to x86_32 with cx8
2024-09-29Merge tag 'cocci-for-6.12' of ↵Linus Torvalds1-22/+237
git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux Pull coccinelle updates from Julia Lawall: "Extend string_choices.cocci to use more available helpers Ten patches from Hongbo Li extending string_choices.cocci with the complete set of functions offered by include/linux/string_choices.h. One patch from myself reducing the number of redundant cases that are checked by Coccinelle, giving a small performance improvement" * tag 'cocci-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux: Reduce Coccinelle choices in string_choices.cocci coccinelle: Remove unnecessary parentheses for only one possible change. coccinelle: Add rules to find str_yes_no() replacements coccinelle: Add rules to find str_on_off() replacements coccinelle: Add rules to find str_write_read() replacements coccinelle: Add rules to find str_read_write() replacements coccinelle: Add rules to find str_enable{d}_disable{d}() replacements coccinelle: Add rules to find str_lo{w}_hi{gh}() replacements coccinelle: Add rules to find str_hi{gh}_lo{w}() replacements coccinelle: Add rules to find str_false_true() replacements coccinelle: Add rules to find str_true_false() replacements
2024-09-29Merge tag 'linux_kselftest-next-6.12-rc1-fixes' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fix from Shuah Khan: "One urgent fix to vDSO as automated testing is failing due to this bug" * tag 'linux_kselftest-next-6.12-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: vDSO: align stack for O2-optimized memcpy
2024-09-29Merge branch 'locking/core' into locking/urgent, to pick up pending commitsIngo Molnar8-36/+201
Merge all pending locking commits into a single branch. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-09-28Reduce Coccinelle choices in string_choices.cocciJulia Lawall1-50/+41
The isomorphism neg_if_exp negates the test of a ?: conditional, making it unnecessary to have an explicit case for a negated test with the branches inverted. At the same time, we can disable neg_if_exp in cases where a different API function may be more suitable for a negated test. Finally, in the non-patch cases, E matches an expression with parentheses around it, so there is no need to mention () explicitly in the pattern. The () are still needed in the patch cases, because we want to drop them, if they are present. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2024-09-28coccinelle: Remove unnecessary parentheses for only one possible change.Hongbo Li1-8/+0
The parentheses are only needed if there is a disjunction, ie a set of possible changes. If there is only one pattern, we can remove these parentheses. Just like the format: - x + y not: ( - x + y ) Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2024-09-28coccinelle: Add rules to find str_yes_no() replacementsHongbo Li1-0/+19
As other rules done, we add rules for str_yes_no() to check the relative opportunities. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2024-09-28coccinelle: Add rules to find str_on_off() replacementsHongbo Li1-0/+19
As other rules done, we add rules for str_on_off() to check the relative opportunities. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2024-09-28coccinelle: Add rules to find str_write_read() replacementsHongbo Li1-0/+19
As other rules done, we add rules for str_write_read() to check the relative opportunities. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2024-09-28coccinelle: Add rules to find str_read_write() replacementsHongbo Li1-0/+19
As other rules done, we add rules for str_read_write() to check the relative opportunities. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2024-09-28coccinelle: Add rules to find str_enable{d}_disable{d}() replacementsHongbo Li1-0/+38
As other rules done, we add rules for str_enable{d}_ disable{d}() to check the relative opportunities. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2024-09-28coccinelle: Add rules to find str_lo{w}_hi{gh}() replacementsHongbo Li1-0/+38
As other rules done, we add rules for str_lo{w}_hi{gh}() to check the relative opportunities. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2024-09-28coccinelle: Add rules to find str_hi{gh}_lo{w}() replacementsHongbo Li1-0/+42
As other rules done, we add rules for str_hi{gh}_lo{w}() to check the relative opportunities. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2024-09-28coccinelle: Add rules to find str_false_true() replacementsHongbo Li1-0/+19
As done with str_true_false(), add checks for str_false_true() opportunities. A simple test can find over 9 cases currently exist in the tree. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2024-09-28coccinelle: Add rules to find str_true_false() replacementsHongbo Li1-0/+19
After str_true_false() has been introduced in the tree, we can add rules for finding places where str_true_false() can be used. A simple test can find over 10 locations. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2024-09-28Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds82-1448/+2799
Pull x86 kvm updates from Paolo Bonzini: "x86: - KVM currently invalidates the entirety of the page tables, not just those for the memslot being touched, when a memslot is moved or deleted. This does not traditionally have particularly noticeable overhead, but Intel's TDX will require the guest to re-accept private pages if they are dropped from the secure EPT, which is a non starter. Actually, the only reason why this is not already being done is a bug which was never fully investigated and caused VM instability with assigned GeForce GPUs, so allow userspace to opt into the new behavior. - Advertise AVX10.1 to userspace (effectively prep work for the "real" AVX10 functionality that is on the horizon) - Rework common MSR handling code to suppress errors on userspace accesses to unsupported-but-advertised MSRs This will allow removing (almost?) all of KVM's exemptions for userspace access to MSRs that shouldn't exist based on the vCPU model (the actual cleanup is non-trivial future work) - Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC) splits the 64-bit value into the legacy ICR and ICR2 storage, whereas Intel (APICv) stores the entire 64-bit value at the ICR offset - Fix a bug where KVM would fail to exit to userspace if one was triggered by a fastpath exit handler - Add fastpath handling of HLT VM-Exit to expedite re-entering the guest when there's already a pending wake event at the time of the exit - Fix a WARN caused by RSM entering a nested guest from SMM with invalid guest state, by forcing the vCPU out of guest mode prior to signalling SHUTDOWN (the SHUTDOWN hits the VM altogether, not the nested guest) - Overhaul the "unprotect and retry" logic to more precisely identify cases where retrying is actually helpful, and to harden all retry paths against putting the guest into an infinite retry loop - Add support for yielding, e.g. to honor NEED_RESCHED, when zapping rmaps in the shadow MMU - Refactor pieces of the shadow MMU related to aging SPTEs in prepartion for adding multi generation LRU support in KVM - Don't stuff the RSB after VM-Exit when RETPOLINE=y and AutoIBRS is enabled, i.e. when the CPU has already flushed the RSB - Trace the per-CPU host save area as a VMCB pointer to improve readability and cleanup the retrieval of the SEV-ES host save area - Remove unnecessary accounting of temporary nested VMCB related allocations - Set FINAL/PAGE in the page fault error code for EPT violations if and only if the GVA is valid. If the GVA is NOT valid, there is no guest-side page table walk and so stuffing paging related metadata is nonsensical - Fix a bug where KVM would incorrectly synthesize a nested VM-Exit instead of emulating posted interrupt delivery to L2 - Add a lockdep assertion to detect unsafe accesses of vmcs12 structures - Harden eVMCS loading against an impossible NULL pointer deref (really truly should be impossible) - Minor SGX fix and a cleanup - Misc cleanups Generic: - Register KVM's cpuhp and syscore callbacks when enabling virtualization in hardware, as the sole purpose of said callbacks is to disable and re-enable virtualization as needed - Enable virtualization when KVM is loaded, not right before the first VM is created Together with the previous change, this simplifies a lot the logic of the callbacks, because their very existence implies virtualization is enabled - Fix a bug that results in KVM prematurely exiting to userspace for coalesced MMIO/PIO in many cases, clean up the related code, and add a testcase - Fix a bug in kvm_clear_guest() where it would trigger a buffer overflow _if_ the gpa+len crosses a page boundary, which thankfully is guaranteed to not happen in the current code base. Add WARNs in more helpers that read/write guest memory to detect similar bugs Selftests: - Fix a goof that caused some Hyper-V tests to be skipped when run on bare metal, i.e. NOT in a VM - Add a regression test for KVM's handling of SHUTDOWN for an SEV-ES guest - Explicitly include one-off assets in .gitignore. Past Sean was completely wrong about not being able to detect missing .gitignore entries - Verify userspace single-stepping works when KVM happens to handle a VM-Exit in its fastpath - Misc cleanups" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits) Documentation: KVM: fix warning in "make htmldocs" s390: Enable KVM_S390_UCONTROL config in debug_defconfig selftests: kvm: s390: Add VM run test case KVM: SVM: let alternatives handle the cases when RSB filling is required KVM: VMX: Set PFERR_GUEST_{FINAL,PAGE}_MASK if and only if the GVA is valid KVM: x86/mmu: Use KVM_PAGES_PER_HPAGE() instead of an open coded equivalent KVM: x86/mmu: Add KVM_RMAP_MANY to replace open coded '1' and '1ul' literals KVM: x86/mmu: Fold mmu_spte_age() into kvm_rmap_age_gfn_range() KVM: x86/mmu: Morph kvm_handle_gfn_range() into an aging specific helper KVM: x86/mmu: Honor NEED_RESCHED when zapping rmaps and blocking is allowed KVM: x86/mmu: Add a helper to walk and zap rmaps for a memslot KVM: x86/mmu: Plumb a @can_yield parameter into __walk_slot_rmaps() KVM: x86/mmu: Move walk_slot_rmaps() up near for_each_slot_rmap_range() KVM: x86/mmu: WARN on MMIO cache hit when emulating write-protected gfn KVM: x86/mmu: Detect if unprotect will do anything based on invalid_list KVM: x86/mmu: Subsume kvm_mmu_unprotect_page() into the and_retry() version KVM: x86: Rename reexecute_instruction()=>kvm_unprotect_and_retry_on_failure() KVM: x86: Update retry protection fields when forcing retry on emulation failure KVM: x86: Apply retry protection to "unprotect on failure" path KVM: x86: Check EMULTYPE_WRITE_PF_TO_SP before unprotecting gfn ...
2024-09-28Merge tag 's390-6.12-2' of ↵Linus Torvalds5-51/+86
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Vasily Gorbik: - Clean up and improve vdso code: use SYM_* macros for function and data annotations, add CFI annotations to fix GDB unwinding, optimize the chacha20 implementation - Add vfio-ap driver feature advertisement for use by libvirt and mdevctl * tag 's390-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/vfio-ap: Driver feature advertisement s390/vdso: Use one large alternative instead of an alternative branch s390/vdso: Use SYM_DATA_START_LOCAL()/SYM_DATA_END() for data objects tools: Add additional SYM_*() stubs to linkage.h s390/vdso: Use macros for annotation of asm functions s390/vdso: Add CFI annotations to __arch_chacha20_blocks_nostack() s390/vdso: Fix comment within __arch_chacha20_blocks_nostack() s390/vdso: Get rid of permutation constants