kernel/linux.git/drivers/cxl/core, branch v7.0.11

cxl/mbox: Use proper endpoint validity check upon sanitize

2026-03-18T15:49:29+00:00

Fuzzying CXL triggered: BUG: KASAN: null-ptr-deref in cxl_num_decoders_committed+0x3e/0x80 drivers/cxl/core/port.c:49 Read of size 4 at addr 0000000000000642 by task syz.0.97/2282 CPU: 2 UID: 0 PID: 2282 Comm: syz.0.97 Not tainted 7.0.0-rc1-gebd11be59f74-dirty #494 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 kasan_report+0xe0/0x110 mm/kasan/report.c:595 cxl_num_decoders_committed+0x3e/0x80 drivers/cxl/core/port.c:49 cxl_mem_sanitize+0x141/0x170 drivers/cxl/core/mbox.c:1304 security_sanitize_store+0xb0/0x120 drivers/cxl/core/memdev.c:173 dev_attr_store+0x46/0x70 drivers/base/core.c:2437 sysfs_kf_write+0x95/0xb0 fs/sysfs/file.c:142 kernfs_fop_write_iter+0x276/0x330 fs/kernfs/file.c:352 new_sync_write fs/read_write.c:595 [inline] vfs_write+0x5df/0xaa0 fs/read_write.c:688 ksys_write+0x103/0x1f0 fs/read_write.c:740 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x111/0x680 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f60a584ba79 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f60a42a7038 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007f60a5ab5fa0 RCX: 00007f60a584ba79 RDX: 0000000000000002 RSI: 00002000000001c0 RDI: 0000000000000003 RBP: 00007f60a58a49df R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f60a5ab6038 R14: 00007f60a5ab5fa0 R15: 00007ffe58fad8b8 This goes away using the correct check instead of abusing cxlmd->endpoint, which is unusable (ENXIO) until the driver has probed. During that window the memdev sysfs attributes are already visible, as soon as device_add() completes. Fixes: 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation") Signed-off-by: Davidlohr Bueso Reviewed-by: Jonathan Cameron Reviewed-by: Gregory Price Link: https://patch.msgid.link/20260301221739.1726722-1-dave@stgolabs.net Signed-off-by: Dave Jiang

cxl/hdm: Avoid incorrect DVSEC fallback when HDM decoders are enabled

2026-03-16T23:58:32+00:00

Check the global CXL_HDM_DECODER_ENABLE bit instead of looping over per-decoder COMMITTED bits to determine whether to fall back to DVSEC range emulation. When the HDM decoder capability is globally enabled, ignore DVSEC range registers regardless of individual decoder commit state. should_emulate_decoders() currently loops over per-decoder COMMITTED bits, which leads to an incorrect DVSEC fallback when those bits are zero. One way to trigger this is to destroy a region and bounce the memdev: cxl disable-region region0 cxl destroy-region region0 cxl disable-memdev mem0 cxl enable-memdev mem0 Region teardown zeroes the HDM decoder registers including the committed bits. The subsequent memdev re-probe finds uncommitted decoders and falls back to DVSEC emulation, even though HDM remains globally enabled. Observed failures: should_emulate_decoders: cxl_port endpoint6: decoder6.0: committed: 0 base: 0x0_00000000 size: 0x0_00000000 devm_cxl_setup_hdm: cxl_port endpoint6: Fallback map 1 range register .. devm_cxl_add_region: cxl_acpi ACPI0017:00: decoder0.0: created region0 __construct_region: cxl_pci 0000:e1:00.0: mem1:decoder6.0: __construct_region region0 res: [mem 0x850000000-0x284fffffff flags 0x200] iw: 1 ig: 4096 cxl region0: pci0000:e0:port1 cxl_port_setup_targets expected iw: 1 ig: 4096 .. cxl region0: pci0000:e0:port1 cxl_port_setup_targets got iw: 1 ig: 256 state: disabled .. cxl_port endpoint6: failed to attach decoder6.0 to region0: -6 .. devm_cxl_add_region: cxl_acpi ACPI0017:00: decoder0.0: created region4 alloc_hpa: cxl region4: HPA allocation error (-34) .. Fixes: 52cc48ad2a76 ("cxl/hdm: Limit emulation to the number of range registers") Signed-off-by: Smita Koralahalli Reviewed-by: Dan Williams Link: https://patch.msgid.link/20260316201950.224567-1-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dave Jiang

cxl/region: Fix leakage in __construct_region()

2026-03-04T17:26:39+00:00

Failing the first sysfs_update_group() needs to explicitly kfree the resource as it is too early for cxl_region_iomem_release() to do so. Signed-off-by: Davidlohr Bueso Reviewed-by: Ira Weiny Reviewed-by: Gregory Price Fixes: d6602e25819d (cxl/region: Add support to indicate region has extended linear cache) Link: https://patch.msgid.link/20260202191330.245608-1-dave@stgolabs.net Signed-off-by: Dave Jiang

cxl/port: Fix use after free of parent_port in cxl_detach_ep()

2026-03-03T17:20:19+00:00

cxl_detach_ep() is called during bottom-up removal when all CXL memory devices beneath a switch port have been removed. For each port in the hierarchy it locks both the port and its parent, removes the endpoint, and if the port is now empty, marks it dead and unregisters the port by calling delete_switch_port(). There are two places during this work where the parent_port may be used after freeing: First, a concurrent detach may have already processed a port by the time a second worker finds it via bus_find_device(). Without pinning parent_port, it may already be freed when we discover port->dead and attempt to unlock the parent_port. In a production kernel that's a silent memory corruption, with lock debug, it looks like this: []DEBUG_LOCKS_WARN_ON(__owner_task(owner) != get_current()) []WARNING: kernel/locking/mutex.c:949 at __mutex_unlock_slowpath+0x1ee/0x310 []Call Trace: []mutex_unlock+0xd/0x20 []cxl_detach_ep+0x180/0x400 [cxl_core] []devm_action_release+0x10/0x20 []devres_release_all+0xa8/0xe0 []device_unbind_cleanup+0xd/0xa0 []really_probe+0x1a6/0x3e0 Second, delete_switch_port() releases three devm actions registered against parent_port. The last of those is unregister_port() and it calls device_unregister() on the child port, which can cascade. If parent_port is now also empty the device core may unregister and free it too. So by the time delete_switch_port() returns, parent_port may be free, and the subsequent device_unlock(&parent_port->dev) operates on freed memory. The kernel log looks same as above, with a different offset in cxl_detach_ep(). Both of these issues stem from the absence of a lifetime guarantee between a child port and its parent port. Establish a lifetime rule for ports: child ports hold a reference to their parent device until release. Take the reference when the port is allocated and drop it when released. This ensures the parent is valid for the full lifetime of the child and eliminates the use after free window in cxl_detach_ep(). This is easily reproduced with a reload of cxl_acpi in QEMU with CXL devices present. Fixes: 2345df54249c ("cxl/memdev: Fix endpoint port removal") Reviewed-by: Dave Jiang Reviewed-by: Li Ming Signed-off-by: Alison Schofield Reviewed-by: Jonathan Cameron Link: https://patch.msgid.link/20260226184439.1732841-1-alison.schofield@intel.com Signed-off-by: Dave Jiang

cxl/region: Test CXL_DECODER_F_NORMALIZED_ADDRESSING as a bitmask

2026-02-24T15:33:30+00:00

The CXL decoder flags are defined as bitmasks, not bit indices. Using test_bit() to check them interprets the mask value as a bit index, which is the wrong test. For CXL_DECODER_F_NORMALIZED_ADDRESSING the test reads beyond the flags word, making the flag sometimes appear set and blocking creation of CXL region debugfs attributes that support poison operations. Replace test_bit() with a bitmask check. Found with cxl-test. Fixes: 208f432406b7 ("cxl: Disable HPA/SPA translation handlers for Normalized Addressing") Signed-off-by: Alison Schofield Reviewed-by: Gregory Price Tested-by: Gregory Price Link: https://patch.msgid.link/63fe4a6203e40e404347f1cdc7a1c55cb4959b86.1771873256.git.alison.schofield@intel.com Signed-off-by: Dave Jiang

cxl: Test CXL_DECODER_F_LOCK as a bitmask

2026-02-24T15:33:30+00:00

The CXL decoder flags are defined as bitmasks, not bit indices. Using test_bit() to check them interprets the mask value as a bit index, which is the wrong test. For CXL_DECODER_F_LOCK the test reads beyond the defined bits, causing the test to always return false and allowing resets that should have been blocked. Replace test_bit() with a bitmask check. Fixes: 2230c4bdc412 ("cxl: Add handling of locked CXL decoder") Signed-off-by: Alison Schofield Reviewed-by: Gregory Price Tested-by: Gregory Price Link: https://patch.msgid.link/98851c4770e4631753cf9f75b58a3a6daeca2ea2.1771873256.git.alison.schofield@intel.com Signed-off-by: Dave Jiang

cxl/mbox: validate payload size before accessing contents in cxl_payload_from_user_allowed()

2026-02-24T15:33:29+00:00

cxl_payload_from_user_allowed() casts and dereferences the input payload without first verifying its size. When a raw mailbox command is sent with an undersized payload (ie: 1 byte for CXL_MBOX_OP_CLEAR_LOG, which expects a 16-byte UUID), uuid_equal() reads past the allocated buffer, triggering a KASAN splat: BUG: KASAN: slab-out-of-bounds in memcmp+0x176/0x1d0 lib/string.c:683 Read of size 8 at addr ffff88810130f5c0 by task syz.1.62/2258 CPU: 2 UID: 0 PID: 2258 Comm: syz.1.62 Not tainted 6.19.0-dirty #3 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0xab/0xe0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xce/0x650 mm/kasan/report.c:482 kasan_report+0xce/0x100 mm/kasan/report.c:595 memcmp+0x176/0x1d0 lib/string.c:683 uuid_equal include/linux/uuid.h:73 [inline] cxl_payload_from_user_allowed drivers/cxl/core/mbox.c:345 [inline] cxl_mbox_cmd_ctor drivers/cxl/core/mbox.c:368 [inline] cxl_validate_cmd_from_user drivers/cxl/core/mbox.c:522 [inline] cxl_send_cmd+0x9c0/0xb50 drivers/cxl/core/mbox.c:643 __cxl_memdev_ioctl drivers/cxl/core/memdev.c:698 [inline] cxl_memdev_ioctl+0x14f/0x190 drivers/cxl/core/memdev.c:713 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl fs/ioctl.c:583 [inline] __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xa8/0x330 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fdaf331ba79 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fdaf1d77038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fdaf3585fa0 RCX: 00007fdaf331ba79 RDX: 00002000000001c0 RSI: 00000000c030ce02 RDI: 0000000000000003 RBP: 00007fdaf33749df R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fdaf3586038 R14: 00007fdaf3585fa0 R15: 00007ffced2af768 Add 'in_size' parameter to cxl_payload_from_user_allowed() and validate the payload is large enough. Fixes: 6179045ccc0c ("cxl/mbox: Block immediate mode in SET_PARTITION_INFO command") Fixes: 206f9fa9d555 ("cxl/mbox: Add Clear Log mailbox command") Signed-off-by: Davidlohr Bueso Reviewed-by: Alison Schofield Reviewed-by: Dave Jiang Link: https://patch.msgid.link/20260220001618.963490-2-dave@stgolabs.net Signed-off-by: Dave Jiang

cxl: Fix race of nvdimm_bus object when creating nvdimm objects

2026-02-24T15:33:21+00:00

Found issue during running of cxl-translate.sh unit test. Adding a 3s sleep right before the test seems to make the issue reproduce fairly consistently. The cxl_translate module has dependency on cxl_acpi and causes orphaned nvdimm objects to reprobe after cxl_acpi is removed. The nvdimm_bus object is registered by the cxl_nvb object when cxl_acpi_probe() is called. With the nvdimm_bus object missing, __nd_device_register() will trigger NULL pointer dereference when accessing the dev->parent that points to &nvdimm_bus->dev. [ 192.884510] BUG: kernel NULL pointer dereference, address: 000000000000006c [ 192.895383] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20250812-19.fc42 08/12/2025 [ 192.897721] Workqueue: cxl_port cxl_bus_rescan_queue [cxl_core] [ 192.899459] RIP: 0010:kobject_get+0xc/0x90 [ 192.924871] Call Trace: [ 192.925959] [ 192.926976] ? pm_runtime_init+0xb9/0xe0 [ 192.929712] __nd_device_register.part.0+0x4d/0xc0 [libnvdimm] [ 192.933314] __nvdimm_create+0x206/0x290 [libnvdimm] [ 192.936662] cxl_nvdimm_probe+0x119/0x1d0 [cxl_pmem] [ 192.940245] cxl_bus_probe+0x1a/0x60 [cxl_core] [ 192.943349] really_probe+0xde/0x380 This patch also relies on the previous change where devm_cxl_add_nvdimm_bridge() is called from drivers/cxl/pmem.c instead of drivers/cxl/core.c to ensure the dependency of cxl_acpi on cxl_pmem. 1. Set probe_type of cxl_nvb to PROBE_FORCE_SYNCHRONOUS to ensure the driver is probed synchronously when add_device() is called. 2. Add a check in __devm_cxl_add_nvdimm_bridge() to ensure that the cxl_nvb driver is attached during cxl_acpi_probe(). 3. Take the cxl_root uport_dev lock and the cxl_nvb->dev lock in devm_cxl_add_nvdimm() before checking nvdimm_bus is valid. 4. Set cxl_nvdimm flag to CXL_NVD_F_INVALIDATED so cxl_nvdimm_probe() will exit with -EBUSY. The removal of cxl_nvdimm devices should prevent any orphaned devices from probing once the nvdimm_bus is gone. [ dj: Fixed 0-day reported kdoc issue. ] [ dj: Fix cxl_nvb reference leak on error. Gregory (kreview-0811365) ] Suggested-by: Dan Williams Fixes: 8fdcb1704f61 ("cxl/pmem: Add initial infrastructure for pmem support") Tested-by: Alison Schofield Reviewed-by: Alison Schofield Link: https://patch.msgid.link/20260205001633.1813643-3-dave.jiang@intel.com Signed-off-by: Dave Jiang

cxl: Move devm_cxl_add_nvdimm_bridge() to cxl_pmem.ko

2026-02-23T18:29:05+00:00

Moving the symbol devm_cxl_add_nvdimm_bridge() to drivers/cxl/cxl_pmem.c, so that cxl_pmem can export a symbol that gives cxl_acpi a depedency on cxl_pmem kernel module. This is a prepatory patch to resolve the issue of a race for nvdimm_bus object that is created during cxl_acpi_probe(). No functional changes besides moving code. Suggested-by: Dan Williams Acked-by: Ira Weiny Tested-by: Alison Schofield Reviewed-by: Alison Schofield Link: https://patch.msgid.link/20260205001633.1813643-2-dave.jiang@intel.com Signed-off-by: Dave Jiang

cxl/port: Hold port host lock during dport adding.

2026-02-23T16:31:22+00:00

CXL testing environment can trigger following trace Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497] RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core] Call Trace: cxl_event_trace_record+0xd1/0xa70 [cxl_core] __cxl_event_trace_record+0x12f/0x1e0 [cxl_core] cxl_mem_get_records_log+0x261/0x500 [cxl_core] cxl_mem_get_event_records+0x7c/0xc0 [cxl_core] cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem] platform_probe+0x9d/0x130 really_probe+0x1c8/0x960 __driver_probe_device+0x187/0x3e0 driver_probe_device+0x45/0x120 __device_attach_driver+0x15d/0x280 When CXL subsystem adds a cxl port to a hierarchy, there is a small window where the new port becomes visible before it is bound to a driver. This happens because device_add() adds a device to bus device list before bus_probe_device() binds it to a driver. So if two cxl memdevs are trying to add a dport to a same port via devm_cxl_enumerate_ports(), the second cxl memdev may observe the port and attempt to add a dport, but fails because the port has not yet been attached to cxl port driver. That causes the memdev->endpoint can not be updated. The sequence is like: CPU 0 CPU 1 devm_cxl_enumerate_ports() # port not found, add it add_port_attach_ep() # hold the parent port lock # to add the new port devm_cxl_create_port() device_add() # Add dev to bus devs list bus_add_device() devm_cxl_enumerate_ports() # found the port find_cxl_port_by_uport() # hold port lock to add a dport device_lock(the port) find_or_add_dport() cxl_port_add_dport() return -ENXIO because port->dev.driver is NULL device_unlock(the port) bus_probe_device() # hold the port lock # for attaching device_lock(the port) attaching the new port device_unlock(the port) To fix this race, require that dport addition holds the host lock of the target port(the host of CXL root and all cxl host bridge ports is the platform firmware device, the host of all other ports is their parent port). The CXL subsystem already requires holding the host lock while attaching a new port. Therefore, successfully acquiring the host lock guarantees that port attaching has completed. Fixes: 4f06d81e7c6a ("cxl: Defer dport allocation for switch ports") Signed-off-by: Li Ming Reviewed-by: Dan Williams Tested-by: Alison Schofield Link: https://patch.msgid.link/20260210-fix-port-enumeration-failure-v3-2-06acce0b9ead@zohomail.com Signed-off-by: Dave Jiang