<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/cxl, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-04-02T11:09:25+00:00</updated>
<entry>
<title>cxl/hdm: Avoid incorrect DVSEC fallback when HDM decoders are enabled</title>
<updated>2026-04-02T11:09:25+00:00</updated>
<author>
<name>Smita Koralahalli</name>
<email>Smita.KoralahalliChannabasappa@amd.com</email>
</author>
<published>2026-03-16T20:19:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=92c47342c66d4e3ded6a035df015ea7a1b259813'/>
<id>urn:sha1:92c47342c66d4e3ded6a035df015ea7a1b259813</id>
<content type='text'>
[ Upstream commit 75cea0776de502f2a1be5ca02d37c586dc81887e ]

Check the global CXL_HDM_DECODER_ENABLE bit instead of looping over
per-decoder COMMITTED bits to determine whether to fall back to DVSEC
range emulation. When the HDM decoder capability is globally enabled,
ignore DVSEC range registers regardless of individual decoder commit
state.

should_emulate_decoders() currently loops over per-decoder COMMITTED
bits, which leads to an incorrect DVSEC fallback when those bits are
zero. One way to trigger this is to destroy a region and bounce the
memdev:

  cxl disable-region region0
  cxl destroy-region region0
  cxl disable-memdev mem0
  cxl enable-memdev mem0

Region teardown zeroes the HDM decoder registers including the committed
bits. The subsequent memdev re-probe finds uncommitted decoders and falls
back to DVSEC emulation, even though HDM remains globally enabled.

Observed failures:

  should_emulate_decoders: cxl_port endpoint6: decoder6.0: committed: 0 base: 0x0_00000000 size: 0x0_00000000
  devm_cxl_setup_hdm: cxl_port endpoint6: Fallback map 1 range register
  ..
  devm_cxl_add_region: cxl_acpi ACPI0017:00: decoder0.0: created region0
  __construct_region: cxl_pci 0000:e1:00.0: mem1:decoder6.0:
  __construct_region region0 res: [mem 0x850000000-0x284fffffff flags 0x200] iw: 1 ig: 4096
  cxl region0: pci0000:e0:port1 cxl_port_setup_targets expected iw: 1 ig: 4096 ..
  cxl region0: pci0000:e0:port1 cxl_port_setup_targets got iw: 1 ig: 256 state: disabled ..
  cxl_port endpoint6: failed to attach decoder6.0 to region0: -6
  ..
  devm_cxl_add_region: cxl_acpi ACPI0017:00: decoder0.0: created region4
  alloc_hpa: cxl region4: HPA allocation error (-34) ..

Fixes: 52cc48ad2a76 ("cxl/hdm: Limit emulation to the number of range registers")
Signed-off-by: Smita Koralahalli &lt;Smita.KoralahalliChannabasappa@amd.com&gt;
Reviewed-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Link: https://patch.msgid.link/20260316201950.224567-1-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cxl/port: Fix use after free of parent_port in cxl_detach_ep()</title>
<updated>2026-04-02T11:09:24+00:00</updated>
<author>
<name>Alison Schofield</name>
<email>alison.schofield@intel.com</email>
</author>
<published>2026-02-26T18:44:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d216a4bd138eb57cc4ae7c43b2f709e3482af7e2'/>
<id>urn:sha1:d216a4bd138eb57cc4ae7c43b2f709e3482af7e2</id>
<content type='text'>
[ Upstream commit 19d2f0b97a131198efc2c4ca3eb7f980bba8c2b4 ]

cxl_detach_ep() is called during bottom-up removal when all CXL memory
devices beneath a switch port have been removed. For each port in the
hierarchy it locks both the port and its parent, removes the endpoint,
and if the port is now empty, marks it dead and unregisters the port
by calling delete_switch_port(). There are two places during this work
where the parent_port may be used after freeing:

First, a concurrent detach may have already processed a port by the
time a second worker finds it via bus_find_device(). Without pinning
parent_port, it may already be freed when we discover port-&gt;dead and
attempt to unlock the parent_port. In a production kernel that's a
silent memory corruption, with lock debug, it looks like this:

[]DEBUG_LOCKS_WARN_ON(__owner_task(owner) != get_current())
[]WARNING: kernel/locking/mutex.c:949 at __mutex_unlock_slowpath+0x1ee/0x310
[]Call Trace:
[]mutex_unlock+0xd/0x20
[]cxl_detach_ep+0x180/0x400 [cxl_core]
[]devm_action_release+0x10/0x20
[]devres_release_all+0xa8/0xe0
[]device_unbind_cleanup+0xd/0xa0
[]really_probe+0x1a6/0x3e0

Second, delete_switch_port() releases three devm actions registered
against parent_port. The last of those is unregister_port() and it
calls device_unregister() on the child port, which can cascade. If
parent_port is now also empty the device core may unregister and free
it too. So by the time delete_switch_port() returns, parent_port may
be free, and the subsequent device_unlock(&amp;parent_port-&gt;dev) operates
on freed memory. The kernel log looks same as above, with a different
offset in cxl_detach_ep().

Both of these issues stem from the absence of a lifetime guarantee
between a child port and its parent port.

Establish a lifetime rule for ports: child ports hold a reference to
their parent device until release. Take the reference when the port
is allocated and drop it when released. This ensures the parent is
valid for the full lifetime of the child and eliminates the use after
free window in cxl_detach_ep().

This is easily reproduced with a reload of cxl_acpi in QEMU with CXL
devices present.

Fixes: 2345df54249c ("cxl/memdev: Fix endpoint port removal")
Reviewed-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Reviewed-by: Li Ming &lt;ming.li@zohomail.com&gt;
Signed-off-by: Alison Schofield &lt;alison.schofield@intel.com&gt;
Reviewed-by: Jonathan Cameron &lt;jonathan.cameron@huawei.com&gt;
Link: https://patch.msgid.link/20260226184439.1732841-1-alison.schofield@intel.com
Signed-off-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cxl: Fix premature commit_end increment on decoder commit failure</title>
<updated>2026-03-04T12:20:22+00:00</updated>
<author>
<name>Yuxiong Wang</name>
<email>yuxiong.wang@linux.alibaba.com</email>
</author>
<published>2026-01-29T06:45:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=130381dd82e733a78f17f3f0482770c1756d11d9'/>
<id>urn:sha1:130381dd82e733a78f17f3f0482770c1756d11d9</id>
<content type='text'>
[ Upstream commit 7b6f9d9b1ea05c9c22570126547c780e8c6c3f62 ]

In cxl_decoder_commit(), commit_end is incremented before verifying
whether the commit succeeded, and the CXL_DECODER_F_ENABLE bit in
cxld-&gt;flags is only set after a successful commit. As a result, if the
commit fails, commit_end has been incremented and cxld-&gt;reset() has no
effect since the flag is not set, so commit_end remains incorrectly
incremented. The inconsistency between commit_end and CXL_DECODER_F_ENABLE
causes failure during subsequent either commit or reset operations.

Fix this by incrementing commit_end only after confirming the commit
succeeded. Also, remove the ineffective cxld-&gt;reset() call. According to
CXL Spec r4.0 8.2.4.20.12 Committing Decoder Programming, since
cxld_await_commit() has cleared the decoder commit bit on failure, no
additional reset is required.

[dj: Fixed commit log 80 char wrapping. ]
[dj: Fix "Fixes" tag to correct hash length. ]
[dj: Change spec to r4.0. ]

Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware")
Signed-off-by: Yuxiong Wang &lt;yuxiong.wang@linux.alibaba.com&gt;
Acked-by: Huang Ying &lt;ying.huang@linux.alibaba.com&gt;
Reviewed-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Reviewed-by: Alison Schofield &lt;alison.schofield@intel.com&gt;
Link: https://patch.msgid.link/20260129064552.31180-1-yuxiong.wang@linux.alibaba.com
Signed-off-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cxl/region: Add a dev_err() on missing target list entries</title>
<updated>2025-07-06T09:01:32+00:00</updated>
<author>
<name>Robert Richter</name>
<email>rrichter@amd.com</email>
</author>
<published>2025-05-09T15:06:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b3dc5c735a754142831f705068fcfa77551724ff'/>
<id>urn:sha1:b3dc5c735a754142831f705068fcfa77551724ff</id>
<content type='text'>
[ Upstream commit d90acdf49e18029cfe4194475c45ef143657737a ]

Broken target lists are hard to discover as the driver fails at a
later initialization stage. Add an error message for this.

Example log messages:

  cxl_mem mem1: failed to find endpoint6:0000:e0:01.3 in target list of decoder1.1
  cxl_port endpoint6: failed to register decoder6.0: -6
  cxl_port endpoint6: probe: 0

Signed-off-by: Robert Richter &lt;rrichter@amd.com&gt;
Reviewed-by: Gregory Price &lt;gourry@gourry.net&gt;
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Reviewed-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Reviewed-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reviewed-by: Alison Schofield &lt;alison.schofield@intel.com&gt;
Reviewed-by: "Fabio M. De Francesco" &lt;fabio.m.de.francesco@linux.intel.com&gt;
Tested-by: Gregory Price &lt;gourry@gourry.net&gt;
Acked-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Link: https://patch.msgid.link/20250509150700.2817697-14-rrichter@amd.com
Signed-off-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cxl/core/regs.c: Skip Memory Space Enable check for RCD and RCH Ports</title>
<updated>2025-05-02T05:59:08+00:00</updated>
<author>
<name>Smita Koralahalli</name>
<email>Smita.KoralahalliChannabasappa@amd.com</email>
</author>
<published>2025-04-07T19:27:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4c5f6925e267be8a39c15e6fa6293882ca7369ed'/>
<id>urn:sha1:4c5f6925e267be8a39c15e6fa6293882ca7369ed</id>
<content type='text'>
commit 078d3ee7c162cd66d76171579c02d7890bd77daf upstream.

According to CXL r3.2 section 8.2.1.2, the PCI_COMMAND register fields,
including Memory Space Enable bit, have no effect on the behavior of an
RCD Upstream Port. Retaining this check may incorrectly cause
cxl_pci_probe() to fail on a valid RCD upstream Port.

While the specification is explicit only for RCD Upstream Ports, this
check is solely for accessing the RCRB, which is always mapped through
memory space. Therefore, its safe to remove the check entirely. In
practice, firmware reliably enables the Memory Space Enable bit for
RCH Downstream Ports and no failures have been observed.

Removing the check simplifies the code and avoids unnecessary
special-casing, while relying on BIOS/firmware to configure devices
correctly. Moreover, any failures due to inaccessible RCRB regions
will still be caught either in __rcrb_to_component() or while
parsing the component register block.

The following failure was observed in dmesg when the check was present:
	cxl_pci 0000:7f:00.0: No component registers (-6)

Fixes: d5b1a27143cb ("cxl/acpi: Extract component registers of restricted hosts from RCRB")
Signed-off-by: Smita Koralahalli &lt;Smita.KoralahalliChannabasappa@amd.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Reviewed-by: Terry Bowman &lt;terry.bowman@amd.com&gt;
Reviewed-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Reviewed-by: Robert Richter &lt;rrichter@amd.com&gt;
Link: https://patch.msgid.link/20250407192734.70631-1-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cxl/region: Fix region creation for greater than x2 switches</title>
<updated>2024-12-27T13:02:01+00:00</updated>
<author>
<name>Huaisheng Ye</name>
<email>huaisheng.ye@intel.com</email>
</author>
<published>2024-12-09T23:33:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cec68375b8e15627d96557f3ae5e3e553fad1ea9'/>
<id>urn:sha1:cec68375b8e15627d96557f3ae5e3e553fad1ea9</id>
<content type='text'>
[ Upstream commit 76467a94810c2aa4dd3096903291ac6df30c399e ]

The cxl_port_setup_targets() algorithm fails to identify valid target list
ordering in the presence of 4-way and above switches resulting in
'cxl create-region' failures of the form:

  $ cxl create-region -d decoder0.0 -g 1024 -s 2G -t ram -w 8 -m mem4 mem1 mem6 mem3 mem2 mem5 mem7 mem0
  cxl region: create_region: region0: failed to set target7 to mem0
  cxl region: cmd_create_region: created 0 regions

  [kernel debug message]
  check_last_peer:1213: cxl region0: pci0000:0c:port1: cannot host mem6:decoder7.0 at 2
  bus_remove_device:574: bus: 'cxl': remove device region0

QEMU can create this failing topology:

                       ACPI0017:00 [root0]
                           |
                         HB_0 [port1]
                        /             \
                     RP_0             RP_1
                      |                 |
                USP [port2]           USP [port3]
            /    /    \    \        /   /    \    \
          DSP   DSP   DSP   DSP   DSP  DSP   DSP  DSP
           |     |     |     |     |    |     |    |
          mem4  mem6  mem2  mem7  mem1 mem3  mem5  mem0
 Pos:      0     2     4     6     1    3     5    7

 HB: Host Bridge
 RP: Root Port
 USP: Upstream Port
 DSP: Downstream Port

...with the following command steps:

$ qemu-system-x86_64 -machine q35,cxl=on,accel=tcg  \
        -smp cpus=8 \
        -m 8G \
        -hda /home/work/vm-images/centos-stream8-02.qcow2 \
        -object memory-backend-ram,size=4G,id=m0 \
        -object memory-backend-ram,size=4G,id=m1 \
        -object memory-backend-ram,size=2G,id=cxl-mem0 \
        -object memory-backend-ram,size=2G,id=cxl-mem1 \
        -object memory-backend-ram,size=2G,id=cxl-mem2 \
        -object memory-backend-ram,size=2G,id=cxl-mem3 \
        -object memory-backend-ram,size=2G,id=cxl-mem4 \
        -object memory-backend-ram,size=2G,id=cxl-mem5 \
        -object memory-backend-ram,size=2G,id=cxl-mem6 \
        -object memory-backend-ram,size=2G,id=cxl-mem7 \
        -numa node,memdev=m0,cpus=0-3,nodeid=0 \
        -numa node,memdev=m1,cpus=4-7,nodeid=1 \
        -netdev user,id=net0,hostfwd=tcp::2222-:22 \
        -device virtio-net-pci,netdev=net0 \
        -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
        -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \
        -device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \
        -device cxl-upstream,bus=root_port0,id=us0 \
        -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \
        -device cxl-type3,bus=swport0,volatile-memdev=cxl-mem0,id=cxl-vmem0 \
        -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \
        -device cxl-type3,bus=swport1,volatile-memdev=cxl-mem1,id=cxl-vmem1 \
        -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \
        -device cxl-type3,bus=swport2,volatile-memdev=cxl-mem2,id=cxl-vmem2 \
        -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \
        -device cxl-type3,bus=swport3,volatile-memdev=cxl-mem3,id=cxl-vmem3 \
        -device cxl-upstream,bus=root_port1,id=us1 \
        -device cxl-downstream,port=4,bus=us1,id=swport4,chassis=0,slot=8 \
        -device cxl-type3,bus=swport4,volatile-memdev=cxl-mem4,id=cxl-vmem4 \
        -device cxl-downstream,port=5,bus=us1,id=swport5,chassis=0,slot=9 \
        -device cxl-type3,bus=swport5,volatile-memdev=cxl-mem5,id=cxl-vmem5 \
        -device cxl-downstream,port=6,bus=us1,id=swport6,chassis=0,slot=10 \
        -device cxl-type3,bus=swport6,volatile-memdev=cxl-mem6,id=cxl-vmem6 \
        -device cxl-downstream,port=7,bus=us1,id=swport7,chassis=0,slot=11 \
        -device cxl-type3,bus=swport7,volatile-memdev=cxl-mem7,id=cxl-vmem7 \
        -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=32G &amp;

In Guest OS:
$ cxl create-region -d decoder0.0 -g 1024 -s 2G -t ram -w 8 -m mem4 mem1 mem6 mem3 mem2 mem5 mem7 mem0

Fix the method to calculate @distance by iterativeley multiplying the
number of targets per switch port. This also follows the algorithm
recommended here [1].

Fixes: 27b3f8d13830 ("cxl/region: Program target lists")
Link: http://lore.kernel.org/6538824b52349_7258329466@dwillia2-xfh.jf.intel.com.notmuch [1]
Signed-off-by: Huaisheng Ye &lt;huaisheng.ye@intel.com&gt;
Tested-by: Li Zhijian &lt;lizhijian@fujitsu.com&gt;
[djbw: add a comment explaining 'distance']
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Link: https://patch.msgid.link/173378716722.1270362.9546805175813426729.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cxl/pci: Fix potential bogus return value upon successful probing</title>
<updated>2024-12-27T13:02:01+00:00</updated>
<author>
<name>Davidlohr Bueso</name>
<email>dave@stgolabs.net</email>
</author>
<published>2024-11-15T17:00:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=87d5a852f0bc2f7b9d6dfdb6db0bdb0740ef2717'/>
<id>urn:sha1:87d5a852f0bc2f7b9d6dfdb6db0bdb0740ef2717</id>
<content type='text'>
[ Upstream commit da4d8c83358163df9a4addaeba0ef8bcb03b22e8 ]

If cxl_pci_ras_unmask() returns non-zero, cxl_pci_probe() will end up
returning that value, instead of zero.

Fixes: 248529edc86f ("cxl: add RAS status unmasking for CXL")
Reviewed-by: Fan Ni &lt;fan.ni@samsung.com&gt;
Signed-off-by: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Link: https://patch.msgid.link/20241115170032.108445-1-dave@stgolabs.net
Signed-off-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cxl/port: Prevent out-of-order decoder allocation</title>
<updated>2024-10-25T21:07:03+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2024-10-23T01:43:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=105b6235ad0f24f271aef17f8865186c4546cb3a'/>
<id>urn:sha1:105b6235ad0f24f271aef17f8865186c4546cb3a</id>
<content type='text'>
With the recent change to allow out-of-order decoder de-commit it
highlights a need to strengthen the in-order decoder commit guarantees.
As it stands match_free_decoder() ensures that if 2 regions are racing
decoder allocations the one that wins the race will get the lower id
decoder, but that still leaves the race to *commit* the decoder.

Rather than have this complicated case of "reserved in-order, but may
still commit out-of-order", just arrange for the reservation order to
match the commit-order. In other words, prevent subsequent allocations
until the last reservation is committed.

This precludes overlapping region creation events and requires the
previous regionN to either move forward to the decoder commit stage or
drop its reservation before regionN+1 can move forward. That is,
provided that regionN and regionN+1 decode through the same switch port.

As a side effect this allows match_free_decoder() to drop its dependency
on needing write access to the device_find_child() @data parameter [1].

Reported-by: Zijun Hu &lt;quic_zijuhu@quicinc.com&gt;
Closes: http://lore.kernel.org/20240905-const_dfc_prepare-v4-0-4180e1d5a244@quicinc.com
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Cc: Alison Schofield &lt;alison.schofield@intel.com&gt;
Cc: Jonathan Cameron &lt;jonathan.cameron@huawei.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Link: https://patch.msgid.link/172964783668.81806.14962699553881333486.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
</content>
</entry>
<entry>
<title>cxl/port: Fix use-after-free, permit out-of-order decoder shutdown</title>
<updated>2024-10-25T21:07:03+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2024-10-23T01:43:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=101c268bd2f37e965a5468353e62d154db38838e'/>
<id>urn:sha1:101c268bd2f37e965a5468353e62d154db38838e</id>
<content type='text'>
In support of investigating an initialization failure report [1],
cxl_test was updated to register mock memory-devices after the mock
root-port/bus device had been registered. That led to cxl_test crashing
with a use-after-free bug with the following signature:

    cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem0:decoder7.0 @ 0 next: cxl_switch_uport.0 nr_eps: 1 nr_targets: 1
    cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem4:decoder14.0 @ 1 next: cxl_switch_uport.0 nr_eps: 2 nr_targets: 1
    cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[0] = cxl_switch_dport.0 for mem0:decoder7.0 @ 0
1)  cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[1] = cxl_switch_dport.4 for mem4:decoder14.0 @ 1
    [..]
    cxld_unregister: cxl decoder14.0:
    cxl_region_decode_reset: cxl_region region3:
    mock_decoder_reset: cxl_port port3: decoder3.0 reset
2)  mock_decoder_reset: cxl_port port3: decoder3.0: out of order reset, expected decoder3.1
    cxl_endpoint_decoder_release: cxl decoder14.0:
    [..]
    cxld_unregister: cxl decoder7.0:
3)  cxl_region_decode_reset: cxl_region region3:
    Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP PTI
    [..]
    RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core]
    [..]
    Call Trace:
     &lt;TASK&gt;
     cxl_region_decode_reset+0x69/0x190 [cxl_core]
     cxl_region_detach+0xe8/0x210 [cxl_core]
     cxl_decoder_kill_region+0x27/0x40 [cxl_core]
     cxld_unregister+0x5d/0x60 [cxl_core]

At 1) a region has been established with 2 endpoint decoders (7.0 and
14.0). Those endpoints share a common switch-decoder in the topology
(3.0). At teardown, 2), decoder14.0 is the first to be removed and hits
the "out of order reset case" in the switch decoder. The effect though
is that region3 cleanup is aborted leaving it in-tact and
referencing decoder14.0. At 3) the second attempt to teardown region3
trips over the stale decoder14.0 object which has long since been
deleted.

The fix here is to recognize that the CXL specification places no
mandate on in-order shutdown of switch-decoders, the driver enforces
in-order allocation, and hardware enforces in-order commit. So, rather
than fail and leave objects dangling, always remove them.

In support of making cxl_region_decode_reset() always succeed,
cxl_region_invalidate_memregion() failures are turned into warnings.
Crashing the kernel is ok there since system integrity is at risk if
caches cannot be managed around physical address mutation events like
CXL region destruction.

A new device_for_each_child_reverse_from() is added to cleanup
port-&gt;commit_end after all dependent decoders have been disabled. In
other words if decoders are allocated 0-&gt;1-&gt;2 and disabled 1-&gt;2-&gt;0 then
port-&gt;commit_end only decrements from 2 after 2 has been disabled, and
it decrements all the way to zero since 1 was disabled previously.

Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1]
Cc: stable@vger.kernel.org
Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware")
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Alison Schofield &lt;alison.schofield@intel.com&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Zijun Hu &lt;quic_zijuhu@quicinc.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Link: https://patch.msgid.link/172964782781.81806.17902885593105284330.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
</content>
</entry>
<entry>
<title>cxl/acpi: Ensure ports ready at cxl_acpi_probe() return</title>
<updated>2024-10-25T21:07:03+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2024-10-23T01:43:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=48f62d38a07d464a499fa834638afcfd2b68f852'/>
<id>urn:sha1:48f62d38a07d464a499fa834638afcfd2b68f852</id>
<content type='text'>
In order to ensure root CXL ports are enabled upon cxl_acpi_probe()
when the 'cxl_port' driver is built as a module, arrange for the
module to be pre-loaded or built-in.

The "Fixes:" but no "Cc: stable" on this patch reflects that the issue
is merely by inspection since the bug that triggered the discovery of
this potential problem [1] is fixed by other means. However, a stable
backport should do no harm.

Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1]
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Tested-by: Gregory Price &lt;gourry@gourry.net&gt;
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Link: https://patch.msgid.link/172964781969.81806.17276352414854540808.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
</content>
</entry>
</feed>
