<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/misc/cxl, branch v3.18.76</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v3.18.76</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v3.18.76'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2016-05-17T18:32:43+00:00</updated>
<entry>
<title>cxl: Keep IRQ mappings on context teardown</title>
<updated>2016-05-17T18:32:43+00:00</updated>
<author>
<name>Michael Neuling</name>
<email>mikey@neuling.org</email>
</author>
<published>2016-04-22T04:57:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=625e43ab5cdd182ea89b160e8ae7008f5bc0219c'/>
<id>urn:sha1:625e43ab5cdd182ea89b160e8ae7008f5bc0219c</id>
<content type='text'>
[ Upstream commit d6776bba44d9752f6cdf640046070e71ee4bba7b ]

Keep IRQ mappings on context teardown.  This won't leak IRQs as if we
allocate the mapping again, the generic code will give the same
mapping used last time.

Doing this works around a race in the generic code. Masking the
interrupt introduces a race which can crash the kernel or result in
IRQ that is never EOIed. The lost of EOI results in all subsequent
mappings to the same HW IRQ never receiving an interrupt.

We've seen this race with cxl test cases which are doing heavy context
startup and teardown at the same time as heavy interrupt load.

A fix to the generic code is being investigated also.

Signed-off-by: Michael Neuling &lt;mikey@neuling.org&gt;
Cc: stable@vger.kernel.org # 3.8
Tested-by: Andrew Donnellan &lt;andrew.donnellan@au1.ibm.com&gt;
Acked-by: Ian Munsie &lt;imunsie@au1.ibm.com&gt;
Tested-by: Vaibhav Jain &lt;vaibhav@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>cxl: Fix unbalanced pci_dev_get in cxl_probe</title>
<updated>2015-10-27T13:33:08+00:00</updated>
<author>
<name>Daniel Axtens</name>
<email>dja@axtens.net</email>
</author>
<published>2015-09-15T05:04:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5324b25311a4676cda90a280ec273b7b46a79d28'/>
<id>urn:sha1:5324b25311a4676cda90a280ec273b7b46a79d28</id>
<content type='text'>
[ Upstream commit 2925c2fdf1e0eb642482f5b30577e9435aaa8edb ]

Currently the first thing we do in cxl_probe is to grab a reference
on the pci device. Later on, we call device_register on our adapter.
In our remove path, we call device_unregister, but we never call
pci_dev_put. We therefore leak the device every time we do a
reflash.

device_register/unregister is sufficient to hold the reference.
Therefore, drop the call to pci_dev_get.

Here's why this is safe.
The proposed cxl_probe(pdev) calls cxl_adapter_init:
    a) init calls cxl_adapter_alloc, which creates a struct cxl,
       conventionally called adapter. This struct contains a
       device entry, adapter-&gt;dev.

    b) init calls cxl_configure_adapter, where we set
       adapter-&gt;dev.parent = &amp;dev-&gt;dev (here dev is the pci dev)

So at this point, the cxl adapter's device's parent is the PCI
device that I want to be refcounted properly.

    c) init calls cxl_register_adapter
       *) cxl_register_adapter calls device_register(&amp;adapter-&gt;dev)

So now we're in device_register, where dev is the adapter device, and
we want to know if the PCI device is safe after we return.

device_register(&amp;adapter-&gt;dev) calls device_initialize() and then
device_add().

device_add() does a get_device(). device_add() also explicitly grabs
the device's parent, and calls get_device() on it:

         parent = get_device(dev-&gt;parent);

So therefore, device_register() takes a lock on the parent PCI dev,
which is what pci_dev_get() was guarding. pci_dev_get() can therefore
be safely removed.

Fixes: f204e0b8cedd ("cxl: Driver code for powernv PCIe based cards for userspace access")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Axtens &lt;dja@axtens.net&gt;
Acked-by: Ian Munsie &lt;imunsie@au1.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>cxl: Add missing return statement after handling AFU errror</title>
<updated>2015-03-24T01:02:54+00:00</updated>
<author>
<name>Ian Munsie</name>
<email>imunsie@au1.ibm.com</email>
</author>
<published>2015-02-04T08:10:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=da7a05331943f70380ceb71a6e2d462d6196334c'/>
<id>urn:sha1:da7a05331943f70380ceb71a6e2d462d6196334c</id>
<content type='text'>
commit a6130ed253a931d2169c26ab0958d81b0dce4d6e upstream.

We were missing a return statement in the PSL interrupt handler in the
case of an AFU error, which would trigger an "Unhandled CXL PSL IRQ"
warning. We do actually handle these type of errors (by notifying
userspace), so add the missing return IRQ_HANDLED so we don't throw
unecessary warnings.

Signed-off-by: Ian Munsie &lt;imunsie@au1.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cxl: Fix device_node reference counting</title>
<updated>2015-03-24T01:02:54+00:00</updated>
<author>
<name>Ryan Grimm</name>
<email>grimm@linux.vnet.ibm.com</email>
</author>
<published>2015-01-29T02:16:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3bc64c1b582e4de917fbbdeaec199864ec8194af'/>
<id>urn:sha1:3bc64c1b582e4de917fbbdeaec199864ec8194af</id>
<content type='text'>
commit 6f963ec2d6bf2476a16799eece920acb2100ff1c upstream.

When unbinding and rebinding the driver on a system with a card in PHB0, this
error condition is reached after a few attempts:

ERROR: Bad of_node_put() on /pciex@3fffe40000000
CPU: 0 PID: 3040 Comm: bash Not tainted 3.18.0-rc3-12545-g3627ffe #152
Call Trace:
[c000000721acb5c0] [c00000000086ef94] .dump_stack+0x84/0xb0 (unreliable)
[c000000721acb640] [c00000000073a0a8] .of_node_release+0xd8/0xe0
[c000000721acb6d0] [c00000000044bc44] .kobject_release+0x74/0xe0
[c000000721acb760] [c0000000007394fc] .of_node_put+0x1c/0x30
[c000000721acb7d0] [c000000000545cd8] .cxl_probe+0x1a98/0x1d50
[c000000721acb900] [c0000000004845a0] .local_pci_probe+0x40/0xc0
[c000000721acb980] [c000000000484998] .pci_device_probe+0x128/0x170
[c000000721acba30] [c00000000052400c] .driver_probe_device+0xac/0x2a0
[c000000721acbad0] [c000000000522468] .bind_store+0x108/0x160
[c000000721acbb70] [c000000000521448] .drv_attr_store+0x38/0x60
[c000000721acbbe0] [c000000000293840] .sysfs_kf_write+0x60/0xa0
[c000000721acbc50] [c000000000292500] .kernfs_fop_write+0x140/0x1d0
[c000000721acbcf0] [c000000000208648] .vfs_write+0xd8/0x260
[c000000721acbd90] [c000000000208b18] .SyS_write+0x58/0x100
[c000000721acbe30] [c000000000009258] syscall_exit+0x0/0x98

We are missing a call to of_node_get(). pnv_pci_to_phb_node() should
call of_node_get() otherwise np's reference count isn't incremented and
it might go away. Rename pnv_pci_to_phb_node() to pnv_pci_get_phb_node()
so it's clear it calls of_node_get().

Signed-off-by: Ryan Grimm &lt;grimm@linux.vnet.ibm.com&gt;
Acked-by: Ian Munsie &lt;imunsie@au1.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cxl: Use image state defaults for reloading FPGA</title>
<updated>2015-03-24T01:02:54+00:00</updated>
<author>
<name>Ryan Grimm</name>
<email>grimm@linux.vnet.ibm.com</email>
</author>
<published>2015-01-19T17:52:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0153a8f025996a5d465717a8782c26d1ce09d254'/>
<id>urn:sha1:0153a8f025996a5d465717a8782c26d1ce09d254</id>
<content type='text'>
commit 4beb5421babee1204757b877622830c6aa31be6d upstream.

Select defaults such that a PERST causes flash image reload.  Select which
image based on what the card is set up to load.

CXL_VSEC_PERST_LOADS_IMAGE selects whether PERST assertion causes flash image
load.

CXL_VSEC_PERST_SELECT_USER selects which image is loaded on the next PERST.

cxl_update_image_control writes these bits into the VSEC.

Signed-off-by: Ryan Grimm &lt;grimm@linux.vnet.ibm.com&gt;
Acked-by: Ian Munsie &lt;imunsie@au1.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cxl: Unmap MMIO regions when detaching a context</title>
<updated>2015-01-27T16:29:36+00:00</updated>
<author>
<name>Ian Munsie</name>
<email>imunsie@au1.ibm.com</email>
</author>
<published>2014-12-08T08:18:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=00ba02150d664472961b83b016fe03825a2176e9'/>
<id>urn:sha1:00ba02150d664472961b83b016fe03825a2176e9</id>
<content type='text'>
commit b123429e6a9e8d03aacf888d23262835f0081448 upstream.

If we need to force detach a context (e.g. due to EEH or simply force
unbinding the driver) we should prevent the userspace contexts from
being able to access the Problem State Area MMIO region further, which
they may have mapped with mmap().

This patch unmaps any mapped MMIO regions when detaching a userspace
context.

Signed-off-by: Ian Munsie &lt;imunsie@au1.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cxl: Add timeout to process element commands</title>
<updated>2015-01-27T16:29:36+00:00</updated>
<author>
<name>Ian Munsie</name>
<email>imunsie@au1.ibm.com</email>
</author>
<published>2014-12-08T08:17:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3db65dba3a87e17e1cb341b585f9e854afd87e1b'/>
<id>urn:sha1:3db65dba3a87e17e1cb341b585f9e854afd87e1b</id>
<content type='text'>
commit a98e6e9f4e0224d85b4d951edc44af16dfe6094a upstream.

In the event that something goes wrong in the hardware and it is unable
to complete a process element comment we would end up polling forever,
effectively making the associated process unkillable.

This patch adds a timeout to the process element command code path, so
that we will give up if the hardware does not respond in a reasonable
time.

Signed-off-by: Ian Munsie &lt;imunsie@au1.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cxl: Change contexts_lock to a mutex to fix sleep while atomic bug</title>
<updated>2015-01-27T16:29:36+00:00</updated>
<author>
<name>Ian Munsie</name>
<email>imunsie@au1.ibm.com</email>
</author>
<published>2014-12-08T08:17:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d4a26d119b968328e0b039a00520e08cd862962a'/>
<id>urn:sha1:d4a26d119b968328e0b039a00520e08cd862962a</id>
<content type='text'>
commit ee41d11d53c8fc4968f0816504651541d606cf40 upstream.

We had a known sleep while atomic bug if a CXL device was forcefully
unbound while it was in use. This could occur as a result of EEH, or
manually induced with something like this while the device was in use:

echo 0000:01:00.0 &gt; /sys/bus/pci/drivers/cxl-pci/unbind

The issue was that in this code path we iterated over each context and
forcefully detached it with the contexts_lock spin lock held, however
the detach also needed to take the spu_mutex, and call schedule.

This patch changes the contexts_lock to a mutex so that we are not in
atomic context while doing the detach, thereby avoiding the sleep while
atomic.

Also delete the related TODO comment, which suggested an alternate
solution which turned out to not be workable.

Signed-off-by: Ian Munsie &lt;imunsie@au1.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cxl: Fix PSL error due to duplicate segment table entries</title>
<updated>2014-10-28T08:52:52+00:00</updated>
<author>
<name>Ian Munsie</name>
<email>imunsie@au1.ibm.com</email>
</author>
<published>2014-10-28T03:25:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=eb01d4c2388ce3b5bcc120d0f72912117ed7599d'/>
<id>urn:sha1:eb01d4c2388ce3b5bcc120d0f72912117ed7599d</id>
<content type='text'>
In certain circumstances the PSL (Power Service Layer, which provides
translation services for CXL hardware) can send an interrupt for a
segment miss that the kernel has already handled. This can happen if
multiple translations for the same segment are queued in the PSL before
the kernel has restarted the first translation.

The CXL driver does not expect this situation and does not check if a
segment had already been handled. This could cause a duplicate segment
table entry which in turn caused a PSL error taking down the card.

This patch fixes the issue by checking for existing entries in the
segment table that match the segment we are trying to insert, so as to
avoid inserting duplicate entries.

Signed-off-by: Ian Munsie &lt;imunsie@au1.ibm.com&gt;
Reviewed-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>cxl: Refactor cxl_load_segment() and find_free_sste()</title>
<updated>2014-10-28T08:52:24+00:00</updated>
<author>
<name>Ian Munsie</name>
<email>imunsie@au1.ibm.com</email>
</author>
<published>2014-10-28T03:25:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b03a7f578ac831276fe3870ebda01609d18167de'/>
<id>urn:sha1:b03a7f578ac831276fe3870ebda01609d18167de</id>
<content type='text'>
This moves the segment table hash calculation from cxl_load_segment()
into find_free_sste() since that is the only place it is actually used.

Signed-off-by: Ian Munsie &lt;imunsie@au1.ibm.com&gt;
Reviewed-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
</feed>
