<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/net/ethernet/amd, branch v6.6.39</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.39</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.39'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-03-15T14:48:21+00:00</updated>
<entry>
<title>net: pds_core: Fix possible double free in error handling path</title>
<updated>2024-03-15T14:48:21+00:00</updated>
<author>
<name>Yongzhi Liu</name>
<email>hyperlyzcs@gmail.com</email>
</author>
<published>2024-03-06T10:57:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=995f802abff209514ac2ee03b96224237646cec3'/>
<id>urn:sha1:995f802abff209514ac2ee03b96224237646cec3</id>
<content type='text'>
[ Upstream commit ba18deddd6d502da71fd6b6143c53042271b82bd ]

When auxiliary_device_add() returns error and then calls
auxiliary_device_uninit(), Callback function pdsc_auxbus_dev_release
calls kfree(padev) to free memory. We shouldn't call kfree(padev)
again in the error handling path.

Fix this by cleaning up the redundant kfree() and putting
the error handling back to where the errors happened.

Fixes: 4569cce43bc6 ("pds_core: add auxiliary_bus devices")
Signed-off-by: Yongzhi Liu &lt;hyperlyzcs@gmail.com&gt;
Reviewed-by: Wojciech Drewek &lt;wojciech.drewek@intel.com&gt;
Reviewed-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Link: https://lore.kernel.org/r/20240306105714.20597-1-hyperlyzcs@gmail.com
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>pds_core: Prevent health thread from running during reset/remove</title>
<updated>2024-02-05T20:14:39+00:00</updated>
<author>
<name>Brett Creeley</name>
<email>brett.creeley@amd.com</email>
</author>
<published>2024-01-29T23:40:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bd8740928aacda3d9a4cbb77e2ca3a951f20ba6b'/>
<id>urn:sha1:bd8740928aacda3d9a4cbb77e2ca3a951f20ba6b</id>
<content type='text'>
[ Upstream commit d9407ff11809c6812bb84fe7be9c1367d758e5c8 ]

The PCIe reset handlers can run at the same time as the
health thread. This can cause the health thread to
stomp on the PCIe reset. Fix this by preventing the
health thread from running while a PCIe reset is happening.

As part of this use timer_shutdown_sync() during reset and
remove to make sure the timer doesn't ever get rearmed.

Fixes: ffa55858330f ("pds_core: implement pci reset handlers")
Signed-off-by: Brett Creeley &lt;brett.creeley@amd.com&gt;
Reviewed-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Reviewed-by: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Link: https://lore.kernel.org/r/20240129234035.69802-2-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>pds_core: Rework teardown/setup flow to be more common</title>
<updated>2024-02-05T20:14:37+00:00</updated>
<author>
<name>Brett Creeley</name>
<email>brett.creeley@amd.com</email>
</author>
<published>2024-01-29T23:40:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2bbf2b1c20f934a054172175ccbabcc01fe69ef6'/>
<id>urn:sha1:2bbf2b1c20f934a054172175ccbabcc01fe69ef6</id>
<content type='text'>
[ Upstream commit bc90fbe0c3182157d2be100a2f6c2edbb1820677 ]

Currently the teardown/setup flow for driver probe/remove is quite
a bit different from the reset flows in pdsc_fw_down()/pdsc_fw_up().
One key piece that's missing are the calls to pci_alloc_irq_vectors()
and pci_free_irq_vectors(). The pcie reset case is calling
pci_free_irq_vectors() on reset_prepare, but not calling the
corresponding pci_alloc_irq_vectors() on reset_done. This is causing
unexpected/unwanted interrupt behavior due to the adminq interrupt
being accidentally put into legacy interrupt mode. Also, the
pci_alloc_irq_vectors()/pci_free_irq_vectors() functions are being
called directly in probe/remove respectively.

Fix this inconsistency by making the following changes:
  1. Always call pdsc_dev_init() in pdsc_setup(), which calls
     pci_alloc_irq_vectors() and get rid of the now unused
     pds_dev_reinit().
  2. Always free/clear the pdsc-&gt;intr_info in pdsc_teardown()
     since this structure will get re-alloced in pdsc_setup().
  3. Move the calls of pci_free_irq_vectors() to pdsc_teardown()
     since pci_alloc_irq_vectors() will always be called in
     pdsc_setup()-&gt;pdsc_dev_init() for both the probe/remove and
     reset flows.
  4. Make sure to only create the debugfs "identity" entry when it
     doesn't already exist, which it will in the reset case because
     it's already been created in the initial call to pdsc_dev_init().

Fixes: ffa55858330f ("pds_core: implement pci reset handlers")
Signed-off-by: Brett Creeley &lt;brett.creeley@amd.com&gt;
Reviewed-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Link: https://lore.kernel.org/r/20240129234035.69802-7-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>pds_core: Clear BARs on reset</title>
<updated>2024-02-05T20:14:37+00:00</updated>
<author>
<name>Brett Creeley</name>
<email>brett.creeley@amd.com</email>
</author>
<published>2024-01-29T23:40:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f6ec6ac9432941ec85a2221c91b1ecfc85680d89'/>
<id>urn:sha1:f6ec6ac9432941ec85a2221c91b1ecfc85680d89</id>
<content type='text'>
[ Upstream commit e96094c1d11cce4deb5da3c0500d49041ab845b8 ]

During reset the BARs might be accessed when they are
unmapped. This can cause unexpected issues, so fix it by
clearing the cached BAR values so they are not accessed
until they are re-mapped.

Also, make sure any places that can access the BARs
when they are NULL are prevented.

Fixes: 49ce92fbee0b ("pds_core: add FW update feature to devlink")
Signed-off-by: Brett Creeley &lt;brett.creeley@amd.com&gt;
Reviewed-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Reviewed-by: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Link: https://lore.kernel.org/r/20240129234035.69802-6-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>pds_core: Prevent race issues involving the adminq</title>
<updated>2024-02-05T20:14:37+00:00</updated>
<author>
<name>Brett Creeley</name>
<email>brett.creeley@amd.com</email>
</author>
<published>2024-01-29T23:40:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=22cd6046eb2148b18990257505834dd45c672a1b'/>
<id>urn:sha1:22cd6046eb2148b18990257505834dd45c672a1b</id>
<content type='text'>
[ Upstream commit 7e82a8745b951b1e794cc780d46f3fbee5e93447 ]

There are multiple paths that can result in using the pdsc's
adminq.

[1] pdsc_adminq_isr and the resulting work from queue_work(),
    i.e. pdsc_work_thread()-&gt;pdsc_process_adminq()

[2] pdsc_adminq_post()

When the device goes through reset via PCIe reset and/or
a fw_down/fw_up cycle due to bad PCIe state or bad device
state the adminq is destroyed and recreated.

A NULL pointer dereference can happen if [1] or [2] happens
after the adminq is already destroyed.

In order to fix this, add some further state checks and
implement reference counting for adminq uses. Reference
counting was used because multiple threads can attempt to
access the adminq at the same time via [1] or [2]. Additionally,
multiple clients (i.e. pds-vfio-pci) can be using [2]
at the same time.

The adminq_refcnt is initialized to 1 when the adminq has been
allocated and is ready to use. Users/clients of the adminq
(i.e. [1] and [2]) will increment the refcnt when they are using
the adminq. When the driver goes into a fw_down cycle it will
set the PDSC_S_FW_DEAD bit and then wait for the adminq_refcnt
to hit 1. Setting the PDSC_S_FW_DEAD before waiting will prevent
any further adminq_refcnt increments. Waiting for the
adminq_refcnt to hit 1 allows for any current users of the adminq
to finish before the driver frees the adminq. Once the
adminq_refcnt hits 1 the driver clears the refcnt to signify that
the adminq is deleted and cannot be used. On the fw_up cycle the
driver will once again initialize the adminq_refcnt to 1 allowing
the adminq to be used again.

Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands")
Signed-off-by: Brett Creeley &lt;brett.creeley@amd.com&gt;
Reviewed-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Reviewed-by: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Link: https://lore.kernel.org/r/20240129234035.69802-5-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>pds_core: implement pci reset handlers</title>
<updated>2024-02-05T20:14:37+00:00</updated>
<author>
<name>Shannon Nelson</name>
<email>shannon.nelson@amd.com</email>
</author>
<published>2023-09-14T22:31:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=699f5416c33e515424982eacbe5a8567c5f64f04'/>
<id>urn:sha1:699f5416c33e515424982eacbe5a8567c5f64f04</id>
<content type='text'>
[ Upstream commit ffa55858330f267beec995fc4f68098c91311c64 ]

Implement the callbacks for a nice PCI reset.  These get called
when a user is nice enough to use the sysfs PCI reset entry, e.g.
    echo 1 &gt; /sys/bus/pci/devices/0000:2b:00.0/reset

Signed-off-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Reviewed-by: Brett Creeley &lt;brett.creeley@amd.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Stable-dep-of: 7e82a8745b95 ("pds_core: Prevent race issues involving the adminq")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>pds_core: Use struct pdsc for the pdsc_adminq_isr private data</title>
<updated>2024-02-05T20:14:37+00:00</updated>
<author>
<name>Brett Creeley</name>
<email>brett.creeley@amd.com</email>
</author>
<published>2024-01-29T23:40:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=10839a1892dd91912dfbd5740410518c87a7893f'/>
<id>urn:sha1:10839a1892dd91912dfbd5740410518c87a7893f</id>
<content type='text'>
[ Upstream commit 951705151e50f9022bc96ec8b3fd5697380b1df6 ]

The initial design for the adminq interrupt was done based
on client drivers having their own adminq and adminq
interrupt. So, each client driver's adminq isr would use
their specific adminqcq for the private data struct. For the
time being the design has changed to only use a single
adminq for all clients. So, instead use the struct pdsc for
the private data to simplify things a bit.

This also has the benefit of not dereferencing the adminqcq
to access the pdsc struct when the PDSC_S_STOPPING_DRIVER bit
is set and the adminqcq has actually been cleared/freed.

Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands")
Signed-off-by: Brett Creeley &lt;brett.creeley@amd.com&gt;
Reviewed-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Reviewed-by: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Link: https://lore.kernel.org/r/20240129234035.69802-4-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>pds_core: Cancel AQ work on teardown</title>
<updated>2024-02-05T20:14:37+00:00</updated>
<author>
<name>Brett Creeley</name>
<email>brett.creeley@amd.com</email>
</author>
<published>2024-01-29T23:40:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b26628142b99201bf40d856e810faa3530946f57'/>
<id>urn:sha1:b26628142b99201bf40d856e810faa3530946f57</id>
<content type='text'>
[ Upstream commit d321067e2cfa4d5e45401a00912ca9da8d1af631 ]

There is a small window where pdsc_work_thread()
calls pdsc_process_adminq() and pdsc_process_adminq()
passes the PDSC_S_STOPPING_DRIVER check and starts
to process adminq/notifyq work and then the driver
starts a fw_down cycle. This could cause some
undefined behavior if the notifyqcq/adminqcq are
free'd while pdsc_process_adminq() is running. Use
cancel_work_sync() on the adminqcq's work struct
to make sure any pending work items are cancelled
and any in progress work items are completed.

Also, make sure to not call cancel_work_sync() if
the work item has not be initialized. Without this,
traces will happen in cases where a reset fails and
teardown is called again or if reset fails and the
driver is removed.

Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands")
Signed-off-by: Brett Creeley &lt;brett.creeley@amd.com&gt;
Reviewed-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Reviewed-by: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Link: https://lore.kernel.org/r/20240129234035.69802-3-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>amd-xgbe: propagate the correct speed and duplex status</title>
<updated>2023-12-03T06:33:05+00:00</updated>
<author>
<name>Raju Rangoju</name>
<email>Raju.Rangoju@amd.com</email>
</author>
<published>2023-11-21T19:14:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=54f06863c1e6b272a515fdba8f0514ace67b509d'/>
<id>urn:sha1:54f06863c1e6b272a515fdba8f0514ace67b509d</id>
<content type='text'>
[ Upstream commit 7a2323ac24a50311f64a3a9b54ed5bef5821ecae ]

xgbe_get_link_ksettings() does not propagate correct speed and duplex
information to ethtool during cable unplug. Due to which ethtool reports
incorrect values for speed and duplex.

Address this by propagating correct information.

Fixes: 7c12aa08779c ("amd-xgbe: Move the PHY support into amd-xgbe")
Acked-by: Shyam Sundar S K &lt;Shyam-sundar.S-k@amd.com&gt;
Signed-off-by: Raju Rangoju &lt;Raju.Rangoju@amd.com&gt;
Reviewed-by: Wojciech Drewek &lt;wojciech.drewek@intel.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>amd-xgbe: handle the corner-case during tx completion</title>
<updated>2023-12-03T06:33:05+00:00</updated>
<author>
<name>Raju Rangoju</name>
<email>Raju.Rangoju@amd.com</email>
</author>
<published>2023-11-21T19:14:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b2943d777f718bb1e71239b74a3b41bf3d22d3fb'/>
<id>urn:sha1:b2943d777f718bb1e71239b74a3b41bf3d22d3fb</id>
<content type='text'>
[ Upstream commit 7121205d5330c6a3cb3379348886d47c77b78d06 ]

The existing implementation uses software logic to accumulate tx
completions until the specified time (1ms) is met and then poll them.
However, there exists a tiny gap which leads to a race between
resetting and checking the tx_activate flag. Due to this the tx
completions are not reported to upper layer and tx queue timeout
kicks-in restarting the device.

To address this, introduce a tx cleanup mechanism as part of the
periodic maintenance process.

Fixes: c5aa9e3b8156 ("amd-xgbe: Initial AMD 10GbE platform driver")
Acked-by: Shyam Sundar S K &lt;Shyam-sundar.S-k@amd.com&gt;
Signed-off-by: Raju Rangoju &lt;Raju.Rangoju@amd.com&gt;
Reviewed-by: Wojciech Drewek &lt;wojciech.drewek@intel.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
