| Age | Commit message (Collapse) | Author | Files | Lines |
|
priv->rc.of_node is never set in reset core. Even if it were: tasking
the reset-gpio driver with controlling the reference count of an OF node
set up in reset core is a weird inversion of responsability. But it's
also wrong in that the underlying device never actually gets removed so
the node should not be put at all and especially not at driver detach.
Remove the devres action.
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
|
|
Driver core holds a reference to the USB interface and its parent USB
device while the interface is bound to a driver and there is no need to
take additional references unless the structures are needed after
disconnect.
Drop the redundant device reference to reduce cargo culting, make it
easier to spot drivers where an extra reference is needed, and reduce
the risk of memory leaks when drivers fail to release it.
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260305124945.10781-1-johan@kernel.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
|
|
Currently Tyr defines a convenience type alias for its DRM device type,
`TyrDrmDevice` but it does not use the alias outside of `tyr/driver.rs`.
Replace `drm::Device<TyrDrmDriver>` with the alias `TyrDrmDevice` across
the driver.
This change will ease future upstream Tyr development by reducing the
diffs when multiple series are touching these files.
No functional changes are intended.
Signed-off-by: Deborah Brouwer <deborah.brouwer@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260302202331.176140-1-deborah.brouwer@collabora.com
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
|
|
The NPU 60XX uses the default boot params location specified
in the firmware image header, consistent with earlier generations.
Remove the unnecessary MMIO register write, freeing the AON register
for future use.
Fixes: 44e4c88951fa ("accel/ivpu: Implement warm boot flow for NPU6 and unify boot handling")
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20260305142226.194995-1-andrzej.kacprowski@linux.intel.com
|
|
Add a new kunit test gpu_test_buddy_alloc_range() that exercises the
__gpu_buddy_alloc_range() exact-range allocation path, triggered when
start + size == end with flags=0.
The test covers:
- Basic exact-range allocation of the full mm
- Exact-range allocation of equal sub-ranges (quarters)
- Minimum chunk-size exact ranges at start, middle, and end offsets
- Non power-of-two mm size with multiple roots, including cross-root
exact-range allocation
- Randomized exact-range allocations of N contiguous page-aligned
slices in random order
- Negative: partially allocated range must reject overlapping exact
alloc
- Negative: checkerboard allocation pattern rejects exact range over
partially occupied pairs
- Negative: misaligned start, unaligned size, and out-of-bounds end
- Free and re-allocate the same exact range across multiple iterations
- Various power-of-two exact ranges at natural alignment
Cc: Christian König <christian.koenig@amd.com>
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://patch.msgid.link/20260302150947.47535-2-sanjay.kumar.yadav@intel.com
|
|
checkpatch.pl reported warnings where seq_printf() was used for simple
strings with no format specifiers.
Replace these instances with seq_puts() to avoid the overhead of runtime
string parsing and to conform to kernel coding standards.
Signed-off-by: Coby McKinney <coby@bytemap.space>
Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://patch.msgid.link/20260202205214.18898-1-coby@bytemap.space
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
For eDP read the ALPM DPCD caps after DPCD initalization and just before
the PSR init.
v2: Move intel_alpm_init to intel_edp_init_dpcd (Jouni)
v3: Add Fixes with commit-id (Jouni)
v4: Separated the alpm dpcd read caps from alpm_init and moved to
intel_edp_init_dpcd.
v5: Read alpm_caps always for eDP irrespective of the eDP version (Jouni)
v6: replace drm_dp_dpcd_readb with drm_dp_dpcd_read_byte (Jouni)
Fixes: 15438b325987 ("drm/i915/alpm: Add compute config for lobf")
Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patch.msgid.link/20260304072157.1123283-1-arun.r.murthy@intel.com
|
|
Two members of struct xdma_desc_block are not descibed leading to
warnings, document them.
Warning: drivers/dma/xilinx/xdma.c:75 struct member 'last_interrupt' not described in 'xdma_chan'
Warning: drivers/dma/xilinx/xdma.c:75 struct member 'stop_requested' not described in 'xdma_chan'
Link: https://patch.msgid.link/20260227022905.233721-1-vkoul@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Use scoped for-each loop when iterating over device nodes to make code a
bit simpler.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260301142158.90319-2-krzysztof.kozlowski@oss.qualcomm.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
On prep, a spin lock is taken and the next entry in the circular buffer
is filled. On submit, the spin lock just needs to be released as the
requests are already pending.
When switchtec_dma_issue_pending() is called, the sq_tail register
is written to indicate there are new jobs for the dma engine to start
on.
Pause and resume operations are implemented by writing to a control
register.
Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com>
Co-developed-by: George Ge <george.ge@microchip.com>
Signed-off-by: George Ge <george.ge@microchip.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://patch.msgid.link/20260302210419.3656-4-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Initialize the hardware and create the dma channel queues.
Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com>
Co-developed-by: George Ge <george.ge@microchip.com>
Signed-off-by: George Ge <george.ge@microchip.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://patch.msgid.link/20260302210419.3656-3-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Some Switchtec Switches can expose DMA engines via extra PCI functions
on the upstream ports. At most one such function can be supported on
each upstream port. Each function can have one or more DMA channels.
This patch is just the core PCI driver skeleton and dma
engine registration.
Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com>
Co-developed-by: George Ge <george.ge@microchip.com>
Signed-off-by: George Ge <george.ge@microchip.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://patch.msgid.link/20260302210419.3656-2-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add support for gracefully terminating hardware cyclic DMA transfers when
a new transfer is queued and is flagged with DMA_PREP_LOAD_EOT. Without
this, cyclic transfers would continue indefinitely until we brute force
it with .device_terminate_all().
When a new descriptor is queued while a cyclic transfer is active, mark
the cyclic transfer for termination. For hardware with scatter-gather
support, modify the last segment flags to trigger end-of-transfer. For
non-SG hardware, clear the CYCLIC flag to allow natural completion.
Older IP core versions (pre-4.6.a) can prefetch data when clearing the
CYCLIC flag, causing corruption in the next transfer. Work around this
by disabling and re-enabling the core to flush prefetched data.
The cyclic_eot flag tracks transfers marked for termination, preventing
new transfers from starting until the cyclic one completes. Non-EOT
transfers submitted after cyclic transfers are discarded with a warning.
Also note that for hardware cyclic transfers not using SG, we need to
make sure that chan->next_desc is also set to NULL (so we can look at
possible EOT transfers) and we also need to move the queue check to
after axi_dmac_get_next_desc() because with hardware based cyclic
transfers we might get the queue marked as full and hence we would not
be able to check for cyclic termination.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20260303-axi-dac-cyclic-support-v2-5-0db27b4be95a@analog.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
As of now, to terminate a cyclic transfer, one pretty much needs to use
brute force and terminate all transfers with .device_terminate_all().
With this change, when a cyclic transfer terminates (and generates an
EOT interrupt), look at any new pending transfer with the DMA_PREP_LOAD_EOT
flag set. If there is one, the current cyclic transfer is terminated and
the next one is enqueued. If the flag is not set, that transfer is ignored.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20260303-axi-dac-cyclic-support-v2-4-0db27b4be95a@analog.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add a new helper for getting the next valid struct axi_dmac_desc. This
will be extended in follow up patches to support to gracefully terminate
cyclic transfers.
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20260303-axi-dac-cyclic-support-v2-3-0db27b4be95a@analog.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
.device_prep_peripheral_dma_vec()
Add support for cyclic transfers by checking the DMA_PREP_REPEAT
flag. If the flag is set, close the loop and clear the flag for the last
segment (the same done for .device_prep_dma_cyclic().
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20260303-axi-dac-cyclic-support-v2-2-0db27b4be95a@analog.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
The ioat_sysfs_entry structures are never modified, mark them as
read-only.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260304-sysfs-const-ioat-v2-4-b9b82651219b@weissschuh.net
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
ioat_ktype is never modified, so make it const.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260304-sysfs-const-ioat-v2-3-b9b82651219b@weissschuh.net
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Move struct ioat_sysfs_entry into sysfs.c because it is only used in it.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20260304-sysfs-const-ioat-v2-2-b9b82651219b@weissschuh.net
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
These structures are only used in sysfs.c, where are defined.
Make them static and remove them from the header.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20260304-sysfs-const-ioat-v2-1-b9b82651219b@weissschuh.net
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
devm_regmap_init_mmio returns an ERR_PTR() upon error, not NULL.
Fix the error check and also fix the error message. Use the error code
from ERR_PTR() instead of the wrong value in ret.
Fixes: 17ce252266c7 ("dmaengine: xilinx: xdma: Add xilinx xdma driver")
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20251014061309.283468-1-alexander.stein@ew.tq-group.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
CYCLE_BIT bits for HDMA.
Others have submitted this issue (https://lore.kernel.org/dmaengine/
20240722030405.3385-1-zhengdongxiong@gxmicro.cn/),
but it has not been fixed yet. Therefore, more supplementary information
is provided here.
As mentioned in the "PCS-CCS-CB-TCB" Producer-Consumer Synchronization of
"DesignWare Cores PCI Express Controller Databook, version 6.00a":
1. The Consumer CYCLE_STATE (CCS) bit in the register only needs to be
initialized once; the value will update automatically to be
~CYCLE_BIT (CB) in the next chunk.
2. The Consumer CYCLE_BIT bit in the register is loaded from the LL
element and tested against CCS. When CB = CCS, the data transfer is
executed. Otherwise not.
The current logic sets customer (HDMA) CS and CB bits to 1 in each chunk
while setting the producer (software) CB of odd chunks to 0 and even
chunks to 1 in the linked list. This is leading to a mismatch between
the producer CB and consumer CS bits.
This issue can be reproduced by setting the transmission data size to
exceed one chunk. By the way, in the EDMA using the same "PCS-CCS-CB-TCB"
mechanism, the CS bit is only initialized once and this issue was not
found. Refer to
drivers/dma/dw-edma/dw-edma-v0-core.c:dw_edma_v0_core_start.
So fix this issue by initializing the CYCLE_STATE and CYCLE_BIT bits
only once.
Fixes: e74c39573d35 ("dmaengine: dw-edma: Add support for native HDMA")
Signed-off-by: LUO Haowen <luo-hw@foxmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/tencent_CB11AA9F3920C1911AF7477A9BD8EFE0AD05@qq.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add KUnit test to validate offset-aligned allocations in the DRM buddy
allocator.
Validate offset-aligned allocation:
The test covers allocations with sizes smaller than the alignment constraint
and verifies correct size preservation, offset alignment, and behavior across
multiple allocation sizes. It also exercises fragmentation by freeing
alternating blocks and confirms that allocation fails once all aligned offsets
are consumed.
Stress subtree_max_alignment propagation:
Exercise subtree_max_alignment tracking by allocating blocks with descending
alignment constraints and freeing them in reverse order. This verifies that
free-tree augmentation correctly propagates the maximum offset alignment
present in each subtree at every stage.
v2:
- Move the patch to gpu/tests/gpu_buddy_test.c file.
v3:
- Fixed build warnings reported by kernel test robot <lkp@intel.com>
v4:(Matthew)
- Use IS_ALIGNED() instead of manual alignment checks
- Simplify order iteration loop for readability
- Remove extra newline
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260306060155.2114-2-Arunpravin.PaneerSelvam@amd.com
|
|
Large alignment requests previously forced the buddy allocator to search by
alignment order, which often caused higher-order free blocks to be split even
when a suitably aligned smaller region already existed within them. This led
to excessive fragmentation, especially for workloads requesting small sizes
with large alignment constraints.
This change prioritizes the requested allocation size during the search and
uses an augmented RB-tree field (subtree_max_alignment) to efficiently locate
free blocks that satisfy both size and offset-alignment requirements. As a
result, the allocator can directly select an aligned sub-region without
splitting larger blocks unnecessarily.
A practical example is the VKCTS test
dEQP-VK.memory.allocation.basic.size_8KiB.reverse.count_4000, which repeatedly
allocates 8 KiB buffers with a 256 KiB alignment. Previously, such allocations
caused large blocks to be split aggressively, despite smaller aligned regions
being sufficient. With this change, those aligned regions are reused directly,
significantly reducing fragmentation.
This improvement is visible in the amdgpu VRAM buddy allocator state
(/sys/kernel/debug/dri/1/amdgpu_vram_mm). After the change, higher-order blocks
are preserved and the number of low-order fragments is substantially reduced.
Before:
order- 5 free: 1936 MiB, blocks: 15490
order- 4 free: 967 MiB, blocks: 15486
order- 3 free: 483 MiB, blocks: 15485
order- 2 free: 241 MiB, blocks: 15486
order- 1 free: 241 MiB, blocks: 30948
After:
order- 5 free: 493 MiB, blocks: 3941
order- 4 free: 246 MiB, blocks: 3943
order- 3 free: 123 MiB, blocks: 4101
order- 2 free: 61 MiB, blocks: 4101
order- 1 free: 61 MiB, blocks: 8018
By avoiding unnecessary splits, this change improves allocator efficiency and
helps maintain larger contiguous free regions under heavy offset-aligned
allocation workloads.
v2:(Matthew)
- Update augmented information along the path to the inserted node.
v3:
- Move the patch to gpu/buddy.c file.
v4:(Matthew)
- Use the helper instead of calling _ffs directly
- Remove gpu_buddy_block_order(block) >= order check and drop order
- Drop !node check as all callers handle this already
- Return larger than any other possible alignment for __ffs64(0)
- Replace __ffs with __ffs64
v5:(Matthew)
- Drop subtree_max_alignment initialization at gpu_block_alloc()
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260306060155.2114-1-Arunpravin.PaneerSelvam@amd.com
|
|
We only allow up to 1 bpt stream running on a SoundWire bus.
bus.bpt_stream will be assigned when it is opened and will be set to
NULL when it is closed. We do check bus->bpt_stream_refcount if the
stream type is SDW_STREAM_BPT in sdw_master_rt_alloc(), but at that
moment the bpt stream is allocated and set to bus.bpt_stream. It will
lead to the original bus.bpt_stream be changed to the new and not used
bpt stream. And it will be released and set to NULL when
sdw_slave_bpt_stream_add() return error as it supposed to. Then the
original stream will try to use the NULL bus.bpt_stream.
Fixes: 4c1ce9f37d8a ("soundwire: intel_ace2x: add BPT send_async/wait callbacks")
Reported-by: Simon Trimmer <simont@opensource.cirrus.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Simon Trimmer <simont@opensource.cirrus.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://patch.msgid.link/20260126054045.2504103-1-yung-chuan.liao@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
The dev_warn() messages in sdw_handle_slave_status() for UNATTACHED
transitions were added in commit d1b328557058 ("soundwire: bus: add
dev_warn() messages to track UNATTACHED devices") to debug attachment
failures with dynamic debug enabled.
These warnings fire during normal operation -- for example when a codec
driver triggers a hardware reset after firmware download, causing the
device to momentarily go UNATTACHED before re-attaching -- producing
misleading noise on every boot.
Demote the messages to dev_dbg() so they remain available via dynamic
debug for diagnosing real attachment failures without alarming users
during expected initialization sequences.
Fixes: d1b328557058 ("soundwire: bus: add dev_warn() messages to track UNATTACHED devices")
Signed-off-by: Cole Leavitt <cole@unwrap.rs>
Reviewed-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://patch.msgid.link/20260218180210.9263-1-cole@unwrap.rs
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Replace the wait_for_completion_timeout() in sdw_prep_deprep_slave_ports()
with a read_poll_timeout().
The original intent of the wait_for_completion_timeout() was to wait for
the port prepare interrupt. But at this time the code is holding the
bus_lock, which prevents the interrupt handler from running. Because of
this, the port_prep completion will not be signaled and the
wait_for_completion_timeout() will always timeout.
Rewriting the code to avoid taking the bus_lock carries risks, and
needs careful consideration of the consequences. It is safer and simpler
to replace the completion with a simple register poll.
As the code is holding the bus_lock, it is already blocking other activity
so consuming control channel bandwidth for polling isn't really a concern.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://patch.msgid.link/20260227111648.175548-1-rf@opensource.cirrus.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
For current platforms(ACP6.3/ACP7.0/ACP7.1/ACP7.2), AMD SoundWire manager
doesn't have banked registers for data port programming on Manager's side.
Need to use fixed block offsets, hstart & hstop for manager ports.
Earlier amd manager driver has support for 12 MHz as a bus clock frequency
where frame rate is 48000 and number of bits is 500, frame shape as
50 x 10 with fixed block offset mapping based on port number.
Got a new requirement to support 6 MHz as a bus clock frequency.
For 6 MHz bus clock frequency amd manager driver needs to support two
different frame shapes i.e number of bits as 250 with frame rate as 48000
and frame shape as 125 x 2 and for the second combination number of bits as
500 where frame rate is 24000 and frame shape is 50 x 10.
Few SoundWire peripherals doesn't support 125 x 2 as a frame shape for
6 MHz bus clock frequency. They have explicit requirement for the frame
shape. In this scenario, amd manager driver needs to use 50 x 10 as a frame
shape where frame rate is 24000. Based on the platform and SoundWire
topology for 6Mhz support frame shape will be decided which is part of
SoundWire manager DisCo tables.
For current platforms, amd manager driver supports only two bus clock
frequencies(12 MHz & 6 MHz). Refactor bandwidth logic to support different
bus clock frequencies.
Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
Link: https://patch.msgid.link/20260226065638.1251771-3-Vijendar.Mukunda@amd.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add generic SoundWire clock initialization sequence to support
different SoundWire bus clock frequencies for ACP6.3/7.0/7.1/7.2
platforms and remove hard coding initializations for 12Mhz bus
clock frequency.
Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
Link: https://patch.msgid.link/20260226065638.1251771-2-Vijendar.Mukunda@amd.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
If the dw_pcie_resume_noirq() API fails, it just returns the errno without
doing cleanup in the error path, leading to resource leak.
So perform cleanup in the error path.
Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
Reported-by: Senchuan Zhang <zhangsenchuan@eswincomputing.com>
Closes: https://lore.kernel.org/linux-pci/78296255.3869.19c8eb694d6.Coremail.zhangsenchuan@eswincomputing.com
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20260226133951.296743-1-mani@kernel.org
|
|
PCIe r7.0, section 7.5.3.6 states that for multi-function devices, the
Max Link Width and Max Link Speed fields in the Link Capabilities
Register must report the same values for all functions.
Currently, dw_pcie_setup() programs these fields only for Function 0
via dw_pcie_link_set_max_speed() and dw_pcie_link_set_max_link_width().
For multi-function endpoint configurations, Function 1 and beyond retain
their default values, violating the PCIe specification.
Fix this by reading the Max Link Width and Max Link Speed fields from
Link Capabilities Register of Function 0 after dw_pcie_setup() completes,
then mirroring these values to all other functions.
Fixes: 24ede430fa49 ("PCI: designware-ep: Add multiple PFs support for DWC")
Fixes: 89db0793c9f2 ("PCI: dwc: Add missing PCI_EXP_LNKCAP_MLW handling")
Signed-off-by: Aksh Garg <a-garg7@ti.com>
[mani: renamed ref_lnkcap to func0_lnkcap]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20260224083817.916782-3-a-garg7@ti.com
|
|
In dw_pcie_ep_set_msix(), while updating the MSI-X Table Size value for
individual functions, Message Control register is read from the passed
function number register space using dw_pcie_ep_readw_dbi(), but always
written back to the Function 0's register space using dw_pcie_writew_dbi().
This causes incorrect MSI-X configuration for the rest of the functions,
other than Function 0.
Fix this by using dw_pcie_ep_writew_dbi() to write to the correct
function's register space, matching the read operation.
Fixes: 70fa02ca1446 ("PCI: dwc: Add dw_pcie_ep_{read,write}_dbi[2] helpers")
Signed-off-by: Aksh Garg <a-garg7@ti.com>
[mani: commit log]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20260224083817.916782-2-a-garg7@ti.com
|
|
There are no offsets in `FalconUCodeDescV2` to give the non-secure and
secure IMEM sections start offsets relative to the beginning of the
firmware object.
The start offsets for both sections were set to `0`, but that is
obviously incorrect since two different sections cannot start at the
same offset. Since these offsets were not used by the bootloader, this
doesn't prevent proper function but is incorrect nonetheless.
Fix this by computing the start of the secure IMEM section relatively to
the start of the firmware object and setting it properly. Also add and
improve comments to explain how the values are obtained.
Fixes: dbfb5aa41f16 ("gpu: nova-core: add FalconUCodeDescV2 support")
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Link: https://patch.msgid.link/20260306-turing_prep-v11-9-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
There are slice row per frame and pic height parameters in DSC that needs
to be configured on every Selective Update in Early Transport mode. Use
helper provided by DSC code to configure these on Selective Update when in
Early Transport mode. Also fill crtc_state->psr2_su_area with full frame
area on full frame update for DSC calculation.
v2: move psr2_su_area under skip_sel_fetch_set_loop label
Bspec: 68927, 71709
Fixes: 467e4e061c44 ("drm/i915/psr: Enable psr2 early transport as possible")
Cc: <stable@vger.kernel.org> # v6.9+
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patch.msgid.link/20260304113011.626542-5-jouni.hogander@intel.com
|
|
There are slice row per frame and pic height configuration in DSC Selective
Update Parameter Set 1 register. Add helper for configuring these.
v2:
- Add WARN_ON_ONCE if vdsc instances per pipe > 2
- instead of checking vdsc instances per pipe being > 1 check == 2
Bspec: 71709
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patch.msgid.link/20260304113011.626542-4-jouni.hogander@intel.com
|
|
Add definitions for DSC_SU_PARAMETER_SET_0_DSC0 and
DSC_SU_PARAMETER_SET_0_DSC1 registers. These are for Selective Update Early
Transport configuration.
Bspec: 71709
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patch.msgid.link/20260304113011.626542-3-jouni.hogander@intel.com
|
|
Currently we are aligning Selective Update area to cover cursor fully if
needed only once. It may happen that cursor is in Selective Update area
after pipe alignment and after that covering cursor plane only
partially. Fix this by looping alignment as long as alignment isn't needed
anymore.
v2:
- do not unecessarily loop if cursor was already fully covered
- rename aligned as su_area_changed
Fixes: 1bff93b8bc27 ("drm/i915/psr: Extend SU area to cover cursor fully if needed")
Cc: <stable@vger.kernel.org> # v6.9+
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patch.msgid.link/20260304113011.626542-2-jouni.hogander@intel.com
|
|
There is no member in `FalconUCodeDescV3` to describe the start offsets
of the IMEM and DMEM section in the firmware object. Add comments to
justify how they are computed.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Link: https://patch.msgid.link/20260306-turing_prep-v11-8-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Clients sometimes need to know whether the mailbox TX queue has room
before posting a new message. Rather than exposing internal queue state
through a struct field, provide a proper accessor function that returns
the number of available slots for a given channel.
This lets clients choose to back off when the queue is full instead of
hitting the -ENOBUFS error path and the misleading "Try increasing
MBOX_TX_QUEUE_LEN" warning.
Tested-by: Tanmay Shah <tanmay.shah@amd.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
|
|
On Turing and GA100, a new firmware image called the Generic Bootloader
(gen_bootloader) must be used to load FWSEC into Falcon memory. The
driver loads the generic bootloader into Falcon IMEM, passes a
descriptor that points to FWSEC using DMEM, and then boots the generic
bootloader. The bootloader will then load FWSEC into IMEM and boot it.
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-12-8f0042c5d026@nvidia.com
|
|
Turing GPUs need an additional firmware file (the FWSEC generic
bootloader) in order to initialize. Add it to `ModInfoBuilder`.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-11-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
We will use this method from const context.
Also take `self` by value since it is the size of a primitive type and
implements `Copy`.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-10-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
This safety check was an assumption based on the firmwares we work with
- it is not based on an actual hardware limitation. Thus, remove it.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-7-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Turing and GA100 use programmed I/O (PIO) instead of DMA to upload
firmware images into Falcon memory.
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-6-8f0042c5d026@nvidia.com
|
|
These methods are relevant no matter the loading method used, thus move
them to the common `FalconFirmware` trait.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-5-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Not all firmware is necessarily loaded by DMA. Remove the requirement
for `FalconFirmware` to implement `FalconDmaLoadable`, and adapt
`Falcon`'s methods constraints accordingly.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-4-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
The current `FalconLoadParams` and `FalconLoadTarget` types are fit for
DMA loading, but not so much for PIO loading which will require its own
types. Start by renaming them to something that indicates that they are
indeed DMA-related.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-3-8f0042c5d026@nvidia.com
[acourbot@nvidia.com: fixup order of import items.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Falcon memory blocks are 256 bytes in size. This is a hard constant on
all models.
This value was hardcoded, so turn it into a documented constant. It will
also become useful with the PIO loading code.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-2-8f0042c5d026@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
When DMA was the only loading option for falcon firmwares, we decided to
store them in DMA objects as soon as they were loaded from disk and
patch them in-place to avoid having to do an extra copy.
This decision complicates the PIO loading patch considerably, and
actually does not even stand on its own when put into perspective with
the fact that it requires 8 unsafe statements in the code that wouldn't
exist if we stored the firmware into a `KVVec` and copied it into a DMA
object at the last minute.
The cost of the copy is, as can be expected, imperceptible at runtime.
Thus, switch to a lazy DMA object creation model and simplify our code
a bit. This will also have the nice side-effect of being more fit for
PIO loading.
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260306-turing_prep-v11-1-8f0042c5d026@nvidia.com
[acourbot@nvidia.com: add TODO item to switch back to a coherent
allocation when it becomes convenient to do so.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Cross-merge BPF and other fixes after downstream PR.
No conflicts.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|