Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"New controller support and updates to drivers.
New support:
- Qualcomm SM6115 and QCM2290 dmaengine support
- at_xdma support for microchip,sam9x7 controller
Updates:
- idxd updates for wq simplification and ats knob updates
- fsl edma updates for v3 support
- Xilinx AXI4-Stream control support
- Yaml conversion for bcm dma binding"
* tag 'dmaengine-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (53 commits)
dmaengine: fsl-edma: integrate v3 support
dt-bindings: fsl-dma: fsl-edma: add edma3 compatible string
dmaengine: fsl-edma: move tcd into struct fsl_dma_chan
dmaengine: fsl-edma: refactor chan_name setup and safety
dmaengine: fsl-edma: move clearing of register interrupt into setup_irq function
dmaengine: fsl-edma: refactor using devm_clk_get_enabled
dmaengine: fsl-edma: simply ATTR_DSIZE and ATTR_SSIZE by using ffs()
dmaengine: fsl-edma: move common IRQ handler to common.c
dmaengine: fsl-edma: Remove enum edma_version
dmaengine: fsl-edma: transition from bool fields to bitmask flags in drvdata
dmaengine: fsl-edma: clean up EXPORT_SYMBOL_GPL in fsl-edma-common.c
dmaengine: fsl-edma: fix build error when arch is s390
dmaengine: idxd: Fix issues with PRS disable sysfs knob
dmaengine: idxd: Allow ATS disable update only for configurable devices
dmaengine: xilinx_dma: Program interrupt delay timeout
dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical usecase
dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit
dmaengine: xilinx_dma: Increase AXI DMA transaction segment count
dmaengine: xilinx_dma: Pass AXI4-Stream control words to dma client
dt-bindings: dmaengine: xilinx_dma: Add xlnx,irq-delay property
...
|
|
Commit c05257b5600b ("dmanegine: idxd: open code the dsa_drv registration")
removed idxd_{un}register_driver() definitions but not the declarations.
Commit 034b3290ba25 ("dmaengine: idxd: create idxd_device sub-driver")
declared idxd_{un}register_idxd_drv() but never implemented it.
Commit 8f47d1a5e545 ("dmaengine: idxd: connect idxd to dmaengine
subsystem") declared idxd_parse_completion_status() but never implemented
it.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230817114135.50264-1-yuehaibing@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Kernel workqueues were disabled due to flawed use of kernel VA and SVA
API. Now that we have the support for attaching PASID to the device's
default domain and the ability to reserve global PASIDs from SVA APIs,
we can re-enable the kernel work queues and use them under DMA API.
We also use non-privileged access for in-kernel DMA to be consistent
with the IOMMU settings. Consequently, interrupt for user privilege is
enabled for work completion IRQs.
Link: https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/
Tested-by: Tony Zhu <tony.zhu@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20230802212427.1497170-9-jacob.jun.pan@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"New support:
- Apple admac t8112 device support
- StarFive JH7110 DMA controller
Updates:
- Big pile of idxd updates to support IAA 2.0 device capabilities,
DSA 2.0 Event Log and completion record faulting features and
new DSA operations
- at_xdmac supend & resume updates and driver code cleanup
- k3-udma supend & resume support
- k3-psil thread support for J784s4"
* tag 'dmaengine-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (57 commits)
dmaengine: idxd: add per wq PRS disable
dmaengine: idxd: add pid to exported sysfs attribute for opened file
dmaengine: idxd: expose fault counters to sysfs
dmaengine: idxd: add a device to represent the file opened
dmaengine: idxd: add per file user counters for completion record faults
dmaengine: idxd: process batch descriptor completion record faults
dmaengine: idxd: add descs_completed field for completion record
dmaengine: idxd: process user page faults for completion record
dmaengine: idxd: add idxd_copy_cr() to copy user completion record during page fault handling
dmaengine: idxd: create kmem cache for event log fault items
dmaengine: idxd: add per DSA wq workqueue for processing cr faults
dmanegine: idxd: add debugfs for event log dump
dmaengine: idxd: add interrupt handling for event log
dmaengine: idxd: setup event log configuration
dmaengine: idxd: add event log size sysfs attribute
dmaengine: idxd: make misc interrupt one shot
dt-bindings: dma: snps,dw-axi-dmac: constrain the items of resets for JH7110 dma
dt-bindings: dma: Drop unneeded quotes
dmaengine: at_xdmac: align declaration of ret with the rest of variables
dmaengine: at_xdmac: add a warning message regarding for unpaused channels
...
|
|
Add sysfs knob for per wq Page Request Service disable. This knob
disables PRS support for the specific wq. When this bit is set,
it also overrides the wq's block on fault enabling.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-17-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Embed a struct device for the user file context in order to export sysfs
attributes related with the opened file. Tie the lifetime of the file
context to the device. The sysfs entry will be added under the char device.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-14-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add counters per opened file for the char device in order to keep track how
many completion record faults occurred and how many of those faults failed
the writeback by the driver after attempt to fault in the page. The
counters are managed by xarray that associates the PASID with
struct idxd_user_context.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-13-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add event log processing for faulting of user batch descriptor completion
record.
When encountering an event log entry for a page fault on a completion
record, the driver is expected to do the following:
1. If the "first error in batch" bit in event log entry error info is
set, discard any previously recorded errors associated with the
"batch identifier".
2. Fix the page fault according to the fault address in the event log. If
successful, write the completion record to the fault address in user space.
3. If an error is encountered while writing the completion record and it is
associated to a descriptor in the batch, the driver associates the error
with the batch identifier of the event log entry and tracks it until the
event log entry for the corresponding batch desc is encountered.
While processing an event log entry for a batch descriptor with error
indicating that one or more descs in the batch had event log entries,
the driver will do the following before writing the batch completion
record:
1. If the status field of the completion record is 0x1, the driver will
change it to error code 0x5 (one or more operations in batch completed
with status not successful) and changes the result field to 1.
2. If the status is error code 0x6 (page fault on batch descriptor list
address), change the result field to 1.
3. If status is any other value, the completion record is not changed.
4. Clear the recorded error in preparation for next batch with same batch
identifier.
The result field is for user software to determine whether to set the
"Batch Error" flag bit in the descriptor for continuation of partial
batch descriptor completion. See DSA spec 2.0 for additional information.
If no error has been recorded for the batch, the batch completion record is
written to user space as is.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-12-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
DSA supports page fault handling through PRS. However, the DMA engine
that's processing the descriptor is blocked until the PRS response is
received. Other workqueues sharing the engine are also blocked.
Page fault handing by the driver with PRS disabled can be used to
mitigate the stalling.
With PRS disabled while ATS remain enabled, DSA handles page faults on
a completion record by reporting an event in the event log. In this
instance, the descriptor is completed and the event log contains the
completion record address and the contents of the completion record. Add
support to the event log handling code to fault in the completion record
and copy the content of the completion record to user memory.
A bitmap is introduced to keep track of discarded event log entries. When
the user process initiates ->release() of the char device, it no longer is
interested in any remaining event log entries tied to the relevant wq and
PASID. The driver will mark the event log entry index in the bitmap. Upon
encountering the entries during processing, the event log handler will just
clear the bitmap bit and skip the entry rather than attempt to process the
event log entry.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-10-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
page fault handling
Define idxd_copy_cr() to copy completion record to fault address in
user address that is found by work queue (wq) and PASID.
It will be used to write the user's completion record that the hardware
device is not able to write due to user completion record page fault.
An xarray is added to associate the PASID and mm with the
struct idxd_user_context so mm can be found by PASID and wq.
It is called when handling the completion record fault in a kernel thread
context. Switch to the mm using kthread_use_vm() and copy the
completion record to the mm via copy_to_user(). Once the copy is
completed, switch back to the current mm using kthread_unuse_mm().
Suggested-by: Christoph Hellwig <hch@infradead.org>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-9-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add a kmem cache per device for allocating event log fault context. The
context allows an event log entry to be copied and passed to a software
workqueue to be processed. Due to each device can have different sized
event log entry depending on device type, it's not possible to have a
global kmem cache.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-8-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add a workqueue for user submitted completion record fault processing.
The workqueue creation and destruction lifetime will be tied to the user
sub-driver since it will only be used when the wq is a user type.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-7-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add debugfs entry to dump the content of the event log for debugging. The
function will dump all non-zero entries in the event log. It will note
which entries are processed and which entries are still pending processing
at the time of the dump. The entries may not always be in chronological
order due to the log is a circular buffer.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-6-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add setup of event log feature for supported device. Event log addresses
error reporting that was lacking in gen 1 DSA devices where a second error
event does not get reported when a first event is pending software
handling. The event log allows a circular buffer that the device can push
error events to. It is up to the user to create a large enough event log
ring in order to capture the expected events. The evl size can be set in
the device sysfs attribute. By default 64 entries are supported as minimal
when event log is enabled.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-4-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add support for changing of the event log size. Event log is a
feature added to DSA 2.0 hardware to improve error reporting.
It supersedes the SWERROR register on DSA 1.0 hardware and hope
to prevent loss of reported errors.
The error log size determines how many error entries supported for
the device. It can be configured by the user via sysfs attribute.
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-3-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add IAA (IAX) capability mask sysfs attribute to expose to applications.
The mask provides application knowledge of what capabilities this IAA
device supports. This mask is available for IAA 2.0 device or later.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230303213732.3357494-3-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
SWERROR register is 4 64bit wide registers. Currently the sysfs attribute
just outputs 4 64bit hex integers. Convert to output with %*pb format
specifier.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230303213732.3357494-2-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
This has no use anymore, delete it all.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20230322200803.869130-8-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
INVALID_IOASID and IOMMU_PASID_INVALID are duplicated. Rename
INVALID_IOASID and consolidate since we are moving away from IOASID
infrastructure.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20230322200803.869130-7-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
>From Intel IAA spec [1], Intel IAA does not support batch processing.
Two batch related default values for IAA are incorrect in current code:
(1) The max batch size of device is set during device initialization,
that indicates batch is supported. It should be always 0 on IAA.
(2) The max batch size of work queue is set to WQ_DEFAULT_MAX_BATCH (32)
as the default value regardless of Intel DSA or IAA device during
work queue setup and cleanup. It should be always 0 on IAA.
Fix the issues by setting the max batch size of device and max batch
size of work queue to 0 on IAA device, that means batch is not
supported.
[1]: https://cdrdv2.intel.com/v1/dl/getContent/721858
Fixes: 23084545dbb0 ("dmaengine: idxd: set max_xfer and max_batch for RO device")
Fixes: 92452a72ebdf ("dmaengine: idxd: set defaults for wq configs")
Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220930201528.18621-2-xiaochen.shen@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add sysfs knob to allow control of the number of batch descriptors that can
be concurrently processed by an engine in the group as a fraction of the
Maximum Work Descriptors in Progress value specfied in ENGCAP register.
This control knob is part of toggle for QoS control.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220917161222.2835172-6-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add sysfs knob to allow control of the number of work descriptors that can
be concurrently processed by an engine in the group as a fraction of the
Maximum Work Descriptors in Progress value specified in ENGCAP register.
This control knob is part of toggle for QoS control.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220917161222.2835172-5-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
DSA 2.0 add the capability of configuring DMA ops on a per workqueue basis.
This means that certain ops can be disabled by the system administrator for
certain wq. By default, all ops are available. A bitmap is used to store
the ops due to total op size of 256 bits and it is more convenient to use a
range list to specify which bits are enabled.
One of the usage to support this is for VM migration between different
iteration of devices. The newer ops are disabled in order to allow guest to
migrate to a host that only support older ops. Another usage is to
restrict the WQ to certain operations for QoS of performance.
A sysfs of ops_config attribute is added per wq. It is only usable when the
ops_config bit is set under WQ_CAP register. This means that this attribute
will return -EOPNOTSUPP on DSA 1.x devices. The expected input is a range
list for the bits per operation the WQ supports.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220917161222.2835172-4-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
To make input and output consistent and prepping for the per WQ operation
configuration support, change the output of opcap display to match the
input that is expected by bitmap_parse() helper function. The output will
be a bitmap with field width as the number of bits using the %*pb format
specifier for printk() family.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220917161222.2835172-3-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Make wq attributes access consistent. Convert ats_dis to wq flag
WQ_FLAG_ATS_DISABLE.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220917161222.2835172-2-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Now that idxd_wq_disable_cleanup() sets the workqueue state to
IDXD_WQ_DISABLED, use a bitmap to track which workqueues have been
enabled. This will then be used to determine which workqueues
should be re-enabled when attempting a software reset to recover
from a device halt state.
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20220928154856.623545-3-jsnitsel@redhat.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Since idxd_register/unregister_dma_channel() are only called locally, make
them static.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/165187583222.3287435.12882651040433040246.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
The idxd driver always gated the pasid enabling under a single knob and
this assumption is incorrect. The pasid used for kernel operation can be
independently toggled and has no dependency on the user pasid (and vice
versa). Split the two so they are independent "enabled" flags.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/165231431746.986466.5666862038354800551.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Move the core driver operations from wq driver to the drv_enable_wq() and
drv_disable_wq() functions. The move should reduce the wq driver's
knowledge of the core driver operations and prevent code confusion for
future wq drivers.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/165047301643.3841827.11222723219862233060.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
DSA spec v1.2 has changed the term of "bandwidth tokens" to "read buffers"
in order to make the concept clearer. Deprecate bandwidth token
naming in the driver and convert to read buffers in order to match with
the spec and reduce confusion when reading the spec.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163951338932.2988321.6162640806935567317.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Change the driver where WQ interrupt is requested only when wq is being
enabled. This new scheme set things up so that request_threaded_irq() is
only called when a kernel wq type is being enabled. This also sets up for
future interrupt request where different interrupt handler such as wq
occupancy interrupt can be setup instead of the wq completion interrupt.
Not calling request_irq() until the WQ actually needs an irq also prevents
wasting of CPU irq vectors on x86 systems, which is a limited resource.
idxd_flush_pending_descs() is moved to device.c since descriptor flushing
is now part of wq disable rather than shutdown().
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163942149487.2412839.6691222855803875848.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
With irq_entry already being associated with the wq in a 1:1 relationship,
embed the irq_entry in the idxd_wq struct and remove back pointers for
idxe_wq and idxd_device. In the process of this work, clean up the interrupt
handle assignment so that there's no decision to be made during submit
call on where interrupt handle value comes from. Set the interrupt handle
during irq request initialization time.
irq_entry 0 is designated as special and is tied to the device itself.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163942148362.2412839.12055447853311267866.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add a sysfs knob to allow tuning of retries for the kernel ENQCMDS
descriptor submission. While on host, it is not as likely that ENQCMDS
return busy during normal operations due to the driver controlling the
number of descriptors allocated for submission. However, when the driver is
operating as a guest driver, the chance of retry goes up significantly due
to sharing a wq with multiple VMs. A default value is provided with the
system admin being able to tune the value on a per WQ basis.
Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163820629464.2702134.7577370098568297574.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add default values for wq size, max_xfer_size and max_batch_size. These
values should provide a general guidance for the wq configuration when
the user does not specify any specific values.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163528473483.3926048.7950067926287180976.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
"Interrupt handle revoked" is an event that happens when the driver is
running on a guest kernel and the VM is migrated to a new machine.
The device will trigger an interrupt that signals to the guest driver
that the interrupt handles need to be replaced.
The misc irq thread function calls a helper function to handle the
event. The function uses the WQ percpu_ref to quiesce the kernel
submissions. It then replaces the interrupt handles by requesting
interrupt handle command for each I/O MSIX vector. Once the handle is
updated, the driver will unblock the submission path to allow new
submissions.
The submitter will attempt to acquire a percpu_ref before submission. When
the request fails, it will wait on the wq_resurrect 'completion'.
The driver does anticipate the possibility of descriptors being submitted
before the WQ percpu_ref is killed. If a descriptor has already been
submitted, it will return with incorrect interrupt handle status. The
descriptor will be re-submitted with the new interrupt handle on the
completion path. For descriptors with incorrect interrupt handles,
completion interrupt won't be triggered.
At the completion of the interrupt handle refresh, the handling function
will call idxd_int_handle_refresh_drain() to issue drain descriptors to
each of the wq with associated interrupt handle. The drain descriptor will have
interrupt request set but without completion record. This will ensure all
descriptors with incorrect interrupt completion handle get drained and
a completion interrupt is triggered for the guest driver to process them.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Co-Developed-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163528420189.3925689.18212568593220415551.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Handle a descriptor that has been marked with invalid interrupt handle
error in status. Create a work item that will resubmit the descriptor. This
typically happens when the driver has handled the revoke interrupt handle
event and has a new interrupt handle.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163528419601.3925689.4166517602890523193.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add a locked version of idxd_quiesce() call so that the quiesce can be
called with a lock in situations where the lock is not held by the caller.
In the driver probe/remove path, the lock is already held, so the raw
version can be called w/o locking.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163528418980.3925689.5841907054957931211.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Attach int_handle to irq_entry. This removes the separate management of int
handles and reduces the confusion of interating through int handles that is
off by 1 count.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163528417065.3925689.11505755433684476288.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Refactor the completion function to allow skipping of descriptor freeing on
the submission failure path. This completely removes descriptor freeing
from the submit failure path and leave the responsibility to the caller.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163528416222.3925689.12859769271667814762.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
According to core-api/dma-api-howto.rst, the address from
dma_alloc_coherent is gauranteed to align to the smallest PAGE_SIZE order.
That supercedes the 64B/32B alignment requirement of the completion record.
Remove alignment adjustment code.
Tested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163517396063.3484297.7494385225280705372.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Enabling device and wq returns standard errno and that does not provide
enough details to indicate what exactly failed. The hardware command status
is only 8bits. Expand the command status to 32bits and use the upper 16
bits to define software errors to provide more details on the exact
failure. Bit 31 will be used to indicate the error is software set as the
driver is using some of the spec defined hardware error as well.
Cc: Ramesh Thomas <ramesh.thomas@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162681373579.1968485.5891788397526827892.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
The device submission portal is on a 4k page and any of those 64bit aligned
address on the page can be used for descriptor submission. By rotating the
offset through the 4k range and prevent successive writes to the same MMIO
address, performance improvement is observed through testing.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162681372446.1968485.10634280461681015569.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Set GRPCFG traffic class to value of 1 for best performance on current
generation of accelerators. Also add override option to allow experimentation.
Sysfs knobs are disabled for DSA/IAX gen1 devices.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162681373005.1968485.3761065664382799202.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
The original architecture of /sys/bus/dsa invented a scheme whereby
a single entry in the list of bus drivers, /sys/bus/drivers/dsa,
handled all device types and internally routed them to different
different drivers. Those internal drivers were invisible to
userspace.
With the idxd driver transitioned to a proper bus device-driver model,
the legacy behavior needs to be preserved due to it being exposed to
user space via sysfs. Create a compat driver to provide the legacy
behavior for /sys/bus/dsa/drivers/dsa. This should satisfy user
tool accel-config v3.2 or ealier where this behavior is expected.
If the distro has a newer accel-config then the legacy mode does
not need to be enabled.
When the compat driver binds the device (i.e. dsa0) to the dsa driver,
it will be bound to the new idxd_drv. The wq device (i.e. wq0.0) will
be bound to either the dmaengine_drv or the user_drv. The dsa_drv
becomes a routing mechansim for the new drivers. It will not support
additional external drivers that are implemented later.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637468705.744545.4399080971745974435.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
The original architecture of /sys/bus/dsa invented a scheme whereby a
single entry in the list of bus drivers, /sys/bus/drivers/dsa, handled
all device types and internally routed them to different drivers.
Those internal drivers were invisible to userspace. Now, as
/sys/bus/dsa wants to grow support for alternate drivers for a given
device, for example vfio-mdev instead of kernel-internal-dmaengine, a
proper bus device-driver model is needed. The first step in that process
is separating the existing omnibus/implicit "dsa" driver into proper
individual drivers registered on /sys/bus/dsa. Establish the
idxd_user_drv driver that controls the enabling and disabling of the
wq and also register and unregister a char device to allow user space
to mmap the descriptor submission portal.
The cdev related bits are moved to the cdev driver probe/remove and out of
the drv_enabe/disable_wq() calls. These bits are exclusive to the cdev
operation and not part of the generic enable/disable of the wq device.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637467578.744545.10203997610072341376.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
The original architecture of /sys/bus/dsa invented a scheme whereby a
single entry in the list of bus drivers, /sys/bus/drivers/dsa, handled
all device types and internally routed them to different drivers.
Those internal drivers were invisible to userspace. Now, as
/sys/bus/dsa wants to grow support for alternate drivers for a given
device, for example vfio-mdev instead of kernel-internal-dmaengine, a
proper bus device-driver model is needed. The first step in that process
is separating the existing omnibus/implicit "dsa" driver into proper
individual drivers registered on /sys/bus/dsa. Establish the
idxd_dmaengine_drv driver that controls the enabling and disabling of the
wq and also register and unregister the dma channel.
idxd_wq_alloc_resources() and idxd_wq_free_resources() also get moved to
the dmaengine driver. The resources (dma descriptors allocation and setup)
are only used by the dmaengine driver and should only happen when it loads.
The char dev driver (cdev) related bits are left in the __drv_enable_wq()
and __drv_disable_wq() calls to be moved when we split out the char dev
driver just like how the dmaengine driver is split out.
WQ autoload support is not expected currently. With the amount of
configuration needed for the device, the wq is always expected to
be enabled by a tool (or via sysfs) rather than auto enabled at driver
load.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637467033.744545.12330636655625405394.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
The original architecture of /sys/bus/dsa invented a scheme whereby a
single entry in the list of bus drivers, /sys/bus/drivers/dsa, handled
all device types and internally routed them to different drivers.
Those internal drivers were invisible to userspace. Now, as
/sys/bus/dsa wants to grow support for alternate drivers for a given
device, for example vfio-mdev instead of kernel-internal-dmaengine, a
proper bus device-driver model is needed. The first step in that process
is separating the existing omnibus/implicit "dsa" driver into proper
individual drivers registered on /sys/bus/dsa. Establish the idxd_drv
driver that control the enabling and disabling of the accelerator device.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637466439.744545.15210886092627144577.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add an array of support device types to the idxd_device_driver
definition in order to enable simple matching of device type to a
given driver. The deprecated / omnibus dsa_drv driver specifies
IDXD_DEV_NONE as its only role is to service legacy userspace (old
accel-config) directed bind requests and route them to them the proper
driver. It need not attach to a device when the bus is autoprobed. The
accel-config tooling is being updated to drop its dependency on this
deprecated bind scheme.
Reviewed-by: Dan Willliams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637465882.744545.17456174666211577867.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Don't need a wrapper to register the driver. Just do it directly.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637465319.744545.16325178432532362906.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Move the code related to a ->remove() function for the idxd
'struct device' to device.c to prep for the idxd device
sub-driver in device.c.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162637464768.744545.15797285510999151668.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|