summaryrefslogtreecommitdiff
path: root/drivers/dma/idxd/irq.c
AgeCommit message (Collapse)AuthorFilesLines
2022-09-29dmaengine: idxd: Remove unused struct idxd_faultYuan Can1-6/+0
Since fault processing code has been removed, struct idxd_fault is not used any more and can be removed as well. Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20220928014747.106808-1-yuancan@huawei.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29dmaengine: idxd: track enabled workqueues in bitmapJerry Snitselaar1-2/+3
Now that idxd_wq_disable_cleanup() sets the workqueue state to IDXD_WQ_DISABLED, use a bitmap to track which workqueues have been enabled. This will then be used to determine which workqueues should be re-enabled when attempting a software reset to recover from a device halt state. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vkoul@kernel.org> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20220928154856.623545-3-jsnitsel@redhat.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-04dmaengine: idxd: avoid deadlock in process_misc_interrupts()Jerry Snitselaar1-2/+0
idxd_device_clear_state() now grabs the idxd->dev_lock itself, so don't grab the lock prior to calling it. This was seen in testing after dmar fault occurred on system, resulting in lockup stack traces. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vkoul@kernel.org> Cc: dmaengine@vger.kernel.org Fixes: cf4ac3fef338 ("dmaengine: idxd: fix lockdep warning on device driver removal") Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20220823163709.2102468-1-jsnitsel@redhat.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05dmaengine: idxd: change MSIX allocation based on per wq activationDave Jiang1-0/+3
Change the driver where WQ interrupt is requested only when wq is being enabled. This new scheme set things up so that request_threaded_irq() is only called when a kernel wq type is being enabled. This also sets up for future interrupt request where different interrupt handler such as wq occupancy interrupt can be setup instead of the wq completion interrupt. Not calling request_irq() until the WQ actually needs an irq also prevents wasting of CPU irq vectors on x86 systems, which is a limited resource. idxd_flush_pending_descs() is moved to device.c since descriptor flushing is now part of wq disable rather than shutdown(). Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163942149487.2412839.6691222855803875848.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05dmaengine: idxd: embed irq_entry in idxd_wq structDave Jiang1-5/+5
With irq_entry already being associated with the wq in a 1:1 relationship, embed the irq_entry in the idxd_wq struct and remove back pointers for idxe_wq and idxd_device. In the process of this work, clean up the interrupt handle assignment so that there's no decision to be made during submit call on where interrupt handle value comes from. Set the interrupt handle during irq request initialization time. irq_entry 0 is designated as special and is tied to the device itself. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163942148362.2412839.12055447853311267866.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05Merge branch 'fixes' into nextVinod Koul1-1/+1
We have a conflict in idxd driver between 'fixes' and 'next' and there are patches dependent on this so, merge the 'fixes' branch into next
2021-12-17dmaengine: idxd: add knob for enqcmds retriesDave Jiang1-1/+1
Add a sysfs knob to allow tuning of retries for the kernel ENQCMDS descriptor submission. While on host, it is not as likely that ENQCMDS return busy during normal operations due to the driver controlling the number of descriptors allocated for submission. However, when the driver is operating as a guest driver, the chance of retry goes up significantly due to sharing a wq with multiple VMs. A default value is provided with the system admin being able to tune the value on a per WQ basis. Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163820629464.2702134.7577370098568297574.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22dmaengine: idxd: fix calling wq quiesce inside spinlockDave Jiang1-1/+1
Dan reports that smatch has found idxd_wq_quiesce() is being called inside the idxd->dev_lock. idxd_wq_quiesce() calls wait_for_completion() and therefore it can sleep. Move the call outside of the spinlock as it does not need device lock. Fixes: 5b0c68c473a1 ("dmaengine: idxd: support reporting of halt interrupt") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163716858508.1721911.15051495873516709923.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22dmaengine: idxd: handle interrupt handle revoked eventDave Jiang1-0/+137
"Interrupt handle revoked" is an event that happens when the driver is running on a guest kernel and the VM is migrated to a new machine. The device will trigger an interrupt that signals to the guest driver that the interrupt handles need to be replaced. The misc irq thread function calls a helper function to handle the event. The function uses the WQ percpu_ref to quiesce the kernel submissions. It then replaces the interrupt handles by requesting interrupt handle command for each I/O MSIX vector. Once the handle is updated, the driver will unblock the submission path to allow new submissions. The submitter will attempt to acquire a percpu_ref before submission. When the request fails, it will wait on the wq_resurrect 'completion'. The driver does anticipate the possibility of descriptors being submitted before the WQ percpu_ref is killed. If a descriptor has already been submitted, it will return with incorrect interrupt handle status. The descriptor will be re-submitted with the new interrupt handle on the completion path. For descriptors with incorrect interrupt handles, completion interrupt won't be triggered. At the completion of the interrupt handle refresh, the handling function will call idxd_int_handle_refresh_drain() to issue drain descriptors to each of the wq with associated interrupt handle. The drain descriptor will have interrupt request set but without completion record. This will ensure all descriptors with incorrect interrupt completion handle get drained and a completion interrupt is triggered for the guest driver to process them. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Co-Developed-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528420189.3925689.18212568593220415551.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22dmaengine: idxd: handle invalid interrupt handle descriptorsDave Jiang1-0/+50
Handle a descriptor that has been marked with invalid interrupt handle error in status. Create a work item that will resubmit the descriptor. This typically happens when the driver has handled the revoke interrupt handle event and has a new interrupt handle. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528419601.3925689.4166517602890523193.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22dmaengine: idxd: add helper for per interrupt handle drainDave Jiang1-0/+39
The helper is called at the completion of the interrupt handle refresh event. It issues drain descriptors to each of the wq with associated interrupt handle. The drain descriptor will have interrupt request set but without completion record. This will ensure all descriptors with incorrect interrupt completion handle get drained and a completion interrupt is triggered for the guest driver to process them. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528418315.3925689.7944718440052849626.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22dmaengine: idxd: rework descriptor free path on failureDave Jiang1-4/+4
Refactor the completion function to allow skipping of descriptor freeing on the submission failure path. This completely removes descriptor freeing from the submit failure path and leave the responsibility to the caller. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163528416222.3925689.12859769271667814762.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-10-25dmaengine: idxd: add halt interrupt supportDave Jiang1-0/+5
Add halt interrupt support. Given that the misc interrupt handler already check halt state, the driver just need to run the halt handling code when receiving the halt interrupt. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/163114224352.846654.14334468363464318828.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-10-25dmaengine: idxd: Use list_move_tail instead of list_del/list_add_tailBixuan Cui1-2/+1
Using list_move_tail() instead of list_del() + list_add_tail() Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20210908092826.67765-1-cuibixuan@huawei.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-08-29dmaengine: idxd: remove interrupt disable for dev_lockDave Jiang1-4/+4
The spinlock is not being used in hard interrupt context. There is no need to disable irq when acquiring the lock. The interrupt thread handler also is not in bottom half context, therefore we can also remove disabling of the bh. Convert all dev_lock acquisition to plain spin_lock() calls. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162984026772.1939166.11504067782824765879.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-08-06dmaengine: idxd: remove interrupt flag for completion list spinlockDave Jiang1-7/+5
The list lock is never acquired in interrupt context. Therefore there is no need to disable interrupts. Remove interrupt flags for lock operations. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162826417450.3454650.3733188117742416238.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-08-06dmaengine: idxd: make I/O interrupt handler one shotDave Jiang1-51/+8
The interrupt thread handler currently loops forever to process outstanding completions. This causes either an "irq X: nobody cared" kernel splat or the NMI watchdog kicks in due to running too long in the function. The irq thread handler is expected to run again after exiting if there are interrupts fired while the thread handler is running. So the handler code can process all the completed I/O in a single pass and exit without losing the follow on completed I/O. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162802977005.3084234.11836261157026497585.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-08-06dmaengine: idxd: Remove unused status variable in irq_process_work_list()Nathan Chancellor1-2/+0
status is no longer used within this block: drivers/dma/idxd/irq.c:255:6: warning: unused variable 'status' [-Wunused-variable] u8 status = desc->completion->status & DSA_COMP_STATUS_MASK; ^ 1 warning generated. Fixes: b60bb6e2bfc1 ("dmaengine: idxd: fix abort status check") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20210802175820.3153920-1-nathan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-29dmaengine: idxd: fix abort status checkDave Jiang1-2/+10
Coverity static analysis of linux-next found issue. The check (status == IDXD_COMP_DESC_ABORT) is always false since status was previously masked with 0x7f and IDXD_COMP_DESC_ABORT is 0xff. Fixes: 6b4b87f2c31a ("dmaengine: idxd: fix submission race window") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162698465160.3560828.18173186265683415384.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-21dmaengine: idxd: remove fault processing codeDave Jiang1-91/+4
Kernel memory are pinned and will not cause faults. Since the driver does not support interrupts for user descriptors, no fault errors are expected to come through the misc interrupt. Remove dead code. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162630502789.631986.10591230961790023856.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-21dmaengine: idxd: add 'struct idxd_dev' as wrapper for conf_devDave Jiang1-1/+1
Add a 'struct idxd_dev' that wraps the 'struct device' for idxd conf_dev that registers with the dsa bus. This is introduced in order to deal with multiple different types of 'devices' that are registered on the dsa_bus when the compat driver needs to route them to the correct driver to attach. The bind() call now can determine the type of device and then do the appropriate driver matching. Reviewed-by Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162637460065.744545.584492831446090984.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-21Merge branch 'fixes' into nextVinod Koul1-9/+18
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-20dmaengine: idxd: fix submission race windowDave Jiang1-9/+18
Konstantin observed that when descriptors are submitted, the descriptor is added to the pending list after the submission. This creates a race window with the slight possibility that the descriptor can complete before it gets added to the pending list and this window would cause the completion handler to miss processing the descriptor. To address the issue, the addition of the descriptor to the pending list must be done before it gets submitted to the hardware. However, submitting to swq with ENQCMDS instruction can cause a failure with the condition of either wq is full or wq is not "active". With the descriptor allocation being the gate to the wq capacity, it is not possible to hit a retry with ENQCMDS submission to the swq. The only possible failure can happen is when wq is no longer "active" due to hw error and therefore we are moving towards taking down the portal. Given this is a rare condition and there's no longer concern over I/O performance, the driver can walk the completion lists in order to retrieve and abort the descriptor. The error path will set the descriptor to aborted status. It will take the work list lock to prevent further processing of worklist. It will do a delete_all on the pending llist to retrieve all descriptors on the pending llist. The delete_all action does not require a lock. It will walk through the acquired llist to find the aborted descriptor while add all remaining descriptors to the work list since it holds the lock. If it does not find the aborted descriptor on the llist, it will walk through the work list. And if it still does not find the descriptor, then it means the interrupt handler has removed the desc from the llist but is pending on the work list lock and will process it once the error path releases the lock. Fixes: eb15e7154fbf ("dmaengine: idxd: add interrupt handle request and release support") Reported-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162628855747.360485.10101925573082466530.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-14dmanegine: idxd: cleanup all device related bits after disabling deviceDave Jiang1-2/+2
The previous state cleanup patch only performed wq state cleanups. This does not go far enough as when device is disabled or reset, the state for groups and engines must also be cleaned up. Add additional state cleanup beyond wq cleanup. Tie those cleanups directly to device disable and reset, and wq disable and reset. Fixes: da32b28c95a7 ("dmaengine: idxd: cleanup workqueue config after disabling") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/162285154108.2096632.5572805472362321307.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-25dmaengine: idxd: Enable IDXD performance monitor supportTom Zanussi1-4/+1
Add the code needed in the main IDXD driver to interface with the IDXD perfmon implementation. [ Based on work originally by Jing Lin. ] Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Link: https://lore.kernel.org/r/a5564a5583911565d31c2af9234218c5166c4b2c.1619276133.git.zanussi@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-23dmaengine: idxd: remove MSIX masking for interrupt handlersDave Jiang1-12/+0
Remove interrupt masking and just let the hard irq handler keep firing for new events. This is less of a performance impact vs the MMIO readback inside the pci_msi_{mask,unmas}_irq(). Especially with a loaded system those flushes can be stuck behind large amounts of MMIO writes to flush. When guest kernel is running on top of VFIO mdev, mask/unmask causes a vmexit each time and is not desirable. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/161894523436.3210025.1834640110556139277.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-23dmaengine: idxd: support reporting of halt interruptDave Jiang1-0/+2
Unmask the halt error interrupt so it gets reported to the interrupt handler. When halt state interrupt is received, quiesce the kernel WQs and unmap the portals to stop submission. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/161894441167.3202472.9485946398140619501.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-20dmaengine: idxd: fix cdev setup and free device lifetime issuesDave Jiang1-2/+2
The char device setup and cleanup has device lifetime issues regarding when parts are initialized and cleaned up. The initialization of struct device is done incorrectly. device_initialize() needs to be called on the 'struct device' and then additional changes can be added. The ->release() function needs to be setup via device_type before dev_set_name() to allow proper cleanup. The change re-parents the cdev under the wq->conf_dev to get natural reference inheritance. No known dependency on the old device path exists. Reported-by: Jason Gunthorpe <jgg@nvidia.com> Fixes: 42d279f9137a ("dmaengine: idxd: add char driver to expose submission portal to userland") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/161852987721.2203940.1478218825576630810.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-20dmaengine: idxd: fix wq conf_dev 'struct device' lifetimeDave Jiang1-3/+3
Remove devm_* allocation and fix wq->conf_dev 'struct device' lifetime. Address issues flagged by CONFIG_DEBUG_KOBJECT_RELEASE. Add release functions in order to free the allocated memory for the wq context at device destruction time. Reported-by: Jason Gunthorpe <jgg@nvidia.com> Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/161852985907.2203940.6840120734115043753.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-12dmaengine: idxd: Fix clobbering of SWERR overflow bit on writebackDave Jiang1-1/+3
Current code blindly writes over the SWERR and the OVERFLOW bits. Write back the bits actually read instead so the driver avoids clobbering the OVERFLOW bit that comes after the register is read. Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/161352082229.3511254.1002151220537623503.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-01-17dmaengine: idxd: fix misc interrupt completionDave Jiang1-9/+27
Nikhil reported the misc interrupt handler can sometimes miss handling the command interrupt when an error interrupt happens near the same time. Have the irq handling thread continue to process the misc interrupts until all interrupts are processed. This is a low usage interrupt and is not expected to handle high volume traffic. Therefore there is no concern of this thread running for a long time. Fixes: 0d5c10b4c84d ("dmaengine: idxd: add work queue drain support") Reported-by: Nikhil Rao <nikhil.rao@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/161074755329.2183844.13295528344116907983.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-01-17dmaengine: idxd: Fix list corruption in description completionDave Jiang1-42/+44
Sanjay reported the following kernel splat after running dmatest for stress testing. The current code is giving up the spinlock in the middle of a completion list walk, and that opens up opportunity for list corruption if another thread touches the list at the same time. In order to make sure the list is always protected, the hardware completed descriptors will be put on a local list to be completed with callbacks from the outside of the list lock. kernel: general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] SMP NOPTI kernel: CPU: 62 PID: 1814 Comm: irq/89-idxd-por Tainted: G W 5.10.0-intel-next_10_16+ #1 kernel: Hardware name: Intel Corporation ArcherCity/ArcherCity, BIOS EGSDCRB1.SBT.4915.D02.2012070418 12/07/2020 kernel: RIP: 0010:irq_process_work_list+0xcd/0x170 [idxd] kernel: Code: e8 18 65 5c d3 8b 45 d4 85 c0 75 b3 4c 89 f7 e8 b9 fe ff ff 84 c0 74 bf 4c 89 e7 e8 dd 6b 5c d3 49 8b 3f 49 8b 4f 08 48 89 c6 <48> 89 4f 08 48 89 39 4c 89 e7 48 b9 00 01 00 00 00 00 ad de 49 89 kernel: RSP: 0018:ff256768c4353df8 EFLAGS: 00010046 kernel: RAX: 0000000000000202 RBX: dead000000000100 RCX: dead000000000122 kernel: RDX: 0000000000000001 RSI: 0000000000000202 RDI: dead000000000100 kernel: RBP: ff256768c4353e40 R08: 0000000000000001 R09: 0000000000000000 kernel: R10: 0000000000000000 R11: 00000000ffffffff R12: ff1fdf9fd06b3e48 kernel: R13: 0000000000000005 R14: ff1fdf9fc4275980 R15: ff1fdf9fc4275a00 kernel: FS: 0000000000000000(0000) GS:ff1fdfa32f880000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 00007f87f24012a0 CR3: 000000010f630004 CR4: 0000000003771ee0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 kernel: PKRU: 55555554 kernel: Call Trace: kernel: ? irq_thread+0xa9/0x1b0 kernel: idxd_wq_thread+0x34/0x90 [idxd] kernel: irq_thread_fn+0x24/0x60 kernel: irq_thread+0x10f/0x1b0 kernel: ? irq_forced_thread_fn+0x80/0x80 kernel: ? wake_threads_waitq+0x30/0x30 kernel: ? irq_thread_check_affinity+0xe0/0xe0 kernel: kthread+0x142/0x160 kernel: ? kthread_park+0x90/0x90 kernel: ret_from_fork+0x1f/0x30 kernel: Modules linked in: idxd_ktest dmatest intel_rapl_msr idxd_mdev iTCO_wdt vfio_pci vfio_virqfd iTCO_vendor_support intel_rapl_common i10nm_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper rapl msr pcspkr snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg ofpart snd_hda_codec snd_hda_core snd_hwdep cmdlinepart snd_seq snd_seq_device idxd snd_pcm intel_spi_pci intel_spi snd_timer spi_nor input_leds joydev snd i2c_i801 mtd soundcore i2c_smbus mei_me mei i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sunrpc nls_iso8859_1 sch_fq_codel ip_tables x_tables xfs libcrc32c ast drm_vram_helper drm_ttm_helper ttm igc wmi pinctrl_sunrisepoint hid_generic usbmouse usbkbd usbhid hid kernel: ---[ end trace cd5ca950ef0db25f ]--- Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Fixes: e4f4d8cdeb9a ("dmaengine: idxd: Clean up descriptors with fault error") Link: https://lore.kernel.org/r/161074757267.2183951.17912830858060607266.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-10-30dmaengine: idxd: Clean up descriptors with fault errorDave Jiang1-12/+134
Add code to "complete" a descriptor when the descriptor or its completion address hit a fault error when SVA mode is being used. This error can be triggered due to bad programming by the user. A lock is introduced in order to protect the descriptor completion lists since the fault handler will run from the system work queue after being scheduled in the interrupt handler. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/160382008092.3911367.12766483427643278985.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-09-11Merge tag 'v5.9-rc4' into nextVinod Koul1-12/+0
Linux 5.9-rc4
2020-08-17dmaengine: idxd: clear misc interrupt cause after readDave Jiang1-1/+1
Move the clearing of misc interrupt cause to immediately after reading the register in order to allow the next interrupt to be asserted. Suggested-by: Nikhil Rao <nikhil.rao@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/159603824665.28647.5344356370364397996.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-08-17dmaengine: idxd: reset states after device disable or resetDave Jiang1-12/+0
The state for WQs should be reset to disabled when a device is reset or disabled. Fixes: da32b28c95a7 ("dmaengine: idxd: cleanup workqueue config after disabling") Reported-by: Mona Hossain <mona.hossain@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/159586777684.27150.17589406415773568534.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-08-05Merge branch 'for-linus' into fixesVinod Koul1-24/+19
Signed-off-by: Vinod Koul <vkoul@kernel.org> Conflicts: drivers/dma/idxd/sysfs.c
2020-07-13dmaengine: idxd: move idxd interrupt handling to mask instead of ignoreDave Jiang1-2/+0
Switch driver to use MSIX mask and unmask instead of the ignore bit. When ignore bit is cleared, we must issue an MMIO read to ensure writes have all arrived and check and process any additional completions. The ignore bit does not queue up any pending MSIX interrupts. The mask bit however does. Use API call from interrupt subsystem to mask MSIX interrupt since the hardware does not have convenient mask bit register. Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/159319517621.70410.11816465052708900506.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-07-13dmaengine: idxd: add work queue drain supportDave Jiang1-22/+19
Add wq drain support. When a wq is being released, it needs to wait for all in-flight operation to complete. A device control function idxd_wq_drain() has been added to facilitate this. A wq drain call is added to the char dev on release to make sure all user operations are complete. A wq drain is also added before the wq is being disabled. A drain command can take an unpredictable period of time. Interrupt support for device commands is added to allow waiting on the command to finish. If a previous command is in progress, the new submitter can block until the current command is finished before proceeding. The interrupt based submission will submit the command and then wait until a command completion interrupt happens to complete. All commands are moved to the interrupt based command submission except for the device reset during probe, which will be polled. Fixes: 42d279f9137a ("dmaengine: idxd: add char driver to expose submission portal to userland") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/159319502515.69593.13451647706946040301.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-07-02dmaengine: idxd: fix misc interrupt handler thread unmaskingDave Jiang1-1/+2
Fix unmasking of misc interrupt handler when completing normal. It exits early and skips the unmasking with the current implementation. Fix to unmask interrupt when exiting normally. Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/159311256528.855.11527922406329728512.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-05-04dmaengine: idxd: fix interrupt completion after unmaskingDave Jiang1-7/+19
The current implementation may miss completions after we unmask the interrupt. In order to make sure we process all competions, we need to: 1. Do an MMIO read from the device as a barrier to ensure that all PCI writes for completions have arrived. 2. Check for any additional completions that we missed. Fixes: 8f47d1a5e545 ("dmaengine: idxd: connect idxd to dmaengine subsystem") Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/158834641769.35613.1341160109892008587.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-24dmaengine: idxd: add char driver to expose submission portal to userlandDave Jiang1-0/+18
Create a char device region that will allow acquisition of user portals in order to allow applications to submit DMA operations. A char device will be created per work queue that gets exposed. The workqueue type "user" is used to mark a work queue for user char device. For example if the workqueue 0 of DSA device 0 is marked for char device, then a device node of /dev/dsa/wq0.0 will be created. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/157965026985.73301.976523230037106742.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-24dmaengine: idxd: connect idxd to dmaengine subsystemDave Jiang1-0/+87
Add plumbing for dmaengine subsystem connection. The driver register a DMA device per DSA device. The channels are dynamically registered when a workqueue is configured to be "kernel:dmanegine" type. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/157965026376.73301.13867988830650740445.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-24dmaengine: idxd: Init and probe for Intel data acceleratorsDave Jiang1-0/+156
The idxd driver introduces the Intel Data Stream Accelerator [1] that will be available on future Intel Xeon CPUs. One of the kernel access point for the driver is through the dmaengine subsystem. It will initially provide the DMA copy service to the kernel. Some of the main functionality introduced with this accelerator are: shared virtual memory (SVM) support, and descriptor submission using Intel CPU instructions movdir64b and enqcmds. There will be additional accelerator devices that share the same driver with variations to capabilities. This commit introduces the probe and initialization component of the driver. [1]: https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/157965023991.73301.6186843973135311580.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>