Age | Commit message (Collapse) | Author | Files | Lines |
|
This commit adds a cache eviction algorithm for the SDMA
user buffer cache.
Besides the interval RB tree used for node lookup, the cache
nodes are also arranged in a doubly-linked list. When a node is
used, it is put at the beginning of the list. Less frequently
used nodes naturally move to the tail of the list.
When the cache limit is reached, the eviction code starts
traversing the linked list in reverse, freeing buffers until
enough space has been freed to fit the new user buffer. This
guarantees that only the least used cache nodes will be removed
from the cache.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Use the new function to query whether the expected receive
user buffer can be pinned successfully. This requires that
a new variable be added to the hfi1_filedata structure used
to hold the number of pages pinned by the expected receive
code.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This change adds a pointer to the process mm_struct when
calling hfi1_release_user_pages().
Previously, the function used the mm_struct of the current
process to adjust the number of pinned pages. However, is some
cases, namely when unpinning pages due to a MMU notifier call,
we want to drop into that code block as it will cause a deadlock
(the MMU notifiers take the process' mmap_sem prior to calling
the callbacks).
By allowing to caller to specify the pointer to the mm_struct,
the caller has finer control over that part of hfi1_release_user_pages().
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
System administrators can use the locked memory
ulimit setting to set the maximum amount of memory
a user can lock/pin. However, this setting alone is not
enough to guarantee good operation of the hfi1 driver
due to the fact that the setting does not have fine
enough granularity to account for the limit being used
by multiple user processes and caches.
Therefore, a better limiting algorithm is needed. This
is where the new hfi1_can_pin_pages() function and the
cache_size module parameter come in.
The function works by looking at the ulimit and cache_size
value to compute a cache size. The algorithm examines the
ulimit value and, if it is not "unlimited", computes a
per-cache limit based on the number of configured user
contexts.
After that, the lower of the two - cache_size and computed
per-cache limit - is used.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Add support for caching of user buffers used for SDMA
transfers. This change improves performance by
avoiding repeatedly pinning the pages of buffers, which
are being re-used by the application.
While the cost of the pinning operation has been made
heavier by adding the extra code to search the cache tree,
re-allocate pages arrays, and future cache evictions,
that cost will be amortized against the savings when the
same buffer is re-used. It is also worth noting that in
most cases, the cost of pinning should be much lower due
to the buffer already being in the cache.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Last address values for intervals in the interval RB tree
nodes should be non-inclusive in order to avoid confusing
ranges.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This commit adds a filter callback, which can be used to filter
out interval RB nodes matching a certain interval down to a
single one.
This is needed for the upcoming SDMA-side caching where buffers
will need to be filtered by their virtual address.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Interval RB trees provide their own searching function,
which also takes care of determining the path through
the tree that should be taken.
This make the compare callback unnecessary.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Add a new tracepoint type for the MMU functions and calls
to that tracepoint to allow tracing of MMU functionality.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The interval RB trees can handle RB nodes which
hold ranged information. This is exactly the usage
for the buffer cache implemented in the expected
receive code path.
Convert the MMU/RB functions to use the interval RB
tree API. This will help with future users of the
caching API, as well.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Tell the remove MMU/RB callback if it's being called as
part of a memory invalidation or not. This can be important
in preventing a deadlock if the remove callback attempts to
take the map_sem semaphore because the kernel's MMU
invalidation functions have already taken it.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The usage of function pointers for RB node insertion
and removal in the expected receive code path was
meant to be a small performance optimization. However,
maintaining it, especially with the new MMU API, would
become more troublesome as the API is extended.
Since the performance optimization is minor, remove the
function pointers and replace with direct calls.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In order to allow the remove MMU callbacks to free the
RB nodes, it is necessary to prevent any references to
the nodes after the remove callback has been called.
Therefore, remove the node from the tree prior to calling
the callback. In other words, the MMU/RB API now guarantees
that all RB node operations it performs will be done prior
to calling the remove callback and that the RB node will
not be touched afterwards.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Prevent a potential NULL pointer dereference (found
by code inspection) when unregistering an MMU handler.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Future users of the MMU/RB functions might be searching or
manipulating the MMU RB trees in interrupt context. Therefore,
the MMU/RB functions need to be able to run in interrupt
context. This requires that we use the IRQ-aware API for
spin locks.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The MMU notification code added to the
expected receive side has been re-factored and
split into it's own file. This was done in
order to make the code more general and, therefore,
usable by other parts of the driver.
The caching behavior remains the same. However,
the handling of the RB tree (insertion, deletions,
and searching) as well as the MMU invalidation
processing is now handled by functions in the
mmu_rb.[ch] files.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Accordingly IB Spec post WR to receive queue must
complete with error if QP is in Error state.
Please refer to C10-42, C10-97.2.1
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Set the piothreshold to the agreed upon default of 256B.
Reviewed-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The adaptive pio heuristic missed a case that causes a corrupted
packet on the wire.
The case is if SDMA egress had been chosen for a pio-able packet and
then encountered a ring space wait, the packet is queued. The sge
cursor had been incremented as part of the packet build out for SDMA.
After the send engine restart, the heuristic might now chose pio based
on the sdma count being zero and start the mmio copy using the already
incremented sge cursor.
Fix this by forcing SDMA egress when the SDMA descriptor has already
been built.
Additionally, the code to wait for a QPs pio count to zero when
switching to SDMA was missing. Add it.
There is also an issue with UD QPs, in that the different SLs can pick
a different egress send context. For now, just insure the UD/GSI
always go through SDMA.
Reviewed-by: Vennila Megavannan <vennila.megavannan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The following panic occurs while running ib_send_bw -a with
adaptive pio turned on:
[ 8551.143596] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 8551.152986] IP: [<ffffffffa0902a94>] pio_wait.isra.21+0x34/0x190 [hfi1]
[ 8551.160926] PGD 80db21067 PUD 80bb45067 PMD 0
[ 8551.166431] Oops: 0000 [#1] SMP
[ 8551.276725] task: ffff880816bf15c0 ti: ffff880812ac0000 task.ti: ffff880812ac0000
[ 8551.285705] RIP: 0010:[<ffffffffa0902a94>] pio_wait.isra.21+0x34/0x190 [hfi1]
[ 8551.296462] RSP: 0018:ffff880812ac3b58 EFLAGS: 00010282
[ 8551.303029] RAX: 000000000000002d RBX: 0000000000000000 RCX: 0000000000000800
[ 8551.311633] RDX: ffff880812ac3c08 RSI: 0000000000000000 RDI: ffff8800b6665e40
[ 8551.320228] RBP: ffff880812ac3ba0 R08: 0000000000001000 R09: ffffffffa09039a0
[ 8551.328820] R10: ffff880817a0c000 R11: 0000000000000000 R12: ffff8800b6665e40
[ 8551.337406] R13: ffff880817a0c000 R14: ffff8800b6665800 R15: ffff8800b6665e40
[ 8551.355640] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8551.362674] CR2: 0000000000000000 CR3: 000000080abe8000 CR4: 00000000001406e0
[ 8551.371262] Stack:
[ 8551.374119] ffff880812ac3bf0 ffff88080cf54010 ffff880800000800 ffff880812ac3c08
[ 8551.383036] ffff8800b6665800 ffff8800b6665e40 0000000000000202 ffffffffa08e7b80
[ 8551.391941] 00000001007de431 ffff880812ac3bc8 ffffffffa0904645 ffff8800b6665800
[ 8551.400859] Call Trace:
[ 8551.404214] [<ffffffffa08e7b80>] ? hfi1_del_timers_sync+0x30/0x30 [hfi1]
[ 8551.412417] [<ffffffffa0904645>] hfi1_verbs_send+0x215/0x330 [hfi1]
[ 8551.420154] [<ffffffffa08ec126>] hfi1_do_send+0x166/0x350 [hfi1]
[ 8551.427618] [<ffffffffa055a533>] rvt_post_send+0x533/0x6a0 [rdmavt]
[ 8551.435367] [<ffffffffa050760f>] ib_uverbs_post_send+0x30f/0x530 [ib_uverbs]
[ 8551.443999] [<ffffffffa0501367>] ib_uverbs_write+0x117/0x380 [ib_uverbs]
[ 8551.452269] [<ffffffff815810ab>] ? sock_recvmsg+0x3b/0x50
[ 8551.459071] [<ffffffff81581152>] ? sock_read_iter+0x92/0xe0
[ 8551.466068] [<ffffffff81212857>] __vfs_write+0x37/0x100
[ 8551.472692] [<ffffffff81213532>] ? rw_verify_area+0x52/0xd0
[ 8551.479682] [<ffffffff81213782>] vfs_write+0xa2/0x1a0
[ 8551.486089] [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
[ 8551.493891] [<ffffffff812146c5>] SyS_write+0x55/0xc0
[ 8551.500220] [<ffffffff816ae0ee>] entry_SYSCALL_64_fastpath+0x12/0x71
[ 8551.531284] RIP [<ffffffffa0902a94>] pio_wait.isra.21+0x34/0x190 [hfi1]
[ 8551.539508] RSP <ffff880812ac3b58>
[ 8551.544110] CR2: 0000000000000000
The priv s_sendcontext pointer was not setup properly. Fix with this
patch by using the s_sendcontext and eliminating its send engine use.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
There is a timing hole if there had been greater than
PIO_WAIT_BATCH_SIZE waiters. This code will dispatch the first
batch but leave the others in the queue. If the restarted waiters
don't in turn wait on a buffer, there is a hang.
Fix by forcing a return when the QP queue is non-empty.
Reviewed-by: Vennila Megavannan <vennila.megavannan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The postitioning of the sdma ibhdr trace was
causing an extra trace message when the tx send
returned -EBUSY.
Move the trace to just before the return
and handle negative return values to avoid
any trace.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This allows for separately enabling pio and sdma
tracepoints to cut the volume of trace information.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The changes are to aid in coorelating trace information
with QPs between the trace and qp_stats information
Such changes include adds a space after QP and clarifying that the second
QP is actually the remote QP.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Tracking user/QP ownership is needed to debug issues with
user ULPs like OpenMPI.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The current LED beaconing code is unclear and uses the timer handler to
turn off the timer. This patch simplifies the code by removing the
special semantics of timeon = timeoff = 0 being interpreted as a request
to turn off the beaconing.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This patch fixed the problem where the driver might reschedule in atomic
mode when sending packets. This is due to the fact that the call to
cond_resched() in hfi1_do_send() might occur in atomic mode and a check is
required to avoid the warning message:
"kernel: BUG: scheduling while atomic: swapper/2/0/0x10000100."
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The kernel memcpy is faster than a cacheless copy. However,
if too much of the L3 cache is overwritten by one-time copies
then overall bandwidth suffers. Implement an adaptive scheme
where full page copies are tracked and if the number of unique
entries are larger than a threshold, verbs will use a cacheless
copy. Tracked entries are gradually cleaned, allowing memcpy to
resume once the larger copies have stopped.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Host handshake timeout can occur during the verify capability
state. This is a LNI related failure and should be
handled in the same way as other LNI failures.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Different OSes using parts of the same hardware may leave
cross-device flags set. Export a debugfs file to view and
clear these flags if needed.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
External i2c firmware updates are done in multiple steps and
cannot have other things done in between. For debugfs files,
acquire the resource on open and release it on close.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The hardware mutex is now held only long enough to set
or clear flags. Reduce the timeout to something more
reasonable.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The flag HFI1_DO_INIT_ASIC flag is no longer used. Remove
the flag and the code that sets it.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Use the resource reservation system to flag that the ASIC
thermal has been initialized.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Remove the mutex guarding each operation in favor the ASIC
resource acquire/release. Push the resource acquire/release,
above each operation call to allow exclusive access across
multiple operations.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The SBus resource includes SBUS, PCIE, and THERM registers.
Change SBus handling to use the new ASIC resource reservation system.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Change EPROM handling to use the new ASIC resource reservation system.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The ASIC block is a shared hardware resource between two devices
on the chip. Add functions to acquire and release these resources
in a way that is safe for both multiple users on the same OS
and multiple users on different OSes, while holding the hardware
mutex as little as possible.
Reservations are noted in a scratch register in the shared region.
There are two types of reservations: per-HFI dynamic and permanent.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Create a shared structure to exist between devices that share the
same ASIC.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The ASIC block is shared between two HFIs. Individual devices
should not initialize registers there. Retain the power-on values.
Individual users set registers as needed with one exception.
Clear sbus fast mode on "slow" calls.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This change was recommended by Coccinelle tool when I ran the command:
-bash-4.2$ make coccicheck MODE=patch M=drivers/infiniband/hw/hfi1/
Reviewed-by: Jubin John <jubin.john@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Implement changes recommended by the Coccinelle tool to move constant to
the right in bitwise operations
-bash-4.2$ make coccicheck MODE=report M=drivers/infiniband/hw/hfi1/
drivers/infiniband/hw/hfi1/pio.c:765:4-16: Move constant to right.
drivers/infiniband/hw/hfi1/rc.c:2503:19-29: Move constant to right.
drivers/infiniband/hw/hfi1/chip.c:9813:11-22: Move constant to right.
drivers/infiniband/hw/hfi1/chip.c:14468:29-40: Move constant to right.
Reviewed-by: Jubin John <jubin.john@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The break statement was unintentionally removed in this patch
commit 41ca419abc0ca7ee65d765408cdc1a7fed2897a3
("staging/rdma/hfi1: Remove hfi1 MR and hfi1 specific qp type")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Fix 3 memory leaks reported by the LeakCheck tool in the KEDR framework.
The following resources were allocated memory during their respective
initializations but not freed during cleanup:
1. SDMA map elements
2. PIO map elements
3. HW send context to SW index map
This patch fixes the memory leaks by freeing the allocated memory in the
cleanup path.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The LedInfo SMA attribute is redefined to control the LED beaconing
state machine instead of the LED directly. In accordance, we now
return the state of LED beaconing, represented by whether the beaconing
timer is active, instead of the state of the LED itself for SMA queries
Get(LedInfo) and Get(PortInfo). While we are at it, we fix the beaconing
timer control code so that the state of the timer is accurately updated.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This patch tests the interrupt registers when the driver has no access to
its upstream component. In this case, it is highly likely that it is
running in a virtual machine (eg, Qemu-kvm guest). If the interrupt
registers are not mapped properly by the virtual machine monitor, an
error message will be printed and the probing will be terminated. This
will help the user identify the issue. On the other hand, if the driver
is running in a host or has access to its upstream component in some
other VM, it will do nothing.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
When the hfi1 device is assigned to a VM (eg KVM), the hfi1 driver has
no access to the upstream component and therefore cannot use it to perform
some operations, such as secondary bus reset. As a result, the hfi1 driver
cannot perform the pcie Gen3 transition. Instead, those operation should
be done in the host environment, preferrably done during the Option ROM
initialization. Similarly, the hfi1 driver cannot support ASPM and tune
the pcie capability under this circumstance.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
There is a header size counter in both the QP struture and the txreq
structure. The counter in the txreq structure is not updated properly
for RC and UC queue pairs with GRH enabled, and thus causing SDMA
send to fail. This patch fixes the RC and UC path.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The lkey_table_size driver specific parameter value is used before its
value is sanity checked and restricted to RVT_MAX_LKEY_TABLE_BITS.
This causes a vmalloc allocation failure for large values. Fix this
by moving the value check before the first usage of the value.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
A cp or cat of /sys/kernel/debug/hfi1/hfi1_0/port1counters
produces the following message:
hfi1 0000:81:00.0: hfi1_0: index not supported
hfi1 0000:81:00.0: hfi1_0: read_cntrs does not support indexing
Fix by removing the file position logic and the associated messages
and make the file positioning the responsibility of the caller.
The port counter read function argument is changed to the per port
data structure since the counters are relative to the port and not
the device.
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|