summaryrefslogtreecommitdiff
path: root/drivers/misc/habanalabs/common
AgeCommit message (Collapse)AuthorFilesLines
2021-12-26habanalabs: print va_range in vm node debugfsYuri Nudelman1-0/+25
VA range info could assist in debugging VA allocation bugs. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-12-26habanalabs: modify wait for boot fit in dynamic FW loadOhad Sharabi1-1/+0
In the dynamic FW load protocol the boot status is updated to "Ready to Boot" once uboot is active. Polling on other boot status values is a residue of code duplication from the static protocol and should be removed. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-25dma-buf: move dma-buf symbols into the DMA_BUF module namespaceGreg Kroah-Hartman1-0/+2
In order to better track where in the kernel the dma-buf code is used, put the symbols in the namespace DMA_BUF and modify all users of the symbols to properly import the namespace to not break the build at the same time. Now the output of modinfo shows the use of these symbols, making it easier to watch for users over time: $ modinfo drivers/misc/fastrpc.ko | grep import import_ns: DMA_BUF Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: dri-devel@lists.freedesktop.org Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20211010124628.17691-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-18habanalabs: refactor fence handling in hl_cs_poll_fencesDani Liberman1-35/+36
To avoid checking if fence exists multipled times, changed fence handling to depend only on the fence status field: Busy, which means CS still did not completed : Add its QID so multi CS wait on its completion. Finished, which means CS completed and fence exists: Raise its completion bit if it finished mcs handling and update if necessary the earliest timestamp. Gone, which means CS already completed and fence deleted: Update multi CS data to ignore timestamp and raise its completion bit. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: context cleanup cosmeticsOmer Shpigelman1-7/+1
No need to check the return value if the following action is the same for both cases. In addition, now that hl_ctx_free() doesn't print if the context is not released, its name can be misleading as the context might stay alive after it is executed with no indication for that. Hence we can discard it and simply put the refcount. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: simplify wait for interrupt with timestamp flowYuri Nudelman3-10/+7
Remove the flag that determines whether to take a timestamp once the interrupt arrives. Instead, always take the timestamp once per interrupt. This is a must for the user-space to measure its graph operations to evaluate the graph computation time. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: initialize hpriv fields before adding new nodeMoti Haimovski1-8/+15
When adding a new node to the hpriv list, the driver should initialize its fields before adding the new node. Otherwise, there may be some small chance of another thread traversing that list and accessing the new node's fields without them being initialized. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: Unify frequency set/get functionalityRajaravi Krishna Katta3-1/+128
Make the frequency set/get functionality common to all ASICs. This makes more code reusable when adding support for newer ASICs. Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: add support for dma-buf exporterTomer Tayar2-3/+532
Implement the calls to the dma-buf kernel api to create a dma-buf object backed by FD. We block the option to mmap the DMA-BUF object because we don't support DIRECT_IO and implicit P2P. We only implement support for explicit P2P through importing the FD of the DMA-BUF. In the export phase, we provide to the DMA-BUF object an array of pages that represent the device's memory area. During the map callback, we convert the array of pages into an SGT. We split/merge the pages according to the dma max segment size of the importer. To get the DMA address of the PCI bar, we use the dma_map_resources() kernel API, because our device memory is not backed by page struct and this API doesn't need page struct to map the physical address to a DMA address. We set the orig_nents member of the SGT to be 0, to indicate to other drivers that we don't support CPU mappings. Note that in Habanalabs's ASICs, the device memory is pinned and immutable. Therefore, there is no need for dynamic mappings and pinning callbacks. Also note that in GAUDI we don't have an MMU towards the device memory and the user works on physical addresses. Therefore, the user doesn't pass through the kernel driver to allocate memory there. As a result, only for GAUDI we receive from the user a device memory physical address (instead of a handle) and a size. We check the p2p distance using pci_p2pdma_distance_many() and refusing to map dmabuf in case the distance doesn't allow p2p. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: fix NULL pointer dereferenceDani Liberman1-2/+11
When polling fences for multi CS, it is possible that fence is no longer exists (its corresponding CS completed and the fence was deleted) but we still accessing its parameters, causing NULL pointer dereference. Fixed by checking if fence exits before accessing its parameters. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: fix race condition in multi CS completionDani Liberman2-1/+23
Race condition occurs when CS fence completes and multi CS did not completed yet, while waiting for multi CS ends and returns indication to user that the CS completed. Next wait for multi CS may be triggered by previous multi CS completion without any current CS completed, causing an error. Example scenario : 1. User do multi CS wait for CSs 1 and 2 on master QID 0 2. CS 1 and 2 reached the "cs release" code. The thread of CS 1 completed both the CS and multi CS handling but the completion thread of CS 2 completed the CS but still did not executed complete_multi_cs (note that in CS completion the sequence is to first do complete all for the CS and then another complete all to signal the multi_cs) 3. User received indication that CS 1 and 2 completed (since we check the CS fence and both indicated as completed) and immediately waits on CS 3 and 4, also on master QID 0. 4. Completion thread of CS2 executed complete_multi_cs before completion of CS 3 and 4 and so will trigger the multi CS wait of CSs 3 and 4 as they wait on master QID 0. This will trigger multi CS completion although none of its current CS has been completed. Fixed by adding multi CS complete handling indication for each CS. CS will be marked to the user as completed only if its fence completed and multi CS handling is done. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: bypass reset for continuous h/w error eventBharat Jauhari3-34/+99
There may be a situation where drivers receives continuous fatal H/W error events from FW immediately post reset cycle. This may be due to some fault on the silicon itself. In such case its better to bypass reset cycle so we won't be stuck in endless loop of resets. This commit bypasses reset request in case driver received two back to back FW fatal error before first occurrence of heartbeat event. Signed-off-by: Bharat Jauhari <bjauhari@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: take timestamp on wait for interruptYuri Nudelman3-3/+21
Taking an accurate timestamp in a close proximity of the interrupt is required for user side statistics management. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: prevent race between fd close/openOded Gabbay1-7/+23
The driver allows only a single process to open a device's FD at any single time. This is done by checking "hdev->compute_ctx" under mutex. Therefore, to prevent a race between the moment a user closes it's FD and when another user tries to open the device, we need to make sure that clearing this variable is the very last thing that is done in the code of the FD's release. I'm moving the idle check before clearing this variable and the "reset on device release". btw, if the reset happens it will prevent any other user from opening the device until the reset is finished. An important thing to note is that we need to remove the user process that is closing the device from the process list BEFORE calling the reset function. That is to prevent a case where the reset code will try to kill that user process and it is unnecessary as the process doesn't hold any device/driver resources anymore. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: refactor reset log messageOded Gabbay1-1/+8
Reset to the device is not necessarily due to an error, so print it as info instead of error. In addition, print the type of reset we are doing: - reset of the entire device (aka hard reset) - reset of the device after user have released it (less than hard reset) - lighter reset of an inference device (aka soft reset) Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: define soft-reset as inference opOded Gabbay3-7/+11
Soft-reset is the procedure where we reset only the compute/DMA engines of the device, without requiring the current user-space process to release the device. This type of reset can happen if TDR event occurred (a workload got stuck) or by a root request through sysfs. This is only relevant for inference ASICs, as there is no real-world use-case to do that in training, because training runs on multiple devices. In addition, we also do (in certain ASICs) a reset upon device release. That reset uses the same code as the soft-reset. Therefore, to better differentiate between the two resets, it is better to rename the soft-reset support as "inference soft-reset", to make the code more self-explanatory. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: fix debugfs device memory MMU VA translationYuri Nudelman1-14/+16
The translation in debugfs of device memory MMU virtual addresses was wrong as it did not take into consideration the fact that the page sizes there can be not power of 2. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: add support for a long interrupt target valueOfir Bitton1-4/+4
In order to avoid user target value wraparound, we modify the current interface so user will be able to wait for an 8-byte target value rather than a 4-byte value. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: remove redundant cs validity checksOfir Bitton1-3/+2
During TDR handling, we check multiple times if CS is valid. No need to perform this check as CS must be valid at all time during the TDR handling. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: enable power info via HWMON frameworkRajaravi Krishna Katta2-0/+98
Add support to retrieve following power info via HWMON: - instantaneous power value - highest value since last reset - reset the highest place holder Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: generalize COMMS message sending procedureAlon Mizrahi1-10/+18
Instead of having dedicated function per message that we want to send to the firmware in COMMS protocol, have a generic function that we can call to from other parts of the driver Signed-off-by: Alon Mizrahi <amizrahi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: create static map of f/w hwmon enumsRajaravi Krishna Katta1-9/+91
Instead of using the Linux kernel HWMON enums definition when communicating with the firmware, use proprietary HWMON based enums i.e. map hwmon.h header enum to cpucp_if.h based enum while. This is needed because the HWMON enums are not forcing backward compatibility and therefore changes can break compatibility between newer driver and older firmware. The driver will check for CPU_BOOT_DEV_STS0_MAP_HWMON_EN bit to validate if f/w supports cpucp->hwmon enum mapping to support older firmware where this mapping won't be available. Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: add debugfs node for configuring CS timeoutOfir Bitton1-0/+51
Command submission timeout is currently determined during driver loading time. As some environments requires this timeout to be modified in runtime, we introduce a new debugfs node that controls the timeout value without the need to reload the driver. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-29habanalabs: fix resetting args in wait for CS IOCTLRajaravi Krishna Katta1-14/+19
In wait for CS IOCTL code, the driver resets the incoming args structure before returning to the user, regardless of the return value of the IOCTL. In case the IOCTL returns EINTR, resetting the args will result in error in case the userspace will repeat the ioctl call immediately (which is the behavior in the hl-thunk userspace library). The solution is to reset the args only if the driver returns success (0) as a return value for the IOCTL. Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-14habanalabs: expose a single cs seq in staged submissionsOfir Bitton1-0/+6
Staged submission consists of multiple command submissions. In order to be explicit, driver should return a single cs sequence for every cs in the submission, or else user may try to wait on an internal CS rather than waiting for the whole submission. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-14habanalabs: fix wait offset handlingfarah kassabri1-2/+7
Add handling for case where the user doesn't set wait offset, and keeps it as 0. In such a case the driver will decrement one from this zero value which will cause the code to wait for wrong number of signals. The solution is to treat this case as in legacy wait cs, and wait for the next signal. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-14habanalabs: rate limit multi CS completion errorsOfir Bitton1-1/+2
As user can send wrong arguments to multi CS API, we rate limit the amount of errors dumped to dmesg, in addition we change the severity to warning. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-14habanalabs: fail collective wait when not supportedOfir Bitton1-0/+9
As collective wait operation is required only when NIC ports are available, we disable the option to submit a CS in case all the ports are disabled, which is the current situation in the upstream driver. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-14habanalabs: fix kernel OOPs related to staged csfarah kassabri1-5/+13
In case of single staged cs with both first/last indications set, we reach a scenario where in cs_release function flow we don't cancel the TDR work before freeing the cs memory, this lead to kernel OOPs since when the timer expires the work pointer will be freed already. In addition treat wait encaps cs "not found" handle as "OK" for the user in order to keep the user interface for both legacy and encpas signal/wait features the same. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-14habanalabs: fix potential race in interrupt wait ioctlOfir Bitton1-14/+21
We have a potential race where a user interrupt can be received in between user thread value comparison and before request was added to wait list. This means that if no consecutive interrupt will be received, user thread will timeout and fail. The solution is to add the request to wait list before we perform the comparison. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: add support for f/w resetOded Gabbay3-15/+33
When the f/w runs in secured mode, it can reset the ASIC when certain events occur. In unsecured mode, the driver asks the f/w to reset the ASIC for those events. We need to perform the entire reset procedure but without accessing the ASIC. i.e. without halting the engines and without sending messages to the f/w. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: cannot sleep while holding spinlockfarah kassabri2-3/+1
Fix 2 areas in the code where it's possible the code will go to sleep while holding a spinlock. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: never copy_from_user inside spinlockOded Gabbay1-23/+12
copy_from_user might sleep so we can never call it when we have a spinlock. Moreover, it is not necessary in waiting for user interrupt, because if multiple threads will call this function on the same interrupt, each one will have it's own fence object inside the driver. The user address might be the same, but it doesn't really matter to us, as we only read from it. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: remove unnecessary device status checkOded Gabbay1-4/+6
Checking if the device is operational when entering the function to wait for user interrupt is not something that is useful or necessary. It is not done in any other wait_for_cs ioctl path. If the device becomes non-operational during the wait, the reset function will make sure the process wait is interrupted. Instead, move the check to the beginning of hl_wait_ioctl(). It will block any attempt to wait on CS or user interrupt once the device is already marked as non-operational. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: disable IRQ in user interrupts spinlockOded Gabbay1-12/+13
Because this spinlock is taken in an interrupt handler, we must use the spin_lock_irqsave/irqrestore version to disable the interrupts on the local CPU. Otherwise, we can have a potential deadlock (if the interrupt handler is scheduled to run on the same cpu that the code who took the lock was running on). Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: add "in device creation" statusOmer Shpigelman4-16/+17
On init, the disabled state is cleared right before hw_init and that causes the device to report on "Operational" state before the device initialization is finished. Although the char device is not yet exposed to the user at this stage, the sysfs entries are exposed. This can cause errors in monitoring applications that use the sysfs entries. In order to avoid this, a new state "in device creation" is introduced to ne reported when the device is not disabled but is still in init flow. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: add userptr_lookup node in debugfsYuri Nudelman3-19/+93
It is useful to have the ability to see which user address was pinned to which physical address during the initial mapping. We already have all that info stored, but no means to search this data (which may be quite large). Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: modify multi-CS to wait on stream mastersOhad Sharabi3-27/+48
During the integration, the multi-CS requirements were refined: - The multi CS call shall wait on "per-ASIC" predefined stream masters instead of set of streams. - Stream masters are set of QIDs used by the upper SW layers (synapse) for completion (must be an external/HW queue). Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs/gaudi: increase boot fit timeoutOded Gabbay1-0/+4
Various f/w versions have different timeouts, so increase the default timeout to accommodate all the options. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: fix mmu node address resolution in debugfsYuri Nudelman1-1/+1
The address resolution via debugfs was not taking into consideration the page offset, resulting in a wrong address. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: save pid per userptrYuri Nudelman3-4/+7
Currently userptr endpoint in debugfs prints out virtual addresses in the user process memory space, without specifying their owner process ID. User space virtual address is meaningless without knowing the owner process. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: clear msg_to_cpu_reg to avoid misread after resetKoby Elbaz1-16/+12
For some ASICs, the f/w reads the msg_to_cpu_reg value after reset, and for some it doesn't. Therefore, to be sure f/w doesn't read a wrong value after reset, we need to clear this register before the reset occurs. Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: make set_pci_regions asic functionOhad Sharabi1-0/+2
In order to better support variants of the same ASIC the set_pci_regions function is now an ASIC function which allows each ASIC to implement it internally, thus keeping all definitions static to the file. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: convert PCI BAR offset to u64Ohad Sharabi1-1/+1
Done as the bar size can exceed 4GB. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: expose server type in INFO IOCTLOded Gabbay3-1/+6
Add the server type property to the hl_info_hw_ip_info structure that is exposed to the user via the INFO IOCTL. This is needed by the userspace s/w stack to know the connections map of the internal links that connect the ASIC among themselves inside the server. The F/W will tell us, as part of the NIC information, the server type that the GAUDI is located in. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-08-29habanalabs: remove redundant warning messageOded Gabbay1-3/+0
This warning is redundant as we will print a notice in case the device is still in use after the FD was closed. No need to print the same message per context. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-08-29habanalabs: add support for encapsulated signals submissionfarah kassabri4-248/+262
This commit is the second part of the encapsulated signals feature. It contains the driver support for submission of cs with encapsulated signals and the wait for them. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-08-29habanalabs: add support for encapsulated signals reservationfarah kassabri5-14/+358
The signaling from within encapsulated OP capability is merged into the existing stream architecture, such that one can trigger multiple signaling from an encapsulated op, according to the time the event was done in the graph execution and avoid the need to wait for the whole encapsulated OP execution to be complete before the stream can signal. This commit implements only the reserve/unreserve part. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-08-29habanalabs: signal/wait change sync object reset flowfarah kassabri3-49/+113
Currently the SOB reset was in fence release function which happens only at the CS wraparound during the CS allocation time. In order to support the new encapsulated signals reservation feature, we need to move the SOB reset to an earlier phase because this SOB could reach it's max value very fast using the signal reservation. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-08-29habanalabs: add wait-for-multi-CS uAPIOhad Sharabi5-3/+533
When user sends multiple CSs, waiting for each CS is not efficient as it involves many user-kernel context switches. In order to address this issue we add support to "wait on multiple CSs" using a new uAPI which can wait on maximum of 32 CSs. The new uAPI is defined using a new flag - WAIT_FOR_MULTI_CS - in the wait_for_cs IOCTL. The input parameters for this uAPI will be: @seq: user pointer to an array of up to 32 CS's sequence numbers. @seq_array_len: length of sequence array. @timeout_us: timeout for waiting for any CS. The output paramateres for this API will be: @status: multi CS ioctl completion status (dedicated status was added as well). @flags: bitmap of output flags of the CS. @cs_completion_map: bitmap for multi CS, if CS sequence that was placed in index N in input seq array has completed- the N-th bit in cs_completion_map will be 1, otherwise it will be 0. @timestamp_nsec: timestamp of the first completed CS Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>