summaryrefslogtreecommitdiff
path: root/drivers/misc/habanalabs/include/gaudi
AgeCommit message (Collapse)AuthorFilesLines
2023-01-26habanalabs: move driver to accel subsystemOded Gabbay100-81039/+0
Now that we have a subsystem for compute accelerators, move the habanalabs driver to it. This patch only moves the files and fixes the Makefiles. Future patches will change the existing code to register to the accel subsystem and expose the accel device char files instead of the habanalabs device char files. Update the MAINTAINERS file to reflect this change. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-07-12habanalabs/gaudi: enable error interrupt on ARB WDTOded Gabbay1-0/+1
We want to receive an error interrupt in case the watchdog timer expires on arbitration event in the queues. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-03-28Merge tag 'char-misc-5.18-rc1' of ↵Linus Torvalds1-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc and other driver updates from Greg KH: "Here is the big set of char/misc and other small driver subsystem updates for 5.18-rc1. Included in here are merges from driver subsystems which contain: - iio driver updates and new drivers - fsi driver updates - fpga driver updates - habanalabs driver updates and support for new hardware - soundwire driver updates and new drivers - phy driver updates and new drivers - coresight driver updates - icc driver updates Individual changes include: - mei driver updates - interconnect driver updates - new PECI driver subsystem added - vmci driver updates - lots of tiny misc/char driver updates All of these have been in linux-next for a while with no reported problems" * tag 'char-misc-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (556 commits) firmware: google: Properly state IOMEM dependency kgdbts: fix return value of __setup handler firmware: sysfb: fix platform-device leak in error path firmware: stratix10-svc: add missing callback parameter on RSU arm64: dts: qcom: add non-secure domain property to fastrpc nodes misc: fastrpc: Add dma handle implementation misc: fastrpc: Add fdlist implementation misc: fastrpc: Add helper function to get list and page misc: fastrpc: Add support to secure memory map dt-bindings: misc: add fastrpc domain vmid property misc: fastrpc: check before loading process to the DSP misc: fastrpc: add secure domain support dt-bindings: misc: add property to support non-secure DSP misc: fastrpc: Add support to get DSP capabilities misc: fastrpc: add support for FASTRPC_IOCTL_MEM_MAP/UNMAP misc: fastrpc: separate fastrpc device from channel context dt-bindings: nvmem: brcm,nvram: add basic NVMEM cells dt-bindings: nvmem: make "reg" property optional nvmem: brcm_nvram: parse NVRAM content into NVMEM cells nvmem: dt-bindings: Fix the error of dt-bindings check ...
2022-02-28habanalabs/gaudi: add missing handling of NIC related eventsOded Gabbay1-0/+10
There are a few events that can arrive from the f/w and without proper handling can cause errors to appear in the kernel log without reason. Add the relevant handling that was missing. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-02-17treewide: Replace zero-length arrays with flexible-array membersGustavo A. R. Silva1-2/+2
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. This code was transformed with the help of Coccinelle: (next-20220214$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch) @@ identifier S, member, array; type T1, T2; @@ struct S { ... T1 member; T2 array[ - 0 ]; }; UAPI and wireless changes were intentionally excluded from this patch and will be sent out separately. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays Link: https://github.com/KSPP/linux/issues/78 Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2021-10-18habanalabs: update firmware filesOded Gabbay2-6/+5
Update the firmware headers to the latest version Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-14habanalabs/gaudi: use direct MSI in single modeOmer Shpigelman1-0/+2
Due to FLR scenario when running inside a VM, we must not use indirect MSI because it might cause some issues on VM destroy. In a VM we use single MSI mode in contrary to multi MSI mode which is used in bare-metal. Hence direct MSI should be used in single MSI mode only. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-08-29habanalabs/gaudi: implement state dumpYuri Nudelman2-0/+20
At the first stage, only gaudi core dump shall be implemented, not including the status registers. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-08-29habanalabs: update firmware header filesOfir Bitton1-2/+0
Update recent changes made in firmware header files, which contain a minor COMMS protocol change and new error status definitions. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-06-21habanalabs/gaudi: add support for NIC DERROfir Bitton2-5/+15
We add support for NIC DERR ECC error events, in case this error is received a device reset will be performed. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-06-18habanalabs/gaudi: correct driver events numberingOfir Bitton2-12/+12
Currently driver sends fc interrupt id to FW instead of using cpu interrupt id. We intend to fix that and keep backward compatibility by using the same interrupt values. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-06-18habanalabs/gaudi: add FW alive event supportOfir Bitton2-1/+2
In order for driver to be aware of process or thread crashes inside GAUDI's CPU, we introduce a new event which contains all relevant information. Upon event reception, driver will dump information and will reset the device. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-06-18habanalabs/gaudi: update to latest f/w specsOded Gabbay1-0/+7
Update the firmware interface files to their latest version. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-06-18habanalabs/gaudi: split host irq interfaces towards FWOfir Bitton1-0/+4
Current implementation uses a single interrupt interface towards FW, this interface is causing races between interrupt types. We split this interface to interface per interrupt type. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-06-18habanalabs/gaudi: add ARB to QM stop on error masksTomer Tayar1-5/+10
Update the QM stop on error masks to also stop on ARB errors. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-06-18habanalabs/gaudi: use scratchpad regs instead of GIC controllerKoby Elbaz1-0/+6
Due to new security restrictions, GIC controller can no longer be accessed from user/kernel. To monitor that, a new status bit will be read from preboot caps, indicating whether direct access to GIC is blocked. In case it is blocked, driver will use scratchpad registers instead of using GIC interface on two main scenarios: The first of which LKD triggers interrupts to F/W through GIC, and the second of when LKD configures all engines/QMANs to write to GIC when they want to report an error. From F/W perspective, it will poll on all SPs, and once IRQ number is retrieved, SP register is cleared, and it will perform the write to the GIC to trigger the IRQ handler. Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-06-18habanalabs: update to latest f/w headersOded Gabbay1-0/+39
Update the common and GAUDI firmware header files to the latest version. The latest version use the correct endianness types so this commit also contains minor changes to the code to use the correct conversions when reading/writing to the firmware structures. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs: support legacy and new pll indexesOhad Sharabi1-14/+0
In order to use minimum of hard coded values common to LKD and F/W a dynamic method to work with PLLs is introduced in this patch. Formerly asic specific PLL numbering is now common for all asics. To be backward compatible a bit in dev status is defined, if the bit is not set LKD will keep working with old PLL numbering. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs/gaudi: Update async events headerOfir Bitton2-13/+24
Update with latest version from the Firmware team. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs/gaudi: reset device upon BMC requestOfir Bitton2-1/+2
In case the BMC of the devices' box wants to initiate a reset of a specific device, it must go through driver. Once driver will receive the request it will initiate a hard reset flow. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs/gaudi: update extended async event headerOfir Bitton1-5/+5
Update to the latest definition of the firmware Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs: reset device in case of sync errorOhad Sharabi1-0/+1
As the F/wW is the first to detect out of sync event, a new event is added to notify the driver on such event. In which case the driver performs hard reset. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs: set max asid to 2farah kassabri1-1/+1
currently we support only 2 asids in all asics. asid 0 for driver, and asic 1 for user. no need to setup 1024 asids configurations at init phase. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: fix ETR security issueOhad Sharabi1-1/+4
ETR should always be non-secured as it is used by the users to record profiling/trace data. This patch fixes the configuration to match those requirements. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs/gaudi: print sync manager SEI interrupt infoOfir Bitton1-0/+4
Driver must print sync manager SEI information upon receiving interrupt from FW. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs/gaudi: remove PCI access to SM blockOfir Bitton1-0/+3
Due to HW limitation we must remove all direct access to SM registers, in order to do that we will access SM registers using the HW QMANS. When possible and no user context is present, we can directly access the HW QMANS. Whenever there is an active user, driver will prepare a pending command buffer list which will be sent upon user submissions. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs/gaudi: remove duplicated gaudi packets masksOfir Bitton1-24/+0
As all packets use the same CTL register masks, we remove duplicated masks and use common masks instead. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: update firmware filesOded Gabbay1-1/+2
Update various firmware header files with new defines. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: fetch pll frequency from firmwareAlon Mizrahi3-231/+11
Once firmware security is enabled, driver must fetch pll frequencies through the firmware message interface instead of reading the registers directly. Signed-off-by: Alon Mizrahi <amizrahi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: fetch security indication from FWOfir Bitton1-0/+2
Add support for fetching security indication from FW. This indication is needed in order to skip unnecessary initializations done by FW. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: add NIC firmware-related definitionsOded Gabbay1-0/+24
Add new structures and messages that the driver use to interact with the firmware to receive information and events (errors) about GAUDI's NIC. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: add NIC QMAN H/W and registers definitionsOded Gabbay13-1/+9168
Add auto-generated header files that describe the NIC QMANs registers used by the driver. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-04habanalabs/gaudi: mask WDT error in QMANOded Gabbay1-1/+0
This interrupt cause is not relevant because of how the user use the QMAN arbitration mechanism. We must mask it as the log explodes with it. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-09-22habanalabs: update scratchpad register mapOded Gabbay1-0/+1
Our firmware use some scratchpad registers in the device for different roles. Update the file to the latest version of the firmware code. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22habanalabs: update GAUDI hardware specsOded Gabbay1-0/+2
Add define for the 2 MME slave engines. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22habanalabs: Include linux/bitfield.h only in habanalabs.hTomer Tayar1-1/+0
Include linux/bitfield.h only in habanalabs.h, instead of in each and every file that needs it, as habanalabs.h is already included by all. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22habanalabs: use FIELD_PREP() instead of <<Oded Gabbay1-152/+122
Use the standard FIELD_PREP() macro instead of << operator to perform bitmask operations. This ensures type check safety and eliminate compiler warnings. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-31habanalabs: fix report of RAZWI initiator coordinatesOfir Bitton1-16/+16
All initiator coordinates received upon an 'MMU page fault RAZWI event' should be the routers coordinates, the only exception is the DMA initiators for which the reported coordinates correspond to their actual location. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24habanalabs: halt device CPU only upon certain resetOded Gabbay2-0/+4
Currently the driver halts the device CPU in the halt engines function, which halts all the engines of the ASIC. The problem is that if later on we stop the reset process (due to inability to clean memory mappings in time), the CPU will remain in halt mode. This creates many issues, such as thermal/power control and FLR handling. Therefore, move the halting of the device CPU to the very end of the reset process, just before writing to the registers to initiate the reset. In addition, the driver now needs to send a message to the device F/W to disable it from sending interrupts to the host machine because during halt engines function the driver disables the MSI/MSI-X interrupts. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-07-24habanalabs: Extract ECC information from FWOded Gabbay1-11/+8
ECC (Error Correcting Code) interrupts are going to be handled by the FW. Hence, we define an interface in which the driver can obtain the relevant ECC information. This information is needed for monitoring and can also lead to a hard reset if ECC error is not correctable. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24habanalabs: calculate trace frequency from PLLAdam Aharon2-0/+115
The profiler needs to know the PLL values for correctly showing the profiling data. Because our firmware can use different PLL configurations, we need to read the PLL values from the ASIC to pass them to the profiler. Signed-off-by: Adam Aharon <aaharon@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24habanalabs: Use mask instead of shift in sync stream registersOfir Bitton1-2/+2
Use proper bitfield masks instead of shifting values when configuring packets sent to device. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-06-24habanalabs: block scalar load_and_exe on external queueOded Gabbay1-0/+3
In Gaudi, the user can't execute scalar load_and_exe on external queue because it can be a security hole. The driver doesn't parse the commands being loaded and it can be msg_prot, which the user isn't allowed to use. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19habanalabs: move event handling to common firmware fileOfir Bitton2-687/+694
Instead of writing similar event handling code for each ASIC, move the code to the common firmware file. This code will be used for GAUDI and all future ASICs. In addition, add two new fields to the auto-generated events file: valid and description. This will save the need to manually write the events description in the source code and simplify the code. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19habanalabs: add gaudi asic-dependent codeOded Gabbay1-0/+687
Add the ASIC-dependent code for GAUDI. Supply (almost) all of the function callbacks that the driver's common code need to initialize, finalize and submit workloads to the GAUDI ASIC. It also contains the code to initialize the F/W of the GAUDI ASIC and to receive events from the F/W. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19habanalabs: add gaudi asic registers header filesOded Gabbay89-0/+71194
Add the relevant GAUDI ASIC registers header files. These files are generated automatically from a tool maintained by the VLSI engineers. There are more files which are not upstreamed because only very few defines from those files are used in the driver. For those files, we copied the relevant defines into gaudi_regs.h and gaudi_masks.h, to reduce the size of this patch. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>