summaryrefslogtreecommitdiff
path: root/drivers/misc/habanalabs/include
AgeCommit message (Collapse)AuthorFilesLines
2023-01-26habanalabs: move driver to accel subsystemOded Gabbay378-251969/+0
Now that we have a subsystem for compute accelerators, move the habanalabs driver to it. This patch only moves the files and fixes the Makefiles. Future patches will change the existing code to register to the accel subsystem and expose the accel device char files instead of the habanalabs device char files. Update the MAINTAINERS file to reflect this change. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26habanalabs: pass-through request from user to f/wfarah kassabri1-7/+58
Add a uAPI, as part of the INFO IOCTL, to allow users to send requests directly to f/w, according to a pre-defined set of opcodes that the f/w exposes. The f/w will put the result in a kernel-allocated buffer, which the driver will then copy to the user-supplied buffer. This will allow f/w tools to communicate directly with the f/w without the need to add a new uAPI to the driver for each new type of request. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26habanalabs: support receiving ascii message from preboot f/wTal Cohen1-0/+19
An Ascii message that is sent from preboot towards the driver will indicate the specific error that occurred on the f/w. This commit supports that message and parse the ascii string in order to print it into the kernel log The commit also changes the way the descriptor struct is declared. While its size increased (it now above 1024 bytes), it will be allocated by using kmalloc instead of stack declaration. Signed-off-by: Tal Cohen <talcohen@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26habanalabs/gaudi2: support abrupt device reset eventOfir Bitton2-0/+3
In certain scenarios, firmware might encounter a fatal event for which a device reset is required. Hence, a proper notification is needed for driver to be aware and initiate a reset sequence. In secured environments the reset will be performed by firmware without an explicit request from the driver. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26habanalabs: read binning info from prebootfarah kassabri1-9/+22
Sometimes we need the binning info at a very early state of the driver initialization. Therefore, support was added in preboot to provide the binning info as part of the f/w descriptor and the driver can now use that. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23habanalabs/gaudi2: implement fp32 not supported eventOfir Bitton2-1/+4
Due to binning, Gaudi2 does not always support fp32. We add support for such an event in case fp32 is used by the user in such a device. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-11-23habanalabs/gaudi2: add PCI revision 2 supportOfir Bitton1-0/+7
Add support for Gaudi2 Device with PCI revision 2. Functionality is exactly the same as revision 1, the only difference is device name exposed to user. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19habanalabs/gaudi2: allow user to flush PCIE by readOfir Bitton2-0/+186
In order for the user to flush PCIE he needs to read some register from PCIE block. The chosen register is SPECIAL_GLBL_SPARE_0 and hence needs to be unsecured. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19habanalabs/gaudi2: add secured attestation info uapiDani Liberman1-2/+75
User will provide a nonce via the ioctl, and will retrieve secured attestation data of the boot, generated using given nonce. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19habanalabs/gaudi2: get f/w reset status register dynamicallyfarah kassabri1-1/+3
Get the firmware reset status address from the dynamic registers we read from the firmware instead of using a define. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19habanalabs: add support for new cpucp return codesOfir Bitton1-1/+16
Firmware now responds with a more detailed cpucp return codes. Driver can now distinguish between error and debug return codes. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-19habanalabs: send device active message to f/wfarah kassabri1-0/+11
As part of the RAS that is done by the f/w, we should send a message to the f/w when a user either acquires or releases the device. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-18habanalabs: ignore EEPROM errors during bootOfir Bitton1-0/+5
EEPROM errors reported by firmware are basically warnings and should not fail the boot process. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-18habanalabs/gaudi2: new API to control engine cores running modeTal Cohen1-0/+1
The current flow of halting the engine cores is implemented by command buffers built by the user space and sent towards the Driver. This current flow is broken since the user space does not know when the cores actually halt as sending a workload is async op. Therefore the application can not free the memory that is mapped to the engine cores. This new API allows the user space to control the running mode. The API call is sync (returns after the cores are set to the requested mode). Signed-off-by: Tal Cohen <talcohen@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-18habanalabs: remove left-over code from bring-upOded Gabbay1-7/+21
There is some left-over code from the gaudi2 bring-up that wasn't removed so far. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-18habanalabs/gaudi2: remove old interrupt mappingsOfir Bitton1-57/+0
Interrupt enumration has changed some time ago but the old mapping was accidentally left in the driver. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-07-12habanalabs/gaudi2: map virtual MSI-X doorbell memory for userTomer Tayar1-0/+3
Upon the initialization of a user context, map the host memory page of the virtual MSI-X doorbell in the device MMU. A reserved VA is used for this purpose, so user can use it directly without any allocation/map operation. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-07-12habanalabs/gaudi2: modify decoder to use virtual MSI-X doorbellTomer Tayar1-0/+6
Modify the decoder wrapper blocks to generate interrupts using the virtual MSI-X doorbell. As a decoder wrapper block cannot write directly to HBW upon completion, it writes instead to SOB which is monitored by a master monitor. When resolved, this monitor will be the one to actually write to the virtual MSI-X doorbell. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-07-12habanalabs/gaudi2: reset device upon critical ECC eventOfir Bitton1-1/+1
Correctable ECC events are not fatal, but as they accumulate, the f/w can decide that a hard-rest is required. This indication is propagated to the host using the existing ECC event interface. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-07-12habanalabs: add gaudi2 asic-specific codeOded Gabbay2-17/+284
Add the ASIC-specific code for Gaudi2. Supply (almost) all of the function callbacks that the driver's common code need to initialize, finalize and submit workloads to the Gaudi2 ASIC. It also contains the code to initialize the F/W of the Gaudi2 ASIC and to receive events from the F/W. It contains new debugfs entry to dump razwi events. razwi is a case where the device's engines create a transaction that reaches an invalid destination. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-07-12habanalabs/gaudi2: add asic registers header filesOded Gabbay168-2/+136492
Add the relevant GAUDI2 ASIC registers header files. These files are generated automatically from a tool maintained by the VLSI engineers. There are more files which are not upstreamed because only very few defines from those files are used in the driver. For those files, I copied the relevant defines into gaudi2_regs.h and gaudi2_masks.h, to reduce the size of this patch. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-07-12habanalabs/gaudi: enable error interrupt on ARB WDTOded Gabbay1-0/+1
We want to receive an error interrupt in case the watchdog timer expires on arbitration event in the queues. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-07-12habanalabs/goya: move dma direction enum to uapi fileOded Gabbay1-12/+0
The values in this enum are not used by h/w but are a contract between userspace and the kernel driver so they must be defined in the uapi file. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-07-12habanalabs: add critical indication in sram eccran shalit1-1/+2
Multiple SRAM SERR events are treated as critical events, and host should be notified about it. Thus, adding is_critical indication as part of SRAM ECC failure packet. Signed-off-by: ran shalit <rshalit@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-05-22habanalabs: update firmware headerOded Gabbay1-1/+31
Update cpucp_if.h to latest version. Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-22habanalabs/gaudi: add debugfs to fetch internal sync statusOhad Sharabi1-0/+38
When Gaudi device is secured the monitors data in the configuration space is blocked from PCI access. As we need to enable user to get sync-manager monitors registers when debugging, this patch adds a debugfs that dumps the information to a binary file (blob). When a root user will trigger the dump, the driver will send request to the f/w to fill a data structure containing dump of all monitors registers. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-22habanalabs: convert all MMU masks/shifts to arraysOhad Sharabi1-0/+10
There is no need to hold each MMU mask/shift as a denoted structure member (e.g. hop0_mask). Instead converting it to array will result in smaller and more readable code. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-28Merge tag 'char-misc-5.18-rc1' of ↵Linus Torvalds3-0/+17
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc and other driver updates from Greg KH: "Here is the big set of char/misc and other small driver subsystem updates for 5.18-rc1. Included in here are merges from driver subsystems which contain: - iio driver updates and new drivers - fsi driver updates - fpga driver updates - habanalabs driver updates and support for new hardware - soundwire driver updates and new drivers - phy driver updates and new drivers - coresight driver updates - icc driver updates Individual changes include: - mei driver updates - interconnect driver updates - new PECI driver subsystem added - vmci driver updates - lots of tiny misc/char driver updates All of these have been in linux-next for a while with no reported problems" * tag 'char-misc-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (556 commits) firmware: google: Properly state IOMEM dependency kgdbts: fix return value of __setup handler firmware: sysfb: fix platform-device leak in error path firmware: stratix10-svc: add missing callback parameter on RSU arm64: dts: qcom: add non-secure domain property to fastrpc nodes misc: fastrpc: Add dma handle implementation misc: fastrpc: Add fdlist implementation misc: fastrpc: Add helper function to get list and page misc: fastrpc: Add support to secure memory map dt-bindings: misc: add fastrpc domain vmid property misc: fastrpc: check before loading process to the DSP misc: fastrpc: add secure domain support dt-bindings: misc: add property to support non-secure DSP misc: fastrpc: Add support to get DSP capabilities misc: fastrpc: add support for FASTRPC_IOCTL_MEM_MAP/UNMAP misc: fastrpc: separate fastrpc device from channel context dt-bindings: nvmem: brcm,nvram: add basic NVMEM cells dt-bindings: nvmem: make "reg" property optional nvmem: brcm_nvram: parse NVRAM content into NVMEM cells nvmem: dt-bindings: Fix the error of dt-bindings check ...
2022-02-28habanalabs/gaudi: add missing handling of NIC related eventsOded Gabbay1-0/+10
There are a few events that can arrive from the f/w and without proper handling can cause errors to appear in the kernel log without reason. Add the relevant handling that was missing. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-02-28habanalabs: update to latest f/w specsOded Gabbay1-0/+5
Copy the latest versions of the f/w specs files. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-02-28habanalabs: sysfs support for fw os versionRajaravi Krishna Katta1-0/+2
Adds new sysfs entry to display firmware os version /sys/class/habanalabs/hl<n>/fw_os_ver Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-02-17treewide: Replace zero-length arrays with flexible-array membersGustavo A. R. Silva3-7/+7
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. This code was transformed with the help of Coccinelle: (next-20220214$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch) @@ identifier S, member, array; type T1, T2; @@ struct S { ... T1 member; T2 array[ - 0 ]; }; UAPI and wireless changes were intentionally excluded from this patch and will be sent out separately. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays Link: https://github.com/KSPP/linux/issues/78 Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2021-12-26habanalabs: add CPU-CP packet for engine core ASID cfgTomer Tayar1-0/+5
In some cases the driver cannot configure ASID of some engines due to the security level of the relevant registers. For this a new CPU-CP packet is introduced, which will allow the driver to ask the F/W to do this configuration instead. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-12-26habanalabs: clean MMU headers definitionsOhad Sharabi3-21/+36
During the MMU development the MMU header files were left with unclean definitions: - MMU "version specific" definitions that were left in the mmu_general file - unused definitions This patch attempts, where possible, to keep definitions that can serve multiple MMU versions (but that are not tightly bound with specific MMU arch) in the mmu_general header file (e.g. different definitions for number of HOPs). Otherwise, move MMU version specific definitions (e.g. HOPs masks and shifts) to the specific MMU version file. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-12-26habanalabs: sysfs support for two infineon versionsOfir Bitton1-3/+10
Currently sysfs support dumping a single infineon version, in future asics we will have two infineon versions. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-12-26habanalabs: handle device TPM boot error as warningOfir Bitton1-0/+4
AS TPM error indication is not fatal, driver should dump a warning and continue booting. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-12-26habanalabs: debugfs support for larger I2C transactionsOfir Bitton1-1/+8
I2C debugfs support is limited to 1 byte. We extend functionality to more than 1 byte by using one of the pad fields as a length. No backward compatibility issues as new F/W versions will treat 0 length as a 1 byte length transaction. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-12-26habanalabs: add new opcodes for INFO IOCTLfarah kassabri1-1/+32
Add implementation for new opcodes in the INFO IOCTL: 1. Retrieve the replaced DRAM rows from f/w. 2. Retrieve the pending DRAM rows from f/w. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-12-26habanalabs: add dedicated message towards f/w to set powerRajaravi Krishna Katta1-0/+4
CPUCP_PACKET_POWER_GET packet type was used for both hl_get_power() and hl_set_power(). To align with other sensor functions hl_set_power() should use CPUCP_PACKET_POWER_SET. This packet will only be used with newer ASICs, so need to add a compatibility flag to the asic properties to indicate whether to use this packet or the GET packet. Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: update firmware filesOded Gabbay4-78/+130
Update the firmware headers to the latest version Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: enable power info via HWMON frameworkRajaravi Krishna Katta1-0/+10
Add support to retrieve following power info via HWMON: - instantaneous power value - highest value since last reset - reset the highest place holder Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-10-18habanalabs: create static map of f/w hwmon enumsRajaravi Krishna Katta1-0/+6
Instead of using the Linux kernel HWMON enums definition when communicating with the firmware, use proprietary HWMON based enums i.e. map hwmon.h header enum to cpucp_if.h based enum while. This is needed because the HWMON enums are not forcing backward compatibility and therefore changes can break compatibility between newer driver and older firmware. The driver will check for CPU_BOOT_DEV_STS0_MAP_HWMON_EN bit to validate if f/w supports cpucp->hwmon enum mapping to support older firmware where this mapping won't be available. Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-14habanalabs/gaudi: use direct MSI in single modeOmer Shpigelman1-0/+2
Due to FLR scenario when running inside a VM, we must not use indirect MSI because it might cause some issues on VM destroy. In a VM we use single MSI mode in contrary to multi MSI mode which is used in bare-metal. Hence direct MSI should be used in single MSI mode only. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: update to latest firmware headersOded Gabbay2-6/+121
Add several new packets between driver and firmware. Add matching compatibility bits for backward compatibility. Add support for 4K event types. Add information about pcie errors. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-09-01habanalabs: expose server type in INFO IOCTLOded Gabbay1-0/+11
Add the server type property to the hl_info_hw_ip_info structure that is exposed to the user via the INFO IOCTL. This is needed by the userspace s/w stack to know the connections map of the internal links that connect the ASIC among themselves inside the server. The F/W will tell us, as part of the NIC information, the server type that the GAUDI is located in. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-08-29habanalabs: update firmware header to latest versionOded Gabbay1-1/+3
Add two new fields regarding interrupts communication between driver and f/w. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-08-29habanalabs/gaudi: implement state dumpYuri Nudelman2-0/+20
At the first stage, only gaudi core dump shall be implemented, not including the status registers. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-08-29habanalabs: update firmware header filesOfir Bitton2-6/+31
Update recent changes made in firmware header files, which contain a minor COMMS protocol change and new error status definitions. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-06-21habanalabs/gaudi: add support for NIC DERROfir Bitton2-5/+15
We add support for NIC DERR ECC error events, in case this error is received a device reset will be performed. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-06-18habanalabs/gaudi: correct driver events numberingOfir Bitton2-12/+12
Currently driver sends fc interrupt id to FW instead of using cpu interrupt id. We intend to fix that and keep backward compatibility by using the same interrupt values. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>