starfive-tech/linux.git - StarFive Tech Linux Kernel for VisionFive (JH7110) boards (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2022-02-24	can: mcp251xfd: mcp251xfd_chip_stop(): convert to a void function	Marc Kleine-Budde	1	-3/+3
	The mcp251xfd_chip_stop() function tries the best to stop the chip and put it into sleep mode. It continues, even if some intermediate steps fail. As none of the callers use the return value, let this function return void. Link: https://lore.kernel.org/all/20220207131047.282110-6-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	can: mcp251xfd: mcp251xfd_chip_sleep(): introduce function to bring chip ↵	Marc Kleine-Budde	1	-8/+13
	into sleep mode This patch adds a new function to bring the chip into sleep mode, and replaces several occurrences of open coded variants. Link: https://lore.kernel.org/all/20220207131047.282110-5-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	can: mcp251xfd: mcp251xfd_unregister(): simplify runtime PM handling	Marc Kleine-Budde	1	-4/+4
	The mcp251xfd driver supports runtime PM enabled kernels, but also works on !CONFIG_PM configurations. This patch simplifies the runtime PM handling in the mcp251xfd_unregister(). In the CONFIG_PM case, runtime PM has been enabled in the mcp251xfd_probe() function, so we can disable it here. For !CONFIG_PM builds call mcp251xfd_clks_and_vdd_disable() directly. Link: https://lore.kernel.org/all/20220207131047.282110-4-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	can: mcp251xfd: mcp251xfd_regmap_crc_read(): ignore CRC error only if solely ↵	Marc Kleine-Budde	1	-1/+1
	OSC register is read MCP251XFD_REG_OSC is the first register the driver reads from. The chip may be in deep sleep and the SPI transfer (i.e. the assertion of the CS) will wake the chip up. This takes about 3ms. The CRC of this transfer is wrong, or there isn't any chip at all, in this case the CRC will be wrong, too. The driver ignores the CRC error and returns the read data to the caller. To avoid any confusion, this patch changes the mcp251xfd_regmap_crc_read() function to only ignore the CRC error if solely the OSC register is read. So when reading more than the OSC registers at once, CRC errors are not ignored. Link: https://lore.kernel.org/all/20220207131047.282110-3-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	can: mcp251xfd: mcp251xfd_reg_invalid(): rename from mcp251xfd_osc_invalid()	Marc Kleine-Budde	1	-6/+6
	This patch renames mcp251xfd_osc_invalid() to mcp251xfd_reg_invalid(), as it will be used for other registers than the "osc" register in a later patch. This patch also moves this function to more towards the beginning of the file, to be available for other functions, too. Link: https://lore.kernel.org/all/20220207131047.282110-2-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	can: etas_es58x: use BITS_PER_TYPE() instead of manual calculation	Vincent Mailhol	1	-1/+2
	The input to the GENMASK() macro was calculated by hand. Replaced it with a dedicated macro: BITS_PER_TYPE() which does the exact same job. Link: https://lore.kernel.org/all/20220212130737.3008-1-mailhol.vincent@wanadoo.fr Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	can: xilinx_can: Add check for NAPI Poll function	Srinivas Neeli	1	-4/+5
	Add check for NAPI poll function to avoid enabling interrupts with out completing the NAPI call. Link: https://lore.kernel.org/all/20220208162053.39896-1-srinivas.neeli@xilinx.com Signed-off-by: Srinivas Neeli <srinivas.neeli@xilinx.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	can: softing: softing_netdev_open(): remove redundant ret variable	Minghao Chi	1	-4/+1
	Return value from softing_startstop() directly instead of taking this in another redundant variable. Link: https://lore.kernel.org/all/20220112080629.667191-1-chi.minghao@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: CGEL ZTE <cgel.zte@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	can: c_can: ethtool: use default drvinfo	Marc Kleine-Budde	1	-9/+0
	The ethtool core implements a default drvinfo. There's no need to replicate this in the driver, no additional information is added, so remove this and rely on the default. Link: https://lore.kernel.org/all/20220124215642.3474154-10-mkl@pengutronix.de Cc: Dario Binacchi <dariobin@libero.it> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	can: kvaser_usb: kvaser_usb_send_cmd(): remove redundant variable actual_len	Marc Kleine-Budde	1	-3/+1
	The function usb_bulk_msg() can be called with a NULL pointer as the "actual_length" parameter. This patch removes this variable. Link: https://lore.kernel.org/all/20220124215642.3474154-9-mkl@pengutronix.de Cc: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	can: bittiming: mark function arguments and local variables as const	Marc Kleine-Budde	2	-9/+9
	This patch marks the arguments of some functions as well as some local variables as constant. Link: https://lore.kernel.org/all/20220124215642.3474154-7-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	can: bittiming: can_validate_bitrate(): simplify bit rate checking	Marc Kleine-Budde	1	-6/+2
	This patch simplifies the validation of the fixed bit rates. If a supported bit rate is found, directly return 0. If no valid bit rate is found return -EINVAL; Link: https://lore.kernel.org/all/20220124215642.3474154-6-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	can: gw: use call_rcu() instead of costly synchronize_rcu()	Eric Dumazet	1	-6/+10
	Commit fb8696ab14ad ("can: gw: synchronize rcu operations before removing gw job entry") added three synchronize_rcu() calls to make sure one rcu grace period was observed before freeing a "struct cgw_job" (which are tiny objects). This should be converted to call_rcu() to avoid adding delays in device / network dismantles. Use the rcu_head that was already in struct cgw_job, not yet used. Link: https://lore.kernel.org/all/20220207190706.1499190-1-eric.dumazet@gmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Oliver Hartkopp <socketcan@hartkopp.net> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	dt-binding: can: m_can: include common CAN controller bindings	Marc Kleine-Budde	1	-0/+3
	Since commit \| 1f9234401ce0 ("dt-bindings: can: add can-controller.yaml") there is a common CAN controller binding. Add this to the m_can binding. Link: https://lore.kernel.org/all/20220124220653.3477172-4-mkl@pengutronix.de Reviewed-by: Chandrasekar Ramakrishnan <rcsekar@samsung.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	dt-binding: can: m_can: fix indention of table in bosch,mram-cfg description	Marc Kleine-Budde	1	-2/+2
	This patch fixes the indention of the table in the description of the bosch,mram-cfg property. Link: https://lore.kernel.org/all/20220217101111.2291151-1-mkl@pengutronix.de Reviewed-by: Chandrasekar Ramakrishnan <rcsekar@samsung.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	dt-binding: can: m_can: list Chandrasekar Ramakrishnan as maintainer	Marc Kleine-Budde	1	-1/+1
	Since Sriram Dash's email bounces, change the maintainer entry to Chandrasekar Ramakrishnan. Chandrasekar Ramakrishnan is already listed as a maintainer in the MAINTAINERS file. Link: https://lore.kernel.org/all/20220217113839.2311417-1-mkl@pengutronix.de Cc: Chandrasekar Ramakrishnan <rcsekar@samsung.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	dt-binding: can: sun4i_can: include common CAN controller bindings	Marc Kleine-Budde	1	-0/+3
	Since commit \| 1f9234401ce0 ("dt-bindings: can: add can-controller.yaml") there is a common CAN controller binding. Add this to the sun4i_can binding. Link: https://lore.kernel.org/all/20220124220653.3477172-3-mkl@pengutronix.de Cc: Evgeny Boger <boger@wirenboard.com> Cc: Gerhard Bertelsmann <info@gerhard-bertelsmann.de> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	dt-binding: can: mcp251xfd: include common CAN controller bindings	Marc Kleine-Budde	1	-0/+3
	Since commit \| 1f9234401ce0 ("dt-bindings: can: add can-controller.yaml") there is a common CAN controller binding. Add this to the mcp251xfd binding. Link: https://lore.kernel.org/all/20220124220653.3477172-2-mkl@pengutronix.de Cc: Manivannan Sadhasivam <mani@kernel.org> Cc: Thomas Kopp <thomas.kopp@microchip.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-02-24	Merge branch 'add-ethtool-support-for-completion-queue-event-size'	Jakub Kicinski	10	-8/+55
	Subbaraya Sundeep says: ==================== Add ethtool support for completion queue event size After a packet is sent or received by NIC then NIC posts a completion queue event which consists of transmission status (like send success or error) and received status(like pointers to packet fragments). These completion events may also use a ring similar to rx and tx rings. This patchset introduces cqe-size ethtool parameter to modify the size of the completion queue event if NIC hardware has that capability. A bigger completion queue event can have more receive buffer pointers inturn NIC can transfer a bigger frame from wire as long as hardware(MAC) receive frame size limit is not exceeded. Patch 1 adds support setting/getting cqe-size via ethtool -G and ethtool -g. Patch 2 includes octeontx2 driver changes to use completion queue event size set from ethtool -G. ==================== Link: https://lore.kernel.org/r/1645555153-4932-1-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24	octeontx2-pf: Vary completion queue event size	Subbaraya Sundeep	5	-5/+21
	Completion Queue Entry(CQE) is a descriptor written by hardware to notify software about the send and receive completion status. The CQE can be of size 128 or 512 bytes. A 512 bytes CQE can hold more receive fragments pointers compared to 128 bytes CQE. This patch enables to modify CQE size using: <ethtool -G cqe-size N>. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24	ethtool: add support to set/get completion queue event size	Subbaraya Sundeep	5	-3/+34
	Add support to set completion queue event size via ethtool -G parameter and get it via ethtool -g parameter. ~ # ./ethtool -G eth0 cqe-size 512 ~ # ./ethtool -g eth0 Ring parameters for eth0: Pre-set maximums: RX: 1048576 RX Mini: n/a RX Jumbo: n/a TX: 1048576 Current hardware settings: RX: 256 RX Mini: n/a RX Jumbo: n/a TX: 4096 RX Buf Len: 2048 CQE Size: 128 Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24	Merge tag 'mlx5-fixes-2022-02-23' of ↵	Jakub Kicinski	24	-94/+236
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2022-02-22 This series provides bug fixes to mlx5 driver. Please pull and let me know if there is any problem. * tag 'mlx5-fixes-2022-02-23' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Fix VF min/max rate parameters interchange mistake net/mlx5e: Add missing increment of count net/mlx5e: MPLSoUDP decap, fix check for unsupported matches net/mlx5e: Fix MPLSoUDP encap to use MPLS action information net/mlx5e: Add feature check for set fec counters net/mlx5e: TC, Skip redundant ct clear actions net/mlx5e: TC, Reject rules with forward and drop actions net/mlx5e: TC, Reject rules with drop and modify hdr action net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets net/mlx5e: Fix wrong return value on ioctl EEPROM query failure net/mlx5: Fix possible deadlock on rule deletion net/mlx5: Fix tc max supported prio for nic mode net/mlx5: Fix wrong limitation of metadata match on ecpf net/mlx5: Update log_max_qp value to be 17 at most net/mlx5: DR, Fix the threshold that defines when pool sync is initiated net/mlx5: DR, Don't allow match on IP w/o matching on full ethertype/ip_version net/mlx5: DR, Fix slab-out-of-bounds in mlx5_cmd_dr_create_fte net/mlx5: DR, Cache STE shadow memory net/mlx5: Update the list of the PCI supported devices ==================== Link: https://lore.kernel.org/r/20220224001123.365265-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24	Merge tag 'devicetree-fixes-for-5.17-2' of ↵	Linus Torvalds	10	-22/+13
	git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Update some maintainers email addresses - Fix handling of elfcorehdr reservation for crash dump kernel - Fix unittest expected warnings text * tag 'devicetree-fixes-for-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: update Roger Quadros email MAINTAINERS: sifive: drop Yash Shah of/fdt: move elfcorehdr reservation early for crash dump kernel of: unittest: update text of expected warnings
2022-02-24	Merge tag 'selinux-pr-20220223' of ↵	Linus Torvalds	1	-2/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "A second small SELinux fix which addresses an incorrect mutex_is_locked() check" * tag 'selinux-pr-20220223' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: fix misuse of mutex_is_locked()
2022-02-24	net/mlx5e: Fix VF min/max rate parameters interchange mistake	Gal Pressman	1	-1/+1
	The VF min and max rate were passed incorrectly and resulted in wrongly interchanging them. Fix the order of parameters in mlx5_esw_qos_set_vport_rate(). Fixes: d7df09f5e7b4 ("net/mlx5: E-switch, Enable vport QoS on demand") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5e: Add missing increment of count	Lama Kayal	1	-0/+1
	Add mistakenly missing increment of count variable when looping over output buffer in mlx5e_self_test(). This resolves the issue of garbage values output when querying with self test via ethtool. before: $ ethtool -t eth2 The test result is PASS The test extra info: Link Test 0 Speed Test 1768697188 Health Test 758528120 Loopback Test 3288687 after: $ ethtool -t eth2 The test result is PASS The test extra info: Link Test 0 Speed Test 0 Health Test 0 Loopback Test 0 Fixes: 7990b1b5e8bd ("net/mlx5e: loopback test is not supported in switchdev mode") Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5e: MPLSoUDP decap, fix check for unsupported matches	Maor Dickman	1	-17/+11
	Currently offload of rule on bareudp device require tunnel key in order to match on mpls fields and without it the mpls fields are ignored, this is incorrect due to the fact udp tunnel doesn't have key to match on. Fix by returning error in case flow is matching on tunnel key. Fixes: 72046a91d134 ("net/mlx5e: Allow to match on mpls parameters") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5e: Fix MPLSoUDP encap to use MPLS action information	Maor Dickman	7	-3/+32
	Currently the MPLSoUDP encap builds the MPLS header using encap action information (tunnel id, ttl and tos) instead of the MPLS action information (label, ttl, tc and bos) which is wrong. Fix by storing the MPLS action information during the flow action parse and later using it to create the encap MPLS header. Fixes: f828ca6a2fb6 ("net/mlx5e: Add support for hw encapsulation of MPLS over UDP") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5e: Add feature check for set fec counters	Lama Kayal	1	-3/+3
	Fec counters support is checked via the PCAM feature_cap_mask, bit 0: PPCNT_counter_group_Phy_statistical_counter_group. Add feature check to avoid faulty behavior. Fixes: 0a1498ebfa55 ("net/mlx5e: Expose FEC counters via ethtool") Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5e: TC, Skip redundant ct clear actions	Roi Dayan	2	-0/+8
	Offload of ct clear action is just resetting the reg_c register. It's done by allocating modify hdr resources which is limited. Doing it multiple times is redundant and wasting modify hdr resources and if resources depleted the driver will fail offloading the rule. Ignore redundant ct clear actions after the first one. Fixes: 806401c20a0f ("net/mlx5e: CT, Fix multiple allocations and memleak of mod acts") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5e: TC, Reject rules with forward and drop actions	Roi Dayan	1	-0/+6
	Such rules are redundant but allowed and passed to the driver. The driver does not support offloading such rules so return an error. Fixes: 03a9d11e6eeb ("net/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5e: TC, Reject rules with drop and modify hdr action	Roi Dayan	1	-0/+6
	This kind of action is not supported by firmware and generates a syndrome. kernel: mlx5_core 0000:08:00.0: mlx5_cmd_check:777:(pid 102063): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x8708c3) Fixes: d7e75a325cb2 ("net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) actions") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets	Tariq Toukan	1	-1/+2
	For RX TLS device-offloaded packets, the HW spec guarantees checksum validation for the offloaded packets, but does not define whether the CQE.checksum field matches the original packet (ciphertext) or the decrypted one (plaintext). This latitude allows architetctural improvements between generations of chips, resulting in different decisions regarding the value type of CQE.checksum. Hence, for these packets, the device driver should not make use of this CQE field. Here we block CHECKSUM_COMPLETE usage for RX TLS device-offloaded packets, and use CHECKSUM_UNNECESSARY instead. Value of the packet's tcp_hdr.csum is not modified by the HW, and it always matches the original ciphertext. Fixes: 1182f3659357 ("net/mlx5e: kTLS, Add kTLS RX HW offload support") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5e: Fix wrong return value on ioctl EEPROM query failure	Gal Pressman	1	-1/+1
	The ioctl EEPROM query wrongly returns success on read failures, fix that by returning the appropriate error code. Fixes: bb64143eee8c ("net/mlx5e: Add ethtool support for dump module EEPROM") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: Fix possible deadlock on rule deletion	Maor Gottlieb	1	-0/+2
	Add missing call to up_write_ref_node() which releases the semaphore in case the FTE doesn't have destinations, such in drop rule case. Fixes: 465e7baab6d9 ("net/mlx5: Fix deletion of duplicate rules") Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: Fix tc max supported prio for nic mode	Chris Mi	1	-0/+3
	Only prio 1 is supported if firmware doesn't support ignore flow level for nic mode. The offending commit removed the check wrongly. Add it back. Fixes: 9a99c8f1253a ("net/mlx5e: E-Switch, Offload all chain 0 priorities when modify header and forward action is not supported") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: Fix wrong limitation of metadata match on ecpf	Ariel Levkovich	1	-4/+0
	Match metadata support check returns false for ecpf device. However, this support does exist for ecpf and therefore this limitation should be removed to allow feature such as stacked devices and internal port offloaded to be supported. Fixes: 92ab1eb392c6 ("net/mlx5: E-Switch, Enable vport metadata matching if firmware supports it") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: Update log_max_qp value to be 17 at most	Maher Sanalla	1	-1/+1
	Currently, log_max_qp value is dependent on what FW reports as its max capability. In reality, due to a bug, some FWs report a value greater than 17, even though they don't support log_max_qp > 17. This FW issue led the driver to exhaust memory on startup. Thus, log_max_qp value is set to be no more than 17 regardless of what FW reports, as it was before the cited commit. Fixes: f79a609ea6bf ("net/mlx5: Update log_max_qp value to FW max capability") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: DR, Fix the threshold that defines when pool sync is initiated	Yevgeny Kliteynik	1	-4/+7
	When deciding whether to start syncing and actually free all the "hot" ICM chunks, we need to consider the type of the ICM chunks that we're dealing with. For instance, the amount of available ICM for MODIFY_ACTION is significantly lower than the usual STE ICM, so the threshold should account for that - otherwise we can deplete MODIFY_ACTION memory just by creating and deleting the same modify header action in a continuous loop. This patch replaces the hard-coded threshold with a dynamic value. Fixes: 1c58651412bb ("net/mlx5: DR, ICM memory pools sync optimization") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: DR, Don't allow match on IP w/o matching on full ethertype/ip_version	Yevgeny Kliteynik	3	-17/+45
	Currently SMFS allows adding rule with matching on src/dst IP w/o matching on full ethertype or ip_version, which is not supported by HW. This patch fixes this issue and adds the check as it is done in DMFS. Fixes: 26d688e33f88 ("net/mlx5: DR, Add Steering entry (STE) utilities") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: DR, Fix slab-out-of-bounds in mlx5_cmd_dr_create_fte	Yevgeny Kliteynik	1	-7/+26
	When adding a rule with 32 destinations, we hit the following out-of-band access issue: BUG: KASAN: slab-out-of-bounds in mlx5_cmd_dr_create_fte+0x18ee/0x1e70 This patch fixes the issue by both increasing the allocated buffers to accommodate for the needed actions and by checking the number of actions to prevent this issue when a rule with too many actions is provided. Fixes: 1ffd498901c1 ("net/mlx5: DR, Increase supported num of actions to 32") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: DR, Cache STE shadow memory	Yevgeny Kliteynik	2	-35/+79
	During rule insertion on each ICM memory chunk we also allocate shadow memory used for management. This includes the hw_ste, dr_ste and miss list per entry. Since the scale of these allocations is large we noticed a performance hiccup that happens once malloc and free are stressed. In extreme usecases when ~1M chunks are freed at once, it might take up to 40 seconds to complete this, up to the point the kernel sees this as self-detected stall on CPU: rcu: INFO: rcu_sched self-detected stall on CPU To resolve this we will increase the reuse of shadow memory. Doing this we see that a time in the aforementioned usecase dropped from ~40 seconds to ~8-10 seconds. Fixes: 29cf8febd185 ("net/mlx5: DR, ICM pool memory allocator") Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: Update the list of the PCI supported devices	Meir Lichtinger	1	-0/+2
	Add the upcoming BlueField-4 and ConnectX-8 device IDs. Fixes: 2e9d3e83ab82 ("net/mlx5: Update the list of the PCI supported devices") Signed-off-by: Meir Lichtinger <meirl@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: Add clarification on sync reset failure	Moshe Shemesh	5	-19/+74
	In case devlink reload action fw_activate failed in sync reset stage, use the new MFRL field reset_state to find why it failed and share this clarification with the user. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: Add reset_state field to MFRL register	Moshe Shemesh	1	-2/+12
	Add new field reset_state to MFRL register. This field expose current state of sync reset for fw update. This field enables sharing with the user more details on why fw activate failed in case it failed the sync reset stage. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	RDMA/mlx5: Use new command interface API	Saeed Mahameed	1	-23/+32
	DEVX can now use mlx5_cmd_do() which will not intercept the command execution status and will provide full information of the return code. DEVX can now propagate the error code safely to upper layers, to indicate to the callers if the command was actually executed and the error code indicates the command execution status availability in the command outbox buffer. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: cmdif, Refactor error handling and reporting of async commands	Saeed Mahameed	3	-27/+52
	Same as the new mlx5_cmd_do API, report all information to callers and let them handle the error values and outbox parsing. The user callback status "work->user_callback(status)" is now similar to the error rc code returned from the blocking mlx5_cmd_do() version, and now is defined as follows: -EREMOTEIO : Command executed by FW, outbox.status != MLX5_CMD_STAT_OK. Caller must check FW outbox status. 0 : Command execution successful, outbox.status == MLX5_CMD_STAT_OK. < 0 : Command couldn't execute, FW or driver induced error. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: Use mlx5_cmd_do() in core create_{cq,dct}	Saeed Mahameed	5	-7/+21
	mlx5_core_create_{cq/dct} functions are non-trivial mlx5 commands functions. They check command execution status themselves and hide valuable FW failure information. For mlx5_core/eth kernel user this is what we actually want, but for a devx/rdma user the hidden information is essential and should be propagated up to the caller, thus we convert these commands to use mlx5_cmd_do to return the FW/driver and command outbox status as is, and let the caller decide what to do with it. For kernel callers of mlx5_core_create_{cq/dct} or those who only care about the binary status (FAIL/SUCCESS) they must check status themselves via mlx5_cmd_check() to restore the current behavior. err = mlx5_create_cq(in, out) err = mlx5_cmd_check(err, in, out) if (err) // handle err For DEVX users and those who care about full visibility, They will just propagate the error to user space, and app can check if err == -EREMOTEIO, then outbox.{status,syndrome} are valid. API Note: mlx5_cmd_check() must be used by kernel users since it allows the driver to intercept the command execution status and return a driver simulated status in case of driver induced error handling or reset/recovery flows. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: cmdif, Add new api for command execution	Saeed Mahameed	2	-13/+68
	Add mlx5_cmd_do. Unlike mlx5_cmd_exec, this function will not modify or translate outbox.status. The function will return: return = 0: Command was executed, outbox.status == MLX5_CMD_STAT_OK. return = -EREMOTEIO: Executed, outbox.status != MLX5_CMD_STAT_OK. return < 0: Command execution couldn't be performed by FW or driver. And document other mlx5_cmd_exec functions. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-24	net/mlx5: cmdif, cmd_check refactoring	Saeed Mahameed	3	-84/+95
	Do not mangle the command outbox in the internal low level cmd_exec and cmd_invoke functions. Instead return a proper unique error code and move the driver error checking to be at a higher level in mlx5_cmd_exec(). Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>