From 62a04f81e6133c8eaa5e93e15eab1ad2511a45db Mon Sep 17 00:00:00 2001 From: Shivasharan S Date: Tue, 7 May 2019 10:05:35 -0700 Subject: scsi: megaraid_sas: IRQ poll to avoid CPU hard lockups Issue Description: We have seen cpu lock up issues from field if system has a large (more than 96) logical cpu count. SAS3.0 controller (Invader series) supports max 96 MSI-X vector and SAS3.5 product (Ventura) supports max 128 MSI-X vectors. This may be a generic issue (if PCI device support completion on multiple reply queues). Let me explain it w.r.t megaraid_sas supported h/w just to simplify the problem and possible changes to handle such issues. MegaRAID controller supports multiple reply queues in completion path. Driver creates MSI-X vectors for controller as "minimum of (FW supported Reply queues, Logical CPUs)". If submitter is not interrupted via completion on same CPU, there is a loop in the IO path. This behavior can cause hard/soft CPU lockups, IO timeout, system sluggish etc. Example - one CPU (e.g. CPU A) is busy submitting the IOs and another CPU (e.g. CPU B) is busy with processing the corresponding IO's reply descriptors from reply descriptor queue upon receiving the interrupts from HBA. If CPU A is continuously pumping the IOs then always CPU B (which is executing the ISR) will see the valid reply descriptors in the reply descriptor queue and it will be continuously processing those reply descriptor in a loop without quitting the ISR handler. megaraid_sas driver will exit ISR handler if it finds unused reply descriptor in the reply descriptor queue. Since CPU A will be continuously sending the IOs, CPU B may always see a valid reply descriptor (posted by HBA Firmware after processing the IO) in the reply descriptor queue. In worst case, driver will not quit from this loop in the ISR handler. Eventually, CPU lockup will be detected by watchdog. Above mentioned behavior is not common if "rq_affinity" set to 2 or affinity_hint is honored by irqbalancer as "exact". If rq_affinity is set to 2, submitter will be always interrupted via completion on same CPU. If irqbalancer is using "exact" policy, interrupt will be delivered to submitter CPU. Problem statement: If CPU count to MSI-X vectors (reply descriptor Queues) count ratio is not 1:1, we still have exposure of issue explained above and for that we don't have any solution. Exposure of soft/hard lockup is seen if CPU count is more than MSI-X supported by device. If CPUs count to MSI-X vectors count ratio is not 1:1, (Other way, if CPU counts to MSI-X vector count ratio is something like X:1, where X > 1) then 'exact' irqbalance policy OR rq_affinity = 2 won't help to avoid CPU hard/soft lockups. There won't be any one to one mapping between CPU to MSI-X vector instead one MSI-X interrupt (or reply descriptor queue) is shared with group/set of CPUs and there is a possibility of having a loop in the IO path within that CPU group and may observe lockups. For example: Consider a system having two NUMA nodes and each node having four logical CPUs and also consider that number of MSI-X vectors enabled on the HBA is two, then CPUs count to MSI-X vector count ratio as 4:1. e.g. MSI-X vector 0 is affinity to CPU 0, CPU 1, CPU 2 & CPU 3 of NUMA node 0 and MSI-X vector 1 is affinity to CPU 4, CPU 5, CPU 6 & CPU 7 of NUMA node 1. numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 --> MSI-X 0 node 0 size: 65536 MB node 0 free: 63176 MB node 1 cpus: 4 5 6 7 --> MSI-X 1 node 1 size: 65536 MB node 1 free: 63176 MB Assume that user started an application which uses all the CPUs of NUMA node 0 for issuing the IOs. Only one CPU from affinity list (it can be any cpu since this behavior depends upon irqbalance) CPU0 will receive the interrupts from MSI-X 0 for all the IOs. Eventually, CPU 0 IO submission percentage will be decreasing and ISR processing percentage will be increasing as it is more busy with processing the interrupts. Gradually IO submission percentage on CPU 0 will be zero and it's ISR processing percentage will be 100% as IO loop has already formed within the NUMA node 0, i.e. CPU 1, CPU 2 & CPU 3 will be continuously busy with submitting the heavy IOs and only CPU 0 is busy in the ISR path as it always find the valid reply descriptor in the reply descriptor queue. Eventually, we will observe the hard lockup here. Chances of occurring of hard/soft lockups are directly proportional to value of X. If value of X is high, then chances of observing CPU lockups is high. Solution: Use IRQ poll interface defined in "irq_poll.c". megaraid_sas driver will execute ISR routine in softirq context and it will always quit the loop based on budget provided in IRQ poll interface. Driver will switch to IRQ poll only when more than a threshold number of reply descriptors are handled in one ISR. Currently threshold is set as 1/4th of HBA queue depth. In these scenarios (i.e. where CPUs count to MSI-X vectors count ratio is X:1 (where X > 1)), IRQ poll interface will avoid CPU hard lockups due to voluntary exit from the reply queue processing based on budget. Note - Only one MSI-X vector is busy doing processing. Select CONFIG_IRQ_POLL from driver Kconfig for driver compilation. Signed-off-by: Kashyap Desai Signed-off-by: Shivasharan S Signed-off-by: Martin K. Petersen --- drivers/scsi/megaraid/megaraid_sas_fusion.h | 1 - 1 file changed, 1 deletion(-) (limited to 'drivers/scsi/megaraid/megaraid_sas_fusion.h') diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h b/drivers/scsi/megaraid/megaraid_sas_fusion.h index 1481bf029490..160ac16941fe 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h @@ -100,7 +100,6 @@ enum MR_RAID_FLAGS_IO_SUB_TYPE { #define MEGASAS_FP_CMD_LEN 16 #define MEGASAS_FUSION_IN_RESET 0 -#define THRESHOLD_REPLY_COUNT 50 #define RAID_1_PEER_CMDS 2 #define JBOD_MAPS_COUNT 2 #define MEGASAS_REDUCE_QD_COUNT 64 -- cgit v1.2.3 From ba53572bf02da1836e5780e8c283c8e0cea714e2 Mon Sep 17 00:00:00 2001 From: Shivasharan S Date: Tue, 7 May 2019 10:05:49 -0700 Subject: scsi: megaraid_sas: Export RAID map through debugfs Create a debugfs interface for megaraid_sas driver. Provide interface to dump driver RAID map in debugfs. Signed-off-by: Sumit Saxena Signed-off-by: Shivasharan S Signed-off-by: Martin K. Petersen --- drivers/scsi/megaraid/Makefile | 2 +- drivers/scsi/megaraid/megaraid_sas.h | 4 + drivers/scsi/megaraid/megaraid_sas_base.c | 14 +++ drivers/scsi/megaraid/megaraid_sas_debugfs.c | 180 +++++++++++++++++++++++++++ drivers/scsi/megaraid/megaraid_sas_fusion.h | 5 + 5 files changed, 204 insertions(+), 1 deletion(-) create mode 100644 drivers/scsi/megaraid/megaraid_sas_debugfs.c (limited to 'drivers/scsi/megaraid/megaraid_sas_fusion.h') diff --git a/drivers/scsi/megaraid/Makefile b/drivers/scsi/megaraid/Makefile index 6e74d21227a5..12177e4cae65 100644 --- a/drivers/scsi/megaraid/Makefile +++ b/drivers/scsi/megaraid/Makefile @@ -3,4 +3,4 @@ obj-$(CONFIG_MEGARAID_MM) += megaraid_mm.o obj-$(CONFIG_MEGARAID_MAILBOX) += megaraid_mbox.o obj-$(CONFIG_MEGARAID_SAS) += megaraid_sas.o megaraid_sas-objs := megaraid_sas_base.o megaraid_sas_fusion.o \ - megaraid_sas_fp.o + megaraid_sas_fp.o megaraid_sas_debugfs.o diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h index 840506f2f33c..56b3204d3fc6 100644 --- a/drivers/scsi/megaraid/megaraid_sas.h +++ b/drivers/scsi/megaraid/megaraid_sas.h @@ -2390,6 +2390,10 @@ struct megasas_instance { u8 task_abort_tmo; u8 max_reset_tmo; u8 snapdump_wait_time; +#ifdef CONFIG_DEBUG_FS + struct dentry *debugfs_root; + struct dentry *raidmap_dump; +#endif u8 enable_fw_dev_list; }; struct MR_LD_VF_MAP { diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index 555292e5ff27..3a9128ed304d 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -188,6 +188,12 @@ static bool support_nvme_encapsulation; /* define lock for aen poll */ spinlock_t poll_aen_lock; +extern struct dentry *megasas_debugfs_root; +extern void megasas_init_debugfs(void); +extern void megasas_exit_debugfs(void); +extern void megasas_setup_debugfs(struct megasas_instance *instance); +extern void megasas_destroy_debugfs(struct megasas_instance *instance); + void megasas_complete_cmd(struct megasas_instance *instance, struct megasas_cmd *cmd, u8 alt_status); @@ -7139,6 +7145,8 @@ static int megasas_probe_one(struct pci_dev *pdev, goto fail_start_aen; } + megasas_setup_debugfs(instance); + /* Get current SR-IOV LD/VF affiliation */ if (instance->requestorId) megasas_get_ld_vf_affiliation(instance, 1); @@ -7609,6 +7617,8 @@ skip_firing_dcmds: megasas_free_ctrl_mem(instance); + megasas_destroy_debugfs(instance); + scsi_host_put(host); pci_disable_device(pdev); @@ -8536,6 +8546,8 @@ static int __init megasas_init(void) megasas_mgmt_majorno = rval; + megasas_init_debugfs(); + /* * Register ourselves as PCI hotplug module */ @@ -8595,6 +8607,7 @@ err_dcf_rel_date: err_dcf_attr_ver: pci_unregister_driver(&megasas_pci_driver); err_pcidrv: + megasas_exit_debugfs(); unregister_chrdev(megasas_mgmt_majorno, "megaraid_sas_ioctl"); return rval; } @@ -8617,6 +8630,7 @@ static void __exit megasas_exit(void) &driver_attr_support_nvme_encapsulation); pci_unregister_driver(&megasas_pci_driver); + megasas_exit_debugfs(); unregister_chrdev(megasas_mgmt_majorno, "megaraid_sas_ioctl"); } diff --git a/drivers/scsi/megaraid/megaraid_sas_debugfs.c b/drivers/scsi/megaraid/megaraid_sas_debugfs.c new file mode 100644 index 000000000000..e52837bb6807 --- /dev/null +++ b/drivers/scsi/megaraid/megaraid_sas_debugfs.c @@ -0,0 +1,180 @@ +/* + * Linux MegaRAID driver for SAS based RAID controllers + * + * Copyright (c) 2003-2018 LSI Corporation. + * Copyright (c) 2003-2018 Avago Technologies. + * Copyright (c) 2003-2018 Broadcom Inc. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see . + * + * Authors: Broadcom Inc. + * Kashyap Desai + * Sumit Saxena + * Shivasharan S + * + * Send feedback to: megaraidlinux.pdl@broadcom.com + */ +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include "megaraid_sas_fusion.h" +#include "megaraid_sas.h" + +#ifdef CONFIG_DEBUG_FS +#include + +struct dentry *megasas_debugfs_root; + +static ssize_t +megasas_debugfs_read(struct file *filp, char __user *ubuf, size_t cnt, + loff_t *ppos) +{ + struct megasas_debugfs_buffer *debug = filp->private_data; + + if (!debug || !debug->buf) + return 0; + + return simple_read_from_buffer(ubuf, cnt, ppos, debug->buf, debug->len); +} + +static int +megasas_debugfs_raidmap_open(struct inode *inode, struct file *file) +{ + struct megasas_instance *instance = inode->i_private; + struct megasas_debugfs_buffer *debug; + struct fusion_context *fusion; + + fusion = instance->ctrl_context; + + debug = kzalloc(sizeof(struct megasas_debugfs_buffer), GFP_KERNEL); + if (!debug) + return -ENOMEM; + + debug->buf = (void *)fusion->ld_drv_map[(instance->map_id & 1)]; + debug->len = fusion->drv_map_sz; + file->private_data = debug; + + return 0; +} + +static int +megasas_debugfs_release(struct inode *inode, struct file *file) +{ + struct megasas_debug_buffer *debug = file->private_data; + + if (!debug) + return 0; + + file->private_data = NULL; + kfree(debug); + return 0; +} + +static const struct file_operations megasas_debugfs_raidmap_fops = { + .owner = THIS_MODULE, + .open = megasas_debugfs_raidmap_open, + .read = megasas_debugfs_read, + .release = megasas_debugfs_release, +}; + +/* + * megasas_init_debugfs : Create debugfs root for megaraid_sas driver + */ +void megasas_init_debugfs(void) +{ + megasas_debugfs_root = debugfs_create_dir("megaraid_sas", NULL); + if (!megasas_debugfs_root) + pr_info("Cannot create debugfs root\n"); +} + +/* + * megasas_exit_debugfs : Remove debugfs root for megaraid_sas driver + */ +void megasas_exit_debugfs(void) +{ + debugfs_remove_recursive(megasas_debugfs_root); +} + +/* + * megasas_setup_debugfs : Setup debugfs per Fusion adapter + * instance: Soft instance of adapter + */ +void +megasas_setup_debugfs(struct megasas_instance *instance) +{ + char name[64]; + struct fusion_context *fusion; + + fusion = instance->ctrl_context; + + if (fusion) { + snprintf(name, sizeof(name), + "scsi_host%d", instance->host->host_no); + if (!instance->debugfs_root) { + instance->debugfs_root = + debugfs_create_dir(name, megasas_debugfs_root); + if (!instance->debugfs_root) { + dev_err(&instance->pdev->dev, + "Cannot create per adapter debugfs directory\n"); + return; + } + } + + snprintf(name, sizeof(name), "raidmap_dump"); + instance->raidmap_dump = + debugfs_create_file(name, S_IRUGO, + instance->debugfs_root, instance, + &megasas_debugfs_raidmap_fops); + if (!instance->raidmap_dump) { + dev_err(&instance->pdev->dev, + "Cannot create raidmap debugfs file\n"); + debugfs_remove(instance->debugfs_root); + return; + } + } + +} + +/* + * megasas_destroy_debugfs : Destroy debugfs per Fusion adapter + * instance: Soft instance of adapter + */ +void megasas_destroy_debugfs(struct megasas_instance *instance) +{ + debugfs_remove_recursive(instance->debugfs_root); +} + +#else +void megasas_init_debugfs(void) +{ +} +void megasas_exit_debugfs(void) +{ +} +void megasas_setup_debugfs(struct megasas_instance *instance) +{ +} +void megasas_destroy_debugfs(struct megasas_instance *instance) +{ +} +#endif /*CONFIG_DEBUG_FS*/ diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h b/drivers/scsi/megaraid/megaraid_sas_fusion.h index 160ac16941fe..98738290c533 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h @@ -1360,6 +1360,11 @@ struct MR_SNAPDUMP_PROPERTIES { u8 reserved[12]; }; +struct megasas_debugfs_buffer { + void *buf; + u32 len; +}; + void megasas_free_cmds_fusion(struct megasas_instance *instance); int megasas_ioc_init_fusion(struct megasas_instance *instance); u8 megasas_get_map_info(struct megasas_instance *instance); -- cgit v1.2.3 From 49f2bf1071f06a430920888ff2d1a89395a3b6b5 Mon Sep 17 00:00:00 2001 From: Chandrakanth Patil Date: Tue, 25 Jun 2019 16:34:28 +0530 Subject: scsi: megaraid_sas: RAID1 PCI bandwidth limit algorithm is applicable for only Ventura RAID1 PCI bandwidth limit algorithm is not applicable to Aero as it's PCIe Gen4 adapter. Signed-off-by: Sumit Saxena Signed-off-by: Chandrakanth Patil Signed-off-by: Martin K. Petersen --- drivers/scsi/megaraid/megaraid_sas_base.c | 3 +++ drivers/scsi/megaraid/megaraid_sas_fusion.c | 24 +++++++++++++----------- drivers/scsi/megaraid/megaraid_sas_fusion.h | 2 +- 3 files changed, 17 insertions(+), 12 deletions(-) (limited to 'drivers/scsi/megaraid/megaraid_sas_fusion.h') diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index a886de3e3ffb..5244b6eedfec 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -5777,6 +5777,9 @@ static int megasas_init_fw(struct megasas_instance *instance) MR_MAX_RAID_MAP_SIZE_MASK); } + if (instance->adapter_type == VENTURA_SERIES) + fusion->pcie_bw_limitation = true; + /* Check if MSI-X is supported while in ready state */ msix_enable = (instance->instancet->read_fw_status_reg(instance) & 0x4000000) >> 0x1a; diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c index 5121d4c6eea3..ad18474f06b7 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c @@ -2621,9 +2621,10 @@ static void megasas_stream_detect(struct megasas_instance *instance, * */ static void -megasas_set_raidflag_cpu_affinity(union RAID_CONTEXT_UNION *praid_context, - struct MR_LD_RAID *raid, bool fp_possible, - u8 is_read, u32 scsi_buff_len) +megasas_set_raidflag_cpu_affinity(struct fusion_context *fusion, + union RAID_CONTEXT_UNION *praid_context, + struct MR_LD_RAID *raid, bool fp_possible, + u8 is_read, u32 scsi_buff_len) { u8 cpu_sel = MR_RAID_CTX_CPUSEL_0; struct RAID_CONTEXT_G35 *rctx_g35; @@ -2681,11 +2682,11 @@ megasas_set_raidflag_cpu_affinity(union RAID_CONTEXT_UNION *praid_context, * vs MR_RAID_FLAGS_IO_SUB_TYPE_CACHE_BYPASS. * IO Subtype is not bitmap. */ - if ((raid->level == 1) && (!is_read)) { - if (scsi_buff_len > MR_LARGE_IO_MIN_SIZE) - praid_context->raid_context_g35.raid_flags = - (MR_RAID_FLAGS_IO_SUB_TYPE_LDIO_BW_LIMIT - << MR_RAID_CTX_RAID_FLAGS_IO_SUB_TYPE_SHIFT); + if ((fusion->pcie_bw_limitation) && (raid->level == 1) && (!is_read) && + (scsi_buff_len > MR_LARGE_IO_MIN_SIZE)) { + praid_context->raid_context_g35.raid_flags = + (MR_RAID_FLAGS_IO_SUB_TYPE_LDIO_BW_LIMIT + << MR_RAID_CTX_RAID_FLAGS_IO_SUB_TYPE_SHIFT); } } @@ -2834,8 +2835,9 @@ megasas_build_ldio_fusion(struct megasas_instance *instance, (instance->host->can_queue)) { fp_possible = false; atomic_dec(&instance->fw_outstanding); - } else if ((scsi_buff_len > MR_LARGE_IO_MIN_SIZE) || - (atomic_dec_if_positive(&mrdev_priv->r1_ldio_hint) > 0)) { + } else if (fusion->pcie_bw_limitation && + ((scsi_buff_len > MR_LARGE_IO_MIN_SIZE) || + (atomic_dec_if_positive(&mrdev_priv->r1_ldio_hint) > 0))) { fp_possible = false; atomic_dec(&instance->fw_outstanding); if (scsi_buff_len > MR_LARGE_IO_MIN_SIZE) @@ -2860,7 +2862,7 @@ megasas_build_ldio_fusion(struct megasas_instance *instance, /* If raid is NULL, set CPU affinity to default CPU0 */ if (raid) - megasas_set_raidflag_cpu_affinity(&io_request->RaidContext, + megasas_set_raidflag_cpu_affinity(fusion, &io_request->RaidContext, raid, fp_possible, io_info.isRead, scsi_buff_len); else diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h b/drivers/scsi/megaraid/megaraid_sas_fusion.h index 98738290c533..b50da3822aa0 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h @@ -1335,7 +1335,7 @@ struct fusion_context { dma_addr_t ioc_init_request_phys; struct MPI2_IOC_INIT_REQUEST *ioc_init_request; struct megasas_cmd *ioc_init_cmd; - + bool pcie_bw_limitation; }; union desc_value { -- cgit v1.2.3 From 7fc557005c454fa053153ac0bf7c7c96f58dab4f Mon Sep 17 00:00:00 2001 From: Chandrakanth Patil Date: Tue, 25 Jun 2019 16:34:29 +0530 Subject: scsi: megaraid_sas: Offload Aero RAID5/6 division calculations to driver For RAID5/RAID6 volumes configured behind Aero, driver will be doing 64bit division operations on behalf of firmware as controller's ARM CPU is very slow in this division. Later, driver calculates Q-ARM, P-ARM and Log-ARM and passes those values to firmware by writing these values to RAID_CONTEXT. Signed-off-by: Sumit Saxena Signed-off-by: Chandrakanth Patil Signed-off-by: Martin K. Petersen --- drivers/scsi/megaraid/megaraid_sas_base.c | 10 +++- drivers/scsi/megaraid/megaraid_sas_fp.c | 78 +++++++++++++++++++++++++++++ drivers/scsi/megaraid/megaraid_sas_fusion.c | 6 +-- drivers/scsi/megaraid/megaraid_sas_fusion.h | 24 ++++++--- 4 files changed, 108 insertions(+), 10 deletions(-) (limited to 'drivers/scsi/megaraid/megaraid_sas_fusion.h') diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index 5244b6eedfec..81e708b2eca0 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -5777,8 +5777,16 @@ static int megasas_init_fw(struct megasas_instance *instance) MR_MAX_RAID_MAP_SIZE_MASK); } - if (instance->adapter_type == VENTURA_SERIES) + switch (instance->adapter_type) { + case VENTURA_SERIES: fusion->pcie_bw_limitation = true; + break; + case AERO_SERIES: + fusion->r56_div_offload = true; + break; + default: + break; + } /* Check if MSI-X is supported while in ready state */ msix_enable = (instance->instancet->read_fw_status_reg(instance) & diff --git a/drivers/scsi/megaraid/megaraid_sas_fp.c b/drivers/scsi/megaraid/megaraid_sas_fp.c index d296255a4f12..43a2e49807c4 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fp.c +++ b/drivers/scsi/megaraid/megaraid_sas_fp.c @@ -901,6 +901,77 @@ u8 MR_GetPhyParams(struct megasas_instance *instance, u32 ld, u64 stripRow, return retval; } +/* + * mr_get_phy_params_r56_rmw - Calculate parameters for R56 CTIO write operation + * @instance: Adapter soft state + * @ld: LD index + * @stripNo: Strip Number + * @io_info: IO info structure pointer + * pRAID_Context: RAID context pointer + * map: RAID map pointer + * + * This routine calculates the logical arm, data Arm, row number and parity arm + * for R56 CTIO write operation. + */ +static void mr_get_phy_params_r56_rmw(struct megasas_instance *instance, + u32 ld, u64 stripNo, + struct IO_REQUEST_INFO *io_info, + struct RAID_CONTEXT_G35 *pRAID_Context, + struct MR_DRV_RAID_MAP_ALL *map) +{ + struct MR_LD_RAID *raid = MR_LdRaidGet(ld, map); + u8 span, dataArms, arms, dataArm, logArm; + s8 rightmostParityArm, PParityArm; + u64 rowNum; + u64 *pdBlock = &io_info->pdBlock; + + dataArms = raid->rowDataSize; + arms = raid->rowSize; + + rowNum = mega_div64_32(stripNo, dataArms); + /* parity disk arm, first arm is 0 */ + rightmostParityArm = (arms - 1) - mega_mod64(rowNum, arms); + + /* logical arm within row */ + logArm = mega_mod64(stripNo, dataArms); + /* physical arm for data */ + dataArm = mega_mod64((rightmostParityArm + 1 + logArm), arms); + + if (raid->spanDepth == 1) { + span = 0; + } else { + span = (u8)MR_GetSpanBlock(ld, rowNum, pdBlock, map); + if (span == SPAN_INVALID) + return; + } + + if (raid->level == 6) { + /* P Parity arm, note this can go negative adjust if negative */ + PParityArm = (arms - 2) - mega_mod64(rowNum, arms); + + if (PParityArm < 0) + PParityArm += arms; + + /* rightmostParityArm is P-Parity for RAID 5 and Q-Parity for RAID */ + pRAID_Context->flow_specific.r56_arm_map = rightmostParityArm; + pRAID_Context->flow_specific.r56_arm_map |= + (u16)(PParityArm << RAID_CTX_R56_P_ARM_SHIFT); + } else { + pRAID_Context->flow_specific.r56_arm_map |= + (u16)(rightmostParityArm << RAID_CTX_R56_P_ARM_SHIFT); + } + + pRAID_Context->reg_lock_row_lba = cpu_to_le64(rowNum); + pRAID_Context->flow_specific.r56_arm_map |= + (u16)(logArm << RAID_CTX_R56_LOG_ARM_SHIFT); + cpu_to_le16s(&pRAID_Context->flow_specific.r56_arm_map); + pRAID_Context->span_arm = (span << RAID_CTX_SPANARM_SPAN_SHIFT) | dataArm; + pRAID_Context->raid_flags = (MR_RAID_FLAGS_IO_SUB_TYPE_R56_DIV_OFFLOAD << + MR_RAID_CTX_RAID_FLAGS_IO_SUB_TYPE_SHIFT); + + return; +} + /* ****************************************************************************** * @@ -1108,6 +1179,13 @@ MR_BuildRaidContext(struct megasas_instance *instance, /* save pointer to raid->LUN array */ *raidLUN = raid->LUN; + /* Aero R5/6 Division Offload for WRITE */ + if (fusion->r56_div_offload && (raid->level >= 5) && !isRead) { + mr_get_phy_params_r56_rmw(instance, ld, start_strip, io_info, + (struct RAID_CONTEXT_G35 *)pRAID_Context, + map); + return true; + } /*Get Phy Params only if FP capable, or else leave it to MR firmware to do the calculation.*/ diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c index ad18474f06b7..058d22b4fa7b 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c @@ -3324,9 +3324,9 @@ void megasas_prepare_secondRaid1_IO(struct megasas_instance *instance, r1_cmd->request_desc->SCSIIO.DevHandle = cmd->r1_alt_dev_handle; r1_cmd->io_request->DevHandle = cmd->r1_alt_dev_handle; r1_cmd->r1_alt_dev_handle = cmd->io_request->DevHandle; - cmd->io_request->RaidContext.raid_context_g35.smid.peer_smid = + cmd->io_request->RaidContext.raid_context_g35.flow_specific.peer_smid = cpu_to_le16(r1_cmd->index); - r1_cmd->io_request->RaidContext.raid_context_g35.smid.peer_smid = + r1_cmd->io_request->RaidContext.raid_context_g35.flow_specific.peer_smid = cpu_to_le16(cmd->index); /*MSIxIndex of both commands request descriptors should be same*/ r1_cmd->request_desc->SCSIIO.MSIxIndex = @@ -3444,7 +3444,7 @@ megasas_complete_r1_command(struct megasas_instance *instance, rctx_g35 = &cmd->io_request->RaidContext.raid_context_g35; fusion = instance->ctrl_context; - peer_smid = le16_to_cpu(rctx_g35->smid.peer_smid); + peer_smid = le16_to_cpu(rctx_g35->flow_specific.peer_smid); r1_cmd = fusion->cmd_list[peer_smid - 1]; scmd_local = cmd->scmd; diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h b/drivers/scsi/megaraid/megaraid_sas_fusion.h index b50da3822aa0..ca32b2b72515 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h @@ -87,7 +87,8 @@ enum MR_RAID_FLAGS_IO_SUB_TYPE { MR_RAID_FLAGS_IO_SUB_TYPE_RMW_P = 3, MR_RAID_FLAGS_IO_SUB_TYPE_RMW_Q = 4, MR_RAID_FLAGS_IO_SUB_TYPE_CACHE_BYPASS = 6, - MR_RAID_FLAGS_IO_SUB_TYPE_LDIO_BW_LIMIT = 7 + MR_RAID_FLAGS_IO_SUB_TYPE_LDIO_BW_LIMIT = 7, + MR_RAID_FLAGS_IO_SUB_TYPE_R56_DIV_OFFLOAD = 8 }; /* @@ -151,12 +152,15 @@ struct RAID_CONTEXT_G35 { u16 timeout_value; /* 0x02 -0x03 */ u16 routing_flags; // 0x04 -0x05 routing flags u16 virtual_disk_tgt_id; /* 0x06 -0x07 */ - u64 reg_lock_row_lba; /* 0x08 - 0x0F */ + __le64 reg_lock_row_lba; /* 0x08 - 0x0F */ u32 reg_lock_length; /* 0x10 - 0x13 */ - union { - u16 next_lmid; /* 0x14 - 0x15 */ - u16 peer_smid; /* used for the raid 1/10 fp writes */ - } smid; + union { // flow specific + u16 rmw_op_index; /* 0x14 - 0x15, R5/6 RMW: rmw operation index*/ + u16 peer_smid; /* 0x14 - 0x15, R1 Write: peer smid*/ + u16 r56_arm_map; /* 0x14 - 0x15, Unused [15], LogArm[14:10], P-Arm[9:5], Q-Arm[4:0] */ + + } flow_specific; + u8 ex_status; /* 0x16 : OUT */ u8 status; /* 0x17 status */ u8 raid_flags; /* 0x18 resvd[7:6], ioSubType[5:4], @@ -247,6 +251,13 @@ union RAID_CONTEXT_UNION { #define RAID_CTX_SPANARM_SPAN_SHIFT (5) #define RAID_CTX_SPANARM_SPAN_MASK (0xE0) +/* LogArm[14:10], P-Arm[9:5], Q-Arm[4:0] */ +#define RAID_CTX_R56_Q_ARM_MASK (0x1F) +#define RAID_CTX_R56_P_ARM_SHIFT (5) +#define RAID_CTX_R56_P_ARM_MASK (0x3E0) +#define RAID_CTX_R56_LOG_ARM_SHIFT (10) +#define RAID_CTX_R56_LOG_ARM_MASK (0x7C00) + /* number of bits per index in U32 TrackStream */ #define BITS_PER_INDEX_STREAM 4 #define INVALID_STREAM_NUM 16 @@ -1336,6 +1347,7 @@ struct fusion_context { struct MPI2_IOC_INIT_REQUEST *ioc_init_request; struct megasas_cmd *ioc_init_cmd; bool pcie_bw_limitation; + bool r56_div_offload; }; union desc_value { -- cgit v1.2.3 From f39e5e52c5b5407173d87b03a6385fbe6ccf1026 Mon Sep 17 00:00:00 2001 From: Chandrakanth Patil Date: Tue, 25 Jun 2019 16:34:34 +0530 Subject: scsi: megaraid_sas: Use high IOPS queues based on IO workload The driver will use round-robin method for IO submission in batches within the high IOPS queues when the number of in-flight ios on the target device is larger than 8. Otherwise the driver will use low latency reply queues. Signed-off-by: Kashyap Desai Signed-off-by: Chandrakanth Patil Signed-off-by: Martin K. Petersen --- drivers/scsi/megaraid/megaraid_sas.h | 3 +++ drivers/scsi/megaraid/megaraid_sas_fp.c | 1 + drivers/scsi/megaraid/megaraid_sas_fusion.c | 16 ++++++++++++++-- drivers/scsi/megaraid/megaraid_sas_fusion.h | 1 + 4 files changed, 19 insertions(+), 2 deletions(-) (limited to 'drivers/scsi/megaraid/megaraid_sas_fusion.h') diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h index 02e6e15d8c2a..3f4cb52177f6 100644 --- a/drivers/scsi/megaraid/megaraid_sas.h +++ b/drivers/scsi/megaraid/megaraid_sas.h @@ -2253,6 +2253,8 @@ enum MR_PD_TYPE { /*Aero performance parameters*/ #define MR_HIGH_IOPS_QUEUE_COUNT 8 +#define MR_DEVICE_HIGH_IOPS_DEPTH 8 +#define MR_HIGH_IOPS_BATCH_COUNT 16 struct megasas_instance { @@ -2362,6 +2364,7 @@ struct megasas_instance { atomic_t ldio_outstanding; atomic_t fw_reset_no_pci_access; atomic64_t total_io_count; + atomic64_t high_iops_outstanding; struct megasas_instance_template *instancet; struct tasklet_struct isr_tasklet; diff --git a/drivers/scsi/megaraid/megaraid_sas_fp.c b/drivers/scsi/megaraid/megaraid_sas_fp.c index 43a2e49807c4..f9f7c34e93c3 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fp.c +++ b/drivers/scsi/megaraid/megaraid_sas_fp.c @@ -1038,6 +1038,7 @@ MR_BuildRaidContext(struct megasas_instance *instance, stripSize = 1 << raid->stripeShift; stripe_mask = stripSize-1; + io_info->data_arms = raid->rowDataSize; /* * calculate starting row and stripe, and number of strips and rows diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c index 845ca2f94e5c..90dced4290a2 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c @@ -2811,6 +2811,7 @@ megasas_build_ldio_fusion(struct megasas_instance *instance, io_info.r1_alt_dev_handle = MR_DEVHANDLE_INVALID; scsi_buff_len = scsi_bufflen(scp); io_request->DataLength = cpu_to_le32(scsi_buff_len); + io_info.data_arms = 1; if (scp->sc_data_direction == DMA_FROM_DEVICE) io_info.isRead = 1; @@ -2830,7 +2831,13 @@ megasas_build_ldio_fusion(struct megasas_instance *instance, fp_possible = (io_info.fpOkForIo > 0) ? true : false; } - if (instance->msix_load_balance) + if (instance->balanced_mode && + atomic_read(&scp->device->device_busy) > + (io_info.data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) + cmd->request_desc->SCSIIO.MSIxIndex = + mega_mod64((atomic64_add_return(1, &instance->high_iops_outstanding) / + MR_HIGH_IOPS_BATCH_COUNT), instance->low_latency_index_start); + else if (instance->msix_load_balance) cmd->request_desc->SCSIIO.MSIxIndex = (mega_mod64(atomic64_add_return(1, &instance->total_io_count), instance->msix_vectors)); @@ -3157,7 +3164,12 @@ megasas_build_syspd_fusion(struct megasas_instance *instance, cmd->request_desc->SCSIIO.DevHandle = io_request->DevHandle; - if (instance->msix_load_balance) + if (instance->balanced_mode && + atomic_read(&scmd->device->device_busy) > MR_DEVICE_HIGH_IOPS_DEPTH) + cmd->request_desc->SCSIIO.MSIxIndex = + mega_mod64((atomic64_add_return(1, &instance->high_iops_outstanding) / + MR_HIGH_IOPS_BATCH_COUNT), instance->low_latency_index_start); + else if (instance->msix_load_balance) cmd->request_desc->SCSIIO.MSIxIndex = (mega_mod64(atomic64_add_return(1, &instance->total_io_count), instance->msix_vectors)); diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h b/drivers/scsi/megaraid/megaraid_sas_fusion.h index ca32b2b72515..6fe334348c46 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h @@ -962,6 +962,7 @@ struct IO_REQUEST_INFO { u8 pd_after_lb; u16 r1_alt_dev_handle; /* raid 1/10 only */ bool ra_capable; + u8 data_arms; }; struct MR_LD_TARGET_SYNC { -- cgit v1.2.3