genirq/affinity: Spread IRQs to all available NUMA nodes

If the number of NUMA nodes exceeds the number of MSI/MSI-X interrupts which are allocated for a device, the interrupt affinity spreading code fails to spread them across all nodes. The reason is, that the spreading code starts from node 0 and continues up to the number of interrupts requested for allocation. This leaves the nodes past the last interrupt unused. This results in interrupt concentration on the first nodes which violates the assumption of the block layer that all nodes are covered evenly. As a consequence the NUMA nodes above the number of interrupts are all assigned to hardware queue 0 and therefore NUMA node 0, which results in bad performance and has CPU hotplug implications, because queue 0 gets shut down when the last CPU of node 0 is offlined. Go over all NUMA nodes and assign them round-robin to all requested interrupts to solve this. [ tglx: Massaged changelog ] Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: Michael Kelley <mikelley@microsoft.com> Link: https://lkml.kernel.org/r/20181102180248.13583-1-longli@linuxonhyperv.com
author: Long Li <longli@microsoft.com> 2018-11-02 21:02:48 +0300
committer: Thomas Gleixner <tglx@linutronix.de> 2018-11-05 14:16:26 +0300
commit: b82592199032bf7c778f861b936287e37ebc9f62 (patch)
tree: 2544a1b9e363aa177430a1c2d168f113a9a8f0e6 /kernel/irq
parent: 651022382c7f8da46cb4872a545ee1da6d097d2a (diff)
download: linux-b82592199032bf7c778f861b936287e37ebc9f62.tar.xz
1 files changed, 2 insertions, 3 deletions
diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index f4f29b9d90ee..e12cdf637c71 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -117,12 +117,11 @@ static int irq_build_affinity_masks(const struct irq_affinity *affd,
 	 */
 	if (numvecs <= nodes) {
 		for_each_node_mask(n, nodemsk) {
-			cpumask_copy(masks + curvec, node_to_cpumask[n]);
-			if (++done == numvecs)
-				break;
+			cpumask_or(masks + curvec, masks + curvec, node_to_cpumask[n]);
 			if (++curvec == last_affv)
 				curvec = affd->pre_vectors;
 		}
+		done = numvecs;
 		goto out;
 	}
author	Long Li <longli@microsoft.com>	2018-11-02 21:02:48 +0300
committer	Thomas Gleixner <tglx@linutronix.de>	2018-11-05 14:16:26 +0300
commit	b82592199032bf7c778f861b936287e37ebc9f62 (patch)
tree	2544a1b9e363aa177430a1c2d168f113a9a8f0e6 /kernel/irq
parent	651022382c7f8da46cb4872a545ee1da6d097d2a (diff)
download	linux-b82592199032bf7c778f861b936287e37ebc9f62.tar.xz