summaryrefslogtreecommitdiff
path: root/arch/powerpc/platforms
diff options
context:
space:
mode:
authorAnton Blanchard <anton@samba.org>2012-06-07 22:14:48 +0400
committerBenjamin Herrenschmidt <benh@kernel.crashing.org>2012-07-03 08:14:48 +0400
commitb4c3a8729ae57b4f84d661e16a192f828eca1d03 (patch)
tree03ff960dc63b7c60ed54cbf88f98c8b6df1823ec /arch/powerpc/platforms
parentd362213722c8875b40d712796392682968ce685e (diff)
downloadlinux-b4c3a8729ae57b4f84d661e16a192f828eca1d03.tar.xz
powerpc/iommu: Implement IOMMU pools to improve multiqueue adapter performance
At the moment all queues in a multiqueue adapter will serialise against the IOMMU table lock. This is proving to be a big issue, especially with 10Gbit ethernet. This patch creates 4 pools and tries to spread the load across them. If the table is under 1GB in size we revert back to the original behaviour of 1 pool and 1 largealloc pool. We create a hash to map CPUs to pools. Since we prefer interrupts to be affinitised to primary CPUs, without some form of hashing we are very likely to end up using the same pool. As an example, POWER7 has 4 way SMT and with 4 pools all primary threads will map to the same pool. The largealloc pool is reduced from 1/2 to 1/4 of the space to partially offset the overhead of breaking the table up into pools. Some performance numbers were obtained with a Chelsio T3 adapter on two POWER7 boxes, running a 100 session TCP round robin test. Performance improved 69% with this patch applied. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Diffstat (limited to 'arch/powerpc/platforms')
-rw-r--r--arch/powerpc/platforms/cell/iommu.c1
1 files changed, 0 insertions, 1 deletions
diff --git a/arch/powerpc/platforms/cell/iommu.c b/arch/powerpc/platforms/cell/iommu.c
index b9f509a34c01..c264969c9319 100644
--- a/arch/powerpc/platforms/cell/iommu.c
+++ b/arch/powerpc/platforms/cell/iommu.c
@@ -518,7 +518,6 @@ cell_iommu_setup_window(struct cbe_iommu *iommu, struct device_node *np,
__set_bit(0, window->table.it_map);
tce_build_cell(&window->table, window->table.it_offset, 1,
(unsigned long)iommu->pad_page, DMA_TO_DEVICE, NULL);
- window->table.it_hint = window->table.it_blocksize;
return window;
}