<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/net/rds/ib.c, branch v6.12.91</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.91</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.91'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-05-23T11:04:25+00:00</updated>
<entry>
<title>net/rds: Restrict use of RDS/IB to the initial network namespace</title>
<updated>2026-05-23T11:04:25+00:00</updated>
<author>
<name>Greg Jumper</name>
<email>greg.jumper@oracle.com</email>
</author>
<published>2026-04-08T08:04:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fb407343c0c16e94584707b2dfdd350a5f81b000'/>
<id>urn:sha1:fb407343c0c16e94584707b2dfdd350a5f81b000</id>
<content type='text'>
[ Upstream commit ebf71dd4aff46e8e421d455db3e231ba43d2fa8a ]

Prevent using RDS/IB in network namespaces other than the initial one.
The existing RDS/IB code will not work properly in non-initial network
namespaces.

Fixes: d5a8ac28a7ff ("RDS-TCP: Make RDS-TCP work correctly when it is set up in a netns other than init_net")
Reported-by: syzbot+da8e060735ae02c8f3d1@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=da8e060735ae02c8f3d1
Signed-off-by: Greg Jumper &lt;greg.jumper@oracle.com&gt;
Signed-off-by: Allison Henderson &lt;achender@kernel.org&gt;
Link: https://patch.msgid.link/20260408080420.540032-3-achender@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net/rds: Optimize rds_ib_laddr_check</title>
<updated>2026-05-23T11:04:25+00:00</updated>
<author>
<name>Håkon Bugge</name>
<email>haakon.bugge@oracle.com</email>
</author>
<published>2026-04-08T08:04:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=91ac4bdeb93a15abcd3d3499497eb216edc9ffa4'/>
<id>urn:sha1:91ac4bdeb93a15abcd3d3499497eb216edc9ffa4</id>
<content type='text'>
[ Upstream commit 236f718ac885965fa886440b9898dfae185c9733 ]

rds_ib_laddr_check() creates a CM_ID and attempts to bind the address
in question to it. This in order to qualify the allegedly local
address as a usable IB/RoCE address.

In the field, ExaWatcher runs rds-ping to all ports in the fabric from
all local ports. This using all active ToS'es. In a full rack system,
we have 14 cell servers and eight db servers. Typically, 6 ToS'es are
used. This implies 528 rds-ping invocations per ExaWatcher's "RDSinfo"
interval.

Adding to this, each rds-ping invocation creates eight sockets and
binds the local address to them:

socket(AF_RDS, SOCK_SEQPACKET, 0)       = 3
bind(3, {sa_family=AF_INET, sin_port=htons(0),
	sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0)       = 4
bind(4, {sa_family=AF_INET, sin_port=htons(0),
	sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0)       = 5
bind(5, {sa_family=AF_INET, sin_port=htons(0),
	sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0)       = 6
bind(6, {sa_family=AF_INET, sin_port=htons(0),
	sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0)       = 7
bind(7, {sa_family=AF_INET, sin_port=htons(0),
	sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0)       = 8
bind(8, {sa_family=AF_INET, sin_port=htons(0),
	sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0)       = 9
bind(9, {sa_family=AF_INET, sin_port=htons(0),
	sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0)       = 10
bind(10, {sa_family=AF_INET, sin_port=htons(0),
	sin_addr=inet_addr("192.168.36.2")}, 16) = 0

So, at every interval ExaWatcher executes rds-ping's, 4224 CM_IDs are
allocated, considering this full-rack system. After the a CM_ID has
been allocated, rdma_bind_addr() is called, with the port number being
zero. This implies that the CMA will attempt to search for an un-used
ephemeral port. Simplified, the algorithm is to start at a random
position in the available port space, and then if needed, iterate
until an un-used port is found.

The book-keeping of used ports uses the idr system, which again uses
slab to allocate new struct idr_layer's. The size is 2092 bytes and
slab tries to reduce the wasted space. Hence, it chooses an order:3
allocation, for which 15 idr_layer structs will fit and only 1388
bytes are wasted per the 32KiB order:3 chunk.

Although this order:3 allocation seems like a good space/speed
trade-off, it does not resonate well with how it used by the CMA. The
combination of the randomized starting point in the port space (which
has close to zero spatial locality) and the close proximity in time of
the 4224 invocations of the rds-ping's, creates a memory hog for
order:3 allocations.

These costly allocations may need reclaims and/or compaction. At
worst, they may fail and produce a stack trace such as (from uek4):

[&lt;ffffffff811a72d5&gt;] __inc_zone_page_state+0x35/0x40
[&lt;ffffffff811c2e97&gt;] page_add_file_rmap+0x57/0x60
[&lt;ffffffffa37ca1df&gt;] remove_migration_pte+0x3f/0x3c0 [ksplice_6cn872bt_vmlinux_new]
[&lt;ffffffff811c3de8&gt;] rmap_walk+0xd8/0x340
[&lt;ffffffff811e8860&gt;] remove_migration_ptes+0x40/0x50
[&lt;ffffffff811ea83c&gt;] migrate_pages+0x3ec/0x890
[&lt;ffffffff811afa0d&gt;] compact_zone+0x32d/0x9a0
[&lt;ffffffff811b00ed&gt;] compact_zone_order+0x6d/0x90
[&lt;ffffffff811b03b2&gt;] try_to_compact_pages+0x102/0x270
[&lt;ffffffff81190e56&gt;] __alloc_pages_direct_compact+0x46/0x100
[&lt;ffffffff8119165b&gt;] __alloc_pages_nodemask+0x74b/0xaa0
[&lt;ffffffff811d8411&gt;] alloc_pages_current+0x91/0x110
[&lt;ffffffff811e3b0b&gt;] new_slab+0x38b/0x480
[&lt;ffffffffa41323c7&gt;] __slab_alloc+0x3b7/0x4a0 [ksplice_s0dk66a8_vmlinux_new]
[&lt;ffffffff811e42ab&gt;] kmem_cache_alloc+0x1fb/0x250
[&lt;ffffffff8131fdd6&gt;] idr_layer_alloc+0x36/0x90
[&lt;ffffffff8132029c&gt;] idr_get_empty_slot+0x28c/0x3d0
[&lt;ffffffff813204ad&gt;] idr_alloc+0x4d/0xf0
[&lt;ffffffffa051727d&gt;] cma_alloc_port+0x4d/0xa0 [rdma_cm]
[&lt;ffffffffa0517cbe&gt;] rdma_bind_addr+0x2ae/0x5b0 [rdma_cm]
[&lt;ffffffffa09d8083&gt;] rds_ib_laddr_check+0x83/0x2c0 [ksplice_6l2xst5i_rds_rdma_new]
[&lt;ffffffffa05f892b&gt;] rds_trans_get_preferred+0x5b/0xa0 [rds]
[&lt;ffffffffa05f09f2&gt;] rds_bind+0x212/0x280 [rds]
[&lt;ffffffff815b4016&gt;] SYSC_bind+0xe6/0x120
[&lt;ffffffff815b4d3e&gt;] SyS_bind+0xe/0x10
[&lt;ffffffff816b031a&gt;] system_call_fastpath+0x18/0xd4

To avoid these excessive calls to rdma_bind_addr(), we optimize
rds_ib_laddr_check() by simply checking if the address in question has
been used before. The rds_rdma module keeps track of addresses
associated with IB devices, and the function rds_ib_get_device() is
used to determine if the address already has been qualified as a valid
local address. If not found, we call the legacy rds_ib_laddr_check(),
now renamed to rds_ib_laddr_check_cm().

Signed-off-by: Håkon Bugge &lt;haakon.bugge@oracle.com&gt;
Signed-off-by: Somasundaram Krishnasamy &lt;somasundaram.krishnasamy@oracle.com&gt;
Signed-off-by: Gerd Rausch &lt;gerd.rausch@oracle.com&gt;
Signed-off-by: Allison Henderson &lt;achender@kernel.org&gt;
Link: https://patch.msgid.link/20260408080420.540032-2-achender@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Stable-dep-of: ebf71dd4aff4 ("net/rds: Restrict use of RDS/IB to the initial network namespace")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA: Split kernel-only global device caps from uverbs device caps</title>
<updated>2022-04-06T18:02:13+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2022-04-04T15:26:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e945c653c8e972d1b81a88e474d79f801b60213a'/>
<id>urn:sha1:e945c653c8e972d1b81a88e474d79f801b60213a</id>
<content type='text'>
Split out flags from ib_device::device_cap_flags that are only used
internally to the kernel into kernel_cap_flags that is not part of the
uapi. This limits the device_cap_flags to being the same bitmap that will
be copied to userspace.

This cleanly splits out the uverbs flags from the kernel flags to avoid
confusion in the flags bitmap.

Add some short comments describing which each of the kernel flags is
connected to. Remove unused kernel flags.

Link: https://lore.kernel.org/r/0-v2-22c19e565eef+139a-kern_caps_jgg@nvidia.com
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Max Gurtovoy &lt;mgurtovoy@nvidia.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
</entry>
<entry>
<title>rds: stop using dmapool</title>
<updated>2020-11-17T19:22:06+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-11-06T18:19:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=42f2611cc1738b201701e717246e11e86bef4e1e'/>
<id>urn:sha1:42f2611cc1738b201701e717246e11e86bef4e1e</id>
<content type='text'>
RDMA ULPs should only perform DMA through the ib_dma_* API instead of
using the hidden dma_device directly.  In addition using the dma coherent
API family that dmapool is a part of can be very ineffcient on plaforms
that are not DMA coherent.  Switch to use slab allocations and the
ib_dma_* APIs instead.

Link: https://lore.kernel.org/r/20201106181941.1878556-6-hch@lst.de
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Santosh Shilimkar &lt;santosh.shilimkar@oracle.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
</entry>
<entry>
<title>RDMA: Remove 'max_fmr'</title>
<updated>2020-06-02T23:32:54+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@mellanox.com</email>
</author>
<published>2020-05-28T19:45:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=649392bf75a423287a9c4936b341677f12e8cf0b'/>
<id>urn:sha1:649392bf75a423287a9c4936b341677f12e8cf0b</id>
<content type='text'>
Now that FMR support is gone, this attribute can be deleted from all
places.

Link: https://lore.kernel.org/r/12-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com
Reviewed-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Reviewed-by: Bernard Metzler &lt;bmt@zurich.ibm.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
<entry>
<title>RDMA/rds: Remove FMR support for memory registration</title>
<updated>2020-06-02T23:32:53+00:00</updated>
<author>
<name>Max Gurtovoy</name>
<email>maxg@mellanox.com</email>
</author>
<published>2020-05-28T19:45:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=07549ee21ce5247143ffb069bf838025d86b908c'/>
<id>urn:sha1:07549ee21ce5247143ffb069bf838025d86b908c</id>
<content type='text'>
Use FRWR method for memory registration by default and remove the ancient
and unsafe FMR method.

Link: https://lore.kernel.org/r/3-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com
Signed-off-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
<entry>
<title>RDMA: Allow ib_client's to fail when add() is called</title>
<updated>2020-05-06T14:57:33+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@mellanox.com</email>
</author>
<published>2020-04-21T17:24:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=11a0ae4c4bff9b2a471b54dbe910fc0f60e58e62'/>
<id>urn:sha1:11a0ae4c4bff9b2a471b54dbe910fc0f60e58e62</id>
<content type='text'>
When a client is added it isn't allowed to fail, but all the client's have
various failure paths within their add routines.

This creates the very fringe condition where the client was added, failed
during add and didn't set the client_data. The core code will then still
call other client_data centric ops like remove(), rename(), get_nl_info(),
and get_net_dev_by_params() with NULL client_data - which is confusing and
unexpected.

If the add() callback fails, then do not call any more client ops for the
device, even remove.

Remove all the now redundant checks for NULL client_data in ops callbacks.

Update all the add() callbacks to return error codes
appropriately. EOPNOTSUPP is used for cases where the ULP does not support
the ib_device - eg because it only works with IB.

Link: https://lore.kernel.org/r/20200421172440.387069-1-leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Acked-by: Ursula Braun &lt;ubraun@linux.ibm.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
<entry>
<title>net/rds: Handle ODP mr registration/unregistration</title>
<updated>2020-01-18T09:48:19+00:00</updated>
<author>
<name>Hans Westgaard Ry</name>
<email>hans.westgaard.ry@oracle.com</email>
</author>
<published>2020-01-15T12:43:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2eafa1746f17872483d1033b0116ec71435ea19d'/>
<id>urn:sha1:2eafa1746f17872483d1033b0116ec71435ea19d</id>
<content type='text'>
On-Demand-Paging MRs are registered using ib_reg_user_mr and
unregistered with ib_dereg_mr.

Signed-off-by: Hans Westgaard Ry &lt;hans.westgaard.ry@oracle.com&gt;
Acked-by: Santosh Shilimkar &lt;santosh.shilimkar@oracle.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
</content>
</entry>
<entry>
<title>net/rds: Remove unnecessary null check</title>
<updated>2019-10-17T19:23:03+00:00</updated>
<author>
<name>YueHaibing</name>
<email>yuehaibing@huawei.com</email>
</author>
<published>2019-10-15T11:47:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ce753e66dcc37e19572a87f70585ec6537dede81'/>
<id>urn:sha1:ce753e66dcc37e19572a87f70585ec6537dede81</id>
<content type='text'>
Null check before dma_pool_destroy is redundant, so remove it.
This is detected by coccinelle.

Signed-off-by: YueHaibing &lt;yuehaibing@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net/rds: Add missing include file</title>
<updated>2019-10-06T16:34:10+00:00</updated>
<author>
<name>YueHaibing</name>
<email>yuehaibing@huawei.com</email>
</author>
<published>2019-10-06T07:08:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d131c5bb60123f29ed15dd2f829b6644c2deec87'/>
<id>urn:sha1:d131c5bb60123f29ed15dd2f829b6644c2deec87</id>
<content type='text'>
Fix build error:

net/rds/ib_cm.c: In function rds_dma_hdrs_alloc:
net/rds/ib_cm.c:475:13: error: implicit declaration of function dma_pool_zalloc; did you mean mempool_alloc? [-Werror=implicit-function-declaration]
   hdrs[i] = dma_pool_zalloc(pool, GFP_KERNEL, &amp;hdr_daddrs[i]);
             ^~~~~~~~~~~~~~~
             mempool_alloc

net/rds/ib.c: In function rds_ib_dev_free:
net/rds/ib.c:111:3: error: implicit declaration of function dma_pool_destroy; did you mean mempool_destroy? [-Werror=implicit-function-declaration]
   dma_pool_destroy(rds_ibdev-&gt;rid_hdrs_pool);
   ^~~~~~~~~~~~~~~~
   mempool_destroy

Reported-by: Hulk Robot &lt;hulkci@huawei.com&gt;
Fixes: 9b17f5884be4 ("net/rds: Use DMA memory pool allocation for rds_header")
Signed-off-by: YueHaibing &lt;yuehaibing@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
