<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/infiniband/core/device.c, branch v6.19.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-04T12:20:55+00:00</updated>
<entry>
<title>RDMA/core: Fix stale RoCE GIDs during netdev events at registration</title>
<updated>2026-03-04T12:20:55+00:00</updated>
<author>
<name>Jiri Pirko</name>
<email>jiri@nvidia.com</email>
</author>
<published>2026-01-27T09:38:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=087d4f1939b97ee08924d2be2add26a90ff71e9e'/>
<id>urn:sha1:087d4f1939b97ee08924d2be2add26a90ff71e9e</id>
<content type='text'>
[ Upstream commit 9af0feae8016ba58ad7ff784a903404986b395b1 ]

RoCE GID entries become stale when netdev properties change during the
IB device registration window. This is reproducible with a udev rule
that sets a MAC address when a VF netdev appears:

  ACTION=="add", SUBSYSTEM=="net", KERNEL=="eth4", \
    RUN+="/sbin/ip link set eth4 address 88:22:33:44:55:66"

After VF creation, show_gids displays GIDs derived from the original
random MAC rather than the configured one.

The root cause is a race between netdev event processing and device
registration:

  CPU 0 (driver)                    CPU 1 (udev/workqueue)
  ──────────────                    ──────────────────────
  ib_register_device()
    ib_cache_setup_one()
      gid_table_setup_one()
        _gid_table_setup_one()
          ← GID table allocated
        rdma_roce_rescan_device()
          ← GIDs populated with
            OLD MAC
                                    ip link set eth4 addr NEW_MAC
                                    NETDEV_CHANGEADDR queued
                                    netdevice_event_work_handler()
                                      ib_enum_all_roce_netdevs()
                                        ← Iterates DEVICE_REGISTERED
                                        ← Device NOT marked yet, SKIP!
    enable_device_and_get()
      xa_set_mark(DEVICE_REGISTERED)
          ← Too late, event was lost

The netdev event handler uses ib_enum_all_roce_netdevs() which only
iterates devices marked DEVICE_REGISTERED. However, this mark is set
late in the registration process, after the GID cache is already
populated. Events arriving in this window are silently dropped.

Fix this by introducing a new xarray mark DEVICE_GID_UPDATES that is
set immediately after the GID table is allocated and initialized. Use
the new mark in ib_enum_all_roce_netdevs() function to iterate devices
instead of DEVICE_REGISTERED.

This is safe because:
- After _gid_table_setup_one(), all required structures exist (port_data,
  immutable, cache.gid)
- The GID table mutex serializes concurrent access between the initial
  rescan and event handlers
- Event handlers correctly update stale GIDs even when racing with rescan
- The mark is cleared in ib_cache_cleanup_one() before teardown

This also fixes similar races for IP address events (inetaddr_event,
inet6addr_event) which use the same enumeration path.

Fixes: 0df91bb67334 ("RDMA/devices: Use xarray to store the client_data")
Signed-off-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Link: https://patch.msgid.link/20260127093839.126291-1-jiri@resnulli.us
Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/core: always drop device refcount in ib_del_sub_device_and_put()</title>
<updated>2025-12-21T11:45:17+00:00</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2025-12-20T02:11:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fa3c411d21ebc26ffd175c7256c37cefa35020aa'/>
<id>urn:sha1:fa3c411d21ebc26ffd175c7256c37cefa35020aa</id>
<content type='text'>
Since nldev_deldev() (introduced by commit 060c642b2ab8 ("RDMA/nldev: Add
support to add/delete a sub IB device through netlink") grabs a reference
using ib_device_get_by_index() before calling ib_del_sub_device_and_put(),
we need to drop that reference before returning -EOPNOTSUPP error.

Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Fixes: bca51197620a ("RDMA/core: Support IB sub device with type "SMI"")
Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Link: https://patch.msgid.link/80749a85-cbe2-460c-8451-42516013f9fa@I-love.SAKURA.ne.jp
Reviewed-by: Parav Pandit &lt;parav@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/core: WQ_PERCPU added to alloc_workqueue users</title>
<updated>2025-11-06T07:23:23+00:00</updated>
<author>
<name>Marco Crivellari</name>
<email>marco.crivellari@suse.com</email>
</author>
<published>2025-11-01T16:31:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e60c5583b661da65b09bfd6ae91126607397490e'/>
<id>urn:sha1:e60c5583b661da65b09bfd6ae91126607397490e</id>
<content type='text'>
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Marco Crivellari &lt;marco.crivellari@suse.com&gt;
Link: https://patch.msgid.link/20251101163121.78400-3-marco.crivellari@suse.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/core: fix "truely"-&gt;"truly"</title>
<updated>2025-09-11T06:18:35+00:00</updated>
<author>
<name>Xichao Zhao</name>
<email>zhao.xichao@vivo.com</email>
</author>
<published>2025-08-27T12:00:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2ed096dc41536ded233612ac9f308dcdefebd286'/>
<id>urn:sha1:2ed096dc41536ded233612ac9f308dcdefebd286</id>
<content type='text'>
Trivial fix to spelling mistake in comment text.

Signed-off-by: Xichao Zhao &lt;zhao.xichao@vivo.com&gt;
Link: https://patch.msgid.link/20250827120007.489496-1-zhao.xichao@vivo.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/core: Introduce a DMAH object and its alloc/free APIs</title>
<updated>2025-07-23T05:42:10+00:00</updated>
<author>
<name>Yishai Hadas</name>
<email>yishaih@nvidia.com</email>
</author>
<published>2025-07-17T12:17:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d83edab562a496a42720902a1d2effccd05c37c5'/>
<id>urn:sha1:d83edab562a496a42720902a1d2effccd05c37c5</id>
<content type='text'>
Introduce a new DMA handle (DMAH) object along with its corresponding
allocation and deallocation APIs.

This DMAH object encapsulates attributes intended for use in DMA
transactions.

While its initial purpose is to support TPH functionality, it is
designed to be extensible for future features such as DMA PCI multipath,
PCI UIO configurations, PCI traffic class selection, and more.

Further details:
----------------
We ensure that a caller requesting a DMA handle for a specific CPU ID is
permitted to be scheduled on it. This prevent a potential security issue
where a non privilege user may trigger DMA operations toward a CPU that
it's not allowed to run on.

We manage reference counting for the DMAH object and its consumers
(e.g., memory regions) as will be detailed in subsequent patches in the
series.

Signed-off-by: Yishai Hadas &lt;yishaih@nvidia.com&gt;
Reviewed-by: Edward Srouji &lt;edwards@nvidia.com&gt;
Link: https://patch.msgid.link/2cad097e849597e49d6b61e6865dba878257f371.1752752567.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/uverbs: Add a common way to create CQ with umem</title>
<updated>2025-07-13T08:00:34+00:00</updated>
<author>
<name>Michael Margolin</name>
<email>mrgolin@amazon.com</email>
</author>
<published>2025-07-08T20:23:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1a40c362ae265ca4004f7373b34c22af6810f6cb'/>
<id>urn:sha1:1a40c362ae265ca4004f7373b34c22af6810f6cb</id>
<content type='text'>
Add ioctl command attributes and a common handling for the option to
create CQs with memory buffers passed from userspace. When required
attributes are supplied, create umem and provide it for driver's use.
The extension enables creation of CQs on top of preallocated CPU
virtual or device memory buffers, by supplying VA or dmabuf fd, in a
common way.
Drivers can support this flow by initializing a new create_cq_umem fp
field in their ops struct, with a function that can handle the new
parameter.

Signed-off-by: Michael Margolin &lt;mrgolin@amazon.com&gt;
Link: https://patch.msgid.link/20250708202308.24783-2-mrgolin@amazon.com
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/uverbs: Check CAP_NET_RAW in user namespace for flow create</title>
<updated>2025-07-01T09:21:27+00:00</updated>
<author>
<name>Parav Pandit</name>
<email>parav@nvidia.com</email>
</author>
<published>2025-06-26T18:58:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f458ccd2aa2c5a6f0129a9b1548f2825071fdc6b'/>
<id>urn:sha1:f458ccd2aa2c5a6f0129a9b1548f2825071fdc6b</id>
<content type='text'>
Currently, the capability check is done in the default
init_user_ns user namespace. When a process runs in a
non default user namespace, such check fails. Due to this
when a process is running using Podman, it fails to create
the flow resource.

Since the RDMA device is a resource within a network namespace,
use the network namespace associated with the RDMA device to
determine its owning user namespace.

Fixes: 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through uverbs")
Signed-off-by: Parav Pandit &lt;parav@nvidia.com&gt;
Suggested-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Link: https://patch.msgid.link/6df6f2f24627874c4f6d041c19dc1f6f29f68f84.1750963874.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/core: Extend RDMA device registration to be net namespace aware</title>
<updated>2025-06-26T12:10:07+00:00</updated>
<author>
<name>Mark Bloch</name>
<email>mbloch@nvidia.com</email>
</author>
<published>2025-06-17T08:44:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8cffca866ba86cbf0d097e56521b17d830956d4a'/>
<id>urn:sha1:8cffca866ba86cbf0d097e56521b17d830956d4a</id>
<content type='text'>
Presently, RDMA devices are always registered within the init network
namespace, even if the associated devlink device's namespace was
changed via a devlink reload. This mismatch leads to discrepancies
between the network namespace of the devlink device and that of the
RDMA device.

Therefore, extend the RDMA device allocation API to optionally take
the net namespace. This isn't limited to devices that support devlink
but allows all users to provide the network namespace if they need to
do so.

If a network namespace is provided during device allocation, it's up
to the caller to make sure the namespace stays valid until
ib_register_device() is called.

Signed-off-by: Shay Drory &lt;shayd@nvidia.com&gt;
Signed-off-by: Mark Bloch &lt;mbloch@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
</content>
</entry>
<entry>
<title>RDMA/core: Add driver APIs pre_destroy_cq() and post_destroy_cq()</title>
<updated>2025-06-25T07:49:51+00:00</updated>
<author>
<name>Mark Zhang</name>
<email>markzhang@nvidia.com</email>
</author>
<published>2025-06-16T08:26:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5a2a5b65d5d67279be9e1f0e4b9baf39ee594cb1'/>
<id>urn:sha1:5a2a5b65d5d67279be9e1f0e4b9baf39ee594cb1</id>
<content type='text'>
Currently in ib_free_cq, it disables IRQ or cancel the CQ work before
driver destroy_cq. This isn't good as a new IRQ or a CQ work can be
submitted immediately after disabling IRQ or canceling CQ work, which
may run concurrently with destroy_cq and cause crashes.
The right flow should be:
 1. Driver disables CQ to make sure no new CQ event will be submitted;
 2. Disables IRQ or Cancels CQ work in core layer, to make sure no CQ
    polling work is running;
 3. Free all resources to destroy the CQ.

This patch adds 2 driver APIs:
- pre_destroy_cq(): Disable a CQ to prevent it from generating any new
  work completions, but not free any kernel resources;
- post_destroy_cq(): Free all kernel resources.

In ib_free_cq, the IRQ is disabled or CQ work is canceled after
pre_destroy_cq, and before post_destroy_cq.

Fixes: 14d3a3b2498e ("IB: add a proper completion queue abstraction")
Signed-off-by: Mark Zhang &lt;markzhang@nvidia.com&gt;
Link: https://patch.msgid.link/b5f7ae3d75f44a3e15ff3f4eb2bbdea13e06b97f.1750062328.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/core: Fix "KASAN: slab-use-after-free Read in ib_register_device" problem</title>
<updated>2025-05-06T17:36:57+00:00</updated>
<author>
<name>Zhu Yanjun</name>
<email>yanjun.zhu@linux.dev</email>
</author>
<published>2025-05-06T15:10:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d0706bfd3ee40923c001c6827b786a309e2a8713'/>
<id>urn:sha1:d0706bfd3ee40923c001c6827b786a309e2a8713</id>
<content type='text'>
Call Trace:

 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:408 [inline]
 print_report+0xc3/0x670 mm/kasan/report.c:521
 kasan_report+0xe0/0x110 mm/kasan/report.c:634
 strlen+0x93/0xa0 lib/string.c:420
 __fortify_strlen include/linux/fortify-string.h:268 [inline]
 get_kobj_path_length lib/kobject.c:118 [inline]
 kobject_get_path+0x3f/0x2a0 lib/kobject.c:158
 kobject_uevent_env+0x289/0x1870 lib/kobject_uevent.c:545
 ib_register_device drivers/infiniband/core/device.c:1472 [inline]
 ib_register_device+0x8cf/0xe00 drivers/infiniband/core/device.c:1393
 rxe_register_device+0x275/0x320 drivers/infiniband/sw/rxe/rxe_verbs.c:1552
 rxe_net_add+0x8e/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:550
 rxe_newlink+0x70/0x190 drivers/infiniband/sw/rxe/rxe.c:225
 nldev_newlink+0x3a3/0x680 drivers/infiniband/core/nldev.c:1796
 rdma_nl_rcv_msg+0x387/0x6e0 drivers/infiniband/core/netlink.c:195
 rdma_nl_rcv_skb.constprop.0.isra.0+0x2e5/0x450
 netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
 netlink_unicast+0x53a/0x7f0 net/netlink/af_netlink.c:1339
 netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1883
 sock_sendmsg_nosec net/socket.c:712 [inline]
 __sock_sendmsg net/socket.c:727 [inline]
 ____sys_sendmsg+0xa95/0xc70 net/socket.c:2566
 ___sys_sendmsg+0x134/0x1d0 net/socket.c:2620
 __sys_sendmsg+0x16d/0x220 net/socket.c:2652
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0x260 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

This problem is similar to the problem that the
commit 1d6a9e7449e2 ("RDMA/core: Fix use-after-free when rename device name")
fixes.

The root cause is: the function ib_device_rename() renames the name with
lock. But in the function kobject_uevent(), this name is accessed without
lock protection at the same time.

The solution is to add the lock protection when this name is accessed in
the function kobject_uevent().

Fixes: 779e0bf47632 ("RDMA/core: Do not indicate device ready when device enablement fails")
Link: https://patch.msgid.link/r/20250506151008.75701-1-yanjun.zhu@linux.dev
Reported-by: syzbot+e2ce9e275ecc70a30b72@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e2ce9e275ecc70a30b72
Signed-off-by: Zhu Yanjun &lt;yanjun.zhu@linux.dev&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
</entry>
</feed>
