<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/infiniband/core, branch v6.13.6</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.13.6</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.13.6'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-01-03T19:09:35+00:00</updated>
<entry>
<title>Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma</title>
<updated>2025-01-03T19:09:35+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-01-03T19:09:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dea3165f989b259bc1293d9dfc50053223f5f4b2'/>
<id>urn:sha1:dea3165f989b259bc1293d9dfc50053223f5f4b2</id>
<content type='text'>
Pull rdma fixes from Jason Gunthorpe:
 "A lot of fixes accumulated over the holiday break:

   - Static tool fixes, value is already proven to be NULL, possible
     integer overflow

   - Many bnxt_re fixes:
      - Crashes due to a mismatch in the maximum SGE list size
      - Don't waste memory for user QPs by creating kernel-only
        structures
      - Fix compatability issues with older HW in some of the new HW
        features recently introduced: RTS-&gt;RTS feature, work around 9096
      - Do not allow destroy_qp to fail
      - Validate QP MTU against device limits
      - Add missing validation on madatory QP attributes for RTR-&gt;RTS
      - Report port_num in query_qp as required by the spec
      - Fix creation of QPs of the maximum queue size, and in the
        variable mode
      - Allow all QPs to be used on newer HW by limiting a work around
        only to HW it affects
      - Use the correct MSN table size for variable mode QPs
      - Add missing locking in create_qp() accessing the qp_tbl
      - Form WQE buffers correctly when some of the buffers are 0 hop
      - Don't crash on QP destroy if the userspace doesn't setup the
        dip_ctx
      - Add the missing QP flush handler call on the DWQE path to avoid
        hanging on error recovery
      - Consistently use ENXIO for return codes if the devices is
        fatally errored

   - Try again to fix VLAN support on iwarp, previous fix was reverted
     due to breaking other cards

   - Correct error path return code for rdma netlink events

   - Remove the seperate net_device pointer in siw and rxe which
     syzkaller found a way to UAF

   - Fix a UAF of a stack ib_sge in rtrs

   - Fix a regression where old mlx5 devices and FW were wrongly
     activing new device features and failing"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (28 commits)
  RDMA/mlx5: Enable multiplane mode only when it is supported
  RDMA/bnxt_re: Fix error recovery sequence
  RDMA/rtrs: Ensure 'ib_sge list' is accessible
  RDMA/rxe: Remove the direct link to net_device
  RDMA/hns: Fix missing flush CQE for DWQE
  RDMA/hns: Fix warning storm caused by invalid input in IO path
  RDMA/hns: Fix accessing invalid dip_ctx during destroying QP
  RDMA/hns: Fix mapping error of zero-hop WQE buffer
  RDMA/bnxt_re: Fix the locking while accessing the QP table
  RDMA/bnxt_re: Fix MSN table size for variable wqe mode
  RDMA/bnxt_re: Add send queue size check for variable wqe
  RDMA/bnxt_re: Disable use of reserved wqes
  RDMA/bnxt_re: Fix max_qp_wrs reported
  RDMA/siw: Remove direct link to net_device
  RDMA/nldev: Set error code in rdma_nl_notify_event
  RDMA/bnxt_re: Fix reporting hw_ver in query_device
  RDMA/bnxt_re: Fix to export port num to ib_query_qp
  RDMA/bnxt_re: Fix setting mandatory attributes for modify_qp
  RDMA/bnxt_re: Add check for path mtu in modify_qp
  RDMA/bnxt_re: Fix the check for 9060 condition
  ...
</content>
</entry>
<entry>
<title>RDMA/nldev: Set error code in rdma_nl_notify_event</title>
<updated>2024-12-19T10:15:20+00:00</updated>
<author>
<name>Chiara Meiohas</name>
<email>cmeiohas@nvidia.com</email>
</author>
<published>2024-12-10T07:33:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=13a6691910cc23ea9ba4066e098603088673d5b0'/>
<id>urn:sha1:13a6691910cc23ea9ba4066e098603088673d5b0</id>
<content type='text'>
In case of error set the error code before the goto.

Fixes: 6ff57a2ea7c2 ("RDMA/nldev: Fix NULL pointer dereferences issue in rdma_nl_notify_event")
Reported-by: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Closes: https://lore.kernel.org/linux-rdma/a84a2fc3-33b6-46da-a1bd-3343fa07eaf9@stanley.mountain/
Signed-off-by: Chiara Meiohas &lt;cmeiohas@nvidia.com&gt;
Reviewed-by: Maher Sanalla &lt;msanalla@nvidia.com&gt;
Link: https://patch.msgid.link/13eb25961923f5de9eb9ecbbc94e26113d6049ef.1733815944.git.leonro@nvidia.com
Reviewed-by: Kalesh AP &lt;kalesh-anakkur.purayil@broadcom.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/core: Fix ENODEV error for iWARP test over vlan</title>
<updated>2024-12-10T08:58:03+00:00</updated>
<author>
<name>Anumula Murali Mohan Reddy</name>
<email>anumula@chelsio.com</email>
</author>
<published>2024-12-03T14:00:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a4048c83fd87c65657a4acb17d639092d4b6133d'/>
<id>urn:sha1:a4048c83fd87c65657a4acb17d639092d4b6133d</id>
<content type='text'>
If traffic is over vlan, cma_validate_port() fails to match
net_device ifindex with bound_if_index and results in ENODEV error.
As iWARP gid table is static, it contains entry corresponding  to
only one net device which is either real netdev or vlan netdev for
cases like  siw attached to a vlan interface.
This patch fixes the issue by assigning bound_if_index with net
device index, if real net device obtained from bound if index matches
with net device retrieved from gid table

Fixes: f8ef1be816bf ("RDMA/cma: Avoid GID lookups on iWARP devices")
Link: https://lore.kernel.org/all/ZzNgdrjo1kSCGbRz@chelsio.com/
Signed-off-by: Anumula Murali Mohan Reddy &lt;anumula@chelsio.com&gt;
Signed-off-by: Potnuri Bharat Teja &lt;bharat@chelsio.com&gt;
Link: https://patch.msgid.link/20241203140052.3985-1-anumula@chelsio.com
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/uverbs: Prevent integer overflow issue</title>
<updated>2024-12-04T14:19:39+00:00</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@linaro.org</email>
</author>
<published>2024-11-30T10:06:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d0257e089d1bbd35c69b6c97ff73e3690ab149a9'/>
<id>urn:sha1:d0257e089d1bbd35c69b6c97ff73e3690ab149a9</id>
<content type='text'>
In the expression "cmd.wqe_size * cmd.wr_count", both variables are u32
values that come from the user so the multiplication can lead to integer
wrapping.  Then we pass the result to uverbs_request_next_ptr() which also
could potentially wrap.  The "cmd.sge_count * sizeof(struct ib_uverbs_sge)"
multiplication can also overflow on 32bit systems although it's fine on
64bit systems.

This patch does two things.  First, I've re-arranged the condition in
uverbs_request_next_ptr() so that the use controlled variable "len" is on
one side of the comparison by itself without any math.  Then I've modified
all the callers to use size_mul() for the multiplications.

Fixes: 67cdb40ca444 ("[IB] uverbs: Implement more commands")
Cc: stable@vger.kernel.org
Signed-off-by: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Link: https://patch.msgid.link/b8765ab3-c2da-4611-aae0-ddd6ba173d23@stanley.mountain
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>module: Convert symbol namespace to string literal</title>
<updated>2024-12-02T19:34:44+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2024-12-02T14:59:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cdd30ebb1b9f36159d66f088b61aee264e649d7a'/>
<id>urn:sha1:cdd30ebb1b9f36159d66f088b61aee264e649d7a</id>
<content type='text'>
Clean up the existing export namespace code along the same lines of
commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo)
to __section("foo")") and for the same reason, it is not desired for the
namespace argument to be a macro expansion itself.

Scripted using

  git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file;
  do
    awk -i inplace '
      /^#define EXPORT_SYMBOL_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /^#define MODULE_IMPORT_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /MODULE_IMPORT_NS/ {
        $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g");
      }
      /EXPORT_SYMBOL_NS/ {
        if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) {
  	if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ &amp;&amp;
  	    $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ &amp;&amp;
  	    $0 !~ /^my/) {
  	  getline line;
  	  gsub(/[[:space:]]*\\$/, "");
  	  gsub(/[[:space:]]/, "", line);
  	  $0 = $0 " " line;
  	}

  	$0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/,
  		    "\\1(\\2, \"\\3\")", "g");
        }
      }
      { print }' $file;
  done

Requested-by: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc
Acked-by: Greg KH &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma</title>
<updated>2024-11-23T04:03:57+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-11-23T04:03:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2a163a4cea153348172e260a0c5b5569103a66a3'/>
<id>urn:sha1:2a163a4cea153348172e260a0c5b5569103a66a3</id>
<content type='text'>
Pull rdma updates from Jason Gunthorpe:
 "Seveal fixes scattered across the drivers and a few new features:

   - Minor updates and bug fixes to hfi1, efa, iopob, bnxt, hns

   - Force disassociate the userspace FD when hns does an async reset

   - bnxt new features for optimized modify QP to skip certain stayes,
     CQ coalescing, better debug dumping

   - mlx5 new data placement ordering feature

   - Faster destruction of mlx5 devx HW objects

   - Improvements to RDMA CM mad handling"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (51 commits)
  RDMA/bnxt_re: Correct the sequence of device suspend
  RDMA/bnxt_re: Use the default mode of congestion control
  RDMA/bnxt_re: Support different traffic class
  IB/cm: Rework sending DREQ when destroying a cm_id
  IB/cm: Do not hold reference on cm_id unless needed
  IB/cm: Explicitly mark if a response MAD is a retransmission
  RDMA/mlx5: Move events notifier registration to be after device registration
  RDMA/bnxt_re: Cache MSIx info to a local structure
  RDMA/bnxt_re: Refurbish CQ to NQ hash calculation
  RDMA/bnxt_re: Refactor NQ allocation
  RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are reserved
  RDMA/hns: Fix different dgids mapping to the same dip_idx
  RDMA/bnxt_re: Add set_func_resources support for P5/P7 adapters
  RDMA/bnxt_re: Enhance RoCE SRIOV resource configuration design
  bnxt_en: Add support for RoCE sriov configuration
  RDMA/hns: Fix NULL pointer derefernce in hns_roce_map_mr_sg()
  RDMA/hns: Fix out-of-order issue of requester when setting FENCE
  RDMA/nldev: Add IB device and net device rename events
  RDMA/mlx5: Add implementation for ufile_hw_cleanup device operation
  RDMA/core: Move ib_uverbs_file struct to uverbs_types.h
  ...
</content>
</entry>
<entry>
<title>Merge tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2024-11-18T20:24:06+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-11-18T20:24:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0f25f0e4efaeb68086f7e65c442f2d648b21736f'/>
<id>urn:sha1:0f25f0e4efaeb68086f7e65c442f2d648b21736f</id>
<content type='text'>
Pull 'struct fd' class updates from Al Viro:
 "The bulk of struct fd memory safety stuff

  Making sure that struct fd instances are destroyed in the same scope
  where they'd been created, getting rid of reassignments and passing
  them by reference, converting to CLASS(fd{,_pos,_raw}).

  We are getting very close to having the memory safety of that stuff
  trivial to verify"

* tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits)
  deal with the last remaing boolean uses of fd_file()
  css_set_fork(): switch to CLASS(fd_raw, ...)
  memcg_write_event_control(): switch to CLASS(fd)
  assorted variants of irqfd setup: convert to CLASS(fd)
  do_pollfd(): convert to CLASS(fd)
  convert do_select()
  convert vfs_dedupe_file_range().
  convert cifs_ioctl_copychunk()
  convert media_request_get_by_fd()
  convert spu_run(2)
  switch spufs_calls_{get,put}() to CLASS() use
  convert cachestat(2)
  convert do_preadv()/do_pwritev()
  fdget(), more trivial conversions
  fdget(), trivial conversions
  privcmd_ioeventfd_assign(): don't open-code eventfd_ctx_fdget()
  o2hb_region_dev_store(): avoid goto around fdget()/fdput()
  introduce "fd_pos" class, convert fdget_pos() users to it.
  fdget_raw() users: switch to CLASS(fd_raw)
  convert vmsplice() to CLASS(fd)
  ...
</content>
</entry>
<entry>
<title>IB/cm: Rework sending DREQ when destroying a cm_id</title>
<updated>2024-11-17T09:51:49+00:00</updated>
<author>
<name>Sean Hefty</name>
<email>shefty@nvidia.com</email>
</author>
<published>2024-11-13T11:12:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fc0856c3a32576fb21c494f38b9c6c8dc3bf58ab'/>
<id>urn:sha1:fc0856c3a32576fb21c494f38b9c6c8dc3bf58ab</id>
<content type='text'>
A DREQ is sent in 2 situations:

  1. When requested by the user.
     This DREQ has to wait for a DREP, which will be routed to the user.

  2. When the cm_id is destroyed.
     This DREQ is generated by the CM to notify the peer that the
     connection has been destroyed.

In the latter case, any DREP that is received will be discarded.
There's no need to hold a reference on the cm_id.  Today, both
situations are covered by the same function: cm_send_dreq_locked().
When invoked in the cm_id destroy path, the cm_id reference would be
held until the DREQ completes, blocking the destruction.  Because it
could take several seconds to minutes before the DREQ receives a DREP,
the destroy call posts a send for the DREQ then immediately cancels the
MAD.  However, cancellation is not immediate in the MAD layer.  There
could still be a delay before the MAD layer returns the DREQ to the CM.
Moreover, the only guarantee is that the DREQ will be sent at most once.

Introduce a separate flow for sending a DREQ when destroying the cm_id.
The new flow will not hold a reference on the cm_id, allowing it to be
cleaned up immediately.  The cancellation trick is no longer needed.
The MAD layer will send the DREQ exactly once.

Signed-off-by: Sean Hefty &lt;shefty@nvidia.com&gt;
Signed-off-by: Or Har-Toov &lt;ohartoov@nvidia.com&gt;
Signed-off-by: Vlad Dumitrescu &lt;vdumitrescu@nvidia.com&gt;
Link: https://patch.msgid.link/a288a098b8e0550305755fd4a7937431699317f4.1731495873.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>IB/cm: Do not hold reference on cm_id unless needed</title>
<updated>2024-11-17T09:51:43+00:00</updated>
<author>
<name>Sean Hefty</name>
<email>shefty@nvidia.com</email>
</author>
<published>2024-11-13T11:12:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1e5159219076ddb2e44338c667c83fd1bd43dfef'/>
<id>urn:sha1:1e5159219076ddb2e44338c667c83fd1bd43dfef</id>
<content type='text'>
Typically, when the CM sends a MAD it bumps a reference count
on the associated cm_id.  There are some exceptions, such
as when the MAD is a direct response to a receive MAD.  For
example, the CM may generate an MRA in response to a duplicate
REQ.  But, in general, if a MAD may be sent as a result of
the user invoking an API call (e.g. ib_send_cm_rep(),
ib_send_cm_rtu(), etc.), a reference is taken on the cm_id.

This reference is necessary if the MAD requires a response.
The reference allows routing a response MAD back to the
cm_id, or, if no response is received, allows updating the
cm_id state to reflect the failure.

For MADs which do not generate a response from the
target, however, there's no need to hold a reference on the cm_id.
Such MADs will not be retried by the MAD layer and their
completions do not change the state of the cm_id.

There are 2 internal calls used to allocate MADs which take
a reference on the cm_id: cm_alloc_msg() and cm_alloc_priv_msg().
The latter calls the former.  It turns out that all other places
where cm_alloc_msg() is called are for MADs that do not generate
a response from the target: sending an RTU, DREP, REJ, MRA, or
SIDR REP.  In all of these cases, there's no need to hold a
reference on the cm_id.

The benefit of dropping unneeded references is that it allows
destruction of the cm_id to proceed immediately.  Currently,
the cm_destroy_id() call blocks as long as there's a reference
held on the cm_id.  Worse, is that cm_destroy_id() will send
MADs, which it then needs to complete.  Sending the MADs is
beneficial, as they notify the peer that a connection is
being destroyed.  However, since the MADs hold a reference
on the cm_id, they block destruction and cannot be retried.

Move cm_id referencing from cm_alloc_msg() to cm_alloc_priv_msg().
The latter should hold a reference on the cm_id in all cases but
one, which will be handled in a separate patch.  cm_alloc_priv_msg()
is used when sending a REQ, REP, DREQ, and SIDR REQ, all of which
require a response.

Also, merge common code into cm_alloc_priv_msg() and combine the
freeing of all messages which do not need a response.

Signed-off-by: Sean Hefty &lt;shefty@nvidia.com&gt;
Signed-off-by: Or Har-Toov &lt;ohartoov@nvidia.com&gt;
Signed-off-by: Vlad Dumitrescu &lt;vdumitrescu@nvidia.com&gt;
Link: https://patch.msgid.link/1f0f96acace72790ecf89087fc765dead960189e.1731495873.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>IB/cm: Explicitly mark if a response MAD is a retransmission</title>
<updated>2024-11-17T09:51:38+00:00</updated>
<author>
<name>Sean Hefty</name>
<email>shefty@nvidia.com</email>
</author>
<published>2024-11-13T11:12:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0492458750c9fbd69cfc7baddd3ddcac77f2a0c8'/>
<id>urn:sha1:0492458750c9fbd69cfc7baddd3ddcac77f2a0c8</id>
<content type='text'>
In several situations the CM may send a reply to a received MAD
without the reply being directly linked with a cm_id.  For
example, it may send a REJ in response to a REQ which does not
match a listener.  Or, it may send a DREP in response to a DREQ
if the cm_id has already been destroyed.  This can happen if the
original DREP was lost and the DREQ was retried.

When such a response MAD completes, it updates a counter tracking
how many MADs were retried.  However, not all response MADs issued
directly by the CM may be retries.  The REJ mentioned in the example
above is such a case.  To distinguish between responses which were
retries versus those that are not, the send_handler performs the
following check: is a retry if the response is not associated with
a cm_id and the response is not a REJ message.

Replace this indirect method of checking if a response is a retry
with an explicit check.  Note that these retries are generated
directly by the CM, rather than retried by the MAD layer.

This change will be needed by later changes which would otherwise
break the indirect check.

Signed-off-by: Sean Hefty &lt;shefty@nvidia.com&gt;
Signed-off-by: Or Har-Toov &lt;ohartoov@nvidia.com&gt;
Signed-off-by: Vlad Dumitrescu &lt;vdumitrescu@nvidia.com&gt;
Link: https://patch.msgid.link/1ee6e2a68f8de1992b9da23aa1d7e3f9f25e0036.1731495873.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
</feed>
