<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/mlx4, branch v6.6.132</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-06-04T12:42:09+00:00</updated>
<entry>
<title>net/mlx4_core: Avoid impossible mlx4_db_alloc() order value</title>
<updated>2025-06-04T12:42:09+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2025-02-10T17:45:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d655dbd82b555abf19ca38b57cde76175d4ae615'/>
<id>urn:sha1:d655dbd82b555abf19ca38b57cde76175d4ae615</id>
<content type='text'>
[ Upstream commit 4a6f18f28627e121bd1f74b5fcc9f945d6dbeb1e ]

GCC can see that the value range for "order" is capped, but this leads
it to consider that it might be negative, leading to a false positive
warning (with GCC 15 with -Warray-bounds -fdiagnostics-details):

../drivers/net/ethernet/mellanox/mlx4/alloc.c:691:47: error: array subscript -1 is below array bounds of 'long unsigned int *[2]' [-Werror=array-bounds=]
  691 |                 i = find_first_bit(pgdir-&gt;bits[o], MLX4_DB_PER_PAGE &gt;&gt; o);
      |                                    ~~~~~~~~~~~^~~
  'mlx4_alloc_db_from_pgdir': events 1-2
  691 |                 i = find_first_bit(pgdir-&gt;bits[o], MLX4_DB_PER_PAGE &gt;&gt; o);                        |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     |                         |                                                   |                     |                         (2) out of array bounds here
      |                     (1) when the condition is evaluated to true                             In file included from ../drivers/net/ethernet/mellanox/mlx4/mlx4.h:53,
                 from ../drivers/net/ethernet/mellanox/mlx4/alloc.c:42:
../include/linux/mlx4/device.h:664:33: note: while referencing 'bits'
  664 |         unsigned long          *bits[2];
      |                                 ^~~~

Switch the argument to unsigned int, which removes the compiler needing
to consider negative values.

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
Link: https://patch.msgid.link/20250210174504.work.075-kees@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>mlx4: Delete custom device management logic</title>
<updated>2023-08-23T07:25:28+00:00</updated>
<author>
<name>Petr Pavlu</name>
<email>petr.pavlu@suse.com</email>
</author>
<published>2023-08-21T13:12:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c138cdb89a14ce00d45e77d3db18263e4b9f9465'/>
<id>urn:sha1:c138cdb89a14ce00d45e77d3db18263e4b9f9465</id>
<content type='text'>
After the conversion to use the auxiliary bus, the custom device
management is not needed anymore and can be deleted.

Signed-off-by: Petr Pavlu &lt;petr.pavlu@suse.com&gt;
Tested-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Reviewed-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Acked-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>mlx4: Register mlx4 devices to an auxiliary virtual bus</title>
<updated>2023-08-23T07:25:28+00:00</updated>
<author>
<name>Petr Pavlu</name>
<email>petr.pavlu@suse.com</email>
</author>
<published>2023-08-21T13:12:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8c2d2b87719bad9c1db4a7919e74f7818a8ca3de'/>
<id>urn:sha1:8c2d2b87719bad9c1db4a7919e74f7818a8ca3de</id>
<content type='text'>
Add an auxiliary virtual bus to model the mlx4 driver structure. The
code is added along the current custom device management logic.
Subsequent patches switch mlx4_en and mlx4_ib to the auxiliary bus and
the old interface is then removed.

Structure mlx4_priv gains a new adev dynamic array to keep track of its
auxiliary devices. Access to the array is protected by the global
mlx4_intf mutex.

Functions mlx4_register_device() and mlx4_unregister_device() are
updated to expose auxiliary devices on the bus in order to load mlx4_en
and/or mlx4_ib. Functions mlx4_register_auxiliary_driver() and
mlx4_unregister_auxiliary_driver() are added to substitute
mlx4_register_interface() and mlx4_unregister_interface(), respectively.
Function mlx4_do_bond() is adjusted to walk over the adev array and
re-adds a specific auxiliary device if its driver sets the
MLX4_INTFF_BONDING flag.

Signed-off-by: Petr Pavlu &lt;petr.pavlu@suse.com&gt;
Tested-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Reviewed-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Acked-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>mlx4: Move the bond work to the core driver</title>
<updated>2023-08-23T07:25:28+00:00</updated>
<author>
<name>Petr Pavlu</name>
<email>petr.pavlu@suse.com</email>
</author>
<published>2023-08-21T13:12:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e2fb47d4eb5cd245c38c8c57d969ac6b12efc764'/>
<id>urn:sha1:e2fb47d4eb5cd245c38c8c57d969ac6b12efc764</id>
<content type='text'>
Function mlx4_en_queue_bond_work() is used in mlx4_en to start a bond
reconfiguration. It gathers data about a new port map setting, takes
a reference on the netdev that triggered the change and queues a work
object on mlx4_en_priv.mdev.workqueue to perform the operation. The
scheduled work is mlx4_en_bond_work() which calls
mlx4_bond()/mlx4_unbond() and consequently mlx4_do_bond().

At the same time, function mlx4_change_port_types() in mlx4_core might
be invoked to change the port type configuration. As part of its logic,
it re-registers the whole device by calling mlx4_unregister_device(),
followed by mlx4_register_device().

The two operations can result in concurrent access to the data about
currently active interfaces on the device.

Functions mlx4_register_device() and mlx4_unregister_device() lock the
intf_mutex to gain exclusive access to this data. The current
implementation of mlx4_do_bond() doesn't do that which could result in
an unexpected behavior. An updated version of mlx4_do_bond() for use
with an auxiliary bus goes and locks the intf_mutex when accessing a new
auxiliary device array.

However, doing so can then result in the following deadlock:
* A two-port mlx4 device is configured as an Ethernet bond.
* One of the ports is changed from eth to ib, for instance, by writing
  into a mlx4_port&lt;x&gt; sysfs attribute file.
* mlx4_change_port_types() is called to update port types. It invokes
  mlx4_unregister_device() to unregister the device which locks the
  intf_mutex and starts removing all associated interfaces.
* Function mlx4_en_remove() gets invoked and starts destroying its first
  netdev. This triggers mlx4_en_netdev_event() which recognizes that the
  configured bond is broken. It runs mlx4_en_queue_bond_work() which
  takes a reference on the netdev. Removing the netdev now cannot
  proceed until the work is completed.
* Work function mlx4_en_bond_work() gets scheduled. It calls
  mlx4_unbond() -&gt; mlx4_do_bond(). The latter function tries to lock the
  intf_mutex but that is not possible because it is held already by
  mlx4_unregister_device().

This particular case could be possibly solved by unregistering the
mlx4_en_netdev_event() notifier in mlx4_en_remove() earlier, but it
seems better to decouple mlx4_en more and break this reference order.

Avoid then this scenario by recognizing that the bond reconfiguration
operates only on a mlx4_dev. The logic to queue and execute the bond
work can be moved into the mlx4_core driver. Only a reference on the
respective mlx4_dev object is needed to be taken during the work's
lifetime. This removes a call from mlx4_en that can directly result in
needing to lock the intf_mutex, it remains a privilege of the core
driver.

Signed-off-by: Petr Pavlu &lt;petr.pavlu@suse.com&gt;
Tested-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Reviewed-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Acked-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>mlx4: Get rid of the mlx4_interface.activate callback</title>
<updated>2023-08-23T07:25:27+00:00</updated>
<author>
<name>Petr Pavlu</name>
<email>petr.pavlu@suse.com</email>
</author>
<published>2023-08-21T13:12:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=13f857111cb23f2a8dbcd0271c3ff1824913d980'/>
<id>urn:sha1:13f857111cb23f2a8dbcd0271c3ff1824913d980</id>
<content type='text'>
The mlx4_interface.activate callback was introduced in commit
79857cd31fe7 ("net/mlx4: Postpone the registration of net_device"). It
dealt with a situation when a netdev notifier received a NETDEV_REGISTER
event for a new net_device created by mlx4_en but the same device was
not yet visible to mlx4_get_protocol_dev(). The callback can be removed
now that mlx4_get_protocol_dev() is gone.

Signed-off-by: Petr Pavlu &lt;petr.pavlu@suse.com&gt;
Tested-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Reviewed-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Acked-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>mlx4: Replace the mlx4_interface.event callback with a notifier</title>
<updated>2023-08-23T07:25:27+00:00</updated>
<author>
<name>Petr Pavlu</name>
<email>petr.pavlu@suse.com</email>
</author>
<published>2023-08-21T13:12:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=73d68002a02efd370dba6b8fc570427326e36d1a'/>
<id>urn:sha1:73d68002a02efd370dba6b8fc570427326e36d1a</id>
<content type='text'>
Use a notifier to implement mlx4_dispatch_event() in preparation to
switch mlx4_en and mlx4_ib to be an auxiliary device.

A problem is that if the mlx4_interface.event callback was replaced with
something as mlx4_adrv.event then the implementation of
mlx4_dispatch_event() would need to acquire a lock on a given device
before executing this callback. That is necessary because otherwise
there is no guarantee that the associated driver cannot get unbound when
the callback is running. However, taking this lock is not possible
because mlx4_dispatch_event() can be invoked from the hardirq context.
Using an atomic notifier allows the driver to accurately record when it
wants to receive these events and solves this problem.

A handler registration is done by both mlx4_en and mlx4_ib at the end of
their mlx4_interface.add callback. This matches the current situation
when mlx4_add_device() would enable events for a given device
immediately after this callback, by adding the device on the
mlx4_priv.list.

Signed-off-by: Petr Pavlu &lt;petr.pavlu@suse.com&gt;
Tested-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Acked-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Reviewed-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>mlx4: Use 'void *' as the event param of mlx4_dispatch_event()</title>
<updated>2023-08-23T07:25:27+00:00</updated>
<author>
<name>Petr Pavlu</name>
<email>petr.pavlu@suse.com</email>
</author>
<published>2023-08-21T13:12:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7ba189ac52acab44129f09b302069e5a86fd92c2'/>
<id>urn:sha1:7ba189ac52acab44129f09b302069e5a86fd92c2</id>
<content type='text'>
Function mlx4_dispatch_event() takes an 'unsigned long' as its event
parameter. The actual value is none (MLX4_DEV_EVENT_CATASTROPHIC_ERROR),
a pointer to mlx4_eqe (MLX4_DEV_EVENT_PORT_MGMT_CHANGE), or a 32-bit
integer (remaining events).

In preparation to switch mlx4_en and mlx4_ib to be an auxiliary device,
the mlx4_interface.event callback is replaced with a notifier and
function mlx4_dispatch_event() gets updated to invoke
atomic_notifier_call_chain(). This requires forwarding the input 'param'
value from the former function to the latter. A problem is that the
notifier call takes 'void *' as its 'param' value, compared to
'unsigned long' used by mlx4_dispatch_event(). Re-passing the value
would need either punning it to 'void *' or passing down the address of
the input 'param'. Both approaches create a number of unnecessary casts.

Change instead the input 'param' of mlx4_dispatch_event() from
'unsigned long' to 'void *'. A mlx4_eqe pointer can be passed directly,
callers using an int value are adjusted to pass its address.

Signed-off-by: Petr Pavlu &lt;petr.pavlu@suse.com&gt;
Reviewed-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>mlx4: Get rid of the mlx4_interface.get_dev callback</title>
<updated>2023-08-23T07:25:27+00:00</updated>
<author>
<name>Petr Pavlu</name>
<email>petr.pavlu@suse.com</email>
</author>
<published>2023-08-21T13:12:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=71ab55a9af80fcffe1f42b9b8dba4d0e3dd0c351'/>
<id>urn:sha1:71ab55a9af80fcffe1f42b9b8dba4d0e3dd0c351</id>
<content type='text'>
Simplify the mlx4 driver interface by removing mlx4_get_protocol_dev()
and the associated mlx4_interface.get_dev callbacks. This is done in
preparation to use an auxiliary bus to model the mlx4 driver structure.

The change is motivated by the following situation:
* The mlx4_en interface is being initialized by mlx4_en_add() and
  mlx4_en_activate().
* The latter activate function calls mlx4_en_init_netdev() -&gt;
  register_netdev() to register a new net_device.
* A netdev event NETDEV_REGISTER is raised for the device.
* The netdev notififier mlx4_ib_netdev_event() is called and it invokes
  mlx4_ib_scan_netdevs() -&gt; mlx4_get_protocol_dev() -&gt;
  mlx4_en_get_netdev() [via mlx4_interface.get_dev].

This chain creates a problem when mlx4_en gets switched to be an
auxiliary driver. It contains two device calls which would both need to
take a respective device lock.

Avoid this situation by updating mlx4_ib_scan_netdevs() to no longer
call mlx4_get_protocol_dev() but instead to utilize the information
passed in net_device.parent and net_device.dev_port. This data is
sufficient to determine that an updated port is one that the mlx4_ib
driver should take care of and to keep mlx4_ib_dev.iboe.netdevs up to
date.

Following that, update mlx4_ib_get_netdev() to also not call
mlx4_get_protocol_dev() and instead scan all current netdevs to find
find a matching one. Note that mlx4_ib_get_netdev() is called early from
ib_register_device() and cannot use data tracked in
mlx4_ib_dev.iboe.netdevs which is not at that point yet set.

Finally, remove function mlx4_get_protocol_dev() and the
mlx4_interface.get_dev callbacks (only mlx4_en_get_netdev()) as they
became unused.

Signed-off-by: Petr Pavlu &lt;petr.pavlu@suse.com&gt;
Tested-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Reviewed-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Acked-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma</title>
<updated>2023-02-24T23:11:03+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-02-24T23:11:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8cbd92339db08b19b93d1637e5799ff2a8dddfd2'/>
<id>urn:sha1:8cbd92339db08b19b93d1637e5799ff2a8dddfd2</id>
<content type='text'>
Pull rdma updates from Jason Gunthorpe:
 "Quite a small cycle this time, even with the rc8. I suppose everyone
  went to sleep over xmas.

   - Minor driver updates for hfi1, cxgb4, erdma, hns, irdma, mlx5, siw,
     mana

   - inline CQE support for hns

   - Have mlx5 display device error codes

   - Pinned DMABUF support for irdma

   - Continued rxe cleanups, particularly converting the MRs to use
     xarray

   - Improvements to what can be cached in the mlx5 mkey cache"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (61 commits)
  IB/mlx5: Extend debug control for CC parameters
  IB/hfi1: Fix sdma.h tx-&gt;num_descs off-by-one errors
  IB/hfi1: Fix math bugs in hfi1_can_pin_pages()
  RDMA/irdma: Add support for dmabuf pin memory regions
  RDMA/mlx5: Use query_special_contexts for mkeys
  net/mlx5e: Use query_special_contexts for mkeys
  net/mlx5: Change define name for 0x100 lkey value
  net/mlx5: Expose bits for querying special mkeys
  RDMA/rxe: Fix missing memory barriers in rxe_queue.h
  RDMA/mana_ib: Fix a bug when the PF indicates more entries for registering memory on first packet
  RDMA/rxe: Remove rxe_alloc()
  RDMA/cma: Distinguish between sockaddr_in and sockaddr_in6 by size
  Subject: RDMA/rxe: Handle zero length rdma
  iw_cxgb4: Fix potential NULL dereference in c4iw_fill_res_cm_id_entry()
  RDMA/mlx5: Use rdma_umem_for_each_dma_block()
  RDMA/umem: Remove unused 'work' member from struct ib_umem
  RDMA/irdma: Cap MSIX used to online CPUs + 1
  RDMA/mlx5: Check reg_create() create for errors
  RDMA/restrack: Correct spelling
  RDMA/cxgb4: Fix potential null-ptr-deref in pass_establish()
  ...
</content>
</entry>
<entry>
<title>net/mlx4_en: Introduce flexible array to silence overflow warning</title>
<updated>2023-02-21T00:38:00+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2023-02-18T18:38:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f8f185e39b4de91bc5235e5be0d829bea69d9b06'/>
<id>urn:sha1:f8f185e39b4de91bc5235e5be0d829bea69d9b06</id>
<content type='text'>
The call "skb_copy_from_linear_data(skb, inl + 1, spc)" triggers a FORTIFY
memcpy() warning on ppc64 platform:

In function ‘fortify_memcpy_chk’,
    inlined from ‘skb_copy_from_linear_data’ at ./include/linux/skbuff.h:4029:2,
    inlined from ‘build_inline_wqe’ at drivers/net/ethernet/mellanox/mlx4/en_tx.c:722:4,
    inlined from ‘mlx4_en_xmit’ at drivers/net/ethernet/mellanox/mlx4/en_tx.c:1066:3:
./include/linux/fortify-string.h:513:25: error: call to ‘__write_overflow_field’ declared with
attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()?
[-Werror=attribute-warning]
  513 |                         __write_overflow_field(p_size_field, size);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Same behaviour on x86 you can get if you use "__always_inline" instead of
"inline" for skb_copy_from_linear_data() in skbuff.h

The call here copies data into inlined tx destricptor, which has 104
bytes (MAX_INLINE) space for data payload. In this case "spc" is known
in compile-time but the destination is used with hidden knowledge
(real structure of destination is different from that the compiler
can see). That cause the fortify warning because compiler can check
bounds, but the real bounds are different.  "spc" can't be bigger than
64 bytes (MLX4_INLINE_ALIGN), so the data can always fit into inlined
tx descriptor. The fact that "inl" points into inlined tx descriptor is
determined earlier in mlx4_en_xmit().

Avoid confusing the compiler with "inl + 1" constructions to get to past
the inl header by introducing a flexible array "data" to the struct so
that the compiler can see that we are not dealing with an array of inl
structs, but rather, arbitrary data following the structure. There are
no changes to the structure layout reported by pahole, and the resulting
machine code is actually smaller.

Reported-by: Josef Oskera &lt;joskera@redhat.com&gt;
Link: https://lore.kernel.org/lkml/20230217094541.2362873-1-joskera@redhat.com
Fixes: f68f2ff91512 ("fortify: Detect struct member overflows in memcpy() at compile-time")
Cc: Yishai Hadas &lt;yishaih@nvidia.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Link: https://lore.kernel.org/r/20230218183842.never.954-kees@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
</feed>
