<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/net/bonding.h, branch v5.15.3</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.3</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.3'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2021-08-11T21:57:31+00:00</updated>
<entry>
<title>bonding: remove extraneous definitions from bonding.h</title>
<updated>2021-08-11T21:57:31+00:00</updated>
<author>
<name>Jonathan Toppins</name>
<email>jtoppins@redhat.com</email>
</author>
<published>2021-08-11T02:53:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=891a88f4f5768b1e6ff52a2386d48558aa5a3f63'/>
<id>urn:sha1:891a88f4f5768b1e6ff52a2386d48558aa5a3f63</id>
<content type='text'>
All of the symbols either only exist in bond_options.c or nowhere at
all. These symbols were verified to not exist in the code base by
using `git grep` and their removal was verified by compiling bonding.ko.

Signed-off-by: Jonathan Toppins &lt;jtoppins@redhat.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next</title>
<updated>2021-08-10T14:53:22+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2021-08-10T14:27:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d1a4e0a9576fd2b29a0d13b306a9f52440908ab4'/>
<id>urn:sha1:d1a4e0a9576fd2b29a0d13b306a9f52440908ab4</id>
<content type='text'>
Daniel Borkmann says:

====================
bpf-next 2021-08-10

We've added 31 non-merge commits during the last 8 day(s) which contain
a total of 28 files changed, 3644 insertions(+), 519 deletions(-).

1) Native XDP support for bonding driver &amp; related BPF selftests, from Jussi Maki.

2) Large batch of new BPF JIT tests for test_bpf.ko that came out as a result from
   32-bit MIPS JIT development, from Johan Almbladh.

3) Rewrite of netcnt BPF selftest and merge into test_progs, from Stanislav Fomichev.

4) Fix XDP bpf_prog_test_run infra after net to net-next merge, from Andrii Nakryiko.

5) Follow-up fix in unix_bpf_update_proto() to enforce socket type, from Cong Wang.

6) Fix bpf-iter-tcp4 selftest to print the correct dest IP, from Jose Blanquicet.

7) Various misc BPF XDP sample improvements, from Niklas Söderlund, Matthew Cover,
   and Muhammad Falak R Wani.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (31 commits)
  bpf, tests: Add tail call test suite
  bpf, tests: Add tests for BPF_CMPXCHG
  bpf, tests: Add tests for atomic operations
  bpf, tests: Add test for 32-bit context pointer argument passing
  bpf, tests: Add branch conversion JIT test
  bpf, tests: Add word-order tests for load/store of double words
  bpf, tests: Add tests for ALU operations implemented with function calls
  bpf, tests: Add more ALU64 BPF_MUL tests
  bpf, tests: Add more BPF_LSH/RSH/ARSH tests for ALU64
  bpf, tests: Add more ALU32 tests for BPF_LSH/RSH/ARSH
  bpf, tests: Add more tests of ALU32 and ALU64 bitwise operations
  bpf, tests: Fix typos in test case descriptions
  bpf, tests: Add BPF_MOV tests for zero and sign extension
  bpf, tests: Add BPF_JMP32 test cases
  samples, bpf: Add an explict comment to handle nested vlan tagging.
  selftests/bpf: Add tests for XDP bonding
  selftests/bpf: Fix xdp_tx.c prog section name
  net, core: Allow netdev_lower_get_next_private_rcu in bh context
  bpf, devmap: Exclude XDP broadcast to master device
  net, bonding: Add XDP support to the bonding driver
  ...
====================

Link: https://lore.kernel.org/r/20210810130038.16927-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net, bonding: Add XDP support to the bonding driver</title>
<updated>2021-08-09T21:20:14+00:00</updated>
<author>
<name>Jussi Maki</name>
<email>joamaki@gmail.com</email>
</author>
<published>2021-07-31T05:57:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9e2ee5c7e7c35d195e2aa0692a7241d47a433d1e'/>
<id>urn:sha1:9e2ee5c7e7c35d195e2aa0692a7241d47a433d1e</id>
<content type='text'>
XDP is implemented in the bonding driver by transparently delegating
the XDP program loading, removal and xmit operations to the bonding
slave devices. The overall goal of this work is that XDP programs
can be attached to a bond device *without* any further changes (or
awareness) necessary to the program itself, meaning the same XDP
program can be attached to a native device but also a bonding device.

Semantics of XDP_TX when attached to a bond are equivalent in such
setting to the case when a tc/BPF program would be attached to the
bond, meaning transmitting the packet out of the bond itself using one
of the bond's configured xmit methods to select a slave device (rather
than XDP_TX on the slave itself). Handling of XDP_TX to transmit
using the configured bonding mechanism is therefore implemented by
rewriting the BPF program return value in bpf_prog_run_xdp. To avoid
performance impact this check is guarded by a static key, which is
incremented when a XDP program is loaded onto a bond device. This
approach was chosen to avoid changes to drivers implementing XDP. If
the slave device does not match the receive device, then XDP_REDIRECT
is transparently used to perform the redirection in order to have
the network driver release the packet from its RX ring. The bonding
driver hashing functions have been refactored to allow reuse with
xdp_buff's to avoid code duplication.

The motivation for this change is to enable use of bonding (and
802.3ad) in hairpinning L4 load-balancers such as [1] implemented with
XDP and also to transparently support bond devices for projects that
use XDP given most modern NICs have dual port adapters. An alternative
to this approach would be to implement 802.3ad in user-space and
implement the bonding load-balancing in the XDP program itself, but
is rather a cumbersome endeavor in terms of slave device management
(e.g. by watching netlink) and requires separate programs for native
vs bond cases for the orchestrator. A native in-kernel implementation
overcomes these issues and provides more flexibility.

Below are benchmark results done on two machines with 100Gbit
Intel E810 (ice) NIC and with 32-core 3970X on sending machine, and
16-core 3950X on receiving machine. 64 byte packets were sent with
pktgen-dpdk at full rate. Two issues [2, 3] were identified with the
ice driver, so the tests were performed with iommu=off and patch [2]
applied. Additionally the bonding round robin algorithm was modified
to use per-cpu tx counters as high CPU load (50% vs 10%) and high rate
of cache misses were caused by the shared rr_tx_counter (see patch
2/3). The statistics were collected using "sar -n dev -u 1 10". On top
of that, for ice, further work is in progress on improving the XDP_TX
numbers [4].

 -----------------------|  CPU  |--| rxpck/s |--| txpck/s |----
 without patch (1 dev):
   XDP_DROP:              3.15%      48.6Mpps
   XDP_TX:                3.12%      18.3Mpps     18.3Mpps
   XDP_DROP (RSS):        9.47%      116.5Mpps
   XDP_TX (RSS):          9.67%      25.3Mpps     24.2Mpps
 -----------------------
 with patch, bond (1 dev):
   XDP_DROP:              3.14%      46.7Mpps
   XDP_TX:                3.15%      13.9Mpps     13.9Mpps
   XDP_DROP (RSS):        10.33%     117.2Mpps
   XDP_TX (RSS):          10.64%     25.1Mpps     24.0Mpps
 -----------------------
 with patch, bond (2 devs):
   XDP_DROP:              6.27%      92.7Mpps
   XDP_TX:                6.26%      17.6Mpps     17.5Mpps
   XDP_DROP (RSS):       11.38%      117.2Mpps
   XDP_TX (RSS):         14.30%      28.7Mpps     27.4Mpps
 --------------------------------------------------------------

RSS: Receive Side Scaling, e.g. the packets were sent to a range of
destination IPs.

  [1]: https://cilium.io/blog/2021/05/20/cilium-110#standalonelb
  [2]: https://lore.kernel.org/bpf/20210601113236.42651-1-maciej.fijalkowski@intel.com/T/#t
  [3]: https://lore.kernel.org/bpf/CAHn8xckNXci+X_Eb2WMv4uVYjO2331UWB2JLtXr_58z0Av8+8A@mail.gmail.com/
  [4]: https://lore.kernel.org/bpf/20210805230046.28715-1-maciej.fijalkowski@intel.com/T/#t

Signed-off-by: Jussi Maki &lt;joamaki@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Jay Vosburgh &lt;j.vosburgh@gmail.com&gt;
Cc: Veaceslav Falico &lt;vfalico@gmail.com&gt;
Cc: Andy Gospodarek &lt;andy@greyhouse.net&gt;
Cc: Maciej Fijalkowski &lt;maciej.fijalkowski@intel.com&gt;
Cc: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Link: https://lore.kernel.org/bpf/20210731055738.16820-4-joamaki@gmail.com
</content>
</entry>
<entry>
<title>bonding: add new option lacp_active</title>
<updated>2021-08-03T10:50:22+00:00</updated>
<author>
<name>Hangbin Liu</name>
<email>liuhangbin@gmail.com</email>
</author>
<published>2021-08-02T03:02:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3a755cd8b7c601f756cbbf908b84f7cc8c04a02b'/>
<id>urn:sha1:3a755cd8b7c601f756cbbf908b84f7cc8c04a02b</id>
<content type='text'>
Add an option lacp_active, which is similar with team's runner.active.
This option specifies whether to send LACPDU frames periodically. If set
on, the LACPDU frames are sent along with the configured lacp_rate
setting. If set off, the LACPDU frames acts as "speak when spoken to".

Note, the LACPDU state frames still will be sent when init or unbind port.

v2: remove module parameter

Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bonding: Add struct bond_ipesc to manage SA</title>
<updated>2021-07-06T17:36:59+00:00</updated>
<author>
<name>Taehee Yoo</name>
<email>ap420073@gmail.com</email>
</author>
<published>2021-07-05T15:38:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9a5605505d9c7dbfdb89cc29a8f5fc5cf9fd2334'/>
<id>urn:sha1:9a5605505d9c7dbfdb89cc29a8f5fc5cf9fd2334</id>
<content type='text'>
bonding has been supporting ipsec offload.
When SA is added, bonding just passes SA to its own active real interface.
But it doesn't manage SA.
So, when events(add/del real interface, active real interface change, etc)
occur, bonding can't handle that well because It doesn't manage SA.
So some problems(panic, UAF, refcnt leak)occur.

In order to make it stable, it should manage SA.
That's the reason why struct bond_ipsec is added.
When a new SA is added to bonding interface, it is stored in the
bond_ipsec list. And the SA is passed to a current active real interface.
If events occur, it uses bond_ipsec data to handle these events.
bond-&gt;ipsec_list is protected by bond-&gt;ipsec_lock.

If a current active real interface is changed, the following logic works.
1. delete all SAs from old active real interface
2. Add all SAs to the new active real interface.
3. If a new active real interface doesn't support ipsec offload or SA's
option, it sets real_dev to NULL.

Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves")
Signed-off-by: Taehee Yoo &lt;ap420073@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: bonding: Use per-cpu rr_tx_counter</title>
<updated>2021-06-15T18:26:15+00:00</updated>
<author>
<name>Jussi Maki</name>
<email>joamaki@gmail.com</email>
</author>
<published>2021-06-15T08:54:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=848ca9182a7d25bb54955c3aab9a3a2742bf9678'/>
<id>urn:sha1:848ca9182a7d25bb54955c3aab9a3a2742bf9678</id>
<content type='text'>
The round-robin rr_tx_counter was shared across CPUs leading to
significant cache thrashing at high packet rates. This patch switches
the round-robin packet counter to use a per-cpu variable to decide
the destination slave.

On a test with 2x100Gbit ICE nic with pktgen_sample_04_many_flows.sh
(-s 64 -t 32) the tx rate was 19.6Mpps before and 22.3Mpps after
this patch.

"perf top -e cache_misses" before:
    12.31%  [bonding]       [k] bond_xmit_roundrobin_slave_get
    10.59%  [sch_fq_codel]  [k] fq_codel_dequeue
     9.34%  [kernel]        [k] skb_release_data
after:
    15.42%  [sch_fq_codel]  [k] fq_codel_dequeue
    10.06%  [kernel]        [k] __memset
     9.12%  [kernel]        [k] skb_release_data

Signed-off-by: Jussi Maki &lt;joamaki@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net/bonding: Declare TLS RX device offload support</title>
<updated>2021-01-19T04:48:40+00:00</updated>
<author>
<name>Tariq Toukan</name>
<email>tariqt@nvidia.com</email>
</author>
<published>2021-01-17T14:59:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dc5809f9e2b674a489723bd8d0131c97e565ca8d'/>
<id>urn:sha1:dc5809f9e2b674a489723bd8d0131c97e565ca8d</id>
<content type='text'>
Following the description in previous patch (for TX):
As the bond interface is being bypassed by the TLS module, interacting
directly against the lower devs, there is no way for the bond interface
to disable its device offload capabilities, as long as the mode/policy
config allows it.
Hence, the feature flag is not directly controllable, but just reflects
the offload status based on the logic under bond_sk_check().

Here we just declare RX device offload support, and expose it via the
NETIF_F_HW_TLS_RX flag.

Signed-off-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Reviewed-by: Boris Pismenny &lt;borisp@nvidia.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net/bonding: Implement TLS TX device offload</title>
<updated>2021-01-19T04:48:40+00:00</updated>
<author>
<name>Tariq Toukan</name>
<email>tariqt@nvidia.com</email>
</author>
<published>2021-01-17T14:59:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=89df6a8104706f94800ed527ad73d07465ea4d12'/>
<id>urn:sha1:89df6a8104706f94800ed527ad73d07465ea4d12</id>
<content type='text'>
Implement TLS TX device offload for bonding interfaces.
This allows kTLS sockets running on a bond to benefit from the
device offload on capable lower devices.

To allow a simple and fast maintenance of the TLS context in SW and
lower devices, we bind the TLS socket to a specific lower dev.
To achieve a behavior similar to SW kTLS, we support only balance-xor
and 802.3ad modes, with xmit_hash_policy=layer3+4. This is enforced
in bond_sk_check(), done in a previous patch.

For the above configuration, the SW implementation keeps picking the
same exact lower dev for all the socket's SKBs. The device offload
behaves similarly, making the decision once at the connection creation.

Per socket, the TLS module should work directly with the lowest netdev
in chain, to call the tls_dev_ops operations.

As the bond interface is being bypassed by the TLS module, interacting
directly against the lower devs, there is no way for the bond interface
to disable its device offload capabilities, as long as the mode/policy
config allows it.
Hence, the feature flag is not directly controllable, but just reflects
the current offload status based on the logic under bond_sk_check().

Signed-off-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Reviewed-by: Boris Pismenny &lt;borisp@nvidia.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net/bonding: Implement ndo_sk_get_lower_dev</title>
<updated>2021-01-19T04:48:40+00:00</updated>
<author>
<name>Tariq Toukan</name>
<email>tariqt@nvidia.com</email>
</author>
<published>2021-01-17T14:59:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=007feb87fb15933b5de7135e6bdf57c219b3fbec'/>
<id>urn:sha1:007feb87fb15933b5de7135e6bdf57c219b3fbec</id>
<content type='text'>
Add ndo_sk_get_lower_dev() implementation for bond interfaces.

Support only for the cases where the socket's and SKBs' hash
yields identical value for the whole connection lifetime.

Here we restrict it to L3+4 sockets only, with
xmit_hash_policy==LAYER34 and bond modes xor/802.3ad.

Signed-off-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Reviewed-by: Boris Pismenny &lt;borisp@nvidia.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>bonding: fix feature flag setting at init time</title>
<updated>2020-12-08T19:26:08+00:00</updated>
<author>
<name>Jarod Wilson</name>
<email>jarod@redhat.com</email>
</author>
<published>2020-12-05T17:22:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=007ab5345545aba2f9cbe4c096cc35d2fd3275ac'/>
<id>urn:sha1:007ab5345545aba2f9cbe4c096cc35d2fd3275ac</id>
<content type='text'>
Don't try to adjust XFRM support flags if the bond device isn't yet
registered. Bad things can currently happen when netdev_change_features()
is called without having wanted_features fully filled in yet. This code
runs both on post-module-load mode changes, as well as at module init
time, and when run at module init time, it is before register_netdevice()
has been called and filled in wanted_features. The empty wanted_features
led to features also getting emptied out, which was definitely not the
intended behavior, so prevent that from happening.

Originally, I'd hoped to stop adjusting wanted_features at all in the
bonding driver, as it's documented as being something only the network
core should touch, but we actually do need to do this to properly update
both the features and wanted_features fields when changing the bond type,
or we get to a situation where ethtool sees:

    esp-hw-offload: off [requested on]

I do think we should be using netdev_update_features instead of
netdev_change_features here though, so we only send notifiers when the
features actually changed.

Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load")
Reported-by: Ivan Vecera &lt;ivecera@redhat.com&gt;
Suggested-by: Ivan Vecera &lt;ivecera@redhat.com&gt;
Cc: Jay Vosburgh &lt;j.vosburgh@gmail.com&gt;
Cc: Veaceslav Falico &lt;vfalico@gmail.com&gt;
Cc: Andy Gospodarek &lt;andy@greyhouse.net&gt;
Signed-off-by: Jarod Wilson &lt;jarod@redhat.com&gt;
Link: https://lore.kernel.org/r/20201205172229.576587-1-jarod@redhat.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
</feed>
