Age | Commit message (Collapse) | Author | Files | Lines |
|
Add test with multiple maps where each socket is inserted in multiple
maps. Test protocols: TCP, UDP, stream af_unix and dgram af_unix.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/r/20231221232327.43678-5-john.fastabend@gmail.com
|
|
Add test with a single map where each socket is inserted multiple
times. Test protocols: TCP, UDP, stream af_unix and dgram af_unix.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/r/20231221232327.43678-4-john.fastabend@gmail.com
|
|
Add a comment describing that the psock update proto callbback can be
called multiple times and this must be safe.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/r/20231221232327.43678-3-john.fastabend@gmail.com
|
|
When sockets are added to a sockmap or sockhash we allocate and init a
psock. Then update the proto ops with sock_map_init_proto the flow is
sock_hash_update_common
sock_map_link
psock = sock_map_psock_get_checked() <-returns existing psock
sock_map_init_proto(sk, psock) <- updates sk_proto
If the socket is already in a map this results in the sock_map_init_proto
being called multiple times on the same socket. We do this because when
a socket is added to multiple maps this might result in a new set of BPF
programs being attached to the socket requiring an updated ops struct.
This creates a rule where it must be safe to call psock_update_sk_prot
multiple times. When we added a fix for UAF through unix sockets in patch
4dd9a38a753fc we broke this rule by adding a sock_hold in that path
to ensure the sock is not released. The result is if a af_unix stream sock
is placed in multiple maps it results in a memory leak because we call
sock_hold multiple times with only a single sock_put on it.
Fixes: 8866730aed51 ("bpf, sockmap: af_unix stream sockets need to hold ref for pair sock")
Reported-by: Xingwei Lee <xrivendell7@gmail.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/r/20231221232327.43678-2-john.fastabend@gmail.com
|
|
Alexei Starovoitov says:
====================
bpf: volatile compare
From: Alexei Starovoitov <ast@kernel.org>
v2->v3:
Debugged profiler.c regression. It was caused by basic block layout.
Introduce bpf_cmp_likely() and bpf_cmp_unlikely() macros.
Debugged redundant <<=32, >>=32 with u32 variables. Added cast workaround.
v1->v2:
Fixed issues pointed out by Daniel, added more tests, attempted to convert profiler.c,
but barrier_var() wins vs bpf_cmp(). To be investigated.
Patches 1-4 are good to go, but 5 needs more work.
====================
Link: https://lore.kernel.org/r/20231226191148.48536-1-alexei.starovoitov@gmail.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
|
|
Convert profiler[123].c to "volatile compare" to compare barrier_var() approach vs bpf_cmp_likely() vs bpf_cmp_unlikely().
bpf_cmp_unlikely() produces correct code, but takes much longer to verify:
./veristat -C -e prog,insns,states before after_with_unlikely
Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF)
------------------------------------ --------- --------- ------------------ ---------- ---------- -----------------
kprobe__proc_sys_write 1603 19606 +18003 (+1123.08%) 123 1678 +1555 (+1264.23%)
kprobe__vfs_link 11815 70305 +58490 (+495.05%) 971 4967 +3996 (+411.53%)
kprobe__vfs_symlink 5464 42896 +37432 (+685.07%) 434 3126 +2692 (+620.28%)
kprobe_ret__do_filp_open 5641 44578 +38937 (+690.25%) 446 3162 +2716 (+608.97%)
raw_tracepoint__sched_process_exec 2770 35962 +33192 (+1198.27%) 226 3121 +2895 (+1280.97%)
raw_tracepoint__sched_process_exit 1526 2135 +609 (+39.91%) 133 208 +75 (+56.39%)
raw_tracepoint__sched_process_fork 265 337 +72 (+27.17%) 19 24 +5 (+26.32%)
tracepoint__syscalls__sys_enter_kill 18782 140407 +121625 (+647.56%) 1286 12176 +10890 (+846.81%)
bpf_cmp_likely() is equivalent to barrier_var():
./veristat -C -e prog,insns,states before after_with_likely
Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF)
------------------------------------ --------- --------- -------------- ---------- ---------- -------------
kprobe__proc_sys_write 1603 1663 +60 (+3.74%) 123 127 +4 (+3.25%)
kprobe__vfs_link 11815 12090 +275 (+2.33%) 971 971 +0 (+0.00%)
kprobe__vfs_symlink 5464 5448 -16 (-0.29%) 434 426 -8 (-1.84%)
kprobe_ret__do_filp_open 5641 5739 +98 (+1.74%) 446 446 +0 (+0.00%)
raw_tracepoint__sched_process_exec 2770 2608 -162 (-5.85%) 226 216 -10 (-4.42%)
raw_tracepoint__sched_process_exit 1526 1526 +0 (+0.00%) 133 133 +0 (+0.00%)
raw_tracepoint__sched_process_fork 265 265 +0 (+0.00%) 19 19 +0 (+0.00%)
tracepoint__syscalls__sys_enter_kill 18782 18970 +188 (+1.00%) 1286 1286 +0 (+0.00%)
kprobe__proc_sys_write 2700 2809 +109 (+4.04%) 107 109 +2 (+1.87%)
kprobe__vfs_link 12238 12366 +128 (+1.05%) 267 269 +2 (+0.75%)
kprobe__vfs_symlink 7139 7365 +226 (+3.17%) 167 175 +8 (+4.79%)
kprobe_ret__do_filp_open 7264 7070 -194 (-2.67%) 180 182 +2 (+1.11%)
raw_tracepoint__sched_process_exec 3768 3453 -315 (-8.36%) 211 199 -12 (-5.69%)
raw_tracepoint__sched_process_exit 3138 3138 +0 (+0.00%) 83 83 +0 (+0.00%)
raw_tracepoint__sched_process_fork 265 265 +0 (+0.00%) 19 19 +0 (+0.00%)
tracepoint__syscalls__sys_enter_kill 26679 24327 -2352 (-8.82%) 1067 1037 -30 (-2.81%)
kprobe__proc_sys_write 1833 1833 +0 (+0.00%) 157 157 +0 (+0.00%)
kprobe__vfs_link 9995 10127 +132 (+1.32%) 803 803 +0 (+0.00%)
kprobe__vfs_symlink 5606 5672 +66 (+1.18%) 451 451 +0 (+0.00%)
kprobe_ret__do_filp_open 5716 5782 +66 (+1.15%) 462 462 +0 (+0.00%)
raw_tracepoint__sched_process_exec 3042 3042 +0 (+0.00%) 278 278 +0 (+0.00%)
raw_tracepoint__sched_process_exit 1680 1680 +0 (+0.00%) 146 146 +0 (+0.00%)
raw_tracepoint__sched_process_fork 299 299 +0 (+0.00%) 25 25 +0 (+0.00%)
tracepoint__syscalls__sys_enter_kill 18372 18372 +0 (+0.00%) 1558 1558 +0 (+0.00%)
default (mcpu=v3), no_alu32, cpuv4 have similar differences.
Note one place where bpf_nop_mov() is used to workaround the verifier lack of link
between the scalar register and its spill to stack.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231226191148.48536-7-alexei.starovoitov@gmail.com
|
|
bpf_nop_mov(var) asm macro emits nop register move: rX = rX.
If 'var' is a scalar and not a fixed constant the verifier will assign ID to it.
If it's later spilled the stack slot will carry that ID as well.
Hence the range refining comparison "if rX < const" will update all copies
including spilled slot.
This macro is a temporary workaround until the verifier gets smarter.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231226191148.48536-6-alexei.starovoitov@gmail.com
|
|
Since the last user was converted to bpf_cmp, remove bpf_assert_eq/ne/... macros.
__bpf_assert_op() macro is kept for experiments, since it's slightly more efficient
than bpf_assert(bpf_cmp_unlikely()) until LLVM is fixed.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20231226191148.48536-5-alexei.starovoitov@gmail.com
|
|
Convert exceptions_assert.c to bpf_cmp_unlikely() macro.
Since
bpf_assert(bpf_cmp_unlikely(var, ==, 100));
other code;
will generate assembly code:
if r1 == 100 goto L2;
r0 = 0
call bpf_throw
L1:
other code;
...
L2: goto L1;
LLVM generates redundant basic block with extra goto. LLVM will be fixed eventually.
Right now it's less efficient than __bpf_assert(var, ==, 100) macro that produces:
if r1 == 100 goto L1;
r0 = 0
call bpf_throw
L1:
other code;
But extra goto doesn't hurt the verification process.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20231226191148.48536-4-alexei.starovoitov@gmail.com
|
|
Compilers optimize conditional operators at will, but often bpf programmers
want to force compilers to keep the same operator in asm as it's written in C.
Introduce bpf_cmp_likely/unlikely(var1, conditional_op, var2) macros that can be used as:
- if (seen >= 1000)
+ if (bpf_cmp_unlikely(seen, >=, 1000))
The macros take advantage of BPF assembly that is C like.
The macros check the sign of variable 'seen' and emits either
signed or unsigned compare.
For example:
int a;
bpf_cmp_unlikely(a, >, 0) will be translated to 'if rX s> 0 goto' in BPF assembly.
unsigned int a;
bpf_cmp_unlikely(a, >, 0) will be translated to 'if rX > 0 goto' in BPF assembly.
C type conversions coupled with comparison operator are tricky.
int i = -1;
unsigned int j = 1;
if (i < j) // this is false.
long i = -1;
unsigned int j = 1;
if (i < j) // this is true.
Make sure BPF program is compiled with -Wsign-compare then the macros will catch
the mistake.
The macros check LHS (left hand side) only to figure out the sign of compare.
'if 0 < rX goto' is not allowed in the assembly, so the users
have to use a variable on LHS anyway.
The patch updates few tests to demonstrate the use of the macros.
The macro allows to use BPF_JSET in C code, since LLVM doesn't generate it at
present. For example:
if (i & j) compiles into r0 &= r1; if r0 == 0 goto
while
if (bpf_cmp_unlikely(i, &, j)) compiles into if r0 & r1 goto
Note that the macros has to be careful with RHS assembly predicate.
Since:
u64 __rhs = 1ull << 42;
asm goto("if r0 < %[rhs] goto +1" :: [rhs] "ri" (__rhs));
LLVM will silently truncate 64-bit constant into s32 imm.
Note that [lhs] "r"((short)LHS) the type cast is a workaround for LLVM issue.
When LHS is exactly 32-bit LLVM emits redundant <<=32, >>=32 to zero upper 32-bits.
When LHS is 64 or 16 or 8-bit variable there are no shifts.
When LHS is 32-bit the (u64) cast doesn't help. Hence use (short) cast.
It does _not_ truncate the variable before it's assigned to a register.
Traditional likely()/unlikely() macros that use __builtin_expect(!!(x), 1 or 0)
have no effect on these macros, hence macros implement the logic manually.
bpf_cmp_unlikely() macro preserves compare operator as-is while
bpf_cmp_likely() macro flips the compare.
Consider two cases:
A.
for() {
if (foo >= 10) {
bar += foo;
}
other code;
}
B.
for() {
if (foo >= 10)
break;
other code;
}
It's ok to use either bpf_cmp_likely or bpf_cmp_unlikely macros in both cases,
but consider that 'break' is effectively 'goto out_of_the_loop'.
Hence it's better to use bpf_cmp_unlikely in the B case.
While 'bar += foo' is better to keep as 'fallthrough' == likely code path in the A case.
When it's written as:
A.
for() {
if (bpf_cmp_likely(foo, >=, 10)) {
bar += foo;
}
other code;
}
B.
for() {
if (bpf_cmp_unlikely(foo, >=, 10))
break;
other code;
}
The assembly will look like:
A.
for() {
if r1 < 10 goto L1;
bar += foo;
L1:
other code;
}
B.
for() {
if r1 >= 10 goto L2;
other code;
}
L2:
The bpf_cmp_likely vs bpf_cmp_unlikely changes basic block layout, hence it will
greatly influence the verification process. The number of processed instructions
will be different, since the verifier walks the fallthrough first.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20231226191148.48536-3-alexei.starovoitov@gmail.com
|
|
GCC's -Wall includes -Wsign-compare while clang does not.
Since BPF programs are built with clang we need to add this flag explicitly
to catch problematic comparisons like:
int i = -1;
unsigned int j = 1;
if (i < j) // this is false.
long i = -1;
unsigned int j = 1;
if (i < j) // this is true.
C standard for reference:
- If either operand is unsigned long the other shall be converted to unsigned long.
- Otherwise, if one operand is a long int and the other unsigned int, then if a
long int can represent all the values of an unsigned int, the unsigned int
shall be converted to a long int; otherwise both operands shall be converted to
unsigned long int.
- Otherwise, if either operand is long, the other shall be converted to long.
- Otherwise, if either operand is unsigned, the other shall be converted to unsigned.
Unfortunately clang's -Wsign-compare is very noisy.
It complains about (s32)a == (u32)b which is safe and doen't have surprising behavior.
This patch fixes some of the issues. It needs a follow up to fix the rest.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20231226191148.48536-2-alexei.starovoitov@gmail.com
|
|
Andrei Matei says:
====================
bpf: Simplify checking size of helper accesses
v3->v4:
- kept only the minimal change, undoing debatable changes (Andrii)
- dropped the second patch from before, with changes to the error
message (Andrii)
- extracted the new test into a separate patch (Andrii)
- added Acked by Andrii
v2->v3:
- split the error-logging function to a separate patch (Andrii)
- make the error buffers smaller (Andrii)
- include size of memory region for PTR_TO_MEM (Andrii)
- nits from Andrii and Eduard
v1->v2:
- make the error message include more info about the context of the
zero-sized access (Andrii)
====================
Link: https://lore.kernel.org/r/20231221232225.568730-1-andreimatei1@gmail.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
|
|
This patch adds a test for the condition that the previous patch mucked
with - illegal zero-sized helper memory access. As opposed to existing
tests, this new one uses a size whose lower bound is zero, as opposed to
a known-zero one.
Signed-off-by: Andrei Matei <andreimatei1@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231221232225.568730-3-andreimatei1@gmail.com
|
|
This patch simplifies the verification of size arguments associated to
pointer arguments to helpers and kfuncs. Many helpers take a pointer
argument followed by the size of the memory access performed to be
performed through that pointer. Before this patch, the handling of the
size argument in check_mem_size_reg() was confusing and wasteful: if the
size register's lower bound was 0, then the verification was done twice:
once considering the size of the access to be the lower-bound of the
respective argument, and once considering the upper bound (even if the
two are the same). The upper bound checking is a super-set of the
lower-bound checking(*), except: the only point of the lower-bound check
is to handle the case where zero-sized-accesses are explicitly not
allowed and the lower-bound is zero. This static condition is now
checked explicitly, replacing a much more complex, expensive and
confusing verification call to check_helper_mem_access().
Error messages change in this patch. Before, messages about illegal
zero-size accesses depended on the type of the pointer and on other
conditions, and sometimes the message was plain wrong: in some tests
that changed you'll see that the old message was something like "R1 min
value is outside of the allowed memory range", where R1 is the pointer
register; the error was wrongly claiming that the pointer was bad
instead of the size being bad. Other times the information that the size
came for a register with a possible range of values was wrong, and the
error presented the size as a fixed zero. Now the errors refer to the
right register. However, the old error messages did contain useful
information about the pointer register which is now lost; recovering
this information was deemed not important enough.
(*) Besides standing to reason that the checks for a bigger size access
are a super-set of the checks for a smaller size access, I have also
mechanically verified this by reading the code for all types of
pointers. I could convince myself that it's true for all but
PTR_TO_BTF_ID (check_ptr_to_btf_access). There, simply looking
line-by-line does not immediately prove what we want. If anyone has any
qualms, let me know.
Signed-off-by: Andrei Matei <andreimatei1@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231221232225.568730-2-andreimatei1@gmail.com
|
|
In function `tc_dump_tfilter`, the attributes array is parsed via
tcf_tfilter_dump_policy which only describes TCA_DUMP_FLAGS. However,
the NLA TCA_CHAIN is also accessed with `nla_get_u32`.
The access to TCA_CHAIN is introduced in commit 5bc1701881e3 ("net:
sched: introduce multichain support for filters") and no nla_policy is
provided for parsing at that point. Later on, tcf_tfilter_dump_policy is
introduced in commit f8ab1807a9c9 ("net: sched: introduce terse dump
flag") while still ignoring the fact that TCA_CHAIN needs a check. This
patch does that by complementing the policy to allow the access
discussed here can be safe as other cases just choose rtm_tca_policy as
the parsing policy.
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix spelling typo in comment
Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: liyouhong <liyouhong@kylinos.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231227015831.289077-1-liyouhong@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 13e59344fb9d ("net: ethtool: add support for symmetric-xor RSS hash")
adds a check to the ethtool set_rxnfc operation, which checks the RX
flow hash if the flag RXH_XFRM_SYM_XOR is set. This flag is introduced
with the same commit. It calls the ethtool get_rxfh operation to get the
RX flow hash data. If get_rxfh is not supported, then EOPNOTSUPP is
returned.
There are driver like tsnep, macb, asp2, genet, gianfar, mtk, ... which
support the ethtool operation set_rxnfc but not get_rxfh. This results
in EOPNOTSUPP returned by ethtool_set_rxnfc() without actually calling
the ethtool operation set_rxnfc. Thus, set_rxnfc got broken for all
these drivers.
Check RX flow hash in ethtool_set_rxnfc() only if driver supports RX
flow hash.
Fixes: 13e59344fb9d ("net: ethtool: add support for symmetric-xor RSS hash")
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Ravi Gunasekaran <r-gunasekaran@ti.com>
Link: https://lore.kernel.org/r/20231226205536.32003-1-gerhard@engleder-embedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ahmed Zaki says:
====================
Bug fixes for RSS symmetric-xor
A couple of fixes for the symmetric-xor recently merged in net-next [1].
The first patch copies the xfrm value back to user-space when ethtool is
built with --disable-netlink. The second allows ethtool to change other
RSS attributes while not changing the xfrm values.
Link: https://lore.kernel.org/netdev/20231213003321.605376-1-ahmed.zaki@intel.com/ [1]
====================
Link: https://lore.kernel.org/r/20231221184235.9192-1-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a NO_CHANGE uAPI value for the new RXFH/RSS input_xfrm uAPI field.
This needed so that user-space can set other RSS values (hkey or indir
table) without affecting input_xfrm.
Should have been part of [1].
Link: https://lore.kernel.org/netdev/20231213003321.605376-1-ahmed.zaki@intel.com/ [1]
Fixes: 13e59344fb9d ("net: ethtool: add support for symmetric-xor RSS hash")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://lore.kernel.org/r/20231221184235.9192-3-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The ioctl path of ethtool's get channels is missing the final step of
copying the new input_xfrm field to user-space. This should have been
part of [1].
Link: https://lore.kernel.org/netdev/20231213003321.605376-1-ahmed.zaki@intel.com/ [1]
Fixes: 13e59344fb9d ("net: ethtool: add support for symmetric-xor RSS hash")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://lore.kernel.org/r/20231221184235.9192-2-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The ice driver fails to build when CONFIG_XDP_SOCKETS is disabled.
drivers/net/ethernet/intel/ice/ice_base.c:533:21: error:
variable has incomplete type 'struct xsk_cb_desc'
struct xsk_cb_desc desc = {};
^
include/net/xsk_buff_pool.h:15:8: note:
forward declaration of 'struct xsk_cb_desc'
struct xsk_cb_desc;
^
Fixes: d68d707dcbbf ("ice: Support XDP hints in AF_XDP ZC mode")
Closes: https://lore.kernel.org/netdev/8b76dad3-8847-475b-aa17-613c9c978f7a@infradead.org/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/r/20231219110205.1289506-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit cff9c565e65f3622e8dc1dcc21c1520a083dff35.
Revert based on feedback from Russell.
Link: https://lore.kernel.org/all/ZZPtUIRerqTI2%2Fyh@shell.armlinux.org.uk/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Claudiu Beznea says:
====================
renesas: rzg3s: Add support for Ethernet
Series adds Ethernet support for Renesas RZ/G3S.
Along with it preparatory cleanups and fixes were included.
====================
Link: https://lore.kernel.org/r/20231207070700.4156557-1-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Document Ethernet RZ/G3S support. Ethernet IP is similar to the one
available on RZ/G2L devices.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jamal Hadi Salim says:
====================
net/sched: Remove UAPI support for retired TC qdiscs and classifiers
Classifiers RSVP and tcindex as well as qdiscs dsmark, CBQ and ATM have already
been deleted. This patchset removes their UAPI support.
User space - with a focus on iproute2 - typically copies these UAPI headers for
different kernels.
These deletion patches are coordinated with the iproute2 maintainers to make
sure that they delete any user space code referencing removed objects at their
leisure.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 051d44209842 ("net/sched: Retire CBQ qdisc") retired the CBQ qdisc.
Remove UAPI for it. Iproute2 will sync by equally removing it from user space.
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit fb38306ceb9e ("net/sched: Retire ATM qdisc") retired the ATM qdisc.
Remove UAPI for it. Iproute2 will sync by equally removing it from user space.
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit bbe77c14ee61 ("net/sched: Retire dsmark qdisc") retired the dsmark
classifier. Remove UAPI support for it.
Iproute2 will sync by equally removing it from user space.
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit 8c710f75256b ("net/sched: Retire tcindex classifier") retired the TC
tcindex classifier.
Remove UAPI for it. Iproute2 will sync by equally removing it from user space.
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit 265b4da82dbf ("net/sched: Retire rsvp classifier") retired the TC RSVP
classifier.
Remove UAPI for it. Iproute2 will sync by equally removing it from user space.
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Shinas Rasheed says:
====================
add octeon_ep_vf driver
This driver implements networking functionality of Marvell's Octeon
PCI Endpoint NIC VF.
This driver support following devices:
* Network controller: Cavium, Inc. Device b203
* Network controller: Cavium, Inc. Device b403
* Network controller: Cavium, Inc. Device b103
* Network controller: Cavium, Inc. Device b903
* Network controller: Cavium, Inc. Device ba03
* Network controller: Cavium, Inc. Device bc03
* Network controller: Cavium, Inc. Device bd03
Changes:
V2:
- Removed linux/version.h header file from inclusion in
octep_vf_main.c
- Corrected Makefile entry to include building octep_vf_mbox.c in
[6/8] patch.
- Removed redundant vzalloc pointer cast and vfree pointer check in
[6/8] patch.
V1: https://lore.kernel.org/all/20231221092844.2885872-1-srasheed@marvell.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
add MAINTAINERS for octeon_ep_vf driver.
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for the following ethtool commands:
ethtool -i|--driver devname
ethtool devname
ethtool -S|--statistics devname
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support to enable MSI-x and register interrupts.
Add support to process Tx and Rx traffic. Includes processing
Tx completions and Rx refill.
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for ndo ops to set MAC address, change MTU, get stats.
Add control path support to set MAC address, change MTU, get stats,
set speed, get and set link mode.
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement Tx/Rx ring resource allocation and cleanup.
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement VF-PF mailbox to send all control commands from VF to PF
and receive responses and notifications from PF to VF.
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement hardware resource init and shutdown helper APIs, like
hardware Tx/Rx queue init/enable/disable/reset.
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add driver framework and device setup and initialization for Octeon
PCI Endpoint NIC VF.
Add implementation to load module, initialize, register network device,
cleanup and unload module.
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In an effort to make the PS3 gelic driver easier to maintain, create two
new structures, struct gelic_hw_regs and struct gelic_chain_link, and
replace the corresponding members of struct gelic_descr with the new
structures.
The new struct gelic_hw_regs holds the register variables used by the
gelic hardware device. The new struct gelic_chain_link holds variables
used to manage the driver's linked list of gelic descr structures.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Michael Chan says:
====================
bnxt_en: Add basic ntuple filter support
The current driver only supports ntuple filters added by aRFS. This
patch series adds basic support for user defined TCP/UDP ntuple filters
added by the user using ethtool. Many of the patches are refactoring
patches to make the existing code more general to support both aRFS
and user defined filters. aRFS filters always have the Toeplitz hash
value from the NIC. A Toepliz hash function is added in patch 5 to
get the same hash value for user defined filters. The hash is used
to store all ntuple filters in the table and all filters must be
hashed identically using the same function and key.
v2: Fix compile error in patch #4 when CONFIG_BNXT_SRIOV is disabled.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add logic to delete a user specified ntuple filter from ethtool.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for adding user defined ntuple TCP/UDP filters. These
filters are similar to aRFS filters except that they don't get aged.
Source IP, destination IP, source port, or destination port can be
unspecifed as wildcard. At least one of these tuples must be specifed.
If a tuple is specified, the full mask must be specified.
All ntuple related ethtool functions are now no longer compiled only
for CONFIG_RFS_ACCEL.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
aRFS filters match all 5 tuples. User defined ntuple filters may
specify some of the tuples as wildcards. To support that, we add the
ntuple_flags to the bnxt_ntuple_filter struct to specify which tuple
fields are to be matched. The matching tuple fields will then be
passed to the firmware in bnxt_hwrm_cfa_ntuple_filter_alloc() to create
the proper filter.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Refactor the logic into a new function bnxt_del_ntp_filters(). The
same call will be used when the user deletes an ntuple filter.
The bnxt_hwrm_cfa_ntuple_filter_free() function to call fw to free
the ntuple filter is exported so that the ethtool logic can call it.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Generalize the ethtool logic that walks the ntuple hash table now that
we have the common bnxt_filter_base structure. This will allow the code
to easily extend to cover user defined ntuple or ether filters.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a new function bnxt_insert_ntp_filter() to insert the ntuple filter
into the hash table and other basic setup. We'll use this function
to insert a user defined filter from ethtool.
Also, export bnxt_lookup_ntp_filter_from_idx() and bnxt_get_ntp_filter_idx()
for similar purposes. All ntuple related functions are now no longer
compiled only for CONFIG_RFS_ACCEL
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Change the unused flag to BNXT_FLTR_INSERTED. To prepare for multiple
pathways that an ntuple filter can be deleted, we add this flag. These
filter structures can be retreived from the RCU hash table but only
the caller that sees that the BNXT_FLTR_INSERTED flag is set can delete
the filter structure and clear the flag under spinlock.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the helper function to look up the ntuple filter from the
hash index and use it in bnxt_rx_flow_steer(). The helper function
will also be used by user defined ntuple filters in the next
patches.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For ntuple filters added by aRFS, the Toeplitz hash calculated by our
NIC is available and is used to store the ntuple filter for quick
retrieval. In the next patches, user defined ntuple filter support
will be added and we need to calculate the same hash for these
filters. The same hash function needs to be used so we can detect
duplicates.
Add the function bnxt_toeplitz() to calculate the Toeplitz hash for
user defined ntuple filters. bnxt_toeplitz() uses the same Toeplitz
key and the same key length as the NIC.
bnxt_get_ntp_filter_idx() is added to return the hash index. For
aRFS, the hash comes from the NIC. For user defined ntuple, we call
bnxt_toeplitz() to calculate the hash index.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|