summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2022-04-14cdc_ether: export usbnet_cdc_zte_rx_fixupLech Perczak1-0/+1
Commit bfe9b9d2df66 ("cdc_ether: Improve ZTE MF823/831/910 handling") introduces a workaround for certain ZTE modems reporting invalid MAC addresses over CDC-ECM. The same issue was present on their RNDIS interface,which was fixed in commit a5a18bdf7453 ("rndis_host: Set valid random MAC on buggy devices"). However, internal modem of ZTE MF286R router, on its RNDIS interface, also exhibits a second issue fixed already in CDC-ECM, of the device not respecting configured random MAC address. In order to share the fixup for this with rndis_host driver, export the workaround function, which will be re-used in the following commit in rndis_host. Cc: Kristian Evensen <kristian.evensen@gmail.com> Cc: Bjørn Mork <bjorn@mork.no> Cc: Oliver Neukum <oliver@neukum.org> Signed-off-by: Lech Perczak <lech.perczak@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-13net: icmp: introduce function icmpv6_param_prob_reason()Menglong Dong1-2/+9
In order to add the skb drop reasons support to icmpv6_param_prob(), introduce the function icmpv6_param_prob_reason() and make icmpv6_param_prob() an inline call to it. This new function will be used in the following patches. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ip: add skb drop reasons to ip forwardingMenglong Dong2-0/+4
Replace kfree_skb() which is used in ip6_forward() and ip_forward() with kfree_skb_reason(). The new drop reason 'SKB_DROP_REASON_PKT_TOO_BIG' is introduced for the case that the length of the packet exceeds MTU and can't fragment. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: ipv4: add skb drop reasons to ip_error()Menglong Dong2-0/+8
Eventually, I find out the handler function for inputting route lookup fail: ip_error(). The drop reasons we used in ip_error() are almost corresponding to IPSTATS_MIB_*, and following new reasons are introduced: SKB_DROP_REASON_IP_INADDRERRORS SKB_DROP_REASON_IP_INNOROUTES Isn't the name SKB_DROP_REASON_IP_HOSTUNREACH and SKB_DROP_REASON_IP_NETUNREACH more accurate? To make them corresponding to IPSTATS_MIB_*, we keep their name still. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13skb: add some helpers for skb drop reasonsMenglong Dong1-0/+12
In order to simply the definition and assignment for 'enum skb_drop_reason', introduce some helpers. SKB_DR() is used to define a variable of type 'enum skb_drop_reason' with the 'SKB_DROP_REASON_NOT_SPECIFIED' initial value. SKB_DR_SET() is used to set the value of the variable. Seems it is a little useless? But it makes the code shorter. SKB_DR_OR() is used to set the value of the variable if it is not set yet, which means its value is SKB_DROP_REASON_NOT_SPECIFIED. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: rtnetlink: add ndm flags and state mask attributesNikolay Aleksandrov1-0/+2
Add ndm flags/state masks which will be used for bulk delete filtering. All of these are used by the bridge and vxlan drivers. Also minimal attr policy validation is added, it is up to ndo_fdb_del_bulk implementers to further validate them. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: add ndo_fdb_del_bulkNikolay Aleksandrov1-0/+9
Add a new netdev op called ndo_fdb_del_bulk, it will be later used for driver-specific bulk delete implementation dispatched from rtnetlink. The first user will be the bridge, we need it to signal to rtnetlink from the driver that we support bulk delete operation (NLM_F_BULK). Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: rtnetlink: add bulk delete support flagNikolay Aleksandrov1-1/+2
Add a new rtnl flag (RTNL_FLAG_BULK_DEL_SUPPORTED) which is used to verify that the delete operation allows bulk object deletion. Also emit a warning if anyone tries to set it for non-delete kind. Suggested-by: David Ahern <dsahern@kernel.org> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: netlink: add NLM_F_BULK delete request modifierNikolay Aleksandrov1-0/+1
Add a new delete request modifier called NLM_F_BULK which, when supported, would cause the request to delete multiple objects. The flag is a convenient way to signal that a multiple delete operation is requested which can be gradually added to different delete requests. In order to make sure older kernels will error out if the operation is not supported instead of doing something unintended we have to break a required condition when implementing support for this flag, f.e. for neighbors we will omit the mandatory mac address attribute. Initially it will be used to add flush with filtering support for bridge fdbs, but it also opens the door to add similar support to others. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: rtnetlink: use BIT for flag valuesNikolay Aleksandrov1-1/+1
Use BIT to define flag values. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: rtnetlink: add helper to extract msg type's kindNikolay Aleksandrov1-0/+6
Add a helper which extracts the msg type's kind using the kind mask (0x3). Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: rtnetlink: add msg kind namesNikolay Aleksandrov1-0/+7
Add rtnl kind names instead of using raw values. We'll need to check for DEL kind later to validate bulk flag support. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13net: phylink: remove phylink_helper_basex_speed()Russell King (Oracle)1-6/+0
As there are now no users of phylink_helper_basex_speed(), we can remove this obsolete functionality. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-12net: remove noblock parameter from recvmsg() entitiesOliver Hartkopp6-10/+9
The internal recvmsg() functions have two parameters 'flags' and 'noblock' that were merged inside skb_recv_datagram(). As a follow up patch to commit f4b41f062c42 ("net: remove noblock parameter from skb_recv_datagram()") this patch removes the separate 'noblock' parameter for recvmsg(). Analogue to the referenced patch for skb_recv_datagram() the 'flags' and 'noblock' parameters are unnecessarily split up with e.g. err = sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); or in err = INDIRECT_CALL_2(sk->sk_prot->recvmsg, tcp_recvmsg, udp_recvmsg, sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); instead of simply using only flags all the time and check for MSG_DONTWAIT where needed (to preserve for the formerly separated no(n)block condition). Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/r/20220411124955.154876-1-socketcan@hartkopp.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-12Merge branch 'mlx5-next' of ↵Jakub Kicinski5-242/+17
https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Leon Romanovsky says: ==================== Mellanox shared branch that includes: * Removal of FPGA TLS code https://lore.kernel.org/all/cover.1649073691.git.leonro@nvidia.com Mellanox INNOVA TLS cards are EOL in May, 2018 [1]. As such, the code is unmaintained, untested and not in-use by any upstream/distro oriented customers. In order to reduce code complexity, drop the kernel code, clean build config options and delete useless kTLS vs. TLS separation. [1] https://network.nvidia.com/related-docs/eol/LCR-000286.pdf * Removal of FPGA IPsec code https://lore.kernel.org/all/cover.1649232994.git.leonro@nvidia.com Together with FPGA TLS, the IPsec went to EOL state in the November of 2019 [1]. Exactly like FPGA TLS, no active customers exist for this upstream code and all the complexity around that area can be deleted. [2] https://network.nvidia.com/related-docs/eol/LCR-000535.pdf * Fix to undefined behavior from Borislav https://lore.kernel.org/all/20220405151517.29753-11-bp@alien8.de * 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: (23 commits) net/mlx5: Remove not-implemented IPsec capabilities net/mlx5: Remove ipsec_ops function table net/mlx5: Reduce kconfig complexity while building crypto support net/mlx5: Move IPsec file to relevant directory net/mlx5: Remove not-needed IPsec config net/mlx5: Align flow steering allocation namespace to common style net/mlx5: Unify device IPsec capabilities check net/mlx5: Remove useless IPsec device checks net/mlx5: Remove ipsec vs. ipsec offload file separation RDMA/core: Delete IPsec flow action logic from the core RDMA/mlx5: Drop crypto flow steering API RDMA/mlx5: Delete never supported IPsec flow action net/mlx5: Remove FPGA ipsec specific statistics net/mlx5: Remove XFRM no_trailer flag net/mlx5: Remove not-used IDA field from IPsec struct net/mlx5: Delete metadata handling logic net/mlx5_fpga: Drop INNOVA IPsec support IB/mlx5: Fix undefined behavior due to shift overflowing the constant net/mlx5: Cleanup kTLS function names and their exposure net/mlx5: Remove tls vs. ktls separation as it is the same ... ==================== Link: https://lore.kernel.org/r/20220409055303.1223644-1-leon@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-12ipv4: Use dscp_t in struct fib_entry_notifier_infoGuillaume Nault1-1/+1
Use the new dscp_t type to replace the tos field of struct fib_entry_notifier_info. This ensures ECN bits are ignored and makes it compatible with the dscp field of struct fib_rt_info. This also allows sparse to flag potential incorrect uses of DSCP and ECN bits. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-12ipv4: Use dscp_t in struct fib_rt_infoGuillaume Nault1-1/+1
Use the new dscp_t type to replace the tos field of struct fib_rt_info. This ensures ECN bits are ignored and makes it compatible with the fa_dscp field of struct fib_alias. This also allows sparse to flag potential incorrect uses of DSCP and ECN bits. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextDavid S. Miller1-2/+6
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Replace unnecessary list_for_each_entry_continue() in nf_tables, from Jakob Koschel. 2) Add struct nf_conntrack_net_ecache to conntrack event cache and use it, from Florian Westphal. 3) Refactor ctnetlink_dump_list(), also from Florian. 4) Bump module reference counter on cttimeout object addition/removal, from Florian. 5) Consolidate nf_log MAC printer, from Phil Sutter. 6) Add basic logging support for unknown ethertype, from Phil Sutter. 7) Consolidate check for sysctl nf_log_all_netns toggle, also from Phil. 8) Replace hardcode value in nft_bitwise, from Jeremy Sowden. 9) Rename BASIC-like goto tags in nft_bitwise to more meaningful names, also from Jeremy. 10) nft_fib support for reverse path filtering with policy-based routing on iif. Extend selftests to cover for this new usecase, from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11net: icmp: add skb drop reasons to icmp protocolMenglong Dong3-1/+8
Replace kfree_skb() used in icmp_rcv() and icmpv6_rcv() with kfree_skb_reason(). In order to get the reasons of the skb drops after icmp message handle, we change the return type of 'handler()' in 'struct icmp_control' from 'bool' to 'enum skb_drop_reason'. This may change its original intention, as 'false' means failure, but 'SKB_NOT_DROPPED_YET' means success now. Therefore, all 'handler' and the call of them need to be handled. Following 'handler' functions are involved: icmp_unreach() icmp_redirect() icmp_echo() icmp_timestamp() icmp_discard() And following new drop reasons are added: SKB_DROP_REASON_ICMP_CSUM SKB_DROP_REASON_INVALID_PROTO The reason 'INVALID_PROTO' is introduced for the case that the packet doesn't follow rfc 1122 and is dropped. This is not a common case, and I believe we can locate the problem from the data in the packet. For now, this 'INVALID_PROTO' is used for the icmp broadcasts with wrong types. Maybe there should be a document file for these reasons. For example, list all the case that causes the 'UNHANDLED_PROTO' and 'INVALID_PROTO' drop reason. Therefore, users can locate their problems according to the document. Reviewed-by: Hao Peng <flyingpeng@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11net: skb: rename SKB_DROP_REASON_PTYPE_ABSENTMenglong Dong2-6/+4
As David Ahern suggested, the reasons for skb drops should be more general and not be code based. Therefore, rename SKB_DROP_REASON_PTYPE_ABSENT to SKB_DROP_REASON_UNHANDLED_PROTO, which is used for the cases of no L3 protocol handler, no L4 protocol handler, version extensions, etc. From previous discussion, now we have the aim to make these reasons more abstract and users based, avoiding code based. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11net: sock: introduce sock_queue_rcv_skb_reason()Menglong Dong1-1/+8
In order to report the reasons of skb drops in 'sock_queue_rcv_skb()', introduce the function 'sock_queue_rcv_skb_reason()'. As the return value of 'sock_queue_rcv_skb()' is used as the error code, we can't make it as drop reason and have to pass extra output argument. 'sock_queue_rcv_skb()' is used in many places, so we can't change it directly. Introduce the new function 'sock_queue_rcv_skb_reason()' and make 'sock_queue_rcv_skb()' an inline call to it. Reviewed-by: Hao Peng <flyingpeng@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-10tls: rx: simplify async waitJakub Kicinski1-1/+0
Since we are protected from async completions by decrypt_compl_lock we can drop the async_notify and reinit the completion before we start waiting. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-09net/mlx5: Remove not-implemented IPsec capabilitiesLeon Romanovsky1-7/+4
Clean a capabilities enum to remove not-implemented bits. Link: https://lore.kernel.org/r/1044bb7b779107ff38e48e3f6553421104f3f819.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-04-09net/mlx5: Remove ipsec_ops function tableLeon Romanovsky1-3/+0
There is only one IPsec implementation and ipsec_ops is not needed at all in this situation. Together with removal of ipsec_ops, we can drop the entry checks as these functions are called for IPsec devices only. Link: https://lore.kernel.org/r/bc8dd1c8a77b65dbf5e2cf92c813ffaca2505c5f.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-04-09net/mlx5: Remove not-needed IPsec configLeon Romanovsky2-3/+3
In current code, the CONFIG_MLX5_IPSEC and CONFIG_MLX5_EN_IPSEC are the same. So remove useless indirection. Link: https://lore.kernel.org/r/fd14492cbc01a0d51a5bfedde02bcd2154123fde.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-04-09net/mlx5: Unify device IPsec capabilities checkLeon Romanovsky1-2/+5
Merge two different function to one in order to provide coherent picture if the device is IPsec capable or not. Link: https://lore.kernel.org/r/8f10ea06ad19c6f651e9fb33921009658f01e1d5.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-04-09RDMA/core: Delete IPsec flow action logic from the coreLeon Romanovsky1-8/+0
The removal of mlx5 flow steering logic, left the kernel without any RDMA drivers that implements flow action callbacks supplied by RDMA/core. Any user access to them caused to EOPNOTSUPP error, which can be achieved by simply removing ioctl implementation. Link: https://lore.kernel.org/r/a638e376314a2eb1c66f597c0bbeeab2e5de7faf.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-04-09RDMA/mlx5: Drop crypto flow steering APILeon Romanovsky1-8/+5
The mlx5 flow steering crypto API was intended to be used in FPGA devices, which is not supported for years already. The removal of mlx5 crypto FPGA code together with inability to configure encryption keys makes the low steering API completely unusable. So delete the code, so any ESP flow steering requests will fail with not supported error, as it is happening now anyway as no device support this type of API. Link: https://lore.kernel.org/r/634a5face7734381463d809bfb89850f6998deac.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-04-09net/mlx5_fpga: Drop INNOVA IPsec supportLeon Romanovsky1-148/+0
Mellanox INNOVA IPsec cards are EOL in Nov, 2019 [1]. As such, the code is unmaintained, untested and not in-use by any upstream/distro oriented customers. In order to reduce code complexity, drop the kernel code. [1] https://network.nvidia.com/related-docs/eol/LCR-000535.pdf Link: https://lore.kernel.org/r/2afe88ec5020a491079eacf6fe3c89b64d65195c.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-04-09Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski1-2/+2
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-04-09 We've added 63 non-merge commits during the last 9 day(s) which contain a total of 68 files changed, 4852 insertions(+), 619 deletions(-). The main changes are: 1) Add libbpf support for USDT (User Statically-Defined Tracing) probes. USDTs are an abstraction built on top of uprobes, critical for tracing and BPF, and widely used in production applications, from Andrii Nakryiko. 2) While Andrii was adding support for x86{-64}-specific logic of parsing USDT argument specification, Ilya followed-up with USDT support for s390 architecture, from Ilya Leoshkevich. 3) Support name-based attaching for uprobe BPF programs in libbpf. The format supported is `u[ret]probe/binary_path:[raw_offset|function[+offset]]`, e.g. attaching to libc malloc can be done in BPF via SEC("uprobe/libc.so.6:malloc") now, from Alan Maguire. 4) Various load/store optimizations for the arm64 JIT to shrink the image size by using arm64 str/ldr immediate instructions. Also enable pointer authentication to verify return address for JITed code, from Xu Kuohai. 5) BPF verifier fixes for write access checks to helper functions, e.g. rd-only memory from bpf_*_cpu_ptr() must not be passed to helpers that write into passed buffers, from Kumar Kartikeya Dwivedi. 6) Fix overly excessive stack map allocation for its base map structure and buckets which slipped-in from cleanups during the rlimit accounting removal back then, from Yuntao Wang. 7) Extend the unstable CT lookup helpers for XDP and tc/BPF to report netfilter connection tracking tuple direction, from Lorenzo Bianconi. 8) Improve bpftool dump to show BPF program/link type names, Milan Landaverde. 9) Minor cleanups all over the place from various others. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (63 commits) bpf: Fix excessive memory allocation in stack_map_alloc() selftests/bpf: Fix return value checks in perf_event_stackmap test selftests/bpf: Add CO-RE relos into linked_funcs selftests libbpf: Use weak hidden modifier for USDT BPF-side API functions libbpf: Don't error out on CO-RE relos for overriden weak subprogs samples, bpf: Move routes monitor in xdp_router_ipv4 in a dedicated thread libbpf: Allow WEAK and GLOBAL bindings during BTF fixup libbpf: Use strlcpy() in path resolution fallback logic libbpf: Add s390-specific USDT arg spec parsing logic libbpf: Make BPF-side of USDT support work on big-endian machines libbpf: Minor style improvements in USDT code libbpf: Fix use #ifdef instead of #if to avoid compiler warning libbpf: Potential NULL dereference in usdt_manager_attach_usdt() selftests/bpf: Uprobe tests should verify param/return values libbpf: Improve string parsing for uprobe auto-attach libbpf: Improve library identification for uprobe binary path resolution selftests/bpf: Test for writes to map key from BPF helpers selftests/bpf: Test passing rdonly mem to global func bpf: Reject writes for PTR_TO_MAP_KEY in check_helper_mem_access bpf: Check PTR_TO_MEM | MEM_RDONLY in check_helper_mem_access ... ==================== Link: https://lore.kernel.org/r/20220408231741.19116-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08net/sched: act_skbedit: Add extack messages for offload failureIdo Schimmel1-0/+12
For better error reporting to user space, add extack messages when skbedit action offload fails. Example: # echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action skbedit queue_mapping 1234 Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-185 [002] b..1. 31.802414: netlink_extack: msg=act_skbedit: Offload not supported when "queue_mapping" option is used tc-185 [002] ..... 31.802418: netlink_extack: msg=cls_matchall: Failed to setup flow action # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action skbedit inheritdsfield Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-187 [002] b..1. 45.985145: netlink_extack: msg=act_skbedit: Offload not supported when "inheritdsfield" option is used tc-187 [002] ..... 45.985160: netlink_extack: msg=cls_matchall: Failed to setup flow action Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08net/sched: act_gact: Add extack messages for offload failureIdo Schimmel1-0/+15
For better error reporting to user space, add extack messages when gact action offload fails. Example: # echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action continue Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-181 [002] b..1. 105.493450: netlink_extack: msg=act_gact: Offload of "continue" action is not supported tc-181 [002] ..... 105.493466: netlink_extack: msg=cls_matchall: Failed to setup flow action # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action reclassify Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-183 [002] b..1. 124.126477: netlink_extack: msg=act_gact: Offload of "reclassify" action is not supported tc-183 [002] ..... 124.126489: netlink_extack: msg=cls_matchall: Failed to setup flow action # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action pipe action drop Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-185 [002] b..1. 137.097791: netlink_extack: msg=act_gact: Offload of "pipe" action is not supported tc-185 [002] ..... 137.097804: netlink_extack: msg=cls_matchall: Failed to setup flow action Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08net/sched: act_api: Add extack to offload_act_setup() callbackIdo Schimmel2-3/+6
The callback is used by various actions to populate the flow action structure prior to offload. Pass extack to this callback so that the various actions will be able to report accurate error messages to user space. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08tls: rx: use a define for tag lengthJakub Kicinski1-0/+1
TLS 1.3 has to strip padding, and it starts out 16 bytes from the end of the record. Make it clear this is because of the auth tag. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08tls: rx: don't store the decryption status in socket contextJakub Kicinski2-1/+1
Similar justification to previous change, the information about decryption status belongs in the skb. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08tls: rx: don't store the record type in socket contextJakub Kicinski2-7/+6
Original TLS implementation was handling one record at a time. It stashed the type of the record inside tls context (per socket structure) for convenience. When async crypto support was added [1] the author had to use skb->cb to store the type per-message. The use of skb->cb overlaps with strparser, however, so a hybrid approach was taken where type is stored in context while parsing (since we parse a message at a time) but once parsed its copied to skb->cb. Recently a workaround for sockmaps [2] exposed the previously private struct _strp_msg and started a trend of adding user fields directly in strparser's header. This is cleaner than storing information about an skb in the context. This change is not strictly necessary, but IMHO the ownership of the context field is confusing. Information naturally belongs to the skb. [1] commit 94524d8fc965 ("net/tls: Add support for async decryption of tls records") [2] commit b2c4618162ec ("bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08netfilter: ecache: move to separate structureFlorian Westphal1-2/+6
This makes it easier for a followup patch to only expose ecache related parts of nf_conntrack_net structure. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-04-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski4-9/+4
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08Merge tag 'net-5.18-rc2' of ↵Linus Torvalds2-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - new code bugs: - mctp: correct mctp_i2c_header_create result - eth: fungible: fix reference to __udivdi3 on 32b builds - eth: micrel: remove latencies support lan8814 Previous releases - regressions: - bpf: resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT - vrf: fix packet sniffing for traffic originating from ip tunnels - rxrpc: fix a race in rxrpc_exit_net() - dsa: revert "net: dsa: stop updating master MTU from master.c" - eth: ice: fix MAC address setting Previous releases - always broken: - tls: fix slab-out-of-bounds bug in decrypt_internal - bpf: support dual-stack sockets in bpf_tcp_check_syncookie - xdp: fix coalescing for page_pool fragment recycling - ovs: fix leak of nested actions - eth: sfc: - add missing xdp queue reinitialization - fix using uninitialized xdp tx_queue - eth: ice: - clear default forwarding VSI during VSI release - fix broken IFF_ALLMULTI handling - synchronize_rcu() when terminating rings - eth: qede: confirm skb is allocated before using - eth: aqc111: fix out-of-bounds accesses in RX fixup - eth: slip: fix NPD bug in sl_tx_timeout()" * tag 'net-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits) drivers: net: slip: fix NPD bug in sl_tx_timeout() bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets bpf: Support dual-stack sockets in bpf_tcp_check_syncookie myri10ge: fix an incorrect free for skb in myri10ge_sw_tso net: usb: aqc111: Fix out-of-bounds accesses in RX fixup qede: confirm skb is allocated before using net: ipv6mr: fix unused variable warning with CONFIG_IPV6_PIMSM_V2=n net: phy: mscc-miim: reject clause 45 register accesses net: axiemac: use a phandle to reference pcs_phy dt-bindings: net: add pcs-handle attribute net: axienet: factor out phy_node in struct axienet_local net: axienet: setup mdio unconditionally net: sfc: fix using uninitialized xdp tx_queue rxrpc: fix a race in rxrpc_exit_net() net: openvswitch: fix leak of nested actions net: ethernet: mv643xx: Fix over zealous checking of_get_mac_address() net: openvswitch: don't send internal clone attribute to the userspace. net: micrel: Fix KS8851 Kconfig ice: clear cmd_type_offset_bsz for TX rings ice: xsk: fix VSI state check in ice_xsk_wakeup() ...
2022-04-08tcp: Add tracepoint for tcp_set_ca_statePing Gan2-9/+48
The congestion status of a tcp flow may be updated since there is congestion between tcp sender and receiver. It makes sense to add tracepoint for congestion status set function to summate cc status duration and evaluate the performance of network and congestion algorithm. the backgound of this patch is below. Link: https://github.com/iovisor/bcc/pull/3899 Signed-off-by: Ping Gan <jacky_gam_2001@163.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220406010956.19656-1-jacky_gam_2001@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08net-core: rx_otherhost_dropped to core_statsJeffrey Ji2-0/+7
Increment rx_otherhost_dropped counter when packet dropped due to mismatched dest MAC addr. An example when this drop can occur is when manually crafting raw packets that will be consumed by a user space application via a tap device. For testing purposes local traffic was generated using trafgen for the client and netcat to start a server Tested: Created 2 netns, sent 1 packet using trafgen from 1 to the other with "{eth(daddr=$INCORRECT_MAC...}", verified that iproute2 showed the counter was incremented. (Also had to modify iproute2 to show the stat, additional patch for that coming next.) Signed-off-by: Jeffrey Ji <jeffreyji@google.com> Reviewed-by: Brian Vazquez <brianvv@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220406172600.1141083-1-jeffreyjilinux@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08net: extract a few internals from netdevice.hJakub Kicinski1-70/+2
There's a number of functions and static variables used under net/core/ but not from the outside. We currently dump most of them into netdevice.h. That bad for many reasons: - netdevice.h is very cluttered, hard to figure out what the APIs are; - netdevice.h is very long; - we have to touch netdevice.h more which causes expensive incremental builds. Create a header under net/core/ and move some declarations. The new header is also a bit of a catch-all but that's fine, if we create more specific headers people will likely over-think where their declaration fit best. And end up putting them in netdevice.h, again. More work should be done on splitting netdevice.h into more targeted headers, but that'd be more time consuming so small steps. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-07Merge tag 'hyperv-fixes-signed-20220407' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Correctly propagate coherence information for VMbus devices (Michael Kelley) - Disable balloon and memory hot-add on ARM64 temporarily (Boqun Feng) - Use barrier to prevent reording when reading ring buffer (Michael Kelley) - Use virt_store_mb in favour of smp_store_mb (Andrea Parri) - Fix VMbus device object initialization (Andrea Parri) - Deactivate sysctl_record_panic_msg on isolated guest (Andrea Parri) - Fix a crash when unloading VMbus module (Guilherme G. Piccoli) * tag 'hyperv-fixes-signed-20220407' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: Replace smp_store_mb() with virt_store_mb() Drivers: hv: balloon: Disable balloon and hot-add accordingly Drivers: hv: balloon: Support status report for larger page sizes Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer PCI: hv: Propagate coherence from VMbus device to PCI device Drivers: hv: vmbus: Propagate VMbus coherence to each VMbus device Drivers: hv: vmbus: Fix potential crash on module unload Drivers: hv: vmbus: Fix initialization of device object in vmbus_device_register() Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests
2022-04-07ipv6: fix locking issues with loops over idev->addr_listNiels Dossche1-0/+8
idev->addr_list needs to be protected by idev->lock. However, it is not always possible to do so while iterating and performing actions on inet6_ifaddr instances. For example, multiple functions (like addrconf_{join,leave}_anycast) eventually call down to other functions that acquire the idev->lock. The current code temporarily unlocked the idev->lock during the loops, which can cause race conditions. Moving the locks up is also not an appropriate solution as the ordering of lock acquisition will be inconsistent with for example mc_lock. This solution adds an additional field to inet6_ifaddr that is used to temporarily add the instances to a temporary list while holding idev->lock. The temporary list can then be traversed without holding idev->lock. This change was done in two places. In addrconf_ifdown, the list_for_each_entry_safe variant of the list loop is also no longer necessary as there is no deletion within that specific loop. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Niels Dossche <dossche.niels@gmail.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20220403231523.45843-1-dossche.niels@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-07Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski1-1/+3
Alexei Starovoitov says: ==================== pull-request: bpf 2022-04-06 We've added 8 non-merge commits during the last 8 day(s) which contain a total of 9 files changed, 139 insertions(+), 36 deletions(-). The main changes are: 1) rethook related fixes, from Jiri and Masami. 2) Fix the case when tracing bpf prog is attached to struct_ops, from Martin. 3) Support dual-stack sockets in bpf_tcp_check_syncookie, from Maxim. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets bpf: Support dual-stack sockets in bpf_tcp_check_syncookie bpf: selftests: Test fentry tracing a struct_ops program bpf: Resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT rethook: Fix to use WRITE_ONCE() for rethook:: Handler selftests/bpf: Fix warning comparing pointer to 0 bpf: Fix sparse warnings in kprobe_multi_resolve_syms bpftool: Explicit errno handling in skeletons ==================== Link: https://lore.kernel.org/r/20220407031245.73026-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-06tcp: add accessors to read/set tp->snd_cwndEric Dumazet2-5/+16
We had various bugs over the years with code breaking the assumption that tp->snd_cwnd is greater than zero. Lately, syzbot reported the WARN_ON_ONCE(!tp->prior_cwnd) added in commit 8b8a321ff72c ("tcp: fix zero cwnd in tcp_cwnd_reduction") can trigger, and without a repro we would have to spend considerable time finding the bug. Instead of complaining too late, we want to catch where and when tp->snd_cwnd is set to an illegal value. Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Link: https://lore.kernel.org/r/20220405233538.947344-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-06net: ethernet: mtk_eth_soc: implement flow offloading to WED devicesFelix Fietkau1-0/+7
This allows hardware flow offloading from Ethernet to WLAN on MT7622 SoC Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-06net: ethernet: mtk_eth_soc: add support for Wireless Ethernet Dispatch (WED)Felix Fietkau1-0/+131
The Wireless Ethernet Dispatch subsystem on the MT7622 SoC can be configured to intercept and handle access to the DMA queues and PCIe interrupts for a MT7615/MT7915 wireless card. It can manage the internal WDMA (Wireless DMA) controller, which allows ethernet packets to be passed from the packet switch engine (PSE) to the wireless card, bypassing the CPU entirely. This can be used to implement hardware flow offloading from ethernet to WLAN. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-06net, uapi: remove inclusion of arpa/inet.hNick Desaulniers1-16/+12
In include/uapi/linux/tipc_config.h, there's a comment that it includes arpa/inet.h for ntohs; but ntohs is not defined in any UAPI header. For now, reuse the definitions from include/linux/byteorder/generic.h, since the various conversion functions do exist in UAPI headers: include/uapi/linux/byteorder/big_endian.h include/uapi/linux/byteorder/little_endian.h We would like to get to the point where we can build UAPI header tests with -nostdinc, meaning that kernel UAPI headers should not have a circular dependency on libc headers. Link: https://android-review.googlesource.com/c/platform/bionic/+/2048127 Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-06net: remove noblock parameter from skb_recv_datagram()Oliver Hartkopp1-2/+1
skb_recv_datagram() has two parameters 'flags' and 'noblock' that are merged inside skb_recv_datagram() by 'flags | (noblock ? MSG_DONTWAIT : 0)' As 'flags' may contain MSG_DONTWAIT as value most callers split the 'flags' into 'flags' and 'noblock' with finally obsolete bit operations like this: skb_recv_datagram(sk, flags & ~MSG_DONTWAIT, flags & MSG_DONTWAIT, &rc); And this is not even done consistently with the 'flags' parameter. This patch removes the obsolete and costly splitting into two parameters and only performs bit operations when really needed on the caller side. One missing conversion thankfully reported by kernel test robot. I missed to enable kunit tests to build the mctp code. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>