summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-03-03bpf, sockmap: Do not ignore orig_len parameterEric Dumazet1-1/+1
Currently, sk_psock_verdict_recv() returns skb->len This is problematic because tcp_read_sock() might have passed orig_len < skb->len, due to the presence of TCP urgent data. This causes an infinite loop from tcp_read_sock() Followup patch will make tcp_read_sock() more robust vs bad actors. Fixes: ef5659280eb1 ("bpf, sockmap: Allow skipping sk_skb parser program") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20220302161723.3910001-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03tcp: Remove the unused apiTao Chen1-5/+0
Last tcp_write_queue_head() use was removed in commit 114f39feab36 ("tcp: restore autocorking"), so remove it. Signed-off-by: Tao Chen <chentao3@hotmail.com> Link: https://lore.kernel.org/r/SYZP282MB33317DEE1253B37C0F57231E86029@SYZP282MB3331.AUSP282.PROD.OUTLOOK.COM Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03flow_dissector: Add support for HSRKurt Kanzenbach3-16/+33
Network drivers such as igb or igc call eth_get_headlen() to determine the header length for their to be constructed skbs in receive path. When running HSR on top of these drivers, it results in triggering BUG_ON() in skb_pull(). The reason is the skb headlen is not sufficient for HSR to work correctly. skb_pull() notices that. For instance, eth_get_headlen() returns 14 bytes for TCP traffic over HSR which is not correct. The problem is, the flow dissection code does not take HSR into account. Therefore, add support for it. Reported-by: Anthony Harivel <anthony.harivel@linutronix.de> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Link: https://lore.kernel.org/r/20220228195856.88187-1-kurt@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03net: dsa: mv88e6xxx: support RMII cmodeBaruch Siach1-0/+3
Add support for direct RMII MAC mode. This allows hardware with CPU port connected in direct 100M fixed link to work properly. Signed-off-by: Baruch Siach <baruch.siach@siklu.com> Link: https://lore.kernel.org/r/a962d1ccbeec42daa10dd8aff0e66e31f0faf1eb.1646050203.git.baruch@tkos.co.il Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03net: dsa: mv88e6xxx: don't error out cmode set on missing laneBaruch Siach1-0/+2
When the given cmode has no serdes, mv88e6xxx_serdes_get_lane() returns -NODEV. Earlier in the same function the code skips serdes handing in this case. Do the same after cmode set. Signed-off-by: Baruch Siach <baruch.siach@siklu.com> Link: https://lore.kernel.org/r/cd95cf3422ae8daf297a01fa9ec3931b203cdf45.1646050203.git.baruch@tkos.co.il Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03net: openvswitch: remove unneeded semicolonYang Li1-1/+1
Eliminate the following coccicheck warning: ./net/openvswitch/flow.c:379:2-3: Unneeded semicolon Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20220227132208.24658-1-yang.lee@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03flow_offload: improve extack msg for user when adding invalid filterBaowen Zheng1-0/+2
Add extack message to return exact message to user when adding invalid filter with conflict flags for TC action. In previous implement we just return EINVAL which is confusing for user. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Link: https://lore.kernel.org/r/1646191769-17761-1-git-send-email-baowen.zheng@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03net: ipa: add an interconnect dependencyAlex Elder1-0/+1
In order to function, the IPA driver very clearly requires the interconnect framework to be enabled in the kernel configuration. State that dependency in the Kconfig file. This became a problem when CONFIG_COMPILE_TEST support was added. Non-Qualcomm platforms won't necessarily enable CONFIG_INTERCONNECT. Reported-by: kernel test robot <lkp@intel.com> Fixes: 38a4066f593c5 ("net: ipa: support COMPILE_TEST") Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20220301113440.257916-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03Merge branch '40GbE' of ↵Jakub Kicinski6-189/+397
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 40GbE Intel Wired LAN Driver Updates 2022-03-01 This series contains updates to iavf driver only. Mateusz adds support for interrupt moderation for 50G and 100G speeds as well as support for the driver to specify a request as its primary MAC address. He also refactors VLAN V2 capability exchange into more generic extended capabilities to ease the addition of future capabilities. Finally, he corrects the incorrect return of iavf_status values and removes non-inclusive language. Minghao Chi removes unneeded variables, instead returning values directly. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: iavf: Remove non-inclusive language iavf: Fix incorrect use of assigning iavf_status to int iavf: stop leaking iavf_status as "errno" values iavf: remove redundant ret variable iavf: Add usage of new virtchnl format to set default MAC iavf: refactor processing of VLAN V2 capability message iavf: Add support for 50G/100G in AIM algorithm ==================== Link: https://lore.kernel.org/r/20220301185939.3005116-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03nfp: flower: Remove usage of the deprecated ida_simple_xxx APIChristophe JAILLET1-5/+5
Use ida_alloc_xxx()/ida_free() instead to ida_simple_get()/ida_simple_remove(). The latter is deprecated and more verbose. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20220301131212.26348-1-simon.horman@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03net: fix up skbs delta_truesize in UDP GRO frag_listlena wang1-1/+1
The truesize for a UDP GRO packet is added by main skb and skbs in main skb's frag_list: skb_gro_receive_list p->truesize += skb->truesize; The commit 53475c5dd856 ("net: fix use-after-free when UDP GRO with shared fraglist") introduced a truesize increase for frag_list skbs. When uncloning skb, it will call pskb_expand_head and trusesize for frag_list skbs may increase. This can occur when allocators uses __netdev_alloc_skb and not jump into __alloc_skb. This flow does not use ksize(len) to calculate truesize while pskb_expand_head uses. skb_segment_list err = skb_unclone(nskb, GFP_ATOMIC); pskb_expand_head if (!skb->sk || skb->destructor == sock_edemux) skb->truesize += size - osize; If we uses increased truesize adding as delta_truesize, it will be larger than before and even larger than previous total truesize value if skbs in frag_list are abundant. The main skb truesize will become smaller and even a minus value or a huge value for an unsigned int parameter. Then the following memory check will drop this abnormal skb. To avoid this error we should use the original truesize to segment the main skb. Fixes: 53475c5dd856 ("net: fix use-after-free when UDP GRO with shared fraglist") Signed-off-by: lena wang <lena.wang@mediatek.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/1646133431-8948-1-git-send-email-lena.wang@mediatek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03net: sfp: use %pe for printing errorsRussell King (Oracle)1-18/+30
Convert sfp to use %pe for printing error codes, which can print them as errno symbols rather than numbers. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/E1nOyEN-00BuuE-OB@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03net: phylink: use %pe for printing errorsRussell King (Oracle)1-9/+11
Convert phylink to use %pe for printing error codes, which can print them as errno symbols rather than numbers. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/E1nOyEI-00Buu8-K9@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03tuntap: add sanity checks about msg_controllen in sendmsgHarold Huang3-2/+5
In patch [1], tun_msg_ctl was added to allow pass batched xdp buffers to tun_sendmsg. Although we donot use msg_controllen in this path, we should check msg_controllen to make sure the caller pass a valid msg_ctl. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe8dd45bb7556246c6b76277b1ba4296c91c2505 Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Harold Huang <baymaxhuang@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20220303022441.383865-1-baymaxhuang@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03Merge tag 'batadv-next-pullrequest-20220302' of ↵Jakub Kicinski17-16/+19
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - Remove redundant 'flush_workqueue()' calls, by Christophe JAILLET - Migrate to linux/container_of.h, by Sven Eckelmann - Demote batadv-on-batadv skip error message, by Sven Eckelmann * tag 'batadv-next-pullrequest-20220302' of git://git.open-mesh.org/linux-merge: batman-adv: Demote batadv-on-batadv skip error message batman-adv: Migrate to linux/container_of.h batman-adv: Remove redundant 'flush_workqueue()' calls batman-adv: Start new development cycle ==================== Link: https://lore.kernel.org/r/20220302163522.102842-1-sw@simonwunderlich.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03Merge tag 'batadv-net-pullrequest-20220302' of ↵Jakub Kicinski1-9/+20
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== Here are some batman-adv bugfixes: - Remove redundant iflink requests, by Sven Eckelmann (2 patches) - Don't expect inter-netns unique iflink indices, by Sven Eckelmann * tag 'batadv-net-pullrequest-20220302' of git://git.open-mesh.org/linux-merge: batman-adv: Don't expect inter-netns unique iflink indices batman-adv: Request iflink once in batadv_get_real_netdevice batman-adv: Request iflink once in batadv-on-batadv check ==================== Link: https://lore.kernel.org/r/20220302163049.101957-1-sw@simonwunderlich.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03Merge tag 'wireless-for-net-2022-03-02' of ↵Jakub Kicinski3-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Three more fixes: - fix build issue in iwlwifi, now that I understood what's going on there - propagate error in iwlwifi/mvm to userspace so it can figure out what's happening - fix channel switch related updates in P2P-client in cfg80211 * tag 'wireless-for-net-2022-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: iwlwifi: mvm: return value for request_ownership nl80211: Update bss channel on channel switch for P2P_CLIENT iwlwifi: fix build error for IWLMEI ==================== Link: https://lore.kernel.org/r/20220302214444.100180-1-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03Merge branch 'ucount-rlimit-fixes-for-v5.17' of ↵Linus Torvalds1-1/+13
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull ucounts fix from Eric Biederman: "Etienne Dechamps recently found a regression caused by enforcing RLIMIT_NPROC for root where the rlimit was not previously enforced. Michal Koutný had previously pointed out the inconsistency in enforcing the RLIMIT_NPROC that had been on the root owned process after the root user creates a user namespace. Which makes the fix for the regression simply removing the inconsistency" * 'ucount-rlimit-fixes-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: ucounts: Fix systemd LimitNPROC with private users regression
2022-03-03Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds3-9/+30
Pull ARM fixes from Russell King: - Fix kgdb breakpoint for Thumb2 - Fix dependency for BITREVERSE kconfig - Fix nommu early_params and __setup returns * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9182/1: mmu: fix returns from early_param() and __setup() functions ARM: 9178/1: fix unmet dependency on BITREVERSE for HAVE_ARCH_BITREVERSE ARM: Fix kgdb breakpoint for Thumb2
2022-03-03auxdisplay: lcd2s: Use proper API to free the instance of charlcd objectAndy Shevchenko1-2/+2
While it might work, the current approach is fragile in a few ways: - whenever members in the structure are shuffled, the pointer will be wrong - the resource freeing may include more than covered by kfree() Fix this by using charlcd_free() call instead of kfree(). Fixes: 8c9108d014c5 ("auxdisplay: add a driver for lcd2s character display") Cc: Lars Poeschel <poeschel@lemonage.de> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2022-03-03auxdisplay: lcd2s: Fix memory leak in ->remove()Andy Shevchenko1-11/+7
Once allocated the struct lcd2s_data is never freed. Fix the memory leak by switching to devm_kzalloc(). Fixes: 8c9108d014c5 ("auxdisplay: add a driver for lcd2s character display") Cc: Lars Poeschel <poeschel@lemonage.de> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2022-03-03auxdisplay: lcd2s: Fix lcd2s_redefine_char() featureAndy Shevchenko1-1/+1
It seems that the lcd2s_redefine_char() has never been properly tested. The buffer is filled by DEF_CUSTOM_CHAR command followed by the character number (from 0 to 7), but immediately after that these bytes are rewritten by the decoded hex stream. Fix the index to fill the buffer after the command and number. Fixes: 8c9108d014c5 ("auxdisplay: add a driver for lcd2s character display") Cc: Lars Poeschel <poeschel@lemonage.de> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> [fixed typo in commit message] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2022-03-03iwlwifi: mvm: return value for request_ownershipEmmanuel Grumbach1-2/+3
Propagate the value to the user space so it can understand if the operation failed or not. Fixes: bfcfdb59b669 ("iwlwifi: mvm: add vendor commands needed for iwlmei") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://lore.kernel.org/r/20220302072715.4885-1-emmanuel.grumbach@intel.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-03nl80211: Update bss channel on channel switch for P2P_CLIENTSreeramya Soratkal1-1/+2
The wdev channel information is updated post channel switch only for the station mode and not for the other modes. Due to this, the P2P client still points to the old value though it moved to the new channel when the channel change is induced from the P2P GO. Update the bss channel after CSA channel switch completion for P2P client interface as well. Signed-off-by: Sreeramya Soratkal <quic_ssramya@quicinc.com> Link: https://lore.kernel.org/r/1646114600-31479-1-git-send-email-quic_ssramya@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-03iwlwifi: fix build error for IWLMEIRandy Dunlap1-0/+1
When CONFIG_IWLWIFI=m and CONFIG_IWLMEI=y, the kernel build system must be told to build the iwlwifi/ subdirectory for both IWLWIFI and IWLMEI so that builds for both =y and =m are done. This resolves an undefined reference build error: ERROR: modpost: "iwl_mei_is_connected" [drivers/net/wireless/intel/iwlwifi/iwlwifi.ko] undefined! Fixes: 977df8bd5844 ("wlwifi: work around reverse dependency on MEI") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: linux-wireless@vger.kernel.org Link: https://lore.kernel.org/r/20220227200051.7176-1-rdunlap@infradead.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-02Merge tag 'erofs-for-5.17-rc7-fixes' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fix from Gao Xiang: "A one-line patch to fix the new ztailpacking feature on > 4GiB filesystems because z_idataoff can get trimmed improperly. ztailpacking is still a brand new EXPERIMENTAL feature, but it'd be better to fix the issue as soon as possible to avoid unnecessary backporting. Summary: - Fix ztailpacking z_idataoff getting trimmed on > 4GiB filesystems" * tag 'erofs-for-5.17-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: fix ztailpacking on > 4GiB filesystems
2022-03-02Merge tag 'ntb-5.17-bugfixes' of git://github.com/jonmason/ntbLinus Torvalds4-9/+38
Pull NTB fixes from Jon Mason: "Bug fixes for sparse warning, intel port config offset, and a new mailing list" * tag 'ntb-5.17-bugfixes' of git://github.com/jonmason/ntb: MAINTAINERS: update mailing list address for NTB subsystem ntb: intel: fix port config status offset for SPR NTB/msi: Use struct_size() helper in devm_kzalloc()
2022-03-02ptp: ocp: Add ptp_ocp_adjtime_coarse for large adjustmentsJonathan Lemon1-2/+23
In ("ptp: ocp: Have FPGA fold in ns adjustment for adjtime."), the ns adjustment was written to the FPGA register, so the clock could accurately perform adjustments. However, the adjtime() call passes in a s64, while the clock adjustment registers use a s32. When trying to perform adjustments with a large value (37 sec), things fail. Examine the incoming delta, and if larger than 1 sec, use the original (coarse) adjustment method. If smaller than 1 sec, then allow the FPGA to fold in the changes over a 1 second window. Fixes: 6d59d4fa1789 ("ptp: ocp: Have FPGA fold in ns adjustment for adjtime.") Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/20220228203957.367371-1-jonathan.lemon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02net: hamradio: fix compliation errorWang Qing1-1/+1
add missing ")" which caused by previous commit. Fixes: 61c4fb9c4d09 ("net: hamradio: use time_is_after_jiffies() instead of open coding it") Link: https://lore.kernel.org/all/1646018012-61129-1-git-send-email-wangqing@vivo.com/ Signed-off-by: Wang Qing <wangqing@vivo.com> Link: https://lore.kernel.org/r/1646203277-83159-1-git-send-email-wangqing@vivo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02erofs: fix ztailpacking on > 4GiB filesystemsGao Xiang1-1/+1
z_idataoff here is an absolute physical offset, so it should use erofs_off_t (64 bits at least). Otherwise, it'll get trimmed and cause the decompresion failure. Link: https://lore.kernel.org/r/20220222033118.20540-1-hsiangkao@linux.alibaba.com Fixes: ab92184ff8f1 ("erofs: add on-disk compressed tail-packing inline support") Reviewed-by: Yue Hu <huyue2@yulong.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-03-02batman-adv: Don't expect inter-netns unique iflink indicesSven Eckelmann1-5/+14
The ifindex doesn't have to be unique for multiple network namespaces on the same machine. $ ip netns add test1 $ ip -net test1 link add dummy1 type dummy $ ip netns add test2 $ ip -net test2 link add dummy2 type dummy $ ip -net test1 link show dev dummy1 6: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 96:81:55:1e:dd:85 brd ff:ff:ff:ff:ff:ff $ ip -net test2 link show dev dummy2 6: dummy2: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 5a:3c:af:35:07:c3 brd ff:ff:ff:ff:ff:ff But the batman-adv code to walk through the various layers of virtual interfaces uses this assumption because dev_get_iflink handles it internally and doesn't return the actual netns of the iflink. And dev_get_iflink only documents the situation where ifindex == iflink for physical devices. But only checking for dev->netdev_ops->ndo_get_iflink is also not an option because ipoib_get_iflink implements it even when it sometimes returns an iflink != ifindex and sometimes iflink == ifindex. The caller must therefore make sure itself to check both netns and iflink + ifindex for equality. Only when they are equal, a "physical" interface was detected which should stop the traversal. On the other hand, vxcan_get_iflink can also return 0 in case there was currently no valid peer. In this case, it is still necessary to stop. Fixes: b7eddd0b3950 ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface") Fixes: 5ed4a460a1d3 ("batman-adv: additional checks for virtual interfaces on top of WiFi") Reported-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2022-03-02batman-adv: Request iflink once in batadv_get_real_netdeviceSven Eckelmann1-4/+5
There is no need to call dev_get_iflink multiple times for the same net_device in batadv_get_real_netdevice. And since some of the ndo_get_iflink callbacks are dynamic (for example via RCUs like in vxcan_get_iflink), it could easily happen that the returned values are not stable. The pre-checks before __dev_get_by_index are then of course bogus. Fixes: 5ed4a460a1d3 ("batman-adv: additional checks for virtual interfaces on top of WiFi") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2022-03-02batman-adv: Request iflink once in batadv-on-batadv checkSven Eckelmann1-4/+5
There is no need to call dev_get_iflink multiple times for the same net_device in batadv_is_on_batman_iface. And since some of the .ndo_get_iflink callbacks are dynamic (for example via RCUs like in vxcan_get_iflink), it could easily happen that the returned values are not stable. The pre-checks before __dev_get_by_index are then of course bogus. Fixes: b7eddd0b3950 ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2022-03-02batman-adv: Demote batadv-on-batadv skip error messageSven Eckelmann1-2/+2
The error message "Cannot find parent device" was shown for users of macvtap (on batadv devices) whenever the macvtap was moved to a different netns. This happens because macvtap doesn't provide an implementation for rtnl_link_ops->get_link_net. The situation for which this message is printed is actually not an error but just a warning that the optional sanity check was skipped. So demote the message from error to warning and adjust the text to better explain what happened. Reported-by: Leonardo Mörlein <freifunk@irrelefant.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2022-03-02batman-adv: Migrate to linux/container_of.hSven Eckelmann16-12/+16
The commit d2a8ebbf8192 ("kernel.h: split out container_of() and typeof_member() macros") introduced a new header for the container_of related macros from (previously) linux/kernel.h. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2022-03-02Merge branch 'if_ether-h-add-industrial-fieldbus-ethertypes'Jakub Kicinski1-0/+2
Daniel Braunwarth says: ==================== if_ether.h: add industrial fieldbus Ethertypes This set of patches adds the Ethertypes for PROFINET and EtherCAT. The defines should be used by iproute2 to extend the list of available link layer protocols. ==================== Link: https://lore.kernel.org/r/20220228133029.100913-1-daniel@braunwarth.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02if_ether.h: add EtherCAT EthertypeDaniel Braunwarth1-0/+1
Add the Ethertype for EtherCAT protocol. Signed-off-by: Daniel Braunwarth <daniel@braunwarth.dev> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02if_ether.h: add PROFINET EthertypeDaniel Braunwarth1-0/+1
Add the Ethertype for PROFINET protocol. Signed-off-by: Daniel Braunwarth <daniel@braunwarth.dev> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02net: dsa: restore error path of dsa_tree_change_tag_protoVladimir Oltean1-1/+1
When the DSA_NOTIFIER_TAG_PROTO returns an error, the user space process which initiated the protocol change exits the kernel processing while still holding the rtnl_mutex. So any other process attempting to lock the rtnl_mutex would deadlock after such event. The error handling of DSA_NOTIFIER_TAG_PROTO was inadvertently changed by the blamed commit, introducing this regression. We must still call rtnl_unlock(), and we must still call DSA_NOTIFIER_TAG_PROTO for the old protocol. The latter is due to the limiting design of notifier chains for cross-chip operations, which don't have a built-in error recovery mechanism - we should look into using notifier_call_chain_robust for that. Fixes: dc452a471dba ("net: dsa: introduce tagger-owned storage for private and shared data") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220228141715.146485-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02macvtap: advertise link netns via netlinkSven Eckelmann1-0/+6
Assign rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is added to rtnetlink messages. This fixes iproute2 which otherwise resolved the link interface to an interface in the wrong namespace. Test commands: ip netns add nst ip link add dummy0 type dummy ip link add link macvtap0 link dummy0 type macvtap ip link set macvtap0 netns nst ip -netns nst link show macvtap0 Before: 10: macvtap0@gre0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 500 link/ether 5e:8f:ae:1d:60:50 brd ff:ff:ff:ff:ff:ff After: 10: macvtap0@if2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 500 link/ether 5e:8f:ae:1d:60:50 brd ff:ff:ff:ff:ff:ff link-netnsid 0 Reported-by: Leonardo Mörlein <freifunk@irrelefant.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Link: https://lore.kernel.org/r/20220228003240.1337426-1-sven@narfation.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02nfp: avoid newline at end of message in NL_SET_ERR_MSG_MODWan Jiabing1-1/+1
Fix the following coccicheck warning: ./drivers/net/ethernet/netronome/nfp/flower/qos_conf.c:750:7-55: WARNING avoid newline at end of message in NL_SET_ERR_MSG_MOD Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20220301112356.1820985-1-wanjiabing@vivo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02tun: support NAPI for packets received from batched XDP buffsHarold Huang1-13/+30
In tun, NAPI is supported and we can also use NAPI in the path of batched XDP buffs to accelerate packet processing. What is more, after we use NAPI, GRO is also supported. The iperf shows that the throughput of single stream could be improved from 4.5Gbps to 9.2Gbps. Additionally, 9.2 Gbps nearly reachs the line speed of the phy nic and there is still about 15% idle cpu core remaining on the vhost thread. Test topology: [iperf server]<--->tap<--->dpdk testpmd<--->phy nic<--->[iperf client] Iperf stream: iperf3 -c 10.0.0.2 -i 1 -t 10 Before: ... [ 5] 5.00-6.00 sec 558 MBytes 4.68 Gbits/sec 0 1.50 MBytes [ 5] 6.00-7.00 sec 556 MBytes 4.67 Gbits/sec 1 1.35 MBytes [ 5] 7.00-8.00 sec 556 MBytes 4.67 Gbits/sec 2 1.18 MBytes [ 5] 8.00-9.00 sec 559 MBytes 4.69 Gbits/sec 0 1.48 MBytes [ 5] 9.00-10.00 sec 556 MBytes 4.67 Gbits/sec 1 1.33 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 5.39 GBytes 4.63 Gbits/sec 72 sender [ 5] 0.00-10.04 sec 5.39 GBytes 4.61 Gbits/sec receiver After: ... [ 5] 5.00-6.00 sec 1.07 GBytes 9.19 Gbits/sec 0 1.55 MBytes [ 5] 6.00-7.00 sec 1.08 GBytes 9.30 Gbits/sec 0 1.63 MBytes [ 5] 7.00-8.00 sec 1.08 GBytes 9.25 Gbits/sec 0 1.72 MBytes [ 5] 8.00-9.00 sec 1.08 GBytes 9.25 Gbits/sec 77 1.31 MBytes [ 5] 9.00-10.00 sec 1.08 GBytes 9.24 Gbits/sec 0 1.48 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 10.8 GBytes 9.28 Gbits/sec 166 sender [ 5] 0.00-10.04 sec 10.8 GBytes 9.24 Gbits/sec receiver Reported-at: https://lore.kernel.org/all/CACGkMEvTLG0Ayg+TtbN4q4pPW-ycgCCs3sC3-TF8cuRTf7Pp1A@mail.gmail.com Signed-off-by: Harold Huang <baymaxhuang@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20220228033805.1579435-1-baymaxhuang@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02Merge tag 'for-net-2022-03-01' of ↵Jakub Kicinski1-36/+63
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fix regression with scanning not working in some systems. * tag 'for-net-2022-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: Fix not checking MGMT cmd pending queue ==================== Link: https://lore.kernel.org/r/20220302004330.125536-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02Merge branch 'sfc-optimize-rxqs-count-and-affinities'Jakub Kicinski1-19/+44
Íñigo Huguet says: ==================== sfc: optimize RXQs count and affinities In sfc driver one RX queue per physical core was allocated by default. Later on, IRQ affinities were set spreading the IRQs in all NUMA local CPUs. However, with that default configuration it result in a non very optimal configuration in many modern systems. Specifically, in systems with hyper threading and 2 NUMA nodes, affinities are set in a way that IRQs are handled by all logical cores of one same NUMA node. Handling IRQs from both hyper threading siblings has no benefit, and setting affinities to one queue per physical core is neither a very good idea because there is a performance penalty for moving data across nodes (I was able to check it with some XDP tests using pktgen). This patches reduce the default number of channels to one per physical core in the local NUMA node. Then, they set IRQ affinities to CPUs in the local NUMA node only. This way we save hardware resources since channels are limited resources. We also leave more room for XDP_TX channels without hitting driver's limit of 32 channels per interface. Running performance tests using iperf with a SFC9140 device showed no performance penalty for reducing the number of channels. RX XDP tests showed that performance can go down to less than half if the IRQ is handled by a CPU in a different NUMA node, which doesn't happen with the new defaults from this patches. ==================== Link: https://lore.kernel.org/r/20220228132254.25787-1-ihuguet@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02sfc: set affinity hints in local NUMA node onlyÍñigo Huguet1-2/+10
Affinity hints were being set to CPUs in local NUMA node first, and then in other CPUs. This was creating 2 unintended issues: 1. Channels created to be assigned each to a different physical core were assigned to hyperthreading siblings because of being in same NUMA node. Since the patch previous to this one, this did not longer happen with default rss_cpus modparam because less channels are created. 2. XDP channels could be assigned to CPUs in different NUMA nodes, decreasing performance too much (to less than half in some of my tests). This patch sets the affinity hints spreading the channels only in local NUMA node's CPUs. A fallback for the case that no CPU in local NUMA node is online has been added too. Example of CPUs being assigned in a non optimal way before this and the previous patch (note: in this system, xdp-8 to xdp-15 are created because num_possible_cpus == 64, but num_present_cpus == 32 so they're never used): $ lscpu | grep -i numa NUMA node(s): 2 NUMA node0 CPU(s): 0-7,16-23 NUMA node1 CPU(s): 8-15,24-31 $ grep -H . /proc/irq/*/0000:07:00.0*/../smp_affinity_list /proc/irq/141/0000:07:00.0-0/../smp_affinity_list:0 /proc/irq/142/0000:07:00.0-1/../smp_affinity_list:1 /proc/irq/143/0000:07:00.0-2/../smp_affinity_list:2 /proc/irq/144/0000:07:00.0-3/../smp_affinity_list:3 /proc/irq/145/0000:07:00.0-4/../smp_affinity_list:4 /proc/irq/146/0000:07:00.0-5/../smp_affinity_list:5 /proc/irq/147/0000:07:00.0-6/../smp_affinity_list:6 /proc/irq/148/0000:07:00.0-7/../smp_affinity_list:7 /proc/irq/149/0000:07:00.0-8/../smp_affinity_list:16 /proc/irq/150/0000:07:00.0-9/../smp_affinity_list:17 /proc/irq/151/0000:07:00.0-10/../smp_affinity_list:18 /proc/irq/152/0000:07:00.0-11/../smp_affinity_list:19 /proc/irq/153/0000:07:00.0-12/../smp_affinity_list:20 /proc/irq/154/0000:07:00.0-13/../smp_affinity_list:21 /proc/irq/155/0000:07:00.0-14/../smp_affinity_list:22 /proc/irq/156/0000:07:00.0-15/../smp_affinity_list:23 /proc/irq/157/0000:07:00.0-xdp-0/../smp_affinity_list:8 /proc/irq/158/0000:07:00.0-xdp-1/../smp_affinity_list:9 /proc/irq/159/0000:07:00.0-xdp-2/../smp_affinity_list:10 /proc/irq/160/0000:07:00.0-xdp-3/../smp_affinity_list:11 /proc/irq/161/0000:07:00.0-xdp-4/../smp_affinity_list:12 /proc/irq/162/0000:07:00.0-xdp-5/../smp_affinity_list:13 /proc/irq/163/0000:07:00.0-xdp-6/../smp_affinity_list:14 /proc/irq/164/0000:07:00.0-xdp-7/../smp_affinity_list:15 /proc/irq/165/0000:07:00.0-xdp-8/../smp_affinity_list:24 /proc/irq/166/0000:07:00.0-xdp-9/../smp_affinity_list:25 /proc/irq/167/0000:07:00.0-xdp-10/../smp_affinity_list:26 /proc/irq/168/0000:07:00.0-xdp-11/../smp_affinity_list:27 /proc/irq/169/0000:07:00.0-xdp-12/../smp_affinity_list:28 /proc/irq/170/0000:07:00.0-xdp-13/../smp_affinity_list:29 /proc/irq/171/0000:07:00.0-xdp-14/../smp_affinity_list:30 /proc/irq/172/0000:07:00.0-xdp-15/../smp_affinity_list:31 CPUs assignments after this and previous patch, so normal channels created only one per core in NUMA node and affinities set only to local NUMA node: $ grep -H . /proc/irq/*/0000:07:00.0*/../smp_affinity_list /proc/irq/116/0000:07:00.0-0/../smp_affinity_list:0 /proc/irq/117/0000:07:00.0-1/../smp_affinity_list:1 /proc/irq/118/0000:07:00.0-2/../smp_affinity_list:2 /proc/irq/119/0000:07:00.0-3/../smp_affinity_list:3 /proc/irq/120/0000:07:00.0-4/../smp_affinity_list:4 /proc/irq/121/0000:07:00.0-5/../smp_affinity_list:5 /proc/irq/122/0000:07:00.0-6/../smp_affinity_list:6 /proc/irq/123/0000:07:00.0-7/../smp_affinity_list:7 /proc/irq/124/0000:07:00.0-xdp-0/../smp_affinity_list:16 /proc/irq/125/0000:07:00.0-xdp-1/../smp_affinity_list:17 /proc/irq/126/0000:07:00.0-xdp-2/../smp_affinity_list:18 /proc/irq/127/0000:07:00.0-xdp-3/../smp_affinity_list:19 /proc/irq/128/0000:07:00.0-xdp-4/../smp_affinity_list:20 /proc/irq/129/0000:07:00.0-xdp-5/../smp_affinity_list:21 /proc/irq/130/0000:07:00.0-xdp-6/../smp_affinity_list:22 /proc/irq/131/0000:07:00.0-xdp-7/../smp_affinity_list:23 /proc/irq/132/0000:07:00.0-xdp-8/../smp_affinity_list:0 /proc/irq/133/0000:07:00.0-xdp-9/../smp_affinity_list:1 /proc/irq/134/0000:07:00.0-xdp-10/../smp_affinity_list:2 /proc/irq/135/0000:07:00.0-xdp-11/../smp_affinity_list:3 /proc/irq/136/0000:07:00.0-xdp-12/../smp_affinity_list:4 /proc/irq/137/0000:07:00.0-xdp-13/../smp_affinity_list:5 /proc/irq/138/0000:07:00.0-xdp-14/../smp_affinity_list:6 /proc/irq/139/0000:07:00.0-xdp-15/../smp_affinity_list:7 Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02sfc: default config to 1 channel/core in local NUMA node onlyÍñigo Huguet1-17/+34
Handling channels from CPUs in different NUMA node can penalize performance, so better configure only one channel per core in the same NUMA node than the NIC, and not per each core in the system. Fallback to all other online cores if there are not online CPUs in local NUMA node. Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02net: smc: fix different types in min()Jakub Kicinski1-2/+2
Fix build: include/linux/minmax.h:45:25: note: in expansion of macro ‘__careful_cmp’ 45 | #define min(x, y) __careful_cmp(x, y, <) | ^~~~~~~~~~~~~ net/smc/smc_tx.c:150:24: note: in expansion of macro ‘min’ 150 | corking_size = min(sock_net(&smc->sk)->smc.sysctl_autocorking_size, | ^~~ Fixes: 12bbb0d163a9 ("net/smc: add sysctl for autocorking") Link: https://lore.kernel.org/r/20220301222446.1271127-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02Bluetooth: Fix not checking MGMT cmd pending queueBrian Gix1-36/+63
A number of places in the MGMT handlers we examine the command queue for other commands (in progress but not yet complete) that will interact with the process being performed. However, not all commands go into the queue if one of: 1. There is no negative side effect of consecutive or redundent commands 2. The command is entirely perform "inline". This change examines each "pending command" check, and if it is not needed, deletes the check. Of the remaining pending command checks, we make sure that the command is in the pending queue by using the mgmt_pending_add/mgmt_pending_remove pair rather than the mgmt_pending_new/mgmt_pending_free pair. Link: https://lore.kernel.org/linux-bluetooth/f648f2e11bb3c2974c32e605a85ac3a9fac944f1.camel@redhat.com/T/ Tested-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Brian Gix <brian.gix@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-03-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski13-20/+226
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Use kfree_rcu(ptr, rcu) variant, using kfree_rcu(ptr) was not intentional. From Eric Dumazet. 2) Use-after-free in netfilter hook core, from Eric Dumazet. 3) Missing rcu read lock side for netfilter egress hook, from Florian Westphal. 4) nf_queue assume state->sk is full socket while it might not be. Invoke sock_gen_put(), from Florian Westphal. 5) Add selftest to exercise the reported KASAN splat in 4) 6) Fix possible use-after-free in nf_queue in case sk_refcnt is 0. Also from Florian. 7) Use input interface index only for hardware offload, not for the software plane. This breaks tc ct action. Patch from Paul Blakey. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: net/sched: act_ct: Fix flow table lookup failure with no originating ifindex netfilter: nf_queue: handle socket prefetch netfilter: nf_queue: fix possible use-after-free selftests: netfilter: add nfqueue TCP_NEW_SYN_RECV socket race test netfilter: nf_queue: don't assume sk is full socket netfilter: egress: silence egress hook lockdep splats netfilter: fix use-after-free in __nf_register_net_hook() netfilter: nf_tables: prefer kfree_rcu(ptr, rcu) variant ==================== Link: https://lore.kernel.org/r/20220301215337.378405-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-02net/sched: act_ct: Fix flow table lookup failure with no originating ifindexPaul Blakey3-6/+19
After cited commit optimizted hw insertion, flow table entries are populated with ifindex information which was intended to only be used for HW offload. This tuple ifindex is hashed in the flow table key, so it must be filled for lookup to be successful. But tuple ifindex is only relevant for the netfilter flowtables (nft), so it's not filled in act_ct flow table lookup, resulting in lookup failure, and no SW offload and no offload teardown for TCP connection FIN/RST packets. To fix this, add new tc ifindex field to tuple, which will only be used for offloading, not for lookup, as it will not be part of the tuple hash. Fixes: 9795ded7f924 ("net/sched: act_ct: Fill offloading tuple iifidx") Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>