summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-05-16ice: track interrupt vectors with xarrayPiotr Raczynski6-84/+89
Replace custom interrupt tracker with generic xarray data structure. Remove all code responsible for searching for a new entry with xa_alloc, which always tries to allocate at the lowes possible index. As a result driver is always using a contiguous region of the MSIX vector table. New tracker keeps ice_irq_entry entries in xarray as opaque for the rest of the driver hiding the entry details from the caller. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: add individual interrupt allocationPiotr Raczynski13-282/+165
Currently interrupt allocations, depending on a feature are distributed in batches. Also, after allocation there is a series of operations that distributes per irq settings through that batch of interrupts. Although driver does not yet support dynamic interrupt allocation, keep allocated interrupts in a pool and add allocation abstraction logic to make code more flexible. Keep per interrupt information in the ice_q_vector structure, which yields ice_vsi::base_vector redundant. Also, as a result there are a few functions that can be removed. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: remove redundant SRIOV codePiotr Raczynski1-36/+0
Remove redundant code from ice_get_max_valid_res_idx that has no effect. ice_pf::irq_tracker is initialized during driver probe, there is no reason to check it again. Also it is not possible for pf::sriov_base_vector to be lower than the tracker length, remove WARN_ON that will never happen. Get rid of ice_get_max_valid_res_idx helper function completely since it can never return negative value. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: refactor VF control VSI interrupt handlingPiotr Raczynski3-66/+58
All VF control VSIs share the same interrupt vector. Currently, a helper function dedicated for that directly sets ice_vsi::base_vector. Use helper that returns pointer to first found VF control VSI instead. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: use preferred MSIX allocation apiPiotr Raczynski3-38/+37
Move away from using pci_enable_msix_range/pci_disable_msix and use pci_alloc_irq_vectors/pci_free_irq_vectors instead. As a result stop tracking msix_entries since with newer API entries are handled by MSIX core. However, due to current design of communication with RDMA driver which accesses ice_pf::msix_entries directly, keep using the array just for RDMA driver use. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: use pci_irq_vector helper functionPiotr Raczynski5-11/+11
Currently, driver gets interrupt number directly from ice_pf::msix_entries array. Use helper function dedicated to do just that. While at it use a variable to store interrupt number in ice_free_irq_msix_misc instead of calling the helper function twice. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: move interrupt related code to separate filePiotr Raczynski5-218/+238
Keep interrupt handling code in a dedicated file. This helps keep driver structured better and prepares for more functionality added to this file. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16Merge branch 'spdx-conversion-for-bonding-8390-and-i825xx-drivers'Paolo Abeni22-61/+33
Bagas Sanjaya says: ==================== SPDX conversion for bonding, 8390, and i825xx drivers This series is SPDX conversion for bonding, 8390, and i825xx driver subsystems. It is splitted from v2 of my SPDX conversion series in response to Didi's GPL full name fixes [1] to make it easily digestible. The conversion in this series is divided by each subsystem and by license type. [1]: https://lore.kernel.org/linux-spdx/20230512100620.36807-1-bagasdotme@gmail.com/ ==================== Link: https://lore.kernel.org/r/20230515060714.621952-1-bagasdotme@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-16net: ethernet: i825xx: sun3_8256: Add SPDX license identifierBagas Sanjaya2-0/+2
The boilerplate reads that sun3_8256 driver is an extension to Linux kernel core, hence add SPDX license identifier for GPL 2.0. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Michael Hipp <hippm@informatik.uni-tuebingen.de> Cc: Sam Creasey <sammy@sammy.net> Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-16net: ethernet: i825xx: Replace unversioned GPL (GPL 1.0) notice with SPDX ↵Bagas Sanjaya3-9/+6
identifier Replace unversioned GPL boilerplate notice with corresponding SPDX license identifier, which is GPL 1.0+. Cc: Donald Becker <becker@scyld.com> Cc: Richard Hirst <richard@sleepie.demon.co.uk> Cc: Sam Creasey <sammy@sammy.net> Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-16net: ethernet: 8390: Replace GPL 2.0 boilerplate with SPDX identifierBagas Sanjaya5-23/+6
The boilerplate refers to COPYING in the top-level directory of kernel tree. Replace it with corresponding SPDX license identifier. Cc: Donald Becker <becker@scyld.com> Cc: Peter De Schrijver <p2@mind.be> Cc: Topi Kanerva <topi@susanna.oulu.fi> Cc: Alain Malek <Alain.Malek@cryogen.com> Cc: Bruce Abbott <bhabbott@inhb.co.nz> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Richard Fontana <rfontana@redhat.com> Acked-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-16net: ethernet: 8390: Convert unversioned GPL notice to SPDX license identifierBagas Sanjaya9-22/+15
Replace boilerplate notice for unversioned GPL to SPDX tag for GPL 1.0+. For ne2k-pci.c, only add SPDX tag and keep the boilerplate instead, since the boilerplate notes that it must be preserved. Cc: David A. Hinds <dahinds@users.sourceforge.net> Cc: Donald Becker <becker@scyld.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Richard Fontana <rfontana@redhat.com> Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-16net: bonding: Add SPDX identifier to remaining filesBagas Sanjaya3-7/+4
Previous batches of SPDX conversion missed bond_main.c and bonding_priv.h because these files doesn't mention intended GPL version. Add SPDX identifier to these files, assuming GPL 1.0+. Cc: Thomas Davis <tadavis@lbl.gov> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-16net: skbuff: update comment about pfmemalloc propagatingYunsheng Lin1-5/+5
__skb_fill_page_desc_noacc() is not doing any pfmemalloc propagating, and yet it has a comment about that, commit 84ce071e38a6 ("net: introduce __skb_fill_page_desc_noacc") may have accidentally moved it to __skb_fill_page_desc_noacc(), so move it back to __skb_fill_page_desc() which is supposed to be doing pfmemalloc propagating. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> CC: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20230515050107.46397-1-linyunsheng@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-15nfc: llcp: fix possible use of uninitialized variable in nfc_llcp_send_connect()Krzysztof Kozlowski1-1/+2
If sock->service_name is NULL, the local variable service_name_tlv_length will not be assigned by nfc_llcp_build_tlv(), later leading to using value frmo the stack. Smatch warning: net/nfc/llcp_commands.c:442 nfc_llcp_send_connect() error: uninitialized symbol 'service_name_tlv_length'. Fixes: de9e5aeb4f40 ("NFC: llcp: Fix usage of llcp_add_tlv()") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-15octeontx2-pf: mcs: Remove unneeded semicolonYang Li1-2/+2
./drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c:242:2-3: Unneeded semicolon ./drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c:476:2-3: Unneeded semicolon Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4947 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-15net: ethernet: microchip: vcap: Remove extra semicolonAnup Sharma1-4/+4
Remove the extra semicolon at end. Issue identified using semicolon.cocci Coccinelle semantic patch. drivers/net/ethernet/microchip/vcap/vcap_api.c:1124:3-4: Unneeded semicolon drivers/net/ethernet/microchip/vcap/vcap_api.c:1165:3-4: Unneeded semicolon drivers/net/ethernet/microchip/vcap/vcap_api.c:1239:3-4: Unneeded semicolon drivers/net/ethernet/microchip/vcap/vcap_api.c:1287:3-4: Unneeded semicolon Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Changes: V1 -> V2: Target tree included in the subject line. Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-15Merge branch 'octeontx2-pf-HTB'David S. Miller20-127/+2144
Hariprasad Kelam says: ==================== octeontx2-pf: HTB offload support octeontx2 silicon and CN10K transmit interface consists of five transmit levels starting from MDQ, TL4 to TL1. Once packets are submitted to MDQ, hardware picks all active MDQs using strict priority, and MDQs having the same priority level are chosen using round robin. Each packet will traverse MDQ, TL4 to TL1 levels. Each level contains an array of queues to support scheduling and shaping. As HTB supports classful queuing mechanism by supporting rate and ceil and allow the user to control the absolute bandwidth to particular classes of traffic the same can be achieved by configuring shapers and schedulers on different transmit levels. This series of patches adds support for HTB offload, Patch1: Allow strict priority parameter in HTB offload mode. Patch2: Rename existing total tx queues for better readability Patch3: defines APIs such that the driver can dynamically initialize/ deinitialize the send queues. Patch4: Refactors transmit alloc/free calls as preparation for QOS offload code. Patch5: moves rate limiting logic to common header which will be used by qos offload code. Patch6: Adds actual HTB offload support. Patch7: exposes qos send queue stats over ethtool. Patch8: Add documentation about htb offload flow in driver ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-15docs: octeontx2: Add Documentation for QOSHariprasad Kelam1-0/+45
Add QOS example configuration along with tc-htb commands Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-15octeontx2-pf: ethtool expose qos statsHariprasad Kelam1-9/+20
This patch extends ethtool stats support for QoS send queues as well. upon the number of transmit channels change request, Ensures the real number of transmit queues are equal to active QoS send queues plus configured transmit queues. ethtool -S eth0 txq_qos0: bytes: 3021391800 txq_qos0: frames: 1998275 txq_qos1: bytes: 4619766312 txq_qos1: frames: 3055401 ... ... Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-15octeontx2-pf: Add support for HTB offloadNaveen Mamindlapalli10-14/+1507
This patch registers callbacks to support HTB offload. Below are features supported, - supports traffic shaping on the given class by honoring rate and ceil configuration. - supports traffic scheduling, which prioritizes different types of traffic based on strict priority values. - supports the creation of leaf to inner classes such that parent node rate limits apply to all child nodes. Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-15octeontx2-pf: Prepare for QOS offloadHariprasad Kelam3-20/+43
This patch moves rate limiting definitions to a common header file and adds csr definitions required for QOS code. Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-15octeontx2-pf: Refactor schedular queue alloc/free callsHariprasad Kelam5-38/+94
1. Upon txschq free request, the transmit schedular config in hardware is not getting reset. This patch adds necessary changes to do the same. 2. Current implementation calls txschq alloc during interface initialization and in response handler updates the default txschq array. This creates a problem for htb offload where txsch alloc will be called for every tc class. This patch addresses the issue by reading txschq response in mbox caller function instead in the response handler. 3. Current otx2_txschq_stop routine tries to free all txschq nodes allocated to the interface. This creates a problem for htb offload. This patch introduces the otx2_txschq_free_one to free txschq in a given level. Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-15octeontx2-pf: qos send queues managementSubbaraya Sundeep10-42/+426
Current implementation is such that the number of Send queues (SQs) are decided on the device probe which is equal to the number of online cpus. These SQs are allocated and deallocated in interface open and c lose calls respectively. This patch defines new APIs for initializing and deinitializing Send queues dynamically and allocates more number of transmit queues for QOS feature. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-15octeontx2-pf: Rename tot_tx_queues to non_qos_queuesHariprasad Kelam4-15/+15
current implementation is such that tot_tx_queues contains both xdp queues and normal tx queues. which will be allocated in interface open calls and deallocated on interface down calls respectively. With addition of QOS, where send quees are allocated/deallacated upon user request Qos send queues won't be part of tot_tx_queues. So this patch renames tot_tx_queues to non_qos_queues. Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-15sch_htb: Allow HTB priority parameter in offload modeNaveen Mamindlapalli3-5/+10
The current implementation of HTB offload returns the EINVAL error for unsupported parameters like prio and quantum. This patch removes the error returning checks for 'prio' parameter and populates its value to tc_htb_qopt_offload structure such that driver can use the same. Add prio parameter check in mlx5 driver, as mlx5 devices are not capable of supporting the prio parameter when htb offload is used. Report error if prio parameter is set to a non-default value. Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Co-developed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-15net: Remove low_thresh in ip defragAngus Chen6-24/+18
As low_thresh has no work in fragment reassembles,del it. And Mark it deprecated in sysctl Document. Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-15Merge tag 'wireless-next-2023-05-12' of ↵David S. Miller43-300/+3488
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle valo says: ==================== wireless-next patches for v6.5 The first pull request for v6.5 and only driver changes this time. rtl8xxxu has been making lots of progress lately and now has AP mode support. Major changes: rtl8xxxu * AP mode support, initially only for rtl8188f rtw89 * provide RSSI, EVN and SNR statistics via debugfs * support U-NII-4 channels on 5 GHz band ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13sfc: fix use-after-free in efx_tc_flower_record_encap_match()Edward Cree1-2/+2
When writing error messages to extack for pseudo collisions, we can't use encap->type as encap has already been freed. Fortunately the same value is stored in local variable em_type, so use that instead. Fixes: 3c9561c0a5b9 ("sfc: support TC decap rules matching on enc_ip_tos") Reported-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13net: phylink: constify fwnode argumentsRussell King (Oracle)2-9/+11
Both phylink_create() and phylink_fwnode_phy_connect() do not modify the fwnode argument that they are passed, so lets constify these. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13net: fec: using the standard return codes when xdp xmit errorsShenwei Wang1-3/+3
This patch standardizes the inconsistent return values for unsuccessful XDP transmits by using standardized error codes (-EBUSY or -ENOMEM). Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13net: macb: Shorten max_tx_len to 4KiB - 56 on mpfsDaire McNamara2-3/+10
On mpfs, with SRAM configured for 4 queues, setting max_tx_len to GEM_TX_MAX_LEN=0x3f0 results multiple AMBA errors. Setting max_tx_len to (4KiB - 56) removes those errors. The details are described in erratum 1686 by Cadence The max jumbo frame size is also reduced for mpfs to (4KiB - 56). Signed-off-by: Daire McNamara <daire.mcnamara@microchip.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13ping: Convert hlist_nulls to plain hlist.Kuniyuki Iwashima1-26/+15
Since introduced in commit c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind"), ping socket does not use SLAB_TYPESAFE_BY_RCU nor check nulls marker in loops. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13Merge branch 'skb_frag_fill_page_desc'David S. Miller20-109/+65
Yunsheng Lin says: ==================== net: introduce skb_frag_fill_page_desc() Most users use __skb_frag_set_page()/skb_frag_off_set()/ skb_frag_size_set() to fill the page desc for a skb frag. It does not make much sense to calling __skb_frag_set_page() without calling skb_frag_off_set(), as the offset may depend on whether the page is head page or tail page, so add skb_frag_fill_page_desc() to fill the page desc for a skb frag. In the future, we can make sure the page in the frag is head page of compound page or a base page, if not, we may warn about that and convert the tail page to head page and update the offset accordingly, if we see a warning about that, we also fix the caller to fill the head page in the frag. when the fixing is done, we may remove the warning and converting. In this way, we can remove the compound_head() or use page_ref_*() like the below case: https://elixir.bootlin.com/linux/latest/source/net/core/page_pool.c#L881 https://elixir.bootlin.com/linux/latest/source/include/linux/skbuff.h#L3383 It may also convert net stack to use the folio easier. V1: repost with all the ack/review tags included. RFC: remove a local variable as pointed out by Simon. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13net: remove __skb_frag_set_page()Yunsheng Lin3-17/+1
The remaining users calling __skb_frag_set_page() with page being NULL seems to be doing defensive programming, as shinfo->nr_frags is already decremented, so remove them. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13net: introduce and use skb_frag_fill_page_desc()Yunsheng Lin19-92/+64
Most users use __skb_frag_set_page()/skb_frag_off_set()/ skb_frag_size_set() to fill the page desc for a skb frag. Introduce skb_frag_fill_page_desc() to do that. net/bpf/test_run.c does not call skb_frag_off_set() to set the offset, "copy_from_user(page_address(page), ...)" and 'shinfo' being part of the 'data' kzalloced in bpf_test_init() suggest that it is assuming offset to be initialized as zero, so call skb_frag_fill_page_desc() with offset being zero for this case. Also, skb_frag_set_page() is not used anymore, so remove it. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13selftests: net: vxlan: Add tests for vxlan nolocalbypass option.Vladimir Nikishkin2-0/+241
Add test to make sure that the localbypass option is on by default. Add test to change vxlan localbypass to nolocalbypass and check that packets are delivered to userspace. Signed-off-by: Vladimir Nikishkin <vladimir@nikishkin.pw> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13net: vxlan: Add nolocalbypass option to vxlan.Vladimir Nikishkin3-3/+23
If a packet needs to be encapsulated towards a local destination IP, the packet will undergo a "local bypass" and be injected into the Rx path as if it was received by the target VXLAN device without undergoing encapsulation. If such a device does not exist, the packet will be dropped. There are scenarios where we do not want to perform such a bypass, but instead want the packet to be encapsulated and locally received by a user space program for post-processing. To that end, add a new VXLAN device attribute that controls whether a "local bypass" is performed or not. Default to performing a bypass to maintain existing behavior. Signed-off-by: Vladimir Nikishkin <vladimir@nikishkin.pw> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13Merge branch 'broadcom-phy-wol'David S. Miller7-5/+416
Florian Fainelli says: ==================== Support for Wake-on-LAN for Broadcom PHYs This patch series adds support for Wake-on-LAN to the Broadcom PHY driver. Specifically the BCM54210E/B50212E are capable of supporting Wake-on-LAN using an external pin typically wired up to a system's GPIO. These PHY operate a programmable Ethernet MAC destination address comparator which will fire up an interrupt whenever a match is received. Because of that, it was necessary to introduce patch #1 which allows the PHY driver's ->suspend() routine to be called unconditionally. This is necessary in our case because we need a hook point into the device suspend/resume flow to enable the wake-up interrupt as late as possible. Patch #2 adds support for the Broadcom PHY library and driver for Wake-on-LAN proper with the WAKE_UCAST, WAKE_MCAST, WAKE_BCAST, WAKE_MAGIC and WAKE_MAGICSECURE. Note that WAKE_FILTER is supportable, however this will require further discussions and be submitted as a RFC series later on. Patch #3 updates the GENET driver to defer to the PHY for Wake-on-LAN if the PHY supports it, thus allowing the MAC to be powered down to conserve power. Changes in v3: - collected Reviewed-by tags - explicitly use return 0 in bcm54xx_phy_probe() (Paolo) Changes in v2: - introduce PHY_ALWAYS_CALL_SUSPEND and only have the Broadcom PHY driver set this flag to minimize changes to the suspend flow to only drivers that need it - corrected possibly uninitialized variable in bcm54xx_set_wakeup_irq (Simon) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13net: bcmgenet: Add support for PHY-based Wake-on-LANFlorian Fainelli1-0/+14
If available, interrogate the PHY to find out whether we can use it for Wake-on-LAN. This can be a more power efficient way of implementing that feature, especially when the MAC is powered off in low power states. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13net: phy: broadcom: Add support for Wake-on-LANFlorian Fainelli4-3/+395
Add support for WAKE_UCAST, WAKE_MCAST, WAKE_BCAST, WAKE_MAGIC and WAKE_MAGICSECURE. This is only supported with the BCM54210E and compatible Ethernet PHYs. Using the in-band interrupt or an out of band GPIO interrupts are supported. Broadcom PHYs will generate a Wake-on-LAN level low interrupt on LED4 as soon as one of the supported patterns is being matched. That includes generating such an interrupt even if the PHY is operated during normal modes. If WAKE_UCAST is selected, this could lead to the LED4 interrupt firing up for every packet being received which is absolutely undesirable from a performance point of view. Because the Wake-on-LAN configuration can be set long before the system is actually put to sleep, we cannot have an interrupt service routine to clear on read the interrupt status register and ensure that new packet matches will be detected. It is desirable to enable the Wake-on-LAN interrupt as late as possible during the system suspend process such that we limit the number of interrupts to be handled by the system, but also conversely feed into the Linux's system suspend way of dealing with interrupts in and around the points of no return. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13net: phy: Allow drivers to always call into ->suspend()Florian Fainelli2-2/+7
A few PHY drivers are currently attempting to not suspend the PHY when Wake-on-LAN is enabled, however that code is not currently executing at all due to an early check in phy_suspend(). This prevents PHY drivers from making an appropriate decisions and put the hardware into a low power state if desired. In order to allow the PHY drivers to opt into getting their ->suspend routine to be called, add a PHY_ALWAYS_CALL_SUSPEND bit which can be set. A boolean that tracks whether the PHY or the attached MAC has Wake-on-LAN enabled is also provided for convenience. If phydev::wol_enabled then the PHY shall not prevent its own Wake-on-LAN detection logic from working and shall not prevent the Ethernet MAC from receiving packets for matching. Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-12Merge branch 'sfc-decap'David S. Miller4-64/+197
Edward Cree says: ==================== sfc: more flexible encap matches on TC decap rules This series extends the TC offload support on EF100 to support optionally matching on the IP ToS and UDP source port of the outer header in rules performing tunnel decapsulation. Both of these fields allow masked matches if the underlying hardware supports it (current EF100 hardware supports masking on ToS, but only exact-match on source port). Given that the source port is typically populated from a hash of inner header entropy, it's not clear whether filtering on it is useful, but since we can support it we may as well expose the capability. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-12sfc: support TC decap rules matching on enc_src_portEdward Cree4-16/+41
Allow efx_tc_encap_match entries to include a udp_sport and a udp_sport_mask. As with enc_ip_tos, use pseudos to enforce that all encap matches within a given <src_ip,dst_ip,udp_dport> tuple have the same udp_sport_mask. Note that since we use a single layer of pseudos for both fields, two matches that differ in (say) udp_sport value aren't permitted to have different ip_tos_mask, even though this would technically be safe. Current userland TC does not support setting enc_src_port; this patch was tested with an iproute2 patched to support it. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-12sfc: support TC decap rules matching on enc_ip_tosEdward Cree2-36/+133
Allow efx_tc_encap_match entries to include an ip_tos and ip_tos_mask. To avoid partially-overlapping Outer Rules (which can lead to undefined behaviour in the hardware), store extra "pseudo" entries in our encap_match hashtable, which are used to enforce that all Outer Rule entries within a given <src_ip,dst_ip,udp_dport> tuple (or IPv6 equivalent) have the same ip_tos_mask. The "direct" encap_match entry takes a reference on the "pseudo", allowing it to be destroyed when all "direct" entries using it are removed. efx_tc_em_pseudo_type is an enum rather than just a bool because in future an additional pseudo-type will be added to support Conntrack offload. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-12sfc: populate enc_ip_tos matches in MAE outer rulesEdward Cree4-7/+20
Currently tc.c will block them before they get here, but following patch will change that. Use the extack message from efx_mae_check_encap_match_caps() instead of writing a new one, since there's now more being fed in than just an IP version. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-12sfc: release encap match in efx_tc_flow_free()Edward Cree1-17/+15
When force-freeing leftover entries from our match_action_ht, call efx_tc_delete_rule(), which releases all the rule's resources, rather than open-coding it. The open-coded version was missing a call to release the rule's encap match (if any). It probably doesn't matter as everything's being torn down anyway, but it's cleaner this way and prevents further error messages potentially being logged by efx_tc_encap_match_free() later on. Move efx_tc_flow_free() further down the file to avoid introducing a forward declaration of efx_tc_delete_rule(). Fixes: 17654d84b47c ("sfc: add offloading of 'foreign' TC (decap) rules") Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-12net: liquidio: lio_main: Remove unnecessary (void*) conversionswuych1-12/+6
Pointer variables of void * type do not require type cast. Signed-off-by: wuych <yunchuan@nfschina.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-12sctp: add bpf_bypass_getsockopt proto callbackAlexander Mikhalitsyn1-0/+18
Implement ->bpf_bypass_getsockopt proto callback and filter out SCTP_SOCKOPT_PEELOFF, SCTP_SOCKOPT_PEELOFF_FLAGS and SCTP_SOCKOPT_CONNECTX3 socket options from running eBPF hook on them. SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT hook returns an error after success of the original handler sctp_getsockopt(...), userspace will receive an error from getsockopt syscall and will be not aware that fd was successfully installed into a fdtable. As pointed by Marcelo Ricardo Leitner it seems reasonable to skip bpf getsockopt hook for SCTP_SOCKOPT_CONNECTX3 sockopt too. Because internaly, it triggers connect() and if error is masked then userspace will be confused. This patch was born as a result of discussion around a new SCM_PIDFD interface: https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks") Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Christian Brauner <brauner@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Cc: Xin Long <lucien.xin@gmail.com> Cc: linux-sctp@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Suggested-by: Stanislav Fomichev <sdf@google.com> Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-12Merge branch 'selftests-fcnal'David S. Miller2-1/+132
Guillaume Nault says: ==================== selftests: fcnal: Test SO_DONTROUTE socket option. The objective is to cover kernel paths that use the RTO_ONLINK flag in .flowi4_tos. This way we'll be able to safely remove this flag in the future by properly setting .flowi4_scope instead. With these selftests in place, we can make sure this won't introduce regressions. For more context, the final objective is to convert .flowi4_tos to dscp_t, to ensure that ECN bits don't influence route and fib-rule lookups (see commit a410a0cf9885 ("ipv6: Define dscp_t and stop taking ECN bits into account in fib6-rules")). These selftests only cover IPv4, as SO_DONTROUTE has no effect on IPv6 sockets. v2: - Use two different nettest options for setting SO_DONTROUTE either on the server or on the client socket. - Use the above feature to run a single 'nettest -B' instance per test (instead of having two nettest processes for server and client). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>