summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)AuthorFilesLines
2023-06-10Merge branch '100GbE' of ↵Jakub Kicinski2-7/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-06-08 (ice) This series contains updates to ice driver only. Simon Horman stops null pointer dereference for GNSS error path. Kamil fixes memory leak when downing interface when XDP is enabled. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Fix XDP memory leak when NIC is brought up and down ice: Don't dereference NULL in ice_gnss_read error path ==================== Link: https://lore.kernel.org/r/20230608200051.451752-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10iavf: remove mask from iavf_irq_enable_queues()Ahmed Zaki3-11/+8
Enable more than 32 IRQs by removing the u32 bit mask in iavf_irq_enable_queues(). There is no need for the mask as there are no callers that select individual IRQs through the bitmask. Also, if the PF allocates more than 32 IRQs, this mask will prevent us from using all of them. Modify the comment in iavf_register.h to show that the maximum number allowed for the IRQ index is 63 as per the iAVF standard 1.0 [1]. link: [1] https://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/ethernet-adaptive-virtual-function-hardware-spec.pdf Fixes: 5eae00c57f5e ("i40evf: main driver core") Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230608200226.451861-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09igb: Fix extts capture value format for 82580/i354/i350Yuezhen Luan1-2/+6
82580/i354/i350 features circle-counter-like timestamp registers that are different with newer i210. The EXTTS capture value in AUXTSMPx should be converted from raw circle counter value to timestamp value in resolution of 1 nanosec by the driver. This issue can be reproduced on i350 nics, connecting an 1PPS signal to a SDP pin, and run 'ts2phc' command to read external 1PPS timestamp value. On i210 this works fine, but on i350 the extts is not correctly converted. The i350/i354/82580's SYSTIM and other timestamp registers are 40bit counters, presenting time range of 2^40 ns, that means these registers overflows every about 1099s. This causes all these regs can't be used directly in contrast to the newer i210/i211s. The igb driver needs to convert these raw register values to valid time stamp format by using kernel timecounter apis for i350s families. Here the igb_extts() just forgot to do the convert. Fixes: 38970eac41db ("igb: support EXTTS on 82580/i354/i350") Signed-off-by: Yuezhen Luan <eggcar.luan@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230607164116.3768175-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08ice: do not re-enable miscellaneous interrupt until thread_fn completesJacob Keller1-4/+5
The ice driver uses threaded IRQ for managing Tx timestamps via the devm_request_threaded_irq() interface. The ice_misc_intr() handler function is responsible for processing the hard interrupt context, and can wake the ice_misc_intr_thread_fn() by returning IRQ_WAKE_THREAD. The request_threaded_irq() function comment says: @handler is still called in hard interrupt context and has to check whether the interrupt originates from the device. If yes, it needs to disable the interrupt on the device and return IRQ_WAKE_THREAD which will wake up the handler thread and run the @thread_fn. We currently re-enable the Other Interrupt Cause Register (OCIR) at the end of ice_misc_intr(). In practice, this seems to be ok, but it can make communicating between the handler function and the thread function difficult. This is because the interrupt can trigger again while the thread function is still processing. Move the OICR update to the end of the thread function, leaving the other interrupt cause disabled in hardware until we complete one pass of the thread function. This prevents the miscellaneous interrupt from firing until after we finish the thread function. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-08ice: trigger PFINT_OICR_TSYN_TX interrupt instead of pollingJacob Keller1-2/+9
In ice_misc_intr_thread_fn(), if we do not complete all Tx timestamp work, the thread function will poll continuously forever. For E822 hardware, this wastes time as the return value from ice_ptp_process_ts() is accurate and always reports correctly that the PHY actually has new timestamp data. In addition, if we receive enough timestamps with the right pacing, we may never exit this polling. Should this occur, other tasks handled by the ice_misc_intr_thread_fn() will never be processed. Fix this by instead writing to PFINT_OICR, causing an emulated interrupt to be triggered immediately. This does take slightly more processing than just re-checking the timestamps. However, it allows all of the other interrupt causes a chance to be processed first in the hard IRQ function. Note that the OICR interrupt is configured to be throttled to no more than once every 124 microseconds. This gives an effective interrupt rate of ~8000 interrupts per second. This should thus not cause a significant increase in overall CPU usage when compared to sleeping. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-08ice: Fix XDP memory leak when NIC is brought up and downKamil Maziarz1-0/+4
Fix the buffer leak that occurs while switching the port up and down with traffic and XDP by checking for an active XDP program and freeing all empty TX buffers. Fixes: efc2214b6047 ("ice: Add support for XDP") Signed-off-by: Kamil Maziarz <kamil.maziarz@intel.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski5-73/+7
Cross-merge networking fixes after downstream PR. Conflicts: net/sched/sch_taprio.c d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping") dced11ef84fb ("net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()") net/ipv4/sysctl_net_ipv4.c e209fee4118f ("net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294") ccce324dabfe ("tcp: make the first N SYN RTO backoffs linear") https://lore.kernel.org/all/20230605100816.08d41a7b@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08ice: introduce ICE_TX_TSTAMP_WORK enumerationJacob Keller3-22/+42
The ice_ptp_process_ts() function and its various helper functions return a boolean value indicating whether any work is remaining. This use of a boolean has grown confusing as we have multiple helpers that pass status between each other. Readers must be aware of what "true" and "false" mean, and it is very easy to get their meaning inverted. The names of the functions are not standard "yes/no" questions, which is the best practice for boolean returns. Replace this use of an enumeration with a custom type, enum ice_tx_tstamp_work. This enumeration clearly indicates whether all work is done, or if more work is pending. To aid in readability, factor the actual list iteration and processing out into ice_ptp_process_tx_tstamp(), making it void. Then call this in ice_ptp_tx_tstamp() ensuring that we always check the Tracker list at the end when determining the appropriate return value. Now the return value is an explicit name instead of the true or false value. This is easier to follow and makes reading the resulting callers much simpler. In addition, this paves the way for future work to allow E822 hardware to process timestamps for all functions using a single interrupt on the clock owning PF. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-08ice: always return IRQ_WAKE_THREAD in ice_misc_intr()Karol Kolacinski1-10/+4
Refactor the ice_misc_intr() function to always return IRQ_WAKE_THREAD, and schedule the service task during the soft IRQ thread function instead of at the end of the hard IRQ handler. Remove the duplicate call to ice_service_task_schedule() that happened when we got a PCI exception. Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-08ice: handle extts in the miscellaneous interrupt threadKarol Kolacinski4-19/+33
The ice_ptp_extts_work() and ice_ptp_periodic_work() functions are both scheduled on the same kthread worker, pf.ptp.kworker. The ice_ptp_periodic_work() function sends to the firmware to interact with the PHY, and must block to wait for responses. This can cause delay in responding to the PFINT_OICR_TSYN_EVNT interrupt cause, ultimately resulting in disruption to processing an input signal of the frequency is high enough. In our testing, even 100 Hz signals get disrupted. Fix this by instead processing the signal inside the miscellaneous interrupt thread prior to handling Tx timestamps. Use atomic bits in a new pf->misc_thread bitmap in order to safely communicate which tasks require processing within the ice_misc_intr_thread_fn(). This ensures the communication of desired tasks from the ice_misc_intr() are correctly processed without racing even in the event that the interrupt triggers again before the thread function exits. Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins") Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-08ice: Don't dereference NULL in ice_gnss_read error pathSimon Horman1-7/+1
If pf is NULL in ice_gnss_read() then it will be dereferenced in the error path by a call to dev_dbg(ice_pf_to_dev(pf), ...). Avoid this by simply returning in this case. If logging is desired an alternate approach might be to use pr_err() before returning. Flagged by Smatch as: .../ice_gnss.c:196 ice_gnss_read() error: we previously assumed 'pf' could be null (see line 131) Fixes: 43113ff73453 ("ice: add TTY for GNSS module for E810T device") Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-08eth: ixgbe: fix the wake conditionJakub Kicinski1-1/+1
Flip the netif_carrier_ok() condition in queue wake logic. When I moved it to inside __netif_txq_completed_wake() I missed negating it. This made the condition ineffective and could probably lead to crashes. Fixes: 301f227fc860 ("net: piggy back on the memory barrier in bql when waking queues") Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230607010826.960226-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-07ice: make writes to /dev/gnssX synchronousMichal Schmidt4-72/+6
The current ice driver's GNSS write implementation buffers writes and works through them asynchronously in a kthread. That's bad because: - The GNSS write_raw operation is supposed to be synchronous[1][2]. - There is no upper bound on the number of pending writes. Userspace can submit writes much faster than the driver can process, consuming unlimited amounts of kernel memory. A patch that's currently on review[3] ("[v3,net] ice: Write all GNSS buffers instead of first one") would add one more problem: - The possibility of waiting for a very long time to flush the write work when doing rmmod, softlockups. To fix these issues, simplify the implementation: Drop the buffering, the write_work, and make the writes synchronous. I tested this with gpsd and ubxtool. [1] https://events19.linuxfoundation.org/wp-content/uploads/2017/12/The-GNSS-Subsystem-Johan-Hovold-Hovold-Consulting-AB.pdf "User interface" slide. [2] A comment in drivers/gnss/core.c:gnss_write(): /* Ignoring O_NONBLOCK, write_raw() is synchronous. */ [3] https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20230217120541.16745-1-karol.kolacinski@intel.com/ Fixes: d6b98c8d242a ("ice: add write functionality for GNSS TTY") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+1
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/sfc/tc.c 622ab656344a ("sfc: fix error unwinds in TC offload") b6583d5e9e94 ("sfc: support TC decap rules matching on enc_src_port") net/mptcp/protocol.c 5b825727d087 ("mptcp: add annotations around msk->subflow accesses") e76c8ef5cc5b ("mptcp: refactor mptcp_stream_accept()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01ice: recycle/free all of the fragments from multi-buffer frameMaciej Fijalkowski1-1/+1
The ice driver caches next_to_clean value at the beginning of ice_clean_rx_irq() in order to remember the first buffer that has to be freed/recycled after main Rx processing loop. The end boundary is indicated by first descriptor of frame that Rx processing loop has ended its duties. Note that if mentioned loop ended in the middle of gathering multi-buffer frame, next_to_clean would be pointing to the descriptor in the middle of the frame BUT freeing/recycling stage will stop at the first descriptor. This means that next iteration of ice_clean_rx_irq() will miss the (first_desc, next_to_clean - 1) entries. When running various 9K MTU workloads, such splats were observed: [ 540.780716] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 540.787787] #PF: supervisor read access in kernel mode [ 540.793002] #PF: error_code(0x0000) - not-present page [ 540.798218] PGD 0 P4D 0 [ 540.800801] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 540.805231] CPU: 18 PID: 3984 Comm: xskxceiver Tainted: G W 6.3.0-rc7+ #96 [ 540.813619] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [ 540.824209] RIP: 0010:ice_clean_rx_irq+0x2b6/0xf00 [ice] [ 540.829678] Code: 74 24 10 e9 aa 00 00 00 8b 55 78 41 31 57 10 41 09 c4 4d 85 ff 0f 84 83 00 00 00 49 8b 57 08 41 8b 4f 1c 65 8b 35 1a fa 4b 3f <48> 8b 02 48 c1 e8 3a 39 c6 0f 85 a2 00 00 00 f6 42 08 02 0f 85 98 [ 540.848717] RSP: 0018:ffffc9000f42fc50 EFLAGS: 00010282 [ 540.854029] RAX: 0000000000000004 RBX: 0000000000000002 RCX: 000000000000fffe [ 540.861272] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000ffffffff [ 540.868519] RBP: ffff88984a05ac00 R08: 0000000000000000 R09: dead000000000100 [ 540.875760] R10: ffff88983fffcd00 R11: 000000000010f2b8 R12: 0000000000000004 [ 540.883008] R13: 0000000000000003 R14: 0000000000000800 R15: ffff889847a10040 [ 540.890253] FS: 00007f6ddf7fe640(0000) GS:ffff88afdf800000(0000) knlGS:0000000000000000 [ 540.898465] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 540.904299] CR2: 0000000000000000 CR3: 000000010d3da001 CR4: 00000000007706e0 [ 540.911542] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 540.918789] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 540.926032] PKRU: 55555554 [ 540.928790] Call Trace: [ 540.931276] <TASK> [ 540.933418] ice_napi_poll+0x4ca/0x6d0 [ice] [ 540.937804] ? __pfx_ice_napi_poll+0x10/0x10 [ice] [ 540.942716] napi_busy_loop+0xd7/0x320 [ 540.946537] xsk_recvmsg+0x143/0x170 [ 540.950178] sock_recvmsg+0x99/0xa0 [ 540.953729] __sys_recvfrom+0xa8/0x120 [ 540.957543] ? do_futex+0xbd/0x1d0 [ 540.961008] ? __x64_sys_futex+0x73/0x1d0 [ 540.965083] __x64_sys_recvfrom+0x20/0x30 [ 540.969155] do_syscall_64+0x38/0x90 [ 540.972796] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 540.977934] RIP: 0033:0x7f6de5f27934 To fix this, set cached_ntc to first_desc so that at the end, when freeing/recycling buffers, descriptors from first to ntc are not missed. Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230531154457.3216621-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-31net/sched: taprio: replace tc_taprio_qopt_offload :: enable with a "cmd" enumVladimir Oltean1-2/+11
Inspired from struct flow_cls_offload :: cmd, in order for taprio to be able to report statistics (which is future work), it seems that we need to drill one step further with the ndo_setup_tc(TC_SETUP_QDISC_TAPRIO) multiplexing, and pass the command as part of the common portion of the muxed structure. Since we already have an "enable" variable in tc_taprio_qopt_offload, refactor all drivers to check for "cmd" instead of "enable", and reject every other command except "replace" and "destroy" - to be future proof. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> # for lan966x Acked-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek Reviewed-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-30devlink: move port_split/unsplit() ops into devlink_port_opsJiri Pirko1-2/+2
Move port_split/unsplit() from devlink_ops into newly introduced devlink_port_ops. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-30ice: register devlink port for PF with opsJiri Pirko1-1/+5
Use newly introduce devlink port registration function variant and register devlink port passing ops. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Michal Wilczynski <michal.wilczynski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-26overflow: Add struct_size_t() helperKees Cook1-5/+4
While struct_size() is normally used in situations where the structure type already has a pointer instance, there are places where no variable is available. In the past, this has been worked around by using a typed NULL first argument, but this is a bit ugly. Add a helper to do this, and replace the handful of instances of the code pattern with it. Instances were found with this Coccinelle script: @struct_size_t@ identifier STRUCT, MEMBER; expression COUNT; @@ - struct_size((struct STRUCT *)\(0\|NULL\), + struct_size_t(struct STRUCT, MEMBER, COUNT) Suggested-by: Christoph Hellwig <hch@infradead.org> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: James Smart <james.smart@broadcom.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: HighPoint Linux Team <linux@highpoint-tech.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumit Saxena <sumit.saxena@broadcom.com> Cc: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Cc: Don Brace <don.brace@microchip.com> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <dchinner@redhat.com> Cc: Guo Xuenan <guoxuenan@huawei.com> Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Daniel Latypov <dlatypov@google.com> Cc: kernel test robot <lkp@intel.com> Cc: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org Cc: linux-nvme@lists.infradead.org Cc: linux-scsi@vger.kernel.org Cc: megaraidlinux.pdl@broadcom.com Cc: storagedev@microchip.com Cc: linux-xfs@vger.kernel.org Cc: linux-hardening@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20230522211810.never.421-kees@kernel.org
2023-05-22Merge branch '100GbE' of ↵David S. Miller12-259/+351
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: allow matching on meta data Michal Swiatkowski says: This patchset is intended to improve the usability of the switchdev slow path. Without matching on a meta data values slow path works based on VF's MAC addresses. It causes a problem when the VF wants to use more than one MAC address (e.g. when it is in trusted mode). Parse all meta data in the same place where protocol type fields are parsed. Add description for the currently implemented meta data. It is important to note that depending on DDP not all described meta data can be available. Using not available meta data leads to error returned by function which is looking for correct words in profiles read from DDP. There is also one small improvement, remove of rx field in rule info structure (patch 2). It is redundant. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-20Merge branch '1GbE' of ↵Jakub Kicinski3-1/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-05-18 (igc, igb, e1000e) This series contains updates to igc, igb, and e1000e drivers. Kurt Kanzenbach adds calls to txq_trans_cond_update() for XDP transmit on igc. Tom Rix makes definition of igb_pm_ops conditional on CONFIG_PM for igb. Baozhu Ni adds a missing kdoc description on e1000e. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: e1000e: Add @adapter description to kdoc igb: Define igb_pm_ops conditionally on CONFIG_PM igc: Avoid transmit queue timeout for XDP ==================== Link: https://lore.kernel.org/r/20230518170942.418109-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-19ice: use src VSI instead of src MAC in slow-pathMichal Swiatkowski9-102/+40
The use of a source MAC to direct packets from the VF to the corresponding port representor is only ok if there is only one MAC on a VF. To support this functionality when the number of MACs on a VF is greater, it is necessary to match a source VSI instead of a source MAC. Let's use the new switch API that allows matching on metadata. If MAC isn't used in match criteria there is no need to handle adding rule after virtchnl command. Instead add new rule while port representor is being configured. Remove rule_added field, checking for sp_rule can be used instead. Remove also checking for switchdev running in deleting rule as it can be called from unroll context when running flag isn't set. Checking for sp_rule covers both context (with and without running flag). Rules are added in eswitch configuration flow, so there is no need to have replay function. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-19ice: allow matching on meta dataMichal Swiatkowski5-105/+95
Add meta data matching criteria in the same place as protocol matching criteria. There is no need to add meta data as special words after parsing all lookups. Trade meta data in the same why as other lookups. The one difference between meta data lookups and protocol lookups is that meta data doesn't impact how the packets looks like. Because of that ignore it when filling testing packet. Match on tunnel type meta data always if tunnel type is different than TNL_LAST. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-19ice: specify field names in ice_prot_ext initMichal Swiatkowski1-23/+28
Anonymous initializers are now discouraged. Define ICE_PROTCOL_ENTRY macro to rewrite anonymous initializers to named one. No functional changes here. Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-19ice: remove redundant Rx field from rule infoMichal Swiatkowski4-22/+14
Information about the direction is currently stored in sw_act.flag. There is no need to duplicate it in another field. Setting direction flag doesn't mean that there is a match criteria for direction in rule. It is only a information for HW from where switch id should be collected (VSI or port). In current implementation of advance rule handling, without matching for direction meta data, we can always set one the same flag and everything will work the same. Ability to match on direction meta data will be added in follow up patches. Recipe 0, 3 and 9 loaded from package has direction match criteria, but they are handled in other function. Move ice_adv_rule_info fields to avoid holes. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-19ice: define meta data to match in switchMichal Swiatkowski3-16/+183
Add description for each meta data. Redefine tunnel mask to match only tunneled MAC and tunneled VLAN. It shouldn't try to match other flags (previously it was 0xff, it is redundant). VLAN mask was 0xd000, change it to 0xf000. 4 last bits are flags depending on the same field in packets (VLAN tag). Because of that, It isn't harmful to match also on ITAG. Group all MDID and MDID offsets into enums to keep things organized. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-19Merge branch '100GbE' of ↵Jakub Kicinski8-366/+142
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-05-17 (ice, MAINTAINERS) This series contains updates to ice driver and MAINTAINERS file. Paul refactors PHY to link mode reporting and updates some PHY types to report more accurate link modes for ice. Dave removes mutual exclusion policy between LAG and SR-IOV in ice driver. Jesse updates link for Intel Wired LAN in the MAINTAINERS file. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: MAINTAINERS: update Intel Ethernet links ice: Remove LAG+SRIOV mutual exclusion ice: update PHY type to ethtool link mode mapping ice: refactor PHY type to ethtool link mode ice: update ICE_PHY_TYPE_HIGH_MAX_INDEX ==================== Link: https://lore.kernel.org/r/20230517165530.3179965-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski10-25/+40
Conflicts: drivers/net/ethernet/freescale/fec_main.c 6ead9c98cafc ("net: fec: remove the xdp_return_frame when lack of tx BDs") 144470c88c5d ("net: fec: using the standard return codes when xdp xmit errors") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-18e1000e: Add @adapter description to kdocBaozhu Ni1-1/+1
Provide a description for the kernel doc of the @adapter of e1000e_trigger_lsc() Signed-off-by: Baozhu Ni <nibaozhu@yeah.net> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-18igb: Define igb_pm_ops conditionally on CONFIG_PMTom Rix1-0/+2
For s390, gcc with W=1 reports drivers/net/ethernet/intel/igb/igb_main.c:186:32: error: 'igb_pm_ops' defined but not used [-Werror=unused-const-variable=] 186 | static const struct dev_pm_ops igb_pm_ops = { | ^~~~~~~~~~ The only use of igb_pm_ops is conditional on CONFIG_PM. The definition of igb_pm_ops should also be conditional on CONFIG_PM Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-18igc: Avoid transmit queue timeout for XDPKurt Kanzenbach1-0/+8
High XDP load triggers the netdev watchdog: |NETDEV WATCHDOG: enp3s0 (igc): transmit queue 2 timed out The reason is the Tx queue transmission start (txq->trans_start) is not updated in XDP code path. Therefore, add it for all XDP transmission functions. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-17ice: Remove LAG+SRIOV mutual exclusionDave Ertman5-91/+0
There was a change previously to stop SR-IOV and LAG from existing on the same interface. This was to prevent the violation of LACP (Link Aggregation Control Protocol). The method to achieve this was to add a no-op Rx handler onto the netdev when SR-IOV VFs were present, thus blocking bonding, bridging, etc from claiming the interface by adding its own Rx handler. Also, when an interface was added into a aggregate, then the SR-IOV capability was set to false. There are some users that have in house solutions using both SR-IOV and bridging/bonding that this method interferes with (e.g. creating duplicate VFs on the bonded interfaces and failing between them when the interface fails over). It makes more sense to provide the most functionality possible, the restriction on co-existence of these features will be removed. No additional functionality is currently being provided beyond what existed before the co-existence restriction was put into place. It is up to the end user to not implement a solution that would interfere with existing network protocols. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-17ice: update PHY type to ethtool link mode mappingPaul Greenwalt1-19/+19
Some link modes can be more accurately reported due to newer link mode values that have been added to the kernel; update those PHY type to report modes that better reflect the link mode. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-17ice: refactor PHY type to ethtool link modePaul Greenwalt3-274/+141
Refactor ice_phy_type_to_ethtool to use phy_type_[low|high]_lkup table to map PHY type to AQ link speed and ethtool link mode. This removes complexity and simplifies future changes. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-17ice: update ICE_PHY_TYPE_HIGH_MAX_INDEXPaul Greenwalt1-1/+1
ICE_PHY_TYPE_HIGH_MAX_INDEX should be the maximum index value and not the length/number of ICE_PHY_TYPE_HIGH. This is not an issue because this define is only used when calling ice_get_link_speed_based_on_phy_type(), which will return ICE_AQ_LINK_SPEED_UNKNOWN for any invalid index. The caller of ice_get_link_speed_based_on_phy_type(), ice_update_phy_type() checks that the return value is a valid link speed before using it and ICE_AQ_LINK_SPEED_UNKNOWN is not. However, update the define to reflect the correct value. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-17Merge branch '100GbE' of ↵David S. Miller16-620/+569
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: support dynamic interrupt allocation Piotr Raczynski says: This patchset reimplements MSIX interrupt allocation logic to allow dynamic interrupt allocation after MSIX has been initially enabled. This allows current and future features to allocate and free interrupts as needed and will help to drastically decrease number of initially preallocated interrupts (even down to the API hard limit of 1). Although this patchset does not change behavior in terms of actual number of allocated interrupts during probe, it will be subject to change. First few patches prepares to introduce dynamic allocation by moving interrupt allocation code to separate file and update allocation API used in the driver to the currently preferred one. Due to the current contract between ice and irdma driver which is directly accessing msix entries allocated by ice driver, even after moving away from older pci_enable_msix_range function, still keep msix_entries array for irdma use. Next patches refactors and removes redundant code from SRIOV related logic as it also make it easier to move away from static allocation scheme. Last patches actually enables dynamic allocation of MSIX interrupts. First, introduce functions to allocate and free interrupts individually. This sets ground for the rest of the changes even if that patch still allocates the interrupts from the preallocated pool. Since this patch starts to keep interrupt details in ice_q_vector structure we can get rid of functions that calculates base vector number and register offset for the interrupt as it is equal to the interrupt index. Only keep separate register offset functions for the VF VSIs. Next, replace homegrown interrupt tracker with much simpler xarray based approach. As new API always allocate interrupts one by one, also track interrupts in the same manner. Lastly, extend the interrupt tracker to deal both with preallocated and dynamically allocated vectors and use pci_msix_alloc_irq_at and pci_msix_free_irq functions. Since not all architecture supports dynamic allocation, check it before trying to allocate a new interrupt. As previously mentioned, this patchset does not change number of initially allocated interrupts during init phase but now it can and will likely be changed. Patch 1-3 -> move code around and use newer API Patch 4-5 -> refactor and remove redundant SRIOV code Patch 6 -> allocate every interrupt individually Patch 7 -> replace homegrown interrupt tracker with xarray Patch 8 -> allow dynamic interrupt allocation --- v2: Patch 4 - simplify ice_vsi_setup_vector_base and account for num_avail_sw_msix Patch 8 - prevent q_vector leak in case vf ctrl VSI error v1: https://lore.kernel.org/netdev/20230509170048.2235678-1-anthony.l.nguyen@intel.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-17igb: fix bit_shift to be in [1..8] rangeAleksandr Loktionov1-2/+2
In igb_hash_mc_addr() the expression: "mc_addr[4] >> 8 - bit_shift", right shifting "mc_addr[4]" shift by more than 7 bits always yields zero, so hash becomes not so different. Add initialization with bit_shift = 1 and add a loop condition to ensure bit_shift will be always in [1..8] range. Fixes: 9d5c824399de ("igb: PCI-Express 82575 Gigabit Ethernet driver") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-17Merge tag 'for-netdev' of ↵Jakub Kicinski2-11/+140
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-05-16 We've added 57 non-merge commits during the last 19 day(s) which contain a total of 63 files changed, 3293 insertions(+), 690 deletions(-). The main changes are: 1) Add precision propagation to verifier for subprogs and callbacks, from Andrii Nakryiko. 2) Improve BPF's {g,s}setsockopt() handling with wrong option lengths, from Stanislav Fomichev. 3) Utilize pahole v1.25 for the kernel's BTF generation to filter out inconsistent function prototypes, from Alan Maguire. 4) Various dyn-pointer verifier improvements to relax restrictions, from Daniel Rosenberg. 5) Add a new bpf_task_under_cgroup() kfunc for designated task, from Feng Zhou. 6) Unblock tests for arm64 BPF CI after ftrace supporting direct call, from Florent Revest. 7) Add XDP hint kfunc metadata for RX hash/timestamp for igc, from Jesper Dangaard Brouer. 8) Add several new dyn-pointer kfuncs to ease their usability, from Joanne Koong. 9) Add in-depth LRU internals description and dot function graph, from Joe Stringer. 10) Fix KCSAN report on bpf_lru_list when accessing node->ref, from Martin KaFai Lau. 11) Only dump unprivileged_bpf_disabled log warning upon write, from Kui-Feng Lee. 12) Extend test_progs to directly passing allow/denylist file, from Stephen Veiss. 13) Fix BPF trampoline memleak upon failure attaching to fentry, from Yafang Shao. 14) Fix emitting struct bpf_tcp_sock type in vmlinux BTF, from Yonghong Song. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (57 commits) bpf: Fix memleak due to fentry attach failure bpf: Remove bpf trampoline selector bpf, arm64: Support struct arguments in the BPF trampoline bpftool: JIT limited misreported as negative value on aarch64 bpf: fix calculation of subseq_idx during precision backtracking bpf: Remove anonymous union in bpf_kfunc_call_arg_meta bpf: Document EFAULT changes for sockopt selftests/bpf: Correctly handle optlen > 4096 selftests/bpf: Update EFAULT {g,s}etsockopt selftests bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen libbpf: fix offsetof() and container_of() to work with CO-RE bpf: Address KCSAN report on bpf_lru_list bpf: Add --skip_encoding_btf_inconsistent_proto, --btf_gen_optimized to pahole flags for v1.25 selftests/bpf: Accept mem from dynptr in helper funcs bpf: verifier: Accept dynptr mem as mem in helpers selftests/bpf: Check overflow in optional buffer selftests/bpf: Test allowing NULL buffer in dynptr slice bpf: Allow NULL buffers in bpf_dynptr_slice(_rw) selftests/bpf: Add testcase for bpf_task_under_cgroup bpf: Add bpf_task_under_cgroup() kfunc ... ==================== Link: https://lore.kernel.org/r/20230515225603.27027-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-16iavf: send VLAN offloading caps once after VFRAhmed Zaki1-5/+0
When the user disables rxvlan offloading and then changes the number of channels, all VLAN ports are unable to receive traffic. Changing the number of channels triggers a VFR reset. During re-init, when VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS is received, we do: 1 - set the IAVF_FLAG_SETUP_NETDEV_FEATURES flag 2 - call iavf_set_vlan_offload_features(adapter, 0, netdev->features); The second step sends to the PF the __default__ features, in this case aq_required |= IAVF_FLAG_AQ_ENABLE_CTAG_VLAN_STRIPPING While the first step forces the watchdog task to call netdev_update_features() -> iavf_set_features() -> iavf_set_vlan_offload_features(adapter, netdev->features, features). Since the user disabled the "rxvlan", this sets: aq_required |= IAVF_FLAG_AQ_DISABLE_CTAG_VLAN_STRIPPING When we start processing the AQ commands, both flags are enabled. Since we process DISABLE_XTAG first then ENABLE_XTAG, this results in the PF enabling the rxvlan offload. This breaks all communications on the VLAN net devices. Fix by removing the call to iavf_set_vlan_offload_features() (second step). Calling netdev_update_features() from watchdog task is enough for both init and reset paths. Fixes: 7598f4b40bd6 ("iavf: Move netdev_update_features() into watchdog task") Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: Fix ice VF reset during iavf initializationDawid Wesierski4-4/+25
Fix the current implementation that causes ice_trigger_vf_reset() to start resetting the VF even when the VF-NIC is still initializing. When we reset NIC with ice driver it can interfere with iavf-vf initialization e.g. during consecutive resets induced by ice iavf ice | | |<-----------------| | ice resets vf iavf | reset | start | |<-----------------| | ice resets vf | causing iavf | initialization | error | | iavf reset end This leads to a series of -53 errors (failed to init adminq) from the IAVF. Change the state of the vf_state field to be not active when the IAVF is still initializing. Make sure to wait until receiving the message on the message box to ensure that the vf is ready and initializded. In simple terms we use the ACTIVE flag to make sure that the ice driver knows if the iavf is ready for another reset iavf ice | | | | |<------------- ice resets vf iavf vf_state != ACTIVE reset | start | | | | | iavf | reset-------> vf_state == ACTIVE end ice resets vf | | | | Fixes: c54d209c78b8 ("ice: Wait for VF to be reset/ready before configuration") Signed-off-by: Dawid Wesierski <dawidx.wesierski@intel.com> Signed-off-by: Kamil Maziarz <kamil.maziarz@intel.com> Acked-by: Jacob Keller <Jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: Fix stats after PF resetAhmed Zaki1-0/+5
After a core PF reset, the VFs were showing wrong Rx/Tx stats. This is a regression in commit 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") caused by missing to set "stat_offsets_loaded = false" in the ice_vsi_rebuild() path. Fixes: 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: add dynamic interrupt allocationPiotr Raczynski7-21/+107
Currently driver can only allocate interrupt vectors during init phase by calling pci_alloc_irq_vectors. Change that and make use of new pci_msix_alloc_irq_at/pci_msix_free_irq API and enable to allocate and free more interrupts after MSIX has been enabled. Since not all platforms supports dynamic allocation, check it with pci_msix_can_alloc_dyn. Extend the tracker to keep track how many interrupts are allocated initially so when all such vectors are already used, additional interrupts are automatically allocated dynamically. Remember each interrupt allocation method to then free appropriately. Since some features may require interrupts allocated dynamically add appropriate VSI flag and take it into account when allocating new interrupt. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: track interrupt vectors with xarrayPiotr Raczynski6-84/+89
Replace custom interrupt tracker with generic xarray data structure. Remove all code responsible for searching for a new entry with xa_alloc, which always tries to allocate at the lowes possible index. As a result driver is always using a contiguous region of the MSIX vector table. New tracker keeps ice_irq_entry entries in xarray as opaque for the rest of the driver hiding the entry details from the caller. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: add individual interrupt allocationPiotr Raczynski13-282/+165
Currently interrupt allocations, depending on a feature are distributed in batches. Also, after allocation there is a series of operations that distributes per irq settings through that batch of interrupts. Although driver does not yet support dynamic interrupt allocation, keep allocated interrupts in a pool and add allocation abstraction logic to make code more flexible. Keep per interrupt information in the ice_q_vector structure, which yields ice_vsi::base_vector redundant. Also, as a result there are a few functions that can be removed. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: remove redundant SRIOV codePiotr Raczynski1-36/+0
Remove redundant code from ice_get_max_valid_res_idx that has no effect. ice_pf::irq_tracker is initialized during driver probe, there is no reason to check it again. Also it is not possible for pf::sriov_base_vector to be lower than the tracker length, remove WARN_ON that will never happen. Get rid of ice_get_max_valid_res_idx helper function completely since it can never return negative value. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: refactor VF control VSI interrupt handlingPiotr Raczynski3-66/+58
All VF control VSIs share the same interrupt vector. Currently, a helper function dedicated for that directly sets ice_vsi::base_vector. Use helper that returns pointer to first found VF control VSI instead. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: use preferred MSIX allocation apiPiotr Raczynski3-38/+37
Move away from using pci_enable_msix_range/pci_disable_msix and use pci_alloc_irq_vectors/pci_free_irq_vectors instead. As a result stop tracking msix_entries since with newer API entries are handled by MSIX core. However, due to current design of communication with RDMA driver which accesses ice_pf::msix_entries directly, keep using the array just for RDMA driver use. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: use pci_irq_vector helper functionPiotr Raczynski5-11/+11
Currently, driver gets interrupt number directly from ice_pf::msix_entries array. Use helper function dedicated to do just that. While at it use a variable to store interrupt number in ice_free_irq_msix_misc instead of calling the helper function twice. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: move interrupt related code to separate filePiotr Raczynski5-218/+238
Keep interrupt handling code in a dedicated file. This helps keep driver structured better and prepares for more functionality added to this file. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-12ice: Fix undersized tx_flags variableJan Sokolowski3-14/+8
As not all ICE_TX_FLAGS_* fit in current 16-bit limited tx_flags field that was introduced in the Fixes commit, VLAN-related information would be discarded completely. As such, creating a vlan and trying to run ping through would result in no traffic passing. Fix that by refactoring tx_flags variable into flags only and a separate variable that holds VLAN ID. As there is some space left, type variable can fit between those two. Pahole reports no size change to ice_tx_buf struct. Fixes: aa1d3faf71a6 ("ice: Robustify cleaning/completing XDP Tx buffers") Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>