summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/chelsio/cxgb4/sge.c
AgeCommit message (Collapse)AuthorFilesLines
2021-05-05net:CXGB4: fix leak if sk_buff is not usedÍñigo Huguet1-7/+9
An sk_buff is allocated to send a flow control message, but it's not sent in all cases: in case the state is not appropiate to send it or if it can't be enqueued. In the first of these 2 cases, the sk_buff was discarded but not freed, producing a memory leak. Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-3/+8
2021-02-15cxgb4/chtls/cxgbit: Keeping the max ofld immediate data size same in cxgb4 ↵Ayush Sawal1-3/+8
and ulds The Max imm data size in cxgb4 is not similar to the max imm data size in the chtls. This caused an mismatch in output of is_ofld_imm() of cxgb4 and chtls. So fixed this by keeping the max wreq size of imm data same in both chtls and cxgb4 as MAX_IMM_OFLD_TX_DATA_WR_LEN. As cxgb4's max imm. data value for ofld packets is changed to MAX_IMM_OFLD_TX_DATA_WR_LEN. Using the same in cxgbit also. Fixes: 36bedb3f2e5b8 ("crypto: chtls - Inline TLS record Tx") Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-01-17cxgb4: enable interrupt based Tx completions for T5Raju Rangoju1-5/+33
Enable interrupt based Tx completions to improve latency for T5. The consumer index (CIDX) will now come via interrupts so that Tx SKBs can be freed up sooner in Rx path. Also, enforce CIDX flush threshold override (CIDXFTHRESHO) to improve latency for slow traffic. This ensures that the interrupt is generated immediately whenever hardware catches up with driver (i.e. CIDX == PIDX is reached), which is often the case for slow traffic. Signed-off-by: Raju Rangoju <rajur@chelsio.com> Link: https://lore.kernel.org/r/20210115102059.6846-1-rajur@chelsio.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12cxgb4/ch_ktls: creating skbs causes panicRohit Maheshwari1-0/+108
Creating SKB per tls record and freeing the original one causes panic. There will be race if connection reset is requested. By freeing original skb, refcnt will be decremented and that means, there is no pending record to send, and so tls_dev_del will be requested in control path while SKB of related connection is in queue. Better approach is to use same SKB to send one record (partial data) at a time. We still have to create a new SKB when partial last part of a record is requested. This fix introduces new API cxgb4_write_partial_sgl() to send partial part of skb. Present cxgb4_write_sgl can only provide feasibility to start from an offset which limits to header only and it can write sgls for the whole skb len. But this new API will help in both. It can start from any offset and can end writing in middle of the skb. v4->v5: - Removed extra changes. Fixes: 429765a149f1 ("chcr: handle partial end part of a record") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12cxgb4/ch_ktls: decrypted bit is not enoughRohit Maheshwari1-1/+2
If skb has retransmit data starting before start marker, e.g. ccs, decrypted bit won't be set for that, and if it has some data to encrypt, then it must be given to crypto ULD. So in place of decrypted, check if socket is tls offloaded. Also, unless skb has some data to encrypt, no need to give it for tls offload handling. v2->v3: - Removed ifdef. Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-09cxgb4: convert tasklets to use new tasklet_setup() APIAllen Pais1-2/+3
In preparation for unconditionally passing the struct tasklet_struct pointer to all tasklet callbacks, switch to using the new tasklet_setup() and from_tasklet() to pass the tasklet pointer explicitly. Signed-off-by: Romain Perier <romain.perier@gmail.com> Signed-off-by: Allen Pais <apais@linux.microsoft.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-30net: cxbg4: Remove pointless in_interrupt() checkThomas Gleixner1-3/+0
t4_sge_stop() is only ever called from task context and the in_interrupt() check is presumably a leftover from copying t3_sge_stop(). Aside of in_interrupt() being deprecated because it's not providing what it claims to provide, this check would paper over illegitimate callers. The functions invoked from t4_sge_stop() contain already warnings to catch invocations from invalid contexts. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-14chelsio: convert tasklets to use new tasklet_setup() APIAllen Pais1-8/+8
In preparation for unconditionally passing the struct tasklet_struct pointer to all tasklet callbacks, switch to using the new tasklet_setup() and from_tasklet() to pass the tasklet pointer explicitly. Signed-off-by: Romain Perier <romain.perier@gmail.com> Signed-off-by: Allen Pais <apais@linux.microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-12crypto/chcr: move nic TLS functionality to drivers/netRohit Maheshwari1-2/+2
This patch moves complete nic tls offload (kTLS) code from crypto directory to drivers/net/ethernet/chelsio/inline_crypto/ch_ktls directory. nic TLS is made a separate ULD of cxgb4. Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-3/+7
2020-08-22crypto/chcr: Moving chelsio's inline ipsec functionality to /drivers/netVinay Kumar Yadav1-2/+2
This patch seperates inline ipsec functionality from coprocessor driver chcr. Now inline ipsec is separate ULD, moved from "drivers/crypto/chelsio/" to "drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/" Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-18cxgb4: Fix race between loopback and normal Tx pathGanji Aravind1-1/+5
Even after Tx queues are marked stopped, there exists a small window where the current packet in the normal Tx path is still being sent out and loopback selftest ends up corrupting the same Tx ring. So, ensure selftest takes the Tx lock to synchronize access the Tx ring. Fixes: 7235ffae3d2c ("cxgb4: add loopback ethtool self-test") Signed-off-by: Ganji Aravind <ganji.aravind@chelsio.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-18cxgb4: Fix work request size calculation for loopback testGanji Aravind1-2/+2
Work request used for sending loopback packet needs to add the firmware work request only once. So, fix by using correct structure size. Fixes: 7235ffae3d2c ("cxgb4: add loopback ethtool self-test") Signed-off-by: Ganji Aravind <ganji.aravind@chelsio.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-0/+1
The UDP reuseport conflict was a little bit tricky. The net-next code, via bpf-next, extracted the reuseport handling into a helper so that the BPF sk lookup code could invoke it. At the same time, the logic for reuseport handling of unconnected sockets changed via commit efc6b6f6c3113e8b203b9debfb72d81e0f3dcace which changed the logic to carry on the reuseport result into the rest of the lookup loop if we do not return immediately. This requires moving the reuseport_has_conns() logic into the callers. While we are here, get rid of inline directives as they do not belong in foo.c files. The other changes were cases of more straightforward overlapping modifications. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23cxgb4: add loopback ethtool self-testVishal Kulkarni1-1/+106
In this test, loopback pkt is created and sent on default queue. The packet goes until the Multi Port Switch (MPS) just before the MAC and based on the specified channel number, it either goes outside the wire on one of the physical ports or looped back to Rx path by MPS. In this case, we're specifying loopback channel, instead of physical ports, so the packet gets looped back to Rx path, instead of getting transmitted on the wire. v3: - Modify commit message to include test details. v2: - Add only loopback self-test. Signed-off-by: Vishal Kulkarni <vishal@chelsio.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23cxgb4: add missing release on skb in uld_send()Navid Emamdoost1-0/+1
In the implementation of uld_send(), the skb is consumed on all execution paths except one. Release skb when returning NET_XMIT_DROP. Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-23/+24
Minor overlapping changes in xfrm_device.c, between the double ESP trailing bug fix setting the XFRM_INIT flag and the changes in net-next preparing for bonding encryption support. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24cxgb4: update kernel-doc line commentsRahul Lakkireddy1-8/+11
Update several kernel-doc line comments to fix warnings reported by make W=1. Fixes following class of warnings reported by make W=1 in several places: l2t.c:616: warning: Cannot understand * @dev: net_device pointer t4_hw.c:3175: warning: Function parameter or member 'adap' not described in 't4_get_exprom_version' t4_hw.c:3175: warning: Excess function parameter 'adapter' description in 't4_get_exprom_version' Fixes: 56d36be4dd5f ("cxgb4: Add HW and FW support code") Fixes: fd3a47900b6f ("cxgb4: Add packet queues and packet DMA code") Fixes: 26f7cbc0a5a4 ("cxgb4: Don't attempt to upgrade T4 firmware when cxgb4 will end up as a slave") Fixes: 793dad94e745 ("RDMA/cxgb4: Fix bug for active and passive LE hash collision path") Fixes: ba3f8cd55f2a ("cxgb4: Add support in cxgb4 to get expansion rom version via ethtool") Fixes: f7502659cec8 ("cxgb4: Add API to alloc l2t entry; also update existing ones") Fixes: ddc7740d9a7c ("cxgb4: Decode link down reason code obtained from firmware") Fixes: 193c4c2845f7 ("cxgb4: Update T6 Buffer Group and Channel Mappings") Fixes: 8f46d46715a1 ("cxgb4: Use Firmware params to get buffer-group map") Fixes: a456950445a0 ("cxgb4: time stamping interface for PTP") Fixes: 9c33e4208bce ("cxgb4: Add PTP Hardware Clock (PHC) support") Fixes: c3168cabe1af ("cxgb4/cxgbvf: Handle 32-bit fw port capabilities") Fixes: 5ccf9d049615 ("cxgb4: update API for TP indirect register access") Fixes: 3bdb376e6944 ("cxgb4: introduce SMT ops to prepare for SMAC rewrite support") Fixes: 736c3b94474e ("cxgb4: collect egress and ingress SGE queue contexts") Fixes: f56ec6766dcf ("cxgb4: Add support for ethtool i2c dump") Fixes: 9d5fd927d20b ("cxgb4/cxgb4vf: add support for ndo_set_vf_vlan") Fixes: 98f3697f8d41 ("cxgb4: add tc flower match support for tunnel VNI") Fixes: 02d805dc5fe3 ("cxgb4: use new fw interface to get the VIN and smt index") Fixes: 3f8cfd0d95e6 ("cxgb4/cxgb4vf: Program hash region for {t4/t4vf}_change_mac()") Fixes: d429005fdf2c ("cxgb4/cxgb4vf: Add support for SGE doorbell queue timer") Fixes: 0e395b3cb1fb ("cxgb4: add FLOWC based QoS offload") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24cxgb4: remove cast when saving IPv4 partial checksumRahul Lakkireddy1-2/+1
The checksum field in IPv4 header is in __sum16 and ip_fast_csum() also returns __sum16. So, no need to cast it to u16. Fixes following sparse warning: sge.c:1539:47: warning: cast from restricted __sum16 sge.c:1539:44: warning: incorrect type in assignment (different base types) sge.c:1539:44: expected restricted __sum16 [usertype] check sge.c:1539:44: got unsigned short [usertype] Fixes: d0a1299c6bf7 ("cxgb4: add support for vxlan segmentation offload") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24cxgb4: use unaligned conversion for fetching timestampRahul Lakkireddy1-1/+1
Use get_unaligned_be64() to fetch the timestamp needed for ns_to_ktime() conversion. Fixes following sparse warning: sge.c:3282:43: warning: cast to restricted __be64 Fixes: a456950445a0 ("cxgb4: time stamping interface for PTP") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24cxgb4: move PTP lock and unlock to caller in Tx pathRahul Lakkireddy1-12/+11
Check for whether PTP is enabled or not at the caller and perform locking/unlocking at the caller. Fixes following sparse warning: sge.c:1641:26: warning: context imbalance in 'cxgb4_eth_xmit' - different lock contexts for basic block Fixes: a456950445a0 ("cxgb4: time stamping interface for PTP") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-19cxgb4: Use struct_size() helperGustavo A. R. Silva1-1/+1
Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes. This code was detected with the help of Coccinelle and, audited and fixed manually. Addresses-KSPP-ID: https://github.com/KSPP/linux/issues/83 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15cxgb4: improve credits recovery in TC-MQPRIO Tx pathRahul Lakkireddy1-14/+26
Request credit update for every half credits consumed, including the current request. Also, avoid re-trying to post packets when there are no credits left. The credit update reply via interrupt will eventually restore the credits and will invoke the Tx path again. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02cxgb4: Add missing annotation for service_ofldq()Jules Irenge1-0/+1
Sparse reports a warning at service_ofldq() warning: context imbalance in service_ofldq() - unexpected unlock The root cause is the missing annotation at service_ofldq() Add the missing __must_hold(&q->sendq.lock) annotation Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01cxgb4: fix EOTID leak when disabling TC-MQPRIO offloadRahul Lakkireddy1-3/+36
Under heavy load, the EOTID termination FLOWC request fails to get enqueued to the end of the Tx ring due to lack of credits. This results in EOTID leak. When disabling TC-MQPRIO offload, the link is already brought down to cleanup EOTIDs. So, flush any pending enqueued skbs that can't be sent outside the wire, to make room for FLOWC request. Also, move the FLOWC descriptor consumption logic closer to when the FLOWC request is actually posted to hardware. Fixes: 0e395b3cb1fb ("cxgb4: add FLOWC based QoS offload") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-42/+10
Overlapping header include additions in macsec.c A bug fix in 'net' overlapping with the removal of 'version' string in ena_netdev.c Overlapping test additions in selftests Makefile Overlapping PCI ID table adjustments in iwlwifi driver. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20cxgb4: fix Txq restart check during backpressureRahul Lakkireddy1-2/+8
Driver reclaims descriptors in much smaller batches, even if hardware indicates more to reclaim, during backpressure. So, fix the check to restart the Txq during backpressure, by looking at how many descriptors hardware had indicated to reclaim, and not on how many descriptors that driver had actually reclaimed. Once the Txq is restarted, driver will reclaim even more descriptors when Tx path is entered again. Fixes: d429005fdf2c ("cxgb4/cxgb4vf: Add support for SGE doorbell queue timer") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20cxgb4: fix throughput drop during Tx backpressureRahul Lakkireddy1-40/+2
commit 7c3bebc3d868 ("cxgb4: request the TX CIDX updates to status page") reverted back to getting Tx CIDX updates via DMA, instead of interrupts, introduced by commit d429005fdf2c ("cxgb4/cxgb4vf: Add support for SGE doorbell queue timer") However, it missed reverting back several code changes where Tx CIDX updates are not explicitly requested during backpressure when using interrupt mode. These missed changes cause slow recovery during backpressure because the corresponding interrupt no longer comes and hence results in Tx throughput drop. So, revert back these missed code changes, as well, which will allow explicitly requesting Tx CIDX updates when backpressure happens. This enables the corresponding interrupt with Tx CIDX update message to get generated and hence speed up recovery and restore back throughput. Fixes: 7c3bebc3d868 ("cxgb4: request the TX CIDX updates to status page") Fixes: d429005fdf2c ("cxgb4/cxgb4vf: Add support for SGE doorbell queue timer") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-09cxgb4/chcr: complete record tx handlingRohit Maheshwari1-0/+5
Added tx handling in this patch. This includes handling of segments contain single complete record. v1->v2: - chcr_write_cpl_set_tcb_ulp is added in this patch. v3->v4: - mss calculation logic. - replaced kfree_skb with dev_kfree_skb_any. - corrected error message reported by kbuild test robot <lkp@intel.com> Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-23cxgb4: add stats for MQPRIO QoS offload Tx pathRahul Lakkireddy1-0/+14
Export necessary stats for traffic flowing through MQPRIO QoS offload Tx path. v2: - No change. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-23cxgb4: add UDP segmentation offload supportRahul Lakkireddy1-47/+112
Implement and export UDP segmentation offload (USO) support for both NIC and MQPRIO QoS offload Tx path. Update appropriate logic in Tx to parse GSO info in skb and configure FW_ETH_TX_EO_WR request needed to perform USO. v2: - Remove inline keyword from write_eo_udp_wr() in sge.c. Let the compiler decide. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-23cxgb4/chcr: update SGL DMA unmap for USORahul Lakkireddy1-88/+31
The FW_ETH_TX_EO_WR used for sending UDP Segmentation Offload (USO) requests expects the headers to be part of the descriptor and the payload to be part of the SGL containing the DMA mapped addresses. Hence, the DMA address in the first entry of the SGL can start after the packet headers. Currently, unmap_sgl() tries to unmap from this wrong offset, instead of the originally mapped DMA address. So, use existing unmap_skb() instead, which takes originally saved DMA addresses as input. Update all necessary Tx paths to save the original DMA addresses, so that unmap_skb() can unmap them properly. v2: - No change. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-20cxgb4: remove unneeded semicolon for switch blockRahul Lakkireddy1-1/+1
Semicolon is not required at the end of switch block. So, remove it. Addresses coccinelle warning: drivers/net/ethernet/chelsio/cxgb4/sge.c:2260:2-3: Unneeded semicolon Fixes: 4846d5330daf ("cxgb4: add Tx and Rx path for ETHOFLD traffic") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-12cxgb4: remove redundant assignment to hdr_lenColin Ian King1-1/+0
Variable hdr_len is being assigned a value that is never read. The assignment is redundant and hence can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07cxgb4: add FLOWC based QoS offloadRahul Lakkireddy1-6/+144
Rework SCHED API to allow offloading TC-MQPRIO QoS configuration. The existing QUEUE based rate limiting throttles all queues sharing a traffic class, to the specified max rate limit value. So, if multiple queues share a traffic class, then all the queues get the aggregate specified max rate limit. So, introduce the new FLOWC based rate limiting, where multiple queues can share a traffic class with each queue getting its own individual specified max rate limit. For example, if 2 queues are bound to class 0, which is rate limited to 1 Gbps, then 2 queues using QUEUE based rate limiting, get the aggregate output of 1 Gbps only. In FLOWC based rate limiting, each queue gets its own output of max 1 Gbps each; i.e. 2 queues * 1 Gbps rate limit = 2 Gbps. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07cxgb4: add Tx and Rx path for ETHOFLD trafficRahul Lakkireddy1-48/+375
Implement Tx path for traffic flowing through software EOSW_TXQ and EOHW_TXQ. Since multiple EOSW_TXQ can post packets to a single EOHW_TXQ, protect the hardware queue with necessary spinlock. Also, move common code used to generate TSO work request to a common function. Implement Rx path to handle Tx completions for successfully transmitted packets. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07cxgb4: add ETHOFLD hardware queue supportRahul Lakkireddy1-25/+70
Add support for configuring and managing ETHOFLD hardware queues. Keep the queue count and MSI-X allocation scheme same as NIC queues. ETHOFLD hardware queues are dynamically allocated/destroyed as TC-MQPRIO Qdisc offload is enabled/disabled on the corresponding interface, respectively. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07cxgb4: parse and configure TC-MQPRIO offloadRahul Lakkireddy1-46/+116
Add logic for validation and configuration of TC-MQPRIO Qdisc offload. Also, add support to manage EOSW_TXQ, which have 1-to-1 mapping with EOTIDs, and expose them to network stack. Move common skb validation in Tx path to a separate function and add minimal Tx path for ETHOFLD. Update Tx queue selection to return normal NIC Txq to send traffic pattern that can't go through ETHOFLD Tx path. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07cxgb4: rework queue config and MSI-X allocationRahul Lakkireddy1-1/+12
Simplify queue configuration and MSI-X allocation logic. Use a single MSI-X information table for both NIC and ULDs. Remove hard-coded MSI-X indices for firmware event queue and non data interrupts. Instead, use the MSI-X bitmap to obtain a free MSI-X index dynamically. Save each Rxq's index into the MSI-X information table, within the Rxq structures themselves, for easier cleanup. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-26cxgb4: request the TX CIDX updates to status pageRaju Rangoju1-6/+2
For adapters which support the SGE Doorbell Queue Timer facility, we configured the Ethernet TX Queues to send CIDX Updates to the Associated Ethernet RX Response Queue with CPL_SGE_EGR_UPDATE messages to allow us to respond more quickly to the CIDX Updates. But, this was adding load to PCIe Link RX bandwidth and, potentially, resulting in higher CPU Interrupt load. This patch requests the HW to deliver the CIDX updates to the TX queue status page rather than generating an ingress queue message (as an interrupt). With this patch, the load on RX bandwidth is reduced and a substantial improvement in BW is noticed at lower IO sizes. Fixes: d429005fdf2c ("cxgb4/cxgb4vf: Add support for SGE doorbell queue timer") Signed-off-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-26chelsio: use BUG() instead of BUG_ON(1)Arnd Bergmann1-1/+1
clang warns about possible bugs in a dead code branch after BUG_ON(1) when CONFIG_PROFILE_ALL_BRANCHES is enabled: drivers/net/ethernet/chelsio/cxgb4/sge.c:479:3: error: variable 'buf_size' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] BUG_ON(1); ^~~~~~~~~ include/asm-generic/bug.h:61:36: note: expanded from macro 'BUG_ON' #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0) ^~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:48:23: note: expanded from macro 'unlikely' # define unlikely(x) (__branch_check__(x, 0, __builtin_constant_p(x))) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/chelsio/cxgb4/sge.c:482:9: note: uninitialized use occurs here return buf_size; ^~~~~~~~ drivers/net/ethernet/chelsio/cxgb4/sge.c:479:3: note: remove the 'if' if its condition is always true BUG_ON(1); ^ include/asm-generic/bug.h:61:32: note: expanded from macro 'BUG_ON' #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0) ^ drivers/net/ethernet/chelsio/cxgb4/sge.c:459:14: note: initialize the variable 'buf_size' to silence this warning int buf_size; ^ = 0 Use BUG() here to create simpler code that clang understands correctly. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-04cxgb4/chtls: Prefix adapter flags with CXGB4Arjun Vynipadath1-6/+6
Some of these macros were conflicting with global namespace, hence prefixing them with CXGB4. Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Vishal Kulkarni <vishal@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-14cxgb4/cxgb4vf: Add support for SGE doorbell queue timerVishal Kulkarni1-56/+266
T6 introduced a Timer Mechanism in SGE called the SGE Doorbell Queue Timer. With this we can now configure TX Queues to get CIDX Updates when: Time(CIDX == PIDX) >= Timer Previously we rely on TX Queue Status Page updates by hardware for DMA completions. This will make Hardware/Firmware actually deliver the CIDX Updates as Ingress Queue messages with commensurate Interrupts. So we now have a new RX Path component for processing CIDX Updates and reclaiming TX Descriptors faster. Original work by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Vishal Kulkarni <vishal@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-08cross-tree: phase out dma_zalloc_coherent()Luis Chamberlain1-1/+1
We already need to zero out memory for dma_alloc_coherent(), as such using dma_zalloc_coherent() is superflous. Phase it out. This change was generated with the following Coccinelle SmPL patch: @ replace_dma_zalloc_coherent @ expression dev, size, data, handle, flags; @@ -dma_zalloc_coherent(dev, size, handle, flags) +dma_alloc_coherent(dev, size, handle, flags) Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> [hch: re-ran the script on the latest tree] Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-09-14cxgb4: add per rx-queue counter for packet errorsGanesh Goudar1-0/+4
print per rx-queue packet errors in sge_qinfo Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12cxgb4: specify IQTYPE in fw_iq_cmdArjun Vynipadath1-1/+3
congestion argument passed to t4_sge_alloc_rxq() is used to differentiate between nic/ofld queues. Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-29cxgb4: Add support for FW_ETH_TX_PKT_VM_WRArjun Vynipadath1-2/+370
The present TX workrequest(FW_ETH_TX_PKT_WR) cant be used for host->vf communication, since it doesn't loopback the outgoing packets to virtual interfaces on the same port. This can be done using FW_ETH_TX_PKT_VM_WR. This fix depends on ethtool_flags to determine what WR to use for TX path. Support for setting this flags by user is added in next commit. Based on the original work by : Casey Leedom <leedom@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-13treewide: kzalloc_node() -> kcalloc_node()Kees Cook1-1/+1
The kzalloc_node() function has a 2-factor argument form, kcalloc_node(). This patch replaces cases of: kzalloc_node(a * b, gfp, node) with: kcalloc_node(a * b, gfp, node) as well as handling cases of: kzalloc_node(a * b * c, gfp, node) with: kzalloc_node(array3_size(a, b, c), gfp, node) as it's slightly less ugly than: kcalloc_node(array_size(a, b), c, gfp, node) This does, however, attempt to ignore constant size factors like: kzalloc_node(4 * 1024, gfp, node) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc_node( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kzalloc_node( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc_node( - sizeof(u8) * (COUNT) + COUNT , ...) | kzalloc_node( - sizeof(__u8) * (COUNT) + COUNT , ...) | kzalloc_node( - sizeof(char) * (COUNT) + COUNT , ...) | kzalloc_node( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kzalloc_node( - sizeof(u8) * COUNT + COUNT , ...) | kzalloc_node( - sizeof(__u8) * COUNT + COUNT , ...) | kzalloc_node( - sizeof(char) * COUNT + COUNT , ...) | kzalloc_node( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc_node + kcalloc_node ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc_node + kcalloc_node ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc_node + kcalloc_node ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc_node + kcalloc_node ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc_node + kcalloc_node ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kzalloc_node + kcalloc_node ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kzalloc_node + kcalloc_node ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kzalloc_node + kcalloc_node ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc_node + kcalloc_node ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc_node( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc_node( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc_node( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc_node( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc_node( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc_node( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc_node( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc_node( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc_node( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc_node( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc_node( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc_node( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc_node( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kzalloc_node( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc_node( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc_node( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc_node( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc_node( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc_node( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc_node( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc_node( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc_node( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc_node(C1 * C2 * C3, ...) | kzalloc_node( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kzalloc_node( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kzalloc_node( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kzalloc_node( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc_node(sizeof(THING) * C2, ...) | kzalloc_node(sizeof(TYPE) * C2, ...) | kzalloc_node(C1 * C2 * C3, ...) | kzalloc_node(C1 * C2, ...) | - kzalloc_node + kcalloc_node ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kzalloc_node + kcalloc_node ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kzalloc_node + kcalloc_node ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kzalloc_node + kcalloc_node ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kzalloc_node + kcalloc_node ( - (E1) * E2 + E1, E2 , ...) | - kzalloc_node + kcalloc_node ( - (E1) * (E2) + E1, E2 , ...) | - kzalloc_node + kcalloc_node ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-04net: chelsio: Use zeroing memory allocator instead of allocator/memsetYueHaibing1-2/+1
Use dma_zalloc_coherent for allocating zeroed memory and remove unnecessary memset function. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>