summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-03-11afs: Fix potential thrashing in afs writebackDavid Howells1-1/+8
In afs_writepages_region(), if the dirty page we find is undergoing writeback or write to cache, but the sync_mode is WB_SYNC_NONE, we go round the loop trying the same page again and again with no pausing or waiting unless and until another thread manages to clear the writeback and fscache flags. Fix this with three measures: (1) Advance start to after the page we found. (2) Break out of the loop and return if rescheduling is requested. (3) Arbitrarily give up after a maximum of 5 skips. Fixes: 31143d5d515e ("AFS: implement basic file write support") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> Acked-by: Marc Dionne <marc.dionne@auristor.com> Link: https://lore.kernel.org/r/164692725757.2097000.2060513769492301854.stgit@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11x86/traps: Mark do_int3() NOKPROBE_SYMBOLLi Huafei1-0/+1
Since kprobe_int3_handler() is called in do_int3(), probing do_int3() can cause a breakpoint recursion and crash the kernel. Therefore, do_int3() should be marked as NOKPROBE_SYMBOL. Fixes: 21e28290b317 ("x86/traps: Split int3 handler up") Signed-off-by: Li Huafei <lihuafei1@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220310120915.63349-1-lihuafei1@huawei.com
2022-03-11watch_queue: Make comment about setting ->defunct more accurateDavid Howells1-1/+1
watch_queue_clear() has a comment stating that setting ->defunct to true preventing new additions as well as preventing notifications. Whilst the latter is true, the first bit is superfluous since at the time this function is called, the pipe cannot be accessed to add new event sources. Remove the "new additions" bit from the comment. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11watch_queue: Fix lack of barrier/sync/lock between post and readDavid Howells2-2/+3
There's nothing to synchronise post_one_notification() versus pipe_read(). Whilst posting is done under pipe->rd_wait.lock, the reader only takes pipe->mutex which cannot bar notification posting as that may need to be made from contexts that cannot sleep. Fix this by setting pipe->head with a barrier in post_one_notification() and reading pipe->head with a barrier in pipe_read(). If that's not sufficient, the rd_wait.lock will need to be taken, possibly in a ->confirm() op so that it only applies to notifications. The lock would, however, have to be dropped before copy_page_to_iter() is invoked. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11watch_queue: Free the alloc bitmap when the watch_queue is torn downDavid Howells1-0/+1
Free the watch_queue note allocation bitmap when the watch_queue is destroyed. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11watch_queue: Fix the alloc bitmap size to reflect notes allocatedDavid Howells1-1/+2
Currently, watch_queue_set_size() sets the number of notes available in wqueue->nr_notes according to the number of notes allocated, but sets the size of the bitmap to the unrounded number of notes originally asked for. Fix this by setting the bitmap size to the number of notes we're actually going to make available (ie. the number allocated). Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11watch_queue: Use the bitmap API when applicableChristophe JAILLET1-5/+2
Use bitmap_alloc() to simplify code, improve the semantic and reduce some open-coded arithmetic in allocator arguments. Also change a memset(0xff) into an equivalent bitmap_fill() to keep consistency. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11watch_queue: Fix to always request a pow-of-2 pipe ring sizeDavid Howells1-1/+1
The pipe ring size must always be a power of 2 as the head and tail pointers are masked off by AND'ing with the size of the ring - 1. watch_queue_set_size(), however, lets you specify any number of notes between 1 and 511. This number is passed through to pipe_resize_ring() without checking/forcing its alignment. Fix this by rounding the number of slots required up to the nearest power of two. The request is meant to guarantee that at least that many notifications can be generated before the queue is full, so rounding down isn't an option, but, alternatively, it may be better to give an error if we aren't allowed to allocate that much ring space. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11watch_queue: Fix to release page in ->release()David Howells1-0/+1
When a pipe ring descriptor points to a notification message, the refcount on the backing page is incremented by the generic get function, but the release function, which marks the bitmap, doesn't drop the page ref. Fix this by calling generic_pipe_buf_release() at the end of watch_queue_pipe_buf_release(). Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11watch_queue, pipe: Free watchqueue state after clearing pipe ringDavid Howells1-3/+5
In free_pipe_info(), free the watchqueue state after clearing the pipe ring as each pipe ring descriptor has a release function, and in the case of a notification message, this is watch_queue_pipe_buf_release() which tries to mark the allocation bitmap that was previously released. Fix this by moving the put of the pipe's ref on the watch queue to after the ring has been cleared. We still need to call watch_queue_clear() before doing that to make sure that the pipe is disconnected from any notification sources first. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11watch_queue: Fix filter limit checkDavid Howells2-3/+4
In watch_queue_set_filter(), there are a couple of places where we check that the filter type value does not exceed what the type_filter bitmap can hold. One place calculates the number of bits by: if (tf[i].type >= sizeof(wfilter->type_filter) * 8) which is fine, but the second does: if (tf[i].type >= sizeof(wfilter->type_filter) * BITS_PER_LONG) which is not. This can lead to a couple of out-of-bounds writes due to a too-large type: (1) __set_bit() on wfilter->type_filter (2) Writing more elements in wfilter->filters[] than we allocated. Fix this by just using the proper WATCH_TYPE__NR instead, which is the number of types we actually know about. The bug may cause an oops looking something like: BUG: KASAN: slab-out-of-bounds in watch_queue_set_filter+0x659/0x740 Write of size 4 at addr ffff88800d2c66bc by task watch_queue_oob/611 ... Call Trace: <TASK> dump_stack_lvl+0x45/0x59 print_address_description.constprop.0+0x1f/0x150 ... kasan_report.cold+0x7f/0x11b ... watch_queue_set_filter+0x659/0x740 ... __x64_sys_ioctl+0x127/0x190 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Allocated by task 611: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0x81/0xa0 watch_queue_set_filter+0x23a/0x740 __x64_sys_ioctl+0x127/0x190 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae The buggy address belongs to the object at ffff88800d2c66a0 which belongs to the cache kmalloc-32 of size 32 The buggy address is located 28 bytes inside of 32-byte region [ffff88800d2c66a0, ffff88800d2c66c0) Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11ice: Support GTP-U and GTP-C offload in switchdevMarcin Szycik8-24/+765
Add support for creating filters for GTP-U and GTP-C in switchdev mode. Add support for parsing GTP-specific options (QFI and PDU type) and TEID. By default, a filter for GTP-U will be added. To add a filter for GTP-C, specify enc_dst_port = 2123, e.g.: tc filter add dev $GTP0 ingress prio 1 flower enc_key_id 1337 \ enc_dst_port 2123 action mirred egress redirect dev $VF1_PR Note: GTP-U with outer IPv6 offload is not supported yet. Note: GTP-U with no payload offload is not supported yet. Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-11ice: Fix FV offset searchingMichal Swiatkowski3-51/+12
Checking only protocol ids while searching for correct FVs can lead to a situation, when incorrect FV will be added to the list. Incorrect means that FV has correct protocol id but incorrect offset. Call ice_get_sw_fv_list with ice_prot_lkup_ext struct which contains all protocol ids with offsets. With this modification allocating and collecting protocol ids list is not longer needed. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-11gtp: Add support for checking GTP device typeWojciech Drewek1-0/+6
Add a function that checks if a net device type is GTP. Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Harald Welte <laforge@gnumonks.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-11net/sched: Allow flower to match on GTP optionsWojciech Drewek4-1/+139
Options are as follows: PDU_TYPE:QFI and they refernce to the fields from the PDU Session Protocol. PDU Session data is conveyed in GTP-U Extension Header. GTP-U Extension Header is described in 3GPP TS 29.281. PDU Session Protocol is described in 3GPP TS 38.415. PDU_TYPE - indicates the type of the PDU Session Information (4 bits) QFI - QoS Flow Identifier (6 bits) # ip link add gtp_dev type gtp role sgsn # tc qdisc add dev gtp_dev ingress # tc filter add dev gtp_dev protocol ip parent ffff: \ flower \ enc_key_id 11 \ gtp_opts 1:8/ff:ff \ action mirred egress redirect dev eth0 Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-11gtp: Implement GTP echo requestWojciech Drewek2-37/+269
Adding GTP device through ip link creates the situation where GTP instance is not able to send GTP echo requests. Echo requests are used to check if GTP peer is still alive. With this patch, gtp_genl_ops are extended by new cmd (GTP_CMD_ECHOREQ) which allows to send echo request in the given version of GTP protocol (v0 or v1), from the given ms address to he given peer. TID is not inclued because in all path management messages it should be equal to 0. When GTP echo response is detected, multicast message is send to everyone in the gtp_genl_family. Message contains GTP version, ms address and peer address. Suggested-by: Harald Welte <laforge@gnumonks.org> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Harald Welte <laforge@gnumonks.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-11gtp: Implement GTP echo responseWojciech Drewek3-16/+229
Adding GTP device through ip link creates the situation where there is no userspace daemon which would handle GTP messages (Echo Request for example). GTP-U instance which would not respond to echo requests would violate GTP specification. When GTP packet arrives with GTP_ECHO_REQ message type, GTP_ECHO_RSP is send to the sender. GTP_ECHO_RSP message should contain information element with GTPIE_RECOVERY tag and restart counter value. For GTPv1 restart counter is not used and should be equal to 0, for GTPv0 restart counter contains information provided from userspace(IFLA_GTP_RESTART_COUNT). Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Suggested-by: Harald Welte <laforge@gnumonks.org> Reviewed-by: Harald Welte <laforge@gnumonks.org> Tested-by: Harald Welte <laforge@gnumonks.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-11gtp: Allow to create GTP device without FDsWojciech Drewek2-17/+85
Currently, when the user wants to create GTP device, he has to provide file handles to the sockets created in userspace (IFLA_GTP_FD0, IFLA_GTP_FD1). This behaviour is not ideal, considering the option of adding support for GTP device creation through ip link. Ip link application is not a good place to create such sockets. This patch allows to create GTP device without providing IFLA_GTP_FD0 and IFLA_GTP_FD1 arguments. If the user sets IFLA_GTP_CREATE_SOCKETS attribute, then GTP module takes care of creating UDP sockets by itself. Sockets are created with the commonly known UDP ports used for GTP protocol (GTP0_PORT and GTP1U_PORT). In this case we don't have to provide encap_destroy because no extra deinitialization is needed, everything is covered by udp_tunnel_sock_release. Note: GTP instance created with only this change applied, does not handle GTP Echo Requests. This is implemented in the following patch. Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-11Merge branch 'ptp-ocp-new-firmware-support'David S. Miller2-156/+1144
Jonathan Lemon says: ==================== ptp: ocp: support for new firmware This series contains support for new firmware features for the timecard. v1 -> v2: roundup() is not 32-bit safe, use DIV_ROUND_UP_ULL ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11docs: ABI: Document new timecard sysfs nodes.Jonathan Lemon1-1/+93
Add sysfs nodes for the frequency generator and signal counters. Update SMA selector lists for these, and also add the new 'None', 'VCC' 'GND' selectors. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11ptp: ocp: Add 2 more timestampersJonathan Lemon1-5/+56
The timecard now has 4 general purpose timestampers. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11ptp: ocp: Add 4 frequency countersJonathan Lemon1-7/+165
Input signals can be steered to any of the frequency counters. The counter measures the frequency over a number of seconds: echo 0 > freq1/seconds = turns off measurement echo 1 > freq1/seconds = sets period & turns on measurment. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11ptp: ocp: Program the signal generators via PTP_CLK_REQ_PEROUTJonathan Lemon1-9/+94
The signal generators can be programmed either via the sysfs file or through a PTP_CLK_REQ_PEROUT ioctl request. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11ptp: ocp: Add signal generators and update sysfs nodesJonathan Lemon1-10/+476
Newer firmware provides 4 programmable signal generators, add support for those here. The signal generators provide the ability to set the period, duty cycle, phase offset, and polarity, with new values defaulting to prior values. The period and phase offset are specified in nanoseconds. E.g: period [duty [phase [polarity]]] echo 500000000 > signal # 1/2 second period echo 1000000 40 100 > signal # 1ms period, 40% on, offset 100ns echo 0 > signal # turn off generator Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11ptp: ocp: Add firmware capability bits for feature gatingJonathan Lemon1-5/+33
Add the ability to group sysfs nodes behind a firmware feature check. This way non-present sysfs attributes are omitted on older firmware, which does not have newer features. This will be used in the upcoming patches which adds more features to the timecard. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11ptp: ocp: Add GND and VCC output selectorsJonathan Lemon1-0/+2
These will provide constant outputs. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11ptp: ocp: Rename output selector 'GNSS' to 'GNSS1'Jonathan Lemon1-4/+4
As there are may be 2 GNSS outputs, rename the first one for clarity. This also works around a parsing issue when specifying selectors. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11ptp: ocp: Add ability to disable input selectors.Jonathan Lemon1-11/+23
This adds support for the "IN: None" selector, which disables the input on a sma pin. This should be compatible with old firmware (the firmware will ignore it if not supported). Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11ptp: ocp: Add support for selectable SMA directions.Jonathan Lemon1-117/+211
Assuming the firmware allows it, the direction of each SMA connector is no longer fixed. Handle remapping directions for each pin. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11net: lan966x: Improve the CPU TX bitrate.Horatiu Vultur1-0/+3
When doing manual injection of the frame, it is required to check if the TX FIFO is ready to accept the next word of the frame. For this we are using 'readx_poll_timeout_atomic', the only problem is that before it actually checks the status, is determining the time when to finish polling the status. Which seems to be an expensive operation. Therefore check the status of the TX FIFO before calling 'readx_poll_timeout_atomic'. Doing this will improve the TX bitrate by ~70%. Because 99% the FIFO is ready by that time. The measurements were done using iperf3. Before: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.03 sec 55.2 MBytes 46.2 Mbits/sec 0 sender [ 5] 0.00-10.04 sec 53.8 MBytes 45.0 Mbits/sec receiver After: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.10 sec 95.0 MBytes 78.9 Mbits/sec 0 sender [ 5] 0.00-10.11 sec 95.0 MBytes 78.8 Mbits/sec receiver Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11net: ethernet: ezchip: fix platform_get_irq.cocci warningYihao Han1-1/+0
Remove dev_err() messages after platform_get_irq*() failures. platform_get_irq() already prints an error. Generated by: scripts/coccinelle/api/platform_get_irq.cocci Signed-off-by: Yihao Han <hanyihao@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11flow_dissector: Add support for HSRv0Kurt Kanzenbach1-0/+1
Commit bf08824a0f47 ("flow_dissector: Add support for HSR") added support for HSR within the flow dissector. However, it only works for HSR in version 1. Version 0 uses a different Ether Type. Add support for it. Reported-by: Anthony Harivel <anthony.harivel@linutronix.de> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11net: mv643xx_eth: use platform_get_irq() instead of platform_get_resource()Minghao Chi1-5/+5
It is not recommened to use platform_get_resource(pdev, IORESOURCE_IRQ) for requesting IRQ's resources any more, as they can be not ready yet in case of DT-booting. platform_get_irq() instead is a recommended way for getting IRQ even if it was not retrieved earlier. It also makes code simpler because we're getting "int" value right away and no conversion from resource to int is required. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11net: ethernet: ti: davinci_emac: Use platform_get_irq() to get the interruptLad Prabhakar1-5/+20
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq() for DT users only. While at it propagate error code in emac_dev_stop() in case platform_get_irq_optional() fails. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11net: ethernet: ti: am65-cpsw: Convert to PHYLINKSiddharth Vadapalli3-160/+129
Convert am65-cpsw driver and am65-cpsw ethtool to use Phylink APIs as described at Documentation/networking/sfp-phylink.rst. All calls to Phy APIs are replaced with their equivalent Phylink APIs. No functional change intended. Use Phylink instead of conventional Phylib, in preparation to add support for SGMII/QSGMII modes. Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11mac80211: Add support to trigger sta disconnect on hardware restartYoughandhar Chintala4-3/+55
Currently in case of target hardware restart, we just reconfig and re-enable the security keys and enable the network queues to start data traffic back from where it was interrupted. Many ath10k wifi chipsets have sequence numbers for the data packets assigned by firmware and the mac sequence number will restart from zero after target hardware restart leading to mismatch in the sequence number expected by the remote peer vs the sequence number of the frame sent by the target firmware. This mismatch in sequence number will cause out-of-order packets on the remote peer and all the frames sent by the device are dropped until we reach the sequence number which was sent before we restarted the target hardware In order to fix this, we trigger a sta disconnect, in case of target hw restart. After this there will be a fresh connection and thereby avoiding the dropping of frames by remote peer. The right fix would be to pull the entire data path into the host which is not feasible or would need lots of complex changes and will still be inefficient. Tested on ath10k using WCN3990, QCA6174 Signed-off-by: Youghandhar Chintala <youghand@codeaurora.org> Link: https://lore.kernel.org/r/20220308115325.5246-2-youghand@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-11powerpc/net: Implement powerpc specific csum_shift() to remove branchChristophe Leroy2-0/+9
Today's implementation of csum_shift() leads to branching based on parity of 'offset' 000002f8 <csum_block_add>: 2f8: 70 a5 00 01 andi. r5,r5,1 2fc: 41 a2 00 08 beq 304 <csum_block_add+0xc> 300: 54 84 c0 3e rotlwi r4,r4,24 304: 7c 63 20 14 addc r3,r3,r4 308: 7c 63 01 94 addze r3,r3 30c: 4e 80 00 20 blr Use first bit of 'offset' directly as input of the rotation instead of branching. 000002f8 <csum_block_add>: 2f8: 54 a5 1f 38 rlwinm r5,r5,3,28,28 2fc: 20 a5 00 20 subfic r5,r5,32 300: 5c 84 28 3e rotlw r4,r4,r5 304: 7c 63 20 14 addc r3,r3,r4 308: 7c 63 01 94 addze r3,r3 30c: 4e 80 00 20 blr And change to left shift instead of right shift to skip one more instruction. This has no impact on the final sum. 000002f8 <csum_block_add>: 2f8: 54 a5 1f 38 rlwinm r5,r5,3,28,28 2fc: 5c 84 28 3e rotlw r4,r4,r5 300: 7c 63 20 14 addc r3,r3,r4 304: 7c 63 01 94 addze r3,r3 308: 4e 80 00 20 blr Seems like only powerpc benefits from a branchless implementation. Other main architectures like ARM or X86 get better code with the generic implementation and its branch. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11Merge tag 'mlx5-updates-2022-03-10' of ↵David S. Miller33-69/+700
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-03-10 1) Leon removes useless includes from both mlx5 and mlx4 2) Tariq adds node awareness to some object allocations 3) Gal Cleanups and improvements to EEPROM query 4) Paul adds Software steering to Connection Tracking, to speed up CT Rules insertion. Paul Blakey Says: ================= To improve insertion rate, this series allows for using software steering API directly instead of going through the fs_core layer. This can be done for CT because it doesn't need fs_core layer extra facilities, such as autogroups, FTE IDs and modifications (which require a copy of the flow key/mask). Skipping fs_core layer also allows to create the software steering objects (dr_* objects) ahead of time and re-use them for multiple rules, whereas software steering under fs_core creates them on the fly and discards them. This in turn increased insertion rate. The series first introduces a lightweight CT flow steering provider with the first implementations using fs_core layer, and moves CT to use it. The next patches implement a provider using software steering directly, bypassing fs_core, and uses it if software steering is available. ================= ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-11mac80211: fix potential double free on mesh joinLinus Lüssing1-3/+0
While commit 6a01afcf8468 ("mac80211: mesh: Free ie data when leaving mesh") fixed a memory leak on mesh leave / teardown it introduced a potential memory corruption caused by a double free when rejoining the mesh: ieee80211_leave_mesh() -> kfree(sdata->u.mesh.ie); ... ieee80211_join_mesh() -> copy_mesh_setup() -> old_ie = ifmsh->ie; -> kfree(old_ie); This double free / kernel panics can be reproduced by using wpa_supplicant with an encrypted mesh (if set up without encryption via "iw" then ifmsh->ie is always NULL, which avoids this issue). And then calling: $ iw dev mesh0 mesh leave $ iw dev mesh0 mesh join my-mesh Note that typically these commands are not used / working when using wpa_supplicant. And it seems that wpa_supplicant or wpa_cli are going through a NETDEV_DOWN/NETDEV_UP cycle between a mesh leave and mesh join where the NETDEV_UP resets the mesh.ie to NULL via a memcpy of default_mesh_setup in cfg80211_netdev_notifier_call, which then avoids the memory corruption, too. The issue was first observed in an application which was not using wpa_supplicant but "Senf" instead, which implements its own calls to nl80211. Fixing the issue by removing the kfree()'ing of the mesh IE in the mesh join function and leaving it solely up to the mesh leave to free the mesh IE. Cc: stable@vger.kernel.org Fixes: 6a01afcf8468 ("mac80211: mesh: Free ie data when leaving mesh") Reported-by: Matthias Kretschmer <mathias.kretschmer@fit.fraunhofer.de> Signed-off-by: Linus Lüssing <ll@simonwunderlich.de> Tested-by: Mathias Kretschmer <mathias.kretschmer@fit.fraunhofer.de> Link: https://lore.kernel.org/r/20220310183513.28589-1-linus.luessing@c0d3.blue Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-11mac80211: correct legacy rates check in ieee80211_calc_rx_airtimeMeiChia Chiu1-1/+3
There are no legacy rates on 60GHz or sub-1GHz band, so modify the check. Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: MeiChia Chiu <MeiChia.Chiu@mediatek.com> Link: https://lore.kernel.org/r/20220308021645.16272-1-MeiChia.Chiu@mediatek.com [Ghz -> GHz] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-11nl80211: fix typo of NL80211_IF_TYPE_OCB in documentationVeerendranath Jakkam1-1/+1
It should be NL80211_IFTYPE_OCB instead. Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com> Link: https://lore.kernel.org/r/1645542399-4680-1-git-send-email-quic_vjakkam@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-11mac80211: Use GFP_KERNEL instead of GFP_ATOMIC when possibleChristophe JAILLET1-1/+1
Previous memory allocations in this function already use GFP_KERNEL, so use __dev_alloc_skb() and an explicit GFP_KERNEL instead of an implicit GFP_ATOMIC. This gives more opportunities of successful allocation. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/194a0e2ff00c3fae88cc9fba47431747360c8242.1645345378.git.christophe.jaillet@wanadoo.fr Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-11mac80211: replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTEYihao Han1-3/+3
Fix the following coccicheck warning: ./drivers/net/wireless/mac80211_hwsim.c:1040:0-23: WARNING: hwsim_fops_rx_rssi should be defined with DEFINE_DEBUGFS_ATTRIBUTE Signed-off-by: Yihao Han <hanyihao@vivo.com> Link: https://lore.kernel.org/r/20220218070228.6210-1-hanyihao@vivo.com [fix indentation] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-03-11net/mlx5e: Remove overzealous validations in netlink EEPROM queryGal Pressman1-23/+0
Unlike the legacy EEPROM callbacks, when using the netlink EEPROM query (get_module_eeprom_by_page) the driver should not try to validate the query parameters, but just perform the read requested by the userspace. Recent discussion in the mailing list: https://lore.kernel.org/netdev/20220120093051.70845141@kicinski-fedora-PC1C0HJN.hsd1.ca.comcast.net/ Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-11net/mlx5: Parse module mapping using mlx5_ifcGal Pressman3-8/+7
The assumption that the first byte in the module mapping dword is the module number shouldn't be hard-coded in the driver, but come from mlx5_ifc structs. While at it, fix the incorrect width for the 'rx_lane' and 'tx_lane' fields. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-11net/mlx5: Query the maximum MCIA register read size from firmwareGal Pressman3-3/+10
The MCIA register supports either 12 or 32 dwords, use the correct value by querying the capability from the MCAM register. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-11net/mlx5: CT: Create smfs dr matchers dynamicallyPaul Blakey1-88/+94
SMFS dr matchers are processed sequentially in hardware according to their priorities, and not skipped if empty. Currently, smfs ct fs creates four predefined dr matchers per ct table (ct/ct nat) with hardcoded priority. Compared to dmfs ct fs using autogroups, this might cause additional hops in fastpath for traffic patterns that match later priorties, even if previous priorites are empty, e.g user only using ipv6 UDP traffic will have additional 3 hops. Create the matchers dynamically, using the highest priority available, on first rule usage, and remove them on last usage. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-11net/mlx5: CT: Add software steering ct flow steering providerPaul Blakey6-7/+435
fs_core layer adds extra book keeping that is either unneeded for CT, or unused by the underlying software steering, such as allocating FTEs and FTE ids, saving the match key and mask, and autogroups management. On top of that, direct steering has a translation layer (fs_dr) from PRM commands to direct steering objects, for example, creating temporary dr_action objects. This has a performance impact when dealing with CT high insertion rate. To use direct steering (smfs) directly for ct, add a tc ct fs smfs implementation. Instead of dmfs autogroups, smfs ct fs uses one of 4 predefined dr matchers in CT and CT-NAT tables, for each combination of tuple ethertype (ipv4/ipv6), and tuple ip_proto (udp/tcp) that is currently used by nf flow table flow offload. At rule insertions, validate the flow rule fits one of the predfined matcher, and insert to it. To fill the dr_actions of the rule efficiently, create the fwd to post_ct tbl dr_action at fs init, the count dr_action at counter creation, and re-use the already pre-allocated modify header dr_action. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-11net/mlx5: Add smfs lib to export direct steering API to CTPaul Blakey3-1/+105
Add a thin layer that exports selected direct steering (dr) API which will be used by a ct fs implementation in a following patch. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-11net/mlx5: DR, Add helper to get backing dr table from a mlx5 flow tablePaul Blakey2-0/+8
If sw steering was used to create the table, dr steeering fs creates a backing dr table for the mlx5 flow table. Add helper to return this table so it can be used to create matchers and add rules on it directly instead of passing via eswitch_offloads/fs_core insertion. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>