kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2024-04-29	platform/x86: ISST: Add Grand Ridge to HPM CPU list	Srinivas Pandruvada	1	-0/+1
	Add Grand Ridge (ATOM_CRESTMONT) to hpm_cpu_ids, so that MSR 0x54 can be used. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20240422212222.3881606-1-srinivas.pandruvada@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-04-29	bpf, docs: Clarify PC use in instruction-set.rst	Dave Thaler	1	-0/+6
	This patch elaborates on the use of PC by expanding the PC acronym, explaining the units, and the relative position to which the offset applies. Signed-off-by: Dave Thaler <dthaler1968@googlemail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20240426231126.5130-1-dthaler1968@gmail.com
2024-04-29	Merge branch 'mlxsw-events-processing-performance'	David S. Miller	1	-54/+150
	Petr Machata says: ==================== mlxsw: Improve events processing performance Amit Cohen writes: Spectrum ASICs only support a single interrupt, it means that all the events are handled by one IRQ (interrupt request) handler. Currently, we schedule a tasklet to handle events in EQ, then we also use tasklet for CQ, SDQ and RDQ. Tasklet runs in softIRQ (software IRQ) context, and will be run on the same CPU which scheduled it. It means that today we have one CPU which handles all the packets (both network packets and EMADs) from hardware. The existing implementation is not efficient and can be improved. Measuring latency of EMADs in the driver (without the time in FW) shows that latency is increased by factor of 28 (x28) when network traffic is handled by the driver. Measuring throughput in CPU shows that CPU can handle ~35% less packets of specific flow when corrupted packets are also handled by the driver. There are cases that these values even worse, we measure decrease of ~44% packet rate. This can be improved if network packet and EMADs will be handled in parallel by several CPUs, and more than that, if different types of traffic will be handled in parallel. We can achieve this using NAPI. This set converts the driver to process completions from hardware via NAPI. The idea is to add NAPI instance per CQ (which is mapped 1:1 to SDQ/RDQ), which means that each DQ can be handled separately. we have DQ for EMADs and DQs for each trap group (like LLDP, BGP, L3 drops, etc..). See more details in commit messages. An additional improvement which is done as part of this set is related to doorbells' ring. The idea is to handle small chunks of Rx packets (which is also recommended using NAPI) and ring doorbells once per chunk. This reduces the access to hardware which is expensive (time wise) and might take time because of memory barriers. With this set we can see better performance. To summerize: EMADs latency: +------------------------------------------------------------------------+ \| \| Before this set \| Now \| \|------------------\|---------------------------\|-------------------------\| \| Increased factor \| x28 \| x1.5 \| +------------------------------------------------------------------------+ Note that we can see even measurements that show better latency when traffic is handled by the driver. Throughput: +------------------------------------------------------------------------+ \| \| Before this set \| Now \| \|-------------\|----------------------------\|-----------------------------\| \| Reduced \| 35% \| 6% \| \| packet rate \| \| \| +------------------------------------------------------------------------+ Additional improvements are planned - use page pool for buffer allocations and avoid cache miss of each SKB using napi_build_skb(). Patch set overview: Patches #1-#2 improve access to hardware by reducing dorbells' rings Patch #3-#4 are preaparations for NAPI usage Patch #5 converts the driver to use NAPI ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	mlxsw: pci: Use NAPI for event processing	Amit Cohen	1	-19/+77
	Spectrum ASICs only support a single interrupt, that means that all the events are handled by one IRQ (interrupt request) handler. Once an interrupt is received, we schedule tasklet to handle events from EQ and then schedule tasklets to handle completions from CQs. Tasklet runs in softIRQ (software IRQ) context, and will be run on the same CPU which scheduled it. That means that today we use only one CPU to handle all the packets (both network packets and EMADs) from hardware. This can be improved using NAPI. The idea is to use NAPI instance per CQ, which is mapped 1:1 to DQ (RDQ or SDQ). NAPI poll method can be run in kernel thread, so then the driver will be able to handle WQEs in several CPUs. Convert the existing code to use NAPI APIs. Add NAPI instance as part of 'struct mlxsw_pci_queue' and initialize it as part of CQs initialization. Set the appropriate poll method and dummy net device, according to queue number, similar to tasklet setup. For CQs which are used for completions of RDQ, use Rx poll method and 'napi_dev_rx', which is set as 'threaded'. It means that Rx poll method will run in kernel context, so several RDQs will be handled in parallel. For CQs which are used for completions of SDQ, use Tx poll method and 'napi_dev_tx', this method will run in softIRQ context, as it is recommended in NAPI documentation, as Tx packets' processing is short task. Convert mlxsw_pci_cq_{rx,tx}_tasklet() to poll methods. Handle 'budget' argument - ignore it in Tx poll method, as it is recommended to not limit Tx processing. For Rx processing, handle up to 'budget' completions. Return 'work_done' which is the amount of completions that were handled. Handle the following cases: 1. After processing 'budget' completions, the driver still has work to do: Return work-done = budget. In that case, the NAPI instance will be polled again (without the need to be rescheduled). Do not re-arm the queue, as NAPI will handle the reschedule, so we do not have to involve hardware to send an additional interrupt for the completions that should be processed. 2. Event processing has been completed: Call napi_complete_done() to mark NAPI processing as completed, which means that the poll method will not be rescheduled. Re-arm the queue, as all completions were handled. In case that poll method handled exactly 'budget' completions, return work-done = budget -1, to distinguish from the case that driver still has completions to handle. Otherwise, return the amount of completions that were handled. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	mlxsw: pci: Reorganize 'mlxsw_pci_queue' structure	Amit Cohen	1	-38/+38
	The next patch will set the driver to use NAPI for event processing. Then tasklet mechanism will be used only for EQ. Reorganize 'mlxsw_pci_queue' to hold EQ and CQ attributes in a union. For now, add tasklet for both EQ and CQ. This will be changed in the next patch, as 'tasklet_struct' will be replaced with NAPI instance. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	mlxsw: pci: Initialize dummy net devices for NAPI	Amit Cohen	1	-0/+41
	mlxsw will use NAPI for event processing in a next patch. As preparation, add two dummy net devices and initialize them. NAPI instance should be attached to net device. Usually each queue is used by a single net device in network drivers, so the mapping between net device to NAPI instance is intuitive. In our case, Rx queues are not per port, they are per trap-group. Tx queues are mapped to net devices, but we do not have a separate queue for each local port, several ports share the same queue. Use init_dummy_netdev() to initialize dummy net devices for NAPI. To run NAPI poll method in a kernel thread, the net device which NAPI instance is attached to should be marked as 'threaded'. It is recommended to handle Tx packets in softIRQ context, as usually this is a short task - just free the Tx packet which has been transmitted. Rx packets handling is more complicated task, so drivers can use a dedicated kernel thread to process them. It allows processing packets from different Rx queues in parallel. We would like to handle only Rx packets in kernel threads, which means that we will use two dummy net devices (one for Rx and one for Tx). Set only one of them with 'threaded' as it will be used for Rx processing. Do not fail in case that setting 'threaded' fails, as it is better to use regular softIRQ NAPI rather than preventing the driver from loading. Note that the net devices are initialized with init_dummy_netdev(), so they are not registered, which means that they will not be visible to user. It will not be possible to change 'threaded' configuration from user space, but it is reasonable in our case, as there is no another configuration which makes sense, considering that user has no influence on the usage of each queue. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	mlxsw: pci: Ring RDQ and CQ doorbells once per several completions	Amit Cohen	1	-7/+3
	Currently, for each CQE in CQ, we ring CQ doorbell, then handle RDQ and ring RDQ doorbell. Finally we ring CQ arm doorbell - once per CQ tasklet. The idea of ringing CQ doorbell before RDQ doorbell, is to be sure that when we post new WQE (after RDQ is handled), there is an available CQE. This was done because of a hardware bug as part of commit c9ebea04cb1b ("mlxsw: pci: Ring CQ's doorbell before RDQ's"). There is no real reason to ring RDQ and CQ doorbells for each completion, it is better to handle several completions and reduce number of ringings, as access to hardware is expensive (time wise) and might take time because of memory barriers. A previous patch changed CQ tasklet to handle up to 64 Rx packets. With this limitation, we can ring CQ and RDQ doorbells once per CQ tasklet. The counters of the doorbells are increased by the amount of packets that we handled, then the device will know for which completion to send an additional event. To avoid reordering CQ and RDQ doorbells' ring, let the tasklet to ring also RDQ doorbell, mlxsw_pci_cqe_rdq_handle() handles the counter but does not ring the doorbell. Note that with this change there is no need to copy the CQE, as we ring CQ doorbell only after Rx packet processing (which uses the CQE) is done. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	mlxsw: pci: Handle up to 64 Rx completions in tasklet	Amit Cohen	1	-2/+3
	We can get many completions in one interrupt. Currently, the CQ tasklet handles up to half queue size completions, and then arms the hardware to generate additional events, which means that in case that there were additional completions that we did not handle, we will get immediately an additional interrupt to handle the rest. The decision to handle up to half of the queue size is arbitrary and was determined in 2015, when mlxsw driver was added to the kernel. One additional fact that should be taken into account is that while WQEs from RDQ are handled, the CPU that handles the tasklet is dedicated for this task, which means that we might hold the CPU for a long time. Handle WQEs in smaller chucks, then arm CQ doorbell to notify the hardware to send additional notifications. Set the chunk size to 64 as this number is recommended using NAPI and the driver will use NAPI in a next patch. Note that for now we use ARM doorbell to retrigger CQ tasklet, but with NAPI it will be more efficient as software will reschedule the poll method and we will not involve hardware for that. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	firewire: ohci: fulfill timestamp for some local asynchronous transaction	Takashi Sakamoto	1	-2/+6
	1394 OHCI driver generates packet data for the response subaction to the request subaction to some local registers. In the case, the driver should assign timestamp to them by itself. This commit fulfills the timestamp for the subaction. Cc: stable@vger.kernel.org Fixes: dcadfd7f7c74 ("firewire: core: use union for callback of transaction completion") Link: https://lore.kernel.org/r/20240429084709.707473-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2024-04-29	firewire: nosy: ensure user_length is taken into account when fetching ↵	Thanassis Avgerinos	1	-2/+4
	packet contents Ensure that packet_buffer_get respects the user_length provided. If the length of the head packet exceeds the user_length, packet_buffer_get will now return 0 to signify to the user that no data were read and a larger buffer size is required. Helps prevent user space overflows. Signed-off-by: Thanassis Avgerinos <thanassis.avgerinos@gmail.com> Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2024-04-29	x86/sev: Add callback to apply RMP table fixups for kexec	Ashish Kalra	3	-0/+45
	Handle cases where the RMP table placement in the BIOS is not 2M aligned and the kexec-ed kernel could try to allocate from within that chunk which then causes a fatal RMP fault. The kexec failure is illustrated below: SEV-SNP: RMP table physical range [0x0000007ffe800000 - 0x000000807f0fffff] BIOS-provided physical RAM map: BIOS-e820: [mem 0x0000000000000000-0x000000000008efff] usable BIOS-e820: [mem 0x000000000008f000-0x000000000008ffff] ACPI NVS ... BIOS-e820: [mem 0x0000004080000000-0x0000007ffe7fffff] usable BIOS-e820: [mem 0x0000007ffe800000-0x000000807f0fffff] reserved BIOS-e820: [mem 0x000000807f100000-0x000000807f1fefff] usable As seen here in the e820 memory map, the end range of the RMP table is not aligned to 2MB and not reserved but it is usable as RAM. Subsequently, kexec -s (KEXEC_FILE_LOAD syscall) loads it's purgatory code and boot_param, command line and other setup data into this RAM region as seen in the kexec logs below, which leads to fatal RMP fault during kexec boot. Loaded purgatory at 0x807f1fa000 Loaded boot_param, command line and misc at 0x807f1f8000 bufsz=0x1350 memsz=0x2000 Loaded 64bit kernel at 0x7ffae00000 bufsz=0xd06200 memsz=0x3894000 Loaded initrd at 0x7ff6c89000 bufsz=0x4176014 memsz=0x4176014 E820 memmap: 0000000000000000-000000000008efff (1) 000000000008f000-000000000008ffff (4) 0000000000090000-000000000009ffff (1) ... 0000004080000000-0000007ffe7fffff (1) 0000007ffe800000-000000807f0fffff (2) 000000807f100000-000000807f1fefff (1) 000000807f1ff000-000000807fffffff (2) nr_segments = 4 segment[0]: buf=0x00000000e626d1a2 bufsz=0x4000 mem=0x807f1fa000 memsz=0x5000 segment[1]: buf=0x0000000029c67bd6 bufsz=0x1350 mem=0x807f1f8000 memsz=0x2000 segment[2]: buf=0x0000000045c60183 bufsz=0xd06200 mem=0x7ffae00000 memsz=0x3894000 segment[3]: buf=0x000000006e54f08d bufsz=0x4176014 mem=0x7ff6c89000 memsz=0x4177000 kexec_file_load: type:0, start:0x807f1fa150 head:0x1184d0002 flags:0x0 Check if RMP table start and end physical range in the e820 tables are not aligned to 2MB and in that case map this range to reserved in all the three e820 tables. [ bp: Massage. ] Fixes: c3b86e61b756 ("x86/cpufeatures: Enable/unmask SEV-SNP CPU feature") Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/df6e995ff88565262c2c7c69964883ff8aa6fc30.1714090302.git.ashish.kalra@amd.com
2024-04-29	misc/pvpanic-pci: register attributes via pci_driver	Thomas Weißschuh	1	-3/+1
	In __pci_register_driver(), the pci core overwrites the dev_groups field of the embedded struct device_driver with the dev_groups from the outer struct pci_driver unconditionally. Set dev_groups in the pci_driver to make sure it is used. This was broken since the introduction of pvpanic-pci. Fixes: db3a4f0abefd ("misc/pvpanic: add PCI driver") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Fixes: ded13b9cfd59 ("PCI: Add support for dev_groups to struct pci_driver") Link: https://lore.kernel.org/r/20240411-pvpanic-pci-dev-groups-v1-1-db8cb69f1b09@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-29	x86/e820: Add a new e820 table update helper	Ashish Kalra	2	-3/+5
	Add a new API helper e820__range_update_table() with which to update an arbitrary e820 table. Move all current users of e820__range_update_kexec() to this new helper. [ bp: Massage. ] Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/b726af213ad55053f8a7a1e793b01bb3f1ca9dd5.1714090302.git.ashish.kalra@amd.com
2024-04-29	ipv6: use call_rcu_hurry() in fib6_info_release()	Eric Dumazet	1	-1/+1
	This is a followup of commit c4e86b4363ac ("net: add two more call_rcu_hurry()") fib6_info_destroy_rcu() is calling nexthop_put() or fib6_nh_release() We must not delay it too much or risk unregister_netdevice/ref_tracker traces because references to netdev are not released in time. This should speedup device/netns dismantles when CONFIG_RCU_LAZY=y Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	Merge branch 'qed-error-codes'	David S. Miller	1	-6/+8
	Asbjørn Sloth Tønnesen says: ==================== net: qede: avoid overruling error codes This series fixes the qede driver, so that qede_parse_flow_attr() and it's subfunctions doesn't get their error codes overruled (ie. turning -EOPNOTSUPP into -EINVAL). --- I have two more patches along the same lines, but they are not yet causing any issues, so I have them destined for net-next. (those are for qede_flow_spec_validate_unused() and qede_flow_parse_ports().) After that I have a series for converting to extack + the final one for validating control flags. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	net: qede: use return from qede_parse_actions()	Asbjørn Sloth Tønnesen	1	-3/+2
	When calling qede_parse_actions() then the return code was only used for a non-zero check, and then -EINVAL was returned. qede_parse_actions() can currently fail with: * -EINVAL * -EOPNOTSUPP This patch changes the code to use the actual return code, not just return -EINVAL. The blaimed commit broke the implicit assumption that only -EINVAL would ever be returned. Only compile tested. Fixes: 319a1d19471e ("flow_offload: check for basic action hw stats type") Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	net: qede: use return from qede_parse_flow_attr() for flow_spec	Asbjørn Sloth Tønnesen	1	-3/+2
	In qede_flow_spec_to_rule(), when calling qede_parse_flow_attr() then the return code was only used for a non-zero check, and then -EINVAL was returned. qede_parse_flow_attr() can currently fail with: * -EINVAL * -EOPNOTSUPP * -EPROTONOSUPPORT This patch changes the code to use the actual return code, not just return -EINVAL. The blaimed commit introduced qede_flow_spec_to_rule(), and this call to qede_parse_flow_attr(), it looks like it just duplicated how it was already used. Only compile tested. Fixes: 37c5d3efd7f8 ("qede: use ethtool_rx_flow_rule() to remove duplicated parser code") Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	net: qede: use return from qede_parse_flow_attr() for flower	Asbjørn Sloth Tønnesen	1	-3/+2
	In qede_add_tc_flower_fltr(), when calling qede_parse_flow_attr() then the return code was only used for a non-zero check, and then -EINVAL was returned. qede_parse_flow_attr() can currently fail with: * -EINVAL * -EOPNOTSUPP * -EPROTONOSUPPORT This patch changes the code to use the actual return code, not just return -EINVAL. The blaimed commit introduced these functions. Only compile tested. Fixes: 2ce9c93eaca6 ("qede: Ingress tc flower offload (drop action) support.") Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	net: qede: sanitize 'rc' in qede_add_tc_flower_fltr()	Asbjørn Sloth Tønnesen	1	-3/+8
	Explicitly set 'rc' (return code), before jumping to the unlock and return path. By not having any code depend on that 'rc' remains at it's initial value of -EINVAL, then we can re-use 'rc' for the return code of function calls in subsequent patches. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	inet: use call_rcu_hurry() in inet_free_ifa()	Eric Dumazet	1	-1/+6
	This is a followup of commit c4e86b4363ac ("net: add two more call_rcu_hurry()") Our reference to ifa->ifa_dev must be freed ASAP to release the reference to the netdev the same way. inet_rcu_free_ifa() in_dev_put() -> in_dev_finish_destroy() -> netdev_put() This should speedup device/netns dismantles when CONFIG_RCU_LAZY=y Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	net: give more chances to rcu in netdev_wait_allrefs_any()	Eric Dumazet	1	-1/+2
	This came while reviewing commit c4e86b4363ac ("net: add two more call_rcu_hurry()"). Paolo asked if adding one synchronize_rcu() would help. While synchronize_rcu() does not help, making sure to call rcu_barrier() before msleep(wait) is definitely helping to make sure lazy call_rcu() are completed. Instead of waiting ~100 seconds in my tests, the ref_tracker splats occurs one time only, and netdev_wait_allrefs_any() latency is reduced to the strict minimum. Ideally we should audit our call_rcu() users to make sure no refcount (or cascading call_rcu()) is held too long, because rcu_barrier() is quite expensive. Fixes: 0e4be9e57e8c ("net: use exponential backoff in netdev_wait_allrefs") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/all/28bbf698-befb-42f6-b561-851c67f464aa@kernel.org/T/#m76d73ed6b03cd930778ac4d20a777f22a08d6824 Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	MAINTAINERS: add an explicit entry for YNL	Jakub Kicinski	1	-0/+8
	Donald has been contributing to YNL a lot. Let's create a dedicated MAINTAINERS entry and add make his involvement official :) Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	xfrm: Correct spelling mistake in xfrm.h comment	Antony Antony	1	-1/+1
	A spelling error was found in the comment section of include/uapi/linux/xfrm.h. Since this header file is copied to many userspace programs and undergoes Debian spellcheck, it's preferable to fix it in upstream rather than downstream having exceptions. This commit fixes the spelling mistake. Fixes: df71837d5024 ("[LSM-IPSec]: Security association restriction.") Signed-off-by: Antony Antony <antony.antony@secunet.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-04-29	Merge branch 'bcmgenet-protect-contended-accesses'	David S. Miller	4	-5/+29
	Doug Berger says: ==================== net: bcmgenet: protect contended accesses Some registers may be modified by parallel execution contexts and require protections to prevent corruption. A review of the driver revealed the need for these additional protections. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	net: bcmgenet: synchronize UMAC_CMD access	Doug Berger	4	-3/+23
	The UMAC_CMD register is written from different execution contexts and has insufficient synchronization protections to prevent possible corruption. Of particular concern are the acceses from the phy_device delayed work context used by the adjust_link call and the BH context that may be used by the ndo_set_rx_mode call. A spinlock is added to the driver to protect contended register accesses (i.e. reg_lock) and it is used to synchronize accesses to UMAC_CMD. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Cc: stable@vger.kernel.org Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	net: bcmgenet: synchronize use of bcmgenet_set_rx_mode()	Doug Berger	1	-1/+3
	The ndo_set_rx_mode function is synchronized with the netif_addr_lock spinlock and BHs disabled. Since this function is also invoked directly from the driver the same synchronization should be applied. Fixes: 72f96347628e ("net: bcmgenet: set Rx mode before starting netif") Cc: stable@vger.kernel.org Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	net: bcmgenet: synchronize EXT_RGMII_OOB_CTRL access	Doug Berger	1	-1/+3
	The EXT_RGMII_OOB_CTRL register can be written from different contexts. It is predominantly written from the adjust_link handler which is synchronized by the phydev->lock, but can also be written from a different context when configuring the mii in bcmgenet_mii_config(). The chances of contention are quite low, but it is conceivable that adjust_link could occur during resume when WoL is enabled so use the phydev->lock synchronizer in bcmgenet_mii_config() to be sure. Fixes: afe3f907d20f ("net: bcmgenet: power on MII block for all MII modes") Cc: stable@vger.kernel.org Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	net: ethernet: ti: am65-cpsw-qos: Add support to taprio for past base_time	Tanmay Patil	1	-3/+13
	If the base-time for taprio is in the past, start the schedule at the time of the form "base_time + N*cycle_time" where N is the smallest possible integer such that the above time is in the future. Signed-off-by: Tanmay Patil <t-patil@ti.com> Signed-off-by: Chintan Vankar <c-vankar@ti.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-29	xtensa: remove redundant flush_dcache_page and ↵	Barry Song	1	-16/+8
	ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE macros xtensa's flush_dcache_page() can be a no-op sometimes. There is a generic implementation for this case in include/asm-generic/ cacheflush.h. #ifndef ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE static inline void flush_dcache_page(struct page page) { } #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 0 #endif So remove the superfluous flush_dcache_page() definition, which also helps silence potential build warnings complaining the page variable passed to flush_dcache_page() is not used. In file included from crypto/scompress.c:12: include/crypto/scatterwalk.h: In function 'scatterwalk_pagedone': include/crypto/scatterwalk.h:76:30: warning: variable 'page' set but not used [-Wunused-but-set-variable] 76 \| struct page page; \| ^~~~ crypto/scompress.c: In function 'scomp_acomp_comp_decomp': >> crypto/scompress.c:174:38: warning: unused variable 'dst_page' [-Wunused-variable] 174 \| struct page *dst_page = sg_page(req->dst); \| The issue was originally reported on LoongArch by kernel test robot (Huacai fixed it on LoongArch), then reported by Guenter and me on xtensa. This patch also removes lots of redundant macros which have been defined by asm-generic/cacheflush.h. Cc: Huacai Chen <chenhuacai@loongson.cn> Cc: Herbert Xu <herbert@gondor.apana.org.au> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202403091614.NeUw5zcv-lkp@intel.com/ Reported-by: Barry Song <v-songbaohua@oppo.com> Closes: https://lore.kernel.org/all/CAGsJ_4yDk1+axbte7FKQEwD7X2oxUCFrEc9M5YOS1BobfDFXPA@mail.gmail.com/ Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/all/aaa8b7d7-5abe-47bf-93f6-407942436472@roeck-us.net/ Fixes: 77292bb8ca69 ("crypto: scomp - remove memcpy if sg_nents is 1 and pages are lowmem") Signed-off-by: Barry Song <v-songbaohua@oppo.com> Message-Id: <20240319010920.125192-1-21cnbao@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2024-04-29	softirq: Fix suspicious RCU usage in __do_softirq()	Zqiang	1	-4/+8
	Currently, the condition "__this_cpu_read(ksoftirqd) == current" is used to invoke rcu_softirq_qs() in ksoftirqd tasks context for non-RT kernels. This works correctly as long as the context is actually task context but this condition is wrong when: - the current task is ksoftirqd - the task is interrupted in a RCU read side critical section - __do_softirq() is invoked on return from interrupt Syzkaller triggered the following scenario: -> finish_task_switch() -> put_task_struct_rcu_user() -> call_rcu(&task->rcu, delayed_put_task_struct) -> __kasan_record_aux_stack() -> pfn_valid() -> rcu_read_lock_sched() <interrupt> __irq_exit_rcu() -> __do_softirq)() -> if (!IS_ENABLED(CONFIG_PREEMPT_RT) && __this_cpu_read(ksoftirqd) == current) -> rcu_softirq_qs() -> RCU_LOCKDEP_WARN(lock_is_held(&rcu_sched_lock_map)) The rcu quiescent state is reported in the rcu-read critical section, so the lockdep warning is triggered. Fix this by splitting out the inner working of __do_softirq() into a helper function which takes an argument to distinguish between ksoftirqd task context and interrupted context and invoke it from the relevant call sites with the proper context information and use that for the conditional invocation of rcu_softirq_qs(). Reported-by: syzbot+dce04ed6d1438ad69656@syzkaller.appspotmail.com Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Zqiang <qiang.zhang1211@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240427102808.29356-1-qiang.zhang1211@gmail.com Link: https://lore.kernel.org/lkml/8f281a10-b85a-4586-9586-5bbc12dc784f@paulmck-laptop/T/#mea8aba4abfcb97bbf499d169ce7f30c4cff1b0e3
2024-04-29	wifi: rtlwifi: rtl8723be: Make read-only arrays static const	Colin Ian King	1	-19/+26
	Don't populate the read-only arrays cck_rates, ofdm_rates, ht_rates_1t and channel_all on the stack at run time, instead make them static const and clean up the formatting. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://msgid.link/20240425155733.114423-1-colin.i.king@gmail.com
2024-04-29	wifi: rtw89: Remove the redundant else branch in the function ↵	Jiapeng Chong	1	-4/+2
	rtw89_phy_get_kpath The assignment of the else and if branches is the same in the "case: MLO_2_PLUS_0_1RF" branch of the function rtw89_phy_get_kpath, so we remove it and add comments here to make the code easier to understand. ./drivers/net/wireless/realtek/rtw89/phy.c:6406:2-4: WARNING: possible condition with no effect (if == else). Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=8812 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://msgid.link/20240422072922.50940-1-jiapeng.chong@linux.alibaba.com
2024-04-29	bcachefs: fix integer conversion bug	Kent Overstreet	1	-1/+1
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-29	bcachefs: btree node scan now fills in sectors_written	Kent Overstreet	2	-2/+6
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-29	bcachefs: Remove accidental debug assert	Kent Overstreet	1	-2/+0
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-29	wifi: rtw89: coex: Check and enable reports after run coex	Ching-Te Ku	1	-6/+40
	If there is any changes with Wi-Fi/Bluetooth, the mechanism will trigger run_coex to update information and coexistence mechanism. Enable/Disable reports here can make sure the action take effect in time. Signed-off-by: Ching-Te Ku <ku920601@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://msgid.link/20240423130502.32682-9-pkshih@realtek.com
2024-04-29	wifi: rtw89: coex: Add Wi-Fi role v8 condition when set BTG control	Ching-Te Ku	1	-0/+5
	BTG(A hardware block, which Wi-Fi 2.4Ghz & Bluetooth shared a part of hardware). Because some information are saved in role info. So the logic also need to get value from the version 8 role info. Signed-off-by: Ching-Te Ku <ku920601@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://msgid.link/20240423130502.32682-8-pkshih@realtek.com
2024-04-29	wifi: rtw89: coex: Add Wi-Fi role v8 condition when set Bluetooth channel	Ching-Te Ku	1	-0/+18
	This function is to let Bluetooth know Wi-Fi is using which channel, and ask Bluetooth do not hop into the nearby channel. Wi-Fi channel is saved at role info, this patch make the logic also get the channel value from version 8 role info. Signed-off-by: Ching-Te Ku <ku920601@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://msgid.link/20240423130502.32682-7-pkshih@realtek.com
2024-04-29	wifi: rtw89: coex: Fix unexpected value in version 7 slot parameter	Ching-Te Ku	1	-3/+15
	It will assign wrong value to version 7 slot parameter setting, because the structure member order has changed. Add a for-loop to assign variables manually. Signed-off-by: Ching-Te Ku <ku920601@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://msgid.link/20240423130502.32682-6-pkshih@realtek.com
2024-04-29	wifi: rtw89: coex: Add Bluetooth version report version 7	Ching-Te Ku	2	-15/+44
	The report is reported from Bluetooth, it shows the current Bluetooth driver & firmware version code. Wi-Fi & Bluetooth need to use compatible version. The version 7 report adjust the structure variables order. Signed-off-by: Ching-Te Ku <ku920601@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://msgid.link/20240423130502.32682-5-pkshih@realtek.com
2024-04-29	wifi: rtw89: coex: Add Bluetooth frequency hopping map version 7	Ching-Te Ku	2	-0/+24
	The report is reported from Bluetooth, it described the usable Bluetooth channel map. Bluetooth should not hopped into Wi-Fi using channel. Version 8 report adjust the structure variables order. Signed-off-by: Ching-Te Ku <ku920601@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://msgid.link/20240423130502.32682-4-pkshih@realtek.com
2024-04-29	wifi: rtw89: coex: Add Bluetooth scan parameter report version 7	Ching-Te Ku	2	-0/+24
	This report is reported from Bluetooth, it described Bluetooth scan parameters. Version 7 adjust the structure variables order. Signed-off-by: Ching-Te Ku <ku920601@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://msgid.link/20240423130502.32682-3-pkshih@realtek.com
2024-04-29	wifi: rtw89: coex: Add Wi-Fi null data status version 7	Ching-Te Ku	2	-0/+36
	The mechanism will use Wi-Fi null packet to stop the packets from access point to avoid the interference to Bluetooth when switch to Bluetooth slot. The report can check whether the null packet is working as expected or not. Signed-off-by: Ching-Te Ku <ku920601@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://msgid.link/20240423130502.32682-2-pkshih@realtek.com
2024-04-29	wifi: rtw89: 8852b: update hardware parameters for RFE type 5	Ping-Ke Shih	2	-0/+15
	RFE type 5 of 8852B is a type of hardware module, which can use different external components, so update register settings accordingly. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://msgid.link/20240423121247.24714-2-pkshih@realtek.com
2024-04-29	wifi: rtw89: fix CTS transmission issue with center frequency deviation	Kuan-Chung Chen	2	-0/+10
	The CTS cannot be received by the peer due to center frequency deviation. This issue can be solved by correct settings to transmit proper CTS. Signed-off-by: Kuan-Chung Chen <damon.chen@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://msgid.link/20240423121247.24714-1-pkshih@realtek.com
2024-04-29	ksmbd: fix uninitialized symbol 'share' in smb2_tree_connect()	Namjae Jeon	1	-2/+2
	Fix uninitialized symbol 'share' in smb2_tree_connect(). Fixes: e9d8c2f95ab8 ("ksmbd: add continuous availability share parameter") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-28	Linux 6.9-rc6v6.9-rc6	Linus Torvalds	1	-1/+1

2024-04-28	Merge tag 'sched-urgent-2024-04-28' of ↵	Linus Torvalds	3	-21/+38
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: - Fix EEVDF corner cases - Fix two nohz_full= related bugs that can cause boot crashes and warnings * tag 'sched-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/isolation: Fix boot crash when maxcpus < first housekeeping CPU sched/isolation: Prevent boot crash when the boot CPU is nohz_full sched/eevdf: Prevent vlag from going out of bounds in reweight_eevdf() sched/eevdf: Fix miscalculation in reweight_entity() when se is not curr sched/eevdf: Always update V if se->on_rq when reweighting
2024-04-28	Merge tag 'x86-urgent-2024-04-28' of ↵	Linus Torvalds	10	-17/+53
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - Make the CPU_MITIGATIONS=n interaction with conflicting mitigation-enabling boot parameters a bit saner. - Re-enable CPU mitigations by default on non-x86 - Fix TDX shared bit propagation on mprotect() - Fix potential show_regs() system hang when PKE initialization is not fully finished yet. - Add the 0x10-0x1f model IDs to the Zen5 range - Harden #VC instruction emulation some more * tag 'x86-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu: Ignore "mitigations" kernel parameter if CPU_MITIGATIONS=n cpu: Re-enable CPU mitigations by default for !X86 architectures x86/tdx: Preserve shared bit on mprotect() x86/cpu: Fix check for RDPKRU in __show_regs() x86/CPU/AMD: Add models 0x10-0x1f to the Zen5 range x86/sev: Check for MWAITX and MONITORX opcodes in the #VC handler
2024-04-28	Merge tag 'irq-urgent-2024-04-28' of ↵	Linus Torvalds	1	-7/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Ingo Molnar: "Fix a double free bug in the init error path of the GICv3 irqchip driver" * tag 'irq-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3-its: Prevent double free on error