kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2024-10-04	Merge tag 'acpi-6.12-rc2' of ↵	Linus Torvalds	3	-17/+39
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix up the ACPI IRQ override quirk list and add two new entries to it, add a new quirk to the ACPI backlight (video) driver, and fix the ACPI battery driver. Specifics: - Add a quirk for Dell OptiPlex 5480 AIO to the ACPI backlight (video) driver (Hans de Goede) - Prevent the ACPI battery driver from crashing when unregistering a battery hook and simplify battery hook locking in it (Armin Wolf) - Fix up the ACPI IRQ override quirk list and add quirks for Asus Vivobook X1704VAP and Asus ExpertBook B2502CVA to it (Hans de Goede)" * tag 'acpi-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: battery: Fix possible crash when unregistering a battery hook ACPI: battery: Simplify battery hook locking ACPI: video: Add backlight=native quirk for Dell OptiPlex 5480 AIO ACPI: resource: Add Asus ExpertBook B2502CVA to irq1_level_low_skip_override[] ACPI: resource: Add Asus Vivobook X1704VAP to irq1_level_low_skip_override[] ACPI: resource: Loosen the Asus E1404GAB DMI match to also cover the E1404GA ACPI: resource: Remove duplicate Asus E1504GAB IRQ override
2024-10-04	Merge tag 'pm-6.12-rc2' of ↵	Linus Torvalds	2	-13/+9
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix two cpufreq issues, one in the core and one in the intel_pstate driver: - Fix CPU device node reference counting in the cpufreq core (Miquel Sabaté Solà) - Turn the spinlock used by the intel_pstate driver in hard IRQ context into a raw one to prevent the driver from crashing when PREEMPT_RT is enabled (Uwe Kleine-König)" * tag 'pm-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: Avoid a bad reference count on CPU node cpufreq: intel_pstate: Make hwp_notify_lock a raw spinlock
2024-10-04	Merge branch 'netfilter-br_netfilter-fix-panic-with-metadata_dst-skb'	Jakub Kicinski	4	-0/+129
	Andy Roulin says: ==================== netfilter: br_netfilter: fix panic with metadata_dst skb There's a kernel panic possible in the br_netfilter module when sending untagged traffic via a VxLAN device. Traceback is included below. This happens during the check for fragmentation in br_nf_dev_queue_xmit if the MTU on the VxLAN device is not big enough. It is dependent on: 1) the br_netfilter module being loaded; 2) net.bridge.bridge-nf-call-iptables set to 1; 3) a bridge with a VxLAN (single-vxlan-device) netdevice as a bridge port; 4) untagged frames with size higher than the VxLAN MTU forwarded/flooded This case was never supported in the first place, so the first patch drops such packets. A regression selftest is added as part of the second patch. PING 10.0.0.2 (10.0.0.2) from 0.0.0.0 h1-eth0: 2000(2028) bytes of data. [ 176.291791] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000110 [ 176.292101] Mem abort info: [ 176.292184] ESR = 0x0000000096000004 [ 176.292322] EC = 0x25: DABT (current EL), IL = 32 bits [ 176.292530] SET = 0, FnV = 0 [ 176.292709] EA = 0, S1PTW = 0 [ 176.292862] FSC = 0x04: level 0 translation fault [ 176.293013] Data abort info: [ 176.293104] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 176.293488] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 176.293787] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 176.293995] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000043ef5000 [ 176.294166] [0000000000000110] pgd=0000000000000000, p4d=0000000000000000 [ 176.294827] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 176.295252] Modules linked in: vxlan ip6_udp_tunnel udp_tunnel veth br_netfilter bridge stp llc ipv6 crct10dif_ce [ 176.295923] CPU: 0 PID: 188 Comm: ping Not tainted 6.8.0-rc3-g5b3fbd61b9d1 #2 [ 176.296314] Hardware name: linux,dummy-virt (DT) [ 176.296535] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 176.296808] pc : br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter] [ 176.297382] lr : br_nf_dev_queue_xmit+0x2ac/0x4ec [br_netfilter] [ 176.297636] sp : ffff800080003630 [ 176.297743] x29: ffff800080003630 x28: 0000000000000008 x27: ffff6828c49ad9f8 [ 176.298093] x26: ffff6828c49ad000 x25: 0000000000000000 x24: 00000000000003e8 [ 176.298430] x23: 0000000000000000 x22: ffff6828c4960b40 x21: ffff6828c3b16d28 [ 176.298652] x20: ffff6828c3167048 x19: ffff6828c3b16d00 x18: 0000000000000014 [ 176.298926] x17: ffffb0476322f000 x16: ffffb7e164023730 x15: 0000000095744632 [ 176.299296] x14: ffff6828c3f1c880 x13: 0000000000000002 x12: ffffb7e137926a70 [ 176.299574] x11: 0000000000000001 x10: ffff6828c3f1c898 x9 : 0000000000000000 [ 176.300049] x8 : ffff6828c49bf070 x7 : 0008460f18d5f20e x6 : f20e0100bebafeca [ 176.300302] x5 : ffff6828c7f918fe x4 : ffff6828c49bf070 x3 : 0000000000000000 [ 176.300586] x2 : 0000000000000000 x1 : ffff6828c3c7ad00 x0 : ffff6828c7f918f0 [ 176.300889] Call trace: [ 176.301123] br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter] [ 176.301411] br_nf_post_routing+0x2a8/0x3e4 [br_netfilter] [ 176.301703] nf_hook_slow+0x48/0x124 [ 176.302060] br_forward_finish+0xc8/0xe8 [bridge] [ 176.302371] br_nf_hook_thresh+0x124/0x134 [br_netfilter] [ 176.302605] br_nf_forward_finish+0x118/0x22c [br_netfilter] [ 176.302824] br_nf_forward_ip.part.0+0x264/0x290 [br_netfilter] [ 176.303136] br_nf_forward+0x2b8/0x4e0 [br_netfilter] [ 176.303359] nf_hook_slow+0x48/0x124 [ 176.303803] __br_forward+0xc4/0x194 [bridge] [ 176.304013] br_flood+0xd4/0x168 [bridge] [ 176.304300] br_handle_frame_finish+0x1d4/0x5c4 [bridge] [ 176.304536] br_nf_hook_thresh+0x124/0x134 [br_netfilter] [ 176.304978] br_nf_pre_routing_finish+0x29c/0x494 [br_netfilter] [ 176.305188] br_nf_pre_routing+0x250/0x524 [br_netfilter] [ 176.305428] br_handle_frame+0x244/0x3cc [bridge] [ 176.305695] __netif_receive_skb_core.constprop.0+0x33c/0xecc [ 176.306080] __netif_receive_skb_one_core+0x40/0x8c [ 176.306197] __netif_receive_skb+0x18/0x64 [ 176.306369] process_backlog+0x80/0x124 [ 176.306540] __napi_poll+0x38/0x17c [ 176.306636] net_rx_action+0x124/0x26c [ 176.306758] __do_softirq+0x100/0x26c [ 176.307051] ____do_softirq+0x10/0x1c [ 176.307162] call_on_irq_stack+0x24/0x4c [ 176.307289] do_softirq_own_stack+0x1c/0x2c [ 176.307396] do_softirq+0x54/0x6c [ 176.307485] __local_bh_enable_ip+0x8c/0x98 [ 176.307637] __dev_queue_xmit+0x22c/0xd28 [ 176.307775] neigh_resolve_output+0xf4/0x1a0 [ 176.308018] ip_finish_output2+0x1c8/0x628 [ 176.308137] ip_do_fragment+0x5b4/0x658 [ 176.308279] ip_fragment.constprop.0+0x48/0xec [ 176.308420] __ip_finish_output+0xa4/0x254 [ 176.308593] ip_finish_output+0x34/0x130 [ 176.308814] ip_output+0x6c/0x108 [ 176.308929] ip_send_skb+0x50/0xf0 [ 176.309095] ip_push_pending_frames+0x30/0x54 [ 176.309254] raw_sendmsg+0x758/0xaec [ 176.309568] inet_sendmsg+0x44/0x70 [ 176.309667] __sys_sendto+0x110/0x178 [ 176.309758] __arm64_sys_sendto+0x28/0x38 [ 176.309918] invoke_syscall+0x48/0x110 [ 176.310211] el0_svc_common.constprop.0+0x40/0xe0 [ 176.310353] do_el0_svc+0x1c/0x28 [ 176.310434] el0_svc+0x34/0xb4 [ 176.310551] el0t_64_sync_handler+0x120/0x12c [ 176.310690] el0t_64_sync+0x190/0x194 [ 176.311066] Code: f9402e61 79402aa2 927ff821 f9400023 (f9408860) [ 176.315743] ---[ end trace 0000000000000000 ]--- [ 176.316060] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 176.316371] Kernel Offset: 0x37e0e3000000 from 0xffff800080000000 [ 176.316564] PHYS_OFFSET: 0xffff97d780000000 [ 176.316782] CPU features: 0x0,88000203,3c020000,0100421b [ 176.317210] Memory Limit: none [ 176.317527] ---[ end Kernel panic - not syncing: Oops: Fatal Exception in interrupt ]---\ ==================== Link: https://patch.msgid.link/20241001154400.22787-1-aroulin@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	selftests: add regression test for br_netfilter panic	Andy Roulin	3	-0/+124
	Add a new netfilter selftests to test against br_netfilter panics when VxLAN single-device is used together with untagged traffic and high MTU. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Andy Roulin <aroulin@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20241001154400.22787-3-aroulin@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	netfilter: br_netfilter: fix panic with metadata_dst skb	Andy Roulin	1	-0/+5
	Fix a kernel panic in the br_netfilter module when sending untagged traffic via a VxLAN device. This happens during the check for fragmentation in br_nf_dev_queue_xmit. It is dependent on: 1) the br_netfilter module being loaded; 2) net.bridge.bridge-nf-call-iptables set to 1; 3) a bridge with a VxLAN (single-vxlan-device) netdevice as a bridge port; 4) untagged frames with size higher than the VxLAN MTU forwarded/flooded When forwarding the untagged packet to the VxLAN bridge port, before the netfilter hooks are called, br_handle_egress_vlan_tunnel is called and changes the skb_dst to the tunnel dst. The tunnel_dst is a metadata type of dst, i.e., skb_valid_dst(skb) is false, and metadata->dst.dev is NULL. Then in the br_netfilter hooks, in br_nf_dev_queue_xmit, there's a check for frames that needs to be fragmented: frames with higher MTU than the VxLAN device end up calling br_nf_ip_fragment, which in turns call ip_skb_dst_mtu. The ip_dst_mtu tries to use the skb_dst(skb) as if it was a valid dst with valid dst->dev, thus the crash. This case was never supported in the first place, so drop the packet instead. PING 10.0.0.2 (10.0.0.2) from 0.0.0.0 h1-eth0: 2000(2028) bytes of data. [ 176.291791] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000110 [ 176.292101] Mem abort info: [ 176.292184] ESR = 0x0000000096000004 [ 176.292322] EC = 0x25: DABT (current EL), IL = 32 bits [ 176.292530] SET = 0, FnV = 0 [ 176.292709] EA = 0, S1PTW = 0 [ 176.292862] FSC = 0x04: level 0 translation fault [ 176.293013] Data abort info: [ 176.293104] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 176.293488] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 176.293787] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 176.293995] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000043ef5000 [ 176.294166] [0000000000000110] pgd=0000000000000000, p4d=0000000000000000 [ 176.294827] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 176.295252] Modules linked in: vxlan ip6_udp_tunnel udp_tunnel veth br_netfilter bridge stp llc ipv6 crct10dif_ce [ 176.295923] CPU: 0 PID: 188 Comm: ping Not tainted 6.8.0-rc3-g5b3fbd61b9d1 #2 [ 176.296314] Hardware name: linux,dummy-virt (DT) [ 176.296535] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 176.296808] pc : br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter] [ 176.297382] lr : br_nf_dev_queue_xmit+0x2ac/0x4ec [br_netfilter] [ 176.297636] sp : ffff800080003630 [ 176.297743] x29: ffff800080003630 x28: 0000000000000008 x27: ffff6828c49ad9f8 [ 176.298093] x26: ffff6828c49ad000 x25: 0000000000000000 x24: 00000000000003e8 [ 176.298430] x23: 0000000000000000 x22: ffff6828c4960b40 x21: ffff6828c3b16d28 [ 176.298652] x20: ffff6828c3167048 x19: ffff6828c3b16d00 x18: 0000000000000014 [ 176.298926] x17: ffffb0476322f000 x16: ffffb7e164023730 x15: 0000000095744632 [ 176.299296] x14: ffff6828c3f1c880 x13: 0000000000000002 x12: ffffb7e137926a70 [ 176.299574] x11: 0000000000000001 x10: ffff6828c3f1c898 x9 : 0000000000000000 [ 176.300049] x8 : ffff6828c49bf070 x7 : 0008460f18d5f20e x6 : f20e0100bebafeca [ 176.300302] x5 : ffff6828c7f918fe x4 : ffff6828c49bf070 x3 : 0000000000000000 [ 176.300586] x2 : 0000000000000000 x1 : ffff6828c3c7ad00 x0 : ffff6828c7f918f0 [ 176.300889] Call trace: [ 176.301123] br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter] [ 176.301411] br_nf_post_routing+0x2a8/0x3e4 [br_netfilter] [ 176.301703] nf_hook_slow+0x48/0x124 [ 176.302060] br_forward_finish+0xc8/0xe8 [bridge] [ 176.302371] br_nf_hook_thresh+0x124/0x134 [br_netfilter] [ 176.302605] br_nf_forward_finish+0x118/0x22c [br_netfilter] [ 176.302824] br_nf_forward_ip.part.0+0x264/0x290 [br_netfilter] [ 176.303136] br_nf_forward+0x2b8/0x4e0 [br_netfilter] [ 176.303359] nf_hook_slow+0x48/0x124 [ 176.303803] __br_forward+0xc4/0x194 [bridge] [ 176.304013] br_flood+0xd4/0x168 [bridge] [ 176.304300] br_handle_frame_finish+0x1d4/0x5c4 [bridge] [ 176.304536] br_nf_hook_thresh+0x124/0x134 [br_netfilter] [ 176.304978] br_nf_pre_routing_finish+0x29c/0x494 [br_netfilter] [ 176.305188] br_nf_pre_routing+0x250/0x524 [br_netfilter] [ 176.305428] br_handle_frame+0x244/0x3cc [bridge] [ 176.305695] __netif_receive_skb_core.constprop.0+0x33c/0xecc [ 176.306080] __netif_receive_skb_one_core+0x40/0x8c [ 176.306197] __netif_receive_skb+0x18/0x64 [ 176.306369] process_backlog+0x80/0x124 [ 176.306540] __napi_poll+0x38/0x17c [ 176.306636] net_rx_action+0x124/0x26c [ 176.306758] __do_softirq+0x100/0x26c [ 176.307051] ____do_softirq+0x10/0x1c [ 176.307162] call_on_irq_stack+0x24/0x4c [ 176.307289] do_softirq_own_stack+0x1c/0x2c [ 176.307396] do_softirq+0x54/0x6c [ 176.307485] __local_bh_enable_ip+0x8c/0x98 [ 176.307637] __dev_queue_xmit+0x22c/0xd28 [ 176.307775] neigh_resolve_output+0xf4/0x1a0 [ 176.308018] ip_finish_output2+0x1c8/0x628 [ 176.308137] ip_do_fragment+0x5b4/0x658 [ 176.308279] ip_fragment.constprop.0+0x48/0xec [ 176.308420] __ip_finish_output+0xa4/0x254 [ 176.308593] ip_finish_output+0x34/0x130 [ 176.308814] ip_output+0x6c/0x108 [ 176.308929] ip_send_skb+0x50/0xf0 [ 176.309095] ip_push_pending_frames+0x30/0x54 [ 176.309254] raw_sendmsg+0x758/0xaec [ 176.309568] inet_sendmsg+0x44/0x70 [ 176.309667] __sys_sendto+0x110/0x178 [ 176.309758] __arm64_sys_sendto+0x28/0x38 [ 176.309918] invoke_syscall+0x48/0x110 [ 176.310211] el0_svc_common.constprop.0+0x40/0xe0 [ 176.310353] do_el0_svc+0x1c/0x28 [ 176.310434] el0_svc+0x34/0xb4 [ 176.310551] el0t_64_sync_handler+0x120/0x12c [ 176.310690] el0t_64_sync+0x190/0x194 [ 176.311066] Code: f9402e61 79402aa2 927ff821 f9400023 (f9408860) [ 176.315743] ---[ end trace 0000000000000000 ]--- [ 176.316060] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 176.316371] Kernel Offset: 0x37e0e3000000 from 0xffff800080000000 [ 176.316564] PHYS_OFFSET: 0xffff97d780000000 [ 176.316782] CPU features: 0x0,88000203,3c020000,0100421b [ 176.317210] Memory Limit: none [ 176.317527] ---[ end Kernel panic - not syncing: Oops: Fatal Exception in interrupt ]---\ Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths") Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Andy Roulin <aroulin@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20241001154400.22787-2-aroulin@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	Merge branch 'add-option-to-provide-opt_id-value-via-cmsg'	Jakub Kicinski	20	-48/+149
	Vadim Fedorenko says: ==================== Add option to provide OPT_ID value via cmsg SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX timestamps and packets sent via socket. Unfortunately, there is no way to reliably predict socket timestamp ID value in case of error returned by sendmsg. For UDP sockets it's impossible because of lockless nature of UDP transmit, several threads may send packets in parallel. In case of RAW sockets MSG_MORE option makes things complicated. More details are in the conversation [1]. This patch adds new control message type to give user-space software an opportunity to control the mapping between packets and values by providing ID with each sendmsg. The first patch in the series adds all needed definitions and implements the function for UDP sockets. The explicit check of socket's type is not added because subsequent patches in the series will add support for other types of sockets. The documentation is also included into the first patch. Patch 2/4 adds support for TCP sockets. This part is simple and straight forward. Patch 3/4 adds support for RAW sockets. It's a bit tricky because sock_tx_timestamp functions has to be refactored to receive full socket cookie information to fill in ID. The commit b534dc46c8ae ("net_tstamp: add SOF_TIMESTAMPING_OPT_ID_TCP") did the conversion of sk_tsflags to u32 but sock_tx_timestamp functions were not converted and still receive 16b flags. It wasn't a problem because SOF_TIMESTAMPING_OPT_ID_TCP was not checked in these functions, that's why no backporting is needed. Patch 4/4 adds selftests for new feature. [1] https://lore.kernel.org/netdev/CALCETrU0jB+kg0mhV6A8mrHfTE1D1pr1SD_B9Eaa9aDPfgHdtA@mail.gmail.com/ ==================== Link: https://patch.msgid.link/20241001125716.2832769-1-vadfed@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	selftests: txtimestamp: add SCM_TS_OPT_ID test	Vadim Fedorenko	3	-15/+43
	Extend txtimestamp test to run with fixed tskey using SCM_TS_OPT_ID control message for all types of sockets. Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Link: https://patch.msgid.link/20241001125716.2832769-4-vadfed@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net_tstamp: add SCM_TS_OPT_ID for RAW sockets	Vadim Fedorenko	9	-21/+31
	The last type of sockets which supports SOF_TIMESTAMPING_OPT_ID is RAW sockets. To add new option this patch converts all callers (direct and indirect) of _sock_tx_timestamp to provide sockcm_cookie instead of tsflags. And while here fix __sock_tx_timestamp to receive tsflags as __u32 instead of __u16. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Link: https://patch.msgid.link/20241001125716.2832769-3-vadfed@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message	Vadim Fedorenko	11	-12/+75
	SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX timestamps and packets sent via socket. Unfortunately, there is no way to reliably predict socket timestamp ID value in case of error returned by sendmsg. For UDP sockets it's impossible because of lockless nature of UDP transmit, several threads may send packets in parallel. In case of RAW sockets MSG_MORE option makes things complicated. More details are in the conversation [1]. This patch adds new control message type to give user-space software an opportunity to control the mapping between packets and values by providing ID with each sendmsg for UDP sockets. The documentation is also added in this patch. [1] https://lore.kernel.org/netdev/CALCETrU0jB+kg0mhV6A8mrHfTE1D1pr1SD_B9Eaa9aDPfgHdtA@mail.gmail.com/ Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Link: https://patch.msgid.link/20241001125716.2832769-2-vadfed@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	Merge tag 'gpio-fixes-for-v6.12-rc2' of ↵	Linus Torvalds	2	-7/+7
	git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - fix a potential NULL-pointer dereference in gpiolib core - fix a probe() regression from the v6.12 merge window and an older bug leading to missed interrupts in gpio-davinci * tag 'gpio-fixes-for-v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpiolib: Fix potential NULL pointer dereference in gpiod_get_label() gpio: davinci: Fix condition for irqchip registration gpio: davinci: fix lazy disable
2024-10-04	Merge branch 'net-mlx5-hw-counters-refactor'	Jakub Kicinski	4	-278/+147
	Tariq Toukan says: ==================== net/mlx5: hw counters refactor This is a patchset re-post, see: https://lore.kernel.org/20240815054656.2210494-7-tariqt@nvidia.com In this patchset, Cosmin refactors hw counters and solves perf scaling issue. Series generated against: commit c824deb1a897 ("cxgb4: clip_tbl: Fix spelling mistake "wont" -> "won't"") HW counters are central to mlx5 driver operations. They are hardware objects created and used alongside most steering operations, and queried from a variety of places. Most counters are queried in bulk from a periodic task in fs_counters.c. Counter performance is important and as such, a variety of improvements have been done over the years. Currently, counters are allocated from pools, which are bulk allocated to amortize the cost of firmware commands. Counters are managed through an IDR, a doubly linked list and two atomic single linked lists. Adding/removing counters is a complex dance between user contexts requesting it and the mlx5_fc_stats_work task which does most of the work. Under high load (e.g. from connection tracking flow insertion/deletion), the counter code becomes a bottleneck, as seen on flame graphs. Whenever a counter is deleted, it gets added to a list and the wq task is scheduled to run immediately to actually delete it. This is done via mod_delayed_work which uses an internal spinlock. In some tests, waiting for this spinlock took up to 66% of all samples. This series refactors the counter code to use a more straight-forward approach, avoiding the mod_delayed_work problem and making the code easier to understand. For that: - patch #1 moves counters data structs to a more appropriate place. - patch #2 simplifies the bulk query allocation scheme by using vmalloc. - patch #3 replaces the IDR+3 lists with an xarray. This is the main patch of the series, solving the spinlock congestion issue. - patch #4 removes an unnecessary cacheline alignment causing a lot of memory to be wasted. - patches #5 and #6 are small cleanups enabled by the refactoring. ==================== Link: https://patch.msgid.link/20241001103709.58127-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net/mlx5: hw counters: Remove mlx5_fc_create_ex	Cosmin Ratiu	3	-10/+2
	It no longer serves any purpose and is identical to mlx5_fc_create upon which it was originally based of. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241001103709.58127-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net/mlx5: hw counters: Don't maintain a counter count	Cosmin Ratiu	1	-22/+18
	num_counters is only used for deciding whether to grow the bulk query buffer, which is done once more counters than a small initial threshold are present. After that, maintaining num_counters serves no purpose. This commit replaces that with an actual xarray traversal to count the counters. This appears expensive at first sight, but is only done when the number of counters is less than the initial threshold (8) and only once every sampling interval. Once the number of counters goes above the threshold, the bulk query buffer is grown to max size and the xarray traversal is never done again. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241001103709.58127-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net/mlx5: hw counters: Drop unneeded cacheline alignment	Cosmin Ratiu	1	-1/+1
	The mlx5_fc struct has a cache for values queried from hw, which is cacheline aligned. On x86_64, this results in: struct mlx5_fc { u32 id; /* 0 4 / bool aging; / 4 1 / / XXX 3 bytes hole, try to pack / struct mlx5_fc_bulk bulk; /* 8 8 / / XXX 48 bytes hole, try to pack / / --- cacheline 1 boundary (64 bytes) --- / struct mlx5_fc_cache cache __attribute__((__aligned__(64))); / 64 24 / u64 lastpackets; / 88 8 / u64 lastbytes; / 96 8 / / size: 128, cachelines: 2, members: 6 / / sum members: 53, holes: 2, sum holes: 51 / / padding: 24 / / forced aligns: 1, forced holes: 1, sum forced holes: 48 / } __attribute__((__aligned__(64))); (output from pahole). ...So a 48+24=72 byte waste. As far as I can determine, this serves no purpose other than maybe making sure that the values in the cache do not span two cachelines in the worst case scenario, but that's not a valid enough reason to waste 72 bytes per counter, especially since this code is not performance-critical. There could potentially be hundreds of thousands of counters (e.g. for connection-tracking), so this quickly adds up to multiple MB wasted. This commit removes the alignment, resulting in: struct mlx5_fc { [...] / size: 56, cachelines: 1, members: 6 / / sum members: 53, holes: 1, sum holes: 3 / / last cacheline: 56 bytes */ }; Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241001103709.58127-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net/mlx5: hw counters: Replace IDR+lists with xarray	Cosmin Ratiu	1	-194/+82
	Previously, managing counters was a complicated affair involving an IDR, a sorted double linked list, two single linked lists and a complex dance between a non-periodic wq task and users adding/deleting counters. Adding was done by inserting new counters into the IDR and into a single linked list, leaving the wq to process the list and actually add the counters into the double linked list, maintained sorted with the IDR. Deleting involved adding the counter into another single linked list, leaving the wq to actually unlink the counter from the other structures and release it. Dumping the counters is done with the bulk query API, which relies on the counter list being sorted and unmutable during querying to efficiently retrieve cached counter values. Finally, the IDR data struct is deprecated. This commit replaces all of that with an xarray. Adding is now done directly, by using xa_lock. Deleting is also done directly, under the xa_lock. Querying is done from a periodic task running every sampling_interval (default 1s) and uses the bulk query API for efficiency. It works by iterating over the xarray: - when a new bulk needs to be started, the bulk information is computed under the xa_lock. - the xa iteration state is saved and the xa_lock dropped. - the HW is queried for bulk counter values. - the xa_lock is reacquired. - counter caches with ids covered by the bulk response are updated. Querying always requests the max bulk length, for simplicity. Counters could be added/deleted while the HW is queried. This is safe, as the HW API simply returns unknown values for counters not in HW, but those values won't be accessed. Only counters present in xarray before bulk query will actually read queried cache values. This cuts down the size of mlx5_fc by 4 pointers (88->56 bytes), which amounts to ~3MB / 100K counters. But more importantly, this solves the wq spinlock congestion issue seen happening on high-rate counter insertion+deletion. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241001103709.58127-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net/mlx5: hw counters: Use kvmalloc for bulk query buffer	Cosmin Ratiu	1	-37/+22
	The bulk query buffer starts out small (see [1]) and as soon as the number of counters goes past the initial threshold grows to max size (32K entries, 512KB) with a retry scheme. This commit switches to using kvmalloc for the buffer, which has a near zero likelihood of failing, and thus the explicit retry scheme becomes superfluous and is taken out. On the low chance the allocation fails, it will still be retried every sampling_interval, when the wq task runs. [1] commit b247f32aecad ("net/mlx5: Dynamically resize flow counters query buffer") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241001103709.58127-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net/mlx5: hw counters: Make fc_stats & fc_pool private	Cosmin Ratiu	2	-52/+60
	The mlx5_fc_stats and mlx5_fc_pool structs are only used from fs_counters.c. As such, make them private there. mlx5_fc_pool is not used or referenced at all outside fs_counters. mlx5_fc_stats is referenced from mlx5_core_dev, so instead of having it as a direct member (which requires exporting it from fs_counters), store a pointer to it, allocate it on init and clear it on destroy. One caveat is that a simple container_of to get from a 'work' struct to the outermost mlx5_core_dev struct directly no longer works, so an extra pointer had to be added to mlx5_fc_stats back to the parent dev. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241001103709.58127-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	Merge tag 'sound-6.12-rc2' of ↵	Linus Torvalds	43	-107/+162
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Slightly high amount of changes in this round, partly because of my vacation in the last weeks. But all changes are small and nothing looks worrisome. The biggest LOCs is MAINTAINERS updates, and there is a core change for card-ID string creation for non-ASCII inputs. Others are rather device-specific, such as new quirks and device IDs for ASoC, usual HD-audio and USB-audio quirks and fixes, as well as regression fixes in HD-audio HDMI audio and Conexant codec" * tag 'sound-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (39 commits) ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin ALSA: line6: add hw monitor volume control to POD HD500X ALSA: gus: Fix some error handling paths related to get_bpos() usage ALSA: hda: Add missing parameter description for snd_hdac_stream_timecounter_init() ALSA: usb-audio: Add native DSD support for Luxman D-08u ALSA: core: add isascii() check to card ID generator MAINTAINERS: ALSA: use linux-sound@vger.kernel.org list Revert "ALSA: hda: Conditionally use snooping for AMD HDMI" ASoC: intel: sof_sdw: Add check devm_kasprintf() returned value ASoC: imx-card: Set card.owner to avoid a warning calltrace if SND=m ASoC: dt-bindings: davinci-mcasp: Fix interrupts property ASoC: qcom: sm8250: add qrb4210-rb2-sndcard compatible string ASoC: dt-bindings: qcom,sm8250: add qrb4210-rb2-sndcard ALSA: hda: fix trigger_tstamp_latched ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200 ALSA: hda/generic: Drop obsoleted obey_preferred_dacs flag ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs ALSA: silence integer wrapping warning ASoC: Intel: soc-acpi: arl: Fix some missing empty terminators ASoC: Intel: soc-acpi-intel-rpl-match: add missing empty item ...
2024-10-04	Merge tag 'drm-fixes-2024-10-04' of https://gitlab.freedesktop.org/drm/kernel	Linus Torvalds	42	-108/+282
	Pull drm fixes from Dave Airlie: "Weekly fixes, xe and amdgpu lead the way, with panthor, and few core components getting various fixes. Nothing seems too out of the ordinary. atomic: - Use correct type when reading damage rectangles display: - Fix kernel docs dp-mst: - Fix DSC decompression detection hdmi: - Fix infoframe size sched: - Update maintainers - Fix race condition whne queueing up jobs - Fix locking in drm_sched_entity_modify_sched() - Fix pointer deref if entity queue changes sysfb: - Disable sysfb if framebuffer parent device is unknown amdgpu: - DML2 fix - DSC fix - Dispclk fix - eDP HDR fix - IPS fix - TBT fix i915: - One fix for bitwise and logical "and" mixup in PM code xe: - Restore pci state on resume - Fix locking on submission, queue and vm - Fix UAF on queue destruction - Fix resource release on freq init error path - Use rw_semaphore to reduce contention on ASID->VM lookup - Fix steering for media on Xe2_HPM - Tuning updates to Xe2 - Resume TDR after GT reset to prevent jobs running forever - Move id allocation to avoid userspace using a guessed number to trigger UAF - Fix OA stream close preventing pbatch buffers to complete - Fix NPD when migrating memory on LNL - Fix memory leak when aborting binds panthor: - Fix locking - Set FOP_UNSIGNED_OFFSET in fops instance - Acquire lock in panthor_vm_prepare_map_op_ctx() - Avoid uninitialized variable in tick_ctx_cleanup() - Do not block scheduler queue if work is pending - Do not add write fences to the shared BOs vbox: - Fix VLA handling" * tag 'drm-fixes-2024-10-04' of https://gitlab.freedesktop.org/drm/kernel: (41 commits) drm/xe: Fix memory leak when aborting binds drm/xe: Prevent null pointer access in xe_migrate_copy drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close drm/xe/queue: move xa_alloc to prevent UAF drm/xe/vm: move xa_alloc to prevent UAF drm/xe: Clean up VM / exec queue file lock usage. drm/xe: Resume TDR after GT reset drm/xe/xe2: Add performance tuning for L3 cache flushing drm/xe/xe2: Extend performance tuning to media GT drm/xe/mcr: Use Xe2_LPM steering tables for Xe2_HPM drm/xe: Use helper for ASID -> VM in GPU faults and access counters drm/xe: Convert to USM lock to rwsem drm/xe: use devm_add_action_or_reset() helper drm/xe: fix UAF around queue destruction drm/xe/guc_submit: add missing locking in wedged_fini drm/xe: Restore pci state upon resume drm/amd/display: Fix system hang while resume with TBT monitor drm/amd/display: Enable idle workqueue for more IPS modes drm/amd/display: Add HDR workaround for specific eDP drm/amd/display: avoid set dispclk to 0 ...
2024-10-04	net: dsa: sja1105: fix reception from VLAN-unaware bridges	Vladimir Oltean	1	-1/+0
	The blamed commit introduced an unexpected regression in the sja1105 driver. Packets from VLAN-unaware bridge ports get received correctly, but the protocol stack can't seem to decode them properly. For ds->untag_bridge_pvid users (thus also sja1105), the blamed commit did introduce a functional change: dsa_switch_rcv() used to call dsa_untag_bridge_pvid(), which looked like this: err = br_vlan_get_proto(br, &proto); if (err) return skb; /* Move VLAN tag from data to hwaccel / if (!skb_vlan_tag_present(skb) && skb->protocol == htons(proto)) { skb = skb_vlan_untag(skb); if (!skb) return NULL; } and now it calls dsa_software_vlan_untag() which has just this: / Move VLAN tag from data to hwaccel / if (!skb_vlan_tag_present(skb)) { skb = skb_vlan_untag(skb); if (!skb) return NULL; } thus lacks any skb->protocol == bridge VLAN protocol check. That check is deferred until a later check for skb->vlan_proto (in the hwaccel area). The new code is problematic because, for VLAN-untagged packets, skb_vlan_untag() blindly takes the 4 bytes starting with the EtherType and turns them into a hwaccel VLAN tag. This is what breaks the protocol stack. It would be tempting to "make it work as before" and only call skb_vlan_untag() for those packets with the skb->protocol actually representing a VLAN. But the premise of the newly introduced dsa_software_vlan_untag() core function is not wrong. Drivers set ds->untag_bridge_pvid or ds->untag_vlan_aware_bridge_pvid presumably because they send all traffic to the CPU reception path as VLAN-tagged. So why should we spend any additional CPU cycles assuming that the packet may be VLAN-untagged? And why does the sja1105 driver opt into ds->untag_bridge_pvid if it doesn't always deliver packets to the CPU as VLAN-tagged? The answer to the latter question is indeed more interesting: it doesn't need to. This got done in commit 884be12f8566 ("net: dsa: sja1105: add support for imprecise RX"), because I thought it would be needed, but I didn't realize that it doesn't actually make a difference. As explained in the commit message of the blamed patch, ds->untag_bridge_pvid only makes a difference in the VLAN-untagged receive path of a bridge port. However, in that operating mode, tag_sja1105.c makes use of VLAN tags with the ETH_P_SJA1105 TPID, and it decodes and consumes these VLAN tags as if they were DSA tags (aka tag_8021q operation). Even if commit 884be12f8566 ("net: dsa: sja1105: add support for imprecise RX") added this logic in sja1105_bridge_vlan_add(): / Always install bridge VLANs as egress-tagged on the CPU port. */ if (dsa_is_cpu_port(ds, port)) flags = 0; that was for _bridge_ VLANs, which are _not_ committed to hardware in VLAN-unaware mode (aka the mode where ds->untag_bridge_pvid does anything at all). Even prior to that change, the tag_8021q VLANs were always installed as egress-tagged on the CPU port, see dsa_switch_tag_8021q_vlan_add(): u16 flags = 0; // egress-tagged, non-PVID if (dsa_port_is_user(dp)) flags \|= BRIDGE_VLAN_INFO_UNTAGGED \| BRIDGE_VLAN_INFO_PVID; err = dsa_port_do_tag_8021q_vlan_add(dp, info->vid, flags); if (err) return err; Whether the sja1105 driver needs the new flag, ds->untag_vlan_aware_bridge_pvid, rather than ds->untag_bridge_pvid, is a separate discussion. To fix the current bug in VLAN-unaware bridge mode, I would argue that the sja1105 driver should not request something it doesn't need, rather than complicating the core DSA helper. Whereas before the blamed commit, this setting was harmless, now it has caused breakage. Fixes: 93e4649efa96 ("net: dsa: provide a software untagging function on RX for VLAN-aware bridges") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241001140206.50933-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	octeontx2-af: Change block parameter to const pointer in get_lf_str_list	Riyan Dhiman	1	-7/+7
	Convert struct rvu_block block to const struct rvu_block *block in get_lf_str_list() function parameter. This improves efficiency by avoiding structure copying and reflects the function's read-only access to block. Signed-off-by: Riyan Dhiman <riyandhiman14@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241001110542.5404-2-riyandhiman14@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net: macb: Adding support for Jumbo Frames up to 10240 Bytes in SAMA5D2	Aleksander Jan Bajkowski	1	-1/+2
	As per the SAMA5D2 device specification it supports Jumbo frames. But the suggested flag and length of bytes it supports was not updated in this driver config_structure. The maximum jumbo frames the device supports: 10240 bytes as per the device spec. While changing the MTU value greater than 1500, it threw error: sudo ifconfig eth1 mtu 9000 SIOCSIFMTU: Invalid argument Add this support to driver so that it works as expected and designed. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://patch.msgid.link/20241003171941.8814-1-olek2@wp.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	Merge tag 'block-6.12-20241004' of git://git.kernel.dk/linux	Linus Torvalds	3	-7/+17
	Pull block fixes from Jens Axboe: - Fix another use-after-free in aoe - Fixup wrong nested non-saving irq disable/restore in blk-iocost - Fixup a kerneldoc complaint introduced by a merge window patch * tag 'block-6.12-20241004' of git://git.kernel.dk/linux: aoe: fix the potential use-after-free problem in more places blk_iocost: remove some duplicate irq disable/enables block: fix blk_rq_map_integrity_sg kernel-doc
2024-10-04	Merge tag 'io_uring-6.12-20241004' of git://git.kernel.dk/linux	Linus Torvalds	2	-3/+8
	Pull io_uring fixes from Jens Axboe: - Fix an error path memory leak, if one part fails to allocate. Obviously not something that'll generally hit without error injection. - Fix an io_req_flags_t cast to make sparse happier. - Improve the recv multishot termination. Not a bug now, but could be one in the future. This makes it do the same thing that recvmsg does in terms of when to terminate a request or not. * tag 'io_uring-6.12-20241004' of git://git.kernel.dk/linux: io_uring/net: harden multishot termination case for recv io_uring: fix casts to io_req_flags_t io_uring: fix memory leak when cache init fail
2024-10-04	Merge tag 'fsnotify_for_v6.12-rc2' of ↵	Linus Torvalds	8	-34/+25
	git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify fixes from Jan Kara: "Fixes for an inotify deadlock and a data race in fsnotify" * tag 'fsnotify_for_v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: inotify: Fix possible deadlock in fsnotify_destroy_mark fsnotify: Avoid data race between fsnotify_recalc_mask() and fsnotify_object_watched()
2024-10-04	Merge tag 'fs_for_v6.12-rc2' of ↵	Linus Torvalds	7	-106/+224
	git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull UDF fixes from Jan Kara: "A couple of UDF error handling fixes for issues spotted by syzbot" * tag 'fs_for_v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: fix uninit-value use in udf_get_fileshortad udf: refactor inode_bmap() to handle error udf: refactor udf_next_aext() to handle error udf: refactor udf_current_aext() to handle error
2024-10-04	Merge tag 'ceph-for-6.12-rc2' of https://github.com/ceph/ceph-client	Linus Torvalds	1	-2/+5
	Pull ceph fixes from Ilya Dryomov: "A fix from Patrick for a variety of CephFS lockup scenarios caused by a regression in cap handling which sneaked in through the netfs helper library in 5.18 (marked for stable) and an unrelated one-line cleanup" * tag 'ceph-for-6.12-rc2' of https://github.com/ceph/ceph-client: ceph: fix cap ref leak via netfs init_request ceph: use struct_size() helper in __ceph_pool_perm_get()
2024-10-04	Merge branches 'acpi-video' and 'acpi-battery'	Rafael J. Wysocki	2	-11/+26
	Merge an ACPI backlight (video) quirk and ACPI battery driver fix and cleanup for 6.12-rc2: - Add a quirk for Dell OptiPlex 5480 AIO to the ACPI backlight (video) driver (Hans de Goede). - Prevent the ACPI battery driver from crashing when unregistering a battery hook and simplify battery hook locking in it (Armin Wolf). * acpi-video: ACPI: video: Add backlight=native quirk for Dell OptiPlex 5480 AIO * acpi-battery: ACPI: battery: Fix possible crash when unregistering a battery hook ACPI: battery: Simplify battery hook locking
2024-10-04	Merge tag 'for-6.12-rc1-tag' of ↵	Linus Torvalds	7	-83/+58
	git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - in incremental send, fix invalid clone operation for file that got its size decreased - fix __counted_by() annotation of send path cache entries, we do not store the terminating NUL - fix a longstanding bug in relocation (and quite hard to hit by chance), drop back reference cache that can get out of sync after transaction commit - wait for fixup worker kthread before finishing umount - add missing raid-stripe-tree extent for NOCOW files, zoned mode cannot have NOCOW files but RST is meant to be a standalone feature - handle transaction start error during relocation, avoid potential NULL pointer dereference of relocation control structure (reported by syzbot) - disable module-wide rate limiting of debug level messages - minor fix to tracepoint definition (reported by checkpatch.pl) * tag 'for-6.12-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: disable rate limiting when debug enabled btrfs: wait for fixup workers before stopping cleaner kthread during umount btrfs: fix a NULL pointer dereference when failed to start a new trasacntion btrfs: send: fix invalid clone operation for file that got its size decreased btrfs: tracepoints: end assignment with semicolon at btrfs_qgroup_extent event class btrfs: drop the backref cache during relocation if we commit btrfs: also add stripe entries for NOCOW writes btrfs: send: fix buffer overflow detection when copying path to cache entry
2024-10-04	Merge tag 'v6.12-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6	Linus Torvalds	24	-59/+167
	Pull smb client fixes from Steve French: - statfs fix (e.g. when limited access to root directory of share) - special file handling fixes: fix packet validation to avoid buffer overflow for reparse points, fixes for symlink path parsing (one for reparse points, and one for SFU use case), and fix for cleanup after failed SET_REPARSE operation. - fix for SMB2.1 signing bug introduced by recent patch to NFS symlink path, and NFS reparse point validation - comment cleanup * tag 'v6.12-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Do not convert delimiter when parsing NFS-style symlinks cifs: Validate content of NFS reparse point buffer cifs: Fix buffer overflow when parsing NFS reparse points smb: client: Correct typos in multiple comments across various files smb: client: use actual path when queryfs cifs: Remove intermediate object of failed create reparse call Revert "smb: client: make SHA-512 TFM ephemeral" smb: Update comments about some reparse point tags cifs: Check for UTF-16 null codepoint in SFU symlink target location
2024-10-04	Merge branch 'net-airoha-fix-pse-memory-configuration'	Jakub Kicinski	1	-6/+14
	Lorenzo Bianconi says: ==================== net: airoha: Fix PSE memory configuration Align PSE memory configuration to vendor SDK. Increase initial value of PSE reserved memory in airoha_fe_pse_ports_init() by the value used for the second Packet Processor Engine (PPE2). Do not overwrite the default value for the number of PSE reserved pages in airoha_fe_set_pse_oq_rsv(). These changes fix issues which are not visible to the user. v1: https://lore.kernel.org/20240930-airoha-eth-pse-fix-v1-0-f41f2f35abb9@kernel.org ==================== Link: https://patch.msgid.link/20241001-airoha-eth-pse-fix-v2-0-9a56cdffd074@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net: airoha: fix PSE memory configuration in airoha_fe_pse_ports_init()	Lorenzo Bianconi	1	-2/+4
	Align PSE memory configuration to vendor SDK. In particular, increase initial value of PSE reserved memory in airoha_fe_pse_ports_init() routine by the value used for the second Packet Processor Engine (PPE2) and do not overwrite the default value. Introduced by commit 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241001-airoha-eth-pse-fix-v2-2-9a56cdffd074@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net: airoha: read default PSE reserved pages value before updating	Lorenzo Bianconi	1	-4/+10
	Store the default value for the number of PSE reserved pages in orig_val at the beginning of airoha_fe_set_pse_oq_rsv routine, before updating it with airoha_fe_set_pse_queue_rsv_pages(). Introduce airoha_fe_get_pse_all_rsv utility routine. Introduced by commit 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241001-airoha-eth-pse-fix-v2-1-9a56cdffd074@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	Linus Torvalds	3	-83/+52
	Pull close_range() fix from Al Viro: "Fix the logic in descriptor table trimming" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: close_range(): fix the logics in descriptor table trimming
2024-10-04	Merge tag 'i2c-host-fixes-6.12-rc2' of ↵	Wolfram Sang	1	-3/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current i2c-host fixes for v6.12-rc2 In the stm32f7 a potential deadlock is fixed during runtime suspend and resume.
2024-10-04	Merge branch 'net-switch-to-scoped-device_for_each_child_node'	Jakub Kicinski	2	-10/+4
	Javier Carrasco says: ==================== net: switch to scoped device_for_each_child_node() This series switches from the device_for_each_child_node() macro to its scoped variant. This makes the code more robust if new early exits are added to the loops, because there is no need for explicit calls to fwnode_handle_put(), which also simplifies existing code. The non-scoped macros to walk over nodes turn error-prone as soon as the loop contains early exits (break, goto, return), and patches to fix them show up regularly, sometimes due to new error paths in an existing loop [1]. Note that the child node is now declared in the macro, and therefore the explicit declaration is no longer required. The general functionality should not be affected by this modification. If functional changes are found, please report them back as errors. Link: https://lore.kernel.org/20240901160829.709296395@linuxfoundation.org v1: https://lore.kernel.org/r/20240930-net-device_for_each_child_node_scoped-v1-0-bbdd7f9fd649@gmail.com ==================== Link: https://patch.msgid.link/20240930-net-device_for_each_child_node_scoped-v2-0-35f09333c1d7@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net: hns: hisilicon: hns_dsaf_mac: switch to scoped device_for_each_child_node()	Javier Carrasco	1	-7/+3
	Use device_for_each_child_node_scoped() to simplify the code by removing the need for explicit calls to fwnode_handle_put() in every error path. This approach also accounts for any error path that could be added. Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20240930-net-device_for_each_child_node_scoped-v2-2-35f09333c1d7@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net: mdio: thunder: switch to scoped device_for_each_child_node()	Javier Carrasco	1	-3/+1
	There has already been an issue with the handling of early exits from device_for_each_child() in this driver, and it was solved with commit b1de5c78ebe9 ("net: mdio: thunder: Add missing fwnode_handle_put()") by adding a call to fwnode_handle_put() right after the loop. That solution is valid indeed, but if a new error path with a 'return' is added to the loop, this solution will fail. A more secure approach is using the scoped variant of the macro, which automatically decrements the refcount of the child node when it goes out of scope, removing the need for explicit calls to fwnode_handle_put(). Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20240930-net-device_for_each_child_node_scoped-v2-1-35f09333c1d7@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	Merge branch 'qed-ethtool-d-faster-less-latency'	Jakub Kicinski	3	-29/+18
	Michal Schmidt says: ==================== qed: 'ethtool -d' faster, less latency Here is a patch to make 'ethtool -d' on a qede network device a lot faster and 3 patches to make it cause less latency for other tasks on non-preemptible kernels. ==================== Link: https://patch.msgid.link/20240930201307.330692-1-mschmidt@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	qed: put cond_resched() in qed_dmae_operation_wait()	Michal Schmidt	1	-0/+1
	It is OK to sleep in qed_dmae_operation_wait, because it is called only in process context, while holding p_hwfn->dmae_info.mutex from one of the qed_dmae_{host,grc}2{host,grc} functions. The udelay(DMAE_MIN_WAIT_TIME=2) in the function is too short to replace with usleep_range, but at least it's a suitable point for checking if we should give up the CPU with cond_resched(). This lowers the latency caused by 'ethtool -d' from 10 ms to less than 2 ms on my test system with voluntary preemption. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Link: https://patch.msgid.link/20240930201307.330692-5-mschmidt@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	qed: allow the callee of qed_mcp_nvm_read() to sleep	Michal Schmidt	1	-8/+1
	qed_mcp_nvm_read has a loop where it calls qed_mcp_nvm_rd_cmd with the argument b_can_sleep=false. And it sleeps once every 0x1000 bytes read. Simplify this by letting qed_mcp_nvm_rd_cmd itself sleep (b_can_sleep=true). It will have slept at least once when successful (in the "Wait for the MFW response" loop). So the extra sleep once every 0x1000 bytes becomes superfluous. Delete it. On my test system with voluntary preemption, this lowers the latency caused by 'ethtool -d' from 53 ms to 10 ms. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Link: https://patch.msgid.link/20240930201307.330692-4-mschmidt@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	qed: put cond_resched() in qed_grc_dump_ctx_data()	Michal Schmidt	1	-0/+1
	On a kernel with preemption none or voluntary, 'ethtool -d' on a qede network device can cause a big latency spike. The biggest part of it is the loop in qed_grc_dump_ctx_data. The function is called only from the .get_size and .perform_dump callbacks for the "grc" feature defined in qed_features_lookup[]. As far as I can see, they are used in: - qed's devlink healh reporter .dump op - qede's ethtool get_regs/get_regs_len/get_dump_data ops - qedf's qedf_get_grc_dump, called from: - qedf_sysfs_write_grcdump - "grcdump" sysfs attribute write - qedf_wq_grcdump - a workqueue It is safe to sleep in all of them. Let's insert a cond_resched() in the outer loop to let other tasks run. Measured using this script: #!/bin/bash DEV=ens3f1 echo wakeup_rt > /sys/kernel/tracing/current_tracer echo 0 > /sys/kernel/tracing/tracing_max_latency echo 1 > /sys/kernel/tracing/tracing_on echo "Setting the task CPU affinity" taskset -p 1 $$ > /dev/null echo "Starting the real-time task" chrt -f 50 bash -c 'while sleep 0.01; do :; done' & sleep 1 echo "Running: ethtool -d $DEV" time ethtool -d $DEV > /dev/null kill %1 echo 0 > /sys/kernel/tracing/tracing_on echo "Measured latency: $(</sys/kernel/tracing/tracing_max_latency) us" echo "To see the latency trace: less /sys/kernel/tracing/trace" The patch lowers the latency from 180 ms to 53 ms on my test system with voluntary preemption. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Link: https://patch.msgid.link/20240930201307.330692-3-mschmidt@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	qed: make 'ethtool -d' 10 times faster	Michal Schmidt	1	-21/+15
	As a side effect of commit 5401c3e09928 ("qed: allow sleep in qed_mcp_trace_dump()"), 'ethtool -d' became much slower. Almost all the time is spent collecting the "mcp_trace". It is caused by sleeping too long in _qed_mcp_cmd_and_union. When called with sleeping not allowed, the function delays for 10 µs between firmware polls. But if sleeping is allowed, it sleeps for 10 ms instead. The sleeps in _qed_mcp_cmd_and_union are unnecessarily long. Replace msleep with usleep_range, which allows to achieve a similar polling interval like in the no-sleeping mode (10 - 20 µs). The only caller, qed_mcp_cmd_and_union, can stop doing the multiplication/division of the usecs/max_retries. The polling interval and the number of retries do not need to be parameters at all. On my test system, 'ethtool -d' now takes 4 seconds instead of 44. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Link: https://patch.msgid.link/20240930201307.330692-2-mschmidt@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	Merge branch 'net-mv643xx-devm-fixes'	Jakub Kicinski	1	-21/+7
	Rosen Penev says: ==================== net: mv643xx: devm fixes Small simplification and a fix for a seemingly wrong function usage. ==================== Link: https://patch.msgid.link/20240930202951.297737-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net: mv643xx: fix wrong devm_clk_get usage	Rosen Penev	1	-13/+4
	This clock should be optional. In addition, PTR_ERR can be -EPROBE_DEFER in which case it should return. devm_clk_get_optional_enabled also allows removing explicit clock enable and disable calls. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240930202951.297737-3-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net: mv643xx: use devm_platform_ioremap_resource	Rosen Penev	1	-8/+3
	This combines multiple steps in one function. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240930202951.297737-2-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	Merge branch 'net-ag71xx-small-cleanups'	Jakub Kicinski	1	-23/+14
	Rosen Penev says: ==================== net: ag71xx: small cleanups More devm and some loose ends. ==================== Link: https://patch.msgid.link/20240930181823.288892-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net: ag71xx: move assignment into main loop	Rosen Penev	1	-2/+1
	Effectively what's going on here is there's a main loop and an identical one below with a single assignment. Simpler to move it up. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20240930181823.288892-6-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net: ag71xx: replace INIT_LIST_HEAD	Rosen Penev	1	-3/+1
	LIST_HEAD is a shorter macro. No real difference. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20240930181823.288892-5-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net: ag71xx: remove platform_set_drvdata	Rosen Penev	1	-3/+0
	platform_get_drvdata is never called as a result of all the devm conversions. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20240930181823.288892-4-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>