summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2023-09-15ipv6: lockless IPV6_MTU_DISCOVER implementationEric Dumazet1-3/+2
Most np->pmtudisc reads are racy. Move this 3bit field on a full byte, add annotations and make IPV6_MTU_DISCOVER setsockopt() lockless. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-15ipv6: lockless IPV6_ROUTER_ALERT_ISOLATE implementationEric Dumazet1-2/+1
Reads from np->rtalert_isolate are racy. Move this flag to inet->inet_flags to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-15ipv6: move np->repflow to atomic flagsEric Dumazet1-1/+0
Move np->repflow to inet->inet_flags to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-15ipv6: lockless IPV6_RECVERR implemetationEric Dumazet1-2/+1
np->recverr is moved to inet->inet_flags to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-15ipv6: lockless IPV6_DONTFRAG implementationEric Dumazet1-1/+0
Move np->dontfrag flag to inet->inet_flags to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-15ipv6: lockless IPV6_AUTOFLOWLABEL implementationEric Dumazet1-2/+0
Move np->autoflowlabel and np->autoflowlabel_set in inet->inet_flags, to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-15ipv6: lockless IPV6_MULTICAST_ALL implementationEric Dumazet1-1/+0
Move np->mc_all to an atomic flags to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-15ipv6: lockless IPV6_RECVERR_RFC4884 implementationEric Dumazet1-1/+0
Move np->recverr_rfc4884 to an atomic flag to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-15ipv6: lockless IPV6_MULTICAST_HOPS implementationEric Dumazet1-8/+1
This fixes data-races around np->mcast_hops, and make IPV6_MULTICAST_HOPS lockless. Note that np->mcast_hops is never negative, thus can fit an u8 field instead of s16. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-15ipv6: lockless IPV6_MULTICAST_LOOP implementationEric Dumazet1-4/+14
Add inet6_{test|set|clear|assign}_bit() helpers. Note that I am using bits from inet->inet_flags, this might change in the future if we need more flags. While solving data-races accessing np->mc_loop, this patch also allows to implement lockless accesses to np->mcast_hops in the following patch. Also constify sk_mc_loop() argument. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-15ipv6: lockless IPV6_UNICAST_HOPS implementationEric Dumazet1-11/+1
Some np->hop_limit accesses are racy, when socket lock is not held. Add missing annotations and switch to full lockless implementation. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-15sched: Assert for_each_thread() is properly lockedMatthew Wilcox (Oracle)1-1/+2
list_for_each_entry_rcu() takes an optional fourth argument which allows RCU to assert that the correct lock is held. Several callers of for_each_thread() rely on their caller to be holding the appropriate lock, so this is a useful assertion to include. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Link: https://lore.kernel.org/r/20230821134428.2504912-1-willy@infradead.org
2023-09-14lsm: constify 'sb' parameter in security_sb_kern_mount()Khadija Kamran2-2/+2
The "sb_kern_mount" hook has implementation registered in SELinux. Looking at the function implementation we observe that the "sb" parameter is not changing. Mark the "sb" parameter of LSM hook security_sb_kern_mount() as "const" since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> [PM: minor merge fuzzing due to other constification patches] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-09-14lsm: constify 'bprm' parameter in security_bprm_committed_creds()Khadija Kamran2-3/+3
Three LSMs register the implementations for the 'bprm_committed_creds()' hook: AppArmor, SELinux and tomoyo. Looking at the function implementations we may observe that the 'bprm' parameter is not changing. Mark the 'bprm' parameter of LSM hook security_bprm_committed_creds() as 'const' since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> [PM: minor merge fuzzing due to other constification patches] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-09-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni8-24/+67
Cross-merge networking fixes after downstream PR. No conflicts. Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-14firmware: qcom-scm: drop unneeded 'extern' specifiersBartosz Golaszewski1-54/+47
The 'extern' specifier in front of a function declaration has no effect. Remove all of them from the qcom-scm header. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230913192826.36187-1-bartosz.golaszewski@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-09-14udplite: fix various data-racesEric Dumazet1-4/+2
udp->pcflag, udp->pcslen and udp->pcrlen reads/writes are racy. Move udp->pcflag to udp->udp_flags for atomicity, and add READ_ONCE()/WRITE_ONCE() annotations for pcslen and pcrlen. Fixes: ba4e58eca8aa ("[NET]: Supporting UDP-Lite (RFC 3828) in Linux") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-14udplite: remove UDPLITE_BITEric Dumazet1-3/+2
This flag is set but never read, we can remove it. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-14udp: lockless UDP_ENCAP_L2TPINUDP / UDP_GROEric Dumazet1-5/+4
Move udp->encap_enabled to udp->udp_flags. Add udp_test_and_set_bit() helper to allow lockless udp_tunnel_encap_enable() implementation. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-14udp: move udp->accept_udp_{l4|fraglist} to udp->udp_flagsEric Dumazet1-7/+9
These are read locklessly, move them to udp_flags to fix data-races. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-14udp: move udp->gro_enabled to udp->udp_flagsEric Dumazet1-1/+1
syzbot reported that udp->gro_enabled can be read locklessly. Use one atomic bit from udp->udp_flags. Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-14udp: move udp->no_check6_rx to udp->udp_flagsEric Dumazet1-5/+5
syzbot reported that udp->no_check6_rx can be read locklessly. Use one atomic bit from udp->udp_flags. Fixes: 1c19448c9ba6 ("net: Make enabling of zero UDP6 csums more restrictive") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-14udp: move udp->no_check6_tx to udp->udp_flagsEric Dumazet1-5/+5
syzbot reported that udp->no_check6_tx can be read locklessly. Use one atomic bit from udp->udp_flags Fixes: 1c19448c9ba6 ("net: Make enabling of zero UDP6 csums more restrictive") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-14udp: introduce udp->udp_flagsEric Dumazet1-7/+21
According to syzbot, it is time to use proper atomic flags for various UDP flags. Add udp_flags field, and convert udp->corkflag to first bit in it. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-14Merge branch '6.6/scsi-staging' into 6.6/scsi-fixesMartin K. Petersen1-3/+3
Pull in staged fixes for 6.6. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-14lsm: constify 'bprm' parameter in security_bprm_committing_creds()Khadija Kamran2-3/+3
The 'bprm_committing_creds' hook has implementations registered in SELinux and Apparmor. Looking at the function implementations we observe that the 'bprm' parameter is not changing. Mark the 'bprm' parameter of LSM hook security_bprm_committing_creds() as 'const' since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-09-14lsm: constify 'file' parameter in security_bprm_creds_from_file()Khadija Kamran3-5/+5
The 'bprm_creds_from_file' hook has implementation registered in commoncap. Looking at the function implementation we observe that the 'file' parameter is not changing. Mark the 'file' parameter of LSM hook security_bprm_creds_from_file() as 'const' since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-09-14lsm: constify 'sb' parameter in security_quotactl()Khadija Kamran2-3/+3
SELinux registers the implementation for the "quotactl" hook. Looking at the function implementation we observe that the parameter "sb" is not changing. Mark the "sb" parameter of LSM hook security_quotactl() as "const" since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-09-13mm: Remove kmem_valid_obj()Zhen Lei1-2/+3
Function kmem_dump_obj() will splat if passed a pointer to a non-slab object. So nothing calls it directly, instead calling kmem_valid_obj() first to determine whether the passed pointer to a valid slab object. This means that merging kmem_valid_obj() into kmem_dump_obj() will make the code more concise. Therefore, convert kmem_dump_obj() to work the same way as vmalloc_dump_obj(), removing the need for the kmem_dump_obj() caller to check kmem_valid_obj(). After this, there are no remaining calls to kmem_valid_obj() anymore, and it can be safely removed. Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-13rcu: Remove unused function declaration rcu_eqs_special_set()Yue Haibing1-1/+0
Commit a86baa69c2b7 ("rcu: Remove special bit at the bottom of the ->dynticks counter") left behind this, remove it. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-13Merge tag 'parisc-for-6.6-rc2' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc architecture fixes from Helge Deller: - fix reference to exported symbols for parisc64 [Masahiro Yamada] - Block-TLB (BTLB) support on 32-bit CPUs - sparse and build-warning fixes * tag 'parisc-for-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: linux/export: fix reference to exported functions for parisc64 parisc: BTLB: Initialize BTLB tables at CPU startup parisc: firmware: Simplify calling non-PA20 functions parisc: BTLB: _edata symbol has to be page aligned for BTLB support parisc: BTLB: Add BTLB insert and purge firmware function wrappers parisc: BTLB: Clear possibly existing BTLB entries parisc: Prepare for Block-TLB support on 32-bit kernel parisc: shmparam.h: Document aliasing requirements of PA-RISC parisc: irq: Make irq_stack_union static to avoid sparse warning parisc: drivers: Fix sparse warning parisc: iosapic.c: Fix sparse warnings parisc: ccio-dma: Fix sparse warnings parisc: sba-iommu: Fix sparse warnigs parisc: sba: Fix compile warning wrt list of SBA devices parisc: sba_iommu: Fix build warning if procfs if disabled
2023-09-13Merge tag 'trace-v6.6-rc1' of ↵Linus Torvalds1-4/+3
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Add missing LOCKDOWN checks for eventfs callers When LOCKDOWN is active for tracing, it causes inconsistent state when some functions succeed and others fail. - Use dput() to free the top level eventfs descriptor There was a race between accesses and freeing it. - Fix a long standing bug that eventfs exposed due to changing timings by dynamically creating files. That is, If a event file is opened for an instance, there's nothing preventing the instance from being removed which will make accessing the files cause use-after-free bugs. - Fix a ring buffer race that happens when iterating over the ring buffer while writers are active. Check to make sure not to read the event meta data if it's beyond the end of the ring buffer sub buffer. - Fix the print trigger that disappeared because the test to create it was looking for the event dir field being filled, but now it has the "ef" field filled for the eventfs structure. - Remove the unused "dir" field from the event structure. - Fix the order of the trace_dynamic_info as it had it backwards for the offset and len fields for which one was for which endianess. - Fix NULL pointer dereference with eventfs_remove_rec() If an allocation fails in one of the eventfs_add_*() functions, the caller of it in event_subsystem_dir() or event_create_dir() assigns the result to the structure. But it's assigning the ERR_PTR and not NULL. This was passed to eventfs_remove_rec() which expects either a good pointer or a NULL, not ERR_PTR. The fix is to not assign the ERR_PTR to the structure, but to keep it NULL on error. - Fix list_for_each_rcu() to use list_for_each_srcu() in dcache_dir_open_wrapper(). One iteration of the code used RCU but because it had to call sleepable code, it had to be changed to use SRCU, but one of the iterations was missed. - Fix synthetic event print function to use "as_u64" instead of passing in a pointer to the union. To fix big/little endian issues, the u64 that represented several types was turned into a union to define the types properly. * tag 'trace-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: eventfs: Fix the NULL pointer dereference bug in eventfs_remove_rec() tracefs/eventfs: Use list_for_each_srcu() in dcache_dir_open_wrapper() tracing/synthetic: Print out u64 values properly tracing/synthetic: Fix order of struct trace_dynamic_info selftests/ftrace: Fix dependencies for some of the synthetic event tests tracing: Remove unused trace_event_file dir field tracing: Use the new eventfs descriptor for print trigger ring-buffer: Do not attempt to read past "commit" tracefs/eventfs: Free top level files on removal ring-buffer: Avoid softlockup in ring_buffer_resize() tracing: Have event inject files inc the trace array ref count tracing: Have option files inc the trace array ref count tracing: Have current_trace inc the trace array ref count tracing: Have tracing_max_latency inc the trace array ref count tracing: Increase trace array ref count on enable and filter files tracefs/eventfs: Use dput to free the toplevel events directory tracefs/eventfs: Add missing lockdown checks tracefs: Add missing lockdown check to tracefs_create_dir()
2023-09-13firmware: qcom_scm: Add support for Qualcomm Secure Execution Environment ↵Maximilian Luz2-0/+68
SCM interface Add support for SCM calls to Secure OS and the Secure Execution Environment (SEE) residing in the TrustZone (TZ) via the QSEECOM interface. This allows communication with Secure/TZ applications, for example 'uefisecapp' managing access to UEFI variables. For better separation, make qcom_scm spin up a dedicated child (platform) device in case QSEECOM support has been detected. The corresponding driver for this device is then responsible for managing any QSEECOM clients. Specifically, this driver attempts to automatically detect known and supported applications, creating a client (auxiliary) device for each one. The respective client/auxiliary driver is then responsible for managing and communicating with the application. While this patch introduces only a very basic interface without the more advanced features (such as re-entrant and blocking SCM calls and listeners/callbacks), this is enough to talk to the aforementioned 'uefisecapp'. Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20230827211408.689076-3-luzmaximilian@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-09-13lib/ucs2_string: Add UCS-2 strscpy functionMaximilian Luz1-0/+1
Add a ucs2_strscpy() function for UCS-2 strings. The behavior is equivalent to the standard strscpy() function, just for 16-bit character UCS-2 strings. Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230827211408.689076-2-luzmaximilian@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-09-13NFSv4.1: fix pnfs MDS=DS session trunkingOlga Kornievskaia1-0/+1
Currently, when GETDEVICEINFO returns multiple locations where each is a different IP but the server's identity is same as MDS, then nfs4_set_ds_client() finds the existing nfs_client structure which has the MDS's max_connect value (and if it's 1), then the 1st IP on the DS's list will get dropped due to MDS trunking rules. Other IPs would be added as they fall under the pnfs trunking rules. For the list of IPs the 1st goes thru calling nfs4_set_ds_client() which will eventually call nfs4_add_trunk() and call into rpc_clnt_test_and_add_xprt() which has the check for MDS trunking. The other IPs (after the 1st one), would call rpc_clnt_add_xprt() which doesn't go thru that check. nfs4_add_trunk() is called when MDS trunking is happening and it needs to enforce the usage of max_connect mount option of the 1st mount. However, this shouldn't be applied to pnfs flow. Instead, this patch proposed to treat MDS=DS as DS trunking and make sure that MDS's max_connect limit does not apply to the 1st IP returned in the GETDEVICEINFO list. It does so by marking the newly created client with a new flag NFS_CS_PNFS which then used to pass max_connect value to use into the rpc_clnt_test_and_add_xprt() instead of the existing rpc client's max_connect value set by the MDS connection. For example, mount was done without max_connect value set so MDS's rpc client has cl_max_connect=1. Upon calling into rpc_clnt_test_and_add_xprt() and using rpc client's value, the caller passes in max_connect value which is previously been set in the pnfs path (as a part of handling GETDEVICEINFO list of IPs) in nfs4_set_ds_client(). However, when NFS_CS_PNFS flag is not set and we know we are doing MDS trunking, comparing a new IP of the same server, we then set the max_connect value to the existing MDS's value and pass that into rpc_clnt_test_and_add_xprt(). Fixes: dc48e0abee24 ("SUNRPC enforce creation of no more than max_connect xprts") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2023-09-13NFS: Use the correct commit info in nfs_join_page_group()Trond Myklebust1-1/+3
Ensure that nfs_clear_request_commit() updates the correct counters when it removes them from the commit list. Fixes: ed5d588fe47f ("NFS: Try to join page groups before an O_DIRECT retransmission") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2023-09-13virtchnl: Add CRC stripping capabilityPaul M Stillwell Jr1-2/+9
Some VFs may want to disable CRC stripping on incoming packets so create an offload for that. The VF already sends information about configuring its RX queues so use that structure to indicate that the CRC stripping should be enabled or not. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-09-13sched: Simplify tg_set_cfs_bandwidth()Peter Zijlstra1-0/+2
Use guards to reduce gotos and simplify control flow. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-09-13cleanup: Make no_free_ptr() __must_checkPeter Zijlstra1-3/+36
recent discussion brought about the realization that it makes sense for no_free_ptr() to have __must_check semantics in order to avoid leaking the resource. Additionally, add a few comments to clarify why/how things work. All credit to Linus on how to combine __must_check and the stmt-expression. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20230816103102.GF980931@hirez.programming.kicks-ass.net
2023-09-13smp: Change function signatures to use call_single_data_tLeonardo Bras1-1/+1
call_single_data_t is a size-aligned typedef of struct __call_single_data. This alignment is desirable in order to have smp_call_function*() avoid bouncing an extra cacheline in case of an unaligned csd, given this would hurt performance. Since the removal of struct request->csd in commit 660e802c76c8 ("blk-mq: use percpu csd to remote complete instead of per-rq csd") there are no current users of smp_call_function*() with unaligned csd. Change every 'struct __call_single_data' function parameter to 'call_single_data_t', so we have warnings if any new code tries to introduce an smp_call_function*() call with unaligned csd. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Guo Ren <guoren@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230831063129.335425-1-leobras@redhat.com
2023-09-13power: supply: core: Use blocking_notifier_call_chain to avoid RCU complaintKai-Heng Feng1-1/+1
AMD PMF driver can cause the following warning: [ 196.159546] ------------[ cut here ]------------ [ 196.159556] Voluntary context switch within RCU read-side critical section! [ 196.159571] WARNING: CPU: 0 PID: 9 at kernel/rcu/tree_plugin.h:320 rcu_note_context_switch+0x43d/0x560 [ 196.159604] Modules linked in: nvme_fabrics ccm rfcomm snd_hda_scodec_cs35l41_spi cmac algif_hash algif_skcipher af_alg bnep joydev btusb btrtl uvcvideo btintel btbcm videobuf2_vmalloc intel_rapl_msr btmtk videobuf2_memops uvc videobuf2_v4l2 intel_rapl_common binfmt_misc hid_sensor_als snd_sof_amd_vangogh hid_sensor_trigger bluetooth industrialio_triggered_buffer videodev snd_sof_amd_rembrandt hid_sensor_iio_common amdgpu ecdh_generic kfifo_buf videobuf2_common hp_wmi kvm_amd sparse_keymap snd_sof_amd_renoir wmi_bmof industrialio ecc mc nls_iso8859_1 kvm snd_sof_amd_acp irqbypass snd_sof_xtensa_dsp crct10dif_pclmul crc32_pclmul mt7921e snd_sof_pci snd_ctl_led polyval_clmulni mt7921_common polyval_generic snd_sof ghash_clmulni_intel mt792x_lib mt76_connac_lib sha512_ssse3 snd_sof_utils aesni_intel snd_hda_codec_realtek crypto_simd mt76 snd_hda_codec_generic cryptd snd_soc_core snd_hda_codec_hdmi rapl ledtrig_audio input_leds snd_compress i2c_algo_bit drm_ttm_helper mac80211 snd_pci_ps hid_multitouch ttm drm_exec [ 196.159970] drm_suballoc_helper snd_rpl_pci_acp6x amdxcp drm_buddy snd_hda_intel snd_acp_pci snd_hda_scodec_cs35l41_i2c serio_raw gpu_sched snd_hda_scodec_cs35l41 snd_acp_legacy_common snd_intel_dspcfg snd_hda_cs_dsp_ctls snd_hda_codec libarc4 drm_display_helper snd_pci_acp6x cs_dsp snd_hwdep snd_soc_cs35l41_lib video k10temp snd_pci_acp5x thunderbolt snd_hda_core drm_kms_helper cfg80211 snd_seq snd_rn_pci_acp3x snd_pcm snd_acp_config cec snd_soc_acpi snd_seq_device rc_core ccp snd_pci_acp3x snd_timer snd soundcore wmi amd_pmf platform_profile amd_pmc mac_hid serial_multi_instantiate wireless_hotkey hid_sensor_hub sch_fq_codel msr parport_pc ppdev lp parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c xor raid6_pq raid1 raid0 multipath linear dm_mirror dm_region_hash dm_log cdc_ether usbnet r8152 mii hid_generic nvme i2c_hid_acpi i2c_hid nvme_core i2c_piix4 xhci_pci amd_sfh drm xhci_pci_renesas nvme_common hid [ 196.160382] CPU: 0 PID: 9 Comm: kworker/0:1 Not tainted 6.6.0-rc1 #4 [ 196.160397] Hardware name: HP HP EliteBook 845 14 inch G10 Notebook PC/8B6E, BIOS V82 Ver. 01.02.00 08/24/2023 [ 196.160405] Workqueue: events power_supply_changed_work [ 196.160426] RIP: 0010:rcu_note_context_switch+0x43d/0x560 [ 196.160440] Code: 00 48 89 be 40 08 00 00 48 89 86 48 08 00 00 48 89 10 e9 63 fe ff ff 48 c7 c7 10 e7 b0 9e c6 05 e8 d8 20 02 01 e8 13 0f f3 ff <0f> 0b e9 27 fc ff ff a9 ff ff ff 7f 0f 84 cf fc ff ff 65 48 8b 3c [ 196.160450] RSP: 0018:ffffc900001878f0 EFLAGS: 00010046 [ 196.160462] RAX: 0000000000000000 RBX: ffff88885e834040 RCX: 0000000000000000 [ 196.160470] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 196.160476] RBP: ffffc90000187910 R08: 0000000000000000 R09: 0000000000000000 [ 196.160482] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 196.160488] R13: 0000000000000000 R14: ffff888100990000 R15: ffff888100990000 [ 196.160495] FS: 0000000000000000(0000) GS:ffff88885e800000(0000) knlGS:0000000000000000 [ 196.160504] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 196.160512] CR2: 000055cb053c8246 CR3: 000000013443a000 CR4: 0000000000750ef0 [ 196.160520] PKRU: 55555554 [ 196.160526] Call Trace: [ 196.160532] <TASK> [ 196.160548] ? show_regs+0x72/0x90 [ 196.160570] ? rcu_note_context_switch+0x43d/0x560 [ 196.160580] ? __warn+0x8d/0x160 [ 196.160600] ? rcu_note_context_switch+0x43d/0x560 [ 196.160613] ? report_bug+0x1bb/0x1d0 [ 196.160637] ? handle_bug+0x46/0x90 [ 196.160658] ? exc_invalid_op+0x19/0x80 [ 196.160675] ? asm_exc_invalid_op+0x1b/0x20 [ 196.160709] ? rcu_note_context_switch+0x43d/0x560 [ 196.160727] __schedule+0xb9/0x15f0 [ 196.160746] ? srso_alias_return_thunk+0x5/0x7f [ 196.160765] ? srso_alias_return_thunk+0x5/0x7f [ 196.160778] ? acpi_ns_search_one_scope+0xbe/0x270 [ 196.160806] schedule+0x68/0x110 [ 196.160820] schedule_timeout+0x151/0x160 [ 196.160829] ? srso_alias_return_thunk+0x5/0x7f [ 196.160842] ? srso_alias_return_thunk+0x5/0x7f [ 196.160855] ? acpi_ns_lookup+0x3c5/0xa90 [ 196.160878] __down_common+0xff/0x220 [ 196.160905] __down_timeout+0x16/0x30 [ 196.160920] down_timeout+0x64/0x70 [ 196.160938] acpi_os_wait_semaphore+0x85/0x200 [ 196.160959] acpi_ut_acquire_mutex+0x9e/0x280 [ 196.160979] acpi_ex_enter_interpreter+0x2d/0xb0 [ 196.160992] acpi_ns_evaluate+0x2f0/0x5f0 [ 196.161005] acpi_evaluate_object+0x172/0x490 [ 196.161018] ? acpi_os_signal_semaphore+0x8a/0xd0 [ 196.161038] acpi_evaluate_integer+0x52/0xe0 [ 196.161055] ? kfree+0x79/0x120 [ 196.161071] ? srso_alias_return_thunk+0x5/0x7f [ 196.161089] acpi_ac_get_state.part.0+0x27/0x80 [ 196.161110] get_ac_property+0x5c/0x70 [ 196.161127] ? __pfx___power_supply_is_system_supplied+0x10/0x10 [ 196.161146] __power_supply_is_system_supplied+0x44/0xb0 [ 196.161166] class_for_each_device+0x124/0x160 [ 196.161184] ? acpi_ac_get_state.part.0+0x27/0x80 [ 196.161203] ? srso_alias_return_thunk+0x5/0x7f [ 196.161223] power_supply_is_system_supplied+0x3c/0x70 [ 196.161243] amd_pmf_get_power_source+0xe/0x20 [amd_pmf] [ 196.161276] amd_pmf_power_slider_update_event+0x49/0x90 [amd_pmf] [ 196.161310] amd_pmf_pwr_src_notify_call+0xe7/0x100 [amd_pmf] [ 196.161340] notifier_call_chain+0x5f/0xe0 [ 196.161362] atomic_notifier_call_chain+0x33/0x60 [ 196.161378] power_supply_changed_work+0x84/0x110 [ 196.161394] process_one_work+0x178/0x360 [ 196.161412] ? __pfx_worker_thread+0x10/0x10 [ 196.161424] worker_thread+0x307/0x430 [ 196.161440] ? __pfx_worker_thread+0x10/0x10 [ 196.161451] kthread+0xf4/0x130 [ 196.161467] ? __pfx_kthread+0x10/0x10 [ 196.161486] ret_from_fork+0x43/0x70 [ 196.161502] ? __pfx_kthread+0x10/0x10 [ 196.161518] ret_from_fork_asm+0x1b/0x30 [ 196.161558] </TASK> [ 196.161562] ---[ end trace 0000000000000000 ]--- Since there's no guarantee that all the callbacks can work in atomic context, switch to use blocking_notifier_call_chain to relax the constraint. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Reported-by: Allen Zhong <allen@atr.me> Fixes: 4c71ae414474 ("platform/x86/amd/pmf: Add support SPS PMF feature") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217571 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20230913033233.602986-1-kai.heng.feng@canonical.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2023-09-13i2c: Drop legacy callback .probe_new()Uwe Kleine-König1-10/+1
Now that all drivers are converted to the (new) .probe() callback, the temporary .probe_new() can go away. \o/ Link: https://lore.kernel.org/linux-i2c/20230626094548.559542-1-u.kleine-koenig@pengutronix.de Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2023-09-13x86/numa: Introduce numa_fill_memblks()Alison Schofield1-0/+7
numa_fill_memblks() fills in the gaps in numa_meminfo memblks over an physical address range. The ACPI driver will use numa_fill_memblks() to implement a new Linux policy that prescribes extending proximity domains in a portion of a CFMWS window to the entire window. Dan Williams offered this explanation of the policy: A CFWMS is an ACPI data structure that indicates *potential* locations where CXL memory can be placed. It is the playground where the CXL driver has free reign to establish regions. That space can be populated by BIOS created regions, or driver created regions, after hotplug or other reconfiguration. When BIOS creates a region in a CXL Window it additionally describes that subset of the Window range in the other typical ACPI tables SRAT, SLIT, and HMAT. The rationale for BIOS not pre-describing the entire CXL Window in SRAT, SLIT, and HMAT is that it can not predict the future. I.e. there is nothing stopping higher or lower performance devices being placed in the same Window. Compare that to ACPI memory hotplug that just onlines additional capacity in the proximity domain with little freedom for dynamic performance differentiation. That leaves the OS with a choice, should unpopulated window capacity match the proximity domain of an existing region, or should it allocate a new one? This patch takes the simple position of minimizing proximity domain proliferation by reusing any proximity domain intersection for the entire Window. If the Window has no intersections then allocate a new proximity domain. Note that SRAT, SLIT and HMAT information can be enumerated dynamically in a standard way from device provided data. Think of CXL as the end of ACPI needing to describe memory attributes, CXL offers a standard discovery model for performance attributes, but Linux still needs to interoperate with the old regime. Reported-by: Derick Marks <derick.w.marks@intel.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Derick Marks <derick.w.marks@intel.com> Link: https://lore.kernel.org/all/ef078a6f056ca974e5af85997013c0fda9e3326d.1689018477.git.alison.schofield%40intel.com
2023-09-12bpf, x64: Fix tailcall infinite loopLeon Hwang1-0/+5
From commit ebf7d1f508a73871 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT"), the tailcall on x64 works better than before. From commit e411901c0b775a3a ("bpf: allow for tailcalls in BPF subprograms for x64 JIT"), tailcall is able to run in BPF subprograms on x64. From commit 5b92a28aae4dd0f8 ("bpf: Support attaching tracing BPF program to other BPF programs"), BPF program is able to trace other BPF programs. How about combining them all together? 1. FENTRY/FEXIT on a BPF subprogram. 2. A tailcall runs in the BPF subprogram. 3. The tailcall calls the subprogram's caller. As a result, a tailcall infinite loop comes up. And the loop would halt the machine. As we know, in tail call context, the tail_call_cnt propagates by stack and rax register between BPF subprograms. So do in trampolines. Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") Fixes: e411901c0b77 ("bpf: allow for tailcalls in BPF subprograms for x64 JIT") Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Leon Hwang <hffilwlqm@gmail.com> Link: https://lore.kernel.org/r/20230912150442.2009-3-hffilwlqm@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-12tcp: defer regular ACK while processing socket backlogEric Dumazet1-6/+8
This idea came after a particular workload requested the quickack attribute set on routes, and a performance drop was noticed for large bulk transfers. For high throughput flows, it is best to use one cpu running the user thread issuing socket system calls, and a separate cpu to process incoming packets from BH context. (With TSO/GRO, bottleneck is usually the 'user' cpu) Problem is the user thread can spend a lot of time while holding the socket lock, forcing BH handler to queue most of incoming packets in the socket backlog. Whenever the user thread releases the socket lock, it must first process all accumulated packets in the backlog, potentially adding latency spikes. Due to flood mitigation, having too many packets in the backlog increases chance of unexpected drops. Backlog processing unfortunately shifts a fair amount of cpu cycles from the BH cpu to the 'user' cpu, thus reducing max throughput. This patch takes advantage of the backlog processing, and the fact that ACK are mostly cumulative. The idea is to detect we are in the backlog processing and defer all eligible ACK into a single one, sent from tcp_release_cb(). This saves cpu cycles on both sides, and network resources. Performance of a single TCP flow on a 200Gbit NIC: - Throughput is increased by 20% (100Gbit -> 120Gbit). - Number of generated ACK per second shrinks from 240,000 to 40,000. - Number of backlog drops per second shrinks from 230 to 0. Benchmark context: - Regular netperf TCP_STREAM (no zerocopy) - Intel(R) Xeon(R) Platinum 8481C (Saphire Rapids) - MAX_SKB_FRAGS = 17 (~60KB per GRO packet) This feature is guarded by a new sysctl, and enabled by default: /proc/sys/net/ipv4/tcp_backlog_ack_defer Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-12x86/ibt: Suppress spurious ENDBRPeter Zijlstra1-0/+5
It was reported that under certain circumstances GCC emits ENDBR instructions for _THIS_IP_ usage. Specifically, when it appears at the start of a basic block -- but not elsewhere. Since _THIS_IP_ is never used for control flow, these ENDBR instructions are completely superfluous. Override the _THIS_IP_ definition for x86_64 to avoid this. Less ENDBR instructions is better. Fixes: 156ff4a544ae ("x86/ibt: Base IBT bits") Reported-by: David Kaplan <David.Kaplan@amd.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230802110323.016197440@infradead.org
2023-09-12linux/export: fix reference to exported functions for parisc64Masahiro Yamada1-0/+2
John David Anglin reported parisc has been broken since commit ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost"). Like ia64, parisc64 uses a function descriptor. The function references must be prefixed with P%. Also, symbols prefixed $$ from the library have the symbol type STT_LOPROC instead of STT_FUNC. They should be handled as functions too. Fixes: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost") Reported-by: John David Anglin <dave.anglin@bell.net> Tested-by: John David Anglin <dave.anglin@bell.net> Tested-by: Helge Deller <deller@gmx.de> Closes: https://lore.kernel.org/linux-parisc/1901598a-e11d-f7dd-a5d9-9a69d06e6b6e@bell.net/T/#u Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Helge Deller <deller@gmx.de>
2023-09-12KVM: arm64: nvhe: Ignore SVE hint in SMCCC function IDJean-Philippe Brucker1-0/+2
When SVE is enabled, the host may set bit 16 in SMCCC function IDs, a hint that indicates an unused SVE state. At the moment NVHE doesn't account for this bit when inspecting the function ID, and rejects most calls. Clear the hint bit before comparing function IDs. About version compatibility: the host's PSCI driver initially probes the firmware for a SMCCC version number. If the firmware implements a protocol recent enough (1.3), subsequent SMCCC calls have the hint bit set. Since the hint bit was reserved in earlier versions of the protocol, clearing it is fine regardless of the version in use. When a new hint is added to the protocol in the future, it will be added to ARM_SMCCC_CALL_HINTS and NVHE will handle it straight away. This patch only clears known hints and leaves reserved bits as is, because future SMCCC versions could use reserved bits as modifiers for the function ID, rather than hints. Fixes: cfa7ff959a78 ("arm64: smccc: Support SMCCC v1.3 SVE register saving hint") Reported-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230911145254.934414-4-jean-philippe@linaro.org
2023-09-12iio: Remove unused declarationsYue Haibing2-6/+0
Commit 0f3a8c3f34f7 ("iio: Add support for creating IIO devices via configfs") declared but never implemented iio_sw_device_type_configfs_{un}register(). Commit b662f809d410 ("iio: core: Introduce IIO software triggers") declared but never implemented iio_sw_trigger_type_configfs_{un}register(). Commit a3e0b51884ee ("iio: accel: add support for FXLS8962AF/FXLS8964AF accelerometers") declared but never implemented fxls8962af_core_remove(). Commit 8dedcc3eee3a ("iio: core: centralize ioctl() calls to the main chardev") declared but never implemented iio_device_ioctl(). Commit d430f3c36ca6 ("iio: imu: inv_mpu6050: Use regmap instead of i2c specific functions") removed inv_mpu6050_write_reg() but not its declaration. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Link: https://lore.kernel.org/r/20230811095701.35372-1-yuehaibing@huawei.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2023-09-12platform/x86: asus-wmi: add support for ASUS screenpadLuke D. Jones1-0/+4
Add support for the WMI methods used to turn off and adjust the brightness of the secondary "screenpad" device found on some high-end ASUS laptops like the GX650P series and others. There are some small quirks with this device when considering only the raw WMI methods: 1. The Off method can only switch the device off 2. Changing the brightness turns the device back on 3. To turn the device back on the brightness must be > 1 4. When the device is off the brightness can't be changed (so it is stored by the driver if device is off). 5. Booting with a value of 0 brightness (retained by bios) means the bios will set a value of >0 <15 6. When the device is off it is "unplugged" asus_wmi sets the minimum brightness as 20 in general use, and 60 for booting with values <= min. The ACPI methods are used in a new backlight device named asus_screenpad. Signed-off-by: Luke D. Jones <luke@ljones.dev> Link: https://lore.kernel.org/r/20230830032237.42987-2-luke@ljones.dev Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>