summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2023-06-09splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage()David Howells2-2/+2
Replace generic_splice_sendpage() + splice_from_pipe + pipe_to_sendpage() with a net-specific handler, splice_to_socket(), that calls sendmsg() with MSG_SPLICE_PAGES set instead of calling ->sendpage(). MSG_MORE is used to indicate if the sendmsg() is expected to be followed with more data. This allows multiple pipe-buffer pages to be passed in a single call in a BVEC iterator, allowing the processing to be pushed down to a loop in the protocol driver. This helps pave the way for passing multipage folios down too. Protocols that haven't been converted to handle MSG_SPLICE_PAGES yet should just ignore it and do a normal sendmsg() for now - although that may be a bit slower as it may copy everything. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09net: Block MSG_SENDPAGE_* from being passed to sendmsg() by userspaceDavid Howells1-1/+3
It is necessary to allow MSG_SENDPAGE_* to be passed into ->sendmsg() to allow sendmsg(MSG_SPLICE_PAGES) to replace ->sendpage(). Unblocking them in the network protocol, however, allows these flags to be passed in by userspace too[1]. Fix this by marking MSG_SENDPAGE_NOPOLICY, MSG_SENDPAGE_NOTLAST and MSG_SENDPAGE_DECRYPTED as internal flags, which causes sendmsg() to object if they are passed to sendmsg() by userspace. Network protocol ->sendmsg() implementations can then allow them through. Note that it should be possible to remove MSG_SENDPAGE_NOTLAST once sendpage is removed as a whole slew of pages will be passed in in one go by splice through sendmsg, with MSG_MORE being set if it has more data waiting in the pipe. Signed-off-by: David Howells <dhowells@redhat.com> cc: Chuck Lever <chuck.lever@oracle.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Link: https://lore.kernel.org/r/20230526181338.03a99016@kernel.org/ [1] Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09Merge tag 'mlx5-updates-2023-06-06' of ↵Jakub Kicinski2-3/+17
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-06-06 1) Support 4 ports VF LAG, part 2/2 2) Few extra trivial cleanup patches Shay Drory Says: ================ Support 4 ports VF LAG, part 2/2 This series continues the series[1] "Support 4 ports VF LAG, part1/2". This series adds support for 4 ports VF LAG (single FDB E-Switch). This series of patches refactoring LAG code that make assumptions about VF LAG supporting only two ports and then enable 4 ports VF LAG. Patch 1: - Fix for ib rep code Patches 2-5: - Refactors LAG layer. Patches 6-7: - Block LAG types which doesn't support 4 ports. Patch 8: - Enable 4 ports VF LAG. This series specifically allows HCAs with 4 ports to create a VF LAG with only 4 ports. It is not possible to create a VF LAG with 2 or 3 ports using HCAs that have 4 ports. Currently, the Merged E-Switch feature only supports HCAs with 2 ports. However, upcoming patches will introduce support for HCAs with 4 ports. In order to activate VF LAG a user can execute: devlink dev eswitch set pci/0000:08:00.0 mode switchdev devlink dev eswitch set pci/0000:08:00.1 mode switchdev devlink dev eswitch set pci/0000:08:00.2 mode switchdev devlink dev eswitch set pci/0000:08:00.3 mode switchdev ip link add name bond0 type bond ip link set dev bond0 type bond mode 802.3ad ip link set dev eth2 master bond0 ip link set dev eth3 master bond0 ip link set dev eth4 master bond0 ip link set dev eth5 master bond0 Where eth2, eth3, eth4 and eth5 are net-interfaces of pci/0000:08:00.0 pci/0000:08:00.1 pci/0000:08:00.2 pci/0000:08:00.3 respectively. User can verify LAG state and type via debugfs: /sys/kernel/debug/mlx5/0000\:08\:00.0/lag/state /sys/kernel/debug/mlx5/0000\:08\:00.0/lag/type [1] https://lore.kernel.org/netdev/20230601060118.154015-1-saeed@kernel.org/T/#mf1d2083780970ba277bfe721554d4925f03f36d1 ================ * tag 'mlx5-updates-2023-06-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: simplify condition after napi budget handling change mlx5/core: E-Switch, Allocate ECPF vport if it's an eswitch manager net/mlx5: Skip inline mode check after mlx5_eswitch_enable_locked() failure net/mlx5e: TC, refactor access to hash key net/mlx5e: Remove RX page cache leftovers net/mlx5e: Expose catastrophic steering error counters net/mlx5: Enable 4 ports VF LAG net/mlx5: LAG, block multiport eswitch LAG in case ldev have more than 2 ports net/mlx5: LAG, block multipath LAG in case ldev have more than 2 ports net/mlx5: LAG, change mlx5_shared_fdb_supported() to static net/mlx5: LAG, generalize handling of shared FDB net/mlx5: LAG, check if all eswitches are paired for shared FDB {net/RDMA}/mlx5: introduce lag_for_each_peer RDMA/mlx5: Free second uplink ib port ==================== Link: https://lore.kernel.org/r/20230607210410.88209-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09net: pcs: lynx: make lynx_pcs_create() staticRussell King (Oracle)1-1/+0
We no longer need to export lynx_pcs_create() for drivers to use as we now have all the functionality we need in the two new creation helpers. Remove the export and prototype, and make it static. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09net: pcs: lynx: add lynx_pcs_create_fwnode()Russell King (Oracle)1-0/+1
Add a helper to create a lynx PCS from a fwnode handle. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09net: pcs: lynx: remove lynx_get_mdio_device()Russell King (Oracle)1-2/+0
lynx_get_mdio_device() is no longer necessary, let's remove it so the lynx PCS code is always managing the lifetime of the mdiodev. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-09clk: Introduce clk_hw_determine_rate_no_reparent()Stephen Boyd1-0/+2
Some clock drivers do not want to allow any reparenting on a given clock, but usually do so by not providing any determine_rate implementation. Whenever we call clk_round_rate() or clk_set_rate(), this leads to clk_core_can_round() returning false and thus the rest of the function either forwarding the rate request to its current parent if CLK_SET_RATE_PARENT is set, or just returning the current clock rate. This behaviour happens implicitly, and as we move forward to making a determine_rate implementation required for muxes, we need some way to explicitly opt-in for that behaviour. Fortunately, this is exactly what the clk_core_determine_rate_no_reparent() function is doing, so we can simply make it available to drivers. Cc: Abel Vesa <abelvesa@kernel.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: "Andreas Färber" <afaerber@suse.de> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Charles Keepax <ckeepax@opensource.cirrus.com> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Chen-Yu Tsai <wenst@chromium.org> Cc: Chunyan Zhang <zhang.lyra@gmail.com> Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: David Airlie <airlied@gmail.com> Cc: David Lechner <david@lechnology.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Fabio Estevam <festevam@gmail.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Jernej Skrabec <jernej.skrabec@gmail.com> Cc: Jonathan Hunter <jonathanh@nvidia.com> Cc: Kishon Vijay Abraham I <kishon@kernel.org> Cc: Liam Girdwood <lgirdwood@gmail.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Luca Ceresoli <luca.ceresoli@bootlin.com> Cc: Manivannan Sadhasivam <mani@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Markus Schneider-Pargmann <msp@baylibre.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Mikko Perttunen <mperttunen@nvidia.com> Cc: Miles Chen <miles.chen@mediatek.com> Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: Orson Zhai <orsonzhai@gmail.com> Cc: Paul Cercueil <paul@crapouillou.net> Cc: Peng Fan <peng.fan@nxp.com> Cc: Peter De Schrijver <pdeschrijver@nvidia.com> Cc: Prashant Gaikwad <pgaikwad@nvidia.com> Cc: Richard Fitzgerald <rf@opensource.cirrus.com> Cc: Samuel Holland <samuel@sholland.org> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Sekhar Nori <nsekhar@ti.com> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Takashi Iwai <tiwai@suse.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Vinod Koul <vkoul@kernel.org> Cc: dri-devel@lists.freedesktop.org Cc: linux-actions@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mips@vger.kernel.org Cc: linux-phy@lists.infradead.org Cc: linux-renesas-soc@vger.kernel.org Cc: linux-rtc@vger.kernel.org Cc: linux-stm32@st-md-mailman.stormreply.com Cc: linux-sunxi@lists.linux.dev Cc: linux-tegra@vger.kernel.org Cc: NXP Linux Team <linux-imx@nxp.com> Cc: patches@opensource.cirrus.com Cc: Pengutronix Kernel Team <kernel@pengutronix.de> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20221018-clk-range-checks-fixes-v4-4-971d5077e7d2@cerno.tech | Reported-by: kernel test robot <lkp@intel.com>:
2023-06-09sysctl: move security keys sysctl registration to its own fileLuis Chamberlain1-3/+0
The security keys sysctls are already declared on its own file, just move the sysctl registration to its own file to help avoid merge conflicts on sysctls.c, and help with clearing up sysctl.c further. This creates a small penalty of 23 bytes: ./scripts/bloat-o-meter vmlinux.1 vmlinux.2 add/remove: 2/0 grow/shrink: 0/1 up/down: 49/-26 (23) Function old new delta init_security_keys_sysctls - 33 +33 __pfx_init_security_keys_sysctls - 16 +16 sysctl_init_bases 85 59 -26 Total: Before=21256937, After=21256960, chg +0.00% But soon we'll be saving tons of bytes anyway, as we modify the sysctl registrations to use ARRAY_SIZE and so we get rid of all the empty array elements so let's just clean this up now. Reviewed-by: Paul Moore <paul@paul-moore.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-06-09sysctl: move umh sysctl registration to its own fileLuis Chamberlain1-2/+0
Move the umh sysctl registration to its own file, the array is already there. We do this to remove the clutter out of kernel/sysctl.c to avoid merge conflicts. This also lets the sysctls not be built at all now when CONFIG_SYSCTL is not enabled. This has a small penalty of 23 bytes but soon we'll be removing all the empty entries on sysctl arrays so just do this cleanup now: ./scripts/bloat-o-meter vmlinux.base vmlinux.1 add/remove: 2/0 grow/shrink: 0/1 up/down: 49/-26 (23) Function old new delta init_umh_sysctls - 33 +33 __pfx_init_umh_sysctls - 16 +16 sysctl_init_bases 111 85 -26 Total: Before=21256914, After=21256937, chg +0.00% Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-06-08kallsyms: make kallsyms_show_value() as generic functionManinder Singh1-8/+3
This change makes function kallsyms_show_value() as generic function without dependency on CONFIG_KALLSYMS. Now module address will be displayed with lsmod and /proc/modules. Earlier: ======= / # insmod test.ko / # lsmod test 12288 0 - Live 0x0000000000000000 (O) // No Module Load address / # With change: ========== / # insmod test.ko / # lsmod test 12288 0 - Live 0xffff800000fc0000 (O) // Module address / # cat /proc/modules test 12288 0 - Live 0xffff800000fc0000 (O) Co-developed-by: Onkarnath <onkarnath.1@samsung.com> Signed-off-by: Onkarnath <onkarnath.1@samsung.com> Signed-off-by: Maninder Singh <maninder1.s@samsung.com> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-06-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski9-25/+43
Cross-merge networking fixes after downstream PR. Conflicts: net/sched/sch_taprio.c d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping") dced11ef84fb ("net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()") net/ipv4/sysctl_net_ipv4.c e209fee4118f ("net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294") ccce324dabfe ("tcp: make the first N SYN RTO backoffs linear") https://lore.kernel.org/all/20230605100816.08d41a7b@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08Merge tag 'net-6.4-rc6' of ↵Linus Torvalds1-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can, wifi, netfilter, bluetooth and ebpf. Current release - regressions: - bpf: sockmap: avoid potential NULL dereference in sk_psock_verdict_data_ready() - wifi: iwlwifi: fix -Warray-bounds bug in iwl_mvm_wait_d3_notif() - phylink: actually fix ksettings_set() ethtool call - eth: dwmac-qcom-ethqos: fix a regression on EMAC < 3 Current release - new code bugs: - wifi: mt76: fix possible NULL pointer dereference in mt7996_mac_write_txwi() Previous releases - regressions: - netfilter: fix NULL pointer dereference in nf_confirm_cthelper - wifi: rtw88/rtw89: correct PS calculation for SUPPORTS_DYNAMIC_PS - openvswitch: fix upcall counter access before allocation - bluetooth: - fix use-after-free in hci_remove_ltk/hci_remove_irk - fix l2cap_disconnect_req deadlock - nic: bnxt_en: prevent kernel panic when receiving unexpected PHC_UPDATE event Previous releases - always broken: - core: annotate rfs lockless accesses - sched: fq_pie: ensure reasonable TCA_FQ_PIE_QUANTUM values - netfilter: add null check for nla_nest_start_noflag() in nft_dump_basechain_hook() - bpf: fix UAF in task local storage - ipv4: ping_group_range: allow GID from 2147483648 to 4294967294 - ipv6: rpl: fix route of death. - tcp: gso: really support BIG TCP - mptcp: fixes for user-space PM address advertisement - smc: avoid to access invalid RMBs' MRs in SMCRv1 ADD LINK CONT - can: avoid possible use-after-free when j1939_can_rx_register fails - batman-adv: fix UaF while rescheduling delayed work - eth: qede: fix scheduling while atomic - eth: ice: make writes to /dev/gnssX synchronous" * tag 'net-6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits) bnxt_en: Implement .set_port / .unset_port UDP tunnel callbacks bnxt_en: Prevent kernel panic when receiving unexpected PHC_UPDATE event bnxt_en: Skip firmware fatal error recovery if chip is not accessible bnxt_en: Query default VLAN before VNIC setup on a VF bnxt_en: Don't issue AP reset during ethtool's reset operation bnxt_en: Fix bnxt_hwrm_update_rss_hash_cfg() net: bcmgenet: Fix EEE implementation eth: ixgbe: fix the wake condition eth: bnxt: fix the wake condition lib: cpu_rmap: Fix potential use-after-free in irq_cpu_rmap_release() bpf: Add extra path pointer check to d_path helper net: sched: fix possible refcount leak in tc_chain_tmplt_add() net: sched: act_police: fix sparse errors in tcf_police_dump() net: openvswitch: fix upcall counter access before allocation net: sched: move rtm_tca_policy declaration to include file ice: make writes to /dev/gnssX synchronous net: sched: add rcu annotations around qdisc->qdisc_sleeping rfs: annotate lockless accesses to RFS sock flow table rfs: annotate lockless accesses to sk->sk_rxhash virtio_net: use control_buf for coalesce params ...
2023-06-08hwmon: (core) Add missing beep-related standard attributesJames Seo1-0/+10
beep_enable, inX_beep, currX_beep, fanX_beep, and tempX_beep are standard attributes mentioned in the sysfs interface specification but not implemented in the hwmon core. Since these are not deprecated, implement them. Adding beep_mask is not necessary, as it is deprecated and the drivers already using it are manually defining it. Signed-off-by: James Seo <james@equiv.tech> Link: https://lore.kernel.org/r/20230507152216.1862653-1-james@equiv.tech Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-06-08Move netfs_extract_iter_to_sg() to lib/scatterlist.cDavid Howells2-4/+5
Move netfs_extract_iter_to_sg() to lib/scatterlist.c as it's going to be used by more than just network filesystems (AF_ALG, for example). Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Herbert Xu <herbert@gondor.apana.org.au> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: Matthew Wilcox <willy@infradead.org> cc: linux-crypto@vger.kernel.org cc: linux-cachefs@redhat.com cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: netdev@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-08Drop the netfs_ prefix from netfs_extract_iter_to_sg()David Howells1-3/+3
Rename netfs_extract_iter_to_sg() and its auxiliary functions to drop the netfs_ prefix. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Herbert Xu <herbert@gondor.apana.org.au> cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-crypto@vger.kernel.org cc: linux-cachefs@redhat.com cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: netdev@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-08net: pcs: Add 10GBASE-R mode for Synopsys Designware XPCSJiawen Wu1-0/+1
Add basic support for XPCS using 10GBASE-R interface. This mode will be extended to use interrupt, so set pcs.poll false. And avoid soft reset so that the device using this mode is in the default configuration. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-08mm/slab: break up RCU readers on SLAB_TYPESAFE_BY_RCU example codeSeongJae Park1-3/+5
The SLAB_TYPESAFE_BY_RCU example code snippet uses a single RCU read-side critical section for retries. 'Documentation/RCU/rculist_nulls.rst' has similar example code snippet, and commit da82af04352b ("doc: Update and wordsmith rculist_nulls.rst") broke it up. Apply the change to SLAB_TYPESAFE_BY_RCU example code snippet, too. Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-06-08mm/slab: add a missing semicolon on SLAB_TYPESAFE_BY_RCU example codeSeongJae Park1-1/+1
An example code snippet for SLAB_TYPESAFE_BY_RCU is missing a semicolon. Add it. Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-06-08ACPI: bus: Consolidate all arm specific initialisation into acpi_arm_init()Sudeep Holla4-34/+6
Move all of the ARM-specific initialization into one function namely acpi_arm_init(), so it is not necessary to modify/update bus.c every time a new piece of it is added. Cc: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/CAJZ5v0iBZRZmV_oU+VurqxnVMbFN_ttqrL=cLh0sUH+=u0PYsw@mail.gmail.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Link: https://lore.kernel.org/r/20230606093531.2746732-1-sudeep.holla@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-08net/mlx5e: Expose catastrophic steering error countersLama Kayal1-2/+10
Add generated_pkt_steering_fail and handled_pkt_steering_fail to devlink heatlth reporter. generated_pkt_steering_fail indicates the number of packets dropped due to illegal steering operation within the vport steering domain. handled_pkt_steering_fail indicates the number of packets dropped due to illegal steering operation, originated by the vport. Also, update devlink reporter functionality documentation with the newly exposed counters. Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-08{net/RDMA}/mlx5: introduce lag_for_each_peerShay Drory1-1/+7
Introduce a generic APIs to iterate over all the devices which are part of the LAG. This API replace mlx5_lag_get_peer_mdev() which retrieve only a single peer device from the lag. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-06-07Merge branches 'doc.2023.05.10a', 'fixes.2023.05.11a', 'kvfree.2023.05.10a', ↵Paul E. McKenney3-50/+22
'nocb.2023.05.11a', 'rcu-tasks.2023.05.10a', 'torture.2023.05.15a' and 'rcu-urgent.2023.06.06a' into HEAD doc.2023.05.10a: Documentation updates fixes.2023.05.11a: Miscellaneous fixes kvfree.2023.05.10a: kvfree_rcu updates nocb.2023.05.11a: Callback-offloading updates rcu-tasks.2023.05.10a: Tasks RCU updates torture.2023.05.15a: Torture-test updates rcu-urgent.2023.06.06a: Urgent SRCU fix
2023-06-07notifier: Initialize new struct srcu_usage fieldChen-Yu Tsai1-0/+10
In commit 95433f726301 ("srcu: Begin offloading srcu_struct fields to srcu_update"), a new struct srcu_usage field was added, but was not properly initialized. This led to a "spinlock bad magic" BUG when the SRCU notifier was ever used. This was observed in the MediaTek CCI devfreq driver on next-20230525. The trimmed stack trace is as follows: BUG: spinlock bad magic on CPU#4, swapper/0/1 lock: 0xffffff80ff529ac0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 Call trace: spin_bug+0xa4/0xe8 do_raw_spin_lock+0xec/0x120 _raw_spin_lock_irqsave+0x78/0xb8 synchronize_srcu+0x3c/0x168 srcu_notifier_chain_unregister+0x5c/0xa0 cpufreq_unregister_notifier+0x94/0xe0 devfreq_passive_event_handler+0x7c/0x3e0 devfreq_remove_device+0x48/0xe8 Add __SRCU_USAGE_INIT() to SRCU_NOTIFIER_INIT() so that srcu_usage gets initialized properly. Reported-by: Jon Hunter <jonathanh@nvidia.com> Fixes: 95433f726301 ("srcu: Begin offloading srcu_struct fields to srcu_update") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com> Cc: Sachin Sant <sachinp@linux.ibm.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org Tested-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Zqiang <qiang.zhang1211@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-06-07pktcdvd: Get rid of custom printing macrosAndy Shevchenko1-1/+0
We may use traditional dev_*() macros instead of custom ones provided by the driver. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20230310164549.22133-2-andriy.shevchenko@linux.intel.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-07TI TPS6594 PMIC support (RTC, pinctrl, regulators)Mark Brown1-0/+1020
Merge series from Esteban Blanc <eblanc@baylibre.com>: TPS6594 is a Power Management IC which provides regulators and others features like GPIOs, RTC, watchdog, ESMs (Error Signal Monitor), and PFSM (Pre-configurable Finite State Machine). The SoC and the PMIC can communicate through the I2C or SPI interfaces. TPS6594 is the super-set device while TPS6593 and LP8764 are derivatives. This series adds support to TI TPS6594 PMIC and its derivatives.
2023-06-07spi-geni-qcom: Add new interfaces and utilise themMark Brown24-57/+112
Merge series from Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com>: A "known issue" during implementation of SE DMA for spi geni driver was that it does DMA map/unmap internally instead of in spi framework. Current patches remove this hiccup and also clean up code a bit. Testing revealed no regressions and results with 1000 iterations of reading from EC showed no loss of performance. Results ======= Before - Iteration 999, min=5.10, max=5.17, avg=5.14, ints=25129 After - Iteration 999, min=5.10, max=5.20, avg=5.15, ints=25153
2023-06-07ALSA: hda: Add Loongson LS7A HD-Audio supportYanteng Si1-0/+3
Add the new PCI ID 0x0014 0x7a07 and the new PCI ID 0x0014 0x7a37 Loongson HDA controller. Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Acked-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://lore.kernel.org/r/993587483b9509796b29a416f257fcfb4b15c6ea.1686128807.git.siyanteng@loongson.cn Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-06-07net: sched: add rcu annotations around qdisc->qdisc_sleepingEric Dumazet1-1/+1
syzbot reported a race around qdisc->qdisc_sleeping [1] It is time we add proper annotations to reads and writes to/from qdisc->qdisc_sleeping. [1] BUG: KCSAN: data-race in dev_graft_qdisc / qdisc_lookup_rcu read to 0xffff8881286fc618 of 8 bytes by task 6928 on cpu 1: qdisc_lookup_rcu+0x192/0x2c0 net/sched/sch_api.c:331 __tcf_qdisc_find+0x74/0x3c0 net/sched/cls_api.c:1174 tc_get_tfilter+0x18f/0x990 net/sched/cls_api.c:2547 rtnetlink_rcv_msg+0x7af/0x8c0 net/core/rtnetlink.c:6386 netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2546 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6413 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0x375/0x4c0 net/socket.c:2503 ___sys_sendmsg net/socket.c:2557 [inline] __sys_sendmsg+0x1e3/0x270 net/socket.c:2586 __do_sys_sendmsg net/socket.c:2595 [inline] __se_sys_sendmsg net/socket.c:2593 [inline] __x64_sys_sendmsg+0x46/0x50 net/socket.c:2593 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd write to 0xffff8881286fc618 of 8 bytes by task 6912 on cpu 0: dev_graft_qdisc+0x4f/0x80 net/sched/sch_generic.c:1115 qdisc_graft+0x7d0/0xb60 net/sched/sch_api.c:1103 tc_modify_qdisc+0x712/0xf10 net/sched/sch_api.c:1693 rtnetlink_rcv_msg+0x807/0x8c0 net/core/rtnetlink.c:6395 netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2546 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6413 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0x375/0x4c0 net/socket.c:2503 ___sys_sendmsg net/socket.c:2557 [inline] __sys_sendmsg+0x1e3/0x270 net/socket.c:2586 __do_sys_sendmsg net/socket.c:2595 [inline] __se_sys_sendmsg net/socket.c:2593 [inline] __x64_sys_sendmsg+0x46/0x50 net/socket.c:2593 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 6912 Comm: syz-executor.5 Not tainted 6.4.0-rc3-syzkaller-00190-g0d85b27b0cc6 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/16/2023 Fixes: 3a7d0d07a386 ("net: sched: extend Qdisc with rcu") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vlad Buslov <vladbu@nvidia.com> Acked-by: Jamal Hadi Salim<jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-07rfs: annotate lockless accesses to RFS sock flow tableEric Dumazet1-2/+5
Add READ_ONCE()/WRITE_ONCE() on accesses to the sock flow table. This also prevents a (smart ?) compiler to remove the condition in: if (table->ents[index] != newval) table->ents[index] = newval; We need the condition to avoid dirtying a shared cache line. Fixes: fec5e652e58f ("rfs: Receive Flow Steering") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-06Input: gameport - provide default trigger() and read()Dmitry Torokhov1-9/+2
Instead of constantly checking pointer(s) for non-NULL-ness provide default implementations of trigger() and read() and instantiate them during pore registration if driver-specific versions were not provided. Link: https://lore.kernel.org/r/ZGvoqP5PAAsJuky4@google.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2023-06-06soc: qcom: geni-se: Add interfaces geni_se_tx_init_dma() and ↵Vijaya Krishna Nivarthi1-0/+4
geni_se_rx_init_dma() The geni_se_xx_dma_prep() interfaces necessarily do DMA mapping before initiating DMA transfers. This is not suitable for spi where framework is expected to handle map/unmap. Expose new interfaces geni_se_xx_init_dma() which do only DMA transfer. Signed-off-by: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/1684325894-30252-2-git-send-email-quic_vnivarth@quicinc.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-06x86/efi: Safely enable unaccepted memory in UEFIDionna Glaze1-0/+3
The UEFI v2.9 specification includes a new memory type to be used in environments where the OS must accept memory that is provided from its host. Before the introduction of this memory type, all memory was accepted eagerly in the firmware. In order for the firmware to safely stop accepting memory on the OS's behalf, the OS must affirmatively indicate support to the firmware. This is only a problem for AMD SEV-SNP, since Linux has had support for it since 5.19. The other technology that can make use of unaccepted memory, Intel TDX, does not yet have Linux support, so it can strictly require unaccepted memory support as a dependency of CONFIG_TDX and not require communication with the firmware. Enabling unaccepted memory requires calling a 0-argument enablement protocol before ExitBootServices. This call is only made if the kernel is compiled with UNACCEPTED_MEMORY=y This protocol will be removed after the end of life of the first LTS that includes it, in order to give firmware implementations an expiration date for it. When the protocol is removed, firmware will strictly infer that a SEV-SNP VM is running an OS that supports the unaccepted memory type. At the earliest convenience, when unaccepted memory support is added to Linux, SEV-SNP may take strict dependence in it. After the firmware removes support for the protocol, this should be reverted. [tl: address some checkscript warnings] Signed-off-by: Dionna Glaze <dionnaglaze@google.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/0d5f3d9a20b5cf361945b7ab1263c36586a78a42.1686063086.git.thomas.lendacky@amd.com
2023-06-06efi: Add unaccepted memory supportKirill A. Shutemov1-0/+1
efi_config_parse_tables() reserves memory that holds unaccepted memory configuration table so it won't be reused by page allocator. Core-mm requires few helpers to support unaccepted memory: - accept_memory() checks the range of addresses against the bitmap and accept memory if needed. - range_contains_unaccepted_memory() checks if anything within the range requires acceptance. Architectural code has to provide efi_get_unaccepted_table() that returns pointer to the unaccepted memory configuration table. arch_accept_memory() handles arch-specific part of memory acceptance. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20230606142637.5171-6-kirill.shutemov@linux.intel.com
2023-06-06bpf: Cleanup unused function declarationRuiqi Gong1-1/+0
All usage and the definition of `bpf_prog_free_linfo()` has been removed in commit e16301fbe183 ("bpf: Simplify freeing logic in linfo and jited_linfo"). Clean up its declaration in the header file. Signed-off-by: Ruiqi Gong <gongruiqi@huaweicloud.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/all/20230602030842.279262-1-gongruiqi@huaweicloud.com/ Link: https://lore.kernel.org/bpf/20230606021047.170667-1-gongruiqi@huaweicloud.com
2023-06-06efi/libstub: Implement support for unaccepted memoryKirill A. Shutemov1-1/+11
UEFI Specification version 2.9 introduces the concept of memory acceptance: Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, requiring memory to be accepted before it can be used by the guest. Accepting happens via a protocol specific for the Virtual Machine platform. Accepting memory is costly and it makes VMM allocate memory for the accepted guest physical address range. It's better to postpone memory acceptance until memory is needed. It lowers boot time and reduces memory overhead. The kernel needs to know what memory has been accepted. Firmware communicates this information via memory map: a new memory type -- EFI_UNACCEPTED_MEMORY -- indicates such memory. Range-based tracking works fine for firmware, but it gets bulky for the kernel: e820 (or whatever the arch uses) has to be modified on every page acceptance. It leads to table fragmentation and there's a limited number of entries in the e820 table. Another option is to mark such memory as usable in e820 and track if the range has been accepted in a bitmap. One bit in the bitmap represents a naturally aligned power-2-sized region of address space -- unit. For x86, unit size is 2MiB: 4k of the bitmap is enough to track 64GiB or physical address space. In the worst-case scenario -- a huge hole in the middle of the address space -- It needs 256MiB to handle 4PiB of the address space. Any unaccepted memory that is not aligned to unit_size gets accepted upfront. The bitmap is allocated and constructed in the EFI stub and passed down to the kernel via EFI configuration table. allocate_e820() allocates the bitmap if unaccepted memory is present, according to the size of unaccepted region. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20230606142637.5171-4-kirill.shutemov@linux.intel.com
2023-06-06mm: Add support for unaccepted memoryKirill A. Shutemov2-0/+27
UEFI Specification version 2.9 introduces the concept of memory acceptance. Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, require memory to be accepted before it can be used by the guest. Accepting happens via a protocol specific to the Virtual Machine platform. There are several ways the kernel can deal with unaccepted memory: 1. Accept all the memory during boot. It is easy to implement and it doesn't have runtime cost once the system is booted. The downside is very long boot time. Accept can be parallelized to multiple CPUs to keep it manageable (i.e. via DEFERRED_STRUCT_PAGE_INIT), but it tends to saturate memory bandwidth and does not scale beyond the point. 2. Accept a block of memory on the first use. It requires more infrastructure and changes in page allocator to make it work, but it provides good boot time. On-demand memory accept means latency spikes every time kernel steps onto a new memory block. The spikes will go away once workload data set size gets stabilized or all memory gets accepted. 3. Accept all memory in background. Introduce a thread (or multiple) that gets memory accepted proactively. It will minimize time the system experience latency spikes on memory allocation while keeping low boot time. This approach cannot function on its own. It is an extension of #2: background memory acceptance requires functional scheduler, but the page allocator may need to tap into unaccepted memory before that. The downside of the approach is that these threads also steal CPU cycles and memory bandwidth from the user's workload and may hurt user experience. Implement #1 and #2 for now. #2 is the default. Some workloads may want to use #1 with accept_memory=eager in kernel command line. #3 can be implemented later based on user's demands. Support of unaccepted memory requires a few changes in core-mm code: - memblock accepts memory on allocation. It serves early boot memory allocations and doesn't limit them to pre-accepted pool of memory. - page allocator accepts memory on the first allocation of the page. When kernel runs out of accepted memory, it accepts memory until the high watermark is reached. It helps to minimize fragmentation. EFI code will provide two helpers if the platform supports unaccepted memory: - accept_memory() makes a range of physical addresses accepted. - range_contains_unaccepted_memory() checks anything within the range of physical addresses requires acceptance. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mike Rapoport <rppt@linux.ibm.com> # memblock Link: https://lore.kernel.org/r/20230606142637.5171-2-kirill.shutemov@linux.intel.com
2023-06-06regulator: Add X-Powers AXP15060/AXP313a PMICMark Brown1-0/+32
Merge series from Andre Przywara <andre.przywara@arm.com>: This patch series adds support for the X-Powers AXP15060 and AXP313a PMIC, which are general purpose PMICs as seen on different boards with different SOCs, mostly from Allwinner. This is mostly a repost of the previous patches, combining both the AXP313a and AXP15060 series, rebased on top of v6.4-rc3, and omitting the patches that already got merged. The first two patches are the successors of the AXP313a v10 post, the third patch is based on Shengyu's AXP15060 v3 post. There were no code changes, just some tiny context differences due to the rebase, plus I added the newly gained tags. As the DT bindings and the AXP15060 MFD part are already in the tree, this is just completing support with the MFD part for the AXP313a, and the regulator support for both PMICs.
2023-06-06firmware: arm_scmi: Add Powercap protocol enable supportCristian Marussi1-0/+18
SCMI powercap protocol v3.2 supports disabling the powercap on a zone by zone basis by providing a zero valued powercap. Expose new operations to enable/disable powercapping on a per-zone base. Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20230531152039.2363181-3-cristian.marussi@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-06-06Merge tag 'platform-drivers-x86-v6.4-4' of ↵Linus Torvalds1-5/+1
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - various Microsoft Surface support fixes - one fix for the INT3472 driver * tag 'platform-drivers-x86-v6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: int3472: Avoid crash in unregistering regulator gpio platform/surface: aggregator_tabletsw: Add support for book mode in POS subsystem platform/surface: aggregator_tabletsw: Add support for book mode in KIP subsystem platform/surface: aggregator: Allow completion work-items to be executed in parallel platform/surface: aggregator: Make to_ssam_device_driver() respect constness
2023-06-06wifi: mac80211: provide a helper to fetch the medium synchronization delayEmmanuel Grumbach1-0/+35
There are drivers which need this information. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230604120651.b1043f3126e2.Iad3806f8bf8df07f52ef0a02cc3d0373c44a8c93@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-06wifi: mac80211: fetch and store the EML capability informationEmmanuel Grumbach1-0/+35
We need to teach the low level driver about the EML capability which includes information for EMLSR / EMLMR operation. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230504134511.828474-11-gregory.greenman@intel.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-06wifi: mac80211: skip EHT BSS membership selectorJohannes Berg1-1/+4
Skip the EHT BSS membership selector for getting rates. While at it, add the definitions for GLK and EPS, and sort the list. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230504134511.828474-9-gregory.greenman@intel.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-06watch_queue: prevent dangling pipe pointerSiddh Raman Pant1-2/+1
NULL the dangling pipe reference while clearing watch_queue. If not done, a reference to a freed pipe remains in the watch_queue, as this function is called before freeing a pipe in free_pipe_info() (see line 834 of fs/pipe.c). The sole use of wqueue->defunct is for checking if the watch queue has been cleared, but wqueue->pipe is also NULLed while clearing. Thus, wqueue->defunct is superfluous, as wqueue->pipe can be checked for NULL. Hence, the former can be removed. Tested with keyutils testsuite. Cc: stable@vger.kernel.org # 6.1 Signed-off-by: Siddh Raman Pant <code@siddh.me> Acked-by: David Howells <dhowells@redhat.com> Message-Id: <20230605143616.640517-1-code@siddh.me> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-06-06lib/string_helpers: Change returned value of the strreplace()Andy Shevchenko1-1/+1
It's more useful to return the pointer to the string itself with strreplace(), so it may be used like attr->name = strreplace(name, '/', '_'); While at it, amend the kernel documentation. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605170553.7835-3-andriy.shevchenko@linux.intel.com
2023-06-06Merge branch 'drm-i915-use-ref_tracker-library-for-tracking-wakerefs'Jakub Kicinski1-2/+23
Andrzej Hajda says: ==================== drm/i915: use ref_tracker library for tracking wakerefs This is reviewed series of ref_tracker patches, ready to merge via network tree, rebased on net-next/main. i915 patches will be merged later via intel-gfx tree. ==================== Merge on top of an -rc tag in case it's needed in another tree. Link: https://lore.kernel.org/r/20230224-track_gt-v9-0-5b47a33f55d1@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06lib/ref_tracker: add printing to memory bufferAndrzej Hajda1-0/+8
Similar to stack_(depot|trace)_snprint the patch adds helper to printing stats to memory buffer. It will be helpful in case of debugfs. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06lib/ref_tracker: improve printing statsAndrzej Hajda1-2/+7
In case the library is tracking busy subsystem, simply printing stack for every active reference will spam log with long, hard to read, redundant stack traces. To improve readabilty following changes have been made: - reports are printed per stack_handle - log is more compact, - added display name for ref_tracker_dir - it will differentiate multiple subsystems, - stack trace is printed indented, in the same printk call, - info about dropped references is printed as well. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06lib/ref_tracker: add unlocked leak print helperAndrzej Hajda1-0/+8
To have reliable detection of leaks, caller must be able to check under the same lock both: tracked counter and the leaks. dir.lock is natural candidate for such lock and unlocked print helper can be called with this lock taken. As a bonus we can reuse this helper in ref_tracker_dir_exit. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-05sched/clock: Provide local_clock_noinstr()Peter Zijlstra1-1/+16
Now that all ARCH_WANTS_NO_INSTR architectures (arm64, loongarch, s390, x86) provide sched_clock_noinstr(), use this to provide local_clock_noinstr(). This local_clock_noinstr() will be safe to use from noinstr code with the assumption that any such noinstr code is non-preemptible (it had better be, entry code will have IRQs disabled while __cpuidle must have preemption disabled). Specifically, preempt_enable_notrace(), a common part of many a sched_clock() implementation calls out to schedule() -- even though, per the above, it will never trigger -- which frustrates noinstr validation. vmlinux.o: warning: objtool: local_clock+0xb5: call to preempt_schedule_notrace_thunk() leaves .noinstr.text section Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Michael Kelley <mikelley@microsoft.com> # Hyper-V Link: https://lore.kernel.org/r/20230519102715.978624636@infradead.org
2023-06-05math64: Always inline u128 version of mul_u64_u64_shr()Peter Zijlstra1-1/+1
In order to prevent the following complaint from happening, always inline the u128 variant of mul_u64_u64_shr() -- which is what x86_64 will use. vmlinux.o: warning: objtool: read_hv_sched_clock_tsc+0x5a: call to mul_u64_u64_shr.constprop.0() leaves .noinstr.text section It should compile into something like: asm("mul %[mul];" "shrd %rdx, %rax, %cl" : "+&a" (a) : "c" shift, [mul] "r" (mul) : "d"); Which is silly not to inline, but it happens. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Michael Kelley <mikelley@microsoft.com> # Hyper-V Link: https://lore.kernel.org/r/20230519102715.637420396@infradead.org