summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-10-25cxl/core: Return error when cxl_endpoint_gather_bandwidth() handles a ↵Li Zhijian1-0/+3
non-PCI device The function cxl_endpoint_gather_bandwidth() invokes pci_bus_read/write_XXX(), however, not all CXL devices are presently implemented via PCI. It is recognized that the cxl_test has realized a CXL device using a platform device. Calling pci_bus_read/write_XXX() in cxl_test will cause kernel panic: platform cxl_host_bridge.3: host supports CXL (restricted) Oops: general protection fault, probably for non-canonical address 0x3ef17856fcae4fbd: 0000 [#1] PREEMPT SMP PTI Call Trace: <TASK> ? __die_body.cold+0x19/0x27 ? die_addr+0x38/0x60 ? exc_general_protection+0x1f5/0x4b0 ? asm_exc_general_protection+0x22/0x30 ? pci_bus_read_config_word+0x1c/0x60 pcie_capability_read_word+0x93/0xb0 pcie_link_speed_mbps+0x18/0x50 cxl_pci_get_bandwidth+0x18/0x60 [cxl_core] cxl_endpoint_gather_bandwidth.constprop.0+0xf4/0x230 [cxl_core] ? xas_store+0x54/0x660 ? preempt_count_add+0x69/0xa0 ? _raw_spin_lock+0x13/0x40 ? __kmalloc_cache_noprof+0xe7/0x270 cxl_region_shared_upstream_bandwidth_update+0x9c/0x790 [cxl_core] cxl_region_attach+0x520/0x7e0 [cxl_core] store_targetN+0xf2/0x120 [cxl_core] kernfs_fop_write_iter+0x13a/0x1f0 vfs_write+0x23b/0x410 ksys_write+0x53/0xd0 do_syscall_64+0x62/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e And Ying also reported a KASAN error with similar calltrace. Reported-by: Huang, Ying <ying.huang@intel.com> Closes: http://lore.kernel.org/87y12w9vp5.fsf@yhuang6-desk2.ccr.corp.intel.com Fixes: a5ab0de0ebaa ("cxl: Calculate region bandwidth of targets with shared upstream link") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Huang, Ying <ying.huang@intel.com> Link: https://patch.msgid.link/20241022030054.258942-1-lizhijian@fujitsu.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-24Merge tag 'probes-fixes-v6.12-rc4.2' of ↵Linus Torvalds6-6/+21
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - objpool: Fix choosing allocation for percpu slots Fixes to allocate objpool's percpu slots correctly according to the GFP flag. It checks whether "any bit" in GFP_ATOMIC is set to choose the vmalloc source, but it should check "all bits" in GFP_ATOMIC flag is set, because GFP_ATOMIC is a combined flag. - tracing/probes: Fix MAX_TRACE_ARGS limit handling If more than MAX_TRACE_ARGS are passed for creating a probe event, the entries over MAX_TRACE_ARG in trace_arg array are not initialized. Thus if the kernel accesses those entries, it crashes. This rejects creating event if the number of arguments is over MAX_TRACE_ARGS. - tracing: Consider the NUL character when validating the event length A strlen() is used when parsing the event name, and the original code does not consider the terminal null byte. Thus it can pass the name one byte longer than the buffer. This fixes to check it correctly. * tag 'probes-fixes-v6.12-rc4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Consider the NULL character when validating the event length tracing/probes: Fix MAX_TRACE_ARGS limit handling objpool: fix choosing allocation for percpu slots
2024-10-24Merge tag 'for-6.12-rc4-tag' of ↵Linus Torvalds9-34/+45
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - mount option fixes: - fix handling of compression mount options on remount - reject rw remount in case there are options that don't work in read-write mode (like rescue options) - fix zone accounting of unusable space - fix in-memory corruption when merging extent maps - fix delalloc range locking for sector < page - use more convenient default value of drop subtree threshold, clean more subvolumes without the fallback to marking quotas inconsistent - fix smatch warning about incorrect value passed to ERR_PTR * tag 'for-6.12-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix passing 0 to ERR_PTR in btrfs_search_dir_index_item() btrfs: reject ro->rw reconfiguration if there are hard ro requirements btrfs: fix read corruption due to race with extent map merging btrfs: fix the delalloc range locking if sector size < page size btrfs: qgroup: set a more sane default value for subtree drop threshold btrfs: clear force-compress on remount when compress mount option is given btrfs: zoned: fix zone unusable accounting for freed reserved extent
2024-10-24Merge tag 'jfs-6.12-rc5' of github.com:kleikamp/linux-shaggyLinus Torvalds1-1/+1
Pull jfs fix from David Kleikamp: "Fix a regression introduced in 6.12-rc1" * tag 'jfs-6.12-rc5' of github.com:kleikamp/linux-shaggy: jfs: Fix sanity check in dbMount
2024-10-24Merge tag 'bcachefs-2024-10-22' of https://github.com/koverstreet/bcachefsLinus Torvalds41-205/+475
Pull bcachefs fixes from Kent Overstreet: "Lots of hotfixes: - transaction restart injection has been shaking out a few things - fix a data corruption in the buffered write path on -ENOSPC, found by xfstests generic/299 - Some small show_options fixes - Repair mismatches in inode hash type, seed: different snapshot versions of an inode must have the same hash/type seed, used for directory entries and xattrs. We were checking the hash seed, but not the type, and a user contributed a filesystem where the hash type on one inode had somehow been flipped; these fixes allow his filesystem to repair. Additionally, the hash type flip made some directory entries invisible, which were then recreated by userspace; so the hash check code now checks for duplicate non dangling dirents, and renames one of them if necessary. - Don't use wait_event_interruptible() in recovery: this fixes some filesystems failing to mount with -ERESTARTSYS - Workaround for kvmalloc not supporting > INT_MAX allocations, causing an -ENOMEM when allocating the sorted array of journal keys: this allows a 75 TB filesystem to mount - Make sure bch_inode_unpacked.bi_snapshot is set in the old inode compat path: this alllows Marcin's filesystem (in use since before 6.7) to repair and mount" * tag 'bcachefs-2024-10-22' of https://github.com/koverstreet/bcachefs: (26 commits) bcachefs: Set bch_inode_unpacked.bi_snapshot in old inode path bcachefs: Mark more errors as AUTOFIX bcachefs: Workaround for kvmalloc() not supporting > INT_MAX allocations bcachefs: Don't use wait_event_interruptible() in recovery bcachefs: Fix __bch2_fsck_err() warning bcachefs: fsck: Improve hash_check_key() bcachefs: bch2_hash_set_or_get_in_snapshot() bcachefs: Repair mismatches in inode hash seed, type bcachefs: Add hash seed, type to inode_to_text() bcachefs: INODE_STR_HASH() for bch_inode_unpacked bcachefs: Run in-kernel offline fsck without ratelimit errors bcachefs: skip mount option handle for empty string. bcachefs: fix incorrect show_options results bcachefs: Fix data corruption on -ENOSPC in buffered write path bcachefs: bch2_folio_reservation_get_partial() is now better behaved bcachefs: fix disk reservation accounting in bch2_folio_reservation_get() bcachefS: ec: fix data type on stripe deletion bcachefs: Don't use commit_do() unnecessarily bcachefs: handle restarts in bch2_bucket_io_time_reset() bcachefs: fix restart handling in __bch2_resume_logged_op_finsert() ...
2024-10-24Revert "9p: Enable multipage folios"Dominique Martinet1-1/+0
This reverts commit 1325e4a91a405f88f1b18626904d37860a4f9069. using multipage folios apparently break some madvise operations like MADV_PAGEOUT which do not reliably unload the specified page anymore, Revert the patch until that is figured out. Reported-by: Andrii Nakryiko <andrii@kernel.org> Fixes: 1325e4a91a40 ("9p: Enable multipage folios") Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-10-24riscv: vdso: Prevent the compiler from inserting calls to memset()Alexandre Ghiti1-0/+1
The compiler is smart enough to insert a call to memset() in riscv_vdso_get_cpus(), which generates a dynamic relocation. So prevent this by using -fno-builtin option. Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API") Cc: stable@vger.kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20241016083625.136311-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-24iio: dac: Kconfig: Fix build error for ltc2664Jinjie Ruan1-1/+1
If REGMAP_SPI is n and LTC2664 is y, the following build error occurs: riscv64-unknown-linux-gnu-ld: drivers/iio/dac/ltc2664.o: in function `ltc2664_probe': ltc2664.c:(.text+0x714): undefined reference to `__devm_regmap_init_spi' Select REGMAP_SPI instead of REGMAP for LTC2664 to fix it. Fixes: 4cc2fc445d2e ("iio: dac: ltc2664: Add driver for LTC2664 and LTC2672") Reviewed-by: Nuno Sa <nuno.sa@analog.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20241024015553.1111253-1-ruanjinjie@huawei.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-10-24drm/xe: Don't restart parallel queues multiple times on GT resetNirmoy Das1-2/+12
In case of parallel submissions multiple GuC id will point to the same exec queue and on GT reset such exec queues will get restarted multiple times which is not desirable. v2: don't use exec_queue_enabled() which could race, do the same for xe_guc_submit_stop (Matt B) Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2295 Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022103555.731557-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit c8b0acd6d8745fd7e6450f5acc38f0227bd253b3) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-24drm/xe/ufence: Prefetch ufence addr to catch bogus addressNirmoy Das1-1/+2
access_ok() only checks for addr overflow so also try to read the addr to catch invalid addr sent from userspace. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1630 Cc: Francois Dugast <francois.dugast@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016082304.66009-2-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit 9408c4508483ffc60811e910a93d6425b8e63928) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-24drm/xe: Handle unreliable MMIO reads during forcewakeShuicheng Lin1-3/+9
In some cases, when the driver attempts to read an MMIO register, the hardware may return 0xFFFFFFFF. The current force wake path code treats this as a valid response, as it only checks the BIT. However, 0xFFFFFFFF should be considered an invalid value, indicating a potential issue. To address this, we should add a log entry to highlight this condition and return failure. The force wake failure log level is changed from notice to err to match the failure return value. v2 (Matt Brost): - set ret value (-EIO) to kick the error to upper layers v3 (Rodrigo): - add commit message for the log level promotion from notice to err v4: - update reviewed info Suggested-by: Alex Zuo <alex.zuo@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Acked-by: Badal Nilawar <badal.nilawar@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241017221547.1564029-1-shuicheng.lin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit a9fbeabe7226a3bf90f82d0e28a02c18e3c67447) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-24drm/xe/guc/ct: Flush g2h worker in case of g2h response timeoutBadal Nilawar1-0/+18
In case if g2h worker doesn't get opportunity to within specified timeout delay then flush the g2h worker explicitly. v2: - Describe change in the comment and add TODO (Matt B/John H) - Add xe_gt_warn on fence done after G2H flush (John H) v3: - Updated the comment with root cause - Clean up xe_gt_warn message (John H) Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1620 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2902 Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241017111410.2553784-2-badal.nilawar@intel.com (cherry picked from commit e5152723380404acb8175e0777b1cea57f319a01) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-24drm/xe: Enlarge the invalidation timeout from 150 to 500Shuicheng Lin1-1/+1
There are error messages like below that are occurring during stress testing: "[ 31.004009] xe 0000:03:00.0: [drm] ERROR GT0: Global invalidation timeout". Previously it was hitting this 3 out of 1000 executions of warm reboot. After raising it to 500, 1000 warm reboot executions passed and it didn't fail. Due to the way xe_mmio_wait32() is implemented, the timeout is able to expire early when the register matches the expected value due to the wait increments starting small. So, the larger timeout value should have no effect during normal use cases. v2 (Jonathan): - rework the commit message v3 (Lucas): - add conclusive message for the fail rate and test case v4: - add suggested-by Suggested-by: Jia Yao <jia.yao@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Tested-by: Zongyao Bai <zongyao.bai@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241015161207.1373401-1-shuicheng.lin@intel.com (cherry picked from commit 2eb460ab9f4bc5b575f52568d17936da0af681d8) [ Fix conflict with gt->mmio ] Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-24iio: adc: ad7124: fix division by zero in ad7124_set_channel_odr()Zicheng Qu1-1/+1
In the ad7124_write_raw() function, parameter val can potentially be zero. This may lead to a division by zero when DIV_ROUND_CLOSEST() is called within ad7124_set_channel_odr(). The ad7124_write_raw() function is invoked through the sequence: iio_write_channel_raw() -> iio_write_channel_attribute() -> iio_channel_write(), with no checks in place to ensure val is non-zero. Cc: stable@vger.kernel.org Fixes: 7b8d045e497a ("iio: adc: ad7124: allow more than 8 channels") Signed-off-by: Zicheng Qu <quzicheng@huawei.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Link: https://patch.msgid.link/20241022134330.574601-1-quzicheng@huawei.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-10-24staging: iio: frequency: ad9832: fix division by zero in ad9832_calc_freqreg()Zicheng Qu1-2/+5
In the ad9832_write_frequency() function, clk_get_rate() might return 0. This can lead to a division by zero when calling ad9832_calc_freqreg(). The check if (fout > (clk_get_rate(st->mclk) / 2)) does not protect against the case when fout is 0. The ad9832_write_frequency() function is called from ad9832_write(), and fout is derived from a text buffer, which can contain any value. Link: https://lore.kernel.org/all/2024100904-CVE-2024-47663-9bdc@gregkh/ Fixes: ea707584bac1 ("Staging: IIO: DDS: AD9832 / AD9835 driver") Cc: stable@vger.kernel.org Signed-off-by: Zicheng Qu <quzicheng@huawei.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/20241022134354.574614-1-quzicheng@huawei.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-10-24docs: iio: ad7380: fix supply for ad7380-4Julien Stephan1-2/+11
ad7380-4 is the only device from ad738x family that doesn't have an internal reference. Moreover it's external reference is called REFIN in the datasheet while all other use REFIO as an optional external reference. Update documentation to highlight this. Fixes: 3e82dfc82f38 ("docs: iio: new docs for ad7380 driver") Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Julien Stephan <jstephan@baylibre.com> Link: https://patch.msgid.link/20241022-ad7380-fix-supplies-v3-5-f0cefe1b7fa6@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-10-24iio: adc: ad7380: fix supplies for ad7380-4Julien Stephan1-10/+26
ad7380-4 is the only device in the family that does not have an internal reference. It uses "refin" as a required external reference. All other devices in the family use "refio"" as an optional external reference. Fixes: 737413da8704 ("iio: adc: ad7380: add support for ad738x-4 4 channels variants") Reviewed-by: Nuno Sa <nuno.sa@analog.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Julien Stephan <jstephan@baylibre.com> Link: https://patch.msgid.link/20241022-ad7380-fix-supplies-v3-4-f0cefe1b7fa6@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-10-24iio: adc: ad7380: add missing suppliesJulien Stephan1-0/+43
vcc and vlogic are required but are not retrieved and enabled in the probe. Add them. In order to prepare support for additional parts requiring different supplies, add vcc and vlogic to the platform specific structures Reviewed-by: Nuno Sa <nuno.sa@analog.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Julien Stephan <jstephan@baylibre.com> Link: https://patch.msgid.link/20241022-ad7380-fix-supplies-v3-3-f0cefe1b7fa6@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-10-24iio: adc: ad7380: use devm_regulator_get_enable_read_voltage()Julien Stephan1-60/+21
Use devm_regulator_get_enable_read_voltage() to simplify the code. Reviewed-by: Nuno Sa <nuno.sa@analog.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Julien Stephan <jstephan@baylibre.com> Link: https://patch.msgid.link/20241022-ad7380-fix-supplies-v3-2-f0cefe1b7fa6@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-10-24dt-bindings: iio: adc: ad7380: fix ad7380-4 reference supplyJulien Stephan1-0/+21
ad7380-4 is the only device from ad738x family that doesn't have an internal reference. Moreover its external reference is called REFIN in the datasheet while all other use REFIO as an optional external reference. If refio-supply is omitted the internal reference is used. Fix the binding by adding refin-supply and makes it required for ad7380-4 only. Fixes: 1a291cc8ee17 ("dt-bindings: iio: adc: ad7380: add support for ad738x-4 4 channels variants") Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Julien Stephan <jstephan@baylibre.com> Link: https://patch.msgid.link/20241022-ad7380-fix-supplies-v3-1-f0cefe1b7fa6@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-10-24iio: light: veml6030: fix microlux value calculationJavier Carrasco1-1/+1
The raw value conversion to obtain a measurement in lux as INT_PLUS_MICRO does not calculate the decimal part properly to display it as micro (in this case microlux). It only calculates the module to obtain the decimal part from a resolution that is 10000 times the provided in the datasheet (0.5376 lux/cnt for the veml6030). The resulting value must still be multiplied by 100 to make it micro. This bug was introduced with the original implementation of the driver. Only the illuminance channel is fixed becuase the scale is non sensical for the intensity channels anyway. Cc: stable@vger.kernel.org Fixes: 7b779f573c48 ("iio: light: add driver for veml6030 ambient light sensor") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241016-veml6030-fix-processed-micro-v1-1-4a5644796437@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-10-24Merge branch 'add-the-missing-bpf_link_type-invocation-for-sockmap'Andrii Nakryiko4-5/+16
Hou Tao says: ==================== Add the missing BPF_LINK_TYPE invocation for sockmap From: Hou Tao <houtao1@huawei.com> Hi, The tiny patch set fixes the out-of-bound read problem when reading the fdinfo of sock map link fd. And in order to spot such omission early for the newly-added link type in the future, it also checks the validity of the link->type and adds a WARN_ONCE() for missed invocation. Please see individual patches for more details. And comments are always welcome. v3: * patch #2: check and warn the validity of link->type instead of adding a static assertion for bpf_link_type_strs array. v2: http://lore.kernel.org/bpf/d49fa2f4-f743-c763-7579-c3cab4dd88cb@huaweicloud.com ==================== Link: https://lore.kernel.org/r/20241024013558.1135167-1-houtao@huaweicloud.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-10-24bpf: Check validity of link->type in bpf_link_show_fdinfo()Hou Tao1-5/+9
If a newly-added link type doesn't invoke BPF_LINK_TYPE(), accessing bpf_link_type_strs[link->type] may result in an out-of-bounds access. To spot such missed invocations early in the future, checking the validity of link->type in bpf_link_show_fdinfo() and emitting a warning when such invocations are missed. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241024013558.1135167-3-houtao@huaweicloud.com
2024-10-24bpf: Add the missing BPF_LINK_TYPE invocation for sockmapHou Tao3-0/+7
There is an out-of-bounds read in bpf_link_show_fdinfo() for the sockmap link fd. Fix it by adding the missing BPF_LINK_TYPE invocation for sockmap link Also add comments for bpf_link_type to prevent missing updates in the future. Fixes: 699c23f02c65 ("bpf: Add bpf_link support for sk_msg and sk_skb progs") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241024013558.1135167-2-houtao@huaweicloud.com
2024-10-24sched_ext: Fix function pointer type mismatches in BPF selftestsVishal Chourasia19-80/+80
Fix incompatible function pointer type warnings in sched_ext BPF selftests by explicitly casting the function pointers when initializing struct_ops. This addresses multiple -Wincompatible-function-pointer-types warnings from the clang compiler where function signatures didn't match exactly. The void * cast ensures the compiler accepts the function pointer assignment despite minor type differences in the parameters. Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-10-24drm/tegra: Fix NULL vs IS_ERR() check in probe()Dan Carpenter1-2/+2
The iommu_paging_domain_alloc() function doesn't return NULL pointers, it returns error pointers. Update the check to match. Fixes: 45c690aea8ee ("drm/tegra: Use iommu_paging_domain_alloc()") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patchwork.freedesktop.org/patch/msgid/ba31cf3a-af3d-4ff1-87a8-f05aaf8c780b@stanley.mountain
2024-10-24cpufreq: CPPC: fix perf_to_khz/khz_to_perf conversion exceptionliwei1-5/+17
When the nominal_freq recorded by the kernel is equal to the lowest_freq, and the frequency adjustment operation is triggered externally, there is a logic error in cppc_perf_to_khz()/cppc_khz_to_perf(), resulting in perf and khz conversion errors. Fix this by adding a branch processing logic when nominal_freq is equal to lowest_freq. Fixes: ec1c7ad47664 ("cpufreq: CPPC: Fix performance/frequency conversion") Signed-off-by: liwei <liwei728@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/20241024022952.2627694-1-liwei728@huawei.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-24ACPI: PRM: Clean up guid type in struct prm_handler_infoDan Carpenter1-1/+1
Clang 19 prints a warning when we pass &th->guid to efi_pa_va_lookup(): drivers/acpi/prmt.c:156:29: error: passing 1-byte aligned argument to 4-byte aligned parameter 1 of 'efi_pa_va_lookup' may result in an unaligned pointer access [-Werror,-Walign-mismatch] 156 | (void *)efi_pa_va_lookup(&th->guid, handler_info->handler_address); | ^ The problem is that efi_pa_va_lookup() takes a efi_guid_t and &th->guid is a regular guid_t. The difference between the two types is the alignment. efi_guid_t is a typedef. typedef guid_t efi_guid_t __aligned(__alignof__(u32)); It's possible that this a bug in Clang 19. Even though the alignment of &th->guid is not explicitly specified, it will still end up being aligned at 4 or 8 bytes. Anyway, as Ard points out, it's cleaner to change guid to efi_guid_t type and that also makes the warning go away. Fixes: 088984c8d54c ("ACPI: PRM: Find EFI_MEMORY_RUNTIME block for PRM handler and context") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://patch.msgid.link/3777d71b-9e19-45f4-be4e-17bf4fa7a834@stanley.mountain [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-24thermal: gov_power_allocator: Granted power set to max when nobody request powerZhengShaobo1-5/+13
When total_req_power is 0, divvy_up_power() will set granted_power to 0, and cdev will be limited to the lowest performance. If our polling delay is set to 200ms, it means that cdev cannot perform better within 200ms even if cdev has a sudden load. This will affect the performance of cdev and is not as expected. For this reason, if nobody requests power, then set the granted power to the max_power. Signed-off-by: ZhengShaobo <zhengshaobo1@xiaomi.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/20241021121138.422-1-zhengshaobo1@xiaomi.com [ rjw: Fixed up tags ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-24thermal: core: Relocate thermal zone initialization routineRafael J. Wysocki1-41/+41
Move thermal_zone_device_init() along with thermal_zone_device_check() closer to the callers of the former, where they fit better together. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/1906685.CQOukoFCf9@rjwysocki.net
2024-10-24thermal: core: Use trip lists for trip crossing detectionRafael J. Wysocki2-82/+137
Modify the thermal core to use three lists of trip points: trips_high, containing trips with thresholds strictly above the current thermal zone temperature, trips_reached, containing trips with thresholds at or below the current zone temperature, trips_invalid, containing trips with temperature equal to THERMAL_ZONE_INVALID, where the first two lists are always sorted by the current trip threshold. For each trip in trips_high, there is no mitigation under way and the trip threshold is equal to its temperature. In turn, for each trip in trips_reached, there is mitigation under way and the trip threshold is equal to its low temperature. The trips in trips_invalid, of course, need not be taken into consideration. The idea is to make __thermal_zone_device_update() walk trips_high and trips_reached instead of walking the entire table of trip points in a thermal zone. Usually, it will only need to walk a few entries in one of the lists and check one entry in the other list, depending on the direction of the zone temperature changes, because crossing many trips by the zone temperature in one go between two consecutive temperature checks should be unlikely (if it occurs often, the thermal zone temperature should probably be checked more often either or there are too many trips). This also helps to eliminate one temporary trip list used for trip crossing notification (only one temporary list is needed for this purpose instead of two) and the remaining temporary list may be sorted by the current trip threshold value, like the trips_reached list, so the additional notify_temp field in struct thermal_trip_desc is not necessary any more. Moreover, since the trips_reached and trips_high lists are sorted, the "low" and "high" values needed by thermal_zone_set_trips() can be determined in a straightforward way by looking at one end of each list. Of course, additional work is needed in some places in order to maintain the ordering of the lists, but it is limited to situations that should be rare, like updating a trip point temperature or hysteresis, thermal zone initialization, or system resume. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/2003443.usQuhbGJ8B@rjwysocki.net [ rjw: Added a comment to thermal_zone_handle_trips() ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-24thermal: core: Eliminate thermal_zone_trip_down()Rafael J. Wysocki2-9/+1
Since thermal_zone_set_trip_temp() is now located in the same file as thermal_trip_crossed(), it can invoke the latter directly without using the thermal_zone_trip_down() wrapper that has no other users. Update thermal_zone_set_trip_temp() accordingly and drop thermal_zone_trip_down(). No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/1807510.VLH7GnMWUR@rjwysocki.net
2024-10-24thermal: core: Relocate functions that update trip pointsRafael J. Wysocki2-35/+35
In preparation for subsequent changes, move two functions used for updating trip points, thermal_zone_set_trip_temp() and thermal_zone_set_trip_hyst(), to thermal_core.c. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/3248558.5fSG56mABF@rjwysocki.net
2024-10-24thermal: core: Move some trip processing to thermal_trip_crossed()Rafael J. Wysocki2-22/+16
Notice that some processing related to trip point crossing carried out in handle_thermal_trip() and thermal_zone_set_trip_temp() may as well be done in thermal_trip_crossed(), which allows code duplication to be reduced, so change the code accordingly. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/1982859.PYKUYFuaPT@rjwysocki.net
2024-10-24thermal: core: Pass trip descriptor to thermal_trip_crossed()Rafael J. Wysocki3-7/+9
In preparation for subsequent changes, modify thermal_trip_crossed() to take a trip descriptor pointer instead of a pointer to struct thermal_trip and propagate this change to thermal_zone_trip_down(). No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/10547668.nUPlyArG6x@rjwysocki.net
2024-10-24thermal: core: Rearrange __thermal_zone_device_update()Rafael J. Wysocki1-4/+4
In preparation for subsequent changes, move the invocations of thermal_thresholds_handle() and thermal_zone_set_trips() in __thermal_zone_device_update() after the processing of the temporary trip lists. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/3323276.44csPzL39Z@rjwysocki.net
2024-10-24thermal: core: Prepare for moving trips between sorted listsRafael J. Wysocki1-7/+18
Subsequently, trips will be moved between sorted lists in multiple places, so replace add_trip_to_sorted_list() with an analogous function, move_trip_to_sorted_list(), that will move a given trip to a given sorted list. To allow list_del() used in the new function to work, initialize the list_node fields in trip descriptors where applicable so they are always valid. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/2960197.e9J7NaK4W3@rjwysocki.net
2024-10-24thermal: core: Rename trip list node in struct thermal_trip_descRafael J. Wysocki2-6/+6
Since the list node field in struct thermal_trip_desc is going to be used for purposes other than trip crossing notification, rename it to list_node. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/2201558.irdbgypaU6@rjwysocki.net
2024-10-24thermal: core: Build sorted lists instead of sorting them laterRafael J. Wysocki1-15/+18
Since it is not expected that multiple trip points will be crossed in one go very often (if this happens, there are too many trip points in the given thermal zone or they are checked too rarely), quite likely it is more efficient to build a sorted list of crossed trip points than to put them on an unsorted list and sort it later. Moreover, trip points are often sorted in ascending temperature order during thermal zone registration, so building a sorted list out of them is quite straightforward and relatively inexpensive. Accordingly, make handle_thermal_trip() maintain list ordering when adding trip points to the lists and get rid of separate list sorting in __thermal_zone_device_update(). No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/4930656.GXAFRqVoOG@rjwysocki.net
2024-10-24thermal/lib: Fix memory leak on error in thermal_genl_auto()Daniel Lezcano1-4/+7
The function thermal_genl_auto() does not free the allocated message in the error path. Fix that by putting a out label and jump to it which will free the message instead of directly returning an error. Fixes: 47c4b0de080a ("tools/lib/thermal: Add a thermal library") Reported-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/20241024105938.1095358-1-daniel.lezcano@linaro.org [ rjw: Fixed up the !msg error path, added Fixes tag ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-24tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo1-14/+20
To pick up the changes from these csets: dc1e67f70f6d4e33 ("KVM VMX: Move MSR_IA32_VMX_MISC bit defines to asm/vmx.h") d7bfc9ffd58037ff ("KVM: VMX: Move MSR_IA32_VMX_BASIC bit defines to asm/vmx.h") beb2e446046f8dd9 ("x86/cpu: KVM: Move macro to encode PAT value to common header") e7e80b66fb242a63 ("x86/cpu: KVM: Add common defines for architectural memory types (PAT, MTRRs, etc.)") That cause no changes to tooling: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after $ To see how this works take a look at this previous update: https://git.kernel.org/torvalds/c/174372668933ede5 174372668933ede5 ("tools arch x86: Sync the msr-index.h copy with the kernel sources to pick IA32_MKTME_KEYID_PARTITIONING") Just silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Please see tools/include/uapi/README for further details. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Xin Li <xin3.li@intel.com> Link: https://lore.kernel.org/lkml/ZxpLSBzGin3vjs3b@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-24thermal: thresholds: Fix thermal lock annotation issueDaniel Lezcano1-4/+9
When the thermal zone is unregistered (thermal sensor module being unloaded), no lock is held when flushing the thresholds. That results in a WARN when the lockdep validation is set in the kernel config. This has been reported by syzbot. As the thermal zone is in the process of being destroyed, there is no need to send a notification about purging the thresholds to the userspace as this one will receive a thermal zone deletion notification which imply the deletion of all the associated resources like the trip points or the user thresholds. Split the function thermal_thresholds_flush() into a lockless one without notification and its call with the lock annotation followed with the thresholds flushing notification. Please note this scenario is unlikely to happen, as the sensor drivers are usually compiled-in in order to have the thermal framework to be able to kick in at boot time if needed. Fixes: 445936f9e258 ("thermal: core: Add user thresholds support") Link: https://lore.kernel.org/all/67124175.050a0220.10f4f4.0012.GAE@google.com Reported-by: syzbot+f24dd060c1911fe54c85@syzkaller.appspotmail.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20241024102303.1086147-1-daniel.lezcano@linaro.org [ rjw: Subject edit, added Fixes tag ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-24tools/thermal/thermal-engine: Take into account the thresholds APIDaniel Lezcano1-13/+92
Enhance the thermal-engine skeleton with the thresholds added in the kernel and use the API exported by the thermal library. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/20241022155147.463475-6-daniel.lezcano@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-24tools/lib/thermal: Add the threshold netlink ABIDaniel Lezcano6-15/+242
The thermal framework supports the thresholds and allows the userspace to create, delete, flush, get the list of the thresholds as well as getting the list of the thresholds set for a specific thermal zone. Add the netlink abstraction in the thermal library to take full advantage of thresholds for the userspace program. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/20241022155147.463475-5-daniel.lezcano@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-24tools/lib/thermal: Make more generic the command encoding functionDaniel Lezcano1-9/+32
The thermal netlink has been extended with more commands which require an encoding with more information. The generic encoding function puts the thermal zone id with the command name. It is the unique parameters. The next changes will provide more parameters to the command. Set the scene for those new parameters by making the encoding function more generic. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/20241022155147.463475-4-daniel.lezcano@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-24thermal: netlink: Add the commands and the events for the thresholdsDaniel Lezcano5-28/+301
The thresholds exist but there is no notification neither action code related to them yet. These changes implement the netlink for the notifications when the thresholds are crossed, added, deleted or flushed as well as the commands which allows to get the list of the thresholds, flush them, add and delete. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/20241022155147.463475-3-daniel.lezcano@linaro.org [ rjw: Use the thermal_zone guard for locking, subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-24thermal: core: Manage thermal_governor_lock using a mutex guardRafael J. Wysocki1-27/+13
Switch over the thermal core to using a mutex guard for thermal_governor_lock management. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/3679429.R56niFO833@rjwysocki.net Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-10-24thermal: core: Separate thermal zone governor initializationRafael J. Wysocki1-15/+21
In preparation for a subsequent change that will switch over the thermal core to using a mutex guard for managing thermal_governor_lock, move the code running in thermal_zone_device_register_with_trips() under that lock into a separate function called thermal_zone_init_governor(). While at it, drop a useless comment. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/4408795.ejJDZkT8p0@rjwysocki.net Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-10-24thermal: core: Add and use cooling device guardRafael J. Wysocki7-84/+60
Add and use a special guard for cooling devices. This allows quite a few error code paths to be simplified among other things and brings in code size reduction for a good measure. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/5837621.DvuYhMxLoT@rjwysocki.net Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-10-24iov_iter: Fix iov_iter_get_pages*() for folio_queueDavid Howells1-9/+12
p9_get_mapped_pages() uses iov_iter_get_pages_alloc2() to extract pages from an iterator when performing a zero-copy request and under some circumstances, this crashes with odd page errors[1], for example, I see: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xbcf0 flags: 0x2000000000000000(zone=1) ... page dumped because: VM_BUG_ON_FOLIO(((unsigned int) folio_ref_count(folio) + 127u <= 127u)) ------------[ cut here ]------------ kernel BUG at include/linux/mm.h:1444! This is because, unlike in iov_iter_extract_folioq_pages(), the iter_folioq_get_pages() helper function doesn't skip the current folio when iov_offset points to the end of it, but rather extracts the next page beyond the end of the folio and adds it to the list. Reading will then clobber the contents of this page, leading to system corruption, and if the page is not in use, put_page() may try to clean up the unused page. This can be worked around by copying the iterator before each extraction[2] and using iov_iter_advance() on the original as the advance function steps over the page we're at the end of. Fix this by skipping the page extraction if we're at the end of the folio. This was reproduced in the ktest environment[3] by forcing 9p to use the fscache caching mode and then reading a file through 9p. Fixes: db0aa2e9566f ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios") Reported-by: Antony Antony <antony@phenome.org> Closes: https://lore.kernel.org/r/ZxFQw4OI9rrc7UYc@Antony2201.local/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/ZxFEi1Tod43pD6JC@moon.secunet.de/ [1] Link: https://lore.kernel.org/r/2299159.1729543103@warthog.procyon.org.uk/ [2] Link: https://github.com/koverstreet/ktest.git [3] Tested-by: Antony Antony <antony.antony@secunet.com> Link: https://lore.kernel.org/r/3327438.1729678025@warthog.procyon.org.uk Signed-off-by: Christian Brauner <brauner@kernel.org>