summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-02-29Merge branch 'icc-sm7150' into icc-nextGeorgi Djakov6-0/+2139
Add dt-bindings and interconnect driver support for the Qualcomm SM7150 SoC. * icc-sm7150 dt-bindings: interconnect: Add Qualcomm SM7150 DT bindings interconnect: qcom: Add SM7150 driver support Link: https://lore.kernel.org/r/20240222174250.80493-1-danila@jiaxyga.com Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-02-29interconnect: qcom: Add SM7150 driver supportDanila Tikhonov4-0/+1905
Add a driver that handles the different NoCs found on SM7150, based on the downstream dtb. Signed-off-by: Danila Tikhonov <danila@jiaxyga.com> Link: https://lore.kernel.org/r/20240222174250.80493-3-danila@jiaxyga.com Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-02-29dt-bindings: interconnect: Add Qualcomm SM7150 DT bindingsDanila Tikhonov2-0/+234
The Qualcomm SM7150 platform has several bus fabrics that could be controlled and tuned dynamically according to the bandwidth demand. Signed-off-by: Danila Tikhonov <danila@jiaxyga.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240222174250.80493-2-danila@jiaxyga.com Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-02-26Merge branch 'icc-cleanup' into icc-nextGeorgi Djakov22-1138/+189
* icc-cleanup interconnect: qcom: sm8550: Remove bogus per-RSC BCMs and nodes dt-bindings: interconnect: Remove bogus interconnect nodes interconnect: qcom: x1e80100: Remove bogus per-RSC BCMs and nodes interconnect: qcom: sa8775p: constify pointer to qcom_icc_node interconnect: qcom: sm8250: constify pointer to qcom_icc_node interconnect: qcom: sm6115: constify pointer to qcom_icc_node interconnect: qcom: sa8775p: constify pointer to qcom_icc_bcm interconnect: qcom: x1e80100: constify pointer to qcom_icc_bcm dt-bindings: interconnect: qcom,rpmh: Fix bouncing @codeaurora address interconnect: constify of_phandle_args in xlate Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-02-26interconnect: constify of_phandle_args in xlateKrzysztof Kozlowski14-21/+25
The xlate callbacks are supposed to translate of_phandle_args to proper provider without modifying the of_phandle_args. Make the argument pointer to const for code safety and readability. Acked-by: Konrad Dybcio <konrad.dybcio@linaro.org> Acked-by: Thierry Reding <treding@nvidia.com> # Tegra Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Alim Akhtar <alim.akhtar@samsung.com> # Samsung Link: https://lore.kernel.org/r/20240220072213.35779-1-krzysztof.kozlowski@linaro.org Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-02-19dt-bindings: interconnect: qcom,rpmh: Fix bouncing @codeaurora addressJeffrey Hugo1-1/+1
The servers for the @codeaurora domain have long been retired and any messages sent there will bounce. Fix Odelu's address in the binding to match the .mailmap entry so that folks see the correct address when looking at the documentation. Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20240202181748.4124411-1-quic_jhugo@quicinc.com Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-02-11Merge branch 'icc-msm8909' into icc-nextGeorgi Djakov5-0/+1436
Add bindings and driver for the three MSM8909 NoCs/interconnects: BIMC, SNoC and PCNoC. The driver allows scaling the bus bandwidths for better power management. * icc-msm8909 dt-bindings: interconnect: Add Qualcomm MSM8909 DT bindings interconnect: qcom: Add MSM8909 interconnect provider driver interconnect: qcom: msm8909: constify pointer to qcom_icc_node Link: https://lore.kernel.org/r/20231220-icc-msm8909-v2-0-3b68bbed2891@kernkonzept.com Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-02-11interconnect: qcom: x1e80100: constify pointer to qcom_icc_bcmKrzysztof Kozlowski1-6/+6
Pointers to struct qcom_icc_bcm are const. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20240208105056.128448-7-krzysztof.kozlowski@linaro.org Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-02-11interconnect: qcom: sa8775p: constify pointer to qcom_icc_bcmKrzysztof Kozlowski1-14/+14
Pointers to struct qcom_icc_bcm are const. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20240208105056.128448-6-krzysztof.kozlowski@linaro.org Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-02-11interconnect: qcom: sm6115: constify pointer to qcom_icc_nodeKrzysztof Kozlowski1-6/+6
Pointers to struct qcom_icc_node are const. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20240208105056.128448-5-krzysztof.kozlowski@linaro.org Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-02-11interconnect: qcom: sm8250: constify pointer to qcom_icc_nodeKrzysztof Kozlowski1-1/+1
Pointers to struct qcom_icc_node are const. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20240208105056.128448-4-krzysztof.kozlowski@linaro.org Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-02-11interconnect: qcom: sa8775p: constify pointer to qcom_icc_nodeKrzysztof Kozlowski1-14/+14
Pointers to struct qcom_icc_node are const. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20240208105056.128448-3-krzysztof.kozlowski@linaro.org Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-02-11interconnect: qcom: msm8909: constify pointer to qcom_icc_nodeKrzysztof Kozlowski1-3/+3
Pointers to struct qcom_icc_node are const. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20240208105056.128448-2-krzysztof.kozlowski@linaro.org Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-01-31interconnect: qcom: x1e80100: Remove bogus per-RSC BCMs and nodesKonrad Dybcio1-315/+0
The downstream kernel has infrastructure for passing votes from different interconnect nodes onto different RPMh RSCs. This neither implemented, not is going to be implemented upstream (in favor of a different solution using ICC tags through the same node). Unfortunately, as it happens, meaningless (in the upstream context) parts of the vendor driver were copied, ending up causing havoc - since all "per-RSC" (in quotes because they all point to the main APPS one) BCMs defined within the driver overwrite the value in RPMh on every aggregation. To both avoid keeping bogus code around and possibly introducing impossible-to-track-down bugs (busses shutting down for no reason), get rid of the duplicated BCMs and their associated ICC nodes. Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Rajendra Nayak <quic_rjendra@quicinc.com> Link: https://lore.kernel.org/r/20240102-topic-x1e_fixes-v1-1-70723e08d5f6@linaro.org Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-01-31dt-bindings: interconnect: Remove bogus interconnect nodesKonrad Dybcio1-24/+0
The downstream kernel has infrastructure for passing votes from different interconnect nodes onto different RPMh RSCs. This neither implemented, not is going to be implemented upstream (in favor of a different solution using ICC tags through the same node). Unfortunately, as it happens, meaningless (in the upstream context) parts of the vendor driver were copied, ending up causing havoc - since all "per-RSC" (in quotes because they all point to the main APPS one) BCMs defined within the driver overwrite the value in RPMh on every aggregation. To both avoid keeping bogus code around and possibly introducing impossible-to-track-down bugs (busses shutting down for no reason), get rid of the duplicated ICC node definitions. Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240102-topic-x1e_fixes-v1-2-70723e08d5f6@linaro.org Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-01-31interconnect: qcom: sm8550: Remove bogus per-RSC BCMs and nodesKonrad Dybcio2-736/+122
The downstream kernel has infrastructure for passing votes from different interconnect nodes onto different RPMh RSCs. This neither implemented, not is going to be implemented upstream (in favor of a different solution using ICC tags through the same node). Unfortunately, as it happens, meaningless (in the upstream context) parts of the vendor driver were copied, ending up causing havoc - since all "per-RSC" (in quotes because they all point to the main APPS one) BCMs defined within the driver overwrite the value in RPMh on every aggregation. To both avoid keeping bogus code around and possibly introducing impossible-to-track-down bugs (busses shutting down for no reason), get rid of the duplicated BCMs and their associated ICC nodes. Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20231218-topic-8550_fixes-v1-1-ce1272d77540@linaro.org Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-01-31interconnect: qcom: Add MSM8909 interconnect provider driverAdam Skladowski3-0/+1340
Add driver for interconnect busses found in MSM8909 based platforms. The topology consists of three NoCs that are partially controlled by a RPM processor. In the downstream/vendor kernel from Qualcomm there is an additional "mm-snoc". However, it actually ends up using the same RPM "snoc_clk" as the normal "snoc". It looks like this is actually the same NoC in hardware and the "mm-snoc" was only defined to assign a different "qcom,util-fact" to increase bandwidth requests by a static margin. In mainline we can represent this by assigning the equivalent "ab_coeff" to all the nodes that are part of "mm-snoc" downstream. Signed-off-by: Adam Skladowski <a39.skl@gmail.com> [Stephan: Drop separate mm-snoc that exists downstream since it's actually the same NoC as SNoC in hardware, add qos_offset for BIMC, add ab_coeff for mm-snoc nodes and BIMC] Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20231220-icc-msm8909-v2-2-3b68bbed2891@kernkonzept.com Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-01-31dt-bindings: interconnect: Add Qualcomm MSM8909 DT bindingsAdam Skladowski2-0/+96
Add bindings for Qualcomm MSM8909 Network-On-Chip interconnect devices. [Stephan: Drop separate mm-snoc that exists downstream since it's actually the same NoC as SNoC in hardware] Signed-off-by: Adam Skladowski <a39.skl@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com> Link: https://lore.kernel.org/r/20231220-icc-msm8909-v2-1-3b68bbed2891@kernkonzept.com Signed-off-by: Georgi Djakov <djakov@kernel.org>
2024-01-22Linux 6.8-rc1Linus Torvalds1-2/+2
2024-01-22Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefsLinus Torvalds78-1426/+1629
Pull more bcachefs updates from Kent Overstreet: "Some fixes, Some refactoring, some minor features: - Assorted prep work for disk space accounting rewrite - BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this makes our trigger context more explicit - A few fixes to avoid excessive transaction restarts on multithreaded workloads: fstests (in addition to ktest tests) are now checking slowpath counters, and that's shaking out a few bugs - Assorted tracepoint improvements - Starting to break up bcachefs_format.h and move on disk types so they're with the code they belong to; this will make room to start documenting the on disk format better. - A few minor fixes" * tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits) bcachefs: Improve inode_to_text() bcachefs: logged_ops_format.h bcachefs: reflink_format.h bcachefs; extents_format.h bcachefs: ec_format.h bcachefs: subvolume_format.h bcachefs: snapshot_format.h bcachefs: alloc_background_format.h bcachefs: xattr_format.h bcachefs: dirent_format.h bcachefs: inode_format.h bcachefs; quota_format.h bcachefs: sb-counters_format.h bcachefs: counters.c -> sb-counters.c bcachefs: comment bch_subvolume bcachefs: bch_snapshot::btime bcachefs: add missing __GFP_NOWARN bcachefs: opts->compression can now also be applied in the background bcachefs: Prep work for variable size btree node buffers bcachefs: grab s_umount only if snapshotting ...
2024-01-21Merge tag 'timers-core-2024-01-21' of ↵Linus Torvalds7-12/+41
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Updates for time and clocksources: - A fix for the idle and iowait time accounting vs CPU hotplug. The time is reset on CPU hotplug which makes the accumulated systemwide time jump backwards. - Assorted fixes and improvements for clocksource/event drivers" * tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug clocksource/drivers/ep93xx: Fix error handling during probe clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings clocksource/timer-riscv: Add riscv_clock_shutdown callback dt-bindings: timer: Add StarFive JH8100 clint dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs
2024-01-21Merge tag 'powerpc-6.8-2' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Aneesh Kumar: - Increase default stack size to 32KB for Book3S Thanks to Michael Ellerman. * tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Increase default stack size to 32KB
2024-01-21bcachefs: Improve inode_to_text()Kent Overstreet1-7/+18
Add line breaks - inode_to_text() is now much easier to read. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: logged_ops_format.hKent Overstreet2-27/+31
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: reflink_format.hKent Overstreet3-47/+48
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs; extents_format.hKent Overstreet2-279/+284
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: ec_format.hKent Overstreet2-16/+20
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: subvolume_format.hKent Overstreet2-32/+36
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: snapshot_format.hKent Overstreet2-33/+37
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: alloc_background_format.hKent Overstreet2-93/+94
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: xattr_format.hKent Overstreet2-15/+20
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: dirent_format.hKent Overstreet2-39/+43
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: inode_format.hKent Overstreet2-164/+167
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs; quota_format.hKent Overstreet2-42/+48
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: sb-counters_format.hKent Overstreet2-95/+100
bcachefs_format.h has gotten too big; let's do some organizing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: counters.c -> sb-counters.cKent Overstreet5-8/+7
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: comment bch_subvolumeKent Overstreet1-0/+3
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: bch_snapshot::btimeKent Overstreet2-0/+3
Add a field to bch_snapshot for creation time; this will be important when we start exposing the snapshot tree to userspace. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: add missing __GFP_NOWARNKent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: opts->compression can now also be applied in the backgroundKent Overstreet11-23/+24
The "apply this compression method in the background" paths now use the compression option if background_compression is not set; this means that setting or changing the compression option will cause existing data to be compressed accordingly in the background. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Prep work for variable size btree node buffersKent Overstreet18-97/+87
bcachefs btree nodes are big - typically 256k - and btree roots are pinned in memory. As we're now up to 18 btrees, we now have significant memory overhead in mostly empty btree roots. And in the future we're going to start enforcing that certain btree node boundaries exist, to solve lock contention issues - analagous to XFS's AGIs. Thus, we need to start allocating smaller btree node buffers when we can. This patch changes code that refers to the filesystem constant c->opts.btree_node_size to refer to the btree node buffer size - btree_buf_bytes() - where appropriate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: grab s_umount only if snapshottingSu Yue1-6/+5
When I was testing mongodb over bcachefs with compression, there is a lockdep warning when snapshotting mongodb data volume. $ cat test.sh prog=bcachefs $prog subvolume create /mnt/data $prog subvolume create /mnt/data/snapshots while true;do $prog subvolume snapshot /mnt/data /mnt/data/snapshots/$(date +%s) sleep 1s done $ cat /etc/mongodb.conf systemLog: destination: file logAppend: true path: /mnt/data/mongod.log storage: dbPath: /mnt/data/ lockdep reports: [ 3437.452330] ====================================================== [ 3437.452750] WARNING: possible circular locking dependency detected [ 3437.453168] 6.7.0-rc7-custom+ #85 Tainted: G E [ 3437.453562] ------------------------------------------------------ [ 3437.453981] bcachefs/35533 is trying to acquire lock: [ 3437.454325] ffffa0a02b2b1418 (sb_writers#10){.+.+}-{0:0}, at: filename_create+0x62/0x190 [ 3437.454875] but task is already holding lock: [ 3437.455268] ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.456009] which lock already depends on the new lock. [ 3437.456553] the existing dependency chain (in reverse order) is: [ 3437.457054] -> #3 (&type->s_umount_key#48){.+.+}-{3:3}: [ 3437.457507] down_read+0x3e/0x170 [ 3437.457772] bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.458206] __x64_sys_ioctl+0x93/0xd0 [ 3437.458498] do_syscall_64+0x42/0xf0 [ 3437.458779] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.459155] -> #2 (&c->snapshot_create_lock){++++}-{3:3}: [ 3437.459615] down_read+0x3e/0x170 [ 3437.459878] bch2_truncate+0x82/0x110 [bcachefs] [ 3437.460276] bchfs_truncate+0x254/0x3c0 [bcachefs] [ 3437.460686] notify_change+0x1f1/0x4a0 [ 3437.461283] do_truncate+0x7f/0xd0 [ 3437.461555] path_openat+0xa57/0xce0 [ 3437.461836] do_filp_open+0xb4/0x160 [ 3437.462116] do_sys_openat2+0x91/0xc0 [ 3437.462402] __x64_sys_openat+0x53/0xa0 [ 3437.462701] do_syscall_64+0x42/0xf0 [ 3437.462982] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.463359] -> #1 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}: [ 3437.463843] down_write+0x3b/0xc0 [ 3437.464223] bch2_write_iter+0x5b/0xcc0 [bcachefs] [ 3437.464493] vfs_write+0x21b/0x4c0 [ 3437.464653] ksys_write+0x69/0xf0 [ 3437.464839] do_syscall_64+0x42/0xf0 [ 3437.465009] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.465231] -> #0 (sb_writers#10){.+.+}-{0:0}: [ 3437.465471] __lock_acquire+0x1455/0x21b0 [ 3437.465656] lock_acquire+0xc6/0x2b0 [ 3437.465822] mnt_want_write+0x46/0x1a0 [ 3437.465996] filename_create+0x62/0x190 [ 3437.466175] user_path_create+0x2d/0x50 [ 3437.466352] bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs] [ 3437.466617] __x64_sys_ioctl+0x93/0xd0 [ 3437.466791] do_syscall_64+0x42/0xf0 [ 3437.466957] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.467180] other info that might help us debug this: [ 3437.469670] 2 locks held by bcachefs/35533: other info that might help us debug this: [ 3437.467507] Chain exists of: sb_writers#10 --> &c->snapshot_create_lock --> &type->s_umount_key#48 [ 3437.467979] Possible unsafe locking scenario: [ 3437.468223] CPU0 CPU1 [ 3437.468405] ---- ---- [ 3437.468585] rlock(&type->s_umount_key#48); [ 3437.468758] lock(&c->snapshot_create_lock); [ 3437.469030] lock(&type->s_umount_key#48); [ 3437.469291] rlock(sb_writers#10); [ 3437.469434] *** DEADLOCK *** [ 3437.469670] 2 locks held by bcachefs/35533: [ 3437.469838] #0: ffffa0a02ce00a88 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_fs_file_ioctl+0x1e3/0xc90 [bcachefs] [ 3437.470294] #1: ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.470744] stack backtrace: [ 3437.470922] CPU: 7 PID: 35533 Comm: bcachefs Kdump: loaded Tainted: G E 6.7.0-rc7-custom+ #85 [ 3437.471313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 3437.471694] Call Trace: [ 3437.471795] <TASK> [ 3437.471884] dump_stack_lvl+0x57/0x90 [ 3437.472035] check_noncircular+0x132/0x150 [ 3437.472202] __lock_acquire+0x1455/0x21b0 [ 3437.472369] lock_acquire+0xc6/0x2b0 [ 3437.472518] ? filename_create+0x62/0x190 [ 3437.472683] ? lock_is_held_type+0x97/0x110 [ 3437.472856] mnt_want_write+0x46/0x1a0 [ 3437.473025] ? filename_create+0x62/0x190 [ 3437.473204] filename_create+0x62/0x190 [ 3437.473380] user_path_create+0x2d/0x50 [ 3437.473555] bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs] [ 3437.473819] ? lock_acquire+0xc6/0x2b0 [ 3437.474002] ? __fget_files+0x2a/0x190 [ 3437.474195] ? __fget_files+0xbc/0x190 [ 3437.474380] ? lock_release+0xc5/0x270 [ 3437.474567] ? __x64_sys_ioctl+0x93/0xd0 [ 3437.474764] ? __pfx_bch2_fs_file_ioctl+0x10/0x10 [bcachefs] [ 3437.475090] __x64_sys_ioctl+0x93/0xd0 [ 3437.475277] do_syscall_64+0x42/0xf0 [ 3437.475454] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.475691] RIP: 0033:0x7f2743c313af ====================================================== In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally and unlock it at the end of the function. There is a comment "why do we need this lock?" about the lock coming from commit 42d237320e98 ("bcachefs: Snapshot creation, deletion") The reason is that __bch2_ioctl_subvolume_create() calls sync_inodes_sb() which enforce locked s_umount to writeback all dirty nodes before doing snapshot works. Fix it by read locking s_umount for snapshotting only and unlocking s_umount after sync_inodes_sb(). Signed-off-by: Su Yue <glass.su@suse.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exitSu Yue1-1/+1
bch_fs::snapshots is allocated by kvzalloc in __snapshot_t_mut. It should be freed by kvfree not kfree. Or umount will triger: [ 406.829178 ] BUG: unable to handle page fault for address: ffffe7b487148008 [ 406.830676 ] #PF: supervisor read access in kernel mode [ 406.831643 ] #PF: error_code(0x0000) - not-present page [ 406.832487 ] PGD 0 P4D 0 [ 406.832898 ] Oops: 0000 [#1] PREEMPT SMP PTI [ 406.833512 ] CPU: 2 PID: 1754 Comm: umount Kdump: loaded Tainted: G OE 6.7.0-rc7-custom+ #90 [ 406.834746 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 406.835796 ] RIP: 0010:kfree+0x62/0x140 [ 406.836197 ] Code: 80 48 01 d8 0f 82 e9 00 00 00 48 c7 c2 00 00 00 80 48 2b 15 78 9f 1f 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 56 9f 1f 01 <48> 8b 50 08 48 89 c7 f6 c2 01 0f 85 b0 00 00 00 66 90 48 8b 07 f6 [ 406.837810 ] RSP: 0018:ffffb9d641607e48 EFLAGS: 00010286 [ 406.838213 ] RAX: ffffe7b487148000 RBX: ffffb9d645200000 RCX: ffffb9d641607dc4 [ 406.838738 ] RDX: 000065bb00000000 RSI: ffffffffc0d88b84 RDI: ffffb9d645200000 [ 406.839217 ] RBP: ffff9a4625d00068 R08: 0000000000000001 R09: 0000000000000001 [ 406.839650 ] R10: 0000000000000001 R11: 000000000000001f R12: ffff9a4625d4da80 [ 406.840055 ] R13: ffff9a4625d00000 R14: ffffffffc0e2eb20 R15: 0000000000000000 [ 406.840451 ] FS: 00007f0a264ffb80(0000) GS:ffff9a4e2d500000(0000) knlGS:0000000000000000 [ 406.840851 ] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 406.841125 ] CR2: ffffe7b487148008 CR3: 000000018c4d2000 CR4: 00000000000006f0 [ 406.841464 ] Call Trace: [ 406.841583 ] <TASK> [ 406.841682 ] ? __die+0x1f/0x70 [ 406.841828 ] ? page_fault_oops+0x159/0x470 [ 406.842014 ] ? fixup_exception+0x22/0x310 [ 406.842198 ] ? exc_page_fault+0x1ed/0x200 [ 406.842382 ] ? asm_exc_page_fault+0x22/0x30 [ 406.842574 ] ? bch2_fs_release+0x54/0x280 [bcachefs] [ 406.842842 ] ? kfree+0x62/0x140 [ 406.842988 ] ? kfree+0x104/0x140 [ 406.843138 ] bch2_fs_release+0x54/0x280 [bcachefs] [ 406.843390 ] kobject_put+0xb7/0x170 [ 406.843552 ] deactivate_locked_super+0x2f/0xa0 [ 406.843756 ] cleanup_mnt+0xba/0x150 [ 406.843917 ] task_work_run+0x59/0xa0 [ 406.844083 ] exit_to_user_mode_prepare+0x197/0x1a0 [ 406.844302 ] syscall_exit_to_user_mode+0x16/0x40 [ 406.844510 ] do_syscall_64+0x4e/0xf0 [ 406.844675 ] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 406.844907 ] RIP: 0033:0x7f0a2664e4fb Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: bios must be 512 byte alginedKent Overstreet1-0/+4
Fixes: 023f9ac9f70f bcachefs: Delete dio read alignment check Reported-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: remove redundant variable tmpColin Ian King1-3/+1
The variable tmp is being assigned a value but it isn't being read afterwards. The assignment is redundant and so tmp can be removed. Cleans up clang scan build warning: warning: Although the value stored to 'ret' is used in the enclosing expression, the value is never actually read from 'ret' [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Improve trace_trans_restart_relockKent Overstreet5-24/+44
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Fix excess transaction restarts in __bchfs_fallocate()Kent Overstreet4-16/+35
drop_locks_do() should not be used in a fastpath without first trying the do in nonblocking mode - the unlock and relock will cause excessive transaction restarts and potentially livelocking with other threads that are contending for the same locks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: extents_to_bp_stateKent Overstreet1-48/+41
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: bkey_and_val_eq()Kent Overstreet1-3/+8
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Better journal tracepointsKent Overstreet2-60/+79
Factor out bch2_journal_bufs_to_text(), and use it in the journal_entry_full() tracepoint; when we can't get a journal reservation we need to know the outstanding journal entry sizes to know if the problem is due to excessive flushing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>