Age | Commit message (Collapse) | Author | Files | Lines |
|
When xe2 support started to be added, USM was still not functional. This
has changed, and now USM can be enabled for xe2. Remove FIXME leftover
to allow VM to be created with DRM_XE_VM_CREATE_FLAG_FAULT_MODE.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240129214510.123829-1-lucas.demarchi@intel.com
|
|
This fixes warnings in xe, i915 hwmon docs:
Warning: /sys/devices/.../hwmon/hwmon<i>/curr1_crit is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:35 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:52
Warning: /sys/devices/.../hwmon/hwmon<i>/energy1_input is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:54 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:65
Warning: /sys/devices/.../hwmon/hwmon<i>/in0_input is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:46 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:0
Warning: /sys/devices/.../hwmon/hwmon<i>/power1_crit is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:22 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:39
Warning: /sys/devices/.../hwmon/hwmon<i>/power1_max is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:0 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:8
Warning: /sys/devices/.../hwmon/hwmon<i>/power1_max_interval is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:62 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:30
Warning: /sys/devices/.../hwmon/hwmon<i>/power1_rated_max is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:14 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:22
Use a path containing the driver name to differentiate the documentation
of each entry.
Fixes: fb1b70607f73 ("drm/xe/hwmon: Expose power attributes")
Fixes: 92d44a422d0d ("drm/xe/hwmon: Expose card reactive critical power")
Fixes: fbcdc9d3bf58 ("drm/xe/hwmon: Expose input voltage attribute")
Fixes: 71d0a32524f9 ("drm/xe/hwmon: Expose hwmon energy attribute")
Fixes: 4446fcf220ce ("drm/xe/hwmon: Expose power1_max_interval")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/all/20240125113345.291118ff@canb.auug.org.au/
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240127165040.2348009-1-badal.nilawar@intel.com
|
|
If skip_guc_pc is set for a platform, C6 is disabled directly without
acquiring a mem_access reference, triggering an assertion inside
xe_gt_idle_disable_c6.
Fixes: 975e4a3795d4 ("drm/xe: Manually setup C6 when skip_guc_pc is set")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240126220613.865939-2-matthew.d.roper@intel.com
|
|
While one MI_STORE_DATA_IMM can take no more than 0x1fe qwords,
the size of the pgtable can be 512 entries.
Fixes: 43d48379c939 ("drm/xe: correct the calculation of remaining size")
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240125065245.1204731-2-fei.yang@intel.com
|
|
Make sure G2H handler is not running when changing the CT state to drop
messages or disabled. This will help prevent races in the code ensuring
that G2H are not being processed after changing the state.
v2:
- s/flush_g2h_handler/stop_g2h_handler (Michal)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo remove the extra line while pushing]
Link: https://patchwork.freedesktop.org/patch/msgid/20240122210156.1517444-4-matthew.brost@intel.com
|
|
This is a software reset which can be done immediately after stopping
the UC.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240122210156.1517444-3-matthew.brost@intel.com
|
|
The Guc CT has more than enabled / disables states rather it has 4. The
4 states are not initialized, disabled, stopped, and enabled. Change the
code to reflect this. These states will enable proper return codes from
functions and therefore enable proper error messages.
v2:
- s/XE_GUC_CT_STATE_DROP_MESSAGES/XE_GUC_CT_STATE_STOPPED (Michal)
- Add assert for CT being initialized (Michal)
- Fix kernel for CT state enum (Michal)
v3:
- Kernel doc (Michal)
- s/reiecved/received (Michal)
- assert CT state not initialized in xe_guc_ct_init (Michal)
- add argument xe_guc_ct_set_state to clear g2h (Michal)
v4:
- Drop clear_outstanding_g2h argument (Michal)
v5:
- Move xa_destroy outside of fast lock (CI)
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240122210156.1517444-2-matthew.brost@intel.com
|
|
trace_dma_fence_init() uses dma_fence_ops functions
like get_driver_name() and get_timeline_name() to generate trace
information but the Xe KMD implementation of those functions makes
use of xe_hw_fence_ctx that was being set after dma_fence_init().
So here just inverting the order to fix the crash.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240124171830.95774-1-jose.souza@intel.com
|
|
If we unconditionally build the debugfs files, we'll get both the static
inline stubs from the headers and the real functions for
CONFIG_DEBUG_FS=n. Avoid building the debugfs files with that config.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Closes: https://lore.kernel.org/r/152521f9-119f-4c61-b467-3e91f4aecb1a@infradead.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240124090515.3363901-1-jani.nikula@intel.com
|
|
I guess the indention was to keep it visually aligned but that
would require a lot of spaces and was not followed by other registers
so lets just drop it.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Maarten Lankhorst <dev@lankhorst.se>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123204454.246788-7-jose.souza@intel.com
|
|
This makes easier to use those registers when copying its values to
calculator also makes easier for tools to parse it.
To avoids padding holes in xe_hw_engine_snapshot the u64 variables
were moved to the top of xe_hw_engine_snapshot.reg.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Maarten Lankhorst <dev@lankhorst.se>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123204454.246788-6-jose.souza@intel.com
|
|
To properly decode batch buffer Mesa tools needs to know what
platform is this one, for now we can do that with PCI id but
already making it future proof by also printing GTs GMD version.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Maarten Lankhorst <dev@lankhorst.se>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123204454.246788-5-jose.souza@intel.com
|
|
Although we've stored the major and minor versions for graphics/media in
xe_device, it will be simpler to implement the uapi version query if we
also stash the raw register value in the GT itself.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123204454.246788-4-jose.souza@intel.com
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
|
|
xe is never set in xe_devcoredump but if xe_device is needed
devcoredump_to_xe_device() can be used.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Maarten Lankhorst <dev@lankhorst.se>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123204454.246788-3-jose.souza@intel.com
|
|
When devcoredump start to dump the VMs contents it will be necessary
to know the starting addresses of batch buffers of the job that hang.
This information it set in xe_sched_job and xe_sched_job is not easily
acessible from xe_exec_queue, so here changing the parameter, next
patch will append the batch buffer addresses to devcoredump snapshot
capture.
v3:
- update functions documentation to xe_sched_job
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Maarten Lankhorst <dev@lankhorst.se>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123204454.246788-2-jose.souza@intel.com
|
|
Right now devcoredump has a new line between '**** GuC CT ****' and
'H2G CTB (all sizes in DW):' while other sections don't have.
v2: remove double new line after IPEHR
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Maarten Lankhorst <dev@lankhorst.se>
Cc: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123204454.246788-1-jose.souza@intel.com
|
|
Since the PCI IDs for PVC weren't added to the xe driver, the xe_wa
tests should not try to create a fake PVC device since they can't find
the right PCI ID. Fix bugs when running kunit:
# xe_wa_gt: ASSERTION FAILED at drivers/gpu/drm/xe/tests/xe_wa_test.c:111
Expected ret == 0, but
ret == -19 (0xffffffffffffffed)
[FAILED] PVC (B0)
# xe_wa_gt: ASSERTION FAILED at drivers/gpu/drm/xe/tests/xe_wa_test.c:111
Expected ret == 0, but
ret == -19 (0xffffffffffffffed)
[FAILED] PVC (B1)
# xe_wa_gt: ASSERTION FAILED at drivers/gpu/drm/xe/tests/xe_wa_test.c:111
Expected ret == 0, but
ret == -19 (0xffffffffffffffed)
[FAILED] PVC (C0)
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123031242.3548724-1-lucas.demarchi@intel.com
|
|
Document nested struct members with full names as described in
Documentation/doc-guide/kernel-doc.rst.
For this documentation we allow a column width of 100 to make
it more readable.
This fixes warnings similar to:
drivers/gpu/drm/xe/xe_lrc_types.h:45: warning: Excess struct member 'size' description in 'xe_lrc'
v2:
- Only change the documentation, not the member.
v3:
- Fix the commit message wording.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123153147.27305-1-thomas.hellstrom@linux.intel.com
|
|
Add performance guide changes to Xe2_LPG.
BSpec: 72161
Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123050552.2250699-2-shekhar.chauhan@intel.com
|
|
Fix function naming and description for xe_vram_freq_sysfs_init
function.
v2: Add fixes tag (Riana)
Fix review comments (Lucas)
Fixes: 4ae3aeab32d7 ("drm/xe: Add vram frequency sysfs attributes")
Signed-off-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240117043215.2598677-1-sujaritha.sundaresan@intel.com
|
|
Sync to v6.8-rc1.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Fix xe_vm_create_ioctl routine not freeing the vm-id allocated to it
when the function fails.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240122102424.4008095-1-mhaimovski@habana.ai
|
|
The pat table entry associated with XE_CACHE_WB is coherent whereas
XE_CACHE_NONE is non coherent. Migration expects the coherency
with cpu therefore use the coherent entry XE_CACHE_WB for
buffers not supporting compression. For read/write to flat ccs region
the issue is not related to coherency with cpu. The hardware expects
the pat index associated with GPUVA for indirect access to be
compression enabled hence use XE_CACHE_NONE_COMPRESSION.
v2
- Fix the argument to emit_pte, pass the bool directly. (Thomas)
v3
- Rebase
- Update commit message (Matt)
v4
- Add a Fixes: tag. (Thomas)
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 65ef8dbad1db ("drm/xe/xe2: Update emit_pte to use compression enabled PAT index")
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240119041826.1670496-1-himal.prasad.ghimiray@intel.com
|
|
|
|
Pull more bcachefs updates from Kent Overstreet:
"Some fixes, Some refactoring, some minor features:
- Assorted prep work for disk space accounting rewrite
- BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
makes our trigger context more explicit
- A few fixes to avoid excessive transaction restarts on
multithreaded workloads: fstests (in addition to ktest tests) are
now checking slowpath counters, and that's shaking out a few bugs
- Assorted tracepoint improvements
- Starting to break up bcachefs_format.h and move on disk types so
they're with the code they belong to; this will make room to start
documenting the on disk format better.
- A few minor fixes"
* tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits)
bcachefs: Improve inode_to_text()
bcachefs: logged_ops_format.h
bcachefs: reflink_format.h
bcachefs; extents_format.h
bcachefs: ec_format.h
bcachefs: subvolume_format.h
bcachefs: snapshot_format.h
bcachefs: alloc_background_format.h
bcachefs: xattr_format.h
bcachefs: dirent_format.h
bcachefs: inode_format.h
bcachefs; quota_format.h
bcachefs: sb-counters_format.h
bcachefs: counters.c -> sb-counters.c
bcachefs: comment bch_subvolume
bcachefs: bch_snapshot::btime
bcachefs: add missing __GFP_NOWARN
bcachefs: opts->compression can now also be applied in the background
bcachefs: Prep work for variable size btree node buffers
bcachefs: grab s_umount only if snapshotting
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
"Updates for time and clocksources:
- A fix for the idle and iowait time accounting vs CPU hotplug.
The time is reset on CPU hotplug which makes the accumulated
systemwide time jump backwards.
- Assorted fixes and improvements for clocksource/event drivers"
* tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
clocksource/drivers/ep93xx: Fix error handling during probe
clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings
clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
clocksource/timer-riscv: Add riscv_clock_shutdown callback
dt-bindings: timer: Add StarFive JH8100 clint
dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Aneesh Kumar:
- Increase default stack size to 32KB for Book3S
Thanks to Michael Ellerman.
* tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Increase default stack size to 32KB
|
|
Add line breaks - inode_to_text() is now much easier to read.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bcachefs_format.h has gotten too big; let's do some organizing.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a field to bch_snapshot for creation time; this will be important
when we start exposing the snapshot tree to userspace.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The "apply this compression method in the background" paths now use the
compression option if background_compression is not set; this means that
setting or changing the compression option will cause existing data to
be compressed accordingly in the background.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bcachefs btree nodes are big - typically 256k - and btree roots are
pinned in memory. As we're now up to 18 btrees, we now have significant
memory overhead in mostly empty btree roots.
And in the future we're going to start enforcing that certain btree node
boundaries exist, to solve lock contention issues - analagous to XFS's
AGIs.
Thus, we need to start allocating smaller btree node buffers when we
can. This patch changes code that refers to the filesystem constant
c->opts.btree_node_size to refer to the btree node buffer size -
btree_buf_bytes() - where appropriate.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
When I was testing mongodb over bcachefs with compression,
there is a lockdep warning when snapshotting mongodb data volume.
$ cat test.sh
prog=bcachefs
$prog subvolume create /mnt/data
$prog subvolume create /mnt/data/snapshots
while true;do
$prog subvolume snapshot /mnt/data /mnt/data/snapshots/$(date +%s)
sleep 1s
done
$ cat /etc/mongodb.conf
systemLog:
destination: file
logAppend: true
path: /mnt/data/mongod.log
storage:
dbPath: /mnt/data/
lockdep reports:
[ 3437.452330] ======================================================
[ 3437.452750] WARNING: possible circular locking dependency detected
[ 3437.453168] 6.7.0-rc7-custom+ #85 Tainted: G E
[ 3437.453562] ------------------------------------------------------
[ 3437.453981] bcachefs/35533 is trying to acquire lock:
[ 3437.454325] ffffa0a02b2b1418 (sb_writers#10){.+.+}-{0:0}, at: filename_create+0x62/0x190
[ 3437.454875]
but task is already holding lock:
[ 3437.455268] ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.456009]
which lock already depends on the new lock.
[ 3437.456553]
the existing dependency chain (in reverse order) is:
[ 3437.457054]
-> #3 (&type->s_umount_key#48){.+.+}-{3:3}:
[ 3437.457507] down_read+0x3e/0x170
[ 3437.457772] bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.458206] __x64_sys_ioctl+0x93/0xd0
[ 3437.458498] do_syscall_64+0x42/0xf0
[ 3437.458779] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.459155]
-> #2 (&c->snapshot_create_lock){++++}-{3:3}:
[ 3437.459615] down_read+0x3e/0x170
[ 3437.459878] bch2_truncate+0x82/0x110 [bcachefs]
[ 3437.460276] bchfs_truncate+0x254/0x3c0 [bcachefs]
[ 3437.460686] notify_change+0x1f1/0x4a0
[ 3437.461283] do_truncate+0x7f/0xd0
[ 3437.461555] path_openat+0xa57/0xce0
[ 3437.461836] do_filp_open+0xb4/0x160
[ 3437.462116] do_sys_openat2+0x91/0xc0
[ 3437.462402] __x64_sys_openat+0x53/0xa0
[ 3437.462701] do_syscall_64+0x42/0xf0
[ 3437.462982] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.463359]
-> #1 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}:
[ 3437.463843] down_write+0x3b/0xc0
[ 3437.464223] bch2_write_iter+0x5b/0xcc0 [bcachefs]
[ 3437.464493] vfs_write+0x21b/0x4c0
[ 3437.464653] ksys_write+0x69/0xf0
[ 3437.464839] do_syscall_64+0x42/0xf0
[ 3437.465009] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.465231]
-> #0 (sb_writers#10){.+.+}-{0:0}:
[ 3437.465471] __lock_acquire+0x1455/0x21b0
[ 3437.465656] lock_acquire+0xc6/0x2b0
[ 3437.465822] mnt_want_write+0x46/0x1a0
[ 3437.465996] filename_create+0x62/0x190
[ 3437.466175] user_path_create+0x2d/0x50
[ 3437.466352] bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
[ 3437.466617] __x64_sys_ioctl+0x93/0xd0
[ 3437.466791] do_syscall_64+0x42/0xf0
[ 3437.466957] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.467180]
other info that might help us debug this:
[ 3437.469670] 2 locks held by bcachefs/35533:
other info that might help us debug this:
[ 3437.467507] Chain exists of:
sb_writers#10 --> &c->snapshot_create_lock --> &type->s_umount_key#48
[ 3437.467979] Possible unsafe locking scenario:
[ 3437.468223] CPU0 CPU1
[ 3437.468405] ---- ----
[ 3437.468585] rlock(&type->s_umount_key#48);
[ 3437.468758] lock(&c->snapshot_create_lock);
[ 3437.469030] lock(&type->s_umount_key#48);
[ 3437.469291] rlock(sb_writers#10);
[ 3437.469434]
*** DEADLOCK ***
[ 3437.469670] 2 locks held by bcachefs/35533:
[ 3437.469838] #0: ffffa0a02ce00a88 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_fs_file_ioctl+0x1e3/0xc90 [bcachefs]
[ 3437.470294] #1: ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.470744]
stack backtrace:
[ 3437.470922] CPU: 7 PID: 35533 Comm: bcachefs Kdump: loaded Tainted: G E 6.7.0-rc7-custom+ #85
[ 3437.471313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[ 3437.471694] Call Trace:
[ 3437.471795] <TASK>
[ 3437.471884] dump_stack_lvl+0x57/0x90
[ 3437.472035] check_noncircular+0x132/0x150
[ 3437.472202] __lock_acquire+0x1455/0x21b0
[ 3437.472369] lock_acquire+0xc6/0x2b0
[ 3437.472518] ? filename_create+0x62/0x190
[ 3437.472683] ? lock_is_held_type+0x97/0x110
[ 3437.472856] mnt_want_write+0x46/0x1a0
[ 3437.473025] ? filename_create+0x62/0x190
[ 3437.473204] filename_create+0x62/0x190
[ 3437.473380] user_path_create+0x2d/0x50
[ 3437.473555] bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
[ 3437.473819] ? lock_acquire+0xc6/0x2b0
[ 3437.474002] ? __fget_files+0x2a/0x190
[ 3437.474195] ? __fget_files+0xbc/0x190
[ 3437.474380] ? lock_release+0xc5/0x270
[ 3437.474567] ? __x64_sys_ioctl+0x93/0xd0
[ 3437.474764] ? __pfx_bch2_fs_file_ioctl+0x10/0x10 [bcachefs]
[ 3437.475090] __x64_sys_ioctl+0x93/0xd0
[ 3437.475277] do_syscall_64+0x42/0xf0
[ 3437.475454] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.475691] RIP: 0033:0x7f2743c313af
======================================================
In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally
and unlock it at the end of the function. There is a comment
"why do we need this lock?" about the lock coming from
commit 42d237320e98 ("bcachefs: Snapshot creation, deletion")
The reason is that __bch2_ioctl_subvolume_create() calls
sync_inodes_sb() which enforce locked s_umount to writeback all dirty
nodes before doing snapshot works.
Fix it by read locking s_umount for snapshotting only and unlocking
s_umount after sync_inodes_sb().
Signed-off-by: Su Yue <glass.su@suse.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bch_fs::snapshots is allocated by kvzalloc in __snapshot_t_mut.
It should be freed by kvfree not kfree.
Or umount will triger:
[ 406.829178 ] BUG: unable to handle page fault for address: ffffe7b487148008
[ 406.830676 ] #PF: supervisor read access in kernel mode
[ 406.831643 ] #PF: error_code(0x0000) - not-present page
[ 406.832487 ] PGD 0 P4D 0
[ 406.832898 ] Oops: 0000 [#1] PREEMPT SMP PTI
[ 406.833512 ] CPU: 2 PID: 1754 Comm: umount Kdump: loaded Tainted: G OE 6.7.0-rc7-custom+ #90
[ 406.834746 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[ 406.835796 ] RIP: 0010:kfree+0x62/0x140
[ 406.836197 ] Code: 80 48 01 d8 0f 82 e9 00 00 00 48 c7 c2 00 00 00 80 48 2b 15 78 9f 1f 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 56 9f 1f 01 <48> 8b 50 08 48 89 c7 f6 c2 01 0f 85 b0 00 00 00 66 90 48 8b 07 f6
[ 406.837810 ] RSP: 0018:ffffb9d641607e48 EFLAGS: 00010286
[ 406.838213 ] RAX: ffffe7b487148000 RBX: ffffb9d645200000 RCX: ffffb9d641607dc4
[ 406.838738 ] RDX: 000065bb00000000 RSI: ffffffffc0d88b84 RDI: ffffb9d645200000
[ 406.839217 ] RBP: ffff9a4625d00068 R08: 0000000000000001 R09: 0000000000000001
[ 406.839650 ] R10: 0000000000000001 R11: 000000000000001f R12: ffff9a4625d4da80
[ 406.840055 ] R13: ffff9a4625d00000 R14: ffffffffc0e2eb20 R15: 0000000000000000
[ 406.840451 ] FS: 00007f0a264ffb80(0000) GS:ffff9a4e2d500000(0000) knlGS:0000000000000000
[ 406.840851 ] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 406.841125 ] CR2: ffffe7b487148008 CR3: 000000018c4d2000 CR4: 00000000000006f0
[ 406.841464 ] Call Trace:
[ 406.841583 ] <TASK>
[ 406.841682 ] ? __die+0x1f/0x70
[ 406.841828 ] ? page_fault_oops+0x159/0x470
[ 406.842014 ] ? fixup_exception+0x22/0x310
[ 406.842198 ] ? exc_page_fault+0x1ed/0x200
[ 406.842382 ] ? asm_exc_page_fault+0x22/0x30
[ 406.842574 ] ? bch2_fs_release+0x54/0x280 [bcachefs]
[ 406.842842 ] ? kfree+0x62/0x140
[ 406.842988 ] ? kfree+0x104/0x140
[ 406.843138 ] bch2_fs_release+0x54/0x280 [bcachefs]
[ 406.843390 ] kobject_put+0xb7/0x170
[ 406.843552 ] deactivate_locked_super+0x2f/0xa0
[ 406.843756 ] cleanup_mnt+0xba/0x150
[ 406.843917 ] task_work_run+0x59/0xa0
[ 406.844083 ] exit_to_user_mode_prepare+0x197/0x1a0
[ 406.844302 ] syscall_exit_to_user_mode+0x16/0x40
[ 406.844510 ] do_syscall_64+0x4e/0xf0
[ 406.844675 ] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 406.844907 ] RIP: 0033:0x7f0a2664e4fb
Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Fixes: 023f9ac9f70f bcachefs: Delete dio read alignment check
Reported-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The variable tmp is being assigned a value but it isn't being
read afterwards. The assignment is redundant and so tmp can be
removed.
Cleans up clang scan build warning:
warning: Although the value stored to 'ret' is used in the enclosing
expression, the value is never actually read from 'ret'
[deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|