Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull bcachefs updates from Kent Overstreet:
- Metadata version 1.8: Stripe sectors accounting, BCH_DATA_unstriped
This splits out the accounting of dirty sectors and stripe sectors in
alloc keys; this lets us see stripe buckets that still have unstriped
data in them.
This is needed for ensuring that erasure coding is working correctly,
as well as completing stripe creation after a crash.
- Metadata version 1.9: Disk accounting rewrite
The previous disk accounting scheme relied heavily on percpu counters
that were also sharded by outstanding journal buffer; it was fast but
not extensible or scalable, and meant that all accounting counters
were recorded in every journal entry.
The new disk accounting scheme stores accounting as normal btree
keys; updates are deltas until they are flushed by the btree write
buffer.
This means we have no practical limit on the number of counters, and
a new tagged union format that's easy to extend.
We now have counters for compression type/ratio, per-snapshot-id
usage, per-btree-id usage, and pending rebalance work.
- Self healing on read IO/checksum error
Data is now automatically rewritten if we get a read error and then a
successful retry
- Mount API conversion (thanks to Thomas Bertschinger)
- Better lockdep coverage
Previously, btree node locks were tracked individually by lockdep,
like any other lock. But we may take _many_ btree node locks
simultaneously, we easily blow through the limit of 48 locks that
lockdep can track, leading to lockdep turning itself off.
Tracking each btree node lock individually isn't really necessary
since we have our own cycle detector for deadlock avoidance and
centralized tracking of btree node locks, so we now have a single
lockdep_map in btree_trans for "any btree nodes are locked".
- Some more small incremental work towards online check_allocations
- Lots more debugging improvements
- Fixes, including:
- undefined behaviour fixes, originally noted as breaking userspace
LTO builds
- fix a spurious warning in fsck_err, reported by Marcin
- fix an integer overflow on trans->nr_updates, also reported by
Marcin; this broke during deletion of highly fragmented indirect
extents
* tag 'bcachefs-2024-07-18.2' of https://evilpiepirate.org/git/bcachefs: (120 commits)
lockdep: Add comments for lockdep_set_no{validate,track}_class()
bcachefs: Fix integer overflow on trans->nr_updates
bcachefs: silence silly kdoc warning
bcachefs: Fix fsck warning about btree_trans not passed to fsck error
bcachefs: Add an error message for insufficient rw journal devs
bcachefs: varint: Avoid left-shift of a negative value
bcachefs: darray: Don't pass NULL to memcpy()
bcachefs: Kill bch2_assert_btree_nodes_not_locked()
bcachefs: Rename BCH_WRITE_DONE -> BCH_WRITE_SUBMITTED
bcachefs: __bch2_read(): call trans_begin() on every loop iter
bcachefs: show none if label is not set
bcachefs: drop packed, aligned from bkey_inode_buf
bcachefs: btree node scan: fall back to comparing by journal seq
bcachefs: Add lockdep support for btree node locks
lockdep: lockdep_set_notrack_class()
bcachefs: Improve copygc_wait_to_text()
bcachefs: Convert clock code to u64s
bcachefs: Improve startup message
bcachefs: Self healing on read IO error
bcachefs: Make read_only a mount option again, but hidden
...
|
|
Cc: Waiman Long <longman@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a new helper to disable lockdep tracking entirely for a given class.
This is needed for bcachefs, which takes too many btree node locks for
lockdep to track. Instead, we have a single lockdep_map for "btree_trans
has any btree nodes locked", which makes more since given that we have
centralized lock management and a cycle detector.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add local_lock_nested_bh() locking. It is based on local_lock_t and the
naming follows the preempt_disable_nested() example.
For !PREEMPT_RT + !LOCKDEP it is a per-CPU annotation for locking
assumptions based on local_bh_disable(). The macro is optimized away
during compilation.
For !PREEMPT_RT + LOCKDEP the local_lock_nested_bh() is reduced to
the usual lock-acquire plus lockdep_assert_in_softirq() - ensuring that
BH is disabled.
For PREEMPT_RT local_lock_nested_bh() acquires the specified per-CPU
lock. It does not disable CPU migration because it relies on
local_bh_disable() disabling CPU migration.
With LOCKDEP it performans the usual lockdep checks as with !PREEMPT_RT.
Due to include hell the softirq check has been moved spinlock.c.
The intention is to use this locking in places where locking of a per-CPU
variable relies on BH being disabled. Instead of treating disabled
bottom halves as a big per-CPU lock, PREEMPT_RT can use this to reduce
the locking scope to what actually needs protecting.
A side effect is that it also documents the protection scope of the
per-CPU variables.
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20240620132727.660738-3-bigeasy@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
While the experiment did reveal that there are additional places that are
missing the lock during secondary bus reset, one of the places that needs
to take cfg_access_lock (pci_bus_lock()) is not prepared for lockdep
annotation.
Specifically, pci_bus_lock() takes pci_dev_lock() recursively and is
currently dependent on the fact that the device_lock() is marked
lockdep_set_novalidate_class(&dev->mutex). Otherwise, without that
annotation, pci_bus_lock() would need to use something like a new
pci_dev_lock_nested() helper, a scheme to track a PCI device's depth in the
topology, and a hope that the depth of a PCI tree never exceeds the max
value for a lockdep subclass.
The alternative to ripping out the lockdep coverage would be to deploy a
dynamic lock key for every PCI device. Unfortunately, there is evidence
that increasing the number of keys that lockdep needs to track to be
per-PCI-device is prohibitively expensive for something like the
cfg_access_lock.
The main motivation for adding the annotation in the first place was to
catch unlocked secondary bus resets, not necessarily catch lock ordering
problems between cfg_access_lock and other locks. Solve that narrower
problem with follow-on patches, and just due to targeted revert for now.
Link: https://lore.kernel.org/r/171711746402.1628941.14575335981264103013.stgit@dwillia2-xfh.jf.intel.com
Fixes: 7e89efc6e9e4 ("PCI: Lock upstream bridge for pci_reset_function()")
Reported-by: Imre Deak <imre.deak@intel.com>
Closes: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_134186v1/shard-dg2-1/igt@device_reset@unbind-reset-rebind.html
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Kalle Valo <kvalo@kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Cc: Jani Saarinen <jani.saarinen@intel.com>
|
|
Fix a long-standing locking gap for missing pci_cfg_access_lock() while
manipulating bridge reset registers and configuration during
pci_reset_bus_function().
If there is an upstream bridge, lock it before locking the device itself.
pci_dev_lock() calls pci_cfg_access_lock(), which blocks the writing of PCI
config space by user space.
Add lockdep assertion via pci_dev->cfg_access_lock to verify
pci_dev->block_cfg_access is set.
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20240502165851.1948523-3-dave.jiang@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
held_lock is embedded in task_struct, and we don't want sched.h pulling
in all of lockdep.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Acked-by: Waiman Long <longman@redhat.com>
|
|
Page pool use in hardirq is prohibited, add debug checks
to catch misuses. IIRC we previously discussed using
DEBUG_NET_WARN_ON_ONCE() for this, but there were concerns
that people will have DEBUG_NET enabled in perf testing.
I don't think anyone enables lockdep in perf testing,
so use lockdep to avoid pushback and arguing :)
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20230804180529.2483231-6-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
- Introduce cmpxchg128() -- aka. the demise of cmpxchg_double()
The cmpxchg128() family of functions is basically & functionally the
same as cmpxchg_double(), but with a saner interface.
Instead of a 6-parameter horror that forced u128 - u64/u64-halves
layout details on the interface and exposed users to complexity,
fragility & bugs, use a natural 3-parameter interface with u128
types.
- Restructure the generated atomic headers, and add kerneldoc comments
for all of the generic atomic{,64,_long}_t operations.
The generated definitions are much cleaner now, and come with
documentation.
- Implement lock_set_cmp_fn() on lockdep, for defining an ordering when
taking multiple locks of the same type.
This gets rid of one use of lockdep_set_novalidate_class() in the
bcache code.
- Fix raw_cpu_generic_try_cmpxchg() bug due to an unintended variable
shadowing generating garbage code on Clang on certain ARM builds.
* tag 'locking-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
locking/atomic: scripts: fix ${atomic}_dec_if_positive() kerneldoc
percpu: Fix self-assignment of __old in raw_cpu_generic_try_cmpxchg()
locking/atomic: treewide: delete arch_atomic_*() kerneldoc
locking/atomic: docs: Add atomic operations to the driver basic API documentation
locking/atomic: scripts: generate kerneldoc comments
docs: scripts: kernel-doc: accept bitwise negation like ~@var
locking/atomic: scripts: simplify raw_atomic*() definitions
locking/atomic: scripts: simplify raw_atomic_long*() definitions
locking/atomic: scripts: split pfx/name/sfx/order
locking/atomic: scripts: restructure fallback ifdeffery
locking/atomic: scripts: build raw_atomic_long*() directly
locking/atomic: treewide: use raw_atomic*_<op>()
locking/atomic: scripts: add trivial raw_atomic*_<op>()
locking/atomic: scripts: factor out order template generation
locking/atomic: scripts: remove leftover "${mult}"
locking/atomic: scripts: remove bogus order parameter
locking/atomic: xtensa: add preprocessor symbols
locking/atomic: x86: add preprocessor symbols
locking/atomic: sparc: add preprocessor symbols
locking/atomic: sh: add preprocessor symbols
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull debugobjects fixes from Thomas Gleixner:
"Two fixes for debugobjects:
- Prevent the allocation path from waking up kswapd.
That's a long standing issue due to the GFP_ATOMIC allocation flag.
As debug objects can be invoked from pretty much any context waking
kswapd can end up in arbitrary lock chains versus the waitqueue
lock
- Correct the explicit lockdep wait-type violation in
debug_object_fill_pool()"
* tag 'core-debugobjects-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
debugobjects: Don't wake up kswapd from fill_pool()
debugobjects,locking: Annotate debug_object_fill_pool() wait type violation
|
|
This implements a new interface to lockdep, lock_set_cmp_fn(), for
defining a custom ordering when taking multiple locks of the same
class.
This is an alternative to subclasses, but can not fully replace them
since subclasses allow lock hierarchies with other clasees
inter-twined, while this relies on pure class nesting.
Specifically, if A is our nesting class then:
A/0 <- B <- A/1
Would be a valid lock order with subclasses (each subclass really is a
full class from the validation PoV) but not with this annotation,
which requires all nesting to be consecutive.
Example output:
| ============================================
| WARNING: possible recursive locking detected
| 6.2.0-rc8-00003-g7d81e591ca6a-dirty #15 Not tainted
| --------------------------------------------
| kworker/14:3/938 is trying to acquire lock:
| ffff8880143218c8 (&b->lock l=0 0:2803368){++++}-{3:3}, at: bch_btree_node_get.part.0+0x81/0x2b0
|
| but task is already holding lock:
| ffff8880143de8c8 (&b->lock l=1 1048575:9223372036854775807){++++}-{3:3}, at: __bch_btree_map_nodes+0xea/0x1e0
| and the lock comparison function returns 1:
|
| other info that might help us debug this:
| Possible unsafe locking scenario:
|
| CPU0
| ----
| lock(&b->lock l=1 1048575:9223372036854775807);
| lock(&b->lock l=0 0:2803368);
|
| *** DEADLOCK ***
|
| May be due to missing lock nesting notation
|
| 3 locks held by kworker/14:3/938:
| #0: ffff888005ea9d38 ((wq_completion)bcache){+.+.}-{0:0}, at: process_one_work+0x1ec/0x530
| #1: ffff8880098c3e70 ((work_completion)(&cl->work)#3){+.+.}-{0:0}, at: process_one_work+0x1ec/0x530
| #2: ffff8880143de8c8 (&b->lock l=1 1048575:9223372036854775807){++++}-{3:3}, at: __bch_btree_map_nodes+0xea/0x1e0
[peterz: extended changelog]
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20230509195847.1745548-1-kent.overstreet@linux.dev
|
|
There is an explicit wait-type violation in debug_object_fill_pool()
for PREEMPT_RT=n kernels which allows them to more easily fill the
object pool and reduce the chance of allocation failures.
Lockdep's wait-type checks are designed to check the PREEMPT_RT
locking rules even for PREEMPT_RT=n kernels and object to this, so
create a lockdep annotation to allow this to stand.
Specifically, create a 'lock' type that overrides the inner wait-type
while it is held -- allowing one to temporarily raise it, such that
the violation is hidden.
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Qi Zheng <zhengqi.arch@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Qi Zheng <zhengqi.arch@bytedance.com>
Link: https://lkml.kernel.org/r/20230429100614.GA1489784@hirez.programming.kicks-ass.net
|
|
Lock scenario print is always a weak spot of lockdep splats. Improvement
can be made if we rework the dependency search and the error printing.
However without touching the graph search, we can improve a little for
the circular deadlock case, since we have the to-be-added lock
dependency, and know whether these two locks are read/write/sync.
In order to know whether a held_lock is sync or not, a bit was
"stolen" from ->references, which reduce our limit for the same lock
class nesting from 2^12 to 2^11, and it should still be good enough.
Besides, since we now have bit in held_lock for sync, we don't need the
"hardirqoffs being 1" trick, and also we can avoid the __lock_release()
if we jump out of __lock_acquire() before the held_lock stored.
With these changes, a deadlock case evolved with read lock and sync gets
a better print-out from:
[...] Possible unsafe locking scenario:
[...]
[...] CPU0 CPU1
[...] ---- ----
[...] lock(srcuA);
[...] lock(srcuB);
[...] lock(srcuA);
[...] lock(srcuB);
to
[...] Possible unsafe locking scenario:
[...]
[...] CPU0 CPU1
[...] ---- ----
[...] rlock(srcuA);
[...] lock(srcuB);
[...] lock(srcuA);
[...] sync(srcuB);
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
|
|
Currently, functions like synchronize_srcu() do not have lockdep
annotations resembling those of other write-side locking primitives.
Such annotations might look as follows:
lock_acquire();
lock_release();
Such annotations would tell lockdep that synchronize_srcu() acts like
an empty critical section that waits for other (read-side) critical
sections to finish. This would definitely catch some deadlock, but
as pointed out by Paul Mckenney [1], this could also introduce false
positives because of irq-safe/unsafe detection. Of course, there are
tricks could help with this:
might_sleep(); // Existing statement in __synchronize_srcu().
if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
local_irq_disable();
lock_acquire();
lock_release();
local_irq_enable();
}
But it would be better for lockdep to provide a separate annonation for
functions like synchronize_srcu(), so that people won't need to repeat
the ugly tricks above.
Therefore introduce lock_sync(), which is simply an lock+unlock
pair with no irq safe/unsafe deadlock check. This works because the
to-be-annontated functions do not create real critical sections, and
there is therefore no way that irq can create extra dependencies.
[1]: https://lore.kernel.org/lkml/20180412021233.ewncg5jjuzjw3x62@tardis/
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
[ boqun: Fix typos reported by Davidlohr Bueso and Paul E. Mckenney ]
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
|
|
The cross-release bits have been removed, lockdep_init_map_crosslock() is
a leftover.
Remove lockdep_init_map_crosslock.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Waiman Long <longman@redhat.com>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20220311164457.46461-1-bigeasy@linutronix.de
Link: https://lore.kernel.org/r/YqITgY+2aPITu96z@linutronix.de
|
|
Commit dfd5e3f5fe27 ("locking/lockdep: Mark local_lock_t") added yet
another lockdep_init_map_*() variant, but forgot to update all the
existing users of the most complicated version.
This could lead to a loss of lock_type and hence an incorrect report.
Given the relative rarity of both local_lock and these annotations,
this is unlikely to happen in practise, still, best fix things.
Fixes: dfd5e3f5fe27 ("locking/lockdep: Mark local_lock_t")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/YqyEDtoan20K0CVD@worktop.programming.kicks-ass.net
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl updates from Dan Williams:
"Compute Express Link (CXL) updates for this cycle.
The highlight is new driver-core infrastructure and CXL subsystem
changes for allowing lockdep to validate device_lock() usage. Thanks
to PeterZ for setting me straight on the current capabilities of the
lockdep API, and Greg acked it as well.
On the CXL ACPI side this update adds support for CXL _OSC so that
platform firmware knows that it is safe to still grant Linux native
control of PCIe hotplug and error handling in the presence of CXL
devices. A circular dependency problem was discovered between suspend
and CXL memory for cases where the suspend image might be stored in
CXL memory where that image also contains the PCI register state to
restore to re-enable the device. Disable suspend for now until an
architecture is defined to clarify that conflict.
Lastly a collection of reworks, fixes, and cleanups to the CXL
subsystem where support for snooping mailbox commands and properly
handling the "mem_enable" flow are the highlights.
Summary:
- Add driver-core infrastructure for lockdep validation of
device_lock(), and fixup a deadlock report that was previously
hidden behind the 'lockdep no validate' policy.
- Add CXL _OSC support for claiming native control of CXL hotplug and
error handling.
- Disable suspend in the presence of CXL memory unless and until a
protocol is identified for restoring PCI device context from memory
hosted on CXL PCI devices.
- Add support for snooping CXL mailbox commands to protect against
inopportune changes, like set-partition with the 'immediate' flag
set.
- Rework how the driver detects legacy CXL 1.1 configurations (CXL
DVSEC / 'mem_enable') before enabling new CXL 2.0 decode
configurations (CXL HDM Capability).
- Miscellaneous cleanups and fixes from -next exposure"
* tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (47 commits)
cxl/port: Enable HDM Capability after validating DVSEC Ranges
cxl/port: Reuse 'struct cxl_hdm' context for hdm init
cxl/port: Move endpoint HDM Decoder Capability init to port driver
cxl/pci: Drop @info argument to cxl_hdm_decode_init()
cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init()
cxl/mem: Skip range enumeration if mem_enable clear
cxl/mem: Consolidate CXL DVSEC Range enumeration in the core
cxl/pci: Move cxl_await_media_ready() to the core
cxl/mem: Validate port connectivity before dvsec ranges
cxl/mem: Fix cxl_mem_probe() error exit
cxl/pci: Drop wait_for_valid() from cxl_await_media_ready()
cxl/pci: Consolidate wait_for_media() and wait_for_media_ready()
cxl/mem: Drop mem_enabled check from wait_for_media()
nvdimm: Fix firmware activation deadlock scenarios
device-core: Kill the lockdep_mutex
nvdimm: Drop nd_device_lock()
ACPI: NFIT: Drop nfit_device_lock()
nvdimm: Replace lockdep_mutex with local lock classes
cxl: Drop cxl_device_lock()
cxl/acpi: Add root device lockdep validation
...
|
|
The CXL "root" device, ACPI0017, is an attach point for coordinating
platform level CXL resources and is the parent device for a CXL port
topology tree. As such it has distinct locking rules relative to other
CXL subsystem objects, but because it is an ACPI device the lock class
is established well before it is given to the cxl_acpi driver.
However, the lockdep API does support changing the lock class "live" for
situations like this. Add a device_lock_set_class() helper that a driver
can use in ->probe() to set a custom lock class, and
device_lock_reset_class() to return to the default "no validate" class
before the custom lock class key goes out of scope after ->remove().
Note the helpers are all macros to support dead code elimination in the
CONFIG_PROVE_LOCKING=n case, however device_set_lock_class() still needs
#ifdef CONFIG_PROVE_LOCKING since lockdep_match_class() explicitly does
not have a helper in the CONFIG_PROVE_LOCKING=n case (see comment in
lockdep.h). The lockdep API needs 2 small tweaks to prevent "unused"
warnings for the @key argument to lock_set_class(), and a new
lock_set_novalidate_class() is added to supplement
lockdep_set_novalidate_class() in the cases where the lock class is
converted while the lock is held.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/165100081305.1528964.11138612430659737238.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
kernel/sysctl.c is a kitchen sink where everyone leaves their dirty
dishes, this makes it very difficult to maintain.
To help with this maintenance let's start by moving sysctls to places
where they actually belong. The proc sysctl maintainers do not want to
know what sysctl knobs you wish to add for your own piece of code, we
just care about the core logic.
All filesystem syctls now get reviewed by fs folks. This commit
follows the commit of fs, move the prove_locking and lock_stat sysctls
to its own file, kernel/lockdep.c.
Signed-off-by: tangmeng <tangmeng@uniontech.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
parisc, ia64 and powerpc32 are the only remaining architectures that
provide custom arch_{spin,read,write}_lock_flags() functions, which are
meant to re-enable interrupts while waiting for a spinlock.
However, none of these can actually run into this codepath, because
it is only called on architectures without CONFIG_GENERIC_LOCKBREAK,
or when CONFIG_DEBUG_LOCK_ALLOC is set without CONFIG_LOCKDEP, and none
of those combinations are possible on the three architectures.
Going back in the git history, it appears that arch/mn10300 may have
been able to run into this code path, but there is a good chance that
it never worked. On the architectures that still exist, it was
already impossible to hit back in 2008 after the introduction of
CONFIG_GENERIC_LOCKBREAK, and possibly earlier.
As this is all dead code, just remove it and the helper functions built
around it. For arch/ia64, the inline asm could be cleaned up, but
it seems safer to leave it untouched.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Helge Deller <deller@gmx.de> # parisc
Link: https://lore.kernel.org/r/20211022120058.1031690-1-arnd@kernel.org
|
|
Extract lockdep_assert{,_once}() helpers to more easily write composite
assertions like, for example:
lockdep_assert(lockdep_is_held(&drm_device.master_mutex) ||
lockdep_is_held(&drm_file.master_lookup_lock));
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210802105957.77692-2-desmondcheongzx@gmail.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
- rtmutex cleanup & spring cleaning pass that removes ~400 lines of
code
- Futex simplifications & cleanups
- Add debugging to the CSD code, to help track down a tenacious race
(or hw problem)
- Add lockdep_assert_not_held(), to allow code to require a lock to not
be held, and propagate this into the ath10k driver
- Misc LKMM documentation updates
- Misc KCSAN updates: cleanups & documentation updates
- Misc fixes and cleanups
- Fix locktorture bugs with ww_mutexes
* tag 'locking-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
kcsan: Fix printk format string
static_call: Relax static_call_update() function argument type
static_call: Fix unused variable warn w/o MODULE
locking/rtmutex: Clean up signal handling in __rt_mutex_slowlock()
locking/rtmutex: Restrict the trylock WARN_ON() to debug
locking/rtmutex: Fix misleading comment in rt_mutex_postunlock()
locking/rtmutex: Consolidate the fast/slowpath invocation
locking/rtmutex: Make text section and inlining consistent
locking/rtmutex: Move debug functions as inlines into common header
locking/rtmutex: Decrapify __rt_mutex_init()
locking/rtmutex: Remove pointless CONFIG_RT_MUTEXES=n stubs
locking/rtmutex: Inline chainwalk depth check
locking/rtmutex: Move rt_mutex_debug_task_free() to rtmutex.c
locking/rtmutex: Remove empty and unused debug stubs
locking/rtmutex: Consolidate rt_mutex_init()
locking/rtmutex: Remove output from deadlock detector
locking/rtmutex: Remove rtmutex deadlock tester leftovers
locking/rtmutex: Remove rt_mutex_timed_lock()
MAINTAINERS: Add myself as futex reviewer
locking/mutex: Remove repeated declaration
...
|
|
Fix ~16 single-word typos in locking code comments.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.13:
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- %p4cc printk format modifier
- atomic: introduce drm_crtc_commit_wait, rework atomic plane state
helpers to take the drm_commit_state structure
- dma-buf: heaps rework to return a struct dma_buf
- simple-kms: Add plate state helpers
- ttm: debugfs support, removal of sysfs
Driver Changes:
- Convert drivers to shadow plane helpers
- arc: Move to drm/tiny
- ast: cursor plane reworks
- gma500: Remove TTM and medfield support
- mxsfb: imx8mm support
- panfrost: MMU IRQ handling rework
- qxl: rework to better handle resources deallocation, locking
- sun4i: Add alpha properties for UI and VI layers
- vc4: RPi4 CEC support
- vmwgfx: doc cleanup
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210303100600.dgnkadonzuvfnu22@gilmour
|
|
Adds defines for lock state returns from lock_is_held_type() based on
Johannes Berg's suggestions as it make it easier to read and maintain
the lock states. These are defines and a enum to avoid changes to
lock_is_held_type() and lockdep_is_held() return types.
Updates to lock_is_held_type() and __lock_is_held() to use the new
defines.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/linux-wireless/871rdmu9z9.fsf@codeaurora.org/
|
|
Some kernel functions must be called without holding a specific lock.
Add lockdep_assert_not_held() to be used in these functions to detect
incorrect calls while holding a lock.
lockdep_assert_not_held() provides the opposite functionality of
lockdep_assert_held() which is used to assert calls that require
holding a specific lock.
Incorporates suggestions from Peter Zijlstra to avoid misfires when
lockdep_off() is employed.
The need for lockdep_assert_not_held() came up in a discussion on
ath10k patch. ath10k_drain_tx() and i915_vma_pin_ww() are examples
of functions that can use lockdep_assert_not_held().
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/linux-wireless/871rdmu9z9.fsf@codeaurora.org/
|
|
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT can't be used when we hold locks
since we are basically waiting for userspace to do something.
Holding a lock while doing so can trivial deadlock with page faults
etc...
So make lockdep complain when a driver tries to do this.
v2: Add lockdep_assert_none_held() macro.
v3: Add might_sleep() and also use lockdep_assert_none_held() in the
IOCTL path.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patchwork.freedesktop.org/patch/414944/
|
|
The local_lock_t's are special, because they cannot form IRQ
inversions, make sure we can tell them apart from the rest of the
locks.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- support "prefer busy polling" NAPI operation mode, where we defer
softirq for some time expecting applications to periodically busy
poll
- AF_XDP: improve efficiency by more batching and hindering the
adjacency cache prefetcher
- af_packet: make packet_fanout.arr size configurable up to 64K
- tcp: optimize TCP zero copy receive in presence of partial or
unaligned reads making zero copy a performance win for much smaller
messages
- XDP: add bulk APIs for returning / freeing frames
- sched: support fragmenting IP packets as they come out of conntrack
- net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs
BPF:
- BPF switch from crude rlimit-based to memcg-based memory accounting
- BPF type format information for kernel modules and related tracing
enhancements
- BPF implement task local storage for BPF LSM
- allow the FENTRY/FEXIT/RAW_TP tracing programs to use
bpf_sk_storage
Protocols:
- mptcp: improve multiple xmit streams support, memory accounting and
many smaller improvements
- TLS: support CHACHA20-POLY1305 cipher
- seg6: add support for SRv6 End.DT4/DT6 behavior
- sctp: Implement RFC 6951: UDP Encapsulation of SCTP
- ppp_generic: add ability to bridge channels directly
- bridge: Connectivity Fault Management (CFM) support as is defined
in IEEE 802.1Q section 12.14.
Drivers:
- mlx5: make use of the new auxiliary bus to organize the driver
internals
- mlx5: more accurate port TX timestamping support
- mlxsw:
- improve the efficiency of offloaded next hop updates by using
the new nexthop object API
- support blackhole nexthops
- support IEEE 802.1ad (Q-in-Q) bridging
- rtw88: major bluetooth co-existance improvements
- iwlwifi: support new 6 GHz frequency band
- ath11k: Fast Initial Link Setup (FILS)
- mt7915: dual band concurrent (DBDC) support
- net: ipa: add basic support for IPA v4.5
Refactor:
- a few pieces of in_interrupt() cleanup work from Sebastian Andrzej
Siewior
- phy: add support for shared interrupts; get rid of multiple driver
APIs and have the drivers write a full IRQ handler, slight growth
of driver code should be compensated by the simpler API which also
allows shared IRQs
- add common code for handling netdev per-cpu counters
- move TX packet re-allocation from Ethernet switch tag drivers to a
central place
- improve efficiency and rename nla_strlcpy
- number of W=1 warning cleanups as we now catch those in a patchwork
build bot
Old code removal:
- wan: delete the DLCI / SDLA drivers
- wimax: move to staging
- wifi: remove old WDS wifi bridging support"
* tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1922 commits)
net: hns3: fix expression that is currently always true
net: fix proc_fs init handling in af_packet and tls
nfc: pn533: convert comma to semicolon
af_vsock: Assign the vsock transport considering the vsock address flags
af_vsock: Set VMADDR_FLAG_TO_HOST flag on the receive path
vsock_addr: Check for supported flag values
vm_sockets: Add VMADDR_FLAG_TO_HOST vsock flag
vm_sockets: Add flags field in the vsock address data structure
net: Disable NETIF_F_HW_TLS_TX when HW_CSUM is disabled
tcp: Add logic to check for SYN w/ data in tcp_simple_retransmit
net: mscc: ocelot: install MAC addresses in .ndo_set_rx_mode from process context
nfc: s3fwrn5: Release the nfc firmware
net: vxget: clean up sparse warnings
mlxsw: spectrum_router: Use eXtended mezzanine to offload IPv4 router
mlxsw: spectrum: Set KVH XLT cache mode for Spectrum2/3
mlxsw: spectrum_router_xm: Introduce basic XM cache flushing
mlxsw: reg: Add Router LPM Cache Enable Register
mlxsw: reg: Add Router LPM Cache ML Delete Register
mlxsw: spectrum_router_xm: Implement L-value tracking for M-index
mlxsw: reg: Add XM Router M Table Register
...
|
|
The current semantic for napi_consume_skb() is that caller need
to provide non-zero budget when calling from NAPI context, and
breaking this semantic will cause hard to debug problem, because
_kfree_skb_defer() need to run in atomic context in order to push
the skb to the particular cpu' napi_alloc_cache atomically.
So add the lockdep_assert_in_softirq() to assert when the running
context is not in_softirq, in_softirq means softirq is serving or
BH is disabled, which has a ambiguous semantics due to the BH
disabled confusion, so add a comment to emphasize that.
And the softirq context can be interrupted by hard IRQ or NMI
context, lockdep_assert_in_softirq() need to assert about hard
IRQ or NMI context too.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When CONFIG_LOCKDEP is not set, lock_is_held() and lockdep_is_held()
are not declared or defined. This forces all callers to use #ifdefs
around these checks.
Recent RCU changes added a lot of lockdep_is_held() calls inside
rcu_dereference_protected(). This macro hides its argument on !LOCKDEP
builds, which can lead to false-positive unused-variable warnings.
This commit therefore provides forward declarations of lock_is_held()
and lockdep_is_held() but without defining them. This way callers
(including those internal to RCU) can keep them visible to the compiler
on !LOCKDEP builds and instead depend on dead code elimination to remove
the references, which in turn prevents the linker from complaining about
the lack of the corresponding function definitions.
[ paulmck: Apply Peter Zijlstra feedback on "extern". ]
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
--
CC: peterz@infradead.org
CC: mingo@redhat.com
CC: will@kernel.org
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The thinking in commit:
fddf9055a60d ("lockdep: Use raw_cpu_*() for per-cpu variables")
is flawed. While it is true that when we're migratable both CPUs will
have a 0 value, it doesn't hold that when we do get migrated in the
middle of a raw_cpu_op(), the old CPU will still have 0 by the time we
get around to reading it on the new CPU.
Luckily, the reason for that commit (s390 using preempt_disable()
instead of preempt_disable_notrace() in their percpu code), has since
been fixed by commit:
1196f12a2c96 ("s390: don't trace preemption in percpu macros")
An audit of arch/*/include/asm/percpu*.h shows there are no other
architectures affected by this particular issue.
Fixes: fddf9055a60d ("lockdep: Use raw_cpu_*() for per-cpu variables")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20201005095958.GJ2651@hirez.programming.kicks-ass.net
|
|
Steve reported that lockdep_assert*irq*(), when nested inside lockdep
itself, will trigger a false-positive.
One example is the stack-trace code, as called from inside lockdep,
triggering tracing, which in turn calls RCU, which then uses
lockdep_assert_irqs_disabled().
Fixes: a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables")
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Now we have four types of dependencies in the dependency graph, and not
all the pathes carry real dependencies (the dependencies that may cause
a deadlock), for example:
Given lock A and B, if we have:
CPU1 CPU2
============= ==============
write_lock(A); read_lock(B);
read_lock(B); write_lock(A);
(assuming read_lock(B) is a recursive reader)
then we have dependencies A -(ER)-> B, and B -(SN)-> A, and a
dependency path A -(ER)-> B -(SN)-> A.
In lockdep w/o recursive locks, a dependency path from A to A
means a deadlock. However, the above case is obviously not a
deadlock, because no one holds B exclusively, therefore no one
waits for the other to release B, so who get A first in CPU1 and
CPU2 will run non-blockingly.
As a result, dependency path A -(ER)-> B -(SN)-> A is not a
real/strong dependency that could cause a deadlock.
From the observation above, we know that for a dependency path to be
real/strong, no two adjacent dependencies can be as -(*R)-> -(S*)->.
Now our mission is to make __bfs() traverse only the strong dependency
paths, which is simple: we record whether we only have -(*R)-> for the
previous lock_list of the path in lock_list::only_xr, and when we pick a
dependency in the traverse, we 1) filter out -(S*)-> dependency if the
previous lock_list only has -(*R)-> dependency (i.e. ->only_xr is true)
and 2) set the next lock_list::only_xr to true if we only have -(*R)->
left after we filter out dependencies based on 1), otherwise, set it to
false.
With this extension for __bfs(), we now need to initialize the root of
__bfs() properly (with a correct ->only_xr), to do so, we introduce some
helper functions, which also cleans up a little bit for the __bfs() root
initialization code.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-8-boqun.feng@gmail.com
|
|
To add recursive read locks into the dependency graph, we need to store
the types of dependencies for the BFS later. There are four types of
dependencies:
* Exclusive -> Non-recursive dependencies: EN
e.g. write_lock(prev) held and try to acquire write_lock(next)
or non-recursive read_lock(next), which can be represented as
"prev -(EN)-> next"
* Shared -> Non-recursive dependencies: SN
e.g. read_lock(prev) held and try to acquire write_lock(next) or
non-recursive read_lock(next), which can be represented as
"prev -(SN)-> next"
* Exclusive -> Recursive dependencies: ER
e.g. write_lock(prev) held and try to acquire recursive
read_lock(next), which can be represented as "prev -(ER)-> next"
* Shared -> Recursive dependencies: SR
e.g. read_lock(prev) held and try to acquire recursive
read_lock(next), which can be represented as "prev -(SR)-> next"
So we use 4 bits for the presence of each type in lock_list::dep. Helper
functions and macros are also introduced to convert a pair of locks into
lock_list::dep bit and maintain the addition of different types of
dependencies.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-7-boqun.feng@gmail.com
|
|
lock_list::distance is always not greater than MAX_LOCK_DEPTH (which
is 48 right now), so a u16 will fit. This patch reduces the size of
lock_list::distance to save space, so that we can introduce other fields
to help detect recursive read lock deadlocks without increasing the size
of lock_list structure.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-6-boqun.feng@gmail.com
|
|
On the archs using QUEUED_RWLOCKS, read_lock() is not always a recursive
read lock, actually it's only recursive if in_interrupt() is true. So
change the annotation accordingly to catch more deadlocks.
Note we used to treat read_lock() as pure recursive read locks in
lib/locking-seftest.c, and this is useful, especially for the lockdep
development selftest, so we keep this via a variable to force switching
lock annotation for read_lock().
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-2-boqun.feng@gmail.com
|
|
Sven reported that commit a21ee6055c30 ("lockdep: Change
hardirq{s_enabled,_context} to per-cpu variables") caused trouble on
s390 because their this_cpu_*() primitives disable preemption which
then lands back tracing.
On the one hand, per-cpu ops should use preempt_*able_notrace() and
raw_local_irq_*(), on the other hand, we can trivialy use raw_cpu_*()
ops for this.
Fixes: a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables")
Reported-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200821085348.192346882@infradead.org
|
|
By using lockdep_assert_*() from seqlock.h, the spaghetti monster
attacked.
Attack back by reducing seqlock.h dependencies from two key high level headers:
- <linux/seqlock.h>: -Remove <linux/ww_mutex.h>
- <linux/time.h>: -Remove <linux/seqlock.h>
- <linux/sched.h>: +Add <linux/seqlock.h>
The price was to add it to sched.h ...
Core header fallout, we add direct header dependencies instead of gaining them
parasitically from higher level headers:
- <linux/dynamic_queue_limits.h>: +Add <asm/bug.h>
- <linux/hrtimer.h>: +Add <linux/seqlock.h>
- <linux/ktime.h>: +Add <asm/bug.h>
- <linux/lockdep.h>: +Add <linux/smp.h>
- <linux/sched.h>: +Add <linux/seqlock.h>
- <linux/videodev2.h>: +Add <linux/kernel.h>
Arch headers fallout:
- PARISC: <asm/timex.h>: +Add <asm/special_insns.h>
- SH: <asm/io.h>: +Add <asm/page.h>
- SPARC: <asm/timer_64.h>: +Add <uapi/asm/asi.h>
- SPARC: <asm/vvar.h>: +Add <asm/processor.h>, <asm/barrier.h>
-Remove <linux/seqlock.h>
- X86: <asm/fixmap.h>: +Add <asm/pgtable_types.h>
-Remove <asm/acpi.h>
There's also a bunch of parasitic header dependency fallout in .c files, not listed
separately.
[ mingo: Extended the changelog, split up & fixed the original patch. ]
Co-developed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200804133438.GK2674@hirez.programming.kicks-ass.net
|
|
Asserting that preemption is enabled or disabled is a critical sanity
check. Developers are usually reluctant to add such a check in a
fastpath as reading the preemption count can be costly.
Extend the lockdep API with macros asserting that preemption is disabled
or enabled. If lockdep is disabled, or if the underlying architecture
does not support kernel preemption, this assert has no runtime overhead.
References: f54bb2ec02c8 ("locking/lockdep: Add IRQs disabled/enabled assertion APIs: ...")
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200720155530.1173732-8-a.darwish@linutronix.de
|
|
Currently lockdep_types.h includes list.h without actually using any
of its macros or functions. All it needs are the type definitions
which were moved into types.h long ago. This potentially causes
inclusion loops because both are included by many core header
files.
This patch moves the list.h inclusion into lockdep.h. Note that
we could probably remove it completely but that could potentially
result in compile failures should any end users not include list.h
directly and also be unlucky enough to not get list.h via some other
header file.
Reported-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Petr Mladek <pmladek@suse.com>
Link: https://lkml.kernel.org/r/20200716063649.GA23065@gondor.apana.org.au
|
|
Now that the macros use per-cpu data, we no longer need the argument.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200623083721.571835311@infradead.org
|
|
Currently all IRQ-tracking state is in task_struct, this means that
task_struct needs to be defined before we use it.
Especially for lockdep_assert_irq*() this can lead to header-hell.
Move the hardirq state into per-cpu variables to avoid the task_struct
dependency.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200623083721.512673481@infradead.org
|
|
There is a header file inclusion loop between asm-generic/bug.h
and linux/kernel.h. This causes potential compile failurs depending
on the which file is included first. One way of breaking this loop
is to stop spinlock_types.h from including lockdep.h. This patch
splits lockdep.h into two files for this purpose.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Petr Mladek <pmladek@suse.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lkml.kernel.org/r/E1jlSJz-0003hE-8g@fornost.hmeau.com
|
|
These functions are called {early,late} in nmi_{enter,exit} and should
not be traced or probed. They are also puny, so 'inline' them.
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Link: https://lkml.kernel.org/r/20200505134101.048523500@linutronix.de
|
|
The warning was intended to spot complete_all() users from hardirq
context on PREEMPT_RT. The warning as-is will also trigger in interrupt
handlers, which are threaded on PREEMPT_RT, which was not intended.
Use lockdep_assert_RT_in_threaded_ctx() which triggers in non-preemptive
context on PREEMPT_RT.
Fixes: a5c6234e1028 ("completion: Use simple wait queues")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200323152019.4qjwluldohuh3by5@linutronix.de
|
|
Extend lockdep to validate lock wait-type context.
The current wait-types are:
LD_WAIT_FREE, /* wait free, rcu etc.. */
LD_WAIT_SPIN, /* spin loops, raw_spinlock_t etc.. */
LD_WAIT_CONFIG, /* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */
LD_WAIT_SLEEP, /* sleeping locks, mutex_t etc.. */
Where lockdep validates that the current lock (the one being acquired)
fits in the current wait-context (as generated by the held stack).
This ensures that there is no attempt to acquire mutexes while holding
spinlocks, to acquire spinlocks while holding raw_spinlocks and so on. In
other words, its a more fancy might_sleep().
Obviously RCU made the entire ordeal more complex than a simple single
value test because RCU can be acquired in (pretty much) any context and
while it presents a context to nested locks it is not the same as it
got acquired in.
Therefore its necessary to split the wait_type into two values, one
representing the acquire (outer) and one representing the nested context
(inner). For most 'normal' locks these two are the same.
[ To make static initialization easier we have the rule that:
.outer == INV means .outer == .inner; because INV == 0. ]
It further means that its required to find the minimal .inner of the held
stack to compare against the outer of the new lock; because while 'normal'
RCU presents a CONFIG type to nested locks, if it is taken while already
holding a SPIN type it obviously doesn't relax the rules.
Below is an example output generated by the trivial test code:
raw_spin_lock(&foo);
spin_lock(&bar);
spin_unlock(&bar);
raw_spin_unlock(&foo);
[ BUG: Invalid wait context ]
-----------------------------
swapper/0/1 is trying to lock:
ffffc90000013f20 (&bar){....}-{3:3}, at: kernel_init+0xdb/0x187
other info that might help us debug this:
1 lock held by swapper/0/1:
#0: ffffc90000013ee0 (&foo){+.+.}-{2:2}, at: kernel_init+0xd1/0x187
The way to read it is to look at the new -{n,m} part in the lock
description; -{3:3} for the attempted lock, and try and match that up to
the held locks, which in this case is the one: -{2,2}.
This tells that the acquiring lock requires a more relaxed environment than
presented by the lock stack.
Currently only the normal locks and RCU are converted, the rest of the
lockdep users defaults to .inner = INV which is ignored. More conversions
can be done when desired.
The check for spinlock_t nesting is not enabled by default. It's a separate
config option for now as there are known problems which are currently
addressed. The config option allows to identify these problems and to
verify that the solutions found are indeed solving them.
The config switch will be removed and the checks will permanently enabled
once the vast majority of issues has been addressed.
[ bigeasy: Move LD_WAIT_FREE,… out of CONFIG_LOCKDEP to avoid compile
failure with CONFIG_DEBUG_SPINLOCK + !CONFIG_LOCKDEP]
[ tglx: Add the config option ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200321113242.427089655@linutronix.de
|
|
Sync up with v5.5-rc1 to get the updated lock_release() API among other
things. Fix the conflict reported by Stephen Rothwell [1].
[1] http://lore.kernel.org/r/20191210093957.5120f717@canb.auug.org.au
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Necessary to annotate functions where we might acquire a
mutex_lock_nested() or similar. Needed by i915.
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104173720.2696-2-daniel.vetter@ffwll.ch
|