| Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull ceph updates from Ilya Dryomov:
"This adds support for manual client session reset in CephFS, allowing
operators to get out of tricky livelock situations involving caps and
file locks without evicting the problematic client instance on the MDS
side or rebooting the client node both of which can be disruptive"
* tag 'ceph-for-7.2-rc1' of https://github.com/ceph/ceph-client:
ceph: add manual reset debugfs control and tracepoints
ceph: add client reset state machine and session teardown
ceph: add diagnostic timeout loop to wait_caps_flush()
ceph: harden send_mds_reconnect and handle active-MDS peer reset
ceph: use proper endian conversion for flock_len in reconnect
ceph: convert inode flags to named bit positions and atomic bitops
rbd: switch to dynamic root device
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"The changes primarily focus on filesystem error reporting, reducing
memory footprint by reverting in-memory data structures used for
runtime validation, honoring FDP hints, and adding trace and debug
logs. In addition, there are critical bug fixes resolving
out-of-bounds read vulnerabilities in inline directory and ACL
handling, potential deadlocks in balance_fs, use-after-free issues in
atomic writes, and false data/node type assignments in large sections.
Enhancements:
- Revert in-memory sit version and block bitmaps
- support to report fserror
- add trace_f2fs_fault_report
- add iostat latency tracking for direct IO
- add logs in f2fs_disable_checkpoint()
- honor per-I/O write streams for direct writes
- map data writes to FDP streams
- skip inode folio lookup for cached overwrite
- skip direct I/O iostat context when disabled
- revert "check in-memory block bitmap"
- revert "check in-memory sit version bitmap"
Fixes:
- optimize representative type determination in GC
- fix incorrect FI_NO_EXTENT handling in __destroy_extent_node()
- fix potential deadlock in f2fs_balance_fs()
- fix potential deadlock in gc_merge path of f2fs_balance_fs()
- atomic: fix UAF issue on f2fs_inode_info.atomic_inode
- fix missing read bio submission on large folio error
- pass correct iostat type for single node writes
- fix to do sanity check on f2fs_get_node_folio_ra()
- validate orphan inode entry count
- keep atomic write retry from zeroing original data
- read COW data with the original inode during atomic write
- validate inline dentry name lengths before conversion
- validate dentry name length before lookup compares it
- reject setattr size changes on large folio files
- revert "remove non-uptodate folio from the page cache in move_data_block"
- validate ACL entry sizes in f2fs_acl_from_disk()
- bound i_inline_xattr_size for non-inline-xattr inodes
- fix listxattr handling of corrupted xattr entries
- fix to round down start offset of fallocate for pin file"
* tag 'f2fs-for-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (42 commits)
f2fs: fix to round down start offset of fallocate for pin file
f2fs: fix listxattr handling of corrupted xattr entries
f2fs: skip direct I/O iostat context when disabled
f2fs: remove unneeded f2fs_is_compressed_page()
f2fs: avoid unnecessary fscrypt_finalize_bounce_page()
f2fs: avoid unnecessary sanity check on ckpt_valid_blocks
f2fs: misc cleanup in f2fs_record_stop_reason()
f2fs: fix wrong description in printed log
f2fs: bound i_inline_xattr_size for non-inline-xattr inodes
f2fs: validate ACL entry sizes in f2fs_acl_from_disk()
Revert "f2fs: remove non-uptodate folio from the page cache in move_data_block"
f2fs: Split f2fs_write_end_io()
f2fs: Rename f2fs_post_read_wq into f2fs_wq
f2fs: Prepare for supporting delayed bio completion
f2fs: reject setattr size changes on large folio files
f2fs: validate dentry name length before lookup compares it
f2fs: validate inline dentry name lengths before conversion
f2fs: read COW data with the original inode during atomic write
f2fs: skip inode folio lookup for cached overwrite
f2fs: keep atomic write retry from zeroing original data
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- "khugepaged: add mTHP collapse support" (Nico Pache)
Provide khugepaged with the capability to collapse anonymous memory
regions to mTHPs
- "Remove CONFIG_READ_ONLY_THP_FOR_FS and enable file THP for writable
files" (Zi Yan)
Remove the READ_ONLY_THP_FOR_FS check in file_thp_enabled(), so that
khugepaged and MADV_COLLAPSE can run on filesystems with PMD THP
pagecache support even without READ_ONLY_THP_FOR_FS enabled
- "make MM selftests more CI friendly" (Mike Rapoport)
General fixes and cleanups to the MM selftests. Also move more MM
selftests under the kselftest framework, making them more amenable to
ongoing CI testing
- "selftests/mm: fix failures and robustness improvements" and
"selftests/mm: assorted fixes for hmm-tests" (Sayali Patil)
Fix several issues in MM selftests which were revealed by powerpc 64k
pagesize
* tag 'mm-stable-2026-06-23-08-55' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (118 commits)
Revert "mm: limit filemap_fault readahead to VMA boundaries"
mm/vmscan: pass NULL to trace vmscan node reclaim
mm: use mapping_mapped to simplify the code
selftests/mm: fix exclusive_cow test fork() handling
selftests/mm: remove hardcoded THP sizing assumptions in hmm tests
selftests/mm: allow PUD-level entries in compound testcase of hmm tests
mm/gup_test: reject wrapped user ranges
mm/page_frag: reject invalid CPUs in page_frag_test
mm/damon/core: always put unsuccessfully committed target pids
mm: page_isolation: avoid unsafe folio reads while scanning compound pages
mm/shrinker: do not hold RCU lock in shrinker_debugfs_count_show()
selftests: mm: fix and speedup "droppable" test
mm: merge writeout into pageout
MAINTAINERS: add Hao Ge as reviewer for codetag and alloc_tag
selftests/mm: clarify alternate unmapping in compaction_test
selftests/mm: move hwpoison setup into run_test() and silence modprobe output for memory-failure category
selftests/mm: skip uffd-stress test when nr_pages_per_cpu is zero
selftests/mm: skip uffd-wp-mremap if UFFD write-protect is unsupported
selftests/mm: ensure destination is hugetlb-backed in hugetlb-mremap
selftest/mm: register existing mapping with userfaultfd in hugetlb-mremap
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang:
"The most notable change is the removal of the fscache backend: it has
been deprecated for almost two years, mainly because EROFS file-backed
mounts and fanotify pre-content hooks (together with erofs-utils) now
provide better functionality and simpler codebase. In addition,
fscache has depended on netfslib for years, which is undesirable for
EROFS since it is a local filesystem. More details in [1].
In addition, sparse support has been added to the pcluster layout,
which is helpful for large sparse AI datasets, and map requests for
chunk-based inodes have been optimized to be more efficient as well.
There are also the usual fixes and cleanups.
Summary:
- Report more consecutive chunks of the same type for
each iomap request
- Add sparse support for the pcluster layout
- Update the EROFS documentation overview
- Remove the deprecated fscache backend
- Various fixes and cleanups"
Link: https://lore.kernel.org/r/20260622013622.934174-1-hsiangkao@linux.alibaba.com [1]
* tag 'erofs-for-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: handle 48-bit blocks_hi for compressed inodes
erofs: remove fscache backend entirely
erofs: simplify RCU read critical sections
erofs: add sparse support to pcluster layout
erofs: add folio order to trace_erofs_read_folio
erofs: introduce erofs_map_chunks()
erofs: call erofs_exit_ishare() before rcu_barrier()
erofs: update the overview of the documentation
erofs: clean up erofs_ishare_fill_inode()
|
|
Add the debugfs and trace plumbing used to trigger and observe
manual client reset.
The reset interface exposes a trigger file for operator-initiated
reset and a status file for tracking the most recent run. The
tracepoints record scheduling, completion, and blocked caller
behavior so reset progress can be diagnosed from the client side.
debugfs layout under /sys/kernel/debug/ceph/<client>/reset/:
trigger - write to initiate a manual reset
status - read to see the most recent reset result
The reset directory is cleaned up via debugfs_remove_recursive()
on the parent, so individual file dentries are not stored.
Tracepoints:
ceph_client_reset_schedule - reset queued
ceph_client_reset_complete - reset finished (success or failure)
ceph_client_reset_blocked - caller blocked waiting for reset
ceph_client_reset_unblocked - caller unblocked after reset
All tracepoints use a null-safe access for monc.auth->global_id
to guard against early-init or late-teardown edge cases.
Signed-off-by: Alex Markuze <amarkuze@redhat.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Add trace_f2fs_fault_report to trigger reporting upon f2fs_bug_on,
need_fsck, stop_checkpoint, and handle_eio. Since f2fs_bug_on and
need_fsck can be triggered in hundreds of scenarios, define set_sbi_flag
as a macro to help capture the effective fault function and line number.
Signed-off-by: shengyong1 <shengyong1@xiaomi.com>
Signed-off-by: liujinbao1 <liujinbao1@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial driver updates from Greg KH:
"Here is the big set of TTY and Serial driver updates for 7.2-rc1.
Overall we end up removing more code than added, due to an obsolete
synclink_gt driver being removed from the tree, always a nice thing to
see happen.
Other than that driver removal, major things included in here are:
- max310x serial driver updates and fixes
- 8250 driver updates and rework in places to make it more "modern"
- dts file updates
- serial driver core tweaks and updates
- vt code cleanups
- vc_screen crash fixes
- other minor driver updates and cleanups
All of these have been in linux-next for well over a week with no
reported issues"
* tag 'tty-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (49 commits)
serial: 8250_pci: Don't specify conflicting values to pci_device_id members
vc_screen: fix null-ptr-deref in vcs_notifier() during concurrent vcs_write
serial: qcom_geni: Fix RX DMA stall when SE_DMA_RX_LEN_IN is zero
vt: merge ucs_is_zero_width()/ucs_is_double_width() into ucs_get_width()
serial: 8250: fix possible ISR soft lockup
dt-bindings: serial: rs485: remove deprecated .txt binding stub
serial: qcom-geni: trace: Add tracepoint support for Qualcomm GENI serial
tty: serial: Use named initializers for arrays of i2c_device_data
serial: 8250_dw: remove clock-notifier infrastructure
serial: 8250_dw: unregister 8250 port if clk_notifier_register() fails
amba/serial: amba-pl011: Bring back zx29 UART support
serial: 8250: Add support for console flow control
serial: 8250: Check LSR timeout on console flow control
serial: 8250: Set cons_flow on port registration
tty: serial: 8250: protect against NULL uart->port.dev in register
arm64: dts: add support for A9 based Amlogic BY401
dt-bindings: arm: amlogic: add A311Y3 support
serial: max310x: fix compile errors if CONFIG_SPI_MASTER is disabled
serial: qcom-geni: Avoid probing debug console UART without console support
serial: max310x: add comments for PLL limits
...
|
|
Add the order to the mm_collapse_huge_page<_swapin,_isolate> tracepoints
to give better insight into what order is being operated at for.
Link: https://lore.kernel.org/20260605161422.213817-10-npache@redhat.com
Signed-off-by: Nico Pache <npache@redhat.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rafael Aquini <raquini@redhat.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shivank Garg <shivankg@amd.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Takashi Iwai (SUSE) <tiwai@suse.de>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: Usama Arif <usama.arif@linux.dev>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "selftests/mm: clean up build output and verbosity" (Li Wang)
Remove some noise from the MM selftests build
- "mm: Free contiguous order-0 pages efficiently" (Ryan Roberts)
Speed up the freeing of a batch of 0-order pages by first scanning
them for coalescing opportunities. This is applicable to vfree() and
to the releasing of frozen pages
- "mm/damon: introduce DAMOS failed region quota charge ratio"
(SeongJae Park)
Address a DAMOS usability issue: The DAMOS quota often exhausts
prematurely because it charges for all memory attempted, causing slow
and inconsistent performance when actions fail on unreclaimable
memory.
To fix this, a new feature lets users set a smaller, flexible quota
charge ratio (via a numerator and denominator) for failed regions.
Since failed actions cause less overhead, reducing their quota cost
ensures more predictable and efficient DAMOS processing
- "selftests/cgroup: improve zswap tests robustness and support large
page sizes" (Li Wang)
Fix various spurious failures and improves the overall robustness of
the cgroup zswap selftests
- "fix MAP_DROPPABLE not supported errno" (Anthony Yznaga)
Fix an issue in the mlock selftests on arm32
- "mm: huge_memory: clean up defrag sysfs with shared" (Breno Leitao)
Some maintenance work in the huge_memory code
- "treewide: fixup gfp_t printks" (Brendan Jackman)
Use the special vprintf() gfp_t conversion in various places
- "mm: Fix vmemmap optimization accounting and initialization" (Muchun
Song)
Fix several bugs in the vmemmap optimization, mainly around incorrect
page accounting and memmap initialization in the DAX and memory
hotplug paths. It also fixes pageblock migratetype initialization and
struct page initialization for ZONE_DEVICE compound pages
- "mm/damon: repost non-hotfix reviewed patches in damon/next tree"
A sprinkle of unrelated minor bugfixes for DAMON
- "mm: remove page_mapped()" (David Hildenbrand)
Remove this function from the tree, replacing it with folio_mapped()
- "mm/damon: let DAMON be paused and resumed" (SeongJae Park)
Allow DAMON to be paused and resumed without losing its current state
- "kasan: hw_tags: Disable tagging for stack and page-tables" (Muhammad
Usama Anjum)
Simplify and speed up kasan by removing its ineffective tagging of
stacks and page tables
- "mm/damon/reclaim,lru_sort: monitor all system rams by default"
(SeongJae Park)
Simplify deployment on diverse hardware like NUMA systems by updating
DAMON_RECLAIM and DAMON_LRU_SORT to automatically monitor the
physical address range covering all System RAM areas by default,
replacing the overly restrictive behavior that only targeted the
single largest memory block to save on negligible overhead
- "mm/damon/sysfs: document filters/ directory as deprecated" (SeongJae
Park)
Update some DAMON docs
- "mm: use spinlock guards for zone lock" (Dmitry Ilvokhin)
Switch zone->lock handling over to using the guard() mechanisms
- "mm/filemap: tighten mmap_miss hit accounting" (fujunjie)
Fix a flaw where the mmap_miss counter over-credited page cache hits
during fault-arounds and page-fault retries. This results in
significant reduction of redundant synchronous mmap readahead I/O,
drastically cutting down execution time and gigabytes read for sparse
random or strided memory access workloads
- "selftests/cgroup: Fix false positive failures in test_percpu_basic"
(Li Wang)
Fix a couple of false-positives in the cgroup kmem selftests
- "mm/damon/reclaim: support monitoring intervals auto-tuning"
(SeongJae Park)
Add a new parameter to DAMON permitting DAMON_RECLAIM to
automatically tune DAMON's sampling and aggregation intervals
- "mm/damon/stat: add kdamond_pid parameter" (SeongJae Park)
Change DAMON_STAT to provide the pid of its kdamond
- "mm/kmemleak: dedupe verbose scan output" (Breno Leitao)
Remove large amounts of duplicated backtraces from the verbose-mode
kmemleak output
- "mm: remove CONFIG_HAVE_BOOTMEM_INFO_NODE (Part 1)" (David
Hildenbrand)
Reduce our use of CONFIG_HAVE_BOOTMEM_INFO_NODE, with a view to
removing it entirely in a later series
- "mm/damon: validate min_region_size to be power of 2" (Liew Rui Yan)
Prevent users from passing a non-power-of-2 value of `addr_unit', as
this later results in undesirable behavior
- "mm: document read_pages and simplify usage" (Frederick Mayle)
- "tools/mm/page-types: Fix misc bugs" (Ye Liu)
Fix three issues in tools/mm/page-types.c
- "mm: misc cleanups from __GFP_UNMAPPED series" (Brendan Jackman)
Implement several cleanups in the page allocator and related code
- "mm, swap: swap table phase IV: unify allocation" (Kairui Song)
Unify the allocation and charging of anon and shmem swap in folios,
provides better synchronization, consolidates the metadata
management, hence dropping the static array and map, and improves
performance
- "mm/damon: introduce data attributes monitoring" (SeongJae Park(
Extend DAMON to monitor general data attributes other than accesses
- "mm/vmalloc: free unused pages on vrealloc() shrink" (Shivam Kalra)
Implement the TODO in vrealloc() to unmap and free unused pages when
shrinking across a page boundary
- "mm/damon: documentation and comment fixes" (niecheng)
- "remove mmap_action success, error hooks" (Lorenzo Stoakes)
Eliminate custom hooks from mmap_action by removing the problematic
success_hook which allowed drivers to improperly access uninitialized
VMAs. It replaces the error_hook with a simple error-code field and
updates the memory char driver accordingly
- "mm/damon: minor improvements for code readability and tests"
(SeongJae Park)
- "mm/damon: fix macro arguments and clarify quota goals doc" (Maksym
Shcherba)
- "userfaultfd: merge fs/userfaultfd.c into mm/userfaultfd.c" (Mike
Rapoport)
- "mm/mglru: improve reclaim loop and dirty folio" (Kairui Song and
others)
Clean up and slightly improves MGLRU's reclaim loop and dirty
writeback handling. Large performance improvements are measured
- "use vma locks for proc/pid/{smaps|numa_maps} reads" (Suren
Baghdasaryan)
Use per-vma locks when reading /proc/pid/smaps and numa_maps similar
to reduce contention on central mmap_lock
- "refactors thpsize_shmem_enabled_store() and thpsize_shmem_enabled_show()"
(Ran Xiaokai)
Some cleanup work in the THP code
- "selftests/memfd: fix compilation warnings" (Konstantin Khorenko)
Fix a few build glitches in the memfd selftest code.
- "memcg: shrink obj_stock_pcp and cache multiple objcgs" (Shakeel
Butt)
Resolve a 68% performance regression caused by NUMA-node cache
thrashing around struct obj_stock_pcp by shrinking its existing
fields and expanding it into a multi-slot array that caches up to
five obj_cgroup pointers per CPU, allowing per-node variants of the
same memcg to coexist within a single 64-byte cache line.
- "zram: writeback fixes" (Sergey Senozhatsky)
address a couple of unrelated zram writeback issues
- "mm: switch THP shrinker to list_lru" (Johannes Weiner)
Resolve NUMA-awareness issues and streamlines callsite interaction by
refactoring and extending the list_lru API to completely replace the
complex, open-coded deferred split queue for Transparent Huge Pages
- "mm: improve large folio readahead for exec memory" (Usama Arif)
Improve large-folio readahead on systems like 64K-page arm64 by
preventing the mmap_miss check from permanently disabling
target-oriented VM_EXEC readahead, and by generalizing the
force_thp_readahead gate to support mappings with any usefully large
maximum folio order under the cache cap.
- "userfaultfd/pagemap: pre-existing fixes" (Kiryl Shutsemau)
Fix a bunch of minor issues in the userfaultfd/pagemap, all of which
were flagged by Sashiko review of proposed new material
- "mm/sparse-vmemmap: Provide generic vmemmap_set_pmd() and
vmemmap_check_pmd()" (Muchun Song)
Provide generic versions of these two functions so the four
arch-specific implementations can be removed.
- "mm/swap, PM: hibernate: fix swapoff race in uswsusp by pinning swap
device" (Youngjun Park)
Address a uswsusp-vs-swapoff race and reduces the swap device
reference taking/releasing frequency.
- "mm/hmm: A fix and a selftest" (Dev Jain)
* tag 'mm-stable-2026-06-18-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
selftests/mm/hmm-tests: test pagemap reads of PMD device-private entries
fs/proc/task_mmu: do not warn on seeing non-migration pmd entry
lib/test_hmm: check alloc_page_vma() return value and handle OOM
mm/compaction: cap compact_gap() at COMPACT_CLUSTER_MAX
mm/swap: remove redundant swap device reference in alloc/free
mm/swap, PM: hibernate: fix swapoff race in uswsusp by pinning swap device
mm/filemap: use folio_next_index() for start
vmalloc: fix NULL pointer dereference in is_vm_area_hugepages()
sparc/mm: drop vmemmap_check_pmd helper and use generic code
loongarch/mm: drop vmemmap_check_pmd helper and use generic code
riscv/mm: drop vmemmap_pmd helpers and use generic code
arm64/mm: drop vmemmap_pmd helpers and use generic code
mm/sparse-vmemmap: provide generic vmemmap_set_pmd() and vmemmap_check_pmd()
rust: page: mark Page::nid as inline
userfaultfd: build __VMA_UFFD_FLAGS from config-gated masks
userfaultfd: gate must_wait writability check on pte_present()
mm/huge_memory: preserve pmd_swp_uffd_wp on device-private PMD downgrade
fs/proc/task_mmu: fix hugetlb self-deadlock in pagemap_scan_pte_hole()
fs/proc/task_mmu: use huge_page_size() in pagemap_scan_hugetlb_entry()
fs/proc/task_mmu: fix make_uffd_wp_huge_pte() prot-update race
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
- A major rework of the fast commit mechanism to avoid lock contention
and deadlocks. We also export snapshot statistics in
/proc/fs/ext4/*/fc_info
- Performance optimization for directory hash computation by processing
input in 4-byte chunks and removing function pointers, along with new
KUnit tests for directory hash
- Cleanups in JBD2 to remove special slabs and use kmalloc() instead
- Various bug fixes, including:
- Early validation of donor superblock in EXT4_IOC_MOVE_EXT to
avoid cross-fs deadlock
- Fix for a kernel BUG in ext4_write_inline_data_end under
data=journal
- Fix for a NULL dereference in jbd2_journal_dirty_metadata when
handle is aborted
- Fix for an underflow in JBD2 fast commit block initialization
check
- Fix for LOGFLUSH shutdown ordering to ensure ordered data
writeback
- Miscellaneous fixes for error path return values and KUnit
assertions
* tag 'ext4_for_linus-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: validate donor file superblock early in EXT4_IOC_MOVE_EXT
ext4: fix kernel BUG in ext4_write_inline_data_end
ext4: fix ERR_PTR(0) in ext4_mkdir()
jbd2: remove special jbd2 slabs
ext4: remove mention of PageWriteback
ext4: improve str2hashbuf by processing 4-byte chunks and removing function pointers
ext4: add Kunit coverage for directory hash computation
ext4: fast commit: export snapshot stats in fc_info
ext4: fast commit: add lock_updates tracepoint
ext4: fast commit: avoid i_data_sem by dropping ext4_map_blocks() in snapshots
ext4: fast commit: avoid self-deadlock in inode snapshotting
ext4: fast commit: avoid waiting for FC_COMMITTING
ext4: lockdep: handle i_data_sem subclassing for special inodes
ext4: fast commit: snapshot inode state before writing log
jbd2: fix integer underflow in jbd2_journal_initialize_fast_commit()
ext4: fix fast commit wait/wake bit mapping on 64-bit
jbd2: check for aborted handle in jbd2_journal_dirty_metadata()
ext4: Use %pe to print PTR_ERR()
ext4: fix LOGFLUSH shutdown ordering to allow ordered-mode data writeback
ext4: replace KUnit tests for memcmp() with KUNIT_ASSERT_MEMEQ()
|
|
erofs supports large folios for reads, but the actual folio order
instantiated in the page cache may be lower due to allocation
constraints such as memory fragmentation.
trace_erofs_read_folio already receives the folio being read but
currently records only its index. Add folio_order() to the tracepoint
so that users can observe the realized folio order and verify the
effectiveness of large folio reads.
Also drop the uptodate field: read_folio() is only called for a
non-uptodate folio, so it is always 0.
Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Pull drm updates from Dave Airlie:
"Highlights:
- xe: add initial CRI platform support
- amdgpu: initial HDMI 2.1 FRL support
- rust: add some new type concepts for device lifetimes
- scheduler: moves to a fair algorithm and lots of cleanups
But it's mostly the usual mountain of changes across the board.
core:
- add docbook for DRM_IOCTL_SYNCOBJ_EVENTFD
- change signature of drm_connector_attach_hdr_output_metadata_property
- dedup counter and timestamp retrieval in vblank code
- parse AMD VSDB v3 in CTA extension blocks
- add P230, Y7, XYYY2101010, T430, XVUY210101010 formats
- don't call drop master on file close if not master
- use drm_printf_indent in atomic / bridge
- fix 32b format descriptions
- docs: fix toctree
- hdmi: add common TMDS character rates
- fix drm_syncobj_find_fence leak
rust:
- introduce Higher-Ranked lifetime types
- replace drvdata with scoped registration data
- add GPUVM immediate mode abstraction for rust GPU drivers
- introduce DeviceContext type state for drm::Device
bridge:
- clarify drm_bridge_get/put
- create drm_get_bridge_by_endpoint and use it
- analogix_dp: add panel probing
- ite-it6211 - use drm audio hdmi helpers
buddy:
- add lockdep annotations
dp:
- add PR and VRR updates
- mst: fix buffer overflows
- add Adaptive Sync SDP decoding support
- fix OOB reads in dp-mst
ttm:
- bump fpfn/lpfn to 64-bit
scheduler:
- change default to fair scheduler
- map runqueue 1:1 with scheduler
dma-buf:
- port selftests to kunit
- convert dma-buf system/heap allocators to module
- add separate DMABUF_HEAPS_SYSTEM_CC_SHARED Kconfig
udmabuf:
- revert hugetlb support
- fix error with CONFIG_DMA_API_DEBUG
dma-fence:
- fix tracepoints lifetime
- remove unused signal on any support
ras:
- add clear error counter netlink command to drm ras
gpusvm:
- reject VMAs with VM_IO or VM_PFNMAP when creating SVM ranges
- use IOVA allocations
pagemap:
- use IOVA allocations
panels:
- update to use ref counts
- add support for CSW PNB601LS1-2, LGD LP116WHA-SPB1
- add support for waveshare panels
- CMN N116BCN-EA1, CMN N140HCA-EEK, IVO M140NWFQ R5,
- IVO, R140NWFW R0, BOE NT140*, BOE NV133FHM-N4F,
- AUO B140*, AUO B133HAN06.6 and AUO B116XTN02.3 eDP panels
- Surface Pro 12 Panel
xe:
- add CRI PCI-IDs
- debugfs add multi-lrc info
- engine init cleanup
- PF fair scheduling auto provisioning
- system controller support for CRI/Xe3p
- PXP state machine fixes
- Reset/wedge/unload corner case fixes
- Wedge path memory allocation fixes
- PAT type cleanups
- Reject unsafe PAT for CPU cached memory
- OA improvements for CRI device memory
- kernel doc syntax in xe headers
- xe_drm.h documentation fixes
- include guard cleanups
- VF CCS memory pool
- i915/xe step unification
- Xe3p GT tuning fixes
- forcewake cleanup in GT and GuC
- admin-only PF mode
- enable hwmon energy attributes for CRI
- enable GT_MI_USER_INTERRUPT
- refactor emit functions
- oa workarounds
- multi_queue: allow QUEUE_TIMESTAMP register
- convert stolen memory to ttm range manager
- use xe2 style blitter as a feature flag
- make drm_driver const
- add/use IRQ page to HW engine definition
- fix oops when display disabled
i915:
- enable PIPEDMC_ERROR interrupt
- more common display code refactoring
- restructure DP/HDMI sink format handling
- eliminate FB usage from lowlevel pinning code
- panel replay bw optimization
- integrate sharpness filter into the scaler
- new fb_pin abstraction for xe/i915 fb transparent handling
- skip inactive MST connectors on HDCP
- start switching to display specific registers
- use polling when irq unavailable
- Adaptive-sync SDP prep
amdgpu:
- use drm_display_info for AMD VSDB data
- Initial HDMI 2.1 FRL support
- Initial DCN 4.2.1 support
- GART fixes for non-4k pages
- GC 11.5.6/SDMA 6.4.0/and other new IPs
- GFX9/DCE6/Hawaii/SDMA4/GART/Userq fixes
- Finish support for using multiple SDMA queues for TTM operations
- SWSMU updates
- GC 12.1 updates
- SMU 15.0.8 updates
- DCN 4.2 updates
- DC type conversion fixes
- Enable DC power module
- Replay/PSR updates
- SMU 13.x updates
- Compute queue quantum MQD updates
- ASPM fix
- Align VKMS with common implementation
- DC analog support fixes
- UVD 3 fixes
- TCC harvesting fixes for SI
- GC 11 APU module reload fix
- NBIO 6.3.2 support
- IH 7.1 updates
- DC cursor fixes
- VCN/JPEG user fence fixes
- DC support for connectors without DDC
- Prefer ROM BAR for default VGA device
- DC bandwidth fixes
- Add PTL support for profiler
- Introduce dc_plane_cm and migrate surface update color path
- Add FRL registers for HDMI 2.1
- Restructure VM state machine
- Auxless ALPM support
- GEM_OP locking/warning fixes
- switch to system_dfl_wq
amdkfd:
- GPUVM TLB flush fix
- Hotplug fix
- Boundary check fixes
- SVM fixes
- CRIU fixes
- add profiler API
- MES 12.1 updates
msm:
- core:
- fix shrinker documentation
- IFPC enabled for gen8
- PERFCNTR_CONFIG ioctl support
- GPU:
- reworked UBWC handling
- a810 support
- MDSS:
- add support for Milos platform
- reworked UBWC handling
- DisplayPort:
- reworked HPD handling as prep for MST
- DPU:
- Milos platform support
- reworked UBWC handling
- DSI:
- Milos platform support
nova:
- Hopper/Blackwell enablement (GH100/GB100/GB202)
- FSP support
- 32-bit firmware support
- HAL functions
- refactor GSP boot/unload
- GA100 support
- VBIOS hardening/refactoring
- Adopt higher order lifetime types
tyr:
- define register blocks
- add shmem backed GEM objects
- adopt higher order lifetime types
- move clock cleanup into Drop
radeon:
- Hawaii SMU fixes
- CS parser fix
- use struct drm_edid instead of edid
amdxdna:
- export per-client BO memory via fdinfo
- AIE4 device support
- support medium/lower power modes
- expandable device heap support
- revert read-only user-pointer BO mappings
ivpu:
- support frequency limiting
panthor:
- enable GEM shrinker support
- add eviction and reclaim info to fdinfo
v3d:
- enable runtime PM
mgag200:
- support XRGB1555 + C8
ast:
- support XRGB1555 + C8
- use constants for lots of registers
- fix register handling
imagination:
- fence handling refactoring
nouveau:
- fix sched double call
- expose VBIOS on GSP-RM systems
- add GA100 support
virtio:
- add VIRTIO_GPU_F_BLOB_ALIGNMENT flag
- add deferred mapping support
gud:
- add RCade Display Adapter
hibmc:
- fix no connectors usage
mediatek:
- hdmi: convert error handling
- simplify mtk_crtc allocation
exynos:
- move fbdev emulation to drm client buffers
- use drm format helpers for geometry/size
- adopt core DMA tracking
- fix framebuffer offset handling
renesas:
- add RZ/T2H SOC support
versilicon:
- add cursor plane support
tegra:
- use drm client for framebuffer"
* tag 'drm-next-2026-06-17' of https://gitlab.freedesktop.org/drm/kernel: (1731 commits)
dma-buf: move system_cc_shared heap under separate Kconfig
accel/amdxdna: Clear sva pointer after unbind
agp/amd64: Fix broken error propagation in agp_amd64_probe()
accel/amdxdna: Require carveout when PASID and force_iova are disabled
drm/amdkfd: always resume_all after suspend_all
drm/amdgpu/gfx: move fault and EOP IRQ get/put to hw_init/hw_fini
drm/amd/display: Consult MCCS FreeSync cap only if requested & supported
drm/amd/pm: Use strscpy in profile mode parsing
drm/amdkfd: Fix infinite loop parsing CRAT with zero subtype length
drm/amdkfd: fix sysfs topology prop length on buffer truncation
drm/amdgpu: drop retry loop in amdgpu_hmm_range_get_pages
drm/amd/pm: bound OD parameter parsing to stack array size
drm/amd/pm: Stop pp_od_clk_voltage emit at PAGE_SIZE
drm/amdkfd: Unwind debug trap enable on copy_to_user failure
drm/amdgpu: validate the mes firmware version for gfx12.1
drm/amdgpu: validate the mes firmware version for gfx12
drm/amdgpu: compare MES firmware version ucode for gfx11
drm/amdkfd: Add bounds check for AMDKFD_IOC_WAIT_EVENTS
drm/amdgpu: restart the CS if some parts of the VM are still invalidated
drm/amd/display: use unsigned types for local pipe and REG_GET counters
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
"Major changes:
- Recover from BPF arena page faults using a scratch page and add
ptep_try_set() for lockless empty-slot installs on x86 and arm64.
This allows BPF kfuncs to access arena pointers directly.
The 'arena_direct_access' stable branch was created for this work
and was pulled into sched-ext and bpf-next trees (Tejun Heo, Kumar
Kartikeya Dwivedi)
- Lift old restriction and support 6+ arguments in BPF programs and
kfuncs on x86 and arm64 (Yonghong Song, Puranjay Mohan)
Other features and fixes:
- Add 24-bit BTF vlen and reclaim unused bits in the BTF UAPI to ease
addition of new BTF kinds (Alan Maguire)
- Raise the maximum BPF call chain depth from 8 to 16 frames (Alexei
Starovoitov)
- Refactor object relationship tracking in the verifier and fix a
dynptr use-after-free bug (Amery Hung)
- Harden the signed program loader and reject exclusive maps as inner
maps (Daniel Borkmann)
- Replace the verifier min/max bounds fields with a circular number
(cnum) representation and improve 32->64 bit range refinements
(Eduard Zingerman)
- Introduce the arena library and runtime (libarena) with a buddy
allocator, rbtree and SPMC queue data structures, ASAN support and
a parallel test harness. Allow subprograms to return arena pointers
and switch to a BTF type-tag based __arena annotation (Emil
Tsalapatis)
- Cache build IDs in the sleepable stackmap path and avoid faultable
build ID reads under mm locks (Ihor Solodrai)
- Introduce the tracing_multi link to attach a single BPF program to
many kernel functions at once. Allow specifying the uprobe_multi
target via FD (Jiri Olsa)
- Extend the bpf_list family of kfuncs with bpf_list_add/del(), and
bpf_list_is_first/is_last/empty() (Kaitao Cheng)
- Extend the BPF syscall with common attributes support for
prog_load, btf_load and map_create (Leon Hwang)
- Wrap rhashtable as BPF map (Mykyta Yatsenko, Herbert Xu)
- Add sleepable support for tracepoint programs and fix deadlocks in
LRU map due to NMI reentry (Mykyta Yatsenko)
- Fix OOB access in bpf_flow_keys, fix nullness analysis of inner
arrays, enforce write checks for global subprograms (Nuoqi Gui)
- Report the maximum combined stack depth and print a breakdown of
instructions processed per subprogram (Paul Chaignon)
- Add an XDP load-balancer benchmark and arm64 JIT support for stack
arguments (Puranjay Mohan)
- Add kfuncs to traverse over wakeup_sources (Samuel Wu)
- Allow sleepable BPF programs to use LPM trie maps directly (Vlad
Poenaru)
- Many more fixes and cleanups across the verifier, BTF, sockmap,
devmap, bpffs, security hooks, s390/riscv/loongarch JITs,
rqspinlock, libbpf, bpftool, selftests"
* tag 'bpf-next-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (336 commits)
selftests/bpf: Work around llvm stack overflow in crypto progs
selftests/bpf: add test for bpf_msg_pop_data() overflow
bpf, sockmap: fix integer overflow in bpf_msg_pop_data() bounds check
sockmap: Fix use-after-free in udp_bpf_recvmsg()
bpf, sockmap: keep sk_msg copy state in sync
bpf, sockmap: Fix wrong rsge offset in bpf_msg_push_data()
bpf, sockmap: reject overflowing copy + len in bpf_msg_push_data()
selftsets/bpf: Retry map update on helper_fill_hashmap()
selftests/bpf: Add test for sleepable lsm_cgroup rejection
selftests/bpf: Add test to verify the fix for bpf_setsockopt() helper
bpf: Fix bpf_get/setsockopt to tos for ipv4-mapped ipv6 socket
selftests/bpf: Avoid static LLVM linking for cross builds
selftests/bpf: Use common CFLAGS for urandom_read
selftests/bpf: Initialize operation name before use
tools/bpf: build: Append extra cflags
libbpf: Initialize CFLAGS before including Makefile.include
bpftool: Append extra host flags
bpftool: Avoid adding EXTRA_CFLAGS to HOST_CFLAGS
bpftool: Pass host flags to bootstrap libbpf
selftests/bpf: correct CONFIG_PPC64 macro name in comment
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Work on removing rtnl_lock protection throughout the stack
continues. In this chapter:
- don't use rtnl_lock for IPv6 multicast routing configuration
- don't take rtnl_lock in ethtool for modern drivers
- prepare Qdisc dump callbacks for rtnl_lock removal
- Support dumping just ifindex + name of all interfaces, under RCU.
It's a common operation for Netlink CLI tools (when translating
names to ifindexes) and previously required full rtnl_lock.
- Support dumping qdiscs and page pools for a specific netdev. Even
tho user space wants a dump of all netdevs, most of the time, the
OOO programming model results in repeating the dump for each
netdev. Which, in absence of a cache, leads to a O(n^2) behavior.
- Flush nexthops once on multi-nexthop removal (e.g. when device goes
down), another O(n^2) -> O(n) improvement.
- Rehash locally generated traffic to a different nexthop on
retransmit timeout.
- Honor oif when choosing nexthop for locally generated IPv6 traffic.
- Convert TCP Auth Option to crypto library, and drop non-RFC algos.
- Increase subflow limits in MPTCP to 64 and endpoint limit to 256.
- Support MPTCP signaling of IPv6 address + port (ADD_ADDR). We need
to selectively skip reporting of the standard TCP Timestamp option,
because they won't fit into the header space together (12 + 30 >
40).
- Support using bridge neighbor suppression, Duplicate Address
Detection, Gratuitous ARP and unsolicited NA forwarding - in EVPN
deployments, e.g. VXLAN fabrics (IPv4 and IPv6).
- Improve link state reporting for upper netdevs (e.g. macvlan) over
tunnel devices (again, mostly for EVPN deployments).
- Support binding GENEVE tunnels to a local address.
- Speed up UDP tunnel destruction (remove one synchronize_rcu()).
- Support exponential field encoding in multicast (IGMPv3 and MLDv2).
- Support attaching PSP crypto offload to containers (veth, netkit).
- Add a new IPSec Netlink message XFRM_MSG_MIGRATE_STATE that allows
migrating individual IPsec SAs independently of their policies.
The existing XFRM_MSG_MIGRATE is tightly coupled to policy+SA
migration, lacks SPI for unique SA identification, and cannot
express reqid changes or migrate Transport mode selectors.
The new interface identifies the SA via SPI and mark, supports
reqid changes, address family changes, encap removal, and uses an
atomic create+install flow under x->lock to prevent SN/IV reuse
during AEAD SA migration.
- Implement GRO/GSO support for PPPoE.
- Convert sockopt callbacks in a number of protocols to iov_iter.
Cross-tree stuff:
- Remove support for Crypto TFM cloning (unblocked after the TCP Auth
Option rework). This feature regressed performance for all crypto
API users, since it changed crypto transformation objects into
reference-counted objects.
- Add FCrypt-PCBC implementation to rxrpc and remove it from the
global crypto API as obsolete and insecure.
Wireless:
- Major rework of station bandwidth handling, fixing issues with
lower capability than AP.
- Cleanups for EMLSR spec issues (drafts differed).
- More Neighbor Awareness Networking (Wi-Fi Aware) work (multicast,
schedule improvements, multi-station etc.)
- Some Ultra High Reliability (UHR) / IEEE 802.11bn (D1.4) work
(e.g. non-primary channel access, UHR DBE support).
- Fine Timing Measurement ranging (i.e. distance measurement) APIs.
Netfilter:
- Use per-rule hash initval in nf_conncount. This avoids unnecessary
lock contention with short keys (e.g. conntrack zones) in different
namespaces.
- Various safety improvements, both in packet parsing and object
lifetimes. Notably add refcounts to conntrack timeout policy.
Deletions:
- Remove TLS + sockmap integration. TLS wants to pin user pages to
avoid a copy, and sockmap wants to write to the input stream. More
work on this integration is clearly needed, and we can't find any
users (original author admitted that they never deployed it).
- Remove support for TLS offload with TCP Offload Engine (the far
more common opportunistic offload is retained). The locking looks
unfixable (driver sleeps under TCP spin locks) and people from the
vendor that added this are AWOL.
- Remove more ATM code, trying to leave behind only what PPPoATM
needs, AAL5 and br2684 with permanent circuits.
- Remove AppleTalk. Let it join hamradio in our out of tree protocol
graveyard, I mean, repository.
- Disable 32-bit x_tables compatibility (32bit binaries on 64bit
kernel) interface in user namespaces. To be deleted completely,
soon.
- Remove 5/10 MHz support from cfg80211/mac80211.
Drivers:
- Software:
- Support DEVMEM/DMABUF Tx over NETMEM_TX_NO_DMA devices (netkit)
- bonding: add knob to strictly follow 802.3ad for link state
- New drivers:
- Alibaba Elastic Ethernet Adaptor (cloud vNIC).
- NXP NETC switch within i.MX94.
- DPLL:
- Add operational state to pins (implement in zl3073x).
- Add generic DPLL type, for daisy-chaining DPLLs (implement in ice).
- Ethernet high-speed NICs:
- Huawei (hinic3):
- enhance tc flow offload support with queue selection,
tunnels
- nVidia/Mellanox:
- avoid over-copying payload to the skb's linear part (up to
60% win for LRO on slow CPUs like ARM64 V2)
- expose more per-queue stats over the standard API
- support additional, unprivileged PFs in the DPU
configuration
- support Socket Direct (multi-PF) with switchdev offloads
- add a pool / frag allocator for DMA mapped buffers for
control objects, save memory on systems with 64kB page size
- take advantage of the ability to dynamically change RSS
table size, even when table is configured by the user
- increase the max RSS table size for even traffic
distribution
- Ethernet NICs:
- Marvell/Aquantia:
- AQC113 PTP support
- Realtek USB (r8152):
- support 10Gbit Link Speeds and Energy-Efficient Ethernet
(EEE)
- support firmware loaded (for RTL8157/RTL8159)
- support for the RTL8159
- Intel (ixgbe):
- support Energy-Efficient Ethernet (EEE) on E610 devices
- Ethernet switches:
- Airoha:
- support multiple netdevs on a single GDM block / port
- Marvell (mv88e6xxx):
- support SERDES of mv88e6321
- Microchip (ksz8/9):
- rework the driver callbacks to remove one indirection layer
- Motorcomm (yt921x):
- support port rate policing
- support TBF qdisc offload
- support ACL/flower offload
- nVidia/Mellanox:
- expose per-PG rx_discards
- Realtek:
- rtl8365mb: bridge offloading and VLAN support
- Ethernet PHYs:
- Airoha:
- support Airoha AN8801R Gigabit PHYs.
- Micrel:
- implement 3 low-loss cable tunables
- Realtek:
- support MDI swapping for RTL8226-CG
- support MDIO for RTL931x
- Qualcomm:
- at803x: Rx and Tx clock management for IPQ5018 PHY
- Motorcomm:
- support YT8522 100M RMII PHY
- set drive strength in YT8531s RGMII
- TI:
- dp83822: add optional external PHY clock
- Bluetooth:
- hci_sync: add support for HCI_LE_Set_Host_Feature [v2]
- SMP: use AES-CMAC library API
- Intel:
- support Product level reset
- support smart trigger dump
- Mediatek:
- add event filter to filter specific event
- Realtek:
- fix RTL8761B/BU broken LE extended scan
- WiFi:
- Broadcom (b43):
- new support for a 11n device
- MediaTek (mt76):
- support mt7927
- mt792x: broken usb transport detection
- mt7921: regulatory improvements
- Qualcomm (ath9k):
- GPIO interface improvements
- Qualcomm (ath12k):
- WDS support
- replace dynamic memory allocation in WMI Rx path
- thermal throttling/cooling device support
- 6 GHz incumbent interference detection
- channel 177 in 5 GHz
- Realtek (rt89):
- RTL8922AU support
- USB 3 mode switch for performance
- better monitor radiotap support
- RTL8922DE preparations"
* tag 'net-next-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1778 commits)
ipv4: fib_rule: Move fib4_rules_exit() to ->exit().
net: serialize netif_running() check in enqueue_to_backlog()
net: skmsg: preserve sg.copy across SG transforms
appletalk: move the protocol out of tree
appletalk: stop storing per-interface state in struct net_device
selftests/bpf: test that TLS crypto is rejected on a sockmap socket
selftests/bpf: drop the unused kTLS program from test_sockmap
selftests/bpf: remove sockmap + ktls tests
tls: remove dead sockmap (psock) handling from the SW path
tls: reject the combination of TLS and sockmap
atm: remove orphaned uAPI for deleted drivers, protocols and SVCs
atm: remove unused ATM PHY operations
atm: remove the unused pre_send and send_bh device operations
atm: remove the unused change_qos device operation
atm: remove SVC socket support and the signaling daemon interface
atm: remove the local ATM (NSAP) address registry
atm: remove dead SONET PHY ioctls
atm: remove the unused send_oam / push_oam callbacks
atm: remove AAL3/4 transport support
net: dsa: sja1105: fix lastused timestamp in flower stats
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block updates from Jens Axboe:
- NVMe pull request via Keith:
- Per-controller admin and IO timeout sysfs attributes, and
letting the block layer set request timeouts (Maurizio,
Maximilian)
- Multipath passthrough iostats, and PCI P2PDMA enablement for
multipath devices (Keith, Kiran)
- A new diag sysfs attribute group exporting per-controller
counters (retries, multipath failover, error counters, requeue
and failure counts, reset and reconnect events) (Nilay)
- FDP configuration validation and bounds check fixes (liuxixin)
- Various nvmet fixes, including a pre-auth out-of-bounds read in
the Discovery Get Log Page handler, auth payload bounds
validation, and tcp error-path leak fixes (Bryam, Tianchu,
Geliang)
- nvme-tcp lockdep and workqueue fixes (Shin'ichiro, Kuniyuki,
Eric)
- Assorted other fixes and cleanups (John, Yao, Chao, Mateusz,
Achkinazi, Wentao)
- MD pull request via Yu Kuai:
- raid1/raid10 fixes for a deadlock in the read error recovery
path, error-path detection and bio accounting with cloned bios,
and an nr_pending leak in the REQ_ATOMIC bad-block error path
(Abd-Alrhman)
- PCI P2PDMA propagation from member devices to the RAID device
(Kiran)
- dm-raid bio requeue fix, and various smaller fixes and cleanups
(Benjamin, Chen, Li, Thorsten)
- Enable Clang lock context analysis for the block layer, with the
accompanying annotations across queue limits, the blk_holder_ops
callbacks, crypto, cgroup, iocost, kyber and mq-deadline (Bart)
- Block status code infrastructure work: a tagged status table, a
str_to_blk_op() helper, a bio_endio_status() helper, and on top of
that a new configurable block-layer error injection facility
(Christoph)
- DRBD netlink rework, replacing the genl_magic machinery with explicit
netlink serialization and moving the DRBD UAPI headers to
include/uapi/linux/ (Christoph Böhmwalder)
- bvec improvements: a bvec_folio() helper and making the bvec_iter
helpers proper inline functions (Willy, Christoph)
- ublk cleanups and a canceling-flag fix for the disk-not-allocated
case (Caleb, Ming)
- Partition handling fixes: bound the AIX pp_count scan, fix an of_node
refcount leak, and replace __get_free_page() with kmalloc() (Bryam,
Wentao, Mike)
- Convert numa_node to int in blk_mq_hw_ctx and ->init_request, and add
WQ_PERCPU to the block workqueue users (Mateusz, Marco)
- Block statistics and tracing: propagate in-flight to the whole disk
on partition IO, export passthrough stats, and a new
block_rq_tag_wait tracepoint (Tang, Keith, Aaron)
- A round of removals, unexports and cleanups across bio, direct-io and
the bvec helpers (Christoph)
- Various driver fixes (mtip32xx use-after-free, rbd snap_count
validation and strscpy conversion, nbd socket lockdep reclassify,
virtio-blk zone report clamp, floppy) and a batch of MAINTAINERS
email/list updates (Coly, Li, Yu, Christoph Böhmwalder)
- Other little fixes and cleanups all over
* tag 'for-7.2/block-20260615' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (117 commits)
MAINTAINERS: Update Coly Li's email address
block: check bio split for unaligned bvec
nbd: Reclassify sockets to avoid lockdep circular dependency
block: add configurable error injection
block: add a str_to_blk_op helper
block: add a "tag" for block status codes
block: add a macro to initialize the status table
floppy: Drop unused pnp driver data
block: propagate in_flight to whole disk on partition I/O
virtio-blk: clamp zone report to the report buffer capacity
block: optimize I/O merge hot path with unlikely() hints
drivers/block/rbd: Use strscpy() to copy strings into arrays
partitions: aix: bound the pp_count scan to the ppe array
block: Enable lock context analysis
block/mq-deadline: Make the lock context annotations compatible with Clang
block/Kyber: Make the lock context annotations compatible with Clang
block/blk-mq-debugfs: Improve lock context annotations
block/blk-iocost: Inline iocg_lock() and iocg_unlock()
block/blk-iocost: Split ioc_rqos_throttle()
block/crypto: Annotate the crypto functions
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"The most noticeable change is to enable large folios by default, it's
been in testing for a few releases. Related to that is huge folio
support (still under experimental config). Otherwise a few ioctl
updates, performance improvements and usual fixes and core changes.
User visible changes:
- enable large folios by default, added in 6.17 (under experimental
build), no feature limitations, a big change internally
- new ioctl to return raw checksums to userspace (a bit tricky given
compression and tail extents), can be used for mkfs and
deduplication optimizations
- provide stable UUID for e.g. overlayfs and temp_fsid, also
reflected in statvfs() field f_fsid, internal dev_t is hashed in to
allow cloning
- add 32bit compat version of GET_SUBVOL_INFO ioctl
- in experimental build, support huge folios (up to 2M)
Performance related improvements/changes:
- limit bio size to the estimated optimum derived from the queue,
this prevents build up of too much data for writeback, which could
cause latency spikes (reported improvement 15% on sequential
writes)
- don't force direct IO to be serialized, forgotten change during
mount API port, brings back +60% of throughput
- lockless calculation of number of shrinkable extent maps, improve
performance with many memcg allocated objects
Notable fixes:
- in zoned mode, fix a deadlock due to zone reclaim and relocation
when space needs to be flushed
- don't trim device which is internally not tracked as writeable
(e.g. when missing device is being rescanned)
- fix deadlock when cloning inline extent and mounted with
flushoncommit
- fix false IO failures after direct IO falls back to buffered write
in some cases
Core:
- remove COW fixup mechanism completely; detect and fix changes to
pages outside of filesystem tracking, guaranteed since 5.8, grace
period is over
- remove 2K block size support, experimental to test subpage code on
x86_64 but now it would block folio changes
- tree-checker improvements of:
- free-space cache and tree items
- root reference and backref items
- extent state exceptions in reloc tree
- subpage mode updates:
- code optimizations, simplify tracking bitmaps
- re-enable readahead of compressed extent
- extend bitmap size to cover huge folios
- add tracepoints related to sync, tree-log and transactions
- device stats item tracking unification, remove item if there are no
stats recorded, also don't leave stale stats on replaced device
- allow extent buffer pages to be allocated as movable, to help page
migration
- added checks for proper extent buffer release
- btrfs.ko code size reduction due to transaction abort call
simplifications
- several struct size reductions
- more auto free conversions
- more verbose assertions"
* tag 'for-7.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (130 commits)
btrfs: fix use-after-free after relocation failure with concurrent COW
btrfs: move WARN_ON on unexpected error in __add_tree_block()
btrfs: move locking into btrfs_get_reloc_bg_bytenr()
btrfs: lzo: reject compressed segment that overflows the compressed input
btrfs: retry faulting in the pages after a zero sized short direct write
btrfs: fix incorrect buffered IO fallback for append direct writes
btrfs: fix false IO failure after falling back to buffered write
btrfs: use verbose assertions in backref.c
btrfs: print a message when a missing device re-appears
btrfs: do not trim a device which is not writeable
btrfs: return real error after lookup failure in btrfs_ioctl_default_subvol()
btrfs: use mapping shared locking for reading super block
btrfs: use lockless read in nr_cached_objects shrinker callback
btrfs: switch local indicator variables to bools
btrfs: send: pass bool for pending_move and refs_processed parameters
btrfs: use shifts for sectorsize and nodesize
btrfs: fix deadlock cloning inline extent when using flushoncommit
btrfs: allocate eb-attached btree pages as movable
btrfs: add 32-bit compat ioctl for BTRFS_IOC_GET_SUBVOL_INFO
btrfs: derive f_fsid from on-disk fsid and dev_t
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi updates from Mark Brown:
"This has been quite a busy release, mainly due to the subsystem wide
work Johan Hovold has done to modernise resource allocation for the
subsystem on probe, the subsystem did some very clever allocation
management pre devm which didn't quite mesh comfortably with managed
allocations and made it far too easy to introduce error handling and
removal bugs.
- Cleanup and simplification of controller struct allocation, moving
everything over to devm and making the devm APIs more robust, from
Johan Hovold
- Support for spi-mem devices that don't assert chip select and
support for a secondary read command for memory mapped flashes,
some commits for this are shared with mtd.
- Support for SpacemiT K1"
* tag 'spi-v7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (118 commits)
spi: Fix mismatched DT property access types
spi: xilinx: use FIFO occupancy register to determine buffer size
spi: spi-mem: Fix spi_controller_mem_ops kdoc
spi: xilinx: let transfers timeout in case of no IRQ
spi: dt-bindings: nuvoton,npcm750-fiu: Convert to DT schema
spi: meson-spifc: fix runtime PM leak on remove
spi: Use named initializers for platform_device_id arrays
spi: rzv2h-rspi: Add suspend/resume support
spi: dw-pci: remove redundant pci_free_irq_vectors() calls
spi: ep93xx: fix double-free of zeropage on DMA setup failure
spi: cadence-xspi: Revert COMPILE_TEST support
spi: cadence-xspi: Support 32bit and 64bit slave dma interface
spi: tegra210-quad: Allocate DMA memory for DMA engine
spi: imx: replace dmaengine_terminate_all() with dmaengine_terminate_sync()
spi: fsl-lpspi: terminate the RX channel on TX prepare failure path
spi: fsl-lpspi: replace dmaengine_terminate_all() with dmaengine_terminate_sync()
spi: atmel: fix DMA channel and bounce buffer leaks
spi: omap2-mcspi: Use of_device_get_match_data()
spi: Use named initializers for arrays of i2c_device_data
spi: aspeed: Replace VLA parameter with flat pointer in calibration helper
...
|
|
gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
"Futex updates:
- Optimize futex hash bucket access patterns (Peter Zijlstra)
- Large series to address the robust futex unlock race for real, by
Thomas Gleixner:
"The robust futex unlock mechanism is racy in respect to the
clearing of the robust_list_head::list_op_pending pointer because
unlock and clearing the pointer are not atomic.
The race window is between the unlock and clearing the pending op
pointer. If the task is forced to exit in this window, exit will
access a potentially invalid pending op pointer when cleaning up
the robust list.
That happens if another task manages to unmap the object
containing the lock before the cleanup, which results in an UAF.
In the worst case this UAF can lead to memory corruption when
unrelated content has been mapped to the same address by the time
the access happens.
User space can't solve this problem without help from the kernel.
This series provides the kernel side infrastructure to help it
along:
1) Combined unlock, pointer clearing, wake-up for the
contended case
2) VDSO based unlock and pointer clearing helpers with a
fix-up function in the kernel when user space was interrupted
within the critical section.
... with help by André Almeida:
- Add a note about robust list race condition (André Almeida)
- Add self-tests for robust release operations (André Almeida)
Context analysis updates:
- Implement context analysis for 'struct rt_mutex'. (Bart Van Assche)
- Bump required Clang version to 23 (Marco Elver)
Guard infrastructure updates:
- Series to remove NULL check from unconditional guards (Dmitry
Ilvokhin)
Lockdep updates:
- Restore self-test migrate_disable() and sched_rt_mutex state on
PREEMPT_RT (Karl Mehltretter)
Membarriers updates:
- Use per-CPU mutexes for targeted commands (Aniket Gattani)
- Modernize membarrier_global_expedited with cleanup guards (Aniket
Gattani)
- Add rseq stress test for CFS throttle interactions (Aniket Gattani)
percpu-rwsems updates:
- Extract __percpu_up_read() to optimize inlining overhead (Dmitry
Ilvokhin)
Seqlocks updates:
- Allow UBSAN_ALIGNMENT to fail optimizing (Heiko Carstens)
Lock tracing:
- Add contended_release tracepoint to sleepable locks such as
mutexes, percpu-rwsems, rtmutexes, rwsems and semaphores (Dmitry
Ilvokhin)
MAINTAINERS updates:
- MAINTAINERS: Add RUST [SYNC] entry (Boqun Feng)
Misc updates and fixes by Randy Dunlap, YE WEI-HONG, Fabricio Parra,
Dmitry Ilvokhin and Peter Zijlstra"
* tag 'locking-core-2026-06-14' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip: (36 commits)
locking: Add contended_release tracepoint to sleepable locks
locking/percpu-rwsem: Extract __percpu_up_read()
tracing/lock: Remove unnecessary linux/sched.h include
futex: Optimize futex hash bucket access patterns
rust: sync: completion: Mark inline complete_all and wait_for_completion
MAINTAINERS: Add RUST [SYNC] entry
cleanup: Specify nonnull argument index
selftests: futex: Add tests for robust release operations
Documentation: futex: Add a note about robust list race condition
x86/vdso: Implement __vdso_futex_robust_try_unlock()
x86/vdso: Prepare for robust futex unlock support
futex: Provide infrastructure to plug the non contended robust futex unlock race
futex: Add robust futex unlock IP range
futex: Add support for unlocking robust futexes
futex: Cleanup UAPI defines
x86: Select ARCH_MEMORY_ORDER_TSO
uaccess: Provide unsafe_atomic_store_release_user()
futex: Provide UABI defines for robust list entry modifiers
futex: Move futex related mm_struct data into a struct
futex: Make futex_mm_init() void
...
|
|
gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip
Pull timer core updates from Thomas Gleixner:
"Updates for the time/timer core subsystem:
- Harden the user space controllable hrtimer interfaces further to
protect against unpriviledged DoS attempts by arming timers in the
past.
- Add per-capacity hierarchies to the timer migration code to prevent
timer migration accross different capacity domains. This code has
been disabled last minute as there is a pathological problem with
SoCs which advertise a larger number of capacity domains. The
problem is under investigation and the code won't be active before
v7.3, but that turned out to be less intrusive than a full revert
as it preserves the preparatory steps and allows people to work on
the final resolution
- Export time namespace functionality as a recent user can be built
as a module.
- Initialize the jiffies clocksource before using it. The recent
hardening against time moving backward requires that the related
members of struct clocksource have been initialized, otherwise it
clamps the readout to 0, which makes time stand sill and causes
boot delays.
- Fix a more than twenty year old PID reference count leak in an
error path of the POSIX CPU timer code.
- The usual small fixes, improvements and cleanups all over the
place"
* tag 'timers-core-2026-06-13' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip: (31 commits)
posix-cpu-timers: Fix pid refcount leak in do_cpu_nanosleep() error path
time/jiffies: Register jiffies clocksource before usage
timers/migration: Temporarily disable per capacity hierarchies
timers/migration: Turn tmigr_hierarchy level_list into a flexible array
timers/migration: Deactivate per-capacity hierarchies under nohz_full
timers/migration: Fix hotplug migrator selection target on asymetric capacity machines
ntsync: Honour caller's time namespace for absolute MONOTONIC timeouts
time/namespace: Export init_time_ns and do_timens_ktime_to_host()
timers/migration: Update stale @online doc to @available
timers: Fix flseep() typo in kernel-doc comment
hrtimer: Fix the bogus return type of __hrtimer_start_range_ns()
hrtimer: Return ktime_t from hrtimer_get_next_event()/hrtimer_next_event_without()
clocksource: Clean up clocksource_update_freq() functions
alarmtimer: Remove stale return description from alarm_handle_timer()
selftests/posix_timers: Use CLOCK_THREAD_CPUTIME_ID for ITIMER_PROF measurements
scripts/timers: Add timer_migration_tree.py
timers/migration: Handle capacity in connect tracepoints
timers/migration: Split per-capacity hierarchies
timers/migration: Track CPUs in a hierarchy
timers/migration: Abstract out hierarchy to prepare for CPU capacity awareness
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- Several small cleanups of various Xen related drivers
(xen/platform-pci, xen-balloon, xenbus, xen/mcelog)
- Cleanup for Xen PV-mode related code (includes dropping the Xen
debugfs code)
- Drop the additional lazy mmu mode tracking done by Xen specific code
* tag 'for-linus-7.2-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/xenbus: Replace strcpy() with memcpy()
x86/xen: Replace generic lazy tracking with cpu specific one
x86/xen: Get rid of last XEN_LAZY_MMU uses
mm: Refactor lazy_mmu_mode_pause() and lazy_mmu_mode_resume()
x86/xen: Change interface of xen_mc_issue()
x86/xen: Drop lazy mode from trace entries
x86/xen: Remove Xen debugfs support
x86/xen: Cleanup Xen related trace points
x86/xen: Guard PV-only stuff in xen-ops.h with CONFIG_XEN_PV
xen: balloon: Replace sprintf() with sysfs_emit()
xen/mcelog: mark g_physinfo, ncpus and xen_mce_chrdev_device as __ro_after_init
xen: constify xsd_errors array
xen/platform-pci: Simplify initialization of pci_device_id array
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs writeback updates from Christian Brauner:
- Fix a race between cgroup_writeback_umount() and inode_switch_wbs()
When a container exits, a race between cgroup_writeback_umount() and
inode_switch_wbs()/cleanup_offline_cgwb() can trigger "VFS: Busy
inodes after unmount" followed by a use-after-free on percpu
counters.
There is a window between inode_prepare_wbs_switch() returning true
(having passed the SB_ACTIVE check and grabbed the inode) and the
subsequent wb_queue_isw() call: if cgroup_writeback_umount() observes
the global isw_nr_in_flight counter as non-zero but flush_workqueue()
finds nothing queued yet, it returns early - leaving a held inode
reference that blocks evict_inodes() and a later iput() that hits
freed percpu counters.
The race is closed by covering the window from
inode_prepare_wbs_switch() through wb_queue_isw() with an RCU
read-side critical section and synchronizing in the umount path.
On top of that the now-dead rcu_barrier() left over from the
queue_rcu_work() era is removed, and the global
synchronize_rcu()/flush_workqueue() pair is replaced with a per-sb
in-flight counter plus pin/unpin/drain helpers so umount no longer
serializes against switch activity on unrelated superblocks.
Under cgroup writeback churn on a 16 vCPU guest this takes umount
latency from ~92-138ms p50 down to ~5-8ms p50 and the cumulative cost
of cgroup_writeback_umount() from ~62ms to ~4us per call.
The initial race fix is kept separate and minimal so it backports
cleanly to stable trees that still queue switches via
queue_rcu_work().
- Improve write performance with RWF_DONTCACHE
Dirty DONTCACHE pages are now tracked per bdi_writeback so that the
writeback flusher can be kicked in a targeted fashion for
IOCB_DONTCACHE writes instead of relying on global writeback, and the
PG_dropbehind flag is preserved when a folio is split.
* tag 'vfs-7.2-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
mm: kick writeback flusher for IOCB_DONTCACHE with targeted dirty tracking
mm: track DONTCACHE dirty pages per bdi_writeback
mm: preserve PG_dropbehind flag during folio split
writeback: use a per-sb counter to drain inode wb switches at umount
writeback: drop now-unnecessary rcu_barrier() in cgroup_writeback_umount()
writeback: fix race between cgroup_writeback_umount() and inode_switch_wbs()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs directory delegations from Christian Brauner:
"This contains the VFS prerequisites for supporting directory
delegations in nfsd via CB_NOTIFY callbacks.
The filelock core gains support for ignoring delegation breaks for
directory change events together with an inode_lease_ignore_mask()
helper, and fsnotify gains fsnotify_modify_mark_mask() and a
FSNOTIFY_EVENT_RENAME data type.
With this in place nfsd can request delegations on directories and set
up inotify watches to trigger sending CB_NOTIFY events to clients
instead of having every directory change break the delegation.
New tracepoints are added to fsnotify() and to the start of
break_lease(), and trace_break_lease_block() is passed the currently
blocking lease instead of the new one.
A follow-up fix moves the LEASE_BREAK_* flags out of
#ifdef CONFIG_FILE_LOCKING to fix the build for CONFIG_FILE_LOCKING=n
configurations"
* tag 'vfs-7.2-rc1.directory.delegations' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
filelock: move LEASE_BREAK_* flags out of #ifdef CONFIG_FILE_LOCKING
fsnotify: add FSNOTIFY_EVENT_RENAME data type
fsnotify: add fsnotify_modify_mark_mask()
fsnotify: new tracepoint in fsnotify()
filelock: add an inode_lease_ignore_mask helper
filelock: add a tracepoint to start of break_lease()
filelock: add support for ignoring deleg breaks for dir change events
filelock: pass current blocking lease to trace_break_lease_block() rather than "new_fl"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs inode updates from Christian Brauner:
"This extends the lockless ->i_count handling.
iput() could already decrement any value greater than one locklessly
but acquiring a reference always required taking inode->i_lock. Now
acquiring a reference is lockless as long as the count was already at
least 1, i.e., only the 0->1 and 1->0 transitions take the lock.
This avoids the lock for the common cases of nfs calling into the
inode hash and btrfs using igrab(). Cleanup-wise icount_read_once() is
added to line up with inode_state_read_once() and the open-coded
->i_count loads across the tree are converted, and ihold() is
relocated and tidied up.
On top of that some stale lock ordering annotations are retired from
the inode hash code: iunique() no longer takes the hash lock since the
inode hash became RCU-searchable and s_inode_list_lock is no longer
taken under the hash lock either"
* tag 'vfs-7.2-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: retire stale lock ordering annotations from inode hash
fs: allow lockless ->i_count bumps as long as it does not transition 0->1
fs: relocate and tidy up ihold()
fs: add icount_read_once() and stop open-coding ->i_count loads
|
|
Add the contended_release trace event. This tracepoint fires on the
holder side when a contended lock is released, complementing the
existing contention_begin/contention_end tracepoints which fire on the
waiter side.
This enables correlating lock hold time under contention with waiter
events by lock address.
Add trace_contended_release()/trace_call__contended_release() calls to
the slowpath unlock paths of sleepable locks: mutex, rtmutex, semaphore,
rwsem, percpu-rwsem, and RT-specific rwbase locks.
Where possible, trace_contended_release() fires before the lock is
released and before the waiter is woken. For some lock types, the
tracepoint fires after the release but before the wake. Making the
placement consistent across all lock types is not worth the added
complexity.
For reader/writer locks, the tracepoint fires for every reader releasing
while a writer is waiting, not only for the last reader.
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Usama Arif <usama.arif@linux.dev>
Link: https://patch.msgid.link/02f4f6c5ce6761e7f6587cf0ff2289d962ecddd4.1780506267.git.d@ilvokhin.com
|
|
None of the trace events in lock.h reference anything from
linux/sched.h. Remove the unnecessary include.
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Usama Arif <usama.arif@linux.dev>
Link: https://patch.msgid.link/85e14a0685c1d46c3e80de2dc01358a48169b263.1780506267.git.d@ilvokhin.com
|
|
Decode the 'RECLAIM_ZONES' state in tracepoints, as of now only the
numerical state is shown in the tracepoints.
Reviewed-by: Boris Burkov <boris@bur.io>
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Print the type of the inode (directory, regular file, symlink, etc) to
facilitate debugging.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_sync_log() is one of the main functions called during a fsync.
Add trace events for when entering and exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_log_new_name() is an important function that affects inode logging
and is called during link and rename operations. Add trace events for when
entering and exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_record_new_subvolume() is an important operation that affects
inode logging and is called during subvolume creation. Add a trace event
for it to help debug issues.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_record_snapshot_destroy() is an important operation that affects
inode logging and is called during subvolume/snapshot deletion as well as
during rmdir. Add a trace event for it to help debug issues.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_record_unlink_dir() is an important operation that affects inode
logging and is called during unlink and rename operations. Add a trace
event for it to help debug issues.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
log_new_delayed_dentries() is an important step called during a fsync, as
well as during rename and link operations on inodes that were previously
logged. Add trace events for when entering and exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
log_conflicting_inodes() is an important step called during a fsync, as
well as during rename and link operations on inodes that were previously
logged. Add trace events for when entering and exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
add_conflicting_inode() is an important step called during a fsync, as
well as during rename and link operations on inodes that were previously
logged. Add trace events for when entering and exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
log_new_dir_dentries() is an important step called during a fsync, as
well as during rename and link operations on inodes that were previously
logged. Add trace events for when entering and exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
log_all_new_ancestors() is an important step called during a fsync, as
well as during rename and link operations on inodes that were previously
logged. Add trace events for when entering and exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_log_all_parents() is an important step called during a fsync, as
well as during rename and link operations on inodes that were previously
logged. Add trace events for when entering and exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_log_inode() is one of the most important steps called during a fsync,
as well as during rename and link operations on inodes that were previously
logged. Add trace events for when entering and exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_log_inode_parent() is one of the most important steps called during
a fsync operation as well as during rename and link operations on inodes
that were previously logged. Add trace events for when entering and
exiting that function.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Currently we only have a trace event for when a fsync operation starts,
but this alone is not very helpful. Add a trace event for when fsync
finishes, which reports its return value, so that using tracing we can
see which other trace events happened in between (several will be added
soon for inode logging steps) and even measure execution time.
So rename the existing trace event btrfs_sync_file to
btrfs_sync_file_enter and add the trace event btrfs_sync_file_exit.
The naming is similar to what ext4 does (ext4_sync_file_enter and
ext4_sync_file_exit) and with similar information reported.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Currently the trace event is fired only when a transaction is fully
complete (its state is TRANS_STATE_COMPLETED). However during a
transaction commit we go through several states and as soon as the
state reaches TRANS_STATE_UNBLOCKED, another transaction can start.
Therefore it's useful to track every transaction state changed during
the commit of a transaction, so that we can see if a new transaction
is started before the current one is completed. Add the transaction
state to the transaction commit event and call the event everytime
we change the transaction state during commit.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
While tracing it's useful to know not just when a transaction is committed
or aborted, but also when a new one is started. So add a trace event for
transaction starts.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
While tracing it's useful to know not just when a transaction is committed
but also when one is aborted. So add a trace event for transaction aborts.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Include the in_fsync value from the transaction handle so that we can know
if a transaction commit was triggered by a fsync call.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The transaction commit tracepoint prints fs_info->generation as if it
were the ID of the committed transaction but this does not always match
that ID. This is because the trace point is called in the transaction
commit path after the transaction is in the TRANS_STATE_COMPLETED state,
which means another transaction may have already started (which can happen
as soon as the transaction state was set to TRANS_STATE_UNBLOCKED), in
which case fs_info->generation was incremented and does not correspond
to the committed transaction anymore.
So fix this by passing a transaction handle to the trace event instead of
fs_info. This will also allow later for the trace event to dump other
useful information about the transaction.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
A transaction commit is global, not per root, and we are currently always
emitting a root id field matching the root tree for no good reason at all,
causing confusion for no reason at all. So remove the root field.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
There is no need to add a double negation (!!) to the update field because
the field has a boolean type.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Drop the lazy mode (cpu or mmu) from the xen_mc_batch and xen_mc_issue
trace entries.
This is done in preparation of removing the xen_lazy_mode percpu
variable.
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20260526150514.129330-2-jgross@suse.com>
|
|
Since dropping Xen-PV support for 32-bit, include/trace/events/xen.h
contains several stale trace point definitions. Remove them.
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20260522152114.77319-3-jgross@suse.com>
|