| Age | Commit message (Collapse) | Author | Files | Lines |
|
The read_seqbegin/need_seqretry/done_seqretry API is cumbersome and
error prone. With the new helper the "typical" code like
int seq, nextseq;
unsigned long flags;
nextseq = 0;
do {
seq = nextseq;
flags = read_seqbegin_or_lock_irqsave(&seqlock, &seq);
// read-side critical section
nextseq = 1;
} while (need_seqretry(&seqlock, seq));
done_seqretry_irqrestore(&seqlock, seq, flags);
can be rewritten as
scoped_seqlock_read (&seqlock, ss_lock_irqsave) {
// read-side critical section
}
Original idea by Oleg Nesterov; with contributions from Linus.
Originally-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
The 'old' argument in atomic_try_cmpxchg() and related functions is a
pointer to a normal non-atomic integer number, which does not require
to be naturally aligned, unlike the atomic_t/atomic64_t types themselves.
In order to add an alignment check with CONFIG_DEBUG_ATOMIC into the
normal instrument_atomic_read_write() helper, change this check to use
the non-atomic instrument_read_write(), the same way that was done
earlier for try_cmpxchg() in commit ec570320b09f ("locking/atomic:
Correct (cmp)xchg() instrumentation").
This prevents warnings on m68k calling the 32-bit atomic_try_cmpxchg()
with 16-bit aligned arguments as well as several more architectures
including x86-32 when calling atomic64_try_cmpxchg() with 32-bit
aligned u64 arguments.
Reported-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/cover.1757810729.git.fthain@linux-m68k.org/
|
|
Reading the GPIO hardware number from a descriptor is a valid use-case
outside of the GPIO core. Export the symbol to consumers of GPIO
descriptors.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20251016-aspeed-gpiolib-include-v1-2-31201c06d124@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Sumanth Korikkar says:
====================
Provide a new interface for dynamic configuration and
deconfiguration of hotplug memory on s390, allowing with/without
memmap_on_memory support. It is a follow up on the discussion with David
when introducing memmap_on_memory support for s390 and support dynamic
(de)configuration of memory:
https://lore.kernel.org/all/ee492da8-74b4-4a97-8b24-73e07257f01d@redhat.com/
https://lore.kernel.org/all/20241202082732.3959803-1-sumanthk@linux.ibm.com/
The original motivation for introducing memmap_on_memory on s390 was to
avoid using online memory to store struct pages metadata, particularly
for standby memory blocks. This became critical in cases where there was
an imbalance between standby and online memory, potentially leading to
boot failures due to insufficient memory for metadata allocation.
To address this, memmap_on_memory was utilized on s390. However, in its
current form, it adds struct pages metadata at the start of each memory
block at the time of addition (only standby memory), and this
configuration is static. It cannot be changed at runtime (When the user
needs continuous physical memory).
Inorder to provide more flexibility to the user and overcome the above
limitation, add an option to dynamically configure and deconfigure
hotpluggable memory block with/without memmap_on_memory.
With the new interface, s390 will not add all possible hotplug memory in
advance, like before, to make it visible in sysfs for online/offline
actions. Instead, before memory block can be set online, it has to be
configured via a new interface in /sys/firmware/memory/memoryX/config,
which makes s390 similar to others. i.e. Adding of hotpluggable memory is
controlled by the user instead of adding it at boottime.
s390 kernel sysfs interface to configure/deconfigure memory with
memmap_on_memory (with upcoming lsmem changes):
* Initial memory layout:
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x7fffffff 2G online 0-15 yes no
0x80000000-0xffffffff 2G offline 16-31 no yes
* Configure memory
echo 1 > /sys/firmware/memory/memory16/config
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x7fffffff 2G online 0-15 yes no
0x80000000-0x87ffffff 128M offline 16 yes yes
0x88000000-0xffffffff 1.9G offline 17-31 no yes
* Deconfigure memory
echo 0 > /sys/firmware/memory/memory16/config
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x7fffffff 2G online 0-15 yes no
0x80000000-0xffffffff 2G offline 16-31 no yes
* Enable memmap_on_memory and online it.
(Deconfigure first)
echo 0 > /sys/devices/system/memory/memory5/online
echo 0 > /sys/firmware/memory/memory5/config
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x27ffffff 640M online 0-4 yes no
0x28000000-0x2fffffff 128M offline 5 no no
0x30000000-0x7fffffff 1.3G online 6-15 yes no
0x80000000-0xffffffff 2G offline 16-31 no yes
(Enable memmap_on_memory and online it)
echo 1 > /sys/firmware/memory/memory5/memmap_on_memory
echo 1 > /sys/firmware/memory/memory5/config
echo 1 > /sys/devices/system/memory/memory5/online
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x27ffffff 640M online 0-4 yes no
0x28000000-0x2fffffff 128M online 5 yes yes
0x30000000-0x7fffffff 1.3G online 6-15 yes no
0x80000000-0xffffffff 2G offline 16-31 no yes
* Disable memmap_on_memory and online it.
(Deconfigure first)
echo 0 > /sys/devices/system/memory/memory5/online
echo 0 > /sys/firmware/memory/memory5/config
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x27ffffff 640M online 0-4 yes no
0x28000000-0x2fffffff 128M offline 5 no yes
0x30000000-0x7fffffff 1.3G online 6-15 yes no
0x80000000-0xffffffff 2G offline 16-31 no yes
(Disable memmap_on_memory and online it)
echo 0 > /sys/firmware/memory/memory5/memmap_on_memory
echo 1 > /sys/firmware/memory/memory5/config
echo 1 > /sys/devices/system/memory/memory5/online
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x7fffffff 2G online 0-15 yes no
0x80000000-0xffffffff 2G offline 16-31 no yes
* Userspace changes:
lsmem/chmem tool is also changed to use the new interface. I will send
it to util-linux soon.
Patch 1 adds support for removal of boot-allocated memory blocks.
Patch 2 provides option to dynamically configure and deconfigure memory
with/without memmap_on_memory.
Patch 3 removes MHP_OFFLINE_INACCESSIBLE from s390. The mhp flag was
used to mark memory as not accessible until memory hotplug online phase
begins. However, with patch 2, it is no longer essential. Memory can be
brought to accessible state before adding memory, as the memory is added
during runttime now instead of boottime.
Patch 4 removes the MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers. It
is no longer needed. Memory can be brought to accessible state before
adding memory now, with runtime (de)configuration of memory.
Note: The patches apply to the linux-next branch.
v3:
Thanks David
* Avoid goto label in create_standby_sclp_mems().
* Use unsigned long instead of u64.
* Add Acked-by.
v2:
Thanks David
* Rename struct mblock/mblock_arg with struct sclp_mem/sclp_mem_arg.
* Rename all mblocks/mblock references with sclp_mems/sclp_mem -
structures, functions.
* Rename create_online_mblock() with create_configured_sclp_mem().
* Rename config_mblock_show()/config_mblock_store() with
config_sclp_mem_show()/config_sclp_mem_store().
* Remove contains_standby_increment() and
sclp_mem_notifier. sclp mem state change is performed when
adding/removing memory. sclp memory notifier - no longer needed with
this patchset.
* Recover sclp mem state when add_memory() fails.
* Refactor and add function init_sclp_mem().
* Use unsigned long instead of unsigned long long.
* Simplify and correct kobj handling. Thanks Heiko.
====================
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2025-10-17
The first patch is by me and adds support for an optional reset to the
m_can drivers.
Vincent Mailhol's patch targets all drivers and removes the
can_change_mtu() function, since the netdev's min and max MTU are
populated.
Markus Schneider-Pargmann contributes 4 patches to the m_can driver to
add am62 wakeup support.
The last 7 patches are by me and provide various cleanups to the m_can
driver.
* tag 'linux-can-next-for-6.19-20251017' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
can: m_can: m_can_get_berr_counter(): don't wake up controller if interface is down
can: m_can: m_can_tx_submit(): remove unneeded sanity checks
can: m_can: m_can_class_register(): remove error message in case devm_kzalloc() fails
can: m_can: m_can_interrupt_enable(): use m_can_write() instead of open coding it
net: m_can: convert dev_{dbg,info,err} -> netdev_{dbg,info,err}
can: m_can: hrtimer_callback(): rename to m_can_polling_timer()
can: m_can: m_can_init_ram(): make static
can: m_can: Support pinctrl wakeup state
can: m_can: Return ERR_PTR on error in allocation
can: m_can: Map WoL to device_set_wakeup_enable
dt-bindings: can: m_can: Add wakeup properties
can: treewide: remove can_change_mtu()
can: m_can: add support for optional reset
====================
Link: https://patch.msgid.link/20251017150819.1415685-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Changing the netif_carrier_*() state behind phylink's back has always
been prohibited because it messes up with phylinks state tracking, and
means that phylink no longer guarantees to call the mac_link_down()
and mac_link_up() methods at the appropriate times. This was later
documented in the sfp-phylink network driver conversion guide.
stmmac was converted to phylink in 2019, but nothing was done with the
"PCS" code. Since then, apart from the updates as part of phylink
development, nothing has happened with stmmac to improve its use of
phylink, or even to address this point.
A couple of years ago, a has_integrated_pcs boolean was added by Bart,
which later became the STMMAC_FLAG_HAS_INTEGRATED_PCS flag, to avoid
manipulating the netif_carrier_*() state. This flag is mis-named,
because whenever the stmmac is synthesized for its native SGMII, TBI
or RTBI interfaces, it has an "integrated PCS". This boolean/flag
actually means "ignore the status from the integrated PCS".
Discussing with Bart, the reasons for this are lost to the winds of
time (which is why we should always document the reasons in the commit
message.)
RGMII also has in-band status, and the dwmac cores and stmmac code
supports this but with one bug that saves the day.
When dwmac cores are synthesised for RGMII only, they do not contain
an integrated PCS, and so priv->dma_cap.pcs is clear, which prevents
(incorrectly) the "RGMII PCS" being used, meaning we don't read the
in-band status. However, a core synthesised for RGMII and also SGMII,
TBI or RTBI will have this capability bit set, thus making these
code paths reachable.
The Jetson Xavier NX uses RGMII mode to talk to its PHY, and removing
the incorrect check for priv->dma_cap.pcs reveals the theortical issue
with netif_carrier_*() manipulation is real:
dwc-eth-dwmac 2490000.ethernet eth0: Register MEM_TYPE_PAGE_POOL RxQ-0
dwc-eth-dwmac 2490000.ethernet eth0: PHY [stmmac-0:00] driver [RTL8211F Gigabit Ethernet] (irq=141)
dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
dwc-eth-dwmac 2490000.ethernet eth0: registered PTP clock
dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii-id link mode
8021q: adding VLAN 0 to HW filter on device eth0
dwc-eth-dwmac 2490000.ethernet eth0: Adding VLAN ID 0 is not supported
Link is Up - 1000/Full
Link is Down
Link is Up - 1000/Full
This looks good until one realises that the phylink "Link" status
messages are missing, even when the RJ45 cable is reconnected. Nothing
one can do results in the interface working. The interrupt handler
(which prints those "Link is" messages) always wins over phylink's
resolve worker, meaning phylink never calls the mac_link_up() nor
mac_link_down() methods.
eth0 also sees no traffic received, and is unable to obtain a DHCP
address:
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group defa
ult qlen 1000
link/ether e6:d3:6a:e6:92:de brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped overrun mcast
0 0 0 0 0 0
TX: bytes packets errors dropped carrier collsns
27686 149 0 0 0 0
With the STMMAC_FLAG_HAS_INTEGRATED_PCS flag set, which disables the
netif_carrier_*() manipulation then stmmac works normally:
dwc-eth-dwmac 2490000.ethernet eth0: Register MEM_TYPE_PAGE_POOL RxQ-0
dwc-eth-dwmac 2490000.ethernet eth0: PHY [stmmac-0:00] driver [RTL8211F Gigabit Ethernet] (irq=141)
dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
dwc-eth-dwmac 2490000.ethernet eth0: registered PTP clock
dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii-id link mode
8021q: adding VLAN 0 to HW filter on device eth0
dwc-eth-dwmac 2490000.ethernet eth0: Adding VLAN ID 0 is not supported
Link is Up - 1000/Full
dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
and packets can be transferred.
This clearly shows that when priv->hw->pcs is set, but
STMMAC_FLAG_HAS_INTEGRATED_PCS is clear, the driver reliably fails.
Discovering whether a platform falls into this is impossible as
parsing all the dtsi and dts files to find out which use the stmmac
driver, whether any of them use RGMII or SGMII and also depends
whether an external interface is being used. The kernel likely
doesn't contain all dts files either.
The only driver that sets this flag uses the qcom,sa8775p-ethqos
compatible, and uses SGMII or 2500BASE-X.
but these are saved from this problem by the incorrect check for
priv->dma_cap.pcs.
So, we have to assume that for every other platform that uses SGMII
with stmmac is using an external PCS.
Moreover, ethtool output can be incorrect. With the full-duplex link
negotiated, ethtool reports:
Speed: 1000Mb/s
Duplex: Half
because with dwmac4, the full-duplex bit is in bit 16 of the status,
priv->xstats.pcs_duplex becomes BIT(16) for full duplex, but the
ethtool ksettings duplex member is u8 - so becomes zero. Moreover,
the supported, advertised and link partner modes are all "not
reported".
Finally, ksettings_set() won't be able to set the advertisement on
a PHY if this PCS code is activated, which is incorrect when SGMII
is used with a PHY.
Thus, remove:
1. the incorrect netif_carrier_*() manipulation.
2. the broken ethtool ksettings code.
Given that all uses of STMMAC_FLAG_HAS_INTEGRATED_PCS are now gone,
remove the flag from stmmac.h and dwmac-qcom-ethqos.c.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://patch.msgid.link/E1v9P5y-0000000AolC-1QWH@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- Fix seqcount lockdep assertion failure in cgroup freezer on
PREEMPT_RT.
Plain seqcount_t expects preemption disabled, but PREEMPT_RT
spinlocks don't disable preemption. Switch to seqcount_spinlock_t to
properly associate css_set_lock with the freeze timing seqcount.
- Misc changes including kernel-doc warning fix for misc_res_type enum
and improved selftest diagnostics.
* tag 'cgroup-for-6.18-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup/misc: fix misc_res_type kernel-doc warning
selftests: cgroup: Use values_close_report in test_cpu
selftests: cgroup: add values_close_report helper
cgroup: Fix seqcount lockdep assertion in cgroup freezer
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify fixes from Jan Kara:
- Stop-gap solution for a race between unmount of a filesystem with
fsnotify marks and someone inspecting fdinfo of fsnotify group with
those marks in procfs.
A proper solution is in the works but it will get a while to settle.
- Fix for non-decodable file handles (used by unprivileged apps using
fanotify)
* tag 'fsnotify_for_v6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fs/notify: call exportfs_encode_fid with s_umount
expfs: Fix exportfs_can_encode_fh() for EXPORT_FH_FID
|
|
Drop unused function acpi_get_lps0_constraint().
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/5032801.GXAFRqVoOG@rafael.j.wysocki
|
|
sysctl_hung_task_timeout_secs)
When a writeback work lasts for sysctl_hung_task_timeout_secs, we want
to identify that there are tasks waiting for a long time-this helps us
pinpoint potential issues.
Additionally, recording the starting jiffies is useful when debugging a
crashed vmcore.
Signed-off-by: Julian Sun <sunjunchao@bytedance.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Writing back a large number of pages can take a lots of time.
This issue is exacerbated when the underlying device is slow or
subject to block layer rate limiting, which in turn triggers
unexpected hung task warnings.
We can trigger a wake-up once a chunk has been written back and the
waiting time for writeback exceeds half of
sysctl_hung_task_timeout_secs.
This action allows the hung task detector to be aware of the writeback
progress, thereby eliminating these unexpected hung task warnings.
This patch has passed the xfstests 'check -g quick' test based on ext4,
with no additional failures introduced.
Signed-off-by: Julian Sun <sunjunchao@bytedance.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
... to make sure all accesses are properly validated.
Merely renaming the var to __i_state still lets the compiler make the
following suggestion:
error: 'struct inode' has no member named 'i_state'; did you mean '__i_state'?
Unfortunately some people will add the __'s and call it a day.
In order to make it harder to mess up in this way, hide it behind a
struct. The resulting error message should be convincing in terms of
checking what to do:
error: invalid operands to binary & (have 'struct inode_state_flags' and 'int')
Of course people determined to do a plain access can still do it, but
nothing can be done for that case.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
coccinelle
Nothing to look at apart from iput_final().
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Open-coded accesses prevent asserting they are done correctly. One
obvious aspect is locking, but significantly more can checked. For
example it can be detected when the code is clearing flags which are
already missing, or is setting flags when it is illegal (e.g., I_FREEING
when ->i_count > 0).
In order to keep things manageable this patchset merely gets the thing
off the ground with only lockdep checks baked in.
Current consumers can be trivially converted.
Suppose flags I_A and I_B are to be handled.
If ->i_lock is held, then:
state = inode->i_state => state = inode_state_read(inode)
inode->i_state |= (I_A | I_B) => inode_state_set(inode, I_A | I_B)
inode->i_state &= ~(I_A | I_B) => inode_state_clear(inode, I_A | I_B)
inode->i_state = I_A | I_B => inode_state_assign(inode, I_A | I_B)
If ->i_lock is not held or only held conditionally:
state = inode->i_state => state = inode_state_read_once(inode)
inode->i_state |= (I_A | I_B) => inode_state_set_raw(inode, I_A | I_B)
inode->i_state &= ~(I_A | I_B) => inode_state_clear_raw(inode, I_A | I_B)
inode->i_state = I_A | I_B => inode_state_assign_raw(inode, I_A | I_B)
The "_once" vs "_raw" discrepancy stems from the read variant differing
by READ_ONCE as opposed to just lockdep checks.
Finally, if you want to atomically clear flags and set new ones, the
following:
state = inode->i_state;
state &= ~I_A;
state |= I_B;
inode->i_state = state;
turns into:
inode_state_replace(inode, I_A, I_B);
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The incomming helpers don't ship with _release/_acquire variants, for
the time being anyway.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The only consumer outside of fs/inode.c is gfs2 and it already includes
fs.h in the relevant file.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Suppose there are 2 CPUs racing inode hash lookup func (say ilookup5())
and unlock_new_inode().
In principle the latter can clear the I_NEW flag before prior stores
into the inode were made visible.
The former can in turn observe I_NEW is cleared and proceed to use the
inode, while possibly reading from not-yet-published areas.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Also remove the now redundant comment.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Most callers of iomap_iter_advance() do not need the remaining length
returned. Get rid of the extra iomap_length() call that
iomap_iter_advance() does.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Fix multiple warnings in enum vadc_scale_fn_type by adding a leading
'@' to the kernel-doc descriptions.
Fixed 14 warnings in this one enum, such as:
Warning: include/linux/iio/adc/qcom-vadc-common.h:123 Enum value
'SCALE_DEFAULT' not described in enum 'vadc_scale_fn_type'
Warning: ../include/linux/iio/adc/qcom-vadc-common.h:123 Enum value
'SCALE_THERM_100K_PULLUP' not described in enum 'vadc_scale_fn_type'
Warning: ../include/linux/iio/adc/qcom-vadc-common.h:123 Enum value
'SCALE_PMIC_THERM' not described in enum 'vadc_scale_fn_type'
Also prevent the warning on SCALE_HW_CALIB_INVALID by marking it
"private:" so that kernel-doc notation is not needed for it.
This leaves only one warning here, which I don't know the
appropriate description of:
qcom-vadc-common.h:125: warning: Enum value
'SCALE_HW_CALIB_PMIC_THERM_PM7' not described in enum 'vadc_scale_fn_type'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Rather than do per-tw checking, which needs to dip into the task_struct
for checking flags, do it upfront before running task_work. This places
a 'cancel' member in io_tw_token_t, which is assigned before running
task_work for that given ctx.
This is both more efficient in doing it upfront rather than for every
task_work, and it means that io_should_terminate_tw() can be made
private in io_uring.c rather than need to be called by various
callbacks of task_work.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Extend __filemap_get_folio() to support NUMA memory policies by
renaming the implementation to __filemap_get_folio_mpol() and adding
a mempolicy parameter. The original function becomes a static inline
wrapper that passes NULL for the mempolicy.
This infrastructure will enable future support for NUMA-aware page cache
allocations in guest_memfd memory backend KVM guests.
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Shivank Garg <shivankg@amd.com>
Tested-by: Ashish Kalra <ashish.kalra@amd.com>
Link: https://lore.kernel.org/r/20250827175247.83322-5-shivankg@amd.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a mempolicy parameter to filemap_alloc_folio() to enable NUMA-aware
page cache allocations. This will be used by upcoming changes to
support NUMA policies in guest-memfd, where guest_memory need to be
allocated NUMA policy specified by VMM.
All existing users pass NULL maintaining current behavior.
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Shivank Garg <shivankg@amd.com>
Tested-by: Ashish Kalra <ashish.kalra@amd.com>
Link: https://lore.kernel.org/r/20250827175247.83322-4-shivankg@amd.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add ACPM DVFS protocol handler. It constructs DVFS messages that
the APM firmware can understand.
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Reviewed-by: Peter Griffin <peter.griffin@linaro.org>
Tested-by: Peter Griffin <peter.griffin@linaro.org> # on gs101-oriole
Link: https://patch.msgid.link/20251010-acpm-clk-v6-2-321ee8826fd4@linaro.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
|
|
Sometimes, the result of the rhashtable_lookup() is expected to be found.
Therefore, we can use likely() for such cases.
Following new functions are introduced, which will use likely or unlikely
during the lookup:
rhashtable_lookup_likely
rhltable_lookup_likely
A micro-benchmark is made for these new functions: lookup a existed entry
repeatedly for 100000000 times, and rhashtable_lookup_likely() gets ~30%
speedup.
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Fix kernel-doc warnings by using correct kernel-doc syntax and
formatting to prevent warnings:
Warning: include/linux/firmware/qcom/qcom_tzmem.h:25 Enum value
'QCOM_TZMEM_POLICY_STATIC' not described in enum 'qcom_tzmem_policy'
Warning: ../include/linux/firmware/qcom/qcom_tzmem.h:25 Enum value
'QCOM_TZMEM_POLICY_MULTIPLIER' not described in enum 'qcom_tzmem_policy'
Warning: ../include/linux/firmware/qcom/qcom_tzmem.h:25 Enum value
'QCOM_TZMEM_POLICY_ON_DEMAND' not described in enum 'qcom_tzmem_policy'
Fixes: 84f5a7b67b61 ("firmware: qcom: add a dedicated TrustZone buffer allocator")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20251017191323.1820167-1-rdunlap@infradead.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
Cross-merge BPF and other fixes after downstream PR.
No conflicts.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- fix for sticky fingers handling in hid-multitouch (Benjamin
Tissoires)
- fix for reporting of 0 battery levels (Dmitry Torokhov)
- build fix for hid-haptic in certain configurations (Jonathan Denose)
- improved probe and avoiding spamming kernel log by hid-nintendo
(Vicki Pfau)
- fix for OOB in hid-cp2112 (Deepak Sharma)
- interrupt handling fix for intel-thc-hid (Even Xu)
- a couple of new device IDs and device-specific quirks
* tag 'hid-for-linus-2025101701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: logitech-hidpp: Add HIDPP_QUIRK_RESET_HI_RES_SCROLL
selftests/hid: add tests for missing release on the Dell Synaptics
HID: multitouch: fix sticky fingers
HID: multitouch: fix name of Stylus input devices
HID: hid-input: only ignore 0 battery events for digitizers
HID: hid-debug: Fix spelling mistake "Rechargable" -> "Rechargeable"
HID: Kconfig: Fix build error from CONFIG_HID_HAPTIC
HID: nintendo: Rate limit IMU compensation message
HID: nintendo: Wait longer for initial probe
HID: core: Add printk_ratelimited variants to hid_warn() etc
HID: quirks: Add ALWAYS_POLL quirk for VRS R295 steering wheel
HID: quirks: avoid Cooler Master MM712 dongle wakeup bug
HID: cp2112: Add parameter validation to data length
HID: intel-thc-hid: intel-quickspi: Add ARL PCI Device Id's
HID: intel-thc-hid: Intel-quickspi: switch first interrupt from level to edge detection
HID: intel-thc-hid: intel-quicki2c: Fix wrong type casting
|
|
Pull bpf fixes from Alexei Starovoitov:
- Replace bpf_map_kmalloc_node() with kmalloc_nolock() to fix kmemleak
imbalance in tracking of bpf_async_cb structures (Alexei Starovoitov)
- Make selftests/bpf arg_parsing.c more robust to errors (Andrii
Nakryiko)
- Fix redefinition of 'off' as different kind of symbol when I40E
driver is builtin (Brahmajit Das)
- Do not disable preemption in bpf_test_run (Sahil Chandna)
- Fix memory leak in __lookup_instance error path (Shardul Bankar)
- Ensure test data is flushed to disk before reading it (Xing Guo)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Fix redefinition of 'off' as different kind of symbol
bpf: Do not disable preemption in bpf_test_run().
bpf: Fix memory leak in __lookup_instance error path
selftests: arg_parsing: Ensure data is flushed to disk before reading.
bpf: Replace bpf_map_kmalloc_node() with kmalloc_nolock() to allocate bpf_async_cb structures.
selftests/bpf: make arg_parsing.c more robust to crashes
bpf: test_run: Fix ctx leak in bpf_prog_test_run_xdp error path
|
|
Pull NFS client fixes from Anna Schumaker:
- Fix for FlexFiles mirror->dss allocation
- Apply delay_retrans to async operations
- Check if suid/sgid is cleared after a write when needed
- Fix setting the state renewal timer for early mounts after a reboot
* tag 'nfs-for-6.18-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFS4: Fix state renewals missing after boot
NFS: check if suid/sgid was cleared after a write as needed
NFS4: Apply delay_retrans to async operations
NFSv4/flexfiles: fix to allocate mirror->dss before use
|
|
Accessing non-existent PMU registers causes an SError, halting the
system.
Implement read and write access tables for the gs101-PMU to specify
which registers are read- and/or writable to avoid that SError.
Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org>
Signed-off-by: André Draszik <andre.draszik@linaro.org>
Link: https://patch.msgid.link/20251009-gs101-pmu-regmap-tables-v2-3-2d64f5261952@linaro.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
|
|
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Fix the handling of ZCR_EL2 in NV VMs
- Pick the correct translation regime when doing a PTW on the back of
a SEA
- Prevent userspace from injecting an event into a vcpu that isn't
initialised yet
- Move timer save/restore to the sysreg handling code, fixing EL2
timer access in the process
- Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug
- Fix trapping configuration when the host isn't GICv3
- Improve the detection of HCR_EL2.E2H being RES1
- Drop a spurious 'break' statement in the S1 PTW
- Don't try to access SPE when owned by EL3
Documentation updates:
- Document the failure modes of event injection
- Document that a GICv3 guest can be created on a GICv5 host with
FEAT_GCIE_LEGACY
Selftest improvements:
- Add a selftest for the effective value of HCR_EL2.AMO
- Address build warning in the timer selftest when building with
clang
- Teach irqfd selftests about non-x86 architectures
- Add missing sysregs to the set_id_regs selftest
- Fix vcpu allocation in the vgic_lpi_stress selftest
- Correctly enable interrupts in the vgic_lpi_stress selftest
x86:
- Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test
for the bug fixed by commit 3ccbf6f47098 ("KVM: x86/mmu: Return
-EAGAIN if userspace deletes/moves memslot during prefault")
- Don't try to get PMU capabilities from perf when running a CPU with
hybrid CPUs/PMUs, as perf will rightly WARN.
guest_memfd:
- Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a
more generic KVM_CAP_GUEST_MEMFD_FLAGS
- Add a guest_memfd INIT_SHARED flag and require userspace to
explicitly set said flag to initialize memory as SHARED,
irrespective of MMAP.
The behavior merged in 6.18 is that enabling mmap() implicitly
initializes memory as SHARED, which would result in an ABI
collision for x86 CoCo VMs as their memory is currently always
initialized PRIVATE.
- Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with
private memory, to enable testing such setups, i.e. to hopefully
flush out any other lurking ABI issues before 6.18 is officially
released.
- Add testcases to the guest_memfd selftest to cover guest_memfd
without MMAP, and host userspace accesses to mmap()'d private
memory"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (46 commits)
arm64: Revamp HCR_EL2.E2H RES1 detection
KVM: arm64: nv: Use FGT write trap of MDSCR_EL1 when available
KVM: arm64: Compute per-vCPU FGTs at vcpu_load()
KVM: arm64: selftests: Fix misleading comment about virtual timer encoding
KVM: arm64: selftests: Add an E2H=0-specific configuration to get_reg_list
KVM: arm64: selftests: Make dependencies on VHE-specific registers explicit
KVM: arm64: Kill leftovers of ad-hoc timer userspace access
KVM: arm64: Fix WFxT handling of nested virt
KVM: arm64: Move CNT*CT_EL0 userspace accessors to generic infrastructure
KVM: arm64: Move CNT*_CVAL_EL0 userspace accessors to generic infrastructure
KVM: arm64: Move CNT*_CTL_EL0 userspace accessors to generic infrastructure
KVM: arm64: Add timer UAPI workaround to sysreg infrastructure
KVM: arm64: Make timer_set_offset() generally accessible
KVM: arm64: Replace timer context vcpu pointer with timer_id
KVM: arm64: Introduce timer_context_to_vcpu() helper
KVM: arm64: Hide CNTHV_*_EL2 from userspace for nVHE guests
Documentation: KVM: Update GICv3 docs for GICv5 hosts
KVM: arm64: gic-v3: Only set ICH_HCR traps for v2-on-v3 or v3 guests
KVM: arm64: selftests: Actually enable IRQs in vgic_lpi_stress
KVM: arm64: selftests: Allocate vcpus with correct size
...
|
|
In order to create a CMA heap instance for each CMA region found in the
system, we need to register each of these instances.
While it would appear trivial, the CMA regions are created super early
in the kernel boot process, before most of the subsystems are
initialized. Thus, we can't just create an exported function to create a
heap from the CMA region being initialized.
What we can do however is create a two-step process, where we collect
all the CMA regions into an array early on, and then when we initialize
the heaps we iterate over that array and create the heaps from the CMA
regions we collected.
Reviewed-by: T.J. Mercier <tjmercier@google.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://lore.kernel.org/r/20251013-dma-buf-ecc-heap-v8-2-04ce150ea3d9@kernel.org
|
|
The pm_vt_switch_required() function fails silently when memory
allocation fails, offering no indication to callers that the operation
was unsuccessful. This behavior prevents drivers from handling allocation
errors correctly or implementing retry mechanisms. By ensuring that
failures are reported back to the caller, drivers can make informed
decisions, improve robustness, and avoid unexpected behavior during
critical power management operations.
Change the function signature to return an integer error code and modify
the implementation to return -ENOMEM when kmalloc() fails. Update both
the function declaration and the inline stub in include/linux/pm.h to
maintain consistency across CONFIG_VT_CONSOLE_SLEEP configurations.
The function now returns:
- 0 on success (including when updating existing entries)
- -ENOMEM when memory allocation fails
This change improves error reporting without breaking existing callers,
as the current callers in drivers/video/fbdev/core/fbmem.c already
ignore the return value, making this a backward-compatible improvement.
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Malaya Kumar Rout <mrout@redhat.com>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Link: https://patch.msgid.link/20251013193028.89570-1-mrout@redhat.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
KVM x86 fixes for 6.18:
- Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the
bug fixed by commit 3ccbf6f47098 ("KVM: x86/mmu: Return -EAGAIN if userspace
deletes/moves memslot during prefault")
- Don't try to get PMU capabbilities from perf when running a CPU with hybrid
CPUs/PMUs, as perf will rightly WARN.
- Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more
generic KVM_CAP_GUEST_MEMFD_FLAGS
- Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set
said flag to initialize memory as SHARED, irrespective of MMAP. The
behavior merged in 6.18 is that enabling mmap() implicitly initializes
memory as SHARED, which would result in an ABI collision for x86 CoCo VMs
as their memory is currently always initialized PRIVATE.
- Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private
memory, to enable testing such setups, i.e. to hopefully flush out any
other lurking ABI issues before 6.18 is officially released.
- Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP,
and host userspace accesses to mmap()'d private memory.
|
|
TASCAM FW-1884/FW-1804/FW-1082 is too lazy to repspond to asynchronous
request at S400. The asynchronous transaction often results in timeout.
This is a problematic quirk.
This commit adds support for the quirk. When identifying the new quirk
flag, then the transaction speed is configured at S200.
Link: https://lore.kernel.org/r/20251018035532.287124-4-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
|
|
Add two functions to atomically replace RCU-protected hlist_nulls entries.
Keep using WRITE_ONCE() to assign values to ->next and ->pprev, as
mentioned in the patch below:
commit efd04f8a8b45 ("rcu: Use WRITE_ONCE() for assignments to ->next for
rculist_nulls")
commit 860c8802ace1 ("rcu: Use WRITE_ONCE() for assignments to ->pprev for
hlist_nulls")
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Xuanqiang Luo <luoxuanqiang@kylinos.cn>
Link: https://patch.msgid.link/20251015020236.431822-2-xuanqiang.luo@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In {tcp6,udp6,raw6}_sock, struct ipv6_pinfo is always placed at
the beginning of a new cache line because
1. __alignof__(struct tcp_sock) is 64 due to ____cacheline_aligned
of __cacheline_group_begin(tcp_sock_write_tx)
2. __alignof__(struct udp_sock) is 64 due to ____cacheline_aligned
of struct numa_drop_counters
3. in raw6_sock, struct numa_drop_counters is placed before
struct ipv6_pinfo
. struct ipv6_pinfo is 136 bytes, but the last cache line is
only used by ipv6_fl_list:
$ pahole -C ipv6_pinfo vmlinux
struct ipv6_pinfo {
...
/* --- cacheline 2 boundary (128 bytes) --- */
struct ipv6_fl_socklist * ipv6_fl_list; /* 128 8 */
/* size: 136, cachelines: 3, members: 23 */
Let's move ipv6_fl_list from struct ipv6_pinfo to struct inet_sock
to save a full cache line for {tcp6,udp6,raw6}_sock.
Now, struct ipv6_pinfo is 128 bytes, and {tcp6,udp6,raw6}_sock have
64 bytes less, while {tcp,udp,raw}_sock retain the same size.
Before:
# grep -E "^(RAW|UDP[^L\-]|TCP)" /proc/slabinfo | awk '{print $1, "\t", $4}'
RAWv6 1408
UDPv6 1472
TCPv6 2560
RAW 1152
UDP 1280
TCP 2368
After:
# grep -E "^(RAW|UDP[^L\-]|TCP)" /proc/slabinfo | awk '{print $1, "\t", $4}'
RAWv6 1344
UDPv6 1408
TCPv6 2496
RAW 1152
UDP 1280
TCP 2368
Also, ipv6_fl_list and inet_flags (SNDFLOW bit) are placed in the
same cache line.
$ pahole -C inet_sock vmlinux
...
/* --- cacheline 11 boundary (704 bytes) was 56 bytes ago --- */
struct ipv6_pinfo * pinet6; /* 760 8 */
/* --- cacheline 12 boundary (768 bytes) --- */
struct ipv6_fl_socklist * ipv6_fl_list; /* 768 8 */
unsigned long inet_flags; /* 776 8 */
Doc churn is due to the insufficient Type column (only 1 space short).
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20251014224210.2964778-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the ishtp_get_connection_state() function for struct ishtp_cl, allowing
ishtp client drivers to retrieve the current connection state.
Signed-off-by: Zhang Lixu <lixu.zhang@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
During suspend/resume tests with S2IDLE, some ISH functional failures were
observed because of delay in executing ISH resume handler. Here
schedule_work() is used from resume handler to do actual work.
schedule_work() uses system_wq, which is a per CPU work queue. Although
the queuing is not bound to a CPU, but it prefers local CPU of the caller,
unless prohibited.
Users of this work queue are not supposed to queue long running work.
But in practice, there are scenarios where long running work items are
queued on other unbound workqueues, occupying the CPU. As a result, the
ISH resume handler may not get a chance to execute in a timely manner.
In one scenario, one of the ish_resume_handler() executions was delayed
nearly 1 second because another work item on an unbound workqueue occupied
the same CPU. This delay causes ISH functionality failures.
A similar issue was previously observed where the ISH HID driver timed out
while getting the HID descriptor during S4 resume in the recovery kernel,
likely caused by the same workqueue contention problem.
Create dedicated unbound workqueues for all ISH operations to allow work
items to execute on any available CPU, eliminating CPU-specific bottlenecks
and improving resume reliability under varying system loads. Also ISH has
three different components, a bus driver which implements ISH protocols, a
PCI interface layer and HID interface. Use one dedicated work queue for all
of them.
Signed-off-by: Zhang Lixu <lixu.zhang@intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
Format the kernel-doc for SCALE_HW_CALIB_INVALID correctly to
avoid a kernel-doc warning:
Warning: include/linux/misc_cgroup.h:26 Enum value
'MISC_CG_RES_TDX' not described in enum 'misc_res_type'
Fixes: 7c035bea9407 ("KVM: TDX: Register TDX host key IDs to cgroup misc controller")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull mmc cleanup from Ulf Hansson:
"Move rpmb_frame struct and constants to rpmb common header
This helps us to avoid sharing an immutable branch between our git
trees. I was planning to send it before rc1, but I didn't make it"
* tag 'mmc-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
rpmb: move rpmb_frame struct and constants to common header
|
|
soc/drivers
arm64: Xilinx SOC changes for 6.18
firmware:
- Add debugfs interface
- Wire versal-net compatible string
- Change SOC family detection
* tag 'zynqmp-soc-for-6.18' of https://github.com/Xilinx/linux-xlnx:
drivers: firmware: xilinx: Switch to new family code in zynqmp_pm_get_family_info()
drivers: firmware: xilinx: Add unique family code for all platforms
firmware: xilinx: Add Versal NET platform compatible string
firmware: xilinx: Add debugfs support for PM_GET_NODE_STATUS
|
|
can_change_mtu() became obsolete by commit 23049938605b ("can: populate the
minimum and maximum MTU values"). Now that net_device->min_mtu and
net_device->max_mtu are populated, all the checks are already done by
dev_validate_mtu() in net/core/dev.c.
Remove the net_device_ops->ndo_change_mtu() callback of all the physical
interfaces, then remove can_change_mtu(). Only keep the vcan_change_mtu()
and vxcan_change_mtu() because the virtual interfaces use their own
different MTU logic.
The only functional change this patch introduces is that now the user will
be able to change the MTU even if the interface is up. This does not matter
for Classical CAN and CAN FD because their MTU range is composed of only
one value, respectively CAN_MTU and CANFD_MTU. For the upcoming CAN XL, the
MTU will be configurable within the CANXL_MIN_MTU to CANXL_MAX_MTU range at
any time, even if the interface is up. This is consistent with the other
net protocols and does not contradict ISO 11898-1:2024 as having a
modifiable MTU is a kernel extension.
Signed-off-by: Vincent Mailhol <mailhol@kernel.org>
Link: https://patch.msgid.link/20251003-remove-can_change_mtu-v1-1-337f8bc21181@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Commit 0f022d32c3ec ("net/sched: Fix mirred deadlock on device recursion")
added code in the fast path, even when act_mirred is not used.
Prepare its revert by implementing loop detection in act_mirred.
Adds an array of device pointers in struct netdev_xmit.
tcf_mirred_is_act_redirect() can detect if the array
already contains the target device.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Tested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20251014171907.3554413-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
pci_msi_create_irq_domain() is now unused. Delete it.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-6.19
Pull in tip/sched/core to receive:
50653216e4ff ("sched: Add support to pick functions to take rf")
4c95380701f5 ("sched/ext: Fold balance_scx() into pick_task_scx()")
which will enable clean integration of DL server support among other things.
This conflicts with the following from sched_ext/for-6.18-fixes:
a8ad873113d3 ("sched_ext: defer queue_balance_callback() until after ops.dispatch")
which adds maybe_queue_balance_callback() to balance_scx() which is removed
by 50653216e4ff. Resolve by moving the invocation to pick_task_scx() in the
equivalent location.
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.18-rc2).
No conflicts or adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from CAN
Current release - regressions:
- udp: do not use skb_release_head_state() before
skb_attempt_defer_free()
- gro_cells: use nested-BH locking for gro_cell
- dpll: zl3073x: increase maximum size of flash utility
Previous releases - regressions:
- core: fix lockdep splat on device unregister
- tcp: fix tcp_tso_should_defer() vs large RTT
- tls:
- don't rely on tx_work during send()
- wait for pending async decryptions if tls_strp_msg_hold fails
- can: j1939: add missing calls in NETDEV_UNREGISTER notification
handler
- eth: lan78xx: fix lost EEPROM write timeout in
lan78xx_write_raw_eeprom
Previous releases - always broken:
- ip6_tunnel: prevent perpetual tunnel growth
- dpll: zl3073x: handle missing or corrupted flash configuration
- can: m_can: fix pm_runtime and CAN state handling
- eth:
- ixgbe: fix too early devlink_free() in ixgbe_remove()
- ixgbevf: fix mailbox API compatibility
- gve: Check valid ts bit on RX descriptor before hw timestamping
- idpf: cleanup remaining SKBs in PTP flows
- r8169: fix packet truncation after S4 resume on RTL8168H/RTL8111H"
* tag 'net-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
udp: do not use skb_release_head_state() before skb_attempt_defer_free()
net: usb: lan78xx: fix use of improperly initialized dev->chipid in lan78xx_reset
netdevsim: set the carrier when the device goes up
selftests: tls: add test for short splice due to full skmsg
selftests: net: tls: add tests for cmsg vs MSG_MORE
tls: don't rely on tx_work during send()
tls: wait for pending async decryptions if tls_strp_msg_hold fails
tls: always set record_type in tls_process_cmsg
tls: wait for async encrypt in case of error during latter iterations of sendmsg
tls: trim encrypted message to match the plaintext on short splice
tg3: prevent use of uninitialized remote_adv and local_adv variables
MAINTAINERS: new entry for IPv6 IOAM
gve: Check valid ts bit on RX descriptor before hw timestamping
net: core: fix lockdep splat on device unregister
MAINTAINERS: add myself as maintainer for b53
selftests: net: check jq command is supported
net: airoha: Take into account out-of-order tx completions in airoha_dev_xmit()
tcp: fix tcp_tso_should_defer() vs large RTT
r8152: add error handling in rtl8152_driver_init
usbnet: Fix using smp_processor_id() in preemptible code warnings
...
|
|
The IRQCHIP_PLATFORM_DRIVER macros can be used to convert OF irqchip
drivers to platform drivers but currently reuse the OF init callback
prototype that only takes OF nodes as arguments. This forces drivers to
do reverse lookups of their struct devices during probe if they need
them for things like dev_printk() and device managed resources.
Half of the drivers doing reverse lookups also currently fail to release
the additional reference taken during the lookup, while other drivers
have had the reference leak plugged in various ways (e.g. using
non-intuitive cleanup constructs which still confuse static checkers).
Switch to using a probe callback that takes a platform device as its
first argument to simplify drivers and plug the remaining (mostly
benign) reference leaks.
Fixes: 32c6c054661a ("irqchip: Add Broadcom BCM2712 MSI-X interrupt controller")
Fixes: 70afdab904d2 ("irqchip: Add IMX MU MSI controller driver")
Fixes: a6199bb514d8 ("irqchip: Add Qualcomm MPM controller driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Changhuang Liang <changhuang.liang@starfivetech.com>
|