Age | Commit message (Collapse) | Author | Files | Lines |
|
We want to check for this early so it can be reattached if necessary in
check_unreachable_inodes(); better than letting it be deleted and having
the children reattached, losing their filenames.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
link count works differently in bcachefs - it's only nonzero for files
with multiple hardlinks, which means we can also avoid checking it
except for files that are known to have hardlinks.
That means we need a few different checks instead; in particular, we
don't want fsck to delet a file that has a dirent pointing to it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It's legal for regular files to have missing backpointers (due to
hardlinks), and fsck should automatically add them, but for directories
this is an error that should be flagged.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The fragmentation_lru field hasn't been needed since we reworked the LRU
btrees to use the btree write buffer; previously it was used to resolve
collisions, but the revised LRU btree uses the backpointer (the bucket)
as part of the key.
It should have been deleted at the time of the LRU rework; since it
wasn't, that left places for bugs to hide, in check/repair.
This fixes LRU fsck on a filesystem image helpfully provided by a user
who disappeared before I could get his name for the reported-by.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
check_lru_key() wasn't using write buffer updates for deleting bad lru
entries - dating from before the lru btree used the btree write buffer.
And when possibly flushing the btree write buffer (to make sure we're
seeing a real inconsistency), we need to be using the modern
bch2_btree_write_buffer_maybe_flush().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Errors are getting marked as AUTOFIX once they've been (re)-tested and
audited.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Newly generated keys, in the transaction commit path or write path,
should not be AUTOFIX; those indicate bugs that we need to fail fast
for.
Fixes: 5612daafb764 ("bcachefs: Fix fsck warnings from bkey validation")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Ensure a copy of the lost+found inode exists in the snapshot that we're
reattaching, so that we don't trigger warnings in
lookup_inode_for_snapshot() later.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This fixes two different bugs:
- Looser locking with the rhashtable means we need to recheck if the
inode is still hashed after prepare_to_wait(), and add a corresponding
wakeup after removing from the hash table.
- da18ecbf0fb6 ("fs: add i_state helpers") changed the bit waitqueues
used for inodes, and bcachefs wasn't updated and thus broke; this
updates bcachefs to the new helper.
Fixes: 112d21fd1a12 ("bcachefs: switch to rhashtable for vfs inodes hash")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Uwe Kleine-König says:
====================
net: Switch back to struct platform_driver::remove()
I already sent a patch last week that is very similar to patch #1 of
this series. However the previous submission was based on plain next.
I was asked to resend based on net-next once the merge window closed,
so here comes this v2. The additional patches address drivers/net/dsa,
drivers/net/mdio and the rest of drivers/net apart from wireless which
has its own tree and will addressed separately at a later point in time.
====================
Link: https://patch.msgid.link/cover.1727949050.git.u.kleine-koenig@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After commit 0edb555a65d1 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.
Convert all platform drivers below drivers/net after the previous
conversion commits apart from the wireless drivers to use .remove(),
with the eventual goal to drop struct platform_driver::remove_new(). As
.remove() and .remove_new() have the same prototypes, conversion is done
by just changing the structure member name in the driver initializer.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After commit 0edb555a65d1 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.
Convert all platform drivers below drivers/net/mdio to use .remove(),
with the eventual goal to drop struct platform_driver::remove_new(). As
.remove() and .remove_new() have the same prototypes, conversion is done
by just changing the structure member name in the driver initializer.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/0b60d8bfc45a3de8193f953794dda241e11032a9.1727949050.git.u.kleine-koenig@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After commit 0edb555a65d1 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.
Convert all platform drivers below drivers/net/dsa to use .remove(),
with the eventual goal to drop struct platform_driver::remove_new(). As
.remove() and .remove_new() have the same prototypes, conversion is done
by just changing the structure member name in the driver initializer.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/36da477cb9fa0bffec32d50c2cf3d18e94a0e7e3.1727949050.git.u.kleine-koenig@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After commit 0edb555a65d1 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.
Convert all platform drivers below drivers/net/ethernet to use
.remove(), with the eventual goal to drop struct
platform_driver::remove_new(). As .remove() and .remove_new() have the
same prototypes, conversion is done by just changing the structure
member name in the driver initializer.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/18f7c585a1a8a8ac8b03a2fca7de19bd5c52ac2b.1727949050.git.u.kleine-koenig@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If phy_read_mmd() fails, the error code stored in 'bmsr' should be returned
instead of 'val' which is likely to be 0.
Fixes: 75f4d8d10e01 ("net: phy: add Broadcom BCM84881 PHY driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://patch.msgid.link/3e1755b0c40340d00e089d6adae5bca2f8c79e53.1727982168.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The SF2 crossbar register is a packed bitfield, giving the index of the
external port selected for each of the internal ports. On BCM4908 (the
only currently-supported switch family with a crossbar), there are 2
internal ports and 3 external ports, so there are 2 bits per internal
port.
The driver currently conflates the "bits per port" and "number of ports"
concepts, lumping both into the `num_crossbar_int_ports` field. Since it
is currently only possible for either of these counts to have a value of
2, there is no behavioral error resulting from this situation for now.
Make the code more readable (and support the future possibility of
larger crossbars) by adding a `num_crossbar_ext_bits` field to represent
the "bits per port" count and relying on this where appropriate instead.
Signed-off-by: Sam Edwards <CFSworks@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20241003212301.1339647-1-CFSworks@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Eric Dumazet says:
====================
net: prepare pacing offload support
Some network devices have the ability to offload EDT (Earliest
Departure Time) which is the model used for TCP pacing and FQ
packet scheduler.
Some of them implement the timing wheel mechanism described in
https://saeed.github.io/files/carousel-sigcomm17.pdf
with an associated 'timing wheel horizon'.
In order to upstream the NIC support, this series adds :
1) timing wheel horizon as a per-device attribute.
2) FQ packet scheduler support, to let paced packets
below the timing wheel horizon be handled by the driver.
v1: https://lore.kernel.org/20240930152304.472767-2-edumazet@google.com
====================
Link: https://patch.msgid.link/20241003121219.2396589-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some network devices have the ability to offload EDT (Earliest
Departure Time) which is the model used for TCP pacing and FQ packet
scheduler.
Some of them implement the timing wheel mechanism described in
https://saeed.github.io/files/carousel-sigcomm17.pdf
with an associated 'timing wheel horizon'.
This patchs adds to FQ packet scheduler TCA_FQ_OFFLOAD_HORIZON
attribute.
Its value is capped by the device max_pacing_offload_horizon,
added in the prior patch.
It allows FQ to let packets within pacing offload horizon
to be delivered to the device, which will handle the needed
delay without host involvement.
Signed-off-by: Jeffrey Ji <jeffreyji@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20241003121219.2396589-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some network devices have the ability to offload EDT (Earliest
Departure Time) which is the model used for TCP pacing and FQ
packet scheduler.
Some of them implement the timing wheel mechanism described in
https://saeed.github.io/files/carousel-sigcomm17.pdf
with an associated 'timing wheel horizon'.
This patch adds dev->max_pacing_offload_horizon expressing
this timing wheel horizon in nsec units.
This is a read-only attribute.
Unless a driver sets it, dev->max_pacing_offload_horizon
is zero.
v2: addressed Jakub feedback ( https://lore.kernel.org/netdev/20240930152304.472767-2-edumazet@google.com/T/#mf6294d714c41cc459962154cc2580ce3c9693663 )
v3: added yaml doc (also per Jakub feedback)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20241003121219.2396589-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The kernel may crash when deleting a genetlink family if there are still
listeners for that family:
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c000000000c080bc] netlink_update_socket_mc+0x3c/0xc0
LR [c000000000c0f764] __netlink_clear_multicast_users+0x74/0xc0
Call Trace:
__netlink_clear_multicast_users+0x74/0xc0
genl_unregister_family+0xd4/0x2d0
Change the unsafe loop on the list to a safe one, because inside the
loop there is an element removal from this list.
Fixes: b8273570f802 ("genetlink: fix netns vs. netlink table locking (2)")
Cc: stable@vger.kernel.org
Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com>
Reviewed-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20241003104431.12391-1-a.kovaleva@yadro.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With the inclusion of commit c259acab839e ("ptp/ioctl: support
MONOTONIC{,_RAW} timestamps for PTP_SYS_OFFSET_EXTENDED") clock_gettime()
now allows retrieval of pre/post timestamps for CLOCK_MONOTONIC and
CLOCK_MONOTONIC_RAW timebases along with the previously supported
CLOCK_REALTIME.
This patch adds a command line option 'y' to the testptp program to
choose one of the allowed timebases [realtime aka system, monotonic,
and monotonic-raw).
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://patch.msgid.link/20241003101506.769418-1-maheshb@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Eric Dumazet says:
====================
tcp: add fast path in timer handlers
As mentioned in Netconf 2024:
TCP retransmit and delack timers are not stopped from
inet_csk_clear_xmit_timer() because we do not define
INET_CSK_CLEAR_TIMERS.
Enabling INET_CSK_CLEAR_TIMERS leads to lower performance,
mainly because del_timer() and mod_timer() happen from
different cpus quite often.
What we can do instead is to add fast paths to tcp_write_timer()
and tcp_delack_timer() to avoid socket spinlock acquisition.
====================
Link: https://patch.msgid.link/20241002173042.917928-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
delack timer is not stopped from inet_csk_clear_xmit_timer()
because we do not define INET_CSK_CLEAR_TIMERS.
This is a conscious choice : inet_csk_clear_xmit_timer()
is often called from another cpu. Calling del_timer()
would cause false sharing and lock contention.
This means that very often, tcp_delack_timer() is called
at the timer expiration, while there is no ACK to transmit.
This can be detected very early, avoiding the socket spinlock.
Notes:
- test about tp->compressed_ack is racy,
but in the unlikely case there is a race, the dedicated
compressed_ack_timer hrtimer would close it.
- Even if the fast path is not taken, reading
icsk->icsk_ack.pending and tp->compressed_ack
before acquiring the socket spinlock reduces
acquisition time and chances of contention.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241002173042.917928-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
retransmit timer is not stopped from inet_csk_clear_xmit_timer()
because we do not define INET_CSK_CLEAR_TIMERS.
This is a conscious choice : for active TCP flows, it is better
to only call mod_timer(), because there is more chances of
keeping the timer unchanged. Also inet_csk_clear_xmit_timer()
is often called from another cpu, and calling del_timer()
would cause false sharing and lock contention.
This means that very often, tcp_write_timer() is called
at the timer expiration, while there is nothing to retransmit.
This can be detected very early, avoiding the socket spinlock.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241002173042.917928-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
icsk->icsk_pending can be read locklessly already.
Following patch in the series will add another lockless read.
Add smp_load_acquire() and smp_store_release() annotations
because following patch will add a test in tcp_write_timer(),
and READ_ONCE()/WRITE_ONCE() alone would possibly lead to races.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241002173042.917928-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Justin Iurman says:
====================
selftests: net: ioam: add tunsrc support
TL;DR This patch comes from a discussion we had with Jakub and Paolo on
aligning the ioam selftests with its new "tunsrc" feature.
This patch updates the IOAM selftests to support the new "tunsrc"
feature of IOAM. As a consequence, some changes were required. For
example, the IPv6 header must be accessed to check some fields (i.e.,
the source address for the "tunsrc" feature), which is not possible
AFAIK with IPv6 raw sockets. The latter is currently used with
IPV6_RECVHOPOPTS and was introduced by commit 187bbb6968af ("selftests:
ioam: refactoring to align with the fix") to fix an issue. But, we
really need packet sockets actually... which is one of the changes in
this patch (see the description of the topology at the top of ioam6.sh
for explanations). Another change is that all IPv6 addresses used in the
topology are now based on the documentation prefix (2001:db8::/32).
Also, the tests have been improved and there are now many more of them.
Overall, the script is more robust.
====================
Link: https://patch.msgid.link/20241002162731.19847-1-justin.iurman@uliege.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch re-adds the (updated) ioam selftests with support for the
tunsrc feature.
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Link: https://patch.msgid.link/20241002162731.19847-3-justin.iurman@uliege.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch entirely removes the ioam selftests to prepare for the next
patch in this series, which re-adds the new ioam selftests for better
readability.
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Link: https://patch.msgid.link/20241002162731.19847-2-justin.iurman@uliege.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This adds control over the hardware LEDs in the Marvell
MV88E6xxx DSA switch and enables it for MV88E6352.
This fixes an imminent problem on the Inteno XG6846 which
has a WAN LED that simply do not work with hardware
defaults: driver amendment is necessary.
The patch is modeled after Christian Marangis LED support
code for the QCA8k DSA switch, I got help with the register
definitions from Tim Harvey.
After this patch it is possible to activate hardware link
indication like this (or with a similar script):
cd /sys/class/leds/Marvell\ 88E6352:05:00:green:wan/
echo netdev > trigger
echo 1 > link
This makes the green link indicator come up on any link
speed. It is also possible to be more elaborate, like this:
cd /sys/class/leds/Marvell\ 88E6352:05:00:green:wan/
echo netdev > trigger
echo 1 > link_1000
cd /sys/class/leds/Marvell\ 88E6352:05:01:amber:wan/
echo netdev > trigger
echo 1 > link_100
Making the green LED come on for a gigabit link and the
amber LED come on for a 100 mbit link.
Each port has 2 LED slots (the hardware may use just one or
none) and the hardware triggers are specified in four bits per
LED, and some of the hardware triggers are only available on the
SFP (fiber) uplink. The restrictions are described in the
port.h header file where the registers are described. For
example, selector 1 set for LED 1 on port 5 or 6 will indicate
Fiber 1000 (gigabit) and activity with a blinking LED, but
ONLY for an SFP connection. If port 5/6 is used with something
not SFP, this selector is a noop: something else need to be
selected.
After the previous series rewriting the MV88E6xxx DT
bindings to use YAML a "leds" subnode is already valid
for each port, in my scratch device tree it looks like
this:
leds {
#address-cells = <1>;
#size-cells = <0>;
led@0 {
reg = <0>;
color = <LED_COLOR_ID_GREEN>;
function = LED_FUNCTION_LAN;
default-state = "off";
linux,default-trigger = "netdev";
};
led@1 {
reg = <1>;
color = <LED_COLOR_ID_AMBER>;
function = LED_FUNCTION_LAN;
default-state = "off";
};
};
This DT config is not yet configuring everything: when the netdev
default trigger is assigned the hw acceleration callbacks are
not called, and there is no way to set the netdev sub-trigger
type (such as link_1000) from the device tree, such as if you want
a gigabit link indicator. This has to be done from userspace at
this point.
We add LED operations to all switches in the 6352 family:
6172, 6176, 6240 and 6352.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241001-mv88e6xxx-leds-v4-1-cc11c4f49b18@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Wesley reported an issue:
==================================================================
EXT4-fs (dm-5): resizing filesystem from 7168 to 786432 blocks
------------[ cut here ]------------
kernel BUG at fs/ext4/resize.c:324!
CPU: 9 UID: 0 PID: 3576 Comm: resize2fs Not tainted 6.11.0+ #27
RIP: 0010:ext4_resize_fs+0x1212/0x12d0
Call Trace:
__ext4_ioctl+0x4e0/0x1800
ext4_ioctl+0x12/0x20
__x64_sys_ioctl+0x99/0xd0
x64_sys_call+0x1206/0x20d0
do_syscall_64+0x72/0x110
entry_SYSCALL_64_after_hwframe+0x76/0x7e
==================================================================
While reviewing the patch, Honza found that when adjusting resize_bg in
alloc_flex_gd(), it was possible for flex_gd->resize_bg to be bigger than
flexbg_size.
The reproduction of the problem requires the following:
o_group = flexbg_size * 2 * n;
o_size = (o_group + 1) * group_size;
n_group: [o_group + flexbg_size, o_group + flexbg_size * 2)
o_size = (n_group + 1) * group_size;
Take n=0,flexbg_size=16 as an example:
last:15
|o---------------|--------------n-|
o_group:0 resize to n_group:30
The corresponding reproducer is:
img=test.img
rm -f $img
truncate -s 600M $img
mkfs.ext4 -F $img -b 1024 -G 16 8M
dev=`losetup -f --show $img`
mkdir -p /tmp/test
mount $dev /tmp/test
resize2fs $dev 248M
Delete the problematic plus 1 to fix the issue, and add a WARN_ON_ONCE()
to prevent the issue from happening again.
[ Note: another reproucer which this commit fixes is:
img=test.img
rm -f $img
truncate -s 25MiB $img
mkfs.ext4 -b 4096 -E nodiscard,lazy_itable_init=0,lazy_journal_init=0 $img
truncate -s 3GiB $img
dev=`losetup -f --show $img`
mkdir -p /tmp/test
mount $dev /tmp/test
resize2fs $dev 3G
umount $dev
losetup -d $dev
-- TYT ]
Reported-by: Wesley Hershberger <wesley.hershberger@canonical.com>
Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2081231
Reported-by: Stéphane Graber <stgraber@stgraber.org>
Closes: https://lore.kernel.org/all/20240925143325.518508-1-aleksandr.mikhalitsyn@canonical.com/
Tested-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Tested-by: Eric Sandeen <sandeen@redhat.com>
Fixes: 665d3e0af4d3 ("ext4: reduce unnecessary memory allocation in alloc_flex_gd()")
Cc: stable@vger.kernel.org
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240927133329.1015041-1-libaokun@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Calling ext4_fc_mark_ineligible() with a NULL handle is racy and may result
in a fast-commit being done before the filesystem is effectively marked as
ineligible. This patch moves the call to this function so that an handle
can be used. If a transaction fails to start, then there's not point in
trying to mark the filesystem as ineligible, and an error will eventually be
returned to user-space.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240923104909.18342-3-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
|
|
Calling ext4_fc_mark_ineligible() with a NULL handle is racy and may result
in a fast-commit being done before the filesystem is effectively marked as
ineligible. This patch fixes the calls to this function in
__track_dentry_update() by adding an extra parameter to the callback used in
ext4_fc_track_template().
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240923104909.18342-2-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
|
|
Commit 4e0a1d8b0675
("Bluetooth: btusb: Don't suspend when there are connections")
introduces a check for connections to prevent auto-suspend but that
actually ignored the fact the .suspend callback can be called for
external suspend requests which
Documentation/driver-api/usb/power-management.rst states the following:
'External suspend calls should never be allowed to fail in this way,
only autosuspend calls. The driver can tell them apart by applying
the :c:func:`PMSG_IS_AUTO` macro to the message argument to the
``suspend`` method; it will return True for internal PM events
(autosuspend) and False for external PM events.'
In addition to that align system suspend with USB suspend by using
hci_suspend_dev since otherwise the stack would be expecting events
such as advertising reports which may not be delivered while the
transport is suspended.
Fixes: 4e0a1d8b0675 ("Bluetooth: btusb: Don't suspend when there are connections")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Kiran K <kiran.k@intel.com>
|
|
This checks if the ACL connection remains valid as it could be destroyed
while hci_enhanced_setup_sync is pending on cmd_sync leading to the
following trace:
BUG: KASAN: slab-use-after-free in hci_enhanced_setup_sync+0x91b/0xa60
Read of size 1 at addr ffff888002328ffd by task kworker/u5:2/37
CPU: 0 UID: 0 PID: 37 Comm: kworker/u5:2 Not tainted 6.11.0-rc6-01300-g810be445d8d6 #7099
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
Workqueue: hci0 hci_cmd_sync_work
Call Trace:
<TASK>
dump_stack_lvl+0x5d/0x80
? hci_enhanced_setup_sync+0x91b/0xa60
print_report+0x152/0x4c0
? hci_enhanced_setup_sync+0x91b/0xa60
? __virt_addr_valid+0x1fa/0x420
? hci_enhanced_setup_sync+0x91b/0xa60
kasan_report+0xda/0x1b0
? hci_enhanced_setup_sync+0x91b/0xa60
hci_enhanced_setup_sync+0x91b/0xa60
? __pfx_hci_enhanced_setup_sync+0x10/0x10
? __pfx___mutex_lock+0x10/0x10
hci_cmd_sync_work+0x1c2/0x330
process_one_work+0x7d9/0x1360
? __pfx_lock_acquire+0x10/0x10
? __pfx_process_one_work+0x10/0x10
? assign_work+0x167/0x240
worker_thread+0x5b7/0xf60
? __kthread_parkme+0xac/0x1c0
? __pfx_worker_thread+0x10/0x10
? __pfx_worker_thread+0x10/0x10
kthread+0x293/0x360
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2f/0x70
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
Allocated by task 34:
kasan_save_stack+0x30/0x50
kasan_save_track+0x14/0x30
__kasan_kmalloc+0x8f/0xa0
__hci_conn_add+0x187/0x17d0
hci_connect_sco+0x2e1/0xb90
sco_sock_connect+0x2a2/0xb80
__sys_connect+0x227/0x2a0
__x64_sys_connect+0x6d/0xb0
do_syscall_64+0x71/0x140
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 37:
kasan_save_stack+0x30/0x50
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3b/0x60
__kasan_slab_free+0x101/0x160
kfree+0xd0/0x250
device_release+0x9a/0x210
kobject_put+0x151/0x280
hci_conn_del+0x448/0xbf0
hci_abort_conn_sync+0x46f/0x980
hci_cmd_sync_work+0x1c2/0x330
process_one_work+0x7d9/0x1360
worker_thread+0x5b7/0xf60
kthread+0x293/0x360
ret_from_fork+0x2f/0x70
ret_from_fork_asm+0x1a/0x30
Cc: stable@vger.kernel.org
Fixes: e07a06b4eb41 ("Bluetooth: Convert SCO configure_datapath to hci_sync")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
rfcomm_sk_state_change attempts to use sock_lock so it must never be
called with it locked but rfcomm_sock_ioctl always attempt to lock it
causing the following trace:
======================================================
WARNING: possible circular locking dependency detected
6.8.0-syzkaller-08951-gfe46a7dd189e #0 Not tainted
------------------------------------------------------
syz-executor386/5093 is trying to acquire lock:
ffff88807c396258 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1671 [inline]
ffff88807c396258 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: rfcomm_sk_state_change+0x5b/0x310 net/bluetooth/rfcomm/sock.c:73
but task is already holding lock:
ffff88807badfd28 (&d->lock){+.+.}-{3:3}, at: __rfcomm_dlc_close+0x226/0x6a0 net/bluetooth/rfcomm/core.c:491
Reported-by: syzbot+d7ce59b06b3eb14fd218@syzkaller.appspotmail.com
Tested-by: syzbot+d7ce59b06b3eb14fd218@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d7ce59b06b3eb14fd218
Fixes: 3241ad820dbb ("[Bluetooth] Add timestamp support to L2CAP, RFCOMM and SCO")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
PSE controllers like the TPS23881 can forcefully turn off their
configuration state. In such cases, the is_enabled() and get_status()
callbacks will report the PSE as disabled, while admin_state_enabled
will show it as enabled. This mismatch can lead the user to attempt
to enable it, but no action is taken as admin_state_enabled remains set.
The solution is to disable the PSE before enabling it, ensuring the
actual status matches admin_state_enabled.
Fixes: d83e13761d5b ("net: pse-pd: Use regulator framework within PSE framework")
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241002121706.246143-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, the second bridge command overwrites the first one.
Fix this by adding this VID to the interface behind $swp2.
The one_bridge_two_pvids() test intends to check that there is no
leakage of traffic between bridge ports which have a single VLAN - the
PVID VLAN.
Because of a typo, port $swp1 is configured with a PVID twice (second
command overwrites first), and $swp2 isn't configured at all (and since
the bridge vlan_default_pvid property is set to 0, this port will not
have a PVID at all, so it will drop all untagged and priority-tagged
traffic).
So, instead of testing the configuration that was intended, we are
testing a different one, where one port has PVID 2 and the other has
no PVID. This incorrect version of the test should also pass, but is
ineffective for its purpose, so fix the typo.
This typo has an impact on results of the test,
potentially leading to wrong conclusions regarding
the functionality of a network device.
The tests results:
TEST: Switch ports in VLAN-aware bridge with different PVIDs:
Unicast non-IP untagged [ OK ]
Multicast non-IP untagged [ OK ]
Broadcast non-IP untagged [ OK ]
Unicast IPv4 untagged [ OK ]
Multicast IPv4 untagged [ OK ]
Unicast IPv6 untagged [ OK ]
Multicast IPv6 untagged [ OK ]
Unicast non-IP VID 1 [ OK ]
Multicast non-IP VID 1 [ OK ]
Broadcast non-IP VID 1 [ OK ]
Unicast IPv4 VID 1 [ OK ]
Multicast IPv4 VID 1 [ OK ]
Unicast IPv6 VID 1 [ OK ]
Multicast IPv6 VID 1 [ OK ]
Unicast non-IP VID 4094 [ OK ]
Multicast non-IP VID 4094 [ OK ]
Broadcast non-IP VID 4094 [ OK ]
Unicast IPv4 VID 4094 [ OK ]
Multicast IPv4 VID 4094 [ OK ]
Unicast IPv6 VID 4094 [ OK ]
Multicast IPv6 VID 4094 [ OK ]
Fixes: 476a4f05d9b8 ("selftests: forwarding: add a no_forwarding.sh test")
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Kacper Ludwinski <kac.ludwinski@icloud.com>
Link: https://patch.msgid.link/20241002051016.849-1-kac.ludwinski@icloud.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
568894edbe48 ("sched_ext: Add scx_cgroup_enabled to gate cgroup operations
and fix scx_tg_online()") assumed that scx_cgroup_exit() is only called
after scx_cgroup_init() finished successfully. This isn't true.
scx_cgroup_exit() can be called without scx_cgroup_init() being called at
all or after scx_cgroup_init() failed in the middle.
As init state is tracked per cgroup, scx_cgroup_exit() can be used safely to
clean up in all cases. Remove the incorrect WARN_ON_ONCE().
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 568894edbe48 ("sched_ext: Add scx_cgroup_enabled to gate cgroup operations and fix scx_tg_online()")
|
|
When the BPF scheduler fails, ops.exit() allows rich error reporting through
scx_exit_info. Use scx.exit() path consistently for all failures which can
be caused by the BPF scheduler:
- scx_ops_error() is called after ops.init() and ops.cgroup_init() failure
to record error information.
- ops.init_task() failure now uses scx_ops_error() instead of pr_err().
- The err_disable path updated to automatically trigger scx_ops_error() to
cover cases that the error message hasn't already been generated and
always return 0 indicating init success so that the error is reported
through ops.exit().
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: David Vernet <void@manifault.com>
Cc: Daniel Hodges <hodges.daniel.scott@gmail.com>
Cc: Changwoo Min <multics69@gmail.com>
Cc: Andrea Righi <andrea.righi@linux.dev>
Cc: Dan Schatzberg <schatzberg.dan@gmail.com>
|
|
Current code allocates the pcpu_sum array with size num_possible_cpus().
This code assumes the cpu_possible_mask is dense, which is not true in
the general case per [1]. If cpu_possible_mask is sparse, the array
might be indexed by a value beyond the size of the array.
However, the configurations that Hyper-V provides to guest VMs on x86
and ARM64 hardware, in combination with how architecture specific code
assigns Linux CPU numbers, *does* always produce a dense cpu_possible_mask.
So the dense assumption is not currently causing failures. But for
robustness against future changes in how cpu_possible_mask is populated,
update the code to no longer assume dense.
The correct approach is to allocate and initialize the array using size
"nr_cpu_ids". While this leaves unused array entries corresponding to
holes in cpu_possible_mask, the holes are assumed to be minimal and hence
the amount of memory wasted by unused entries is minimal.
[1] https://lore.kernel.org/lkml/SN6PR02MB4157210CC36B2593F8572E5ED4692@SN6PR02MB4157.namprd02.prod.outlook.com/
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20241003035333.49261-6-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This warning is emitted when a driver does not default populate an rss
key when one is not provided from userspace. Some devices do not
support individual rss keys per context. For these devices, it is ok
to leave the key zeroed out in ethtool_rxfh_context. Do not warn on
zeroed key when ethtool_ops.rxfh_per_ctx_key == 0.
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20241003162310.1310576-1-daniel.zahka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-10-01 (ice)
This series contains updates to ice driver only.
Karol cleans up current PTP GPIO pin handling, fixes minor bugs,
refactors implementation for all products, introduces SDP (Software
Definable Pins) for E825C and implements reading SDP section from NVM
for E810 products.
Sergey replaces multiple aux buses and devices used in the PTP support
code with struct ice_adapter holding the necessary shared data.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: Drop auxbus use for PTP to finalize ice_adapter move
ice: Use ice_adapter for PTP shared data instead of auxdev
ice: Initial support for E825C hardware in ice_adapter
ice: Add ice_get_ctrl_ptp() wrapper to simplify the code
ice: Introduce ice_get_phy_model() wrapper
ice: Enable 1PPS out from CGU for E825C products
ice: Read SDP section from NVM for pin definitions
ice: Disable shared pin on E810 on setfunc
ice: Cache perout/extts requests and check flags
ice: Align E810T GPIO to other products
ice: Add SDPs support for E825C
ice: Implement ice_ptp_pin_desc
====================
Link: https://patch.msgid.link/20241001201702.3252954-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
"A couple of build/config issues and expanding the speculative SSBS
workaround to more CPUs:
- Expand the speculative SSBS workaround to cover Cortex-A715,
Neoverse-N3 and Microsoft Azure Cobalt 100
- Force position-independent veneers - in some kernel configurations,
the LLD linker generates position-dependent veneers for otherwise
position-independent code, resulting in early boot-time failures
- Fix Kconfig selection of HAVE_DYNAMIC_FTRACE_WITH_ARGS so that it
is not enabled when not supported by the combination of clang and
GNU ld"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386
arm64: fix selection of HAVE_DYNAMIC_FTRACE_WITH_ARGS
arm64: errata: Expand speculative SSBS workaround once more
arm64: cputype: Add Neoverse-N3 definitions
arm64: Force position-independent veneers
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- PERF_TYPE_BREAKPOINT now returns -EOPNOTSUPP instead of -ENOENT,
which aligns to other ports and is a saner value
- The KASAN-related stack size increasing logic has been moved to a C
header, to avoid dependency issues
* tag 'riscv-for-linus-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fix kernel stack size when KASAN is enabled
drivers/perf: riscv: Align errno for unsupported perf event
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix tp_printk command line option crashing the kernel
With the code that can handle a buffer from a previous boot, the
trace_check_vprintf() needed access to the delta of the address space
used by the old buffer and the current buffer. To do so, the
trace_array (tr) parameter was used. But when tp_printk is enabled on
the kernel command line, no trace buffer is used and the trace event
is sent directly to printk(). That meant the tr field of the iterator
descriptor was NULL, and since tp_printk still uses
trace_check_vprintf() it caused a NULL dereference.
- Add ptrace.h include to x86 ftrace file for completeness
- Fix rtla installation when done with out-of-tree build
- Fix the help messages in rtla that were incorrect
- Several fixes to fix races with the timerlat and hwlat code
Several locking issues were discovered with the coordination between
timerlat kthread creation and hotplug. As timerlat has callbacks from
hotplug code to start kthreads when CPUs come online. There are also
locking issues with grabbing the cpu_read_lock() and the locks within
timerlat.
* tag 'trace-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/hwlat: Fix a race during cpuhp processing
tracing/timerlat: Fix a race during cpuhp processing
tracing/timerlat: Drop interface_lock in stop_kthread()
tracing/timerlat: Fix duplicated kthread creation due to CPU online/offline
x86/ftrace: Include <asm/ptrace.h>
rtla: Fix the help text in osnoise and timerlat top tools
tools/rtla: Fix installation from out-of-tree build
tracing: Fix trace_check_vprintf() when tp_printk is used
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fixes from Vlastimil Babka:
"Fixes for issues introduced in this merge window: kobject memory leak,
unsupressed warning and possible lockup in new slub_kunit tests,
misleading code in kvfree_rcu_queue_batch()"
* tag 'slab-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
slub/kunit: skip test_kfree_rcu when the slub kunit test is built-in
mm, slab: suppress warnings in test_leak_destroy kunit test
rcu/kvfree: Refactor kvfree_rcu_queue_batch()
mm, slab: fix use of SLAB_SUPPORTS_SYSFS in kmem_cache_release()
|
|
Nick Child says:
====================
ibmvnic: Fix for send scrq direct
This is a v2 of a patchset (now just patch) which addresses a
bug in a new feature which is causing major link UP issues with
certain physical cards.
For a full summary of the issue:
1. During vnic initialization we get the following values from vnic
server regarding "Transmit / Receive Descriptor Requirement" (see
PAPR Table 584. CAPABILITIES Commands):
- LSO Tx frame = 0x0F , header offsets + L2, L3, L4 headers required
- CSO Tx frame = 0x0C , header offsets + L2 header required
- standard frame = 0x0C , header offsets + L2 header required
2. Assume we are dealing with only "standard frames" from now on (no
CSO, no LSO)
3. When using 100G backing device, we don't hand vnic server any header
information and TX is successful
4. When using 25G backing device, we don't hand vnic server any header
information and TX fails and we get "Adapter Error" transport events.
The obvious issue here is that vnic client should be respecting the 0X0C
header requirement for standard frames. But 100G cards will also give
0x0C despite the fact that we know TX works if we ignore it. That being
said, we still must respect values given from the managing server. Will
need to work with them going forward to hopefully get 100G cards to
return 0x00 for this bitstring so the performance gains of using
send_subcrq_direct can be continued.
====================
Link: https://patch.msgid.link/20241001163200.1802522-1-nnac123@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Previously, the TX header requirement for standard frames was ignored.
This requirement is a bitstring sent from the VIOS which maps to the
type of header information needed during TX. If no header information,
is needed then send subcrq direct can be used (which can be more
performant).
This bitstring was previously ignored for standard packets (AKA non LSO,
non CSO) due to the belief that the bitstring was over-cautionary. It
turns out that there are some configurations where the backing device
does need header information for transmission of standard packets. If
the information is not supplied then this causes continuous "Adapter
error" transport events. Therefore, this bitstring should be respected
and observed before considering the use of send subcrq direct.
Fixes: 74839f7a8268 ("ibmvnic: Introduce send sub-crq direct")
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241001163200.1802522-2-nnac123@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|