Age | Commit message (Collapse) | Author | Files | Lines |
|
commit 34e49994d0dcdb2d31d4d2908d04f4e9ce57e4d7 upstream.
The free space tree bitmap slab cache is created with SLAB_RED_ZONE but
that's a debugging flag and not always enabled. Also the other slabs are
created with at least SLAB_MEM_SPREAD that we want as well to average
the memory placement cost.
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: 3acd48507dc4 ("btrfs: fix allocation of free space cache v1 bitmap pages")
CC: stable@vger.kernel.org # 5.4+
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit dbcc7d57bffc0c8cac9dac11bec548597d59a6a5 upstream.
While resolving backreferences, as part of a logical ino ioctl call or
fiemap, we can end up hitting a BUG_ON() when replaying tree mod log
operations of a root, triggering a stack trace like the following:
------------[ cut here ]------------
kernel BUG at fs/btrfs/ctree.c:1210!
invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 1 PID: 19054 Comm: crawl_335 Tainted: G W 5.11.0-2d11c0084b02-misc-next+ #89
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
RIP: 0010:__tree_mod_log_rewind+0x3b1/0x3c0
Code: 05 48 8d 74 10 (...)
RSP: 0018:ffffc90001eb70b8 EFLAGS: 00010297
RAX: 0000000000000000 RBX: ffff88812344e400 RCX: ffffffffb28933b6
RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff88812344e42c
RBP: ffffc90001eb7108 R08: 1ffff11020b60a20 R09: ffffed1020b60a20
R10: ffff888105b050f9 R11: ffffed1020b60a1f R12: 00000000000000ee
R13: ffff8880195520c0 R14: ffff8881bc958500 R15: ffff88812344e42c
FS: 00007fd1955e8700(0000) GS:ffff8881f5600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007efdb7928718 CR3: 000000010103a006 CR4: 0000000000170ee0
Call Trace:
btrfs_search_old_slot+0x265/0x10d0
? lock_acquired+0xbb/0x600
? btrfs_search_slot+0x1090/0x1090
? free_extent_buffer.part.61+0xd7/0x140
? free_extent_buffer+0x13/0x20
resolve_indirect_refs+0x3e9/0xfc0
? lock_downgrade+0x3d0/0x3d0
? __kasan_check_read+0x11/0x20
? add_prelim_ref.part.11+0x150/0x150
? lock_downgrade+0x3d0/0x3d0
? __kasan_check_read+0x11/0x20
? lock_acquired+0xbb/0x600
? __kasan_check_write+0x14/0x20
? do_raw_spin_unlock+0xa8/0x140
? rb_insert_color+0x30/0x360
? prelim_ref_insert+0x12d/0x430
find_parent_nodes+0x5c3/0x1830
? resolve_indirect_refs+0xfc0/0xfc0
? lock_release+0xc8/0x620
? fs_reclaim_acquire+0x67/0xf0
? lock_acquire+0xc7/0x510
? lock_downgrade+0x3d0/0x3d0
? lockdep_hardirqs_on_prepare+0x160/0x210
? lock_release+0xc8/0x620
? fs_reclaim_acquire+0x67/0xf0
? lock_acquire+0xc7/0x510
? poison_range+0x38/0x40
? unpoison_range+0x14/0x40
? trace_hardirqs_on+0x55/0x120
btrfs_find_all_roots_safe+0x142/0x1e0
? find_parent_nodes+0x1830/0x1830
? btrfs_inode_flags_to_xflags+0x50/0x50
iterate_extent_inodes+0x20e/0x580
? tree_backref_for_extent+0x230/0x230
? lock_downgrade+0x3d0/0x3d0
? read_extent_buffer+0xdd/0x110
? lock_downgrade+0x3d0/0x3d0
? __kasan_check_read+0x11/0x20
? lock_acquired+0xbb/0x600
? __kasan_check_write+0x14/0x20
? _raw_spin_unlock+0x22/0x30
? __kasan_check_write+0x14/0x20
iterate_inodes_from_logical+0x129/0x170
? iterate_inodes_from_logical+0x129/0x170
? btrfs_inode_flags_to_xflags+0x50/0x50
? iterate_extent_inodes+0x580/0x580
? __vmalloc_node+0x92/0xb0
? init_data_container+0x34/0xb0
? init_data_container+0x34/0xb0
? kvmalloc_node+0x60/0x80
btrfs_ioctl_logical_to_ino+0x158/0x230
btrfs_ioctl+0x205e/0x4040
? __might_sleep+0x71/0xe0
? btrfs_ioctl_get_supported_features+0x30/0x30
? getrusage+0x4b6/0x9c0
? __kasan_check_read+0x11/0x20
? lock_release+0xc8/0x620
? __might_fault+0x64/0xd0
? lock_acquire+0xc7/0x510
? lock_downgrade+0x3d0/0x3d0
? lockdep_hardirqs_on_prepare+0x210/0x210
? lockdep_hardirqs_on_prepare+0x210/0x210
? __kasan_check_read+0x11/0x20
? do_vfs_ioctl+0xfc/0x9d0
? ioctl_file_clone+0xe0/0xe0
? lock_downgrade+0x3d0/0x3d0
? lockdep_hardirqs_on_prepare+0x210/0x210
? __kasan_check_read+0x11/0x20
? lock_release+0xc8/0x620
? __task_pid_nr_ns+0xd3/0x250
? lock_acquire+0xc7/0x510
? __fget_files+0x160/0x230
? __fget_light+0xf2/0x110
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fd1976e2427
Code: 00 00 90 48 8b 05 (...)
RSP: 002b:00007fd1955e5cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fd1955e5f40 RCX: 00007fd1976e2427
RDX: 00007fd1955e5f48 RSI: 00000000c038943b RDI: 0000000000000004
RBP: 0000000001000000 R08: 0000000000000000 R09: 00007fd1955e6120
R10: 0000557835366b00 R11: 0000000000000246 R12: 0000000000000004
R13: 00007fd1955e5f48 R14: 00007fd1955e5f40 R15: 00007fd1955e5ef8
Modules linked in:
---[ end trace ec8931a1c36e57be ]---
(gdb) l *(__tree_mod_log_rewind+0x3b1)
0xffffffff81893521 is in __tree_mod_log_rewind (fs/btrfs/ctree.c:1210).
1205 * the modification. as we're going backwards, we do the
1206 * opposite of each operation here.
1207 */
1208 switch (tm->op) {
1209 case MOD_LOG_KEY_REMOVE_WHILE_FREEING:
1210 BUG_ON(tm->slot < n);
1211 fallthrough;
1212 case MOD_LOG_KEY_REMOVE_WHILE_MOVING:
1213 case MOD_LOG_KEY_REMOVE:
1214 btrfs_set_node_key(eb, &tm->key, tm->slot);
Here's what happens to hit that BUG_ON():
1) We have one tree mod log user (through fiemap or the logical ino ioctl),
with a sequence number of 1, so we have fs_info->tree_mod_seq == 1;
2) Another task is at ctree.c:balance_level() and we have eb X currently as
the root of the tree, and we promote its single child, eb Y, as the new
root.
Then, at ctree.c:balance_level(), we call:
tree_mod_log_insert_root(eb X, eb Y, 1);
3) At tree_mod_log_insert_root() we create tree mod log elements for each
slot of eb X, of operation type MOD_LOG_KEY_REMOVE_WHILE_FREEING each
with a ->logical pointing to ebX->start. These are placed in an array
named tm_list.
Lets assume there are N elements (N pointers in eb X);
4) Then, still at tree_mod_log_insert_root(), we create a tree mod log
element of operation type MOD_LOG_ROOT_REPLACE, ->logical set to
ebY->start, ->old_root.logical set to ebX->start, ->old_root.level set
to the level of eb X and ->generation set to the generation of eb X;
5) Then tree_mod_log_insert_root() calls tree_mod_log_free_eb() with
tm_list as argument. After that, tree_mod_log_free_eb() calls
__tree_mod_log_insert() for each member of tm_list in reverse order,
from highest slot in eb X, slot N - 1, to slot 0 of eb X;
6) __tree_mod_log_insert() sets the sequence number of each given tree mod
log operation - it increments fs_info->tree_mod_seq and sets
fs_info->tree_mod_seq as the sequence number of the given tree mod log
operation.
This means that for the tm_list created at tree_mod_log_insert_root(),
the element corresponding to slot 0 of eb X has the highest sequence
number (1 + N), and the element corresponding to the last slot has the
lowest sequence number (2);
7) Then, after inserting tm_list's elements into the tree mod log rbtree,
the MOD_LOG_ROOT_REPLACE element is inserted, which gets the highest
sequence number, which is N + 2;
8) Back to ctree.c:balance_level(), we free eb X by calling
btrfs_free_tree_block() on it. Because eb X was created in the current
transaction, has no other references and writeback did not happen for
it, we add it back to the free space cache/tree;
9) Later some other task T allocates the metadata extent from eb X, since
it is marked as free space in the space cache/tree, and uses it as a
node for some other btree;
10) The tree mod log user task calls btrfs_search_old_slot(), which calls
get_old_root(), and finally that calls __tree_mod_log_oldest_root()
with time_seq == 1 and eb_root == eb Y;
11) First iteration of the while loop finds the tree mod log element with
sequence number N + 2, for the logical address of eb Y and of type
MOD_LOG_ROOT_REPLACE;
12) Because the operation type is MOD_LOG_ROOT_REPLACE, we don't break out
of the loop, and set root_logical to point to tm->old_root.logical
which corresponds to the logical address of eb X;
13) On the next iteration of the while loop, the call to
tree_mod_log_search_oldest() returns the smallest tree mod log element
for the logical address of eb X, which has a sequence number of 2, an
operation type of MOD_LOG_KEY_REMOVE_WHILE_FREEING and corresponds to
the old slot N - 1 of eb X (eb X had N items in it before being freed);
14) We then break out of the while loop and return the tree mod log operation
of type MOD_LOG_ROOT_REPLACE (eb Y), and not the one for slot N - 1 of
eb X, to get_old_root();
15) At get_old_root(), we process the MOD_LOG_ROOT_REPLACE operation
and set "logical" to the logical address of eb X, which was the old
root. We then call tree_mod_log_search() passing it the logical
address of eb X and time_seq == 1;
16) Then before calling tree_mod_log_search(), task T adds a key to eb X,
which results in adding a tree mod log operation of type
MOD_LOG_KEY_ADD to the tree mod log - this is done at
ctree.c:insert_ptr() - but after adding the tree mod log operation
and before updating the number of items in eb X from 0 to 1...
17) The task at get_old_root() calls tree_mod_log_search() and gets the
tree mod log operation of type MOD_LOG_KEY_ADD just added by task T.
Then it enters the following if branch:
if (old_root && tm && tm->op != MOD_LOG_KEY_REMOVE_WHILE_FREEING) {
(...)
} (...)
Calls read_tree_block() for eb X, which gets a reference on eb X but
does not lock it - task T has it locked.
Then it clones eb X while it has nritems set to 0 in its header, before
task T sets nritems to 1 in eb X's header. From hereupon we use the
clone of eb X which no other task has access to;
18) Then we call __tree_mod_log_rewind(), passing it the MOD_LOG_KEY_ADD
mod log operation we just got from tree_mod_log_search() in the
previous step and the cloned version of eb X;
19) At __tree_mod_log_rewind(), we set the local variable "n" to the number
of items set in eb X's clone, which is 0. Then we enter the while loop,
and in its first iteration we process the MOD_LOG_KEY_ADD operation,
which just decrements "n" from 0 to (u32)-1, since "n" is declared with
a type of u32. At the end of this iteration we call rb_next() to find the
next tree mod log operation for eb X, that gives us the mod log operation
of type MOD_LOG_KEY_REMOVE_WHILE_FREEING, for slot 0, with a sequence
number of N + 1 (steps 3 to 6);
20) Then we go back to the top of the while loop and trigger the following
BUG_ON():
(...)
switch (tm->op) {
case MOD_LOG_KEY_REMOVE_WHILE_FREEING:
BUG_ON(tm->slot < n);
fallthrough;
(...)
Because "n" has a value of (u32)-1 (4294967295) and tm->slot is 0.
Fix this by taking a read lock on the extent buffer before cloning it at
ctree.c:get_old_root(). This should be done regardless of the extent
buffer having been freed and reused, as a concurrent task might be
modifying it (while holding a write lock on it).
Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Link: https://lore.kernel.org/linux-btrfs/20210227155037.GN28049@hungrycats.org/
Fixes: 834328a8493079 ("Btrfs: tree mod log's old roots could still be part of the tree")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3cce9d44321e460e7c88cdec4e4537a6e9ad7c0d upstream.
Commit f77ac2e378be9dd6 ("ARM: 9030/1: entry: omit FP emulation for UND
exceptions taken in kernel mode") failed to take into account that there
is in fact a case where we relied on this code path: during boot, the
VFP detection code issues a read of FPSID, which will trigger an undef
exception on cores that lack VFP support.
So let's reinstate this logic using an undef hook which is registered
only for the duration of the initcall to vpf_init(), and which sets
VFP_arch to a non-zero value - as before - if no VFP support is present.
Fixes: f77ac2e378be9dd6 ("ARM: 9030/1: entry: omit FP emulation for UND ...")
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f77ac2e378be9dd61eb88728f0840642f045d9d1 upstream.
There are a couple of problems with the exception entry code that deals
with FP exceptions (which are reported as UND exceptions) when building
the kernel in Thumb2 mode:
- the conditional branch to vfp_kmode_exception in vfp_support_entry()
may be out of range for its target, depending on how the linker decides
to arrange the sections;
- when the UND exception is taken in kernel mode, the emulation handling
logic is entered via the 'call_fpe' label, which means we end up using
the wrong value/mask pairs to match and detect the NEON opcodes.
Since UND exceptions in kernel mode are unlikely to occur on a hot path
(as opposed to the user mode version which is invoked for VFP support
code and lazy restore), we can use the existing undef hook machinery for
any kernel mode instruction emulation that is needed, including calling
the existing vfp_kmode_exception() routine for unexpected cases. So drop
the call to call_fpe, and instead, install an undef hook that will get
called for NEON and VFP instructions that trigger an UND exception in
kernel mode.
While at it, make sure that the PC correction is accurate for the
execution mode where the exception was taken, by checking the PSR
Thumb bit.
[nd: fix conflict in arch/arm/vfp/vfphw.S due to missing
commit 2cbd1cc3dcd3 ("ARM: 8991/1: use VFP assembler mnemonics if
available")]
Fixes: eff8728fe698 ("vmlinux.lds.h: Add PGO and AutoFDO input sections")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Dmitry Osipenko <digetx@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d54cb7d54877d529bc1e0e1f47a3dd082f73add3 upstream.
Commit 152e9b8676c6e ("s390/vtime: steal time exponential moving average")
inadvertently changed the input value for account_steal_time() from
"cputime_to_nsecs(steal)" to just "steal", resulting in broken increased
steal time accounting.
Fix this by changing it back to "cputime_to_nsecs(steal)".
Fixes: 152e9b8676c6e ("s390/vtime: steal time exponential moving average")
Cc: <stable@vger.kernel.org> # 5.1
Reported-by: Sabine Forkel <sabine.forkel@de.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0cab893f409c53634d0d818fa414641cbcdb0dab upstream.
Revert commit 44cc89f76464 ("PM: runtime: Update device status
before letting suppliers suspend") that introduced a race condition
into __rpm_callback() which allowed a concurrent rpm_resume() to
run and resume the device prematurely after its status had been
changed to RPM_SUSPENDED by __rpm_callback().
Fixes: 44cc89f76464 ("PM: runtime: Update device status before letting suppliers suspend")
Link: https://lore.kernel.org/linux-pm/24dfb6fc-5d54-6ee2-9195-26428b7ecf8a@intel.com/
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e1c86210fe27428399643861b81b080eccd79f87 upstream.
There is another fix for headset-mic problem on Redmibook (1d72:1602),
it also works on Redmibook Air (1d72:1947), which has the same issue.
Signed-off-by: Xiaoliang Yu <yxl_22@outlook.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/TYBP286MB02856DC016849DEA0F9B6A37EE6F9@TYBP286MB0285.JPNP286.PROD.OUTLOOK.COM
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2bf44e0ee95f39cc54ea1b942f0a027e0181ca4e upstream.
Recently we found the micmute led init state is not correct after
freshly installing the ubuntu linux on a Lenovo AIO machine. The
internal mic is not muted, but the micmute led is on and led mode is
'follow mute'. If we mute internal mic, the led is keeping on, then
unmute the internal mic, the led is off. And from then on, the
micmute led will work correctly.
So the micmute led init state is not correct. The led is controlled
by codec gpio (ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY), in the
patch_realtek, the gpio data is set to 0x4 initially and the led is
on with this data. In the hda_generic, the led_value is set to
0 initially, suppose users set the 'capture switch' to on from
user space and the micmute led should change to be off with this
operation, but the check "if (val == spec->micmute_led.led_value)" in
the call_micmute_led_update() will skip the led setting.
To guarantee the led state will be set by the 1st time of changing
"Capture Switch", set -1 to the init led_value.
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20210312041408.3776-1-hui.wang@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b95bc12e0412d14d5fc764f0b82631c7bcaf1959 upstream.
Built-in microphone and combojack on Xiaomi Notebook Pro (1d72:1701) needs
to be fixed, the existing quirk for Dell works well on that machine.
Signed-off-by: Xiaoliang Yu <yxl_22@outlook.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/OS0P286MB02749B9E13920E6899902CD8EE6C9@OS0P286MB0274.JPNP286.PROD.OUTLOOK.COM
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit dd7b836d6bc935df95c826f69ff4d051f5561604 upstream.
When node is removed from IEEE 1394 bus, any transaction fails to the node.
In the case, ALSA dice driver doesn't stop isochronous contexts even if
they are running. As a result, null pointer dereference occurs in callback
from the running context.
This commit fixes the bug to release isochronous contexts always.
Cc: <stable@vger.kernel.org> # v5.4 or later
Fixes: e9f21129b8d8 ("ALSA: dice: support AMDTP domain")
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/20210312093407.23437-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 80cffd2468ddb850e678f17841fc356930b2304a upstream.
Add missed MODULE_DEVICE_TABLE for the driver can be loaded
automatically at boot.
Fixes: 920884777480 ("ASoC: ak5558: Add support for AK5558 ADC driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1614149872-25510-2-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4ec5b96775a88dd9b1c3ba1d23c43c478cab95a2 upstream.
Add missed MODULE_DEVICE_TABLE for the driver can be loaded
automatically at boot.
Fixes: 08660086eff9 ("ASoC: ak4458: Add support for AK4458 DAC driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1614149872-25510-1-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Jason Self <jason@bluehome.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Link: https://lore.kernel.org/r/20210319121745.449875976@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f9b3827ee66cfcf297d0acd6ecf33653a5f297ef upstream.
Add support for being able to set the learning attribute on port, and
make sure that the standalone ports start up with learning disabled.
We can remove the code in bcm_sf2 that configured the ports learning
attribute because we want the standalone ports to have learning disabled
by default and port 7 cannot be bridged, so its learning attribute will
not change past its initial configuration.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9200f515c41f4cbaeffd8fdd1d8b6373a18b1b67 upstream.
A different TPID bit is used for 802.1ad VLAN frames.
Reported-by: Ilario Gelmetti <iochesonome@gmail.com>
Fixes: f0af34317f4b ("net: dsa: mediatek: combine MediaTek tag with VLAN tag")
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 86ad60a65f29dd862a11c22bb4b5be28d6c5cef1 upstream.
The XTS asm helper arrangement is a bit odd: the 8-way stride helper
consists of back-to-back calls to the 4-way core transforms, which
are called indirectly, based on a boolean that indicates whether we
are performing encryption or decryption.
Given how costly indirect calls are on x86, let's switch to direct
calls, and given how the 8-way stride doesn't really add anything
substantial, use a 4-way stride instead, and make the asm core
routine deal with any multiple of 4 blocks. Since 512 byte sectors
or 4 KB blocks are the typical quantities XTS operates on, increase
the stride exported to the glue helper to 512 bytes as well.
As a result, the number of indirect calls is reduced from 3 per 64 bytes
of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup
when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU)
Fixes: 9697fa39efd3f ("x86/retpoline/crypto: Convert crypto assembler indirect jumps")
Tested-by: Eric Biggers <ebiggers@google.com> # x86_64
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
[ardb: rebase onto stable/linux-5.4.y]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 032d049ea0f45b45c21f3f02b542aa18bc6b6428 upstream.
CMP $0,%reg can't set overflow flag, so we can use shorter TEST %reg,%reg
instruction when only zero and sign flags are checked (E,L,LE,G,GE conditions).
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ard Biesheuvel <ardb@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9c1e8836edbbaf3656bc07437b59c04be034ac4e upstream.
The crypto glue performed function prototype casting via macros to make
indirect calls to assembly routines. Instead of performing casts at the
call sites (which trips Control Flow Integrity prototype checking), switch
each prototype to a common standard set of arguments which allows the
removal of the existing macros. In order to keep pointer math unchanged,
internal casting between u128 pointers and u8 pointers is added.
Co-developed-by: João Moreira <joao.moreira@intel.com>
Signed-off-by: João Moreira <joao.moreira@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ard Biesheuvel <ardb@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 775c5033a0d164622d9d10dd0f0a5531639ed3ed upstream.
Commit 5d069dbe8aaf ("fuse: fix bad inode") replaced make_bad_inode()
in fuse_iget() with a private implementation fuse_make_bad().
The private implementation fails to remove the bad inode from inode
cache, so the retry loop with iget5_locked() finds the same bad inode
and marks it bad forever.
kmsg snip:
[ ] rcu: INFO: rcu_sched self-detected stall on CPU
...
[ ] ? bit_wait_io+0x50/0x50
[ ] ? fuse_init_file_inode+0x70/0x70
[ ] ? find_inode.isra.32+0x60/0xb0
[ ] ? fuse_init_file_inode+0x70/0x70
[ ] ilookup5_nowait+0x65/0x90
[ ] ? fuse_init_file_inode+0x70/0x70
[ ] ilookup5.part.36+0x2e/0x80
[ ] ? fuse_init_file_inode+0x70/0x70
[ ] ? fuse_inode_eq+0x20/0x20
[ ] iget5_locked+0x21/0x80
[ ] ? fuse_inode_eq+0x20/0x20
[ ] fuse_iget+0x96/0x1b0
Fixes: 5d069dbe8aaf ("fuse: fix bad inode")
Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4ceb06e7c336f4a8d3f3b6ac9a4fea2e9c97dc07 upstream
BXT/APL has different isr/irr/hpd regs compared with other GEN9. If not
setting these regs bits correctly according to the emulated monitor
(currently a DP on PORT_B), although gvt still triggers a virtual HPD
event, the guest driver won't detect a valid HPD pulse thus no full
display detection will be executed to read the updated EDID.
With this patch, the vfio_edid is enabled again on BXT/APL, which is
previously disabled.
Fixes: 642403e3599e ("drm/i915/gvt: Temporarily disable vfio_edid for BXT/APL")
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201201060329.142375-1-colin.xu@intel.com
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
(cherry picked from commit 4ceb06e7c336f4a8d3f3b6ac9a4fea2e9c97dc07)
Signed-off-by: Colin Xu <colin.xu@intel.com>
Cc: <stable@vger.kernel.org> # 5.4.y
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
From: Zhenyu Wang <zhenyuw@linux.intel.com>
commit 28284943ac94014767ecc2f7b3c5747c4a5617a0 upstream
Current BDW virtual display port is initialized as PORT_B, so need
to use same port for VFIO EDID region, otherwise invalid EDID blob
pointer is assigned which caused kernel null pointer reference. We
might evaluate actual display hotplug for BDW to make this function
work as expected, anyway this is always required to be fixed first.
Reported-by: Alejandro Sior <aho@sior.be>
Cc: Alejandro Sior <aho@sior.be>
Fixes: 0178f4ce3c3b ("drm/i915/gvt: Enable vfio edid for all GVT supported platform")
Reviewed-by: Hang Yuan <hang.yuan@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200914030302.2775505-1-zhenyuw@linux.intel.com
(cherry picked from commit 28284943ac94014767ecc2f7b3c5747c4a5617a0)
Signed-off-by: Colin Xu <colin.xu@intel.com>
Cc: <stable@vger.kernel.org> # 5.4.y
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a5a8ef937cfa79167f4b2a5602092b8d14fd6b9a upstream
Program display related vregs to proper value at initialization, setup
virtual monitor and hotplug.
vGPU virtual display vregs inherit the value from pregs. The virtual DP
monitor is always setup on PORT_B for BXT/APL. However the host may
connect monitor on other PORT or without any monitor connected. Without
properly setup PIPE/DDI/PLL related vregs, guest driver may not setup
the virutal display as expected, and the guest desktop may not be
created.
Since only one virtual display is supported, enable PIPE_A only. And
enable transcoder/DDI/PLL based on which port is setup for BXT/APL.
V2:
Revise commit message.
V3:
set_edid should on PORT_B for BXT.
Inject hpd event for BXT.
V4:
Temporarily disable vfio edid on BXT/APL until issue fixed.
V5:
Rebase to use new HPD define GEN8_DE_PORT_HOTPLUG for BXT.
Put vfio edid disabling on BXT/APL to a separate patch.
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201109073922.757759-1-colin.xu@intel.com
(cherry picked from commit a5a8ef937cfa79167f4b2a5602092b8d14fd6b9a)
Signed-off-by: Colin Xu <colin.xu@intel.com>
Cc: <stable@vger.kernel.org> # 5.4.y
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 92010a97098c4c9fd777408cc98064d26b32695b upstream
- Remove dup mmio handler for BXT/APL. Otherwise mmio handler will fail
to init.
- Add engine GPR with F_CMD_ACCESS since BXT/APL will load them via
LRI. Otherwise, guest will enter failsafe mode.
V2:
Use RCS/BCS GPR macros instead of offset.
Revise commit message.
V3:
Use GEN8_RING_CS_GPR macros on ring base.
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201016052913.209248-1-colin.xu@intel.com
(cherry picked from commit 92010a97098c4c9fd777408cc98064d26b32695b)
Signed-off-by: Colin Xu <colin.xu@intel.com>
Cc: <stable@vger.kernel.org> # 5.4.y
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8fe105679765700378eb328495fcfe1566cdbbd0 upstream
If guest fills non-priv bb on ApolloLake/Broxton as Mesa i965 does in:
717e7539124d (i965: Use a WC map and memcpy for the batch instead of pw-)
Due to the missing flush of bb filled by VM vCPU, host GPU hangs on
executing these MI_BATCH_BUFFER.
Temporarily workaround this by setting SNOOP bit for PAT3 used by PPGTT
PML4 PTE: PAT(0) PCD(1) PWT(1).
The performance is still expected to be low, will need further improvement.
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20201012045231.226748-1-colin.xu@intel.com
(cherry picked from commit 8fe105679765700378eb328495fcfe1566cdbbd0)
Signed-off-by: Colin Xu <colin.xu@intel.com>
Cc: <stable@vger.kernel.org> # 5.4.y
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b12de52896c0e8213f70e3a168fde9e6eee95909 upstream.
[BUG]
When running btrfs/072 with only one online CPU, it has a pretty high
chance to fail:
# btrfs/072 12s ... _check_dmesg: something found in dmesg (see xfstests-dev/results//btrfs/072.dmesg)
# - output mismatch (see xfstests-dev/results//btrfs/072.out.bad)
# --- tests/btrfs/072.out 2019-10-22 15:18:14.008965340 +0800
# +++ /xfstests-dev/results//btrfs/072.out.bad 2019-11-14 15:56:45.877152240 +0800
# @@ -1,2 +1,3 @@
# QA output created by 072
# Silence is golden
# +Scrub find errors in "-m dup -d single" test
# ...
And with the following call trace:
BTRFS info (device dm-5): scrub: started on devid 1
------------[ cut here ]------------
BTRFS: Transaction aborted (error -27)
WARNING: CPU: 0 PID: 55087 at fs/btrfs/block-group.c:1890 btrfs_create_pending_block_groups+0x3e6/0x470 [btrfs]
CPU: 0 PID: 55087 Comm: btrfs Tainted: G W O 5.4.0-rc1-custom+ #13
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:btrfs_create_pending_block_groups+0x3e6/0x470 [btrfs]
Call Trace:
__btrfs_end_transaction+0xdb/0x310 [btrfs]
btrfs_end_transaction+0x10/0x20 [btrfs]
btrfs_inc_block_group_ro+0x1c9/0x210 [btrfs]
scrub_enumerate_chunks+0x264/0x940 [btrfs]
btrfs_scrub_dev+0x45c/0x8f0 [btrfs]
btrfs_ioctl+0x31a1/0x3fb0 [btrfs]
do_vfs_ioctl+0x636/0xaa0
ksys_ioctl+0x67/0x90
__x64_sys_ioctl+0x43/0x50
do_syscall_64+0x79/0xe0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
---[ end trace 166c865cec7688e7 ]---
[CAUSE]
The error number -27 is -EFBIG, returned from the following call chain:
btrfs_end_transaction()
|- __btrfs_end_transaction()
|- btrfs_create_pending_block_groups()
|- btrfs_finish_chunk_alloc()
|- btrfs_add_system_chunk()
This happens because we have used up all space of
btrfs_super_block::sys_chunk_array.
The root cause is, we have the following bad loop of creating tons of
system chunks:
1. The only SYSTEM chunk is being scrubbed
It's very common to have only one SYSTEM chunk.
2. New SYSTEM bg will be allocated
As btrfs_inc_block_group_ro() will check if we have enough space
after marking current bg RO. If not, then allocate a new chunk.
3. New SYSTEM bg is still empty, will be reclaimed
During the reclaim, we will mark it RO again.
4. That newly allocated empty SYSTEM bg get scrubbed
We go back to step 2, as the bg is already mark RO but still not
cleaned up yet.
If the cleaner kthread doesn't get executed fast enough (e.g. only one
CPU), then we will get more and more empty SYSTEM chunks, using up all
the space of btrfs_super_block::sys_chunk_array.
[FIX]
Since scrub/dev-replace doesn't always need to allocate new extent,
especially chunk tree extent, so we don't really need to do chunk
pre-allocation.
To break above spiral, here we introduce a new parameter to
btrfs_inc_block_group(), @do_chunk_alloc, which indicates whether we
need extra chunk pre-allocation.
For relocation, we pass @do_chunk_alloc=true, while for scrub, we pass
@do_chunk_alloc=false.
This should keep unnecessary empty chunks from popping up for scrub.
Also, since there are two parameters for btrfs_inc_block_group_ro(),
add more comment for it.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0a13e3537ea67452d549a6a80da3776d6b7dedb3 upstream.
Fix up test_verifier error messages for the case where the original error
message changed, or for the case where pointer alu errors differ between
privileged and unprivileged tests. Also, add alternative tests for keeping
coverage of the original verifier rejection error message (fp alu), and
newly reject map_ptr += rX where rX == 0 given we now forbid alu on these
types for unprivileged. All test_verifier cases pass after the change. The
test case fixups were kept separate to ease backporting of core changes.
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1b1597e64e1a610c7a96710fc4717158e98a08b3 upstream.
Given we know the max possible value of ptr_limit at the time of retrieving
the latter, add basic assertions, so that the verifier can bail out if
anything looks odd and reject the program. Nothing triggered this so far,
but it also does not hurt to have these.
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b5871dca250cd391885218b99cc015aca1a51aea upstream.
Instead of having the mov32 with aux->alu_limit - 1 immediate, move this
operation to retrieve_ptr_limit() instead to simplify the logic and to
allow for subsequent sanity boundary checks inside retrieve_ptr_limit().
This avoids in future that at the time of the verifier masking rewrite
we'd run into an underflow which would not sign extend due to the nature
of mov32 instruction.
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 10d2bb2e6b1d8c4576c56a748f697dbeb8388899 upstream.
retrieve_ptr_limit() computes the ptr_limit for registers with stack and
map_value type. ptr_limit is the size of the memory area that is still
valid / in-bounds from the point of the current position and direction
of the operation (add / sub). This size will later be used for masking
the operation such that attempting out-of-bounds access in the speculative
domain is redirected to remain within the bounds of the current map value.
When masking to the right the size is correct, however, when masking to
the left, the size is off-by-one which would lead to an incorrect mask
and thus incorrect arithmetic operation in the non-speculative domain.
Piotr found that if the resulting alu_limit value is zero, then the
BPF_MOV32_IMM() from the fixup_bpf_calls() rewrite will end up loading
0xffffffff into AX instead of sign-extending to the full 64 bit range,
and as a result, this allows abuse for executing speculatively out-of-
bounds loads against 4GB window of address space and thus extracting the
contents of kernel memory via side-channel.
Fixes: 979d63d50c0c ("bpf: prevent out of bounds speculation on pointer arithmetic")
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f232326f6966cf2a1d1db7bc917a4ce5f9f55f76 upstream.
The purpose of this patch is to streamline error propagation and in particular
to propagate retrieve_ptr_limit() errors for pointer types that are not defining
a ptr_limit such that register-based alu ops against these types can be rejected.
The main rationale is that a gap has been identified by Piotr in the existing
protection against speculatively out-of-bounds loads, for example, in case of
ctx pointers, unprivileged programs can still perform pointer arithmetic. This
can be abused to execute speculatively out-of-bounds loads without restrictions
and thus extract contents of kernel memory.
Fix this by rejecting unprivileged programs that attempt any pointer arithmetic
on unprotected pointer types. The two affected ones are pointer to ctx as well
as pointer to map. Field access to a modified ctx' pointer is rejected at a
later point in time in the verifier, and 7c6967326267 ("bpf: Permit map_ptr
arithmetic with opcode add and offset 0") only relevant for root-only use cases.
Risk of unprivileged program breakage is considered very low.
Fixes: 7c6967326267 ("bpf: Permit map_ptr arithmetic with opcode add and offset 0")
Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation")
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b96b0c5de685df82019e16826a282d53d86d112c upstream
The nVHE KVM hyp drains and disables the SPE buffer, before
entering the guest, as the EL1&0 translation regime
is going to be loaded with that of the guest.
But this operation is performed way too late, because :
- The owning translation regime of the SPE buffer
is transferred to EL2. (MDCR_EL2_E2PB == 0)
- The guest Stage1 is loaded.
Thus the flush could use the host EL1 virtual address,
but use the EL2 translations instead of host EL1, for writing
out any cached data.
Fix this by moving the SPE buffer handling early enough.
The restore path is doing the right thing.
Cc: stable@vger.kernel.org # v5.4-
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Jason Self <jason@bluehome.net>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Hulk Robot <hulkrobot@huawei.com>
Tested-by: Ross Schmidt <ross.schm.dev@gmail.com>
Link: https://lore.kernel.org/r/20210315135550.333963635@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b6622798bc50b625a1e62f82c7190df40c1f5b21 upstream.
When changing the cpu affinity of an event it can happen today that
(with some unlucky timing) the same event will be handled on the old
and the new cpu at the same time.
Avoid that by adding an "event active" flag to the per-event data and
call the handler only if this flag isn't set.
Cc: stable@vger.kernel.org
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Link: https://lore.kernel.org/r/20210306161833.4552-4-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 25da4618af240fbec6112401498301a6f2bc9702 upstream.
An event channel should be kept masked when an eoi is pending for it.
When being migrated to another cpu it might be unmasked, though.
In order to avoid this keep three different flags for each event channel
to be able to distinguish "normal" masking/unmasking from eoi related
masking/unmasking and temporary masking. The event channel should only
be able to generate an interrupt if all flags are cleared.
Cc: stable@vger.kernel.org
Fixes: 54c9de89895e ("xen/events: add a new "late EOI" evtchn framework")
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Link: https://lore.kernel.org/r/20210306161833.4552-3-jgross@suse.com
[boris -- corrected Fixed tag format]
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9e77d96b8e2724ed00380189f7b0ded61113b39f upstream.
When creating a new event channel with 2-level events the affinity
needs to be reset initially in order to avoid using an old affinity
from earlier usage of the event channel port. So when tearing an event
channel down reset all affinity bits.
The same applies to the affinity when onlining a vcpu: all old
affinity settings for this vcpu must be reset. As percpu events get
initialized before the percpu event channel hook is called,
resetting of the affinities happens after offlining a vcpu (this is
working, as initial percpu memory is zeroed out).
Cc: stable@vger.kernel.org
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Link: https://lore.kernel.org/r/20210306161833.4552-2-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit 7d717558dd5ef10d28866750d5c24ff892ea3778 upstream.
KVM/arm64 has forever used a 40bit default IPA space, partially
due to its 32bit heritage (where the only choice is 40bit).
However, there are implementations in the wild that have a *cough*
much smaller *cough* IPA space, which leads to a misprogramming of
VTCR_EL2, and a guest that is stuck on its first memory access
if userspace dares to ask for the default IPA setting (which most
VMMs do).
Instead, blundly reject the creation of such VM, as we can't
satisfy the requirements from userspace (with a one-off warning).
Also clarify the boot warning, and document that the VM creation
will fail when an unsupported IPA size is provided.
Although this is an ABI change, it doesn't really change much
for userspace:
- the guest couldn't run before this change, but no error was
returned. At least userspace knows what is happening.
- a memory slot that was accepted because it did fit the default
IPA space now doesn't even get a chance to be registered.
The other thing that is left doing is to convince userspace to
actually use the IPA space setting instead of relying on the
antiquated default.
Fixes: 233a7cb23531 ("kvm: arm64: Allow tuning the physical address size for VM")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20210311100016.3830038-2-maz@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit 01dc9262ff5797b675c32c0c6bc682777d23de05 upstream.
It recently became apparent that the ARMv8 architecture has interesting
rules regarding attributes being used when fetching instructions
if the MMU is off at Stage-1.
In this situation, the CPU is allowed to fetch from the PoC and
allocate into the I-cache (unless the memory is mapped with
the XN attribute at Stage-2).
If we transpose this to vcpus sharing a single physical CPU,
it is possible for a vcpu running with its MMU off to influence
another vcpu running with its MMU on, as the latter is expected to
fetch from the PoU (and self-patching code doesn't flush below that
level).
In order to solve this, reuse the vcpu-private TLB invalidation
code to apply the same policy to the I-cache, nuking it every time
the vcpu runs on a physical CPU that ran another vcpu of the same
VM in the past.
This involve renaming __kvm_tlb_flush_local_vmid() to
__kvm_flush_cpu_context(), and inserting a local i-cache invalidation
there.
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210303164505.68492-1-maz@kernel.org
[maz: added 32bit ARM support]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ac262508daa88fb12c5dc53cf30bde163f9f26c9 upstream.
If a namespace identification does not match the subsystem's head for
that NSID, release the reference that was taken when the matching head
was initially found.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d567572906d986dedb78b37f111c44eba033f3ef upstream.
The driver had been unlinking the namespace head from the subsystem's
list only after the last reference was released, and outside of the
list's subsys->lock protection.
There is no reason to track an empty head, so unlink the entry from the
subsystem's list when the last namespace using that head is removed and
with the mutex lock protecting the list update. The next namespace to
attach reusing the previous NSID will allocate a new head rather than
find the old head with mismatched identifiers.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 262b003d059c6671601a19057e9fe1a5e7f23722 upstream.
When registering a memslot, we check the size and location of that
memslot against the IPA size to ensure that we can provide guest
access to the whole of the memory.
Unfortunately, this check rejects memslot that end-up at the exact
limit of the addressing capability for a given IPA size. For example,
it refuses the creation of a 2GB memslot at 0x8000000 with a 32bit
IPA space.
Fix it by relaxing the check to accept a memslot reaching the
limit of the IPA space.
Fixes: c3058d5da222 ("arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE")
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20210311100016.3830038-3-maz@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e504e74cc3a2c092b05577ce3e8e013fae7d94e6 upstream.
KASAN reserves "redzone" areas between stack frames in order to detect
stack overruns. A read or write to such an area triggers a KASAN
"stack-out-of-bounds" BUG.
Normally, the ORC unwinder stays in-bounds and doesn't access the
redzone. But sometimes it can't find ORC metadata for a given
instruction. This can happen for code which is missing ORC metadata, or
for generated code. In such cases, the unwinder attempts to fall back
to frame pointers, as a best-effort type thing.
This fallback often works, but when it doesn't, the unwinder can get
confused and go off into the weeds into the KASAN redzone, triggering
the aforementioned KASAN BUG.
But in this case, the unwinder's confusion is actually harmless and
working as designed. It already has checks in place to prevent
off-stack accesses, but those checks get short-circuited by the KASAN
BUG. And a BUG is a lot more disruptive than a harmless unwinder
warning.
Disable the KASAN checks by using READ_ONCE_NOCHECK() for all stack
accesses. This finishes the job started by commit 881125bfe65b
("x86/unwind: Disable KASAN checking in the ORC unwinder"), which only
partially fixed the issue.
Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder")
Reported-by: Ivan Babrou <ivan@cloudflare.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Ivan Babrou <ivan@cloudflare.com>
Cc: stable@kernel.org
Link: https://lkml.kernel.org/r/9583327904ebbbeda399eca9c56d6c7085ac20fe.1612534649.git.jpoimboe@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e7850f4d844e0acfac7e570af611d89deade3146 upstream.
There is a deadlock in bm_register_write:
First, in the begining of the function, a lock is taken on the binfmt_misc
root inode with inode_lock(d_inode(root)).
Then, if the user used the MISC_FMT_OPEN_FILE flag, the function will call
open_exec on the user-provided interpreter.
open_exec will call a path lookup, and if the path lookup process includes
the root of binfmt_misc, it will try to take a shared lock on its inode
again, but it is already locked, and the code will get stuck in a deadlock
To reproduce the bug:
$ echo ":iiiii:E::ii::/proc/sys/fs/binfmt_misc/bla:F" > /proc/sys/fs/binfmt_misc/register
backtrace of where the lock occurs (#5):
0 schedule () at ./arch/x86/include/asm/current.h:15
1 0xffffffff81b51237 in rwsem_down_read_slowpath (sem=0xffff888003b202e0, count=<optimized out>, state=state@entry=2) at kernel/locking/rwsem.c:992
2 0xffffffff81b5150a in __down_read_common (state=2, sem=<optimized out>) at kernel/locking/rwsem.c:1213
3 __down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1222
4 down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1355
5 0xffffffff811ee22a in inode_lock_shared (inode=<optimized out>) at ./include/linux/fs.h:783
6 open_last_lookups (op=0xffffc9000022fe34, file=0xffff888004098600, nd=0xffffc9000022fd10) at fs/namei.c:3177
7 path_openat (nd=nd@entry=0xffffc9000022fd10, op=op@entry=0xffffc9000022fe34, flags=flags@entry=65) at fs/namei.c:3366
8 0xffffffff811efe1c in do_filp_open (dfd=<optimized out>, pathname=pathname@entry=0xffff8880031b9000, op=op@entry=0xffffc9000022fe34) at fs/namei.c:3396
9 0xffffffff811e493f in do_open_execat (fd=fd@entry=-100, name=name@entry=0xffff8880031b9000, flags=<optimized out>, flags@entry=0) at fs/exec.c:913
10 0xffffffff811e4a92 in open_exec (name=<optimized out>) at fs/exec.c:948
11 0xffffffff8124aa84 in bm_register_write (file=<optimized out>, buffer=<optimized out>, count=19, ppos=<optimized out>) at fs/binfmt_misc.c:682
12 0xffffffff811decd2 in vfs_write (file=file@entry=0xffff888004098500, buf=buf@entry=0xa758d0 ":iiiii:E::ii::i:CF
", count=count@entry=19, pos=pos@entry=0xffffc9000022ff10) at fs/read_write.c:603
13 0xffffffff811defda in ksys_write (fd=<optimized out>, buf=0xa758d0 ":iiiii:E::ii::i:CF
", count=19) at fs/read_write.c:658
14 0xffffffff81b49813 in do_syscall_64 (nr=<optimized out>, regs=0xffffc9000022ff58) at arch/x86/entry/common.c:46
15 0xffffffff81c0007c in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120
To solve the issue, the open_exec call is moved to before the write
lock is taken by bm_register_write
Link: https://lkml.kernel.org/r/20210228224414.95962-1-liorribak@gmail.com
Fixes: 948b701a607f1 ("binfmt_misc: add persistent opened binary handler for containers")
Signed-off-by: Lior Ribak <liorribak@gmail.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cea15316ceee2d4a51dfdecd79e08a438135416c upstream.
'lis r2,N' is 'addis r2,0,N' and the instruction encoding in the macro
LIS_R2 is incorrect (it currently maps to 'addis r0,r2,N'). Fix the
same.
Fixes: c71b7eff426f ("powerpc: Add ABIv2 support to ppc_function_entry")
Cc: stable@vger.kernel.org # v3.16+
Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210304020411.16796-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ce29ddc47b91f97e7f69a0fb7cbb5845f52a9825 upstream.
The function sync_runqueues_membarrier_state() should copy the
membarrier state from the @mm received as parameter to each runqueue
currently running tasks using that mm.
However, the use of smp_call_function_many() skips the current runqueue,
which is unintended. Replace by a call to on_each_cpu_mask().
Fixes: 227a4aadc75b ("sched/membarrier: Fix p->mm->membarrier_state racy load")
Reported-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org # 5.4.x+
Link: https://lore.kernel.org/r/74F1E842-4A84-47BF-B6C2-5407DFDD4A4A@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 57e0076e6575a7b7cef620a0bd2ee2549ef77818 upstream.
writeback_store's return value is overwritten by submit_bio_wait's return
value. Thus, writeback_store will return zero since there was no IO
error. In the end, write syscall from userspace will see the zero as
return value, which could make the process stall to keep trying the write
until it will succeed.
Link: https://lkml.kernel.org/r/20210312173949.2197662-1-minchan@kernel.org
Fixes: 3b82a051c101("drivers/block/zram/zram_drv.c: fix error return codes not being returned in writeback_store")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: John Dias <joaodias@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 149fc787353f65b7e72e05e7b75d34863266c3e2 ]
Fix a sparse warning by using rcu_dereference(). Technically this is a
bug and a sufficiently aggressive compiler could reload the `real_parent'
pointer outside the protection of the rcu lock (and access freed memory),
but I think it's pretty unlikely to happen.
Link: https://lkml.kernel.org/r/20210221194207.1351703-1-willy@infradead.org
Fixes: b18dc5f291c0 ("mm, oom: skip vforked tasks from being selected")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit cbf78d85079cee662c45749ef4f744d41be85d48 ]
With clang-13, some functions only get partially inlined, with a
specialized version referring to a global variable. This triggers a
harmless build-time check for the intel-rng driver:
WARNING: modpost: drivers/char/hw_random/intel-rng.o(.text+0xe): Section mismatch in reference from the function stop_machine() to the function .init.text:intel_rng_hw_init()
The function stop_machine() references
the function __init intel_rng_hw_init().
This is often because stop_machine lacks a __init
annotation or the annotation of intel_rng_hw_init is wrong.
In this instance, an easy workaround is to force the stop_machine()
function to be inline, along with related interfaces that did not show the
same behavior at the moment, but theoretically could.
The combination of the two patches listed below triggers the behavior in
clang-13, but individually these commits are correct.
Link: https://lkml.kernel.org/r/20210225130153.1956990-1-arnd@kernel.org
Fixes: fe5595c07400 ("stop_machine: Provide stop_machine_cpuslocked()")
Fixes: ee527cd3a20c ("Use stop_machine_run in the Intel RNG driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 46eb1701c046cc18c032fa68f3c8ccbf24483ee4 ]
hrtimer_force_reprogram() and hrtimer_interrupt() invokes
__hrtimer_get_next_event() to find the earliest expiry time of hrtimer
bases. __hrtimer_get_next_event() does not update
cpu_base::[softirq_]_expires_next to preserve reprogramming logic. That
needs to be done at the callsites.
hrtimer_force_reprogram() updates cpu_base::softirq_expires_next only when
the first expiring timer is a softirq timer and the soft interrupt is not
activated. That's wrong because cpu_base::softirq_expires_next is left
stale when the first expiring timer of all bases is a timer which expires
in hard interrupt context. hrtimer_interrupt() does never update
cpu_base::softirq_expires_next which is wrong too.
That becomes a problem when clock_settime() sets CLOCK_REALTIME forward and
the first soft expiring timer is in the CLOCK_REALTIME_SOFT base. Setting
CLOCK_REALTIME forward moves the clock MONOTONIC based expiry time of that
timer before the stale cpu_base::softirq_expires_next.
cpu_base::softirq_expires_next is cached to make the check for raising the
soft interrupt fast. In the above case the soft interrupt won't be raised
until clock monotonic reaches the stale cpu_base::softirq_expires_next
value. That's incorrect, but what's worse it that if the softirq timer
becomes the first expiring timer of all clock bases after the hard expiry
timer has been handled the reprogramming of the clockevent from
hrtimer_interrupt() will result in an interrupt storm. That happens because
the reprogramming does not use cpu_base::softirq_expires_next, it uses
__hrtimer_get_next_event() which returns the actual expiry time. Once clock
MONOTONIC reaches cpu_base::softirq_expires_next the soft interrupt is
raised and the storm subsides.
Change the logic in hrtimer_force_reprogram() to evaluate the soft and hard
bases seperately, update softirq_expires_next and handle the case when a
soft expiring timer is the first of all bases by comparing the expiry times
and updating the required cpu base fields. Split this functionality into a
separate function to be able to use it in hrtimer_interrupt() as well
without copy paste.
Fixes: 5da70160462e ("hrtimer: Implement support for softirq based hrtimers")
Reported-by: Mikael Beckius <mikael.beckius@windriver.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mikael Beckius <mikael.beckius@windriver.com>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210223160240.27518-1-anna-maria@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7ba8f2b2d652cd8d8a2ab61f4be66973e70f9f88 ]
52-bit VA kernels can run on hardware that is only 48-bit capable, but
configure the ID map as 52-bit by default. This was not a problem until
recently, because the special T0SZ value for a 52-bit VA space was never
programmed into the TCR register anwyay, and because a 52-bit ID map
happens to use the same number of translation levels as a 48-bit one.
This behavior was changed by commit 1401bef703a4 ("arm64: mm: Always update
TCR_EL1 from __cpu_set_tcr_t0sz()"), which causes the unsupported T0SZ
value for a 52-bit VA to be programmed into TCR_EL1. While some hardware
simply ignores this, Mark reports that Amberwing systems choke on this,
resulting in a broken boot. But even before that commit, the unsupported
idmap_t0sz value was exposed to KVM and used to program TCR_EL2 incorrectly
as well.
Given that we already have to deal with address spaces being either 48-bit
or 52-bit in size, the cleanest approach seems to be to simply default to
a 48-bit VA ID map, and only switch to a 52-bit one if the placement of the
kernel in DRAM requires it. This is guaranteed not to happen unless the
system is actually 52-bit VA capable.
Fixes: 90ec95cda91a ("arm64: mm: Introduce VA_BITS_MIN")
Reported-by: Mark Salter <msalter@redhat.com>
Link: http://lore.kernel.org/r/20210310003216.410037-1-msalter@redhat.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20210310171515.416643-2-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 14fbbc8297728e880070f7b077b3301a8c698ef9 ]
Commit b0841eefd969 ("configfs: provide exclusion between IO and removals")
uses ->frag_dead to mark the fragment state, thus no bothering with extra
refcount on config_item when opening a file. The configfs_get_config_item
was removed in __configfs_open_file, but not with config_item_put. So the
refcount on config_item will lost its balance, causing use-after-free
issues in some occasions like this:
Test:
1. Mount configfs on /config with read-only items:
drwxrwx--- 289 root root 0 2021-04-01 11:55 /config
drwxr-xr-x 2 root root 0 2021-04-01 11:54 /config/a
--w--w--w- 1 root root 4096 2021-04-01 11:53 /config/a/1.txt
......
2. Then run:
for file in /config
do
echo $file
grep -R 'key' $file
done
3. __configfs_open_file will be called in parallel, the first one
got called will do:
if (file->f_mode & FMODE_READ) {
if (!(inode->i_mode & S_IRUGO))
goto out_put_module;
config_item_put(buffer->item);
kref_put()
package_details_release()
kfree()
the other one will run into use-after-free issues like this:
BUG: KASAN: use-after-free in __configfs_open_file+0x1bc/0x3b0
Read of size 8 at addr fffffff155f02480 by task grep/13096
CPU: 0 PID: 13096 Comm: grep VIP: 00 Tainted: G W 4.14.116-kasan #1
TGID: 13096 Comm: grep
Call trace:
dump_stack+0x118/0x160
kasan_report+0x22c/0x294
__asan_load8+0x80/0x88
__configfs_open_file+0x1bc/0x3b0
configfs_open_file+0x28/0x34
do_dentry_open+0x2cc/0x5c0
vfs_open+0x80/0xe0
path_openat+0xd8c/0x2988
do_filp_open+0x1c4/0x2fc
do_sys_open+0x23c/0x404
SyS_openat+0x38/0x48
Allocated by task 2138:
kasan_kmalloc+0xe0/0x1ac
kmem_cache_alloc_trace+0x334/0x394
packages_make_item+0x4c/0x180
configfs_mkdir+0x358/0x740
vfs_mkdir2+0x1bc/0x2e8
SyS_mkdirat+0x154/0x23c
el0_svc_naked+0x34/0x38
Freed by task 13096:
kasan_slab_free+0xb8/0x194
kfree+0x13c/0x910
package_details_release+0x524/0x56c
kref_put+0xc4/0x104
config_item_put+0x24/0x34
__configfs_open_file+0x35c/0x3b0
configfs_open_file+0x28/0x34
do_dentry_open+0x2cc/0x5c0
vfs_open+0x80/0xe0
path_openat+0xd8c/0x2988
do_filp_open+0x1c4/0x2fc
do_sys_open+0x23c/0x404
SyS_openat+0x38/0x48
el0_svc_naked+0x34/0x38
To fix this issue, remove the config_item_put in
__configfs_open_file to balance the refcount of config_item.
Fixes: b0841eefd969 ("configfs: provide exclusion between IO and removals")
Signed-off-by: Daiyue Zhang <zhangdaiyue1@huawei.com>
Signed-off-by: Yi Chen <chenyi77@huawei.com>
Signed-off-by: Ge Qiu <qiuge@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|