| Age | Commit message (Collapse) | Author | Files | Lines |
|
commit 233087ca063686964a53c829d547c7571e3f67bf upstream.
Minh Yuan reported a concurrency use-after-free issue in the floppy code
between raw_cmd_ioctl and seek_interrupt.
[ It turns out this has been around, and that others have reported the
KASAN splats over the years, but Minh Yuan had a reproducer for it and
so gets primary credit for reporting it for this fix - Linus ]
The problem is, this driver tends to break very easily and nowadays,
nobody is expected to use FDRAWCMD anyway since it was used to
manipulate non-standard formats. The risk of breaking the driver is
higher than the risk presented by this race, and accessing the device
requires privileges anyway.
Let's just add a config option to completely disable this ioctl and
leave it disabled by default. Distros shouldn't use it, and only those
running on antique hardware might need to enable it.
Link: https://lore.kernel.org/all/000000000000b71cdd05d703f6bf@google.com/
Link: https://lore.kernel.org/lkml/CAKcFiNC=MfYVW-Jt9A3=FPJpTwCD2PL_ULNCpsCVE5s8ZeBQgQ@mail.gmail.com
Link: https://lore.kernel.org/all/CAEAjamu1FRhz6StCe_55XY5s389ZP_xmCF69k987En+1z53=eg@mail.gmail.com
Reported-by: Minh Yuan <yuanmingbuaa@gmail.com>
Reported-by: syzbot+8e8958586909d62b6840@syzkaller.appspotmail.com
Reported-by: cruise k <cruise4k@gmail.com>
Reported-by: Kyungtae Kim <kt0755@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Tested-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 286901941fd18a52b2138fddbbf589ad3639eb00 ]
We want our pages not to change while they are being written.
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3e3876d322aef82416ecc496a4d4a587e0fdf7a3 ]
When poll request is timed out, it is removed from the poll list,
but not completed, so the request is leaked, and never get chance
to complete.
Fix the issue by ending it in timeout handler.
Fixes: 0a593fbbc245 ("null_blk: poll queue support")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20220413084836.1571995-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit ae4d37b5df749926891583d42a6801b5da11e3c1 upstream.
The bug is here:
idr_remove(&connection->peer_devices, vnr);
If the previous for_each_connection() don't exit early (no goto hit
inside the loop), the iterator 'connection' after the loop will be a
bogus pointer to an invalid structure object containing the HEAD
(&resource->connections). As a result, the use of 'connection' above
will lead to a invalid memory access (including a possible invalid free
as idr_remove could call free_layer).
The original intention should have been to remove all peer_devices,
but the following lines have already done the work. So just remove
this line and the unneeded label, to fix this bug.
Cc: stable@vger.kernel.org
Fixes: c06ece6ba6f1b ("drbd: Turn connection->volumes into connection->peer_devices")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Reviewed-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit aadb22ba2f656581b2f733deb3a467c48cc618f6 ]
In get_initial_state, it calls notify_initial_state_done(skb,..) if
cb->args[5]==1. If genlmsg_put() failed in notify_initial_state_done(),
the skb will be freed by nlmsg_free(skb).
Then get_initial_state will goto out and the freed skb will be used by
return value skb->len, which is a uaf bug.
What's worse, the same problem goes even further: skb can also be
freed in the notify_*_state_change -> notify_*_state calls below.
Thus 4 additional uaf bugs happened.
My patch lets the problem callee functions: notify_initial_state_done
and notify_*_state_change return an error code if errors happen.
So that the error codes could be propagated and the uaf bugs can be avoid.
v2 reports a compilation warning. This v3 fixed this warning and built
successfully in my local environment with no additional warnings.
v2: https://lore.kernel.org/patchwork/patch/1435218/
Fixes: a29728463b254 ("drbd: Backport the "events2" command")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 7198bfc2017644c6b92d2ecef9b8b8e0363bb5fd upstream.
This reverts commit 6d35d04a9e18990040e87d2bbf72689252669d54.
Both Gabriel and Borislav report that this commit casues a regression
with nbd:
sysfs: cannot create duplicate filename '/dev/block/43:0'
Revert it before 5.18-rc1 and we'll investigage this separately in
due time.
Link: https://lore.kernel.org/all/YkiJTnFOt9bTv6A2@zn.tnic/
Reported-by: Gabriel L. Somlo <somlo@cmu.edu>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b2479de38d8fc7ef13d5c78ff5ded6e5a1a4eac0 upstream.
My kernel robot report below:
drivers/block/n64cart.c: In function ‘n64cart_submit_bio’:
drivers/block/n64cart.c:91:26: error: ‘struct bio’ has no member named ‘bi_disk’
91 | struct device *dev = bio->bi_disk->private_data;
| ^~
CC drivers/slimbus/qcom-ctrl.o
CC drivers/auxdisplay/hd44780.o
CC drivers/watchdog/watchdog_core.o
CC drivers/nvme/host/fault_inject.o
AR drivers/accessibility/braille/built-in.a
make[2]: *** [scripts/Makefile.build:288: drivers/block/n64cart.o] Error 1
Fixes: 309dca309fc3 ("block: store a block_device pointer in struct bio");
Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220321071216.1549596-1-liu.yun@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6d35d04a9e18990040e87d2bbf72689252669d54 upstream.
When 'index' is a big numbers, it may become negative which forced
to 'int'. then 'index << part_shift' might overflow to a positive
value that is not greater than '0xfffff', then sysfs might complains
about duplicate creation. Because of this, move the 'index' judgment
to the front will fix it and be better.
Fixes: b0d9111a2d53 ("nbd: use an idr to keep track of nbd devices")
Fixes: 940c264984fd ("nbd: fix possible overflow for 'first_minor' in nbd_dev_add()")
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20220310093224.4002895-1-zhangwensheng5@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f941c51eeac7ebe0f8ec30943bf78e7f60aad039 upstream.
Support for cryptoloop was deleted in commit 47e9624616c8 ("block:
remove support for cryptoloop and the xor transfer"), making the usage
of loop_info->lo_encrypt_type obsolete. However, this member was also
removed from the compat_loop_info definition and this breaks userspace
ioctl calls for 32-bit binaries and CONFIG_COMPAT=y.
This patch restores the compat_loop_info->lo_encrypt_type member and
marks it obsolete as well as in the uapi header definitions.
Fixes: 47e9624616c8 ("block: remove support for cryptoloop and the xor transfer")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220329201815.1347500-1-cmllamas@google.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b27824d31f09ea7b4a6ba2c1b18bd328df3e8bed ]
sprintf does not know the PAGE_SIZE maximum of the temporary buffer
used for outputting sysfs content and it's possible to overrun the
PAGE_SIZE buffer length.
Use a generic sysfs_emit function that knows the size of the
temporary buffer and ensures that no overrun is done for offset
attribute in
loop_attr_[offset|sizelimit|autoclear|partscan|dio]_show() callbacks.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Link: https://lore.kernel.org/r/20220215213310.7264-2-kch@nvidia.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit f4329d1f848ac35757d9cc5487669d19dfc5979c upstream.
Scenario:
---------
bio chain generated by blk_queue_split().
Some split bio fails and propagates its error status to the "parent" bio.
But then the (last part of the) parent bio itself completes without error.
We would clobber the already recorded error status with BLK_STS_OK,
causing silent data corruption.
Reproducer:
-----------
How to trigger this in the real world within seconds:
DRBD on top of degraded parity raid,
small stripe_cache_size, large read_ahead setting.
Drop page cache (sysctl vm.drop_caches=1, fadvise "DONTNEED",
umount and mount again, "reboot").
Cause significant read ahead.
Large read ahead request is split by blk_queue_split().
Parts of the read ahead that are already in the stripe cache,
or find an available stripe cache to use, can be serviced.
Parts of the read ahead that would need "too much work",
would need to wait for a "stripe_head" to become available,
are rejected immediately.
For larger read ahead requests that are split in many pieces, it is very
likely that some "splits" will be serviced, but then the stripe cache is
exhausted/busy, and the remaining ones will be rejected.
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Cc: <stable@vger.kernel.org> # 4.13.x
Link: https://lore.kernel.org/r/20220330185551.3553196-1-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"Several Linux PV device frontends are using the grant table interfaces
for removing access rights of the backends in ways being subject to
race conditions, resulting in potential data leaks, data corruption by
malicious backends, and denial of service triggered by malicious
backends:
- blkfront, netfront, scsifront and the gntalloc driver are testing
whether a grant reference is still in use. If this is not the case,
they assume that a following removal of the granted access will
always succeed, which is not true in case the backend has mapped
the granted page between those two operations.
As a result the backend can keep access to the memory page of the
guest no matter how the page will be used after the frontend I/O
has finished. The xenbus driver has a similar problem, as it
doesn't check the success of removing the granted access of a
shared ring buffer.
- blkfront, netfront, scsifront, usbfront, dmabuf, xenbus, 9p,
kbdfront, and pvcalls are using a functionality to delay freeing a
grant reference until it is no longer in use, but the freeing of
the related data page is not synchronized with dropping the granted
access.
As a result the backend can keep access to the memory page even
after it has been freed and then re-used for a different purpose.
- netfront will fail a BUG_ON() assertion if it fails to revoke
access in the rx path.
This will result in a Denial of Service (DoS) situation of the
guest which can be triggered by the backend"
* tag 'xsa396-5.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
xen/gnttab: fix gnttab_end_foreign_access() without page specified
xen/pvcalls: use alloc/free_pages_exact()
xen/9p: use alloc/free_pages_exact()
xen/usb: don't use gnttab_end_foreign_access() in xenhcd_gnttab_done()
xen: remove gnttab_query_foreign_access()
xen/gntalloc: don't use gnttab_query_foreign_access()
xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
xen/netfront: don't use gnttab_query_foreign_access() for mapped status
xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
xen/grant-table: add gnttab_try_end_foreign_access()
xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
|
|
It isn't enough to check whether a grant is still being in use by
calling gnttab_query_foreign_access(), as a mapping could be realized
by the other side just after having called that function.
In case the call was done in preparation of revoking a grant it is
better to do so via gnttab_end_foreign_access_ref() and check the
success of that operation instead.
For the ring allocation use alloc_pages_exact() in order to avoid
high order pages in case of a multi-page ring.
If a grant wasn't unmapped by the backend without persistent grants
being used, set the device state to "error".
This is CVE-2022-23036 / part of XSA-396.
Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
V2:
- use gnttab_try_end_foreign_access()
V4:
- use alloc_pages_exact() and free_pages_exact()
- set state to error if backend didn't unmap (Roger Pau Monné)
|
|
Currently we have a BUG_ON() to make sure the number of sg
list does not exceed queue_max_segments() in virtio_queue_rq().
However, the block layer uses queue_max_discard_segments()
instead of queue_max_segments() to limit the sg list for
discard requests. So the BUG_ON() might be triggered if
virtio-blk device reports a larger value for max discard
segment than queue_max_segments(). To fix it, let's simply
remove the BUG_ON() which has become unnecessary after commit
02746e26c39e("virtio-blk: avoid preallocating big SGL for data").
And the unused vblk->sg_elems can also be removed together.
Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support")
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Link: https://lore.kernel.org/r/20220304100058.116-2-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Currently the value of max_discard_segment will be set to
MAX_DISCARD_SEGMENTS (256) with no basis in hardware if device
set 0 to max_discard_seg in configuration space. It's incorrect
since the device might not be able to handle such large descriptors.
To fix it, let's follow max_segments restrictions in this case.
Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support")
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20220304100058.116-1-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Various block drivers call blk_set_queue_dying to mark a disk as dead due
to surprise removal events, but since commit 8e141f9eb803 that doesn't
work given that the GD_DEAD flag needs to be set to stop I/O.
Replace the driver calls to blk_set_queue_dying with a new (and properly
documented) blk_mark_disk_dead API, and fold blk_set_queue_dying into the
only remaining caller.
Fixes: 8e141f9eb803 ("block: drain file system I/O on del_gendisk")
Reported-by: Markus Blöchl <markus.bloechl@ipetronik.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Link: https://lore.kernel.org/r/20220217075231.1140-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If backing file's filesystem has implemented ->fallocate(), we think the
loop device can support discard, then pass sb->s_blocksize as
discard_granularity. However, some underlying FS, such as overlayfs,
doesn't set sb->s_blocksize, and causes discard_granularity to be set as
zero, then the warning in __blkdev_issue_discard() is triggered.
Christoph suggested to pass kstatfs.f_bsize as discard granularity, and
this way is fine because kstatfs.f_bsize means 'Optimal transfer block
size', which still matches with definition of discard granularity.
So fix the issue by setting discard_granularity as kstatfs.f_bsize if it
is available, otherwise claims discard isn't supported.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Reported-by: Pei Zhang <pezhang@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220126035830.296465-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The kernel test robot is reporting that xfstest which does
umount ext2 on xfs
umount xfs
sequence started failing, for commit 322c4293ecc58110 ("loop: make
autoclear operation asynchronous") removed a guarantee that fput() of
backing file is processed before lo_release() from close() returns to
user mode.
And syzbot is reporting that deferring destroy_workqueue() from
__loop_clr_fd() to a WQ context did not help [1]. Revert that commit.
Link: https://syzkaller.appspot.com/bug?extid=831661966588c802aae9 [1]
Reported-by: kernel test robot <oliver.sang@intel.com>
Acked-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reported-by: syzbot <syzbot+831661966588c802aae9@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Link: https://lore.kernel.org/r/20220211071554.3424-1-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull bitmap updates from Yury Norov:
- introduce for_each_set_bitrange()
- use find_first_*_bit() instead of find_next_*_bit() where possible
- unify for_each_bit() macros
* tag 'bitmap-5.17-rc1' of git://github.com/norov/linux:
vsprintf: rework bitmap_list_string
lib: bitmap: add performance test for bitmap_print_to_pagebuf
bitmap: unify find_bit operations
mm/percpu: micro-optimize pcpu_is_populated()
Replace for_each_*_bit_from() with for_each_*_bit() where appropriate
find: micro-optimize for_each_{set,clear}_bit()
include/linux: move for_each_bit() macros from bitops.h to find.h
cpumask: replace cpumask_next_* with cpumask_first_* where appropriate
tools: sync tools/bitmap with mother linux
all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate
cpumask: use find_first_and_bit()
lib: add find_first_and_bit()
arch: remove GENERIC_FIND_FIRST_BIT entirely
include: move find.h from asm_generic to linux
bitops: move find_bit_*_le functions from le.h to find.h
bitops: protect find_first_{,zero}_bit properly
|
|
Pull block fixes from Jens Axboe:
"Various little minor fixes that should go into this release:
- Fix issue with cloned bios and IO accounting (Christoph)
- Remove redundant assignments (Colin, GuoYong)
- Fix an issue with the mq-deadline async_depth sysfs interface (me)
- Fix brd module loading race (Tetsuo)
- Shared tag map wakeup fix (Laibin)
- End of bdev read fix (OGAWA)
- srcu leak fix (Ming)"
* tag 'block-5.17-2022-01-21' of git://git.kernel.dk/linux-block:
block: fix async_depth sysfs interface for mq-deadline
block: Fix wrong offset in bio_truncate()
block: assign bi_bdev for cloned bios in blk_rq_prep_clone
block: cleanup q->srcu
block: Remove unnecessary variable assignment
brd: remove brd_devices_mutex mutex
aoe: remove redundant assignment on variable n
loop: remove redundant initialization of pointer node
blk-mq: fix tag_get wait task can't be awakened
|
|
Pull ceph updates from Ilya Dryomov:
"The highlight is the new mount "device" string syntax implemented by
Venky Shankar. It solves some long-standing issues with using
different auth entities and/or mounting different CephFS filesystems
from the same cluster, remounting and also misleading /proc/mounts
contents. The existing syntax of course remains to be maintained.
On top of that, there is a couple of fixes for edge cases in quota and
a new mount option for turning on unbuffered I/O mode globally instead
of on a per-file basis with ioctl(CEPH_IOC_SYNCIO)"
* tag 'ceph-for-5.17-rc1' of git://github.com/ceph/ceph-client:
ceph: move CEPH_SUPER_MAGIC definition to magic.h
ceph: remove redundant Lsx caps check
ceph: add new "nopagecache" option
ceph: don't check for quotas on MDS stray dirs
ceph: drop send metrics debug message
rbd: make const pointer spaces a static const array
ceph: Fix incorrect statfs report for small quota
ceph: mount syntax module parameter
doc: document new CephFS mount device syntax
ceph: record updated mon_addr on remount
ceph: new device mount syntax
libceph: rename parse_fsid() to ceph_parse_fsid() and export
libceph: generalize addr/ip parsing based on delimiter
|
|
Pull virtio updates from Michael Tsirkin:
"virtio,vdpa,qemu_fw_cfg: features, cleanups, and fixes.
- partial support for < MAX_ORDER - 1 granularity for virtio-mem
- driver_override for vdpa
- sysfs ABI documentation for vdpa
- multiqueue config support for mlx5 vdpa
- and misc fixes, cleanups"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (42 commits)
vdpa/mlx5: Fix tracking of current number of VQs
vdpa/mlx5: Fix is_index_valid() to refer to features
vdpa: Protect vdpa reset with cf_mutex
vdpa: Avoid taking cf_mutex lock on get status
vdpa/vdpa_sim_net: Report max device capabilities
vdpa: Use BIT_ULL for bit operations
vdpa/vdpa_sim: Configure max supported virtqueues
vdpa/mlx5: Report max device capabilities
vdpa: Support reporting max device capabilities
vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps()
vdpa: Add support for returning device configuration information
vdpa/mlx5: Support configuring max data virtqueue
vdpa/mlx5: Fix config_attr_mask assignment
vdpa: Allow to configure max data virtqueues
vdpa: Read device configuration only if FEATURES_OK
vdpa: Sync calls set/get config/status with cf_mutex
vdpa/mlx5: Distribute RX virtqueues in RQT object
vdpa: Provide interface to read driver features
vdpa: clean up get_config_size ret value handling
virtio_ring: mark ring unused on error
...
|
|
If brd_alloc() from brd_probe() is called before brd_alloc() from
brd_init() is called, module loading will fail with -EEXIST error.
To close this race, call __register_blkdev() just before leaving
brd_init().
Then, we can remove brd_devices_mutex mutex, for brd_device list
will no longer be accessed concurrently.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/6b074af7-c165-4fab-b7da-8270a4f6f6cd@i-love.sakura.ne.jp
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Merge misc updates from Andrew Morton:
"146 patches.
Subsystems affected by this patch series: kthread, ia64, scripts,
ntfs, squashfs, ocfs2, vfs, and mm (slab-generic, slab, kmemleak,
dax, kasan, debug, pagecache, gup, shmem, frontswap, memremap,
memcg, selftests, pagemap, dma, vmalloc, memory-failure, hugetlb,
userfaultfd, vmscan, mempolicy, oom-kill, hugetlbfs, migration, thp,
ksm, page-poison, percpu, rmap, zswap, zram, cleanups, hmm, and
damon)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (146 commits)
mm/damon: hide kernel pointer from tracepoint event
mm/damon/vaddr: hide kernel pointer from damon_va_three_regions() failure log
mm/damon/vaddr: use pr_debug() for damon_va_three_regions() failure logging
mm/damon/dbgfs: remove an unnecessary variable
mm/damon: move the implementation of damon_insert_region to damon.h
mm/damon: add access checking for hugetlb pages
Docs/admin-guide/mm/damon/usage: update for schemes statistics
mm/damon/dbgfs: support all DAMOS stats
Docs/admin-guide/mm/damon/reclaim: document statistics parameters
mm/damon/reclaim: provide reclamation statistics
mm/damon/schemes: account how many times quota limit has exceeded
mm/damon/schemes: account scheme actions that successfully applied
mm/damon: remove a mistakenly added comment for a future feature
Docs/admin-guide/mm/damon/usage: update for kdamond_pid and (mk|rm)_contexts
Docs/admin-guide/mm/damon/usage: mention tracepoint at the beginning
Docs/admin-guide/mm/damon/usage: remove redundant information
Docs/admin-guide/mm/damon/usage: update for scheme quotas and watermarks
mm/damon: convert macro functions to static inline functions
mm/damon: modify damon_rand() macro to static inline function
mm/damon: move damon_rand() definition into damon.h
...
|
|
find_first{,_zero}_bit is a more effective analogue of 'next' version if
start == 0. This patch replaces 'next' with 'first' where things look
trivial.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
|
|
Embrace ATTRIBUTE_GROUPS to avoid boiler plate code. This should not
introduce any functional changes.
Link: https://lkml.kernel.org/r/20211028203600.2157356-1-mcgrof@kernel.org
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This will enable cleanups down the road.
The idea is to disable cbs, then add "flush_queued_cbs" callback
as a parameter, this way drivers can flush any work
queued after callbacks have been disabled.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20211013105226.20225-1-mst@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc and other driver updates from Greg KH:
"Here is the large set of char, misc, and other "small" driver
subsystem changes for 5.17-rc1.
Lots of different things are in here for char/misc drivers such as:
- habanalabs driver updates
- mei driver updates
- lkdtm driver updates
- vmw_vmci driver updates
- android binder driver updates
- other small char/misc driver updates
Also smaller driver subsystems have also been updated, including:
- fpga subsystem updates
- iio subsystem updates
- soundwire subsystem updates
- extcon subsystem updates
- gnss subsystem updates
- phy subsystem updates
- coresight subsystem updates
- firmware subsystem updates
- comedi subsystem updates
- mhi subsystem updates
- speakup subsystem updates
- rapidio subsystem updates
- spmi subsystem updates
- virtual driver updates
- counter subsystem updates
Too many individual changes to summarize, the shortlog contains the
full details.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (406 commits)
counter: 104-quad-8: Fix use-after-free by quad8_irq_handler
dt-bindings: mux: Document mux-states property
dt-bindings: ti-serdes-mux: Add defines for J721S2 SoC
counter: remove old and now unused registration API
counter: ti-eqep: Convert to new counter registration
counter: stm32-lptimer-cnt: Convert to new counter registration
counter: stm32-timer-cnt: Convert to new counter registration
counter: microchip-tcb-capture: Convert to new counter registration
counter: ftm-quaddec: Convert to new counter registration
counter: intel-qep: Convert to new counter registration
counter: interrupt-cnt: Convert to new counter registration
counter: 104-quad-8: Convert to new counter registration
counter: Update documentation for new counter registration functions
counter: Provide alternative counter registration functions
counter: stm32-timer-cnt: Convert to counter_priv() wrapper
counter: stm32-lptimer-cnt: Convert to counter_priv() wrapper
counter: ti-eqep: Convert to counter_priv() wrapper
counter: ftm-quaddec: Convert to counter_priv() wrapper
counter: intel-qep: Convert to counter_priv() wrapper
counter: microchip-tcb-capture: Convert to counter_priv() wrapper
...
|
|
The variable n is being bit-wise or'd with a value and reassigned
before being returned. The update of n is redundant, replace
the |= operator with | instead. Cleans up clang scan warning:
drivers/block/aoe/aoecmd.c:125:9: warning: Although the value stored
to 'n' is used in the enclosing expression, the value is never
actually read from 'n' [deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20220113000545.1307091-1-colin.i.king@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The pointer node is being initialized with a value that is never
read, it is being re-assigned the same value a little futher on.
Remove the redundant initialization. Cleans up clang scan warning:
drivers/block/loop.c:823:19: warning: Value stored to 'node' during
its initialization is never read [deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20220113001432.1331871-1-colin.i.king@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull rdma updates from Jason Gunthorpe:
"Another small cycle. Mostly cleanups and bug fixes, quite a bit
assisted from bots. There are a few new syzkaller splats that haven't
been solved yet but they should get into the rcs in a few weeks, I
think.
Summary:
- Update drivers to use common helpers for GUIDs, pkeys, bitmaps,
memset_startat, and others
- General code cleanups from bots
- Simplify some of the rxe pool code in preparation for a larger
rework
- Clean out old stuff from hns, including all support for hip06
devices
- Fix a bug where GID table entries could be missed if the table had
holes in it
- Rename paths and sessions in rtrs for better understandability
- Consolidate the roce source port selection code
- NDR speed support in mlx5"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (83 commits)
RDMA/irdma: Remove the redundant return
RDMA/rxe: Use the standard method to produce udp source port
RDMA/irdma: Make the source udp port vary
RDMA/hns: Replace get_udp_sport with rdma_get_udp_sport
RDMA/core: Calculate UDP source port based on flow label or lqpn/rqpn
IB/qib: Fix typos
RDMA/rtrs-clt: Rename rtrs_clt to rtrs_clt_sess
RDMA/rtrs-srv: Rename rtrs_srv to rtrs_srv_sess
RDMA/rtrs-clt: Rename rtrs_clt_sess to rtrs_clt_path
RDMA/rtrs-srv: Rename rtrs_srv_sess to rtrs_srv_path
RDMA/rtrs: Rename rtrs_sess to rtrs_path
RDMA/hns: Modify the hop num of HIP09 EQ to 1
IB/iser: Align coding style across driver
IB/iser: Remove un-needed casting to/from void pointer
IB/iser: Don't suppress send completions
IB/iser: Rename ib_ret local variable
IB/iser: Fix RNR errors
IB/iser: Remove deprecated pi_guard module param
IB/mlx5: Expose NDR speed through MAD
RDMA/cxgb4: Set queue pair state when being queried
...
|
|
To resolve minor conflict in:
drivers/infiniband/hw/mlx5/mlx5_ib.h
By merging both hunks.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Don't populate the const array spaces on the stack but make it static
const and make the pointer an array to remove a dereference. Shrinks
object code a little too. Also clean up intent, currently it is spaces
and should be a tab.
Signed-off-by: Colin Ian King <colin.i.king@googlemail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
... and remove hardcoded function name in ceph_parse_ips().
[ idryomov: delim parameter, drop CEPH_ADDR_PARSE_DEFAULT_DELIM ]
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Pull block driver updates from Jens Axboe:
- mtip32xx pci cleanups (Bjorn)
- mtip32xx conversion to generic power management (Vaibhav)
- rsxx pci powermanagement cleanups (Bjorn)
- Remove the rsxx driver. This hardware never saw much adoption, and
it's been end of lifed for a while. (Christoph)
- MD pull request from Song:
- REQ_NOWAIT support (Vishal Verma)
- raid6 benchmark optimization (Dirk Müller)
- Fix for acct bioset (Xiao Ni)
- Clean up max_queued_requests (Mariusz Tkaczyk)
- PREEMPT_RT optimization (Davidlohr Bueso)
- Use default_groups in kobj_type (Greg Kroah-Hartman)
- Use attribute groups in pktcdvd and rnbd (Greg)
- NVMe pull request from Christoph:
- increment request genctr on completion (Keith Busch, Geliang
Tang)
- add a 'iopolicy' module parameter (Hannes Reinecke)
- print out valid arguments when reading from /dev/nvme-fabrics
(Hannes Reinecke)
- Use struct_group() in drbd (Kees)
- null_blk fixes (Ming)
- Get rid of congestion logic in pktcdvd (Neil)
- Floppy ejection hang fix (Tasos)
- Floppy max user request size fix (Xiongwei)
- Loop locking fix (Tetsuo)
* tag 'for-5.17/drivers-2022-01-11' of git://git.kernel.dk/linux-block: (32 commits)
md: use default_groups in kobj_type
md: Move alloc/free acct bioset in to personality
lib/raid6: Use strict priority ranking for pq gen() benchmarking
lib/raid6: skip benchmark of non-chosen xor_syndrome functions
md: fix spelling of "its"
md: raid456 add nowait support
md: raid10 add nowait support
md: raid1 add nowait support
md: add support for REQ_NOWAIT
md: drop queue limitation for RAID1 and RAID10
md/raid5: play nice with PREEMPT_RT
block/rnbd-clt-sysfs: use default_groups in kobj_type
pktcdvd: convert to use attribute groups
block: null_blk: only set set->nr_maps as 3 if active poll_queues is > 0
nvme: add 'iopolicy' module parameter
nvme: drop unused variable ctrl in nvme_setup_cmd
nvme: increment request genctr on completion
nvme-fabrics: print out valid arguments when reading from /dev/nvme-fabrics
block: remove the rsxx driver
rsxx: Drop PCI legacy power management
...
|
|
Pull block updates from Jens Axboe:
- Unify where the struct request handling code is located in the blk-mq
code (Christoph)
- Header cleanups (Christoph)
- Clean up the io_context handling code (Christoph, me)
- Get rid of ->rq_disk in struct request (Christoph)
- Error handling fix for add_disk() (Christoph)
- request allocation cleanusp (Christoph)
- Documentation updates (Eric, Matthew)
- Remove trivial crypto unregister helper (Eric)
- Reduce shared tag overhead (John)
- Reduce poll_stats memory overhead (me)
- Known indirect function call for dio (me)
- Use atomic references for struct request (me)
- Support request list issue for block and NVMe (me)
- Improve queue dispatch pinning (Ming)
- Improve the direct list issue code (Keith)
- BFQ improvements (Jan)
- Direct completion helper and use it in mmc block (Sebastian)
- Use raw spinlock for the blktrace code (Wander)
- fsync error handling fix (Ye)
- Various fixes and cleanups (Lukas, Randy, Yang, Tetsuo, Ming, me)
* tag 'for-5.17/block-2022-01-11' of git://git.kernel.dk/linux-block: (132 commits)
MAINTAINERS: add entries for block layer documentation
docs: block: remove queue-sysfs.rst
docs: sysfs-block: document virt_boundary_mask
docs: sysfs-block: document stable_writes
docs: sysfs-block: fill in missing documentation from queue-sysfs.rst
docs: sysfs-block: add contact for nomerges
docs: sysfs-block: sort alphabetically
docs: sysfs-block: move to stable directory
block: don't protect submit_bio_checks by q_usage_counter
block: fix old-style declaration
nvme-pci: fix queue_rqs list splitting
block: introduce rq_list_move
block: introduce rq_list_for_each_safe macro
block: move rq_list macros to blk-mq.h
block: drop needless assignment in set_task_ioprio()
block: remove unnecessary trailing '\'
bio.h: fix kernel-doc warnings
block: check minor range in device_add_disk()
block: use "unsigned long" for blk_validate_block_size().
block: fix error unwinding in device_add_disk
...
|
|
Structure rtrs_clt is used for sessions. So to avoid confusions rename it
to rtrs_clt_sess.
Transformations are done with the help of following coccinelle script.
@@
@@
struct
- rtrs_clt
+ rtrs_clt_sess
Link: https://lore.kernel.org/r/20220105180708.7774-6-jinpu.wang@ionos.com
Signed-off-by: Vaishali Thakkar <vaishali.thakkar@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Structure rtrs_srv is used for sessions so in order to avoid confusions
rename it to rtrs_srv_sess.
All changes were done with the help of following Coccinelle script:
@@
@@
struct
- rtrs_srv
+ rtrs_srv_sess
Link: https://lore.kernel.org/r/20220105180708.7774-5-jinpu.wang@ionos.com
Signed-off-by: Vaishali Thakkar <vaishali.thakkar@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
rtrs_srv_sess is used for paths and not sessions on the server side. This
creates confusion so let's rename it to rtrs_srv_path. Also, rename
related variables and functions.
Coccinelle is used to do the transformations for most of the occurrences
and remaining ones were handled manually.
Link: https://lore.kernel.org/r/20220105180708.7774-3-jinpu.wang@ionos.com
Signed-off-by: Vaishali Thakkar <vaishali.thakkar@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field. Move the rnbd controller sysfs code to use default_groups field
which has been the preferred way since aa30f47cf666 ("kobject: Add
support for default attribute groups to kobj_type") so that we can soon
get rid of the obsolete default_attrs field.
Cc: "Md. Haris Iqbal" <haris.iqbal@ionos.com>
Cc: Jack Wang <jinpu.wang@ionos.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Link: https://lore.kernel.org/r/20220104162947.1320936-1-gregkh@linuxfoundation.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There is no need to create kobject children of the pktcdvd device just
to display a subdirectory name. Instead, use a named attribute group
which removes the extra kobjects and also fixes the userspace race where
the device is created yet tools like libudev can not see the attributes
as they think the subdirectories are some other sort of device.
Cc: linux-block@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220103162408.742003-1-gregkh@linuxfoundation.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We need the fixes in here as well for testing.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
It isn't correct to set set->nr_maps as 3 if g_poll_queues is > 0 since
we can change it via configfs for null_blk device created there, so only
set it as 3 if active poll_queues is > 0.
Fixes divide zero exception reported by Shinichiro.
Fixes: 2bfdbe8b7ebd ("null_blk: allow zero poll queues")
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20211224010831.1521805-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
MIPS include files define "PC" so when building the paride driver the
following build warning shows up:
rivers/block/paride/bpck.c:32: warning: "PC" redefined
Fix this by undefining PC before redefining it as is done for other
defines in this driver.
Cc: Tim Waugh <tim@cyberelk.net>
Acked-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20211130084626.3215987-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This driver was for rare and shortlived high end enterprise hardware
and hasn't been maintained since 2014, which also means it never got
converted to use blk-mq.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The Xen blkfront driver is still vulnerable for an attack via excessive
number of events sent by the backend. Fix that by using lateeoi event
channels.
This is part of XSA-391
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
The rsxx driver doesn't support device suspend, so remove
rsxx_pci_suspend(), the legacy PCI .suspend() method, completely.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211208192449.146076-5-helgaas@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Convert mtip32xx from legacy PCI power management to the generic power
management framework.
Previously, mtip32xx used legacy PCI power management, where
mtip_pci_suspend() and mtip_pci_resume() were responsible for both
device-specific things and generic PCI things:
mtip_pci_suspend
mtip_block_suspend(dd) <-- device-specific
pci_save_state(pdev) <-- generic PCI
pci_set_power_state(pdev, pci_choose_state(pdev, state))
mtip_pci_resume
pci_set_power_state(PCI_D0) <-- generic PCI
pci_restore_state(pdev) <-- generic PCI
pcim_enable_device(pdev) <-- generic PCI
pci_set_master(pdev) <-- generic PCI
mtip_block_resume(dd) <-- device-specific
With generic power management, the PCI bus PM methods do the generic PCI
things, and the driver needs only the device-specific part, i.e.,
suspend_devices_and_enter
dpm_suspend_start(PMSG_SUSPEND)
pci_pm_suspend # PCI bus .suspend() method
mtip_pci_suspend # dev->driver->pm->suspend
mtip_block_suspend <-- device-specific
suspend_enter
dpm_suspend_noirq(PMSG_SUSPEND)
pci_pm_suspend_noirq # PCI bus .suspend_noirq() method
pci_save_state <-- generic PCI
pci_prepare_to_sleep <-- generic PCI
pci_set_power_state
...
dpm_resume_end(PMSG_RESUME)
pci_pm_resume # PCI bus .resume() method
pci_restore_standard_config
pci_set_power_state(PCI_D0) <-- generic PCI
pci_restore_state <-- generic PCI
mtip_pci_resume # dev->driver->pm->resume
mtip_block_resume <-- device-specific
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20210114115423.52414-2-vaibhavgupta40@gmail.com
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211208192449.146076-4-helgaas@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Previously we passed a struct pci_dev * to mtip_check_surprise_removal(),
which immediately looked up the driver_data. But all callers already have
the driver_data pointer, so just pass it directly and skip the extra
lookup. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211208192449.146076-3-helgaas@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The .suspend() and .resume() methods are only called after the .probe()
method (mtip_pci_probe()) has set the drvdata and returned success.
Therefore, if we get to mtip_pci_suspend() or mtip_pci_resume(), the
drvdata must be valid. Drop the unnecessary checking.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211208192449.146076-2-helgaas@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|