Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull virtio updates from Michael Tsirkin:
- A huge patchset supporting vq resize using the new vq reset
capability
- Features, fixes, and cleanups all over the place
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (88 commits)
vdpa/mlx5: Fix possible uninitialized return value
vdpa_sim_blk: add support for discard and write-zeroes
vdpa_sim_blk: add support for VIRTIO_BLK_T_FLUSH
vdpa_sim_blk: make vdpasim_blk_check_range usable by other requests
vdpa_sim_blk: check if sector is 0 for commands other than read or write
vdpa_sim: Implement suspend vdpa op
vhost-vdpa: uAPI to suspend the device
vhost-vdpa: introduce SUSPEND backend feature bit
vdpa: Add suspend operation
virtio-blk: Avoid use-after-free on suspend/resume
virtio_vdpa: support the arg sizes of find_vqs()
vhost-vdpa: Call ida_simple_remove() when failed
vDPA: fix 'cast to restricted le16' warnings in vdpa.c
vDPA: !FEATURES_OK should not block querying device config space
vDPA/ifcvf: support userspace to query features and MQ of a management device
vDPA/ifcvf: get_config_size should return a value no greater than dev implementation
vhost scsi: Allow user to control num virtqueues
vhost-scsi: Fix max number of virtqueues
vdpa/mlx5: Support different address spaces for control and data
vdpa/mlx5: Implement susupend virtqueue callback
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply and reset updates from Sebastian Reichel:
"No core patches, only driver updates:
- pwr-mlxbf: new reset driver for Mellanox BlueField
- at91-reset: SAMA7G5 support
- ab8500: continue refurbishing
- misc minor fixes"
* tag 'for-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (29 commits)
power: supply: olpc_battery: Hold the reference returned by of_find_compatible_node
power: supply: ab8500: add missing destroy_workqueue in ab8500_charger_bind
power: supply: ab8500: Remove flush_scheduled_work() call.
power: supply: ab8500_fg: drop duplicated 'is' in comment
power: supply: ab8500: Drop external charger leftovers
power: supply: ab8500: Add MAINTAINERS entry
dt-bindings: power: reset: qcom,pshold: convert to dtschema
power: supply: Fix typo in power_supply_check_supplies
power: reset: pwr-mlxbf: change rst_pwr_hid and low_pwr_hid from global to local variables
power: reset: pwr-mlxbf: add missing include
power: reset: at91-reset: add support for SAMA7G5
power: reset: at91-reset: add reset_controller_dev support
power: reset: at91-reset: add at91_reset_data
power: reset: at91-reset: document structures and enums
dt-bindings: reset: add sama7g5 definitions
dt-bindings: reset: atmel,at91sam9260-reset: add sama7g5 bindings
dt-bindings: reset: convert Atmel/Microchip reset controller to YAML
power: reset: pwr-mlxbf: add BlueField SoC power control driver
power: supply: ab8500: Exit maintenance if too low voltage
power: supply: ab8500: Respect charge_restart_voltage_uv
...
|
|
Pull another VFIO update from Alex Williamson:
- Rename vfio source file to more easily allow additional source
files in the upcoming development cycles (Jason Gunthorpe)
* tag 'vfio-v6.0-rc1pt2' of https://github.com/awilliam/linux-vfio:
vfio: Move vfio.c to vfio_main.c
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- A few fixes for the DM verity and bufio changes in this merge window
- A smatch warning fix for DM writecache locking in writecache_map
* tag 'for-6.0/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm bufio: fix some cases where the code sleeps with spinlock held
dm writecache: fix smatch warning about invalid return from writecache_map
dm verity: fix verity_parse_opt_args parsing
dm verity: fix DM_VERITY_OPTS_MAX value yet again
dm bufio: simplify DM_BUFIO_CLIENT_NO_SLEEP locking
|
|
Pull drm fixes from Dave Airlie:
"Not much to squeeze into rc1, just two small fixes, one for core gem
and one for shmem-helpers:
gem:
- Annotate WW context in error paths
shmem-helper:
- Add missing vunmap in error paths"
* tag 'drm-next-2022-08-12-1' of git://anongit.freedesktop.org/drm/drm:
drm/gem: Properly annotate WW context on drm_gem_lock_reservations() error
drm/shmem-helper: Add missing vunmap on error
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bluetooth, bpf, can and netfilter.
A little larger than usual but it's all fixes, no late features. It's
large partially because of timing, and partially because of follow ups
to stuff that got merged a week or so before the merge window and
wasn't as widely tested. Maybe the Bluetooth fixes are a little
alarming so we'll address that, but the rest seems okay and not scary.
Notably we're including a fix for the netfilter Kconfig [1], your WiFi
warning [2] and a bluetooth fix which should unblock syzbot [3].
Current release - regressions:
- Bluetooth:
- don't try to cancel uninitialized works [3]
- L2CAP: fix use-after-free caused by l2cap_chan_put
- tls: rx: fix device offload after recent rework
- devlink: fix UAF on failed reload and leftover locks in mlxsw
Current release - new code bugs:
- netfilter:
- flowtable: fix incorrect Kconfig dependencies [1]
- nf_tables: fix crash when nf_trace is enabled
- bpf:
- use proper target btf when exporting attach_btf_obj_id
- arm64: fixes for bpf trampoline support
- Bluetooth:
- ISO: unlock on error path in iso_sock_setsockopt()
- ISO: fix info leak in iso_sock_getsockopt()
- ISO: fix iso_sock_getsockopt for BT_DEFER_SETUP
- ISO: fix memory corruption on iso_pinfo.base
- ISO: fix not using the correct QoS
- hci_conn: fix updating ISO QoS PHY
- phy: dp83867: fix get nvmem cell fail
Previous releases - regressions:
- wifi: cfg80211: fix validating BSS pointers in
__cfg80211_connect_result [2]
- atm: bring back zatm uAPI after ATM had been removed
- properly fix old bug making bonding ARP monitor mode not being able
to work with software devices with lockless Tx
- tap: fix null-deref on skb->dev in dev_parse_header_protocol
- revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP" it helps some
devices and breaks others
- netfilter:
- nf_tables: many fixes rejecting cross-object linking which may
lead to UAFs
- nf_tables: fix null deref due to zeroed list head
- nf_tables: validate variable length element extension
- bgmac: fix a BUG triggered by wrong bytes_compl
- bcmgenet: indicate MAC is in charge of PHY PM
Previous releases - always broken:
- bpf:
- fix bad pointer deref in bpf_sys_bpf() injected via test infra
- disallow non-builtin bpf programs calling the prog_run command
- don't reinit map value in prealloc_lru_pop
- fix UAFs during the read of map iterator fd
- fix invalidity check for values in sk local storage map
- reject sleepable program for non-resched map iterator
- mptcp:
- move subflow cleanup in mptcp_destroy_common()
- do not queue data on closed subflows
- virtio_net: fix memory leak inside XDP_TX with mergeable
- vsock: fix memory leak when multiple threads try to connect()
- rework sk_user_data sharing to prevent psock leaks
- geneve: fix TOS inheriting for ipv4
- tunnels & drivers: do not use RT_TOS for IPv6 flowlabel
- phy: c45 baset1: do not skip aneg configuration if clock role is
not specified
- rose: avoid overflow when /proc displays timer information
- x25: fix call timeouts in blocking connects
- can: mcp251x: fix race condition on receive interrupt
- can: j1939:
- replace user-reachable WARN_ON_ONCE() with netdev_warn_once()
- fix memory leak of skbs in j1939_session_destroy()
Misc:
- docs: bpf: clarify that many things are not uAPI
- seg6: initialize induction variable to first valid array index (to
silence clang vs objtool warning)
- can: ems_usb: fix clang 14's -Wunaligned-access warning"
* tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (117 commits)
net: atm: bring back zatm uAPI
dpaa2-eth: trace the allocated address instead of page struct
net: add missing kdoc for struct genl_multicast_group::flags
nfp: fix use-after-free in area_cache_get()
MAINTAINERS: use my korg address for mt7601u
mlxsw: minimal: Fix deadlock in ports creation
bonding: fix reference count leak in balance-alb mode
net: usb: qmi_wwan: Add support for Cinterion MV32
bpf: Shut up kern_sys_bpf warning.
net/tls: Use RCU API to access tls_ctx->netdev
tls: rx: device: don't try to copy too much on detach
tls: rx: device: bound the frag walk
net_sched: cls_route: remove from list when handle is 0
selftests: forwarding: Fix failing tests with old libnet
net: refactor bpf_sk_reuseport_detach()
net: fix refcount bug in sk_psock_get (2)
selftests/bpf: Ensure sleepable program is rejected by hash map iter
selftests/bpf: Add write tests for sk local storage map iterator
selftests/bpf: Add tests for reading a dangling map iter fd
bpf: Only allow sleepable program for resched-able iterator
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"These fix up direct references to the fwnode field in struct device
and extend ACPI device properties support.
Specifics:
- Replace direct references to the fwnode field in struct device with
dev_fwnode() and device_match_fwnode() (Andy Shevchenko)
- Make the ACPI code handling device properties support properties
with buffer values (Sakari Ailus)"
* tag 'acpi-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: property: Fix error handling in acpi_init_properties()
ACPI: VIOT: Do not dereference fwnode in struct device
ACPI: property: Read buffer properties as integers
ACPI: property: Add support for parsing buffer property UUID
ACPI: property: Unify integer value reading functions
ACPI: property: Switch node property referencing from ifs to a switch
ACPI: property: Move property ref argument parsing into a new function
ACPI: property: Use acpi_object_type consistently in property ref parsing
ACPI: property: Tie data nodes to acpi handles
ACPI: property: Return type of acpi_add_nondev_subnodes() should be bool
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
Short summary of fixes pull:
* gem: Annotate WW context in error paths
* shmem-helper: Add missing vunmap in error paths
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YvOLPpufsvOJHiNY@linux-uq9g
|
|
Pull ceph updates from Ilya Dryomov:
"We have a good pile of various fixes and cleanups from Xiubo, Jeff,
Luis and others, almost exclusively in the filesystem.
Several patches touch files outside of our normal purview to set the
stage for bringing in Jeff's long awaited ceph+fscrypt series in the
near future. All of them have appropriate acks and sat in linux-next
for a while"
* tag 'ceph-for-5.20-rc1' of https://github.com/ceph/ceph-client: (27 commits)
libceph: clean up ceph_osdc_start_request prototype
libceph: fix ceph_pagelist_reserve() comment typo
ceph: remove useless check for the folio
ceph: don't truncate file in atomic_open
ceph: make f_bsize always equal to f_frsize
ceph: flush the dirty caps immediatelly when quota is approaching
libceph: print fsid and epoch with osd id
libceph: check pointer before assigned to "c->rules[]"
ceph: don't get the inline data for new creating files
ceph: update the auth cap when the async create req is forwarded
ceph: make change_auth_cap_ses a global symbol
ceph: fix incorrect old_size length in ceph_mds_request_args
ceph: switch back to testing for NULL folio->private in ceph_dirty_folio
ceph: call netfs_subreq_terminated with was_async == false
ceph: convert to generic_file_llseek
ceph: fix the incorrect comment for the ceph_mds_caps struct
ceph: don't leak snap_rwsem in handle_cap_grant
ceph: prevent a client from exceeding the MDS maximum xattr size
ceph: choose auth MDS for getxattr with the Xs caps
ceph: add session already open notify support
...
|
|
We should trace the allocated address instead of page struct.
Fixes: 27c874867c4e ("dpaa2-eth: Use a single page per Rx buffer")
Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20220811151651.3327-1-chen45464546@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Merge changes adding support for device properties with buffer values
to the ACPI device properties handling code.
* acpi-properties:
ACPI: property: Fix error handling in acpi_init_properties()
ACPI: property: Read buffer properties as integers
ACPI: property: Add support for parsing buffer property UUID
ACPI: property: Unify integer value reading functions
ACPI: property: Switch node property referencing from ifs to a switch
ACPI: property: Move property ref argument parsing into a new function
ACPI: property: Use acpi_object_type consistently in property ref parsing
ACPI: property: Tie data nodes to acpi handles
ACPI: property: Return type of acpi_add_nondev_subnodes() should be bool
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- changes to input core to properly queue synthetic events (such as
autorepeat) and to release multitouch contacts when an input device
is inhibited or suspended
- reworked quirk handling in i8042 driver that consolidates multiple
DMI tables into one and adds several quirks for TUXEDO line of
laptops
- update to mt6779 keypad to better reflect organization of the
hardware
- changes to mtk-pmic-keys driver preparing it to handle more variants
- facelift of adp5588-keys driver
- improvements to iqs7222 driver
- adjustments to various DT binding documents for input devices
- other assorted driver fixes.
* tag 'input-for-v5.20-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (54 commits)
Input: adc-joystick - fix ordering in adc_joystick_probe()
dt-bindings: input: ariel-pwrbutton: use spi-peripheral-props.yaml
Input: deactivate MT slots when inhibiting or suspending devices
Input: properly queue synthetic events
dt-bindings: input: iqs7222: Use central 'linux,code' definition
Input: i8042 - add dritek quirk for Acer Aspire One AO532
dt-bindings: input: gpio-keys: accept also interrupt-extended
dt-bindings: input: gpio-keys: reference input.yaml and document properties
dt-bindings: input: gpio-keys: enforce node names to match all properties
dt-bindings: input: Convert adc-keys to DT schema
dt-bindings: input: Centralize 'linux,input-type' definition
dt-bindings: input: Use common 'linux,keycodes' definition
dt-bindings: input: Centralize 'linux,code' definition
dt-bindings: input: Increase maximum keycode value to 0x2ff
Input: mt6779-keypad - implement row/column selection
Input: mt6779-keypad - match hardware matrix organization
Input: i8042 - add additional TUXEDO devices to i8042 quirk tables
Input: goodix - switch use of acpi_gpio_get_*_resource() APIs
Input: i8042 - add TUXEDO devices to i8042 quirk tables
Input: i8042 - add debug output for quirks
...
|
|
area_cache_get() is used to distribute cache->area and set cache->id,
and if cache->id is not 0 and cache->area->kref refcount is 0, it will
release the cache->area by nfp_cpp_area_release(). area_cache_get()
set cache->id before cpp->op->area_init() and nfp_cpp_area_acquire().
But if area_init() or nfp_cpp_area_acquire() fails, the cache->id is
is already set but the refcount is not increased as expected. At this
time, calling the nfp_cpp_area_release() will cause use-after-free.
To avoid the use-after-free, set cache->id after area_init() and
nfp_cpp_area_acquire() complete successfully.
Note: This vulnerability is triggerable by providing emulated device
equipped with specified configuration.
BUG: KASAN: use-after-free in nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760)
Write of size 4 at addr ffff888005b7f4a0 by task swapper/0/1
Call Trace:
<TASK>
nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760)
area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:884)
Allocated by task 1:
nfp_cpp_area_alloc_with_name (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:303)
nfp_cpp_area_cache_add (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:802)
nfp6000_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:1230)
nfp_cpp_from_operations (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:1215)
nfp_pci_probe (drivers/net/ethernet/netronome/nfp/nfp_main.c:744)
Freed by task 1:
kfree (mm/slub.c:4562)
area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:873)
nfp_cpp_read (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:924 drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:973)
nfp_cpp_readl (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cpplib.c:48)
Signed-off-by: Jialiang Wang <wangjialiang0806@163.com>
Reviewed-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220810073057.4032-1-wangjialiang0806@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Drop devl_lock() / devl_unlock() from ports creation and removal flows
since the devlink instance lock is now taken by mlxsw_core.
Fixes: 72a4c8c94efa ("mlxsw: convert driver to use unlocked devlink API during init/fini")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/f4afce5ab0318617f3866b85274be52542d59b32.1660211614.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit d5410ac7b0ba ("net:bonding:support balance-alb interface
with vlan to bridge") introduced a reference count leak by not releasing
the reference acquired by ip_dev_find(). Remedy this by insuring the
reference is released.
Fixes: d5410ac7b0ba ("net:bonding:support balance-alb interface with vlan to bridge")
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/26758.1660194413@famine
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit b32d45824aa7 ("dm bufio: Add DM_BUFIO_CLIENT_NO_SLEEP flag")
added a "NO_SLEEP" mode, it replaces a mutex with a spinlock, and it
is only usable when the device is in read-only mode (because the write
path may be sleeping while holding the dm_bufio_client lock).
However, there are still two points where the code could sleep even in
read-only mode. One is in __get_unclaimed_buffer -> __make_buffer_clean.
The other is in __try_evict_buffer -> __make_buffer_clean. These functions
will call __make_buffer_clean which sleeps if the buffer is being read.
Fix these cases so that if c->no_sleep is set __make_buffer_clean
will not be called and the buffer will be skipped instead.
Fixes: b32d45824aa7 ("dm bufio: Add DM_BUFIO_CLIENT_NO_SLEEP flag")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
|
There are 2 models for MV32 serials. MV32-W-A is designed
based on Qualcomm SDX62 chip, and MV32-W-B is designed based
on Qualcomm SDX65 chip. So we use 2 different PID to separate it.
Test evidence as below:
T: Bus=03 Lev=01 Prnt=01 Port=02 Cnt=03 Dev#= 3 Spd=480 MxCh= 0
D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=1e2d ProdID=00f3 Rev=05.04
S: Manufacturer=Cinterion
S: Product=Cinterion PID 0x00F3 USB Mobile Broadband
S: SerialNumber=d7b4be8d
C: #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA
I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
I: If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
I: If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
T: Bus=03 Lev=01 Prnt=01 Port=02 Cnt=03 Dev#= 10 Spd=480 MxCh= 0
D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=1e2d ProdID=00f4 Rev=05.04
S: Manufacturer=Cinterion
S: Product=Cinterion PID 0x00F4 USB Mobile Broadband
S: SerialNumber=d095087d
C: #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA
I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
I: If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
I: If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
Signed-off-by: Slark Xiao <slark_xiao@163.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Link: https://lore.kernel.org/r/20220810014521.9383-1-slark_xiao@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Initialize err local variable to return -EAGAIN if the asid cannot be
found thus avoiding returning uninitialized value.
Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220811134010.952291-1-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Expose VIRTIO_BLK_F_DISCARD and VIRTIO_BLK_F_WRITE_ZEROES features
to the drivers and handle VIRTIO_BLK_T_DISCARD and
VIRTIO_BLK_T_WRITE_ZEROES requests checking ranges and flags.
The simulator behaves like a ramdisk, so for VIRTIO_BLK_F_DISCARD
does nothing, while for VIRTIO_BLK_T_WRITE_ZEROES sets to 0 the
specified region.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220811083632.77525-5-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
The simulator behaves like a ramdisk, so we don't have to do
anything when a VIRTIO_BLK_T_FLUSH request is received, but it
could be useful to test driver behavior.
Let's expose the VIRTIO_BLK_F_FLUSH feature to inform the driver
that we support the flush command.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220811083632.77525-4-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Next patches will add handling of other requests, where will be
useful to reuse vdpasim_blk_check_range().
So let's make it more generic by adding the `max_sectors` parameter,
since different requests allow different numbers of maximum sectors.
Let's also print the messages directly in vdpasim_blk_check_range()
to avoid duplicate prints.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220811083632.77525-3-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
VIRTIO spec states: "The sector number indicates the offset
(multiplied by 512) where the read or write is to occur. This field is
unused and set to 0 for commands other than read or write."
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220811083632.77525-2-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Implement suspend operation for vdpa_sim devices, so vhost-vdpa will
offer that backend feature and userspace can effectively suspend the
device.
This is a must before get virtqueue indexes (base) for live migration,
since the device could modify them after userland gets them. There are
individual ways to perform that action for some devices
(VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no
way to perform it for any vhost device (and, in particular, vhost-vdpa).
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20220810171512.2343333-5-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
The ioctl adds support for suspending the device from userspace.
This is a must before getting virtqueue indexes (base) for live migration,
since the device could modify them after userland gets them. There are
individual ways to perform that action for some devices
(VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no
way to perform it for any vhost device (and, in particular, vhost-vdpa).
After a successful return of the ioctl call the device must not process
more virtqueue descriptors. The device can answer to read or writes of
config fields as if it were not suspended. In particular, writing to
"queue_enable" with a value of 1 will not make the device start
processing buffers of the virtqueue.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20220810171512.2343333-4-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Userland knows if it can suspend the device or not by checking this feature
bit.
It's only offered if the vdpa driver backend implements the suspend()
operation callback, and to offer it or userland to ack it if the backend
does not offer that callback is an error.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20220810171512.2343333-3-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
hctx->user_data is set to vq in virtblk_init_hctx(). However, vq is
freed on suspend and reallocated on resume. So, hctx->user_data is
invalid after resume, and it will cause use-after-free accessing which
will result in the kernel crash something like below:
[ 22.428391] Call Trace:
[ 22.428899] <TASK>
[ 22.429339] virtqueue_add_split+0x3eb/0x620
[ 22.430035] ? __blk_mq_alloc_requests+0x17f/0x2d0
[ 22.430789] ? kvm_clock_get_cycles+0x14/0x30
[ 22.431496] virtqueue_add_sgs+0xad/0xd0
[ 22.432108] virtblk_add_req+0xe8/0x150
[ 22.432692] virtio_queue_rqs+0xeb/0x210
[ 22.433330] blk_mq_flush_plug_list+0x1b8/0x280
[ 22.434059] __blk_flush_plug+0xe1/0x140
[ 22.434853] blk_finish_plug+0x20/0x40
[ 22.435512] read_pages+0x20a/0x2e0
[ 22.436063] ? folio_add_lru+0x62/0xa0
[ 22.436652] page_cache_ra_unbounded+0x112/0x160
[ 22.437365] filemap_get_pages+0xe1/0x5b0
[ 22.437964] ? context_to_sid+0x70/0x100
[ 22.438580] ? sidtab_context_to_sid+0x32/0x400
[ 22.439979] filemap_read+0xcd/0x3d0
[ 22.440917] xfs_file_buffered_read+0x4a/0xc0
[ 22.441984] xfs_file_read_iter+0x65/0xd0
[ 22.442970] __kernel_read+0x160/0x2e0
[ 22.443921] bprm_execve+0x21b/0x640
[ 22.444809] do_execveat_common.isra.0+0x1a8/0x220
[ 22.446008] __x64_sys_execve+0x2d/0x40
[ 22.446920] do_syscall_64+0x37/0x90
[ 22.447773] entry_SYSCALL_64_after_hwframe+0x63/0xcd
This patch fixes this issue by getting vq from vblk, and removes
virtblk_init_hctx().
Fixes: 4e0400525691 ("virtio-blk: support polling I/O")
Cc: "Suwan Kim" <suwan.kim027@gmail.com>
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Message-Id: <20220810160948.959781-1-syoshida@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Virtio vdpa support the new parameter sizes of find_vqs().
Signed-off-by: Bo Liu <liubo03@inspur.com>
Message-Id: <20220810085151.7251-1-liubo03@inspur.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
In function vhost_vdpa_probe(), when code execution fails, we should
call ida_simple_remove() to free ida.
Signed-off-by: Bo Liu <liubo03@inspur.com>
Message-Id: <20220805091254.20026-1-liubo03@inspur.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This commit fixes spars warnings: cast to restricted __le16
in function vdpa_dev_net_config_fill() and
vdpa_fill_stats_rec()
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Message-Id: <20220722115309.82746-7-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Users may want to query the config space of a vDPA device,
to choose a appropriate one for a certain guest. This means the
users need to read the config space before FEATURES_OK, and
the existence of config space contents does not depend on
FEATURES_OK.
The spec says:
The device MUST allow reading of any device-specific configuration
field before FEATURES_OK is set by the driver. This includes
fields which are conditional on feature bits, as long as those
feature bits are offered by the device.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20220722115309.82746-5-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Adapting to current netlink interfaces, this commit allows userspace
to query feature bits and MQ capability of a management device.
Currently both the vDPA device and the management device are the VF itself,
thus this ifcvf should initialize the virtio capabilities in probe() before
setting up the struct vdpa_mgmt_dev.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20220722115309.82746-3-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
implementation
Drivers must not access a BAR outside the capability length,
and for a virtio device, ifcvf driver should not report any non-standard
capability contents to the upper layers.
Function ifcvf_get_config_size() is introduced here to return a safe value
of the device config capability size.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20220722115309.82746-2-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
We are currently hard coded to always create 128 IO virtqueues, so this
adds a modparam to control it. For large systems where we are ok with
using memory for virtqueues it allows us to add up to 1024. This limit
was just selected becuase that's qemu's limit.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20220708030525.5065-3-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
Qemu takes it's num_queues limit then adds the fixed queues (control and
event) to the total it will request from the kernel. So when a user
requests 128 (or qemu does it's num_queues calculation based on vCPUS
and other system limits), we hit errors due to userspace trying to setup
130 queues when vhost-scsi has a hard coded limit of 128.
This has vhost-scsi adjust it's max so we can do a total of 130 virtqueues
(128 IO and 2 fixed). For the case where the user has 128 vCPUs the guest
OS can then nicely map each IO virtqueue to a vCPU and not have the odd case
where 2 vCPUs share a virtqueue.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20220708030525.5065-2-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
Partition virtqueues to two different address spaces: one for control
virtqueue which is implemented in software, and one for data virtqueues.
Based-on: <20220526124338.36247-1-eperezma@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220714113927.85729-3-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Implement the suspend callback allowing to suspend the virtqueues so
they stop processing descriptors. This is required to allow to query a
consistent state of the virtqueue while live migration is taking place.
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220714113927.85729-2-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This introduces a new ioctl: VDUSE_IOTLB_GET_INFO to
support querying some information of IOVA regions.
Now it can be used to query whether the IOVA region
supports userspace memory registration.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Message-Id: <20220803045523.23851-6-xieyongji@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
Introduce two ioctls: VDUSE_IOTLB_REG_UMEM and
VDUSE_IOTLB_DEREG_UMEM to support registering
and de-registering userspace memory for IOVA
regions.
Now it only supports registering userspace memory
for bounce buffer region in virtio-vdpa case.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220803045523.23851-5-xieyongji@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Introduce two APIs: vduse_domain_add_user_bounce_pages()
and vduse_domain_remove_user_bounce_pages() to support
adding and removing userspace pages for bounce buffers.
During adding and removing, the DMA data would be copied
from the kernel bounce pages to the userspace bounce pages
and back.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220803045523.23851-4-xieyongji@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
kmap_atomic() is being deprecated in favor of kmap_local_page().
The use of kmap_atomic() in do_bounce() is all thread local therefore
kmap_local_page() is a sufficient replacement.
Convert to kmap_local_page() but, instead of open coding it,
use the helpers memcpy_to_page() and memcpy_from_page().
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Message-Id: <20220803045523.23851-3-xieyongji@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Now we use domain->iotlb_lock to protect two different
variables: domain->bounce_maps->bounce_page and
domain->iotlb. But for domain->bounce_maps->bounce_page,
we actually don't need any synchronization between
vduse_domain_get_bounce_page() and vduse_domain_free_bounce_pages()
since vduse_domain_get_bounce_page() will only be called in
page fault handler and vduse_domain_free_bounce_pages() will
be called during file release.
So let's remove the unnecessary spin lock protection in
vduse_domain_get_bounce_page(). Then the usage of
domain->iotlb_lock could be more clear: the lock will be
only used to protect the domain->iotlb.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220803045523.23851-2-xieyongji@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
New VirtIO network feature: VIRTIO_NET_F_NOTF_COAL.
Control a Virtio network device notifications coalescing parameters
using the control virtqueue.
A device that supports this fetature can receive
VIRTIO_NET_CTRL_NOTF_COAL control commands.
- VIRTIO_NET_CTRL_NOTF_COAL_TX_SET:
Ask the network device to change the following parameters:
- tx_usecs: Maximum number of usecs to delay a TX notification.
- tx_max_packets: Maximum number of packets to send before a
TX notification.
- VIRTIO_NET_CTRL_NOTF_COAL_RX_SET:
Ask the network device to change the following parameters:
- rx_usecs: Maximum number of usecs to delay a RX notification.
- rx_max_packets: Maximum number of packets to receive before a
RX notification.
VirtIO spec. patch:
https://lists.oasis-open.org/archives/virtio-comment/202206/msg00100.html
Signed-off-by: Alvaro Karsz <alvaro.karsz@solid-run.com>
Message-Id: <20220718091102.498774-1-alvaro.karsz@solid-run.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
It's possible that dev_set_name() returns -ENOMEM, catch and handle this.
Signed-off-by: Bo Liu <liubo03@inspur.com>
Message-Id: <20220707031751.4802-1-liubo03@inspur.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
The assignment to pointer cfg is duplicated, the second assignment
is redundant and can be removed.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Message-Id: <20220704190456.593464-1-colin.i.king@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
There is a typo(does't) in comments.
It maybe 'doesn't' instead of 'does't'.
Signed-off-by: Zhang Jiaming <jiaming@nfschina.com>
Message-Id: <20220704024104.15535-1-jiaming@nfschina.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
Using eth_broadcast_addr() to assign broadcast address instead
of memset().
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xu Qiang <xuqiang36@huawei.com>
Message-Id: <20220704021405.64545-1-xuqiang36@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
Commit bda324fd037a ("vdpasim: control virtqueue support") changed
the allocation of iotlbs calling vhost_iotlb_init() for each address
space, instead of vhost_iotlb_alloc().
With this change we forgot to use the limit we had introduced with
the `max_iotlb_entries` module parameter.
Fixes: bda324fd037a ("vdpasim: control virtqueue support")
Cc: gautam.dawar@xilinx.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220621151208.189959-1-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
|
|
Commit bda324fd037a ("vdpasim: control virtqueue support") added two
new fields (nas, ngroups) to vdpasim_dev_attr, but we forgot to
initialize them for vdpa_sim_blk.
When creating a new vdpa_sim_blk device this causes the kernel
to panic in this way:
$ vdpa dev add mgmtdev vdpasim_blk name blk0
BUG: kernel NULL pointer dereference, address: 0000000000000030
...
RIP: 0010:vhost_iotlb_add_range_ctx+0x41/0x220 [vhost_iotlb]
...
Call Trace:
<TASK>
vhost_iotlb_add_range+0x11/0x800 [vhost_iotlb]
vdpasim_map_range+0x91/0xd0 [vdpa_sim]
vdpasim_alloc_coherent+0x56/0x90 [vdpa_sim]
...
This happens because vdpasim->iommu[0] is not initialized when
dev_attr.nas is 0.
Let's fix this issue by initializing both (nas, ngroups) to 1 for
vdpa_sim_blk.
Fixes: bda324fd037a ("vdpasim: control virtqueue support")
Cc: gautam.dawar@xilinx.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220621151323.190431-1-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
|
|
Call vringh_complete_iotlb() even when we encounter a serious error
that prevents us from writing the status in the "in" header
(e.g. the header length is incorrect, etc.).
The guest is misbehaving, so maybe the ring is in a bad state, but
let's avoid making things worse.
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220630153221.83371-4-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Limit the number of requests (4 per queue as for vdpa_sim_net) handled
in a batch to prevent the worker from using the CPU for too long.
Suggested-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220630153221.83371-3-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|