kernel/linux.git/drivers/s390, branch linux-4.20.y

s390/qeth: conclude all event processing before offlining a card

2019-03-13T21:04:15+00:00

[ Upstream commit c0a2e4d10d9366ada133a8ae4ff2f32397f8b15b ] Work for Bridgeport events is currently placed on a driver-wide workqueue. If the card is removed and freed while any such work is still active, this causes a use-after-free. So put the events on a per-card queue, where we can control their lifetime. As we also don't want stale events to last beyond an offline & online cycle, flush this queue when setting the card offline. Fixes: b4d72c08b358 ("qeth: bridgeport support - basic control") Signed-off-by: Julian Wiedmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin

s390/qeth: cancel close_dev work before removing a card

2019-03-13T21:04:15+00:00

[ Upstream commit c2780c1a3fb724560b1d44f7976e0de17bf153c7 ] A card's close_dev work is scheduled on a driver-wide workqueue. If the card is removed and freed while the work is still active, this causes a use-after-free. So make sure that the work is completed before freeing the card. Fixes: 0f54761d167f ("qeth: Support VEPA mode") Signed-off-by: Julian Wiedmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin

s390/qeth: fix use-after-free in error path

2019-03-13T21:04:15+00:00

[ Upstream commit afa0c5904ba16d59b0454f7ee4c807dae350f432 ] The error path in qeth_alloc_qdio_buffers() that takes care of cleaning up the Output Queues is buggy. It first frees the queue, but then calls qeth_clear_outq_buffers() with that very queue struct. Make the call to qeth_clear_outq_buffers() part of the free action (in the correct order), and while at it fix the naming of the helper. Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks") Signed-off-by: Julian Wiedmann Reviewed-by: Alexandra Winter Signed-off-by: David S. Miller Signed-off-by: Sasha Levin

s390/qeth: release cmd buffer in error paths

2019-03-13T21:04:15+00:00

[ Upstream commit 5065b2dd3e5f9247a6c9d67974bc0472bf561b9d ] Whenever we fail before/while starting an IO, make sure to release the IO buffer. Usually qeth_irq() would do this for us, but if the IO doesn't even start we obviously won't get an interrupt for it either. Signed-off-by: Julian Wiedmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin

s390/zcrypt: fix specification exception on z196 during ap probe

2019-02-20T09:29:13+00:00

commit 8f9aca0c45322a807a343fc32f95f2500f83b9ae upstream. The older machines don't have the QCI instruction available. With support for up to 256 crypto cards the probing of each card has been extended to check card ids from 0 up to 255. For machines with QCI support there is a filter limiting the range of probed cards. The older machines (z196 and older) don't have this filter and so since support for 256 cards is in the driver all cards are probed. However, these machines also require to have the card id fit into 6 bits. Exceeding this limit results in a specification exception which happens on every kernel startup even when there is no crypto configured and used at all. This fix limits the range of probed crypto cards to 64 if there is no QCI instruction available to obey to the older ap architecture and so fixes the specification exceptions on z196 machines. Cc: stable@vger.kernel.org # v4.17+ Fixes: af4a72276d49 ("s390/zcrypt: Support up to 256 crypto adapters.") Signed-off-by: Harald Freudenberger Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman

s390/zcrypt: improve special ap message cmd handling

2019-02-12T19:02:13+00:00

[ Upstream commit be534791011100d204602e2e0496e9e6ce8edf63 ] There exist very few ap messages which need to have the 'special' flag enabled. This flag tells the firmware layer to do some pre- and maybe postprocessing. However, it may happen that this special flag is enabled but the firmware is unable to deal with this kind of message and thus returns with reply code 0x41. For example older firmware may not know the newest messages triggered by the zcrypt device driver and thus react with reject and the named reply code. Unfortunately this reply code is not known to the zcrypt error routines and thus default behavior is to switch the ap queue offline. This patch now makes the ap error routine aware of the reply code and so userspace is informed about the bad processing result but the queue is not switched to offline state any more. Signed-off-by: Harald Freudenberger Signed-off-by: Martin Schwidefsky Signed-off-by: Sasha Levin

s390/qeth: utilize virtual MAC for Layer2 OSD devices

2019-02-12T19:02:06+00:00

[ Upstream commit b144b99fff69a5bc0d34c8e168bedb88c68ca23d ] By default, READ MAC on a Layer2 OSD device returns the adapter's burnt-in MAC address. Given the default scenario of many virtual devices on the same adapter, qeth can't make any use of this address and therefore skips the READ MAC call for this device type. But in some configurations, the READ MAC command for a Layer2 OSD device actually returns a pre-provisioned, virtual MAC address. So enable the READ MAC code to detect this situation, and let the L2 subdriver call READ MAC for OSD devices. This also removes the QETH_LAYER2_MAC_READ flag, which protects L2 devices against calling READ MAC multiple times. Instead protect the whole call to qeth_l2_request_initial_mac(). Signed-off-by: Julian Wiedmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin

s390/smp: fix CPU hotplug deadlock with CPU rescan

2019-01-31T07:15:38+00:00

commit b7cb707c373094ce4008d4a6ac9b6b366ec52da5 upstream. smp_rescan_cpus() is called without the device_hotplug_lock, which can lead to a dedlock when a new CPU is found and immediately set online by a udev rule. This was observed on an older kernel version, where the cpu_hotplug_begin() loop was still present, and it resulted in hanging chcpu and systemd-udev processes. This specific deadlock will not show on current kernels. However, there may be other possible deadlocks, and since smp_rescan_cpus() can still trigger a CPU hotplug operation, the device_hotplug_lock should be held. For reference, this was the deadlock with the old cpu_hotplug_begin() loop: chcpu (rescan) systemd-udevd echo 1 > /sys/../rescan -> smp_rescan_cpus() -> (*) get_online_cpus() (increases refcount) -> smp_add_present_cpu() (new CPU found) -> register_cpu() -> device_add() -> udev "add" event triggered -----------> udev rule sets CPU online -> echo 1 > /sys/.../online -> lock_device_hotplug_sysfs() (this is missing in rescan path) -> device_online() -> (**) device_lock(new CPU dev) -> cpu_up() -> cpu_hotplug_begin() (loops until refcount == 0) -> deadlock with (*) -> bus_probe_device() -> device_attach() -> device_lock(new CPU dev) -> deadlock with (**) Fix this by taking the device_hotplug_lock in the CPU rescan path. Cc: Signed-off-by: Gerald Schaefer Signed-off-by: Martin Schwidefsky Signed-off-by: Greg Kroah-Hartman

virtio: don't allocate vqs when names[i] = NULL

2019-01-22T20:09:52+00:00

commit a229989d975eb926076307c1f2f5e4c6111768e7 upstream. Some vqs may not need to be allocated when their related feature bits are disabled. So callers may pass in such vqs with "names = NULL". Then we skip such vq allocations. Signed-off-by: Wei Wang Signed-off-by: Michael S. Tsirkin Signed-off-by: Wei Wang Signed-off-by: Wei Wang Reviewed-by: Cornelia Huck Cc: stable@vger.kernel.org Fixes: 86a559787e6f ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") Signed-off-by: Greg Kroah-Hartman

scsi: zfcp: fix posting too many status read buffers leading to adapter shutdown

2019-01-13T08:24:01+00:00

commit 60a161b7e5b2a252ff0d4c622266a7d8da1120ce upstream. Suppose adapter (open) recovery is between opened QDIO queues and before (the end of) initial posting of status read buffers (SRBs). This time window can be seconds long due to FSF_PROT_HOST_CONNECTION_INITIALIZING causing by design looping with exponential increase sleeps in the function performing exchange config data during recovery [zfcp_erp_adapter_strat_fsf_xconf()]. Recovery triggered by local link up. Suppose an event occurs for which the FCP channel would send an unsolicited notification to zfcp by means of a previously posted SRB. We saw it with local cable pull (link down) in multi-initiator zoning with multiple NPIV-enabled subchannels of the same shared FCP channel. As soon as zfcp_erp_adapter_strategy_open_fsf() starts posting the initial status read buffers from within the adapter's ERP thread, the channel does send an unsolicited notification. Since v2.6.27 commit d26ab06ede83 ("[SCSI] zfcp: receiving an unsolicted status can lead to I/O stall"), zfcp_fsf_status_read_handler() schedules adapter->stat_work to re-fill the just consumed SRB from a work item. Now the ERP thread and the work item post SRBs in parallel. Both contexts call the helper function zfcp_status_read_refill(). The tracking of missing (to be posted / re-filled) SRBs is not thread-safe due to separate atomic_read() and atomic_dec(), in order to depend on posting success. Hence, both contexts can see atomic_read(&adapter->stat_miss) == 1. One of the two contexts posts one too many SRB. Zfcp gets QDIO_ERROR_SLSB_STATE on the output queue (trace tag "qdireq1") leading to zfcp_erp_adapter_shutdown() in zfcp_qdio_handler_error(). An obvious and seemingly clean fix would be to schedule stat_work from the ERP thread and wait for it to finish. This would serialize all SRB re-fills. However, we already have another work item wait on the ERP thread: adapter->scan_work runs zfcp_fc_scan_ports() which calls zfcp_fc_eval_gpn_ft(). The latter calls zfcp_erp_wait() to wait for all the open port recoveries during zfcp auto port scan, but in fact it waits for any pending recovery including an adapter recovery. This approach leads to a deadlock. [see also v3.19 commit 18f87a67e6d6 ("zfcp: auto port scan resiliency"); v2.6.37 commit d3e1088d6873 ("[SCSI] zfcp: No ERP escalation on gpn_ft eval"); v2.6.28 commit fca55b6fb587 ("[SCSI] zfcp: fix deadlock between wq triggered port scan and ERP") fixing v2.6.27 commit c57a39a45a76 ("[SCSI] zfcp: wait until adapter is finished with ERP during auto-port"); v2.6.27 commit cc8c282963bd ("[SCSI] zfcp: Automatically attach remote ports")] Instead make the accounting of missing SRBs atomic for parallel execution in both the ERP thread and adapter->stat_work. Signed-off-by: Steffen Maier Fixes: d26ab06ede83 ("[SCSI] zfcp: receiving an unsolicted status can lead to I/O stall") Cc: #2.6.27+ Reviewed-by: Jens Remus Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman