summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/ibm/ibmvnic.c
AgeCommit message (Collapse)AuthorFilesLines
2019-11-20net: ibm: fix return type of ndo_start_xmit functionYueHaibing1-2/+2
[ Upstream commit 94b2bb28dbb43fcb943d5275ab19fd5a4972bedb ] The method ndo_start_xmit() is defined as returning an 'netdev_tx_t', which is a typedef for an enum type, so make sure the implementation in this driver has returns 'netdev_tx_t' value, and change the function return type to netdev_tx_t. Found by coccinelle. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-29net/ibmvnic: Fix EOI when running in XIVE mode.Cédric Le Goater1-5/+3
[ Upstream commit 11d49ce9f7946dfed4dcf5dbde865c78058b50ab ] pSeries machines on POWER9 processors can run with the XICS (legacy) interrupt mode or with the XIVE exploitation interrupt mode. These interrupt contollers have different interfaces for interrupt management : XICS uses hcalls and XIVE loads and stores on a page. H_EOI being a XICS interface the enable_scrq_irq() routine can fail when the machine runs in XIVE mode. Fix that by calling the EOI handler of the interrupt chip. Fixes: f23e0643cd0b ("ibmvnic: Clear pending interrupt after device reset") Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-01net/ibmvnic: Fix missing { in __ibmvnic_resetMichal Suchanek1-1/+1
[ Upstream commit c8dc55956b09b53ccffceb6e3146981210e27821 ] Commit 1c2977c09499 ("net/ibmvnic: free reset work of removed device from queue") adds a } without corresponding { causing build break. Fixes: 1c2977c09499 ("net/ibmvnic: free reset work of removed device from queue") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Reviewed-by: Juliet Kim <julietk@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-01net/ibmvnic: free reset work of removed device from queueJuliet Kim1-2/+5
[ Upstream commit 1c2977c094998de032fee6e898c88b4a05483d08 ] Commit 36f1031c51a2 ("ibmvnic: Do not process reset during or after device removal") made the change to exit reset if the driver has been removed, but does not free reset work items of the adapter from queue. Ensure all reset work items are freed when breaking out of the loop early. Fixes: 36f1031c51a2 ("ibmnvic: Do not process reset during or after device removal”) Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-21ibmvnic: Do not process reset during or after device removalThomas Falcon1-1/+5
[ Upstream commit 36f1031c51a2538e5558fb44c6d6b88f98d3c0f2 ] Currently, the ibmvnic driver will not schedule device resets if the device is being removed, but does not check the device state before the reset is actually processed. This leads to a race where a reset is scheduled with a valid device state but is processed after the driver has been removed, resulting in an oops. Fix this by checking the device state before processing a queued reset event. Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-10ibmvnic: Unmap DMA address of TX descriptor buffers after useThomas Falcon1-9/+2
[ Upstream commit 80f0fe0934cd3daa13a5e4d48a103f469115b160 ] There's no need to wait until a completion is received to unmap TX descriptor buffers that have been passed to the hypervisor. Instead unmap it when the hypervisor call has completed. This patch avoids the possibility that a buffer will not be unmapped because a TX completion is lost or mishandled. Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Tested-by: Devesh K. Singh <devesh_singh@in.ibm.com> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-07-14ibmvnic: Fix unchecked return codes of memory allocationsThomas Falcon1-6/+7
[ Upstream commit 7c940b1a5291e5069d561f5b8f0e51db6b7a259a ] The return values for these memory allocations are unchecked, which may cause an oops if the driver does not handle them after a failure. Fix by checking the function's return code. Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-07-14ibmvnic: Refresh device multicast list after resetThomas Falcon1-0/+3
[ Upstream commit be32a24372cf162e825332da1a7ccef058d4f20b ] It was observed that multicast packets were no longer received after a device reset. The fix is to resend the current multicast list to the backing device after recovery. Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-07-14ibmvnic: Do not close unopened driver during resetThomas Falcon1-1/+2
[ Upstream commit 1f94608b0ce141be5286dde31270590bdf35b86a ] Check driver state before halting it during a reset. If the driver is not running, do nothing. Otherwise, a request to deactivate a down link can cause an error and the reset will fail. Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-02net/ibmvnic: Fix RTNL deadlock during device resetThomas Falcon1-1/+1
[ Upstream commit 986103e7920cabc0b910749e77ae5589d3934d52 ] Commit a5681e20b541 ("net/ibmnvic: Fix deadlock problem in reset") made the change to hold the RTNL lock during driver reset but still calls netdev_notify_peers, which results in a deadlock. Instead, use call_netdevice_notifiers, which is functionally the same except that it does not take the RTNL lock again. Fixes: a5681e20b541 ("net/ibmnvic: Fix deadlock problem in reset") Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-17ibmvnic: Fix completion structure initializationThomas Falcon1-2/+3
[ Upstream commit bbd669a868bba591ffd38b7bc75a7b361bb54b04 ] Fix device initialization completion handling for vNIC adapters. Initialize the completion structure on probe and reinitialize when needed. This also fixes a race condition during kdump where the driver can attempt to access the completion struct before it is initialized: Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc0000000081acbe0 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA pSeries Modules linked in: ibmvnic(+) ibmveth sunrpc overlay squashfs loop CPU: 19 PID: 301 Comm: systemd-udevd Not tainted 4.18.0-64.el8.ppc64le #1 NIP: c0000000081acbe0 LR: c0000000081ad964 CTR: c0000000081ad900 REGS: c000000027f3f990 TRAP: 0300 Not tainted (4.18.0-64.el8.ppc64le) MSR: 800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 28228288 XER: 00000006 CFAR: c000000008008934 DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 1 GPR00: c0000000081ad964 c000000027f3fc10 c0000000095b5800 c0000000221b4e58 GPR04: 0000000000000003 0000000000000001 000049a086918581 00000000000000d4 GPR08: 0000000000000007 0000000000000000 ffffffffffffffe8 d0000000014dde28 GPR12: c0000000081ad900 c000000009a00c00 0000000000000001 0000000000000100 GPR16: 0000000000000038 0000000000000007 c0000000095e2230 0000000000000006 GPR20: 0000000000400140 0000000000000001 c00000000910c880 0000000000000000 GPR24: 0000000000000000 0000000000000006 0000000000000000 0000000000000003 GPR28: 0000000000000001 0000000000000001 c0000000221b4e60 c0000000221b4e58 NIP [c0000000081acbe0] __wake_up_locked+0x50/0x100 LR [c0000000081ad964] complete+0x64/0xa0 Call Trace: [c000000027f3fc10] [c000000027f3fc60] 0xc000000027f3fc60 (unreliable) [c000000027f3fc60] [c0000000081ad964] complete+0x64/0xa0 [c000000027f3fca0] [d0000000014dad58] ibmvnic_handle_crq+0xce0/0x1160 [ibmvnic] [c000000027f3fd50] [d0000000014db270] ibmvnic_tasklet+0x98/0x130 [ibmvnic] [c000000027f3fda0] [c00000000813f334] tasklet_action_common.isra.3+0xc4/0x1a0 [c000000027f3fe00] [c000000008cd13f4] __do_softirq+0x164/0x400 [c000000027f3fef0] [c00000000813ed64] irq_exit+0x184/0x1c0 [c000000027f3ff20] [c0000000080188e8] __do_irq+0xb8/0x210 [c000000027f3ff90] [c00000000802d0a4] call_do_irq+0x14/0x24 [c000000026a5b010] [c000000008018adc] do_IRQ+0x9c/0x130 [c000000026a5b060] [c000000008008ce4] hardware_interrupt_common+0x114/0x120 Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13ibmvnic: Fix non-atomic memory allocation in IRQ contextThomas Falcon1-1/+1
[ Upstream commit 1d1bbc37f89b0559c9e913682f2489d89cfde6b8 ] ibmvnic_reset allocated new reset work item objects in a non-atomic context. This can be called from a tasklet, generating the output below. Allocate work items with the GFP_ATOMIC flag instead. BUG: sleeping function called from invalid context at mm/slab.h:421 in_atomic(): 1, irqs_disabled(): 1, pid: 93, name: kworker/0:2 INFO: lockdep is turned off. irq event stamp: 66049 hardirqs last enabled at (66048): [<c000000000122468>] tasklet_action_common.isra.12+0x78/0x1c0 hardirqs last disabled at (66049): [<c000000000befce8>] _raw_spin_lock_irqsave+0x48/0xf0 softirqs last enabled at (66044): [<c000000000a8ac78>] dev_deactivate_queue.constprop.28+0xc8/0x160 softirqs last disabled at (66045): [<c0000000000306e0>] call_do_softirq+0x14/0x24 CPU: 0 PID: 93 Comm: kworker/0:2 Kdump: loaded Not tainted 4.20.0-rc6-00001-g1b50a8f03706 #7 Workqueue: events linkwatch_event Call Trace: [c0000003fffe7ae0] [c000000000bc83e4] dump_stack+0xe8/0x164 (unreliable) [c0000003fffe7b30] [c00000000015ba0c] ___might_sleep+0x2dc/0x320 [c0000003fffe7bb0] [c000000000391514] kmem_cache_alloc_trace+0x3e4/0x440 [c0000003fffe7c30] [d000000005b2309c] ibmvnic_reset+0x16c/0x360 [ibmvnic] [c0000003fffe7cc0] [d000000005b29834] ibmvnic_tasklet+0x1054/0x2010 [ibmvnic] [c0000003fffe7e00] [c0000000001224c8] tasklet_action_common.isra.12+0xd8/0x1c0 [c0000003fffe7e60] [c000000000bf1238] __do_softirq+0x1a8/0x64c [c0000003fffe7f90] [c0000000000306e0] call_do_softirq+0x14/0x24 [c0000003f3967980] [c00000000001ba50] do_softirq_own_stack+0x60/0xb0 [c0000003f39679c0] [c0000000001218a8] do_softirq+0xa8/0x100 [c0000003f39679f0] [c000000000121a74] __local_bh_enable_ip+0x174/0x180 [c0000003f3967a60] [c000000000bf003c] _raw_spin_unlock_bh+0x5c/0x80 [c0000003f3967a90] [c000000000a8ac78] dev_deactivate_queue.constprop.28+0xc8/0x160 [c0000003f3967ad0] [c000000000a8c8b0] dev_deactivate_many+0xd0/0x520 [c0000003f3967b70] [c000000000a8cd40] dev_deactivate+0x40/0x60 [c0000003f3967ba0] [c000000000a5e0c4] linkwatch_do_dev+0x74/0xd0 [c0000003f3967bd0] [c000000000a5e694] __linkwatch_run_queue+0x1a4/0x1f0 [c0000003f3967c30] [c000000000a5e728] linkwatch_event+0x48/0x60 [c0000003f3967c50] [c0000000001444e8] process_one_work+0x238/0x710 [c0000003f3967d20] [c000000000144a48] worker_thread+0x88/0x4e0 [c0000003f3967db0] [c00000000014e3a8] kthread+0x178/0x1c0 [c0000003f3967e20] [c00000000000bfd0] ret_from_kernel_thread+0x5c/0x6c Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13ibmvnic: Convert reset work item mutex to spin lockThomas Falcon1-7/+9
[ Upstream commit 6c5c7489089608d89b7ce310bca44812e2b0a4a5 ] ibmvnic_reset can create and schedule a reset work item from an IRQ context, so do not use a mutex, which can sleep. Convert the reset work item mutex to a spin lock. Locking debugger generated the trace output below. BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908 in_atomic(): 1, irqs_disabled(): 1, pid: 120, name: kworker/8:1 4 locks held by kworker/8:1/120: #0: 0000000017c05720 ((wq_completion)"events"){+.+.}, at: process_one_work+0x188/0x710 #1: 00000000ace90706 ((linkwatch_work).work){+.+.}, at: process_one_work+0x188/0x710 #2: 000000007632871f (rtnl_mutex){+.+.}, at: rtnl_lock+0x30/0x50 #3: 00000000fc36813a (&(&crq->lock)->rlock){..-.}, at: ibmvnic_tasklet+0x88/0x2010 [ibmvnic] irq event stamp: 26293 hardirqs last enabled at (26292): [<c000000000122468>] tasklet_action_common.isra.12+0x78/0x1c0 hardirqs last disabled at (26293): [<c000000000befce8>] _raw_spin_lock_irqsave+0x48/0xf0 softirqs last enabled at (26288): [<c000000000a8ac78>] dev_deactivate_queue.constprop.28+0xc8/0x160 softirqs last disabled at (26289): [<c0000000000306e0>] call_do_softirq+0x14/0x24 CPU: 8 PID: 120 Comm: kworker/8:1 Kdump: loaded Not tainted 4.20.0-rc6 #6 Workqueue: events linkwatch_event Call Trace: [c0000003fffa7a50] [c000000000bc83e4] dump_stack+0xe8/0x164 (unreliable) [c0000003fffa7aa0] [c00000000015ba0c] ___might_sleep+0x2dc/0x320 [c0000003fffa7b20] [c000000000be960c] __mutex_lock+0x8c/0xb40 [c0000003fffa7c30] [d000000006202ac8] ibmvnic_reset+0x78/0x330 [ibmvnic] [c0000003fffa7cc0] [d0000000062097f4] ibmvnic_tasklet+0x1054/0x2010 [ibmvnic] [c0000003fffa7e00] [c0000000001224c8] tasklet_action_common.isra.12+0xd8/0x1c0 [c0000003fffa7e60] [c000000000bf1238] __do_softirq+0x1a8/0x64c [c0000003fffa7f90] [c0000000000306e0] call_do_softirq+0x14/0x24 [c0000003f3f87980] [c00000000001ba50] do_softirq_own_stack+0x60/0xb0 [c0000003f3f879c0] [c0000000001218a8] do_softirq+0xa8/0x100 [c0000003f3f879f0] [c000000000121a74] __local_bh_enable_ip+0x174/0x180 [c0000003f3f87a60] [c000000000bf003c] _raw_spin_unlock_bh+0x5c/0x80 [c0000003f3f87a90] [c000000000a8ac78] dev_deactivate_queue.constprop.28+0xc8/0x160 [c0000003f3f87ad0] [c000000000a8c8b0] dev_deactivate_many+0xd0/0x520 [c0000003f3f87b70] [c000000000a8cd40] dev_deactivate+0x40/0x60 [c0000003f3f87ba0] [c000000000a5e0c4] linkwatch_do_dev+0x74/0xd0 [c0000003f3f87bd0] [c000000000a5e694] __linkwatch_run_queue+0x1a4/0x1f0 [c0000003f3f87c30] [c000000000a5e728] linkwatch_event+0x48/0x60 [c0000003f3f87c50] [c0000000001444e8] process_one_work+0x238/0x710 [c0000003f3f87d20] [c000000000144a48] worker_thread+0x88/0x4e0 [c0000003f3f87db0] [c00000000014e3a8] kthread+0x178/0x1c0 [c0000003f3f87e20] [c00000000000bfd0] ret_from_kernel_thread+0x5c/0x6c Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-13ibmvnic: Update driver queues after change in ring size supportThomas Falcon1-1/+8
[ Upstream commit 5bf032ef08e6a110edc1e3bfb3c66a208fb55125 ] During device reset, queue memory is not being updated to accommodate changes in ring buffer sizes supported by backing hardware. Track any differences in ring buffer sizes following the reset and update queue memory when possible. Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-13ibmvnic: Fix RX queue buffer cleanupThomas Falcon1-2/+2
[ Upstream commit b7cdec3d699db2e5985ad39de0f25d3b6111928e ] The wrong index is used when cleaning up RX buffer objects during release of RX queues. Update to use the correct index counter. Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-13net/ibmnvic: Fix deadlock problem in resetJuliet Kim1-38/+21
[ Upstream commit a5681e20b541a507c7d4fd48ae4a4040d32ee1ef ] This patch changes to use rtnl_lock only during a reset to avoid deadlock that could occur when a thread operating close is holding rtnl_lock and waiting for reset_lock acquired by another thread, which is waiting for rtnl_lock in order to set the number of tx/rx queues during a reset. Also, we now setting the number of tx/rx queues during a soft reset for failover or LPM events. Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-23ibmvnic: fix accelerated VLAN handlingMichał Mirosław1-1/+1
[ Upstream commit e84b47941e15e6666afb8ee8b21d1c3fc1a013af ] Don't request tag insertion when it isn't present in outgoing skb. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-28ibmvnic: remove ndo_poll_controllerEric Dumazet1-16/+0
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. ibmvnic uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. ibmvnic_netpoll_controller() was completely wrong anyway, as it was scheduling NAPI to service RX queues (instead of TX), so I doubt netpoll ever worked on this driver. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-01ibmvnic: Include missing return code checks in reset functionThomas Falcon1-3/+9
Check the return codes of these functions and halt reset in case of failure. The driver will remain in a dormant state until the next reset event, when device initialization will be re-attempted. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07ibmvnic: Update firmware error reporting with cause stringThomas Falcon1-4/+30
Print a string instead of the error code. Since there is a possibility that the driver can recover, classify it as a warning instead of an error. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07ibmvnic: Remove code to request error informationThomas Falcon1-143/+1
When backing device firmware reports an error, it provides an error ID, which is meant to be queried for more detailed error information. Currently, however, an error ID is not provided by the Virtual I/O server and there are not any plans to do so. For now, it is always unfilled or zero, so request_error_information will never be called. Remove it. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-17ibmvnic: Fix error recovery on login failureJohn Allen1-2/+2
Testing has uncovered a failure case that is not handled properly. In the event that a login fails and we are not able to recover on the spot, we return 0 from do_reset, preventing any error recovery code from being triggered. Additionally, the state is set to "probed" meaning that when we are able to trigger the error recovery, the driver always comes up in the probed state. To handle the case properly, we need to return a failure code here and set the adapter state to the state that we entered the reset in indicating the state that we would like to come out of the recovery reset in. Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-17ibmvnic: Revise RX/TX queue error messagesThomas Falcon1-12/+27
During a device failover, there may be latency between the loss of the current backing device and a notification from firmware that a failover has occurred. This latency can result in a large amount of error printouts as firmware returns outgoing traffic with a generic error code. These are not necessarily errors in this case as the firmware is busy swapping in a new backing adapter and is not ready to send packets yet. This patch reclassifies those error codes as warnings with an explanation that a failover may be pending. All other return codes will be considered errors. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-7/+15
Lots of easy overlapping changes in the confict resolutions here. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25ibmvnic: Fix partial success login retriesThomas Falcon1-1/+6
In its current state, the driver will handle backing device login in a loop for a certain number of retries while the device returns a partial success, indicating that the driver may need to try again using a smaller number of resources. The variable it checks to continue retrying may change over the course of operations, resulting in reallocation of resources but exits without sending the login attempt. Guard against this by introducing a boolean variable that will retain the state indicating that the driver needs to reattempt login with backing device firmware. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25ibmvnic: Introduce hard reset recoveryThomas Falcon1-4/+97
Introduce a recovery hard reset to handle reset failure as a result of change of device context following a transport event, such as a backing device failover or partition migration. These operations reset the device context to its initial state. If this occurs during a reset, any initialization commands are likely to fail with an invalid state error as backing device firmware requests reinitialization. When this happens, make one more attempt by performing a hard reset, which frees any resources currently allocated and performs device initialization. If a transport event occurs during a device reset, a flag is set which will trigger a new hard reset following the completionof the current reset event. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25ibmvnic: Set resetting state at earliest possible pointThomas Falcon1-2/+1
Set device resetting state at the earliest possible point: as soon as a reset is successfully scheduled. The reset state is toggled off when all resets have been processed to completion. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25ibmvnic: Create separate initialization routine for resetsThomas Falcon1-2/+46
Instead of having one initialization routine for all cases, create a separate, simpler function for standard initialization, such as during device probe. Use the original initialization function to handle device reset scenarios. The goal of this patch is to avoid having a single, cluttered init function to handle all possible scenarios. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25ibmvnic: Handle error case when setting link stateThomas Falcon1-0/+4
If setting the link state is not successful, print a warning with the resulting return code and return it to be handled by the caller. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25ibmvnic: Return error code if init interrupted by transport eventThomas Falcon1-1/+4
If device init is interrupted by a failover, set the init return code so that it can be checked and handled appropriately by the init routine. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25ibmvnic: Check CRQ command return codesThomas Falcon1-14/+37
Check whether CRQ command is successful before awaiting a response from the management partition. If the command was not successful, the driver may hang waiting for a response that will never come. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25ibmvnic: Introduce active CRQ stateThomas Falcon1-0/+10
Introduce an "active" state for a IBM vNIC Command-Response Queue. A CRQ is considered active once it has initialized or linked with its partner by sending an initialization request and getting a successful response back from the management partition. Until this has happened, do not allow CRQ commands to be sent other than the initialization request. This change will avoid a protocol error in case of a device transport event occurring during a initialization. When the driver receives a transport event notification indicating that the backing hardware has changed and needs reinitialization, any further commands other than the initialization handshake with the VIOS management partition will result in an invalid state error. Instead of sending a command that will be returned with an error, print a warning and return an error that will be handled by the caller. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25ibmvnic: Mark NAPI flag as disabled when releasedThomas Falcon1-0/+1
Set adapter NAPI state as disabled if they are removed. This will allow them to be enabled again if reallocated in case of a hard reset. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23ibmvnic: Only do H_EOI for mobility eventsNathan Fontenot1-6/+9
When enabling the sub-CRQ IRQ a previous update sent a H_EOI prior to the enablement to clear any pending interrupts that may be present across a partition migration. This fixed a firmware bug where a migration could erroneously indicate that a H_EOI was pending. The H_EOI should only be sent when enabling during a mobility event though. Doing so at other time could wrong and can produce extra driver output when IRQs are enabled when doing TX completion. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17ibmvnic: Fix statistics buffers memory leakThomas Falcon1-9/+15
Move initialization of statistics buffers from ibmvnic_init function into ibmvnic_probe. In the current state, ibmvnic_init will be called again during a device reset, resulting in the allocation of new buffers without freeing the old ones. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17ibmvnic: Fix non-fatal firmware error resetThomas Falcon1-2/+1
It is not necessary to disable interrupt lines here during a reset to handle a non-fatal firmware error. Move that call within the code block that handles the other cases that do require interrupts to be disabled and re-enabled. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17ibmvnic: Free coherent DMA memory if FW map failedThomas Falcon1-0/+1
If the firmware map fails for whatever reason, remember to free up the memory after. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23ibmvnic: Clean actual number of RX or TX poolsThomas Falcon1-2/+2
Avoid using value stored in the login response buffer when cleaning TX and RX buffer pools since these could be inconsistent depending on the device state. Instead use the field in the driver's private data that tracks the number of active pools. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-16ibmvnic: Clear pending interrupt after device resetThomas Falcon1-4/+11
Due to a firmware bug, the hypervisor can send an interrupt to a transmit or receive queue just prior to a partition migration, not allowing the device enough time to handle it and send an EOI. When the partition migrates, the interrupt is lost but an "EOI-pending" flag for the interrupt line is still set in firmware. No further interrupts will be sent until that flag is cleared, effectively freezing that queue. To workaround this, the driver will disable the hardware interrupt and send an H_EOI signal prior to re-enabling it. This will flush the pending EOI and allow the driver to continue operation. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-13ibmvnic: Do not notify peers on parameter change resetsNathan Fontenot1-1/+2
When attempting to change the driver parameters, such as the MTU value or number of queues, do not call netdev_notify_peers(). Doing so will deadlock on the rtnl_lock. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-13ibmvnic: Handle all login error conditionsNathan Fontenot1-20/+35
There is a bug in handling the possible return codes from sending the login CRQ. The current code treats any non-success return value, minus failure to send the crq and a timeout waiting for a login response, as a need to re-send the login CRQ. This can put the drive in an infinite loop of trying to login when getting return values other that a partial success such as a return code of aborted. For these scenarios the login will not ever succeed at this point and the driver would need to be reset again. To resolve this loop trying to login is updated to only retry the login if the driver gets a return code of a partial success. Other return codes are treated as an error and the driver returns an error from ibmvnic_login(). To avoid infinite looping in the partial success return cases, the number of retries is capped at the maximum number of supported queues. This value was chosen because the driver does a renegotiation of capabilities which sets the number of queues possible and allows the driver to attempt a login for possible value for the number of queues supported. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-13ibmvnic: Define vnic_login_client_data name field as unsized arrayKees Cook1-6/+6
The "name" field of struct vnic_login_client_data is a char array of undefined length. This should be written as "char name[]" so the compiler can make better decisions about the field (for example, not assuming it's a single character). This was noticed while trying to tighten the CONFIG_FORTIFY_SOURCE checking. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08ibmvnic: Do not reset CRQ for Mobility driver resetsNathan Fontenot1-23/+32
When resetting the ibmvnic driver after a partition migration occurs there is no requirement to do a reset of the main CRQ. The current driver code does the required re-enable of the main CRQ, then does a reset of the main CRQ later. What we should be doing for a driver reset after a migration is to re-enable the main CRQ, release all the sub-CRQs, and then allocate new sub-CRQs after capability negotiation. This patch updates the handling of mobility resets to do the proper work and not reset the main CRQ. To do this the initialization/reset of the main CRQ had to be moved out of the ibmvnic_init routine and in to the ibmvnic_probe and do_reset routines. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08ibmvnic: Fix failover case for non-redundant configurationThomas Falcon1-8/+29
There is a failover case for a non-redundant pseries VNIC configuration that was not being handled properly. The current implementation assumes that the driver will always have a redandant device to communicate with following a failover notification. There are cases, however, when a non-redundant configuration can receive a failover request. If that happens, the driver should wait until it receives a signal that the device is ready for operation. The driver is agnostic of its backing hardware configuration, so this fix necessarily affects all device failover management. The driver needs to wait until it receives a signal that the device is ready for resetting. A flag is introduced to track this intermediary state where the driver is waiting for an active device. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08ibmvnic: Fix reset scheduler error handlingThomas Falcon1-10/+29
In some cases, if the driver is waiting for a reset following a device parameter change, failure to schedule a reset can result in a hang since a completion signal is never sent. If the device configuration is being altered by a tool such as ethtool or ifconfig, it could cause the console to hang if the reset request does not get scheduled. Add some additional error handling code to exit the wait_for_completion if there is one in progress. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08ibmvnic: Zero used TX descriptor counter on resetThomas Falcon1-0/+1
The counter that tracks used TX descriptors pending completion needs to be zeroed as part of a device reset. This change fixes a bug causing transmit queues to be stopped unnecessarily and in some cases a transmit queue stall and timeout reset. If the counter is not reset, the remaining descriptors will not be "removed", effectively reducing queue capacity. If the queue is over half full, it will cause the queue to stall if stopped. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08ibmvnic: Fix DMA mapping mistakesThomas Falcon1-8/+6
Fix some mistakes caught by the DMA debugger. The first change fixes a unnecessary unmap that should have been removed in an earlier update. The next hunk fixes another bad unmap by zeroing the bit checked to determine that an unmap is needed. The final change fixes some buffers that are unmapped with the wrong direction specified. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-02ibmvnic: Disable irqs before exiting reset from closed stateJohn Allen1-10/+18
When the driver is closed, all the associated irqs are disabled. In the event that the driver exits a reset in the closed state, we should be consistent with the state we are in directly after a close. So before we exit the reset routine, all irqs should be disabled as well. This will prevent the irqs from being enabled twice in this case and reporting a number of noisy warning traces. Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-26ibmvnic: Potential NULL dereference in clean_one_tx_pool()Dan Carpenter1-1/+1
There is an && vs || typo here, which potentially leads to a NULL dereference. Fixes: e9e1e97884b7 ("ibmvnic: Update TX pool cleaning routine") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-18ibmvnic: Update TX pool cleaning routineThomas Falcon1-16/+24
Update routine that cleans up any outstanding transmits that have not received completions when the device needs to close. Introduces a helper function that cleans one TX pool to make code more readable. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>