BMC/Intel-BMC/linux.git - Intel OpenBMC Linux kernel source tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2022-04-08	ibmvnic: fix race between xmit and reset	Sukadev Bhattiprolu	2	-15/+55
	[ Upstream commit 4219196d1f662cb10a462eb9e076633a3fc31a15 ] There is a race between reset and the transmit paths that can lead to ibmvnic_xmit() accessing an scrq after it has been freed in the reset path. It can result in a crash like: Kernel attempted to read user page (0) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc0080000016189f8 Oops: Kernel access of bad area, sig: 11 [#1] ... NIP [c0080000016189f8] ibmvnic_xmit+0x60/0xb60 [ibmvnic] LR [c000000000c0046c] dev_hard_start_xmit+0x11c/0x280 Call Trace: [c008000001618f08] ibmvnic_xmit+0x570/0xb60 [ibmvnic] (unreliable) [c000000000c0046c] dev_hard_start_xmit+0x11c/0x280 [c000000000c9cfcc] sch_direct_xmit+0xec/0x330 [c000000000bfe640] __dev_xmit_skb+0x3a0/0x9d0 [c000000000c00ad4] __dev_queue_xmit+0x394/0x730 [c008000002db813c] __bond_start_xmit+0x254/0x450 [bonding] [c008000002db8378] bond_start_xmit+0x40/0xc0 [bonding] [c000000000c0046c] dev_hard_start_xmit+0x11c/0x280 [c000000000c00ca4] __dev_queue_xmit+0x564/0x730 [c000000000cf97e0] neigh_hh_output+0xd0/0x180 [c000000000cfa69c] ip_finish_output2+0x31c/0x5c0 [c000000000cfd244] __ip_queue_xmit+0x194/0x4f0 [c000000000d2a3c4] __tcp_transmit_skb+0x434/0x9b0 [c000000000d2d1e0] __tcp_retransmit_skb+0x1d0/0x6a0 [c000000000d2d984] tcp_retransmit_skb+0x34/0x130 [c000000000d310e8] tcp_retransmit_timer+0x388/0x6d0 [c000000000d315ec] tcp_write_timer_handler+0x1bc/0x330 [c000000000d317bc] tcp_write_timer+0x5c/0x200 [c000000000243270] call_timer_fn+0x50/0x1c0 [c000000000243704] __run_timers.part.0+0x324/0x460 [c000000000243894] run_timer_softirq+0x54/0xa0 [c000000000ea713c] __do_softirq+0x15c/0x3e0 [c000000000166258] __irq_exit_rcu+0x158/0x190 [c000000000166420] irq_exit+0x20/0x40 [c00000000002853c] timer_interrupt+0x14c/0x2b0 [c000000000009a00] decrementer_common_virt+0x210/0x220 --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c The immediate cause of the crash is the access of tx_scrq in the following snippet during a reset, where the tx_scrq can be either NULL or an address that will soon be invalid: ibmvnic_xmit() { ... tx_scrq = adapter->tx_scrq[queue_num]; txq = netdev_get_tx_queue(netdev, queue_num); ind_bufp = &tx_scrq->ind_buf; if (test_bit(0, &adapter->resetting)) { ... } But beyond that, the call to ibmvnic_xmit() itself is not safe during a reset and the reset path attempts to avoid this by stopping the queue in ibmvnic_cleanup(). However just after the queue was stopped, an in-flight ibmvnic_complete_tx() could have restarted the queue even as the reset is progressing. Since the queue was restarted we could get a call to ibmvnic_xmit() which can then access the bad tx_scrq (or other fields). We cannot however simply have ibmvnic_complete_tx() check the ->resetting bit and skip starting the queue. This can race at the "back-end" of a good reset which just restarted the queue but has not cleared the ->resetting bit yet. If we skip restarting the queue due to ->resetting being true, the queue would remain stopped indefinitely potentially leading to transmit timeouts. IOW ->resetting is too broad for this purpose. Instead use a new flag that indicates whether or not the queues are active. Only the open/ reset paths control when the queues are active. ibmvnic_complete_tx() and others wake up the queue only if the queue is marked active. So we will have: A. reset/open thread in ibmvnic_cleanup() and __ibmvnic_open() ->resetting = true ->tx_queues_active = false disable tx queues ... ->tx_queues_active = true start tx queues B. Tx interrupt in ibmvnic_complete_tx(): if (->tx_queues_active) netif_wake_subqueue(); To ensure that ->tx_queues_active and state of the queues are consistent, we need a lock which: - must also be taken in the interrupt path (ibmvnic_complete_tx()) - shared across the multiple queues in the adapter (so they don't become serialized) Use rcu_read_lock() and have the reset thread synchronize_rcu() after updating the ->tx_queues_active state. While here, consolidate a few boolean fields in ibmvnic_adapter for better alignment. Based on discussions with Brian King and Dany Madden. Fixes: 7ed5b31f4a66 ("net/ibmvnic: prevent more than one thread from running in reset") Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08	ibmvnic: complete init_done on transport events	Sukadev Bhattiprolu	1	-0/+7
	[ Upstream commit 36491f2df9ad2501e5a4ec25d3d95d72bafd2781 ] If we get a transport event, set the error and mark the init as complete so the attempt to send crq-init or login fail sooner rather than wait for the timeout. Fixes: bbd669a868bb ("ibmvnic: Fix completion structure initialization") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08	ibmvnic: define flush_reset_queue helper	Sukadev Bhattiprolu	1	-8/+16
	[ Upstream commit 83da53f7e4bd86dca4b2edc1e2bb324fb3c033a1 ] Define and use a helper to flush the reset queue. Fixes: 2770a7984db5 ("ibmvnic: Introduce hard reset recovery") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08	ibmvnic: initialize rc before completing wait	Sukadev Bhattiprolu	1	-1/+1
	[ Upstream commit 765559b10ce514eb1576595834f23cdc92125fee ] We should initialize ->init_done_rc before calling complete(). Otherwise the waiting thread may see ->init_done_rc as 0 before we have updated it and may assume that the CRQ was successful. Fixes: 6b278c0cb378 ("ibmvnic delay complete()") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08	ibmvnic: free reset-work-item when flushing	Sukadev Bhattiprolu	1	-1/+3
	commit 8d0657f39f487d904fca713e0bc39c2707382553 upstream. Fix a tiny memory leak when flushing the reset work queue. Fixes: 2770a7984db5 ("ibmvnic: Introduce hard reset recovery") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-08	ibmvnic: register netdev after init of adapter	Sukadev Bhattiprolu	1	-6/+8
	commit 570425f8c7c18b14fa8a2a58a0adb431968ad118 upstream. Finish initializing the adapter before registering netdev so state is consistent. Fixes: c26eba03e407 ("ibmvnic: Update reset infrastructure to support tunable parameters") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-08	ibmvnic: don't release napi in __ibmvnic_open()	Sukadev Bhattiprolu	1	-2/+7
	[ Upstream commit 61772b0908c640d0309c40f7d41d062ca4e979fa ] If __ibmvnic_open() encounters an error such as when setting link state, it calls release_resources() which frees the napi structures needlessly. Instead, have __ibmvnic_open() only clean up the work it did so far (i.e. disable napi and irqs) and leave the rest to the callers. If caller of __ibmvnic_open() is ibmvnic_open(), it should release the resources immediately. If the caller is do_reset() or do_hard_reset(), they will release the resources on the next reset. This fixes following crash that occurred when running the drmgr command several times to add/remove a vnic interface: [102056] ibmvnic 30000003 env3: Disabling rx_scrq[6] irq [102056] ibmvnic 30000003 env3: Disabling rx_scrq[7] irq [102056] ibmvnic 30000003 env3: Replenished 8 pools Kernel attempted to read user page (10) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000010 Faulting instruction address: 0xc000000000a3c840 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries ... CPU: 9 PID: 102056 Comm: kworker/9:2 Kdump: loaded Not tainted 5.16.0-rc5-autotest-g6441998e2e37 #1 Workqueue: events_long __ibmvnic_reset [ibmvnic] NIP: c000000000a3c840 LR: c0080000029b5378 CTR: c000000000a3c820 REGS: c0000000548e37e0 TRAP: 0300 Not tainted (5.16.0-rc5-autotest-g6441998e2e37) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28248484 XER: 00000004 CFAR: c0080000029bdd24 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0 GPR00: c0080000029b55d0 c0000000548e3a80 c0000000028f0200 0000000000000000 ... NIP [c000000000a3c840] napi_enable+0x20/0xc0 LR [c0080000029b5378] __ibmvnic_open+0xf0/0x430 [ibmvnic] Call Trace: [c0000000548e3a80] [0000000000000006] 0x6 (unreliable) [c0000000548e3ab0] [c0080000029b55d0] __ibmvnic_open+0x348/0x430 [ibmvnic] [c0000000548e3b40] [c0080000029bcc28] __ibmvnic_reset+0x500/0xdf0 [ibmvnic] [c0000000548e3c60] [c000000000176228] process_one_work+0x288/0x570 [c0000000548e3d00] [c000000000176588] worker_thread+0x78/0x660 [c0000000548e3da0] [c0000000001822f0] kthread+0x1c0/0x1d0 [c0000000548e3e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7d2948f8 792307e0 4e800020 60000000 3c4c01eb 384239e0 f821ffd1 39430010 38a0fff6 e92d1100 f9210028 39200000 <e9030010> f9010020 60420000 e9210020 ---[ end trace 5f8033b08fd27706 ]--- Fixes: ed651a10875f ("ibmvnic: Updated reset handling") Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Link: https://lore.kernel.org/r/20220208001918.900602-1-sukadev@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-02	ibmvnic: schedule failover only if vioctl fails	Sukadev Bhattiprolu	1	-1/+5
	commit 277f2bb14361790a70e4b3c649e794b75a91a597 upstream. If client is unable to initiate a failover reset via H_VIOCTL hcall, then it should schedule a failover reset as a last resort. Otherwise, there is no need to do a last resort. Fixes: 334c42414729 ("ibmvnic: improve failover sysfs entry") Reported-by: Cris Forno <cforno12@outlook.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: Dany Madden <drt@linux.ibm.com> Link: https://lore.kernel.org/r/20220221210545.115283-1-drt@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-01	ibmvnic: don't spin in tasklet	Sukadev Bhattiprolu	1	-6/+0
	[ Upstream commit 48079e7fdd0269d66b1d7d66ae88bd03162464ad ] ibmvnic_tasklet() continuously spins waiting for responses to all capability requests. It does this to avoid encountering an error during initialization of the vnic. However if there is a bug in the VIOS and we do not receive a response to one or more queries the tasklet ends up spinning continuously leading to hard lock ups. If we fail to receive a message from the VIOS it is reasonable to timeout the login attempt rather than spin indefinitely in the tasklet. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-02-01	ibmvnic: init ->running_cap_crqs early	Sukadev Bhattiprolu	1	-35/+71
	[ Upstream commit 151b6a5c06b678687f64f2d9a99fd04d5cd32b72 ] We use ->running_cap_crqs to determine when the ibmvnic_tasklet() should send out the next protocol message type. i.e when we get back responses to all our QUERY_CAPABILITY CRQs we send out REQUEST_CAPABILITY crqs. Similiary, when we get responses to all the REQUEST_CAPABILITY crqs, we send out the QUERY_IP_OFFLOAD CRQ. We currently increment ->running_cap_crqs as we send out each CRQ and have the ibmvnic_tasklet() send out the next message type, when this running_cap_crqs count drops to 0. This assumes that all the CRQs of the current type were sent out before the count drops to 0. However it is possible that we send out say 6 CRQs, get preempted and receive all the 6 responses before we send out the remaining CRQs. This can result in ->running_cap_crqs count dropping to zero before all messages of the current type were sent and we end up sending the next protocol message too early. Instead initialize the ->running_cap_crqs upfront so the tasklet will only send the next protocol message after all responses are received. Use the cap_reqs local variable to also detect any discrepancy (either now or in future) in the number of capability requests we actually send. Currently only send_query_cap() is affected by this behavior (of sending next message early) since it is called from the worker thread (during reset) and from application thread (during ->ndo_open()) and they can be preempted. send_request_cap() is only called from the tasklet which processes CRQ responses sequentially, is not be affected. But to maintain the existing symmtery with send_query_capability() we update send_request_capability() also. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-02-01	ibmvnic: Allow extra failures before disabling	Sukadev Bhattiprolu	1	-4/+17
	[ Upstream commit db9f0e8bf79e6da7068b5818fea0ffd9d0d4b4da ] If auto-priority-failover (APF) is enabled and there are at least two backing devices of different priorities, some resets like fail-over, change-param etc can cause at least two back to back failovers. (Failover from high priority backing device to lower priority one and then back to the higher priority one if that is still functional). Depending on the timimg of the two failovers it is possible to trigger a "hard" reset and for the hard reset to fail due to failovers. When this occurs, the driver assumes that the network is unstable and disables the VNIC for a 60-second "settling time". This in turn can cause the ethtool command to fail with "No such device" while the vnic automatically recovers a little while later. Given that it's possible to have two back to back failures, allow for extra failures before disabling the vnic for the settling time. Fixes: f15fde9d47b8 ("ibmvnic: delay next reset if hard reset fails") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18	ibmvnic: delay complete()	Sukadev Bhattiprolu	1	-7/+9
	[ Upstream commit 6b278c0cb378079f3c0c61ae4a369c09ff1a4188 ] If we get CRQ_INIT, we set errno to -EIO and first call complete() to notify the waiter. Then we try to schedule a FAILOVER reset. If this occurs while adapter is in PROBING state, ibmvnic_reset() changes the error code to EAGAIN and returns without scheduling the FAILOVER. The purpose of setting error code to EAGAIN is to ask the waiter to retry. But due to the earlier complete() call, the waiter may already have seen the -EIO response and decided not to retry. This can cause intermittent failures when bringing up ibmvnic adapters during boot, specially in in kexec/kdump kernels. Defer the complete() call until after scheduling the reset. Also streamline the error code to EAGAIN. Don't see why we need EIO sometimes. All 3 callers of ibmvnic_reset_init() can handle EAGAIN. Fixes: 17c8705838a5 ("ibmvnic: Return error code if init interrupted by transport event") Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18	ibmvnic: Process crqs after enabling interrupts	Sukadev Bhattiprolu	1	-0/+3
	[ Upstream commit 6e20d00158f31f7631d68b86996b7e951c4451c8 ] Soon after registering a CRQ it is possible that we get a fail over or maybe a CRQ_INIT from the VIOS while interrupts were disabled. Look for any such CRQs after enabling interrupts. Otherwise we can intermittently fail to bring up ibmvnic adapters during boot, specially in kexec/kdump kernels. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18	ibmvnic: don't stop queue in xmit	Sukadev Bhattiprolu	1	-2/+0
	[ Upstream commit 8878e46fcfd46b19964bd90e13b25dd94cbfc9be ] If adapter's resetting bit is on, discard the packet but don't stop the transmit queue - instead leave that to the reset code. With this change, it is possible that we may get several calls to ibmvnic_xmit() that simply discard packets and return. But if we stop the queue here, we might end up doing so just after __ibmvnic_open() started the queues (during a hard/soft reset) and before the ->resetting bit was cleared. If that happens, there will be no one to restart queue and transmissions will be blocked indefinitely. This can cause a TIMEOUT reset and with auto priority failover enabled, an unnecessary FAILOVER reset to less favored backing device and then a FAILOVER back to the most favored backing device. If we hit the window repeatedly, we can get stuck in a loop of TIMEOUT, FAILOVER, FAILOVER resets leaving the adapter unusable for extended periods of time. Fixes: 7f5b030830fe ("ibmvnic: Free skb's in cases of failure in transmit") Reported-by: Abdul Haleem <abdhalee@in.ibm.com> Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-27	Revert "ibmvnic: check failover_pending in login response"	Desnes A. Nunes do Rosario	1	-8/+0
	This reverts commit d437f5aa23aa2b7bd07cd44b839d7546cc17166f. Code has been duplicated through commit <273c29e944bd> "ibmvnic: check failover_pending in login response" Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-09	ibmvnic: check failover_pending in login response	Sukadev Bhattiprolu	1	-0/+8
	If a failover occurs before a login response is received, the login response buffer maybe undefined. Check that there was no failover before accessing the login response buffer. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-08	ibmvnic: check failover_pending in login response	Sukadev Bhattiprolu	1	-0/+8
	If a failover occurs before a login response is received, the login response buffer maybe undefined. Check that there was no failover before accessing the login response buffer. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-27	dev_ioctl: split out ndo_eth_ioctl	Arnd Bergmann	2	-3/+3
	Most users of ndo_do_ioctl are ethernet drivers that implement the MII commands SIOCGMIIPHY/SIOCGMIIREG/SIOCSMIIREG, or hardware timestamping with SIOCSHWTSTAMP/SIOCGHWTSTAMP. Separate these from the few drivers that use ndo_do_ioctl to implement SIOCBOND, SIOCBR and SIOCWANDEV commands. This is a purely cosmetic change intended to help readers find their way through the implementation. Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Vivien Didelot <vivien.didelot@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Vladimir Oltean <olteanv@gmail.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: linux-rdma@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-21	ibmvnic: Remove the proper scrq flush	Sukadev Bhattiprolu	1	-1/+1
	Commit 65d6470d139a ("ibmvnic: clean pending indirect buffs during reset") intended to remove the call to ibmvnic_tx_scrq_flush() when the ->resetting flag is true and was tested that way. But during the final rebase to net-next, the hunk got applied to a block few lines below (which happened to have the same diff context) and the wrong call to ibmvnic_tx_scrq_flush() got removed. Fix that by removing the correct ibmvnic_tx_scrq_flush() and restoring the one that was incorrectly removed. Fixes: 65d6470d139a ("ibmvnic: clean pending indirect buffs during reset") Reported-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01	ibmvnic: retry reset if there are no other resets	Sukadev Bhattiprolu	1	-3/+19
	Normally, if a reset fails due to failover or other communication error there is another reset (eg: FAILOVER) in the queue and we would process that reset. But if we are unable to communicate with PHYP or VIOS after H_FREE_CRQ, there would be no other resets in the queue and the adapter would be in an undefined state even though it was in the OPEN state earlier. While starting the reset we set the carrier to off state so we won't even get the timeout resets. If the last queued reset fails, retry it as a hard reset (after the usual 60 second settling time). Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-30	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	1	-24/+77
	Trivial conflict in net/netfilter/nf_tables_api.c. Duplicate fix in tools/testing/selftests/net/devlink_port_split.py - take the net-next version. skmsg, and L4 bpf - keep the bpf code but remove the flags and err params. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-24	ibmvnic: parenthesize a check	Sukadev Bhattiprolu	1	-1/+1
	Parenthesize a check to be more explicit and to fix a sparse warning seen on some distros. Fixes: 91dc5d2553fbf ("ibmvnic: fix miscellaneous checks") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24	ibmvnic: free tx_pool if tso_pool alloc fails	Sukadev Bhattiprolu	1	-1/+4
	Free tx_pool and clear it, if allocation of tso_pool fails. release_tx_pools() assumes we have both tx and tso_pools if ->tx_pool is non-NULL. If allocation of tso_pool fails in init_tx_pools(), the assumption will not be true and we would end up dereferencing ->tx_buff, ->free_map fields from a NULL pointer. Fixes: 3205306c6b8d ("ibmvnic: Update TX pool initialization routine") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24	ibmvnic: set ltb->buff to NULL after freeing	Sukadev Bhattiprolu	1	-11/+15
	free_long_term_buff() checks ltb->buff to decide whether we have a long term buffer to free. So set ltb->buff to NULL afer freeing. While here, also clear ->map_id, fix up some coding style and log an error. Fixes: 9c4eaabd1bb39 ("Check CRQ command return codes") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24	ibmvnic: account for bufs already saved in indir_buf	Sukadev Bhattiprolu	1	-1/+8
	This fixes a crash in replenish_rx_pool() when called from ibmvnic_poll() after a previous call to replenish_rx_pool() encountered an error when allocating a socket buffer. Thanks to Rick Lindsley and Dany Madden for helping debug the crash. Fixes: 4f0b6812e9b9 ("ibmvnic: Introduce batched RX buffer descriptor transmission") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24	ibmvnic: clean pending indirect buffs during reset	Sukadev Bhattiprolu	1	-2/+6
	We batch subordinate command response queue (scrq) descriptors that we need to send to the VIOS using an "indirect" buffer. If after we queue one or more scrqs in the indirect buffer encounter an error (say fail to allocate an skb), we leave the queued scrq descriptors in the indirect buffer until the next call to ibmvnic_xmit(). On the next call to ibmvnic_xmit(), it is possible that the adapter is going through a reset and it is possible that the long term buffers have been unmapped on the VIOS side. If we proceed to flush (send) the packets that are in the indirect buffer, we will end up using the old map ids and this can cause the VIOS to trigger an unnecessary FATAL error reset. Instead of flushing packets remaining on the indirect_buff, discard (clean) them instead. Fixes: 0d973388185d4 ("ibmvnic: Introduce xmit_more support using batched subCRQ hcalls") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24	Revert "ibmvnic: remove duplicate napi_schedule call in open function"	Dany Madden	1	-0/+5
	This reverts commit 7c451f3ef676c805a4b77a743a01a5c21a250a73. When a vnic interface is taken down and then up, connectivity is not restored. We bisected it to this commit. Reverting this commit until we can fully investigate the issue/benefit of the change. Fixes: 7c451f3ef676 ("ibmvnic: remove duplicate napi_schedule call in open function") Reported-by: Cristobal Forno <cforno12@linux.ibm.com> Reported-by: Abdul Haleem <abdhalee@in.ibm.com> Signed-off-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24	Revert "ibmvnic: simplify reset_long_term_buff function"	Sukadev Bhattiprolu	1	-8/+38
	This reverts commit 1c7d45e7b2c29080bf6c8cd0e213cc3cbb62a054. We tried to optimize the number of hcalls we send and skipped sending the REQUEST_MAP calls for some maps. However during resets, we need to resend all the maps to the VIOS since the VIOS does not remember the old values. In fact we may have failed over to a new VIOS which will not have any of the mappings. When we send packets with map ids the VIOS does not know about, it triggers a FATAL reset. While the client does recover from the FATAL error reset, we are seeing a large number of such resets. Handling FATAL resets is lot more unnecessary work than issuing a few more hcalls so revert the commit and resend the maps to the VIOS. Fixes: 1c7d45e7b2c ("ibmvnic: simplify reset_long_term_buff function") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23	ibmveth: Set CHECKSUM_PARTIAL if NULL TCP CSUM.	David Wilder	1	-23/+28
	TCP checksums on received packets may be set to NULL by the sender if CSO is enabled. The hypervisor flags these packets as check-sum-ok and the skb is then flagged CHECKSUM_UNNECESSARY. If these packets are then forwarded the sender will not request CSO due to the CHECKSUM_UNNECESSARY flag. The result is a TCP packet sent with a bad checksum. This change sets up CHECKSUM_PARTIAL on these packets causing the sender to correctly request CSUM offload. Signed-off-by: David Wilder <dwilder@us.ibm.com> Reviewed-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> Tested-by: Cristobal Forno <cforno12@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22	ibmvnic: Use strscpy() instead of strncpy()	Kees Cook	1	-3/+3
	Since these strings are expected to be NUL-terminated and the buffers are exactly sized (in vnic_client_data_len()) with no padding, strncpy() can be safely replaced with strscpy() here, as strncpy() on NUL-terminated string is considered deprecated[1]. This has the side-effect of silencing a -Warray-bounds warning due to the compiler being confused about the vlcd incrementing: In file included from ./include/linux/string.h:253, from ./include/linux/bitmap.h:10, from ./include/linux/cpumask.h:12, from ./include/linux/mm_types_task.h:14, from ./include/linux/mm_types.h:5, from ./include/linux/buildid.h:5, from ./include/linux/module.h:14, from drivers/net/ethernet/ibm/ibmvnic.c:35: In function '__fortify_strncpy', inlined from 'vnic_add_client_data' at drivers/net/ethernet/ibm/ibmvnic.c:3919:2: ./include/linux/fortify-string.h:39:30: warning: '__builtin_strncpy' offset 12 from the object at 'v lcd' is out of the bounds of referenced subobject 'name' with type 'char[]' at offset 12 [-Warray-bo unds] 39 \| #define __underlying_strncpy __builtin_strncpy \| ^ ./include/linux/fortify-string.h:51:9: note: in expansion of macro '__underlying_strncpy' 51 \| return __underlying_strncpy(p, q, size); \| ^~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/ibm/ibmvnic.c: In function 'vnic_add_client_data': drivers/net/ethernet/ibm/ibmvnic.c:3883:7: note: subobject 'name' declared here 3883 \| char name[]; \| ^~~~ [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings Cc: Dany Madden <drt@linux.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Cc: Thomas Falcon <tlfalcon@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: netdev@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14	ibmvnic: fix send_request_map incompatible argument	Lijun Pan	1	-1/+1
	The 3rd argument is u32 by function definition while it is __be32 by function declaration. Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-12	ibmvnic: fix kernel build warnings in build_hdr_descs_arr	Lijun Pan	1	-1/+2
	Fix the following kernel build warnings: drivers/net/ethernet/ibm/ibmvnic.c:1516: warning: Function parameter or member 'skb' not described in 'build_hdr_descs_arr' drivers/net/ethernet/ibm/ibmvnic.c:1516: warning: Function parameter or member 'indir_arr' not described in 'build_hdr_descs_arr' drivers/net/ethernet/ibm/ibmvnic.c:1516: warning: Excess function parameter 'txbuff' description in 'build_hdr_descs_arr' Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-12	ibmvnic: fix kernel build warning	Lijun Pan	1	-0/+4
	drivers/net/ethernet/ibm/ibmvnic.c: In function ‘adapter_state_to_string’: drivers/net/ethernet/ibm/ibmvnic.c:855:2: warning: enumeration value ‘VNIC_DOWN’ not handled in switch [-Wswitch] 855 \| switch (state) { \| ^~~~~~ drivers/net/ethernet/ibm/ibmvnic.c: In function ‘reset_reason_to_string’: drivers/net/ethernet/ibm/ibmvnic.c:1958:2: warning: enumeration value ‘VNIC_RESET_PASSIVE_INIT’ not handled in switch [-Wswitch] 1958 \| switch (reason) { \| ^~~~~~ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	ibmvnic: fix kernel build warning in strncpy	Lijun Pan	1	-1/+1
	drivers/net/ethernet/ibm/ibmvnic.c: In function ‘handle_vpd_rsp’: drivers/net/ethernet/ibm/ibmvnic.c:4393:3: warning: ‘strncpy’ output truncated before terminating nul copying 3 bytes from a string of the same length [-Wstringop-truncation] 4393 \| strncpy((char )adapter->fw_version, "N/A", 3 sizeof(char)); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	ibmvnic: Allow device probe if the device is not ready at boot	Cristobal Forno	2	-27/+132
	Allow the device to be initialized at a later time if it is not available at boot. The device will be allowed to probe but will be given a "down" state. After completing device probe and registering the net device, the driver will await an interrupt signal from its partner device, indicating that it is ready for boot. The driver will schedule a work event to perform the necessary procedure and begin operation. Co-developed-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: Cristobal Forno <cforno12@linux.ibm.com> Acked-by: Lijun Pan <lijunp213@gmail.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	ibmvnic: Use list_for_each_entry() to simplify code in ibmvnic.c	Wang Hai	1	-2/+1
	Convert list_for_each() to list_for_each_entry() where applicable. This simplifies the code. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Lijun Pan <lijunp213@gmail.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-29	ehea: fix error return code in ehea_restart_qps()	Zhen Lei	1	-4/+5
	Fix to return -EFAULT from the error handling case instead of 0, as done elsewhere in this function. By the way, when get_zeroed_page() fails, directly return -ENOMEM to simplify code. Fixes: 2c69448bbced ("ehea: DLPAR memory add fix") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20210528085555.9390-1-thunder.leizhen@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-24	ehea: Use DEVICE_ATTR_*() macro	YueHaibing	1	-9/+9
	Use DEVICE_ATTR_*() helper instead of plain DEVICE_ATTR, which makes the code a bit shorter and easier to read. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-21	ibmvnic: remove default label from to_string switch	Michal Suchanek	1	-4/+2
	This way the compiler warns when a new value is added to the enum but not to the string translation like: drivers/net/ethernet/ibm/ibmvnic.c: In function 'adapter_state_to_string': drivers/net/ethernet/ibm/ibmvnic.c:832:2: warning: enumeration value 'VNIC_FOOBAR' not handled in switch [-Wswitch] switch (state) { ^~~~~~ drivers/net/ethernet/ibm/ibmvnic.c: In function 'reset_reason_to_string': drivers/net/ethernet/ibm/ibmvnic.c:1935:2: warning: enumeration value 'VNIC_RESET_FOOBAR' not handled in switch [-Wswitch] switch (reason) { ^~~~~~ Signed-off-by: Michal Suchanek <msuchanek@suse.de> Acked-by: Lijun Pan <lijunp213@gmail.com> Link: https://lore.kernel.org/netdev/CAOhMmr701LecfuNM+EozqbiTxFvDiXjFdY2aYeKJYaXq9kqVDg@mail.gmail.com/ Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-19	net: ibm: remove leading spaces before tabs	Hui Tang	1	-1/+1
	There are a few leading spaces before tabs and remove it by running the following commard: $ find . -name '.c' \| xargs sed -r -i 's/^[ ]+\t/\t/' $ find . -name '.h' \| xargs sed -r -i 's/^[ ]+\t/\t/' Cc: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: Hui Tang <tanghui20@huawei.com> Acked-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-19	ibmveth: fix kobj_to_dev.cocci warnings	YueHaibing	1	-2/+1
	Use kobj_to_dev() instead of container_of() Generated by: scripts/coccinelle/api/kobj_to_dev.cocci Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-17	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	1	-16/+9
	drivers/net/ethernet/stmicro/stmmac/stmmac_main.c - keep the ZC code, drop the code related to reinit net/bridge/netfilter/ebtables.c - fix build after move to net_generic Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-14	ibmvnic: remove duplicate napi_schedule call in open function	Lijun Pan	1	-5/+0
	Remove the unnecessary napi_schedule() call in __ibmvnic_open() since interrupt_rx() calls napi_schedule_prep/__napi_schedule during every receive interrupt. Fixes: ed651a10875f ("ibmvnic: Updated reset handling") Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14	ibmvnic: remove duplicate napi_schedule call in do_reset function	Lijun Pan	1	-5/+1
	During adapter reset, do_reset/do_hard_reset calls ibmvnic_open(), which will calls napi_schedule if previous state is VNIC_CLOSED (i.e, the reset case, and "ifconfig down" case). So there is no need for do_reset to call napi_schedule again at the end of the function though napi_schedule will neglect the request if napi is already scheduled. Fixes: ed651a10875f ("ibmvnic: Updated reset handling") Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14	ibmvnic: avoid calling napi_disable() twice	Lijun Pan	1	-2/+1
	__ibmvnic_open calls napi_disable without checking whether NAPI polling has already been disabled or not. This could cause napi_disable being called twice, which could generate deadlock. For example, the first napi_disable will spin until NAPI_STATE_SCHED is cleared by napi_complete_done, then set it again. When napi_disable is called the second time, it will loop infinitely because no dev->poll will be running to clear NAPI_STATE_SCHED. To prevent above scenario from happening, call ibmvnic_napi_disable() which checks if napi is disabled or not before calling napi_disable. Fixes: bfc32f297337 ("ibmvnic: Move resource initialization to its own routine") Suggested-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14	ibmvnic: queue reset work in system_long_wq	Lijun Pan	1	-3/+4
	The reset process for ibmvnic commonly takes multiple seconds, clearly making it inappropriate for schedule_work/system_wq. The reason to make this change is that ibmvnic's use of the default system-wide workqueue for a relatively long-running work item can negatively affect other workqueue users. So, queue the relatively slow reset job to the system_long_wq. Suggested-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14	ibmvnic: correctly use dev_consume/free_skb_irq	Lijun Pan	1	-4/+7
	It is more correct to use dev_kfree_skb_irq when packets are dropped, and to use dev_consume_skb_irq when packets are consumed. Fixes: 0d973388185d ("ibmvnic: Introduce xmit_more support using batched subCRQ hcalls") Suggested-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14	ibmvnic: improve failover sysfs entry	Lijun Pan	1	-6/+8
	The current implementation relies on H_IOCTL call to issue a H_SESSION_ERR_DETECTED command to let the hypervisor to send a failover signal. However, it may not work if there is no backup device or if the vnic is already in error state, e.g., "ibmvnic 30000003 env3: rx buffer returned with rc 6". Add a last resort, that is to schedule a failover reset via CRQ command. Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-12	ibmvnic: print adapter state as a string	Lijun Pan	1	-18/+49
	The adapter state can be added or deleted over different versions of the source code. Print a string instead of a number. Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-12	ibmvnic: print reset reason as a string	Lijun Pan	1	-7/+28
	The reset reason can be added or deleted over different versions of the source code. Print a string instead of a number. Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>