Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit b72a4aff947ba807177bdabb43debaf2c66bee05 ]
Double free crash is observed when FW recovery(caused by wmi
timeout/crash) is followed by immediate suspend event. The FW recovery
is triggered by ath10k_core_restart() which calls driver clean up via
ath10k_halt(). When the suspend event occurs between the FW recovery,
the restart worker thread is put into frozen state until suspend completes.
The suspend event triggers ath10k_stop() which again triggers ath10k_halt()
The double invocation of ath10k_halt() causes ath10k_htt_rx_free() to be
called twice(Note: ath10k_htt_rx_alloc was not called by restart worker
thread because of its frozen state), causing the crash.
To fix this, during the suspend flow, skip call to ath10k_halt() in
ath10k_stop() when the current driver state is ATH10K_STATE_RESTARTING.
Also, for driver state ATH10K_STATE_RESTARTING, call
ath10k_wait_for_suspend() in ath10k_stop(). This is because call to
ath10k_wait_for_suspend() is skipped later in
[ath10k_halt() > ath10k_core_stop()] for the driver state
ATH10K_STATE_RESTARTING.
The frozen restart worker thread will be cancelled during resume when the
device comes out of suspend.
Below is the crash stack for reference:
[ 428.469167] ------------[ cut here ]------------
[ 428.469180] kernel BUG at mm/slub.c:4150!
[ 428.469193] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 428.469219] Workqueue: events_unbound async_run_entry_fn
[ 428.469230] RIP: 0010:kfree+0x319/0x31b
[ 428.469241] RSP: 0018:ffffa1fac015fc30 EFLAGS: 00010246
[ 428.469247] RAX: ffffedb10419d108 RBX: ffff8c05262b0000
[ 428.469252] RDX: ffff8c04a8c07000 RSI: 0000000000000000
[ 428.469256] RBP: ffffa1fac015fc78 R08: 0000000000000000
[ 428.469276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 428.469285] Call Trace:
[ 428.469295] ? dma_free_attrs+0x5f/0x7d
[ 428.469320] ath10k_core_stop+0x5b/0x6f
[ 428.469336] ath10k_halt+0x126/0x177
[ 428.469352] ath10k_stop+0x41/0x7e
[ 428.469387] drv_stop+0x88/0x10e
[ 428.469410] __ieee80211_suspend+0x297/0x411
[ 428.469441] rdev_suspend+0x6e/0xd0
[ 428.469462] wiphy_suspend+0xb1/0x105
[ 428.469483] ? name_show+0x2d/0x2d
[ 428.469490] dpm_run_callback+0x8c/0x126
[ 428.469511] ? name_show+0x2d/0x2d
[ 428.469517] __device_suspend+0x2e7/0x41b
[ 428.469523] async_suspend+0x1f/0x93
[ 428.469529] async_run_entry_fn+0x3d/0xd1
[ 428.469535] process_one_work+0x1b1/0x329
[ 428.469541] worker_thread+0x213/0x372
[ 428.469547] kthread+0x150/0x15f
[ 428.469552] ? pr_cont_work+0x58/0x58
[ 428.469558] ? kthread_blkcg+0x31/0x31
Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1
Co-developed-by: Wen Gong <quic_wgong@quicinc.com>
Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
Signed-off-by: Abhishek Kumar <kuabhs@chromium.org>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220426221859.v2.1.I650b809482e1af8d0156ed88b5dc2677a0711d46@changeid
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 161c64de239c7018e0295e7e0520a19f00aa32dc ]
When ath11k modules are removed using rmmod with spectral scan enabled,
crash is observed. Different crash trace is observed for each crash.
Send spectral scan disable WMI command to firmware before cleaning
the spectral dbring in the spectral_deinit API to avoid this crash.
call trace from one of the crash observed:
[ 1252.880802] Unable to handle kernel NULL pointer dereference at virtual address 00000008
[ 1252.882722] pgd = 0f42e886
[ 1252.890955] [00000008] *pgd=00000000
[ 1252.893478] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[ 1253.093035] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.89 #0
[ 1253.115261] Hardware name: Generic DT based system
[ 1253.121149] PC is at ath11k_spectral_process_data+0x434/0x574 [ath11k]
[ 1253.125940] LR is at 0x88e31017
[ 1253.132448] pc : [<7f9387b8>] lr : [<88e31017>] psr: a0000193
[ 1253.135488] sp : 80d01bc8 ip : 00000001 fp : 970e0000
[ 1253.141737] r10: 88e31000 r9 : 970ec000 r8 : 00000080
[ 1253.146946] r7 : 94734040 r6 : a0000113 r5 : 00000057 r4 : 00000000
[ 1253.152159] r3 : e18cb694 r2 : 00000217 r1 : 1df1f000 r0 : 00000001
[ 1253.158755] Flags: NzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
[ 1253.165266] Control: 10c0383d Table: 5e71006a DAC: 00000055
[ 1253.172472] Process swapper/0 (pid: 0, stack limit = 0x60870141)
[ 1253.458055] [<7f9387b8>] (ath11k_spectral_process_data [ath11k]) from [<7f917fdc>] (ath11k_dbring_buffer_release_event+0x214/0x2e4 [ath11k])
[ 1253.466139] [<7f917fdc>] (ath11k_dbring_buffer_release_event [ath11k]) from [<7f8ea3c4>] (ath11k_wmi_tlv_op_rx+0x1840/0x29cc [ath11k])
[ 1253.478807] [<7f8ea3c4>] (ath11k_wmi_tlv_op_rx [ath11k]) from [<7f8fe868>] (ath11k_htc_rx_completion_handler+0x180/0x4e0 [ath11k])
[ 1253.490699] [<7f8fe868>] (ath11k_htc_rx_completion_handler [ath11k]) from [<7f91308c>] (ath11k_ce_per_engine_service+0x2c4/0x3b4 [ath11k])
[ 1253.502386] [<7f91308c>] (ath11k_ce_per_engine_service [ath11k]) from [<7f9a4198>] (ath11k_pci_ce_tasklet+0x28/0x80 [ath11k_pci])
[ 1253.514811] [<7f9a4198>] (ath11k_pci_ce_tasklet [ath11k_pci]) from [<8032227c>] (tasklet_action_common.constprop.2+0x64/0xe8)
[ 1253.526476] [<8032227c>] (tasklet_action_common.constprop.2) from [<803021e8>] (__do_softirq+0x130/0x2d0)
[ 1253.537756] [<803021e8>] (__do_softirq) from [<80322610>] (irq_exit+0xcc/0xe8)
[ 1253.547304] [<80322610>] (irq_exit) from [<8036a4a4>] (__handle_domain_irq+0x60/0xb4)
[ 1253.554428] [<8036a4a4>] (__handle_domain_irq) from [<805eb348>] (gic_handle_irq+0x4c/0x90)
[ 1253.562321] [<805eb348>] (gic_handle_irq) from [<80301a78>] (__irq_svc+0x58/0x8c)
Tested-on: QCN6122 hw1.0 AHB WLAN.HK.2.6.0.1-00851-QCAHKSWPL_SILICONZ-1
Signed-off-by: Hari Chandrakanthan <quic_haric@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/1649396345-349-1-git-send-email-quic_haric@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e999a5da28a0e0f7de242d841ef7d5e48f4646ae ]
This patch fixes an invalid TX PA DC bias level on QCA9561, which
results in a very low output power and very low throughput as devices
are further away from the AP (compared to other 2.4GHz APs).
This patch was suggested by Felix Fietkau, who noted[1]:
"The value written to that register is wrong, because while the mask
definition AR_CH0_TOP2_XPABIASLVL uses a different value for 9561, the
shift definition AR_CH0_TOP2_XPABIASLVL_S is hardcoded to 12, which is
wrong for 9561."
In real life testing, without this patch the 2.4GHz throughput on
Yuncore XD3200 is around 10Mbps sitting next to the AP, and closer to
practical maximum with the patch applied.
[1] https://lore.kernel.org/all/91c58969-c60e-2f41-00ac-737786d435ae@nbd.name
Signed-off-by: Thibaut VARÈNE <hacks+kernel@slashdirt.org>
Acked-by: Felix Fietkau <nbd@nbd.name>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220417145145.1847-1-hacks+kernel@slashdirt.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 5a6b06f5927c940fa44026695779c30b7536474c upstream.
The ieee80211_tx_info_clear_status() helper also clears the rate counts and
the driver-private part of struct ieee80211_tx_info, so using it breaks
quite a few other things. So back out of using it, and instead define a
ath-internal helper that only clears the area between the
status_driver_data and the rates info. Combined with moving the
ath_frame_info struct to status_driver_data, this avoids clearing anything
we shouldn't be, and so we can keep the existing code for handling the rate
information.
While fixing this I also noticed that the setting of
tx_info->status.rates[tx_rateindex].count on hardware underrun errors was
always immediately overridden by the normal setting of the same fields, so
rearrange the code so that the underrun detection actually takes effect.
The new helper could be generalised to a 'memset_between()' helper, but
leave it as a driver-internal helper for now since this needs to go to
stable.
Cc: stable@vger.kernel.org
Reported-by: Peter Seiderer <ps.report@gmx.net>
Fixes: 037250f0a45c ("ath9k: Properly clear TX status area before reporting to mac80211")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Peter Seiderer <ps.report@gmx.net>
Tested-by: Peter Seiderer <ps.report@gmx.net>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220404204800.2681133-1-toke@toke.dk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 037250f0a45cf9ecf5b52d4b9ff8eadeb609c800 upstream.
The ath9k driver was not properly clearing the status area in the
ieee80211_tx_info struct before reporting TX status to mac80211. Instead,
it was manually filling in fields, which meant that fields introduced later
were left as-is.
Conveniently, mac80211 actually provides a helper to zero out the status
area, so use that to make sure we zero everything.
The last commit touching the driver function writing the status information
seems to have actually been fixing an issue that was also caused by the
area being uninitialised; but it only added clearing of a single field
instead of the whole struct. That is now redundant, though, so revert that
commit and use it as a convenient Fixes tag.
Fixes: cc591d77aba1 ("ath9k: Make sure to zero status.tx_time before reporting TX status")
Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220330164409.16645-1-toke@toke.dk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 3df6d74aedfdca919cca475d15dfdbc8b05c9e5d ]
If amss.bin was missing ath11k would crash during 'rmmod ath11k_pci'. The
reason for that was that we were using mhi_async_power_up() which does not
check any errors. But mhi_sync_power_up() on the other hand does check for
errors so let's use that to fix the crash.
I was not able to find a reason why an async version was used.
ath11k_mhi_start() (which enables state ATH11K_MHI_POWER_ON) is called from
ath11k_hif_power_up(), which can sleep. So sync version should be safe to use
here.
[ 145.569731] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN PTI
[ 145.569789] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 145.569843] CPU: 2 PID: 1628 Comm: rmmod Kdump: loaded Tainted: G W 5.16.0-wt-ath+ #567
[ 145.569898] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021
[ 145.569956] RIP: 0010:ath11k_hal_srng_access_begin+0xb5/0x2b0 [ath11k]
[ 145.570028] Code: df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 ec 01 00 00 48 8b ab a8 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 ea 48 c1 ea 03 <0f> b6 14 02 48 89 e8 83 e0 07 83 c0 03 45 85 ed 75 48 38 d0 7c 08
[ 145.570089] RSP: 0018:ffffc900025d7ac0 EFLAGS: 00010246
[ 145.570144] RAX: dffffc0000000000 RBX: ffff88814fca2dd8 RCX: 1ffffffff50cb455
[ 145.570196] RDX: 0000000000000000 RSI: ffff88814fca2dd8 RDI: ffff88814fca2e80
[ 145.570252] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffffa8659497
[ 145.570329] R10: fffffbfff50cb292 R11: 0000000000000001 R12: ffff88814fca0000
[ 145.570410] R13: 0000000000000000 R14: ffff88814fca2798 R15: ffff88814fca2dd8
[ 145.570465] FS: 00007fa399988540(0000) GS:ffff888233e00000(0000) knlGS:0000000000000000
[ 145.570519] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 145.570571] CR2: 00007fa399b51421 CR3: 0000000137898002 CR4: 00000000003706e0
[ 145.570623] Call Trace:
[ 145.570675] <TASK>
[ 145.570727] ? ath11k_ce_tx_process_cb+0x34b/0x860 [ath11k]
[ 145.570797] ath11k_ce_tx_process_cb+0x356/0x860 [ath11k]
[ 145.570864] ? tasklet_init+0x150/0x150
[ 145.570919] ? ath11k_ce_alloc_pipes+0x280/0x280 [ath11k]
[ 145.570986] ? tasklet_clear_sched+0x42/0xe0
[ 145.571042] ? tasklet_kill+0xe9/0x1b0
[ 145.571095] ? tasklet_clear_sched+0xe0/0xe0
[ 145.571148] ? irq_has_action+0x120/0x120
[ 145.571202] ath11k_ce_cleanup_pipes+0x45a/0x580 [ath11k]
[ 145.571270] ? ath11k_pci_stop+0x10e/0x170 [ath11k_pci]
[ 145.571345] ath11k_core_stop+0x8a/0xc0 [ath11k]
[ 145.571434] ath11k_core_deinit+0x9e/0x150 [ath11k]
[ 145.571499] ath11k_pci_remove+0xd2/0x260 [ath11k_pci]
[ 145.571553] pci_device_remove+0x9a/0x1c0
[ 145.571605] __device_release_driver+0x332/0x660
[ 145.571659] driver_detach+0x1e7/0x2c0
[ 145.571712] bus_remove_driver+0xe2/0x2d0
[ 145.571772] pci_unregister_driver+0x21/0x250
[ 145.571826] __do_sys_delete_module+0x30a/0x4b0
[ 145.571879] ? free_module+0xac0/0xac0
[ 145.571933] ? lockdep_hardirqs_on_prepare.part.0+0x18c/0x370
[ 145.571986] ? syscall_enter_from_user_mode+0x1d/0x50
[ 145.572039] ? lockdep_hardirqs_on+0x79/0x100
[ 145.572097] do_syscall_64+0x3b/0x90
[ 145.572153] entry_SYSCALL_64_after_hwframe+0x44/0xae
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03003-QCAHSPSWPL_V1_V2_SILICONZ_LITE-2
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220127090117.2024-2-kvalo@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 22b59cb965f79ee1accf83172441c9ca0ecb632a ]
Call netif_napi_del() from ath11k_ahb_free_ext_irq() to fix
the following kernel panic when unload/load ath11k modules
for few iterations.
[ 971.201365] Unable to handle kernel paging request at virtual address 6d97a208
[ 971.204227] pgd = 594c2919
[ 971.211478] [6d97a208] *pgd=00000000
[ 971.214120] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[ 971.412024] CPU: 2 PID: 4435 Comm: insmod Not tainted 5.4.89 #0
[ 971.434256] Hardware name: Generic DT based system
[ 971.440165] PC is at napi_by_id+0x10/0x40
[ 971.445019] LR is at netif_napi_add+0x160/0x1dc
[ 971.743127] (napi_by_id) from [<807d89a0>] (netif_napi_add+0x160/0x1dc)
[ 971.751295] (netif_napi_add) from [<7f1209ac>] (ath11k_ahb_config_irq+0xf8/0x414 [ath11k_ahb])
[ 971.759164] (ath11k_ahb_config_irq [ath11k_ahb]) from [<7f12135c>] (ath11k_ahb_probe+0x40c/0x51c [ath11k_ahb])
[ 971.768567] (ath11k_ahb_probe [ath11k_ahb]) from [<80666864>] (platform_drv_probe+0x48/0x94)
[ 971.779670] (platform_drv_probe) from [<80664718>] (really_probe+0x1c8/0x450)
[ 971.789389] (really_probe) from [<80664cc4>] (driver_probe_device+0x15c/0x1b8)
[ 971.797547] (driver_probe_device) from [<80664f60>] (device_driver_attach+0x44/0x60)
[ 971.805795] (device_driver_attach) from [<806650a0>] (__driver_attach+0x124/0x140)
[ 971.814822] (__driver_attach) from [<80662adc>] (bus_for_each_dev+0x58/0xa4)
[ 971.823328] (bus_for_each_dev) from [<80663a2c>] (bus_add_driver+0xf0/0x1e8)
[ 971.831662] (bus_add_driver) from [<806658a4>] (driver_register+0xa8/0xf0)
[ 971.839822] (driver_register) from [<8030269c>] (do_one_initcall+0x78/0x1ac)
[ 971.847638] (do_one_initcall) from [<80392524>] (do_init_module+0x54/0x200)
[ 971.855968] (do_init_module) from [<803945b0>] (load_module+0x1e30/0x1ffc)
[ 971.864126] (load_module) from [<803948b0>] (sys_init_module+0x134/0x17c)
[ 971.871852] (sys_init_module) from [<80301000>] (ret_fast_syscall+0x0/0x50)
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.6.0.1-00760-QCAHKSWPL_SILICONZ-1
Signed-off-by: Venkateswara Naralasetty <quic_vnaralas@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/1642583973-21599-1-git-send-email-quic_vnaralas@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 564d4eceb97eaf381dd6ef6470b06377bb50c95a ]
The bug was found during fuzzing. Stacktrace locates it in
ath5k_eeprom_convert_pcal_info_5111.
When none of the curve is selected in the loop, idx can go
up to AR5K_EEPROM_N_PD_CURVES. The line makes pd out of bound.
pd = &chinfo[pier].pd_curves[idx];
There are many OOB writes using pd later in the code. So I
added a sanity check for idx. Checks for other loops involving
AR5K_EEPROM_N_PD_CURVES are not needed as the loop index is not
used outside the loops.
The patch is NOT tested with real device.
The following is the fuzzing report
BUG: KASAN: slab-out-of-bounds in ath5k_eeprom_read_pcal_info_5111+0x126a/0x1390 [ath5k]
Write of size 1 at addr ffff8880174a4d60 by task modprobe/214
CPU: 0 PID: 214 Comm: modprobe Not tainted 5.6.0 #1
Call Trace:
dump_stack+0x76/0xa0
print_address_description.constprop.0+0x16/0x200
? ath5k_eeprom_read_pcal_info_5111+0x126a/0x1390 [ath5k]
? ath5k_eeprom_read_pcal_info_5111+0x126a/0x1390 [ath5k]
__kasan_report.cold+0x37/0x7c
? ath5k_eeprom_read_pcal_info_5111+0x126a/0x1390 [ath5k]
kasan_report+0xe/0x20
ath5k_eeprom_read_pcal_info_5111+0x126a/0x1390 [ath5k]
? apic_timer_interrupt+0xa/0x20
? ath5k_eeprom_init_11a_pcal_freq+0xbc0/0xbc0 [ath5k]
? ath5k_pci_eeprom_read+0x228/0x3c0 [ath5k]
ath5k_eeprom_init+0x2513/0x6290 [ath5k]
? ath5k_eeprom_init_11a_pcal_freq+0xbc0/0xbc0 [ath5k]
? usleep_range+0xb8/0x100
? apic_timer_interrupt+0xa/0x20
? ath5k_eeprom_read_pcal_info_2413+0x2f20/0x2f20 [ath5k]
ath5k_hw_init+0xb60/0x1970 [ath5k]
ath5k_init_ah+0x6fe/0x2530 [ath5k]
? kasprintf+0xa6/0xe0
? ath5k_stop+0x140/0x140 [ath5k]
? _dev_notice+0xf6/0xf6
? apic_timer_interrupt+0xa/0x20
ath5k_pci_probe.cold+0x29a/0x3d6 [ath5k]
? ath5k_pci_eeprom_read+0x3c0/0x3c0 [ath5k]
? mutex_lock+0x89/0xd0
? ath5k_pci_eeprom_read+0x3c0/0x3c0 [ath5k]
local_pci_probe+0xd3/0x160
pci_device_probe+0x23f/0x3e0
? pci_device_remove+0x280/0x280
? pci_device_remove+0x280/0x280
really_probe+0x209/0x5d0
Reported-by: Brendan Dolan-Gavitt <brendandg@nyu.edu>
Signed-off-by: Zekun Shen <bruceshenzk@gmail.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/YckvDdj3mtCkDRIt@a-10-27-26-18.dynapool.vpn.nyu.edu
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9747a78d5f758a5284751a10aee13c30d02bd5f1 ]
The device_node pointer is returned by of_parse_phandle() with refcount
incremented. We should use of_node_put() on it when done.
This function only calls of_node_put() in the regular path.
And it will cause refcount leak in error path.
Fixes: 727fec790ead ("ath10k: Setup the msa resources before qmi init")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220308070238.19295-1-linmq006@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d1e0df1c57bd30871dd1c855742a7c346dbca853 ]
Syzbot reported 2 KMSAN bugs in ath9k. All of them are caused by missing
field initialization.
In htc_connect_service() svc_meta_len and pad are not initialized. Based
on code it looks like in current skb there is no service data, so simply
initialize svc_meta_len to 0.
htc_issue_send() does not initialize htc_frame_hdr::control array. Based
on firmware code, it will initialize it by itself, so simply zero whole
array to make KMSAN happy
Fail logs:
BUG: KMSAN: kernel-usb-infoleak in usb_submit_urb+0x6c1/0x2aa0 drivers/usb/core/urb.c:430
usb_submit_urb+0x6c1/0x2aa0 drivers/usb/core/urb.c:430
hif_usb_send_regout drivers/net/wireless/ath/ath9k/hif_usb.c:127 [inline]
hif_usb_send+0x5f0/0x16f0 drivers/net/wireless/ath/ath9k/hif_usb.c:479
htc_issue_send drivers/net/wireless/ath/ath9k/htc_hst.c:34 [inline]
htc_connect_service+0x143e/0x1960 drivers/net/wireless/ath/ath9k/htc_hst.c:275
...
Uninit was created at:
slab_post_alloc_hook mm/slab.h:524 [inline]
slab_alloc_node mm/slub.c:3251 [inline]
__kmalloc_node_track_caller+0xe0c/0x1510 mm/slub.c:4974
kmalloc_reserve net/core/skbuff.c:354 [inline]
__alloc_skb+0x545/0xf90 net/core/skbuff.c:426
alloc_skb include/linux/skbuff.h:1126 [inline]
htc_connect_service+0x1029/0x1960 drivers/net/wireless/ath/ath9k/htc_hst.c:258
...
Bytes 4-7 of 18 are uninitialized
Memory access of size 18 starts at ffff888027377e00
BUG: KMSAN: kernel-usb-infoleak in usb_submit_urb+0x6c1/0x2aa0 drivers/usb/core/urb.c:430
usb_submit_urb+0x6c1/0x2aa0 drivers/usb/core/urb.c:430
hif_usb_send_regout drivers/net/wireless/ath/ath9k/hif_usb.c:127 [inline]
hif_usb_send+0x5f0/0x16f0 drivers/net/wireless/ath/ath9k/hif_usb.c:479
htc_issue_send drivers/net/wireless/ath/ath9k/htc_hst.c:34 [inline]
htc_connect_service+0x143e/0x1960 drivers/net/wireless/ath/ath9k/htc_hst.c:275
...
Uninit was created at:
slab_post_alloc_hook mm/slab.h:524 [inline]
slab_alloc_node mm/slub.c:3251 [inline]
__kmalloc_node_track_caller+0xe0c/0x1510 mm/slub.c:4974
kmalloc_reserve net/core/skbuff.c:354 [inline]
__alloc_skb+0x545/0xf90 net/core/skbuff.c:426
alloc_skb include/linux/skbuff.h:1126 [inline]
htc_connect_service+0x1029/0x1960 drivers/net/wireless/ath/ath9k/htc_hst.c:258
...
Bytes 16-17 of 18 are uninitialized
Memory access of size 18 starts at ffff888027377e00
Fixes: fb9987d0f748 ("ath9k_htc: Support for AR9271 chipset.")
Reported-by: syzbot+f83a1df1ed4f67e8d8ad@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220115122733.11160-1-paskripkin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e3fb3d4418fce5484dfe7995fcd94c18b10a431a ]
In function ath10k_wow_convert_8023_to_80211(), it will do memcpy for
the new->pattern, and currently the new->pattern and new->mask is same
with the old, then the memcpy of new->pattern will also overwrite the
old->pattern, because the header format of new->pattern is 802.11,
its length is larger than the old->pattern which is 802.3. Then the
operation of "Copy frame body" will copy a mistake value because the
body memory has been overwrite when memcpy the new->pattern.
Assign another empty value to new_pattern to avoid the overwrite issue.
Tested-on: QCA6174 hw3.2 SDIO WLAN.RMH.4.4.1-00049
Fixes: fa3440fa2fa1 ("ath10k: convert wow pattern from 802.3 to 802.11")
Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20211222031347.25463-1-quic_wgong@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 02a95374b5eebdbd3b6413fd7ddec151d2ea75a1 upstream.
Currently tx_params is being re-assigned with a new value and the
previous setting IEEE80211_HT_MCS_TX_RX_DIFF is being overwritten.
The assignment operator is incorrect, the original intent was to
bit-wise or the value in. Fix this by replacing the = operator
with |= instead.
Kudos to Christian Lamparter for suggesting the correct fix.
Fixes: fe8ee9ad80b2 ("carl9170: mac80211 glue and command interface")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Cc: <Stable@vger.kernel.org>
Acked-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220125004406.344422-1-colin.i.king@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 98d504a82cc75840bec8e3c6ae0e4f411921962b upstream.
The spread of capability between the three WiFi silicon parts wcn36xx
supports is:
wcn3620 - 802.11 a/b/g
wcn3660 - 802.11 a/b/g/n
wcn3680 - 802.11 a/b/g/n/ac
We currently treat wcn3660 as wcn3620 thus limiting it to 2GHz channels.
Fix this regression by ensuring we differentiate between all three parts.
Fixes: 8490987bdb9a ("wcn36xx: Hook and identify RF_IRIS_WCN3680")
Cc: stable@vger.kernel.org
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220125004046.4058284-1-bryan.odonoghue@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1ec7ed5163c70a0d040150d2279f932c7e7c143f upstream.
This reverts commit 2dc016599cfa9672a147528ca26d70c3654a5423.
Users are reporting regressions in regulatory domain detection and
channel availability.
The problem this was trying to resolve was fixed in firmware anyway:
QCA6174 hw3.0: sdio-4.4.1: add firmware.bin_WLAN.RMH.4.4.1-00042
https://github.com/kvalo/ath10k-firmware/commit/4d382787f0efa77dba40394e0bc604f8eff82552
Link: https://bbs.archlinux.org/viewtopic.php?id=254535
Link: http://lists.infradead.org/pipermail/ath10k/2020-April/014871.html
Link: http://lists.infradead.org/pipermail/ath10k/2020-May/015152.html
Link: https://lore.kernel.org/all/1c160dfb-6ccc-b4d6-76f6-4364e0adb6dd@reox.at/
Fixes: 2dc016599cfa ("ath: add support for special 0x0 regulatory domain")
Cc: <stable@vger.kernel.org>
Cc: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20200527165718.129307-1-briannorris@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 09b8cd69edcf2be04a781e1781e98e52a775c9ad upstream.
On an imx6dl-pico-pi board with a QCA9377 SDIO chip, simply trying to
connect via ssh to another machine causes:
[ 55.824159] ath10k_sdio mmc1:0001:1: failed to transmit packet, dropping: -12
[ 55.832169] ath10k_sdio mmc1:0001:1: failed to submit frame: -12
[ 55.838529] ath10k_sdio mmc1:0001:1: failed to push frame: -12
[ 55.905863] ath10k_sdio mmc1:0001:1: failed to transmit packet, dropping: -12
[ 55.913650] ath10k_sdio mmc1:0001:1: failed to submit frame: -12
[ 55.919887] ath10k_sdio mmc1:0001:1: failed to push frame: -12
, leading to an ssh connection failure.
One user inspected the size of frames on Wireshark and reported
the followig:
"I was able to narrow the issue down to the mtu. If I set the mtu for
the wlan0 device to 1486 instead of 1500, the issue does not happen.
The size of frames that I see on Wireshark is exactly 1500 after
setting it to 1486."
Clearing the HI_ACS_FLAGS_ALT_DATA_CREDIT_SIZE avoids the problem and
the ssh command works successfully after that.
Introduce a 'credit_size_workaround' field to ath10k_hw_params for
the QCA9377 SDIO, so that the HI_ACS_FLAGS_ALT_DATA_CREDIT_SIZE
is not set in this case.
Tested with QCA9377 SDIO with firmware WLAN.TF.1.1.1-00061-QCATFSWPZ-1.
Fixes: 2f918ea98606 ("ath10k: enable alt data of TX path for sdio")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20211124131047.713756-1-festevam@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d943fdad7589653065be0e20aadc6dff37725ed4 ]
Similar to the same bug in ath10k, a napi disable w/out it being enabled
will hang forever. I believe I saw this while trying rmmod after driver
had some failure on startup. Fix it by keeping state on whether napi is
enabled or not.
And, remove un-used napi pointer in ath11k driver base struct.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20200903195254.29379-1-greearb@candelatech.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6ce708f54cc8d73beca213cec66ede5ce100a781 ]
Large pkt_len can lead to out-out-bound memcpy. Current
ath9k_hif_usb_rx_stream allows combining the content of two urb
inputs to one pkt. The first input can indicate the size of the
pkt. Any remaining size is saved in hif_dev->rx_remain_len.
While processing the next input, memcpy is used with rx_remain_len.
4-byte pkt_len can go up to 0xffff, while a single input is 0x4000
maximum in size (MAX_RX_BUF_SIZE). Thus, the patch adds a check for
pkt_len which must not exceed 2 * MAX_RX_BUG_SIZE.
BUG: KASAN: slab-out-of-bounds in ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc]
Read of size 46393 at addr ffff888018798000 by task kworker/0:1/23
CPU: 0 PID: 23 Comm: kworker/0:1 Not tainted 5.6.0 #63
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
Workqueue: events request_firmware_work_func
Call Trace:
<IRQ>
dump_stack+0x76/0xa0
print_address_description.constprop.0+0x16/0x200
? ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc]
? ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc]
__kasan_report.cold+0x37/0x7c
? ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc]
kasan_report+0xe/0x20
check_memory_region+0x15a/0x1d0
memcpy+0x20/0x50
ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc]
? hif_usb_mgmt_cb+0x2d9/0x2d9 [ath9k_htc]
? _raw_spin_lock_irqsave+0x7b/0xd0
? _raw_spin_trylock_bh+0x120/0x120
? __usb_unanchor_urb+0x12f/0x210
__usb_hcd_giveback_urb+0x1e4/0x380
usb_giveback_urb_bh+0x241/0x4f0
? __hrtimer_run_queues+0x316/0x740
? __usb_hcd_giveback_urb+0x380/0x380
tasklet_action_common.isra.0+0x135/0x330
__do_softirq+0x18c/0x634
irq_exit+0x114/0x140
smp_apic_timer_interrupt+0xde/0x380
apic_timer_interrupt+0xf/0x20
I found the bug using a custome USBFuzz port. It's a research work
to fuzz USB stack/drivers. I modified it to fuzz ath9k driver only,
providing hand-crafted usb descriptors to QEMU.
After fixing the value of pkt_tag to ATH_USB_RX_STREAM_MODE_TAG in QEMU
emulation, I found the KASAN report. The bug is triggerable whenever
pkt_len is above two MAX_RX_BUG_SIZE. I used the same input that crashes
to test the driver works when applying the patch.
Signed-off-by: Zekun Shen <bruceshenzk@gmail.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/YXsidrRuK6zBJicZ@10-18-43-117.dynapool.wireless.nyu.edu
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 767c94caf0efad136157110787fe221b74cb5c8a ]
With CONFIG_LOCKDEP=y and CONFIG_DEBUG_SPINLOCK=y, lockdep reports
below warning:
[ 166.059415] ============================================
[ 166.059416] WARNING: possible recursive locking detected
[ 166.059418] 5.15.0-wt-ath+ #10 Tainted: G W O
[ 166.059420] --------------------------------------------
[ 166.059421] kworker/0:2/116 is trying to acquire lock:
[ 166.059423] ffff9905f2083160 (&srng->lock){+.-.}-{2:2}, at: ath11k_hal_reo_cmd_send+0x20/0x490 [ath11k]
[ 166.059440]
but task is already holding lock:
[ 166.059442] ffff9905f2083230 (&srng->lock){+.-.}-{2:2}, at: ath11k_dp_process_reo_status+0x95/0x2d0 [ath11k]
[ 166.059491]
other info that might help us debug this:
[ 166.059492] Possible unsafe locking scenario:
[ 166.059493] CPU0
[ 166.059494] ----
[ 166.059495] lock(&srng->lock);
[ 166.059498] lock(&srng->lock);
[ 166.059500]
*** DEADLOCK ***
[ 166.059501] May be due to missing lock nesting notation
[ 166.059502] 3 locks held by kworker/0:2/116:
[ 166.059504] #0: ffff9905c0081548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1f6/0x660
[ 166.059511] #1: ffff9d2400a5fe68 ((debug_obj_work).work){+.+.}-{0:0}, at: process_one_work+0x1f6/0x660
[ 166.059517] #2: ffff9905f2083230 (&srng->lock){+.-.}-{2:2}, at: ath11k_dp_process_reo_status+0x95/0x2d0 [ath11k]
[ 166.059532]
stack backtrace:
[ 166.059534] CPU: 0 PID: 116 Comm: kworker/0:2 Kdump: loaded Tainted: G W O 5.15.0-wt-ath+ #10
[ 166.059537] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0059.2019.1112.1124 11/12/2019
[ 166.059539] Workqueue: events free_obj_work
[ 166.059543] Call Trace:
[ 166.059545] <IRQ>
[ 166.059547] dump_stack_lvl+0x56/0x7b
[ 166.059552] __lock_acquire+0xb9a/0x1a50
[ 166.059556] lock_acquire+0x1e2/0x330
[ 166.059560] ? ath11k_hal_reo_cmd_send+0x20/0x490 [ath11k]
[ 166.059571] _raw_spin_lock_bh+0x33/0x70
[ 166.059574] ? ath11k_hal_reo_cmd_send+0x20/0x490 [ath11k]
[ 166.059584] ath11k_hal_reo_cmd_send+0x20/0x490 [ath11k]
[ 166.059594] ath11k_dp_tx_send_reo_cmd+0x3f/0x130 [ath11k]
[ 166.059605] ath11k_dp_rx_tid_del_func+0x221/0x370 [ath11k]
[ 166.059618] ath11k_dp_process_reo_status+0x22f/0x2d0 [ath11k]
[ 166.059632] ? ath11k_dp_service_srng+0x2ea/0x2f0 [ath11k]
[ 166.059643] ath11k_dp_service_srng+0x2ea/0x2f0 [ath11k]
[ 166.059655] ath11k_pci_ext_grp_napi_poll+0x1c/0x70 [ath11k_pci]
[ 166.059659] __napi_poll+0x28/0x230
[ 166.059664] net_rx_action+0x285/0x310
[ 166.059668] __do_softirq+0xe6/0x4d2
[ 166.059672] irq_exit_rcu+0xd2/0xf0
[ 166.059675] common_interrupt+0xa5/0xc0
[ 166.059678] </IRQ>
[ 166.059679] <TASK>
[ 166.059680] asm_common_interrupt+0x1e/0x40
[ 166.059683] RIP: 0010:_raw_spin_unlock_irqrestore+0x38/0x70
[ 166.059686] Code: 83 c7 18 e8 2a 95 43 ff 48 89 ef e8 22 d2 43 ff 81 e3 00 02 00 00 75 25 9c 58 f6 c4 02 75 2d 48 85 db 74 01 fb bf 01 00 00 00 <e8> 63 2e 40 ff 65 8b 05 8c 59 97 5c 85 c0 74 0a 5b 5d c3 e8 00 6a
[ 166.059689] RSP: 0018:ffff9d2400a5fca0 EFLAGS: 00000206
[ 166.059692] RAX: 0000000000000002 RBX: 0000000000000200 RCX: 0000000000000006
[ 166.059694] RDX: 0000000000000000 RSI: ffffffffa404879b RDI: 0000000000000001
[ 166.059696] RBP: ffff9905c0053000 R08: 0000000000000001 R09: 0000000000000001
[ 166.059698] R10: ffff9d2400a5fc50 R11: 0000000000000001 R12: ffffe186c41e2840
[ 166.059700] R13: 0000000000000001 R14: ffff9905c78a1c68 R15: 0000000000000001
[ 166.059704] free_debug_processing+0x257/0x3d0
[ 166.059708] ? free_obj_work+0x1f5/0x250
[ 166.059712] __slab_free+0x374/0x5a0
[ 166.059718] ? kmem_cache_free+0x2e1/0x370
[ 166.059721] ? free_obj_work+0x1f5/0x250
[ 166.059724] kmem_cache_free+0x2e1/0x370
[ 166.059727] free_obj_work+0x1f5/0x250
[ 166.059731] process_one_work+0x28b/0x660
[ 166.059735] ? process_one_work+0x660/0x660
[ 166.059738] worker_thread+0x37/0x390
[ 166.059741] ? process_one_work+0x660/0x660
[ 166.059743] kthread+0x176/0x1a0
[ 166.059746] ? set_kthread_struct+0x40/0x40
[ 166.059749] ret_from_fork+0x22/0x30
[ 166.059754] </TASK>
Since these two lockes are both initialized in ath11k_hal_srng_setup,
they are assigned with the same key. As a result lockdep suspects that
the task is trying to acquire the same lock (due to same key) while
already holding it, and thus reports the DEADLOCK warning. However as
they are different spinlock instances, the warning is false positive.
On the other hand, even no dead lock indeed, this is a major issue for
upstream regression testing as it disables lockdep functionality.
Fix it by assigning separate lock class key for each srng->lock.
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20211209011949.151472-1-quic_bqiang@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e8a91863eba3966a447d2daa1526082d52b5db2a ]
While running stress tests in roaming scenarios (switching ap's every 5
seconds, we discovered a issue which leads to tx hangings of exactly 5
seconds while or after scanning for new accesspoints. We found out that
this hanging is triggered by ath10k_mac_wait_tx_complete since the
empty_tx_wq was not wake when the num_tx_pending counter reaches zero.
To fix this, we simply move the wake_up call to htt_tx_dec_pending,
since this call was missed on several locations within the ath10k code.
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20210505085806.11474-1-s.gottschall@dd-wrt.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ed05c7cf1286d7e31e7623bce55ff135723591bf ]
When enable debug config, it print below warning while shut down wlan
interface shuh as run "ifconfig wlan0 down".
The reason is because ar->regd_update_work is ran once, and it is will
call wiphy_lock(ar->hw->wiphy) in function ath11k_regd_update() which
is running in workqueue of ieee80211_local queued by ieee80211_queue_work().
Another thread from "ifconfig wlan0 down" will also accuqire the lock
by wiphy_lock(sdata->local->hw.wiphy) in function ieee80211_stop(), and
then it call ieee80211_stop_device() to flush_workqueue(local->workqueue),
this will wait the workqueue of ieee80211_local finished. Then deadlock
will happen easily if the two thread run meanwhile.
Below warning disappeared after this change.
[ 914.088798] ath11k_pci 0000:05:00.0: mac remove interface (vdev 0)
[ 914.088806] ath11k_pci 0000:05:00.0: mac stop 11d scan
[ 914.088810] ath11k_pci 0000:05:00.0: mac stop 11d vdev id 0
[ 914.088827] ath11k_pci 0000:05:00.0: htc ep 2 consumed 1 credits (total 0)
[ 914.088841] ath11k_pci 0000:05:00.0: send 11d scan stop vdev id 0
[ 914.088849] ath11k_pci 0000:05:00.0: htc insufficient credits ep 2 required 1 available 0
[ 914.088856] ath11k_pci 0000:05:00.0: htc insufficient credits ep 2 required 1 available 0
[ 914.096434] ath11k_pci 0000:05:00.0: rx ce pipe 2 len 16
[ 914.096442] ath11k_pci 0000:05:00.0: htc ep 2 got 1 credits (total 1)
[ 914.096481] ath11k_pci 0000:05:00.0: htc ep 2 consumed 1 credits (total 0)
[ 914.096491] ath11k_pci 0000:05:00.0: WMI vdev delete id 0
[ 914.111598] ath11k_pci 0000:05:00.0: rx ce pipe 2 len 16
[ 914.111628] ath11k_pci 0000:05:00.0: htc ep 2 got 1 credits (total 1)
[ 914.114659] ath11k_pci 0000:05:00.0: rx ce pipe 2 len 20
[ 914.114742] ath11k_pci 0000:05:00.0: htc rx completion ep 2 skb pK-error
[ 914.115977] ath11k_pci 0000:05:00.0: vdev delete resp for vdev id 0
[ 914.116685] ath11k_pci 0000:05:00.0: vdev 00:03:7f:29:61:11 deleted, vdev_id 0
[ 914.117583] ======================================================
[ 914.117592] WARNING: possible circular locking dependency detected
[ 914.117600] 5.16.0-rc1-wt-ath+ #1 Tainted: G OE
[ 914.117611] ------------------------------------------------------
[ 914.117618] ifconfig/2805 is trying to acquire lock:
[ 914.117628] ffff9c00a62bb548 ((wq_completion)phy0){+.+.}-{0:0}, at: flush_workqueue+0x87/0x470
[ 914.117674]
but task is already holding lock:
[ 914.117682] ffff9c00baea07d0 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: ieee80211_stop+0x38/0x180 [mac80211]
[ 914.117872]
which lock already depends on the new lock.
[ 914.117880]
the existing dependency chain (in reverse order) is:
[ 914.117888]
-> #3 (&rdev->wiphy.mtx){+.+.}-{4:4}:
[ 914.117910] __mutex_lock+0xa0/0x9c0
[ 914.117930] mutex_lock_nested+0x1b/0x20
[ 914.117944] reg_process_self_managed_hints+0x3a/0xb0 [cfg80211]
[ 914.118093] wiphy_regulatory_register+0x47/0x80 [cfg80211]
[ 914.118229] wiphy_register+0x84f/0x9c0 [cfg80211]
[ 914.118353] ieee80211_register_hw+0x6b1/0xd90 [mac80211]
[ 914.118486] ath11k_mac_register+0x6af/0xb60 [ath11k]
[ 914.118550] ath11k_core_qmi_firmware_ready+0x383/0x4a0 [ath11k]
[ 914.118598] ath11k_qmi_driver_event_work+0x347/0x4a0 [ath11k]
[ 914.118656] process_one_work+0x228/0x670
[ 914.118669] worker_thread+0x4d/0x440
[ 914.118680] kthread+0x16d/0x1b0
[ 914.118697] ret_from_fork+0x22/0x30
[ 914.118714]
-> #2 (rtnl_mutex){+.+.}-{4:4}:
[ 914.118736] __mutex_lock+0xa0/0x9c0
[ 914.118751] mutex_lock_nested+0x1b/0x20
[ 914.118767] rtnl_lock+0x17/0x20
[ 914.118783] ath11k_regd_update+0x15a/0x260 [ath11k]
[ 914.118841] ath11k_regd_update_work+0x15/0x20 [ath11k]
[ 914.118897] process_one_work+0x228/0x670
[ 914.118909] worker_thread+0x4d/0x440
[ 914.118920] kthread+0x16d/0x1b0
[ 914.118934] ret_from_fork+0x22/0x30
[ 914.118948]
-> #1 ((work_completion)(&ar->regd_update_work)){+.+.}-{0:0}:
[ 914.118972] process_one_work+0x1fa/0x670
[ 914.118984] worker_thread+0x4d/0x440
[ 914.118996] kthread+0x16d/0x1b0
[ 914.119010] ret_from_fork+0x22/0x30
[ 914.119023]
-> #0 ((wq_completion)phy0){+.+.}-{0:0}:
[ 914.119045] __lock_acquire+0x146d/0x1cf0
[ 914.119057] lock_acquire+0x19b/0x360
[ 914.119067] flush_workqueue+0xae/0x470
[ 914.119084] ieee80211_stop_device+0x3b/0x50 [mac80211]
[ 914.119260] ieee80211_do_stop+0x5d7/0x830 [mac80211]
[ 914.119409] ieee80211_stop+0x45/0x180 [mac80211]
[ 914.119557] __dev_close_many+0xb3/0x120
[ 914.119573] __dev_change_flags+0xc3/0x1d0
[ 914.119590] dev_change_flags+0x29/0x70
[ 914.119605] devinet_ioctl+0x653/0x810
[ 914.119620] inet_ioctl+0x193/0x1e0
[ 914.119631] sock_do_ioctl+0x4d/0xf0
[ 914.119649] sock_ioctl+0x262/0x340
[ 914.119665] __x64_sys_ioctl+0x96/0xd0
[ 914.119678] do_syscall_64+0x3d/0xd0
[ 914.119694] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 914.119709]
other info that might help us debug this:
[ 914.119717] Chain exists of:
(wq_completion)phy0 --> rtnl_mutex --> &rdev->wiphy.mtx
[ 914.119745] Possible unsafe locking scenario:
[ 914.119752] CPU0 CPU1
[ 914.119758] ---- ----
[ 914.119765] lock(&rdev->wiphy.mtx);
[ 914.119778] lock(rtnl_mutex);
[ 914.119792] lock(&rdev->wiphy.mtx);
[ 914.119807] lock((wq_completion)phy0);
[ 914.119819]
*** DEADLOCK ***
[ 914.119827] 2 locks held by ifconfig/2805:
[ 914.119837] #0: ffffffffba3dc010 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock+0x17/0x20
[ 914.119872] #1: ffff9c00baea07d0 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: ieee80211_stop+0x38/0x180 [mac80211]
[ 914.120039]
stack backtrace:
[ 914.120048] CPU: 0 PID: 2805 Comm: ifconfig Tainted: G OE 5.16.0-rc1-wt-ath+ #1
[ 914.120064] Hardware name: LENOVO 418065C/418065C, BIOS 83ET63WW (1.33 ) 07/29/2011
[ 914.120074] Call Trace:
[ 914.120084] <TASK>
[ 914.120094] dump_stack_lvl+0x73/0xa4
[ 914.120119] dump_stack+0x10/0x12
[ 914.120135] print_circular_bug.isra.44+0x221/0x2e0
[ 914.120165] check_noncircular+0x106/0x150
[ 914.120203] __lock_acquire+0x146d/0x1cf0
[ 914.120215] ? __lock_acquire+0x146d/0x1cf0
[ 914.120245] lock_acquire+0x19b/0x360
[ 914.120259] ? flush_workqueue+0x87/0x470
[ 914.120286] ? lockdep_init_map_type+0x6b/0x250
[ 914.120310] flush_workqueue+0xae/0x470
[ 914.120327] ? flush_workqueue+0x87/0x470
[ 914.120344] ? lockdep_hardirqs_on+0xd7/0x150
[ 914.120391] ieee80211_stop_device+0x3b/0x50 [mac80211]
[ 914.120565] ? ieee80211_stop_device+0x3b/0x50 [mac80211]
[ 914.120736] ieee80211_do_stop+0x5d7/0x830 [mac80211]
[ 914.120906] ieee80211_stop+0x45/0x180 [mac80211]
[ 914.121060] __dev_close_many+0xb3/0x120
[ 914.121081] __dev_change_flags+0xc3/0x1d0
[ 914.121109] dev_change_flags+0x29/0x70
[ 914.121131] devinet_ioctl+0x653/0x810
[ 914.121149] ? __might_fault+0x77/0x80
[ 914.121179] inet_ioctl+0x193/0x1e0
[ 914.121194] ? inet_ioctl+0x193/0x1e0
[ 914.121218] ? __might_fault+0x77/0x80
[ 914.121238] ? _copy_to_user+0x68/0x80
[ 914.121266] sock_do_ioctl+0x4d/0xf0
[ 914.121283] ? inet_stream_connect+0x60/0x60
[ 914.121297] ? sock_do_ioctl+0x4d/0xf0
[ 914.121329] sock_ioctl+0x262/0x340
[ 914.121347] ? sock_ioctl+0x262/0x340
[ 914.121362] ? exit_to_user_mode_prepare+0x13b/0x280
[ 914.121388] ? syscall_enter_from_user_mode+0x20/0x50
[ 914.121416] __x64_sys_ioctl+0x96/0xd0
[ 914.121430] ? br_ioctl_call+0x90/0x90
[ 914.121445] ? __x64_sys_ioctl+0x96/0xd0
[ 914.121465] do_syscall_64+0x3d/0xd0
[ 914.121482] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 914.121497] RIP: 0033:0x7f0ed051737b
[ 914.121513] Code: 0f 1e fa 48 8b 05 15 3b 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e5 3a 0d 00 f7 d8 64 89 01 48
[ 914.121527] RSP: 002b:00007fff7be38b98 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
[ 914.121544] RAX: ffffffffffffffda RBX: 00007fff7be38ba0 RCX: 00007f0ed051737b
[ 914.121555] RDX: 00007fff7be38ba0 RSI: 0000000000008914 RDI: 0000000000000004
[ 914.121566] RBP: 00007fff7be38c60 R08: 000000000000000a R09: 0000000000000001
[ 914.121576] R10: 0000000000000000 R11: 0000000000000202 R12: 00000000fffffffe
[ 914.121586] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000000
[ 914.121620] </TASK>
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20211201071745.17746-2-quic_wgong@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a93789ae541c7d5c1c2a4942013adb6bcc5e2848 ]
Currently 'ar' reference is not added in skb_cb during
WMI mgmt tx. Though this is generally not used during tx completion
callbacks, on interface removal the remaining idr cleanup callback
uses the ar ptr from skb_cb from mgmt txmgmt_idr. Hence
fill them during tx call for proper usage.
Also free the skb which is missing currently in these
callbacks.
Crash_info:
[19282.489476] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[19282.489515] pgd = 91eb8000
[19282.496702] [00000000] *pgd=00000000
[19282.502524] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[19282.783728] PC is at ath11k_mac_vif_txmgmt_idr_remove+0x28/0xd8 [ath11k]
[19282.789170] LR is at idr_for_each+0xa0/0xc8
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.5.0.1-00729-QCAHKSWPL_SILICONZ-3 v2
Signed-off-by: Sriram R <quic_srirrama@quicinc.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1637832614-13831-1-git-send-email-quic_srirrama@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 273703ebdb01b6c5f1aaf4b98fb57b177609055c ]
Commit 31582373a4a8 ("ath11k: Change number of TCL rings to one for
QCA6390") avoids initializing the other entries of dp->tx_ring cause
the corresponding TX rings on QCA6390/WCN6855 are not used, but leaves
those ring masks in ath11k_hw_ring_mask_qca6390.tx unchanged. Normally
this is OK because we will only get interrupts from the first TX ring
on these chips and thus only the first entry of dp->tx_ring is involved.
In case of one MSI vector, all DP rings share the same IRQ. For each
interrupt, all rings have to be checked, which means the other entries
of dp->tx_ring are involved. However since they are not initialized,
system crashes.
Fix this issue by simply removing those ring masks.
crash stack:
[ 102.907438] BUG: kernel NULL pointer dereference, address: 0000000000000028
[ 102.907447] #PF: supervisor read access in kernel mode
[ 102.907451] #PF: error_code(0x0000) - not-present page
[ 102.907453] PGD 1081f0067 P4D 1081f0067 PUD 1081f1067 PMD 0
[ 102.907460] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI
[ 102.907465] CPU: 0 PID: 3511 Comm: apt-check Kdump: loaded Tainted: G E 5.15.0-rc4-wt-ath+ #20
[ 102.907470] Hardware name: AMD Celadon-RN/Celadon-RN, BIOS RCD1005E 10/08/2020
[ 102.907472] RIP: 0010:ath11k_dp_tx_completion_handler+0x201/0x830 [ath11k]
[ 102.907497] Code: 3c 24 4e 8d ac 37 10 04 00 00 4a 8d bc 37 68 04 00 00 48 89 3c 24 48 63 c8 89 83 84 18 00 00 48 c1 e1 05 48 03 8b 78 18 00 00 <8b> 51 08 89 d6 83 e6 07 89 74 24 24 83 fe 03 74 04 85 f6 75 63 41
[ 102.907501] RSP: 0000:ffff9b7340003e08 EFLAGS: 00010202
[ 102.907505] RAX: 0000000000000001 RBX: ffff8e21530c0100 RCX: 0000000000000020
[ 102.907508] RDX: 0000000000000000 RSI: 00000000fffffe00 RDI: ffff8e21530c1938
[ 102.907511] RBP: ffff8e21530c0000 R08: 0000000000000001 R09: 0000000000000000
[ 102.907513] R10: ffff8e2145534c10 R11: 0000000000000001 R12: ffff8e21530c2938
[ 102.907515] R13: ffff8e21530c18e0 R14: 0000000000000100 R15: ffff8e21530c2978
[ 102.907518] FS: 00007f5d4297e740(0000) GS:ffff8e243d600000(0000) knlGS:0000000000000000
[ 102.907521] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 102.907524] CR2: 0000000000000028 CR3: 00000001034ea000 CR4: 0000000000350ef0
[ 102.907527] Call Trace:
[ 102.907531] <IRQ>
[ 102.907537] ath11k_dp_service_srng+0x5c/0x2f0 [ath11k]
[ 102.907556] ath11k_pci_ext_grp_napi_poll+0x21/0x70 [ath11k_pci]
[ 102.907562] __napi_poll+0x2c/0x160
[ 102.907570] net_rx_action+0x251/0x310
[ 102.907576] __do_softirq+0x107/0x2fc
[ 102.907585] irq_exit_rcu+0x74/0x90
[ 102.907593] common_interrupt+0x83/0xa0
[ 102.907600] </IRQ>
[ 102.907601] asm_common_interrupt+0x1e/0x40
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
Signed-off-by: Baochen Qiang <bqiang@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211026011605.58615-1-quic_bqiang@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ae80b6033834342601e99f74f6a62ff5092b1cee ]
Unexpected WDCMSG_TARGET_START replay can lead to null-ptr-deref
when ar->tx_cmd->odata is NULL. The patch adds a null check to
prevent such case.
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
ar5523_cmd+0x46a/0x581 [ar5523]
ar5523_probe.cold+0x1b7/0x18da [ar5523]
? ar5523_cmd_rx_cb+0x7a0/0x7a0 [ar5523]
? __pm_runtime_set_status+0x54a/0x8f0
? _raw_spin_trylock_bh+0x120/0x120
? pm_runtime_barrier+0x220/0x220
? __pm_runtime_resume+0xb1/0xf0
usb_probe_interface+0x25b/0x710
really_probe+0x209/0x5d0
driver_probe_device+0xc6/0x1b0
device_driver_attach+0xe2/0x120
I found the bug using a custome USBFuzz port. It's a research work
to fuzz USB stack/drivers. I modified it to fuzz ath9k driver only,
providing hand-crafted usb descriptors to QEMU.
After fixing the code (fourth byte in usb packet) to WDCMSG_TARGET_START,
I got the null-ptr-deref bug. I believe the bug is triggerable whenever
cmd->odata is NULL. After patching, I tested with the same input and no
longer see the KASAN report.
This was NOT tested on a real device.
Signed-off-by: Zekun Shen <bruceshenzk@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/YXsmPQ3awHFLuAj2@10-18-43-117.dynapool.wireless.nyu.edu
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit eccd25136386a04ebf46a64f3a34e8e0fab6d9e1 ]
In ath11k_mac_op_hw_scan(), the return value of kzalloc() is directly
used in memcpy(), which may lead to a NULL pointer dereference on
failure of kzalloc().
Fix this bug by adding a check of arg.extraie.ptr.
This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.
Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.
Builds with CONFIG_ATH11K=m show no new warnings, and our static
analyzer no longer warns about this code.
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Signed-off-by: Zhou Qingyang <zhou1615@umn.edu>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20211202155348.71315-1-zhou1615@umn.edu
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ba53ee7f7f38cf0592b8be1dcdabaf8f7535f8c1 ]
frag_timer will be created & initialized for stations when
they associate and will be deleted during every key installation
while flushing old fragments.
For AP interface self peer will be created and Group keys
will be installed for this peer, but there will be no real
Station entry & hence frag_timer won't be created and
initialized, deleting such uninitialized kernel timers causes below
warnings and backtraces printed with CONFIG_DEBUG_OBJECTS_TIMERS
enabled.
[ 177.828008] ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0
[ 177.836833] WARNING: CPU: 3 PID: 188 at lib/debugobjects.c:508 debug_print_object+0xb0/0xf0
[ 177.845185] Modules linked in: ath11k_pci ath11k qmi_helpers qrtr_mhi qrtr ns mhi
[ 177.852679] CPU: 3 PID: 188 Comm: hostapd Not tainted 5.14.0-rc3-32919-g4034139e1838-dirty #14
[ 177.865805] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[ 177.871804] pc : debug_print_object+0xb0/0xf0
[ 177.876155] lr : debug_print_object+0xb0/0xf0
[ 177.880505] sp : ffffffc01169b5a0
[ 177.883810] x29: ffffffc01169b5a0 x28: ffffff80081c2320 x27: ffffff80081c4078
[ 177.890942] x26: ffffff8003fe8f28 x25: ffffff8003de9890 x24: ffffffc01134d738
[ 177.898075] x23: ffffffc010948f20 x22: ffffffc010b2d2e0 x21: ffffffc01169b628
[ 177.905206] x20: ffffffc01134d700 x19: ffffffc010c80d98 x18: 00000000000003f6
[ 177.912339] x17: 203a657079742074 x16: 63656a626f202930 x15: 0000000000000152
[ 177.919471] x14: 0000000000000152 x13: 00000000ffffffea x12: ffffffc010d732e0
[ 177.926603] x11: 0000000000000003 x10: ffffffc010d432a0 x9 : ffffffc010d432f8
[ 177.933735] x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 0000000000000001
[ 177.940866] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
[ 177.947997] x2 : ffffffc010c93240 x1 : ffffff80023624c0 x0 : 0000000000000054
[ 177.955130] Call trace:
[ 177.957567] debug_print_object+0xb0/0xf0
[ 177.961570] debug_object_assert_init+0x124/0x178
[ 177.966269] try_to_del_timer_sync+0x1c/0x70
[ 177.970536] del_timer_sync+0x30/0x50
[ 177.974192] ath11k_peer_frags_flush+0x34/0x68 [ath11k]
[ 177.979439] ath11k_mac_op_set_key+0x1e4/0x338 [ath11k]
[ 177.984673] ieee80211_key_enable_hw_accel+0xc8/0x3d0
[ 177.989722] ieee80211_key_replace+0x360/0x740
[ 177.994160] ieee80211_key_link+0x16c/0x210
[ 177.998337] ieee80211_add_key+0x138/0x338
[ 178.002426] nl80211_new_key+0xfc/0x258
[ 178.006257] genl_family_rcv_msg_doit.isra.17+0xd8/0x120
[ 178.011565] genl_rcv_msg+0xd8/0x1c8
[ 178.015134] netlink_rcv_skb+0x38/0xf8
[ 178.018877] genl_rcv+0x34/0x48
[ 178.022012] netlink_unicast+0x174/0x230
[ 178.025928] netlink_sendmsg+0x188/0x388
[ 178.029845] ____sys_sendmsg+0x218/0x250
[ 178.033763] ___sys_sendmsg+0x68/0x90
[ 178.037418] __sys_sendmsg+0x44/0x88
[ 178.040988] __arm64_sys_sendmsg+0x20/0x28
[ 178.045077] invoke_syscall.constprop.5+0x54/0xe0
[ 178.049776] do_el0_svc+0x74/0xc0
[ 178.053084] el0_svc+0x10/0x18
[ 178.056133] el0t_64_sync_handler+0x88/0xb0
[ 178.060310] el0t_64_sync+0x148/0x14c
[ 178.063966] ---[ end trace 8a5cf0bf9d34a058 ]---
Add changes to not to delete frag timer for peers during
group key installation.
Tested on: IPQ8074 hw2.0 AHB WLAN.HK.2.5.0.1-01092-QCAHKSWPL_SILICONZ-1
Fixes: c3944a562102 ("ath11k: Clear the fragment cache during key install")
Signed-off-by: Rameshkumar Sundaram <quic_ramess@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/1639071421-25078-1-git-send-email-quic_ramess@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b689f091aafd1a874b2f88137934276ab0fca480 ]
CE interrupt configuration uses host ce parameters to assign/free
interrupts. Use host ce parameters to enable/disable interrupts.
This patch fixes below BUG,
BUG: KASAN: global-out-of-bounds in 0xffffffbffdfb035c at addr
ffffffbffde6eeac
Read of size 4 by task kworker/u8:2/132
Address belongs to variable ath11k_core_qmi_firmware_ready+0x1b0/0x5bc [ath11k]
OOB is due to ath11k_ahb_ce_irqs_enable() iterates ce_count(which is 12)
times and accessing 12th element in target_ce_config
(which has only 11 elements) from ath11k_ahb_ce_irq_enable().
With this change host ce configs are used to enable/disable interrupts.
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.5.0.1-00471-QCAHKSWPL_SILICONZ-1
Fixes: 967c1d1131fa ("ath11k: move target ce configs to hw_params")
Signed-off-by: Anilkumar Kolli <akolli@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1637249558-12793-1-git-send-email-akolli@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 64bc3aa02ae78b1fcb1b850e0eb1f0622002bfaa ]
The ath11k driver is caching the information about RSN/WPA IE in the
configured beacon template. The cached information is used during
associations to figure out whether 4-way PKT/2-way GTK peer flags need to
be set or not.
But the code never cleared the state when no such IE was found. This can
for example happen when moving from an WPA/RSN to an open setup. The
(seemingly connected) peer was then not able to communicate over the
link because the firmware assumed a different (encryption enabled) state
for the peer.
Tested-on: IPQ6018 hw1.0 AHB WLAN.HK.2.5.0.1-01100-QCAHKSWPL_SILICONZ-1
Fixes: 01e34233c645 ("ath11k: fix wmi peer flags in peer assoc command")
Cc: Venkateswara Naralasetty <vnaralas@codeaurora.org>
Reported-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Karthikeyan Kathirvel <kathirve@codeaurora.org>
[sven@narfation.org: split into separate patches, clean up commit message]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211115100441.33771-2-sven@narfation.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 436a4e88659842a7cf634d7cc088c8f2cc94ebf5 ]
DISABLE_KEY sets the key_len to 0, firmware will not delete the keys if
key_len is 0. Changing from security mode to open mode will cause mcast
to be still encrypted without vdev restart.
Set the proper key_len for DISABLE_KEY cmd to clear the keys in
firmware.
Tested-on: IPQ6018 hw1.0 AHB WLAN.HK.2.5.0.1-01100-QCAHKSWPL_SILICONZ-1
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Reported-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Karthikeyan Kathirvel <kathirve@codeaurora.org>
[sven@narfation.org: split into separate patches, clean up commit message]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211115100441.33771-1-sven@narfation.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 086c921a354089f209318501038d43c98d3f409f ]
Some ETSI countries have a small overlap in the wireless-regdb with an ETSI
channel (5590-5650). A good example is Australia:
country AU: DFS-ETSI
(2400 - 2483.5 @ 40), (36)
(5150 - 5250 @ 80), (23), NO-OUTDOOR, AUTO-BW
(5250 - 5350 @ 80), (20), NO-OUTDOOR, AUTO-BW, DFS
(5470 - 5600 @ 80), (27), DFS
(5650 - 5730 @ 80), (27), DFS
(5730 - 5850 @ 80), (36)
(57000 - 66000 @ 2160), (43), NO-OUTDOOR
If the firmware (or the BDF) is shipped with these rules then there is only
a 10 MHz overlap with the weather radar:
* below: 5470 - 5590
* weather radar: 5590 - 5600
* above: (none for the rule "5470 - 5600 @ 80")
There are several wrong assumption in the ath11k code:
* there is always a valid range below the weather radar
(actually: there could be no range below the weather radar range OR range
could be smaller than 20 MHz)
* intersected range in the weather radar range is valid
(actually: the range could be smaller than 20 MHz)
* range above weather radar is either empty or valid
(actually: the range could be smaller than 20 MHz)
These wrong assumption will lead in this example to a rule
(5590 - 5600 @ 20), (N/A, 27), (600000 ms), DFS, AUTO-BW
which is invalid according to is_valid_reg_rule() because the freq_diff is
only 10 MHz but the max_bandwidth is set to 20 MHz. Which results in a
rejection like:
WARNING: at backports-20210222_001-4.4.60-b157d2276/net/wireless/reg.c:3984
[...]
Call trace:
[<ffffffbffc3d2e50>] reg_get_max_bandwidth+0x300/0x3a8 [cfg80211]
[<ffffffbffc3d3d0c>] regulatory_set_wiphy_regd_sync+0x3c/0x98 [cfg80211]
[<ffffffbffc651598>] ath11k_regd_update+0x1a8/0x210 [ath11k]
[<ffffffbffc652108>] ath11k_regd_update_work+0x18/0x20 [ath11k]
[<ffffffc0000a93e0>] process_one_work+0x1f8/0x340
[<ffffffc0000a9784>] worker_thread+0x25c/0x448
[<ffffffc0000aedc8>] kthread+0xd0/0xd8
[<ffffffc000085550>] ret_from_fork+0x10/0x40
ath11k c000000.wifi: failed to perform regd update : -22
Invalid regulatory domain detected
To avoid this, the algorithm has to be changed slightly. Instead of
splitting a rule which overlaps with the weather radar range into 3 pieces
and accepting the first two parts blindly, it must actually be checked for
each piece whether it is a valid range. And only if it is valid, add it to
the output array.
When these checks are in place, the processed rules for AU would end up as
country AU: DFS-ETSI
(2400 - 2483 @ 40), (N/A, 36), (N/A)
(5150 - 5250 @ 80), (6, 23), (N/A), NO-OUTDOOR, AUTO-BW
(5250 - 5350 @ 80), (6, 20), (0 ms), NO-OUTDOOR, DFS, AUTO-BW
(5470 - 5590 @ 80), (6, 27), (0 ms), DFS, AUTO-BW
(5650 - 5730 @ 80), (6, 27), (0 ms), DFS, AUTO-BW
(5730 - 5850 @ 80), (6, 36), (N/A), AUTO-BW
and will be accepted by the wireless regulatory code.
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211112153116.1214421-1-sven@narfation.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 16a2c3d5406f95ef6139de52669c60a39443f5f7 ]
HTT_PPDU_STATS_CFG_PDEV_ID bit mask for target FW PPDU stats request message
was set as bit 8 to 15. Bit 8 is reserved for soc stats and pdev id starts from
bit 9. Hence change the bitmask as bit 9 to 15 and fill the proper pdev id in
the request message.
In commit 701e48a43e15 ("ath11k: add packet log support for QCA6390"), both
HTT_PPDU_STATS_CFG_PDEV_ID and pdev_mask were changed, but this pdev_mask
calculation is not valid for platforms which has multiple pdevs with 1 rxdma
per pdev, as this is writing same value(i.e. 2) for all pdevs. Hence fixed it
to consider pdev_idx as well, to make it compatible for both single and multi
pd cases.
Tested on: IPQ8074 hw2.0 AHB WLAN.HK.2.5.0.1-01092-QCAHKSWPL_SILICONZ-1
Tested on: IPQ6018 hw1.0 WLAN.HK.2.5.0.1-01067-QCAHKSWPL_SILICONZ-1
Fixes: 701e48a43e15 ("ath11k: add packet log support for QCA6390")
Co-developed-by: Sathishkumar Muruganandam <murugana@codeaurora.org>
Signed-off-by: Sathishkumar Muruganandam <murugana@codeaurora.org>
Signed-off-by: Rameshkumar Sundaram <ramess@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210721212029.142388-10-jouni@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit cfdf6b19e750f7de8ae71a26932f63b52e3bf74c ]
The linear mapping between the BD rate field and the driver's 5GHz
legacy rates table (wcn_5ghz_rates) does not only apply for the latter
four rates -- it applies to all eight rates.
Fixes: 6ea131acea98 ("wcn36xx: Fix warning due to bad rate_idx")
Signed-off-by: Benjamin Li <benl@squareup.com>
Tested-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211104010548.1107405-3-benl@squareup.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c9c5608fafe4dae975c9644c7d14c51ad3b0ed73 ]
status.band is used in determination of status.rate -- for 5GHz on legacy
rates there is a linear shift between the BD descriptor's rate field and
the wcn36xx driver's rate table (wcn_5ghz_rates).
We have a special clause to populate status.band for hardware scan offload
frames. However, this block occurs after status.rate is already populated.
Correctly handle this dependency by moving the band block before the rate
block.
This patch addresses kernel warnings & missing scan results for 5GHz APs
that send their beacons/probe responses at the higher four legacy rates
(24-54 Mbps), when using hardware scan offload:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at net/mac80211/rx.c:4532 ieee80211_rx_napi+0x744/0x8d8
Modules linked in: wcn36xx [...]
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.19.107-g73909fa #1
Hardware name: Square, Inc. T2 (all variants) (DT)
Call trace:
dump_backtrace+0x0/0x148
show_stack+0x14/0x1c
dump_stack+0xb8/0xf0
__warn+0x2ac/0x2d8
warn_slowpath_null+0x44/0x54
ieee80211_rx_napi+0x744/0x8d8
ieee80211_tasklet_handler+0xa4/0xe0
tasklet_action_common+0xe0/0x118
tasklet_action+0x20/0x28
__do_softirq+0x108/0x1ec
irq_exit+0xd4/0xd8
__handle_domain_irq+0x84/0xbc
gic_handle_irq+0x4c/0xb8
el1_irq+0xe8/0x190
lpm_cpuidle_enter+0x220/0x260
cpuidle_enter_state+0x114/0x1c0
cpuidle_enter+0x34/0x48
do_idle+0x150/0x268
cpu_startup_entry+0x20/0x24
rest_init+0xd4/0xe0
start_kernel+0x398/0x430
---[ end trace ae28cb759352b403 ]---
Fixes: 8a27ca394782 ("wcn36xx: Correct band/freq reporting on RX")
Signed-off-by: Benjamin Li <benl@squareup.com>
Tested-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211104010548.1107405-2-benl@squareup.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ed04ea76e69e7194f7489cebe23a32a68f39218d ]
When deiniting the DXE hardware we should reset the block to ensure there
is no spurious DMA write transaction from the downstream WCNSS to upstream
MSM at a skbuff address we will have released.
Fixes: 8e84c2582169 ("wcn36xx: mac80211 driver for Qualcomm WCN3660/WCN3680 hardware")
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211105122152.1580542-4-bryan.odonoghue@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3652096e5263ad67604b0323f71d133485f410e5 ]
When unloading the driver we are not releasing the DMA descriptors which we
previously allocated.
Fixes: 8e84c2582169 ("wcn36xx: mac80211 driver for Qualcomm WCN3660/WCN3680 hardware")
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211105122152.1580542-3-bryan.odonoghue@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 89dcb1da611d9b3ff0728502d58372fdaae9ebff ]
Right now we have a broken sequence where we enable DMA channel interrupts
which can be left enabled and never disabled if we hit an error path.
Worse still when we unload the driver, the DMA channel interrupt bits are
left intact. About the only saving grace here is that we do remember to
disable the wcnss interrupt when unload the driver.
Fixes: 8e84c2582169 ("wcn36xx: mac80211 driver for Qualcomm WCN3660/WCN3680 hardware")
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211105122152.1580542-2-bryan.odonoghue@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 588b45c88ae130fe373a8c50edaf54735c3f4fe3 ]
Firmware can trigger a missed beacon indication, this is not the same as a
lost signal.
Flag to Linux the missed beacon and let the WiFi stack decide for itself if
the link is up or down by sending its own probe to determine this.
We should only be signalling the link is lost when the firmware indicates
Fixes: 8e84c2582169 ("wcn36xx: mac80211 driver for Qualcomm WCN3660/WCN3680 hardware")
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211027232529.657764-1-bryan.odonoghue@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8f1ba8b0ee2679f0b3d22d2a5c1bc70c436fd872 ]
An SMD capture from the downstream prima driver on WCN3680B shows the
following command sequence for connected scans:
- init_scan_req
- start_scan_req, channel 1
- end_scan_req, channel 1
- start_scan_req, channel 2
- ...
- end_scan_req, channel 3
- finish_scan_req
- init_scan_req
- start_scan_req, channel 4
- ...
- end_scan_req, channel 6
- finish_scan_req
- ...
- end_scan_req, channel 165
- finish_scan_req
Upstream currently never calls wcn36xx_smd_end_scan, and in some cases[1]
still sends finish_scan_req twice in a row or before init_scan_req. A
typical connected scan looks like this:
- init_scan_req
- start_scan_req, channel 1
- finish_scan_req
- init_scan_req
- start_scan_req, channel 2
- ...
- start_scan_req, channel 165
- finish_scan_req
- finish_scan_req
This patch cleans up scanning so that init/finish and start/end are always
paired together and correctly nested.
- init_scan_req
- start_scan_req, channel 1
- end_scan_req, channel 1
- finish_scan_req
- init_scan_req
- start_scan_req, channel 2
- end_scan_req, channel 2
- ...
- start_scan_req, channel 165
- end_scan_req, channel 165
- finish_scan_req
Note that upstream will not do batching of 3 active-probe scans before
returning to the operating channel, and this patch does not change that.
To match downstream in this aspect, adjust IEEE80211_PROBE_DELAY and/or
the 125ms max off-channel time in ieee80211_scan_state_decision.
[1]: commit d195d7aac09b ("wcn36xx: Ensure finish scan is not requested
before start scan") addressed one case of finish_scan_req being sent
without a preceding init_scan_req (the case of the operating channel
coinciding with the first scan channel); two other cases are:
1) if SW scan is started and aborted immediately, without scanning any
channels, we send a finish_scan_req without ever sending init_scan_req,
and
2) as SW scan logic always returns us to the operating channel before
calling wcn36xx_sw_scan_complete, finish_scan_req is always sent twice
at the end of a SW scan
Fixes: 8e84c2582169 ("wcn36xx: mac80211 driver for Qualcomm WCN3660/WCN3680 hardware")
Signed-off-by: Benjamin Li <benl@squareup.com>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211027170306.555535-4-benl@squareup.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit a658c929ded7ea3aee324c8c2a9635a5e5a38e7f upstream.
If cfg80211 is providing extraie's for a scanning process then ath11k will
copy that over to the firmware. The extraie.len is a 32 bit value in struct
element_info and describes the amount of bytes for the vendor information
elements.
The WMI_TLV packet is having a special WMI_TAG_ARRAY_BYTE section. This
section can have a (payload) length up to 65535 bytes because the
WMI_TLV_LEN can store up to 16 bits. The code was missing such a check and
could have created a scan request which cannot be parsed correctly by the
firmware.
But the bigger problem was the allocation of the buffer. It has to align
the TLV sections by 4 bytes. But the code was using an u8 to store the
newly calculated length of this section (with alignment). And the new
calculated length was then used to allocate the skbuff. But the actual code
to copy in the data is using the extraie.len and not the calculated
"aligned" length.
The length of extraie with IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS enabled
was 264 bytes during tests with a QCA Milan card. But it only allocated 8
bytes (264 bytes % 256) for it. As consequence, the code to memcpy the
extraie into the skb was then just overwriting data after skb->end. Things
like shinfo were therefore corrupted. This could usually be seen by a crash
in skb_zcopy_clear which tried to call a ubuf_info callback (using a bogus
address).
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-02892.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
Cc: stable@vger.kernel.org
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20211207142913.1734635-1-sven@narfation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 937e79c67740d1d84736730d679f3cb2552f990e upstream.
Using a kernel pointer in place of a dma_addr_t token can
lead to undefined behavior if that makes it into cache
management functions. The compiler caught one such attempt
in a cast:
drivers/net/wireless/ath/ath10k/mac.c: In function 'ath10k_add_interface':
drivers/net/wireless/ath/ath10k/mac.c:5586:47: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
5586 | arvif->beacon_paddr = (dma_addr_t)arvif->beacon_buf;
| ^
Looking through how this gets used down the way, I'm fairly
sure that beacon_paddr is never accessed again for ATH10K_DEV_TYPE_HL
devices, and if it was accessed, that would be a bug.
Change the assignment to use a known-invalid address token
instead, which avoids the warning and makes it easier to catch
bugs if it does end up getting used.
Fixes: e263bdab9c0e ("ath10k: high latency fixes for beacon buffer")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211014075153.3655910-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 113f304dbc1627c6ec9d5329d839964095768980 ]
The firmware is offering features such as ARP offload, for which
firmware crafts its own (QoS)packets without waking up the host.
Point is that the sequence numbers generated by the firmware are
not in sync with the host mac80211 layer and can cause packets
such as firmware ARP reponses to be dropped by the AP (too old SN).
To fix this we need to let the firmware manages the sequence
numbers by its own (except for QoS null frames). There is a SN
counter for each QoS queue and one global/baseline counter for
Non-QoS.
Fixes: 84aff52e4f57 ("wcn36xx: Use sequence number allocated by mac80211")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1635150336-18736-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9bfe38e064af5decba2ffce66a2958ab8b10eaa4 ]
This is essentially exactly following the dma_wmb()/dma_rmb() usage
instructions in Documentation/memory-barriers.txt.
The theoretical races here are:
1. DXE (the DMA Transfer Engine in the Wi-Fi subsystem) seeing the
dxe->ctrl & WCN36xx_DXE_CTRL_VLD write before the dxe->dst_addr_l
write, thus performing DMA into the wrong address.
2. CPU reading dxe->dst_addr_l before DXE unsets dxe->ctrl &
WCN36xx_DXE_CTRL_VLD. This should generally be harmless since DXE
doesn't write dxe->dst_addr_l (no risk of freeing the wrong skb).
Fixes: 8e84c2582169 ("wcn36xx: mac80211 driver for Qualcomm WCN3660/WCN3680 hardware")
Signed-off-by: Benjamin Li <benl@squareup.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211023001528.3077822-1-benl@squareup.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0a491167fe0cf9f26062462de2a8688b96125d48 ]
Most of the txpower for the ath10k firmware is stored as twicepower (0.5 dB
steps). This isn't the case for max_antenna_gain - which is still expected
by the firmware as dB.
The firmware is converting it from dB to the internal (twicepower)
representation when it calculates the limits of a channel. This can be seen
in tpc_stats when configuring "12" as max_antenna_gain. Instead of the
expected 12 (6 dB), the tpc_stats shows 24 (12 dB).
Tested on QCA9888 and IPQ4019 with firmware 10.4-3.5.3-00057.
Fixes: 02256930d9b8 ("ath10k: use proper tx power unit")
Signed-off-by: Sven Eckelmann <seckelmann@datto.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20190611172131.6064-1-sven@narfation.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4925642d541278575ad1948c5924d71ffd57ef14 ]
In tests with two Lima boards from 8devices (QCA4531 based) on OpenWrt
19.07 we could force a silent restart of a device with no serial
output when we were sending a high amount of UDP traffic (iperf3 at 80
MBit/s in both directions from external hosts, saturating the wifi and
causing a load of about 4.5 to 6) and were then triggering an
ath9k_queue_reset().
Further debugging showed that the restart was caused by the ath79
watchdog. With disabled watchdog we could observe that the device was
constantly going into ath_isr() interrupt handler and was returning
early after the ATH_OP_HW_RESET flag test, without clearing any
interrupts. Even though ath9k_queue_reset() calls
ath9k_hw_kill_interrupts().
With JTAG we could observe the following race condition:
1) ath9k_queue_reset()
...
-> ath9k_hw_kill_interrupts()
-> set_bit(ATH_OP_HW_RESET, &common->op_flags);
...
<- returns
2) ath9k_tasklet()
...
-> ath9k_hw_resume_interrupts()
...
<- returns
3) loops around:
...
handle_int()
-> ath_isr()
...
-> if (test_bit(ATH_OP_HW_RESET,
&common->op_flags))
return IRQ_HANDLED;
x) ath_reset_internal():
=> never reached <=
And in ath_isr() we would typically see the following interrupts /
interrupt causes:
* status: 0x00111030 or 0x00110030
* async_cause: 2 (AR_INTR_MAC_IPQ)
* sync_cause: 0
So the ath9k_tasklet() reenables the ath9k interrupts
through ath9k_hw_resume_interrupts() which ath9k_queue_reset() had just
disabled. And ath_isr() then keeps firing because it returns IRQ_HANDLED
without actually clearing the interrupt.
To fix this IRQ storm also clear/disable the interrupts again when we
are in reset state.
Cc: Sven Eckelmann <sven@narfation.org>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Cc: Linus Lüssing <linus.luessing@c0d3.blue>
Fixes: 872b5d814f99 ("ath9k: do not access hardware on IRQs during reset")
Signed-off-by: Linus Lüssing <ll@simonwunderlich.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210914192515.9273-3-linus.luessing@c0d3.blue
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 019edd01d174ce4bb2e517dd332922514d176601 ]
On a i.MX-based board with a QCA9377 Wifi chip, the following errors
are seen after launching the 'hostapd' application:
hostapd /etc/wifi.conf
Configuration file: /etc/wifi.conf
wlan0: interface state UNINITIALIZED->COUNTRY_UPDATE
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
Using interface wlan0 with hwaddr 00:1f:7b:31:04:a0 and ssid "thessid"
IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
wlan0: interface state COUNTRY_UPDATE->ENABLED
wlan0: AP-ENABLED
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
...
Fix this problem by adding the BH locking around napi-schedule(),
in the same way it was done in commit e63052a5dd3c ("mlx5e: add
add missing BH locking around napi_schdule()").
Its commit log provides the following explanation:
"It's not correct to call napi_schedule() in pure process
context. Because we use __raise_softirq_irqoff() we require
callers to be in a context which will eventually lead to
softirq handling (hardirq, bh disabled, etc.).
With code as is users will see:
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
"
Fixes: cfee8793a74d ("ath10k: enable napi on RX path for sdio")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210824144339.2796122-1-festevam@denx.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e6dfbc3ba90cc2b619229be56b485f085a0a8e1c ]
When receiving a beacon or probe response, we should update the
boottime_ns field which is the timestamp the frame was received at.
(cf mac80211.h)
This fixes a scanning issue with Android since it relies on this
timestamp to determine when the AP has been seen for the last time
(via the nl80211 BSS_LAST_SEEN_BOOTTIME parameter).
Fixes: 5e3dd157d7e7 ("ath10k: mac80211 driver for Qualcomm Atheros 802.11ac CQA98xx devices")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1629811733-7927-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 72de799aa9e3e064b35238ef053d2f0a49db055a ]
The buffer pointed to by event is not freed in case
ATH11K_FLAG_UNREGISTERING bit is set, resulting in
memory leak, so fix it.
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Signed-off-by: Baochen Qiang <bqiang@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210913180246.193388-4-jouni@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9d6ae1f5cf733c0e8d7f904c501fd015c4b9f0f4 ]
Frequency in rx status is being filled incorrectly in the 6 GHz band as
channel number received is invalid in this case which is causing packet
drops. So fix that.
Fixes: 5dcf42f8b79d ("ath11k: Use freq instead of channel number in rx path")
Signed-off-by: Pradeep Kumar Chitrapu <pradeepc@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210722102054.43419-2-jouni@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1db2b0d0a39102238fcbf9092cefa65a710642e9 ]
Whenever ath11k is bootup with a user country already set, cfg80211
notifies this country info to ath11k soon after registration, where the
notification is sent to the firmware for fetching the rules of this user
country input.
Multiple race conditions could be seen in this scenario where a new
request is either lost as pointed in [1] or a new regd overwrites the
default regd provided by the firmware during bootup. Note that, the
default regd is used for intersection purpose and hence it should not be
overwritten.
The main reason as pointed by [1] is the usage of ATH11K_FLAG_REGISTERED
flag which is updated after completion of core registration, whereas the
reg notification from cfg80211 and wmi events for the corresponding
request can happen much before that. Since the ATH11K_FLAG_REGISTERED is
currently used to determine if the event containing reg rules belong to
default regd or for user request, there is a possibility of the default
regd getting overwritten.
Since the default reg rules will be received only once per pdev on
firmware load, the above flag based check can be replaced with a check
to see if default_regd is already set, so that we can now always update
the new_regd. Also if the new_regd is set, this will be always used to
update the reg rules for the registered phy.
[1] https://patchwork.kernel.org/project/linux-wireless/patch/1829665.1PRlr7bOQj@ripper/
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-01460-QCAHKSWPL_SILICONZ-1
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Signed-off-by: Sriram R <srirrama@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210721212029.142388-4-jouni@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit aadf7c81a0771b8f1c97dabca6a48bae1b387779 ]
The ath11k_dbring_bufs_replenish() and ath11k_dbring_fill_bufs()
take a "gfp" parameter but they since they take spinlocks, the
allocations they do have to be atomic. This causes a bug because
ath11k_dbring_buf_setup passes GFP_KERNEL for the gfp flags.
The fix is to use GFP_ATOMIC and remove the unused parameters.
Fixes: bd6478559e27 ("ath11k: Add direct buffer ring support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210812070434.GE31863@kili
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 701668d3bfa03dabc5095fc383d5315544ee5b31 ]
We have been tracking a strange bug with Antenna Diversity Switching (ADS)
on wcn3680b for a while.
ADS is configured like this:
A. Via a firmware configuration table baked into the NV area.
1. Defines if ADS is enabled.
2. Defines which GPIOs are connected to which antenna enable pin.
3. Defines which antenna/GPIO is primary and which is secondary.
B. WCN36XX_CFG_VAL(ANTENNA_DIVERSITY, N)
N is a bitmask of available antenna.
Setting N to 3 indicates a bitmask of enabled antenna (1 | 2).
Obviously then we can set N to 1 or N to 2 to fix to a particular
antenna and disable antenna diversity.
C. WCN36XX_CFG_VAL(ASD_PROBE_INTERVAL, XX)
XX is the number of beacons between each antenna RSSI check.
Setting this value to 50 means, every 50 received beacons, run the
ADS algorithm.
D. WCN36XX_CFG_VAL(ASD_TRIGGER_THRESHOLD, YY)
YY is a two's complement integer which specifies the RSSI decibel
threshold below which ADS will run.
We default to -60db here, meaning a measured RSSI <= -60db will
trigger an ADS probe.
E. WCN36XX_CFG_VAL(ASD_RTT_RSSI_HYST_THRESHOLD, Z)
Z is a hysteresis value, indicating a delta which the RSSI must
exceed for the antenna switch to be valid.
For example if HYST_THRESHOLD == 3 AntennaId1-RSSI == -60db and
AntennaId-2-RSSI == -58db then firmware will not switch antenna.
The threshold needs to be -57db or better to satisfy the criteria.
F. A firmware feature bit also exists ANTENNA_DIVERSITY_SELECTION.
This feature bit is used by the firmware to report if
ANTENNA_DIVERSITY_SELECTION is supported. The host is not required to
toggle this bit to enable or disable ADS.
ADS works like this:
A. Every XX beacons the firmware switches to or remains on the primary
antenna.
B. The firmware then sends a Request-To-Send (RTS) packet to the AP.
C. The firmware waits for a Clear-To-Send (CTS) response from the AP.
D. The firmware then notes the received RSSI on the CTS packet.
E. The firmware then repeats steps A-D on the secondary antenna.
F. Subsequently if the RSSI on the measured antenna is better than
ASD_TRIGGER_THRESHOLD + the active antenna's RSSI then the
measured antenna becomes the active antenna.
G. If RSSI rises past ASD_TRIGGER_THRESHOLD then ADS doesn't run at
all even if there is a substantially better RSSI on the alternative
antenna.
What we have been observing is that the RTS packet is being sent but the
MAC address is a byte-swapped version of the target MAC. The ADS/RTS MAC is
corrupted only when the link is encrypted, if the AP is open the RTS MAC is
correct. Similarly if we configure the firmware to an RTS/CTS sequence for
regular data - the transmitted RTS MAC is correctly formatted.
Internally the wcn36xx firmware uses the indexes in the SMD commands to
populate and extract data from specific entries in an STA lookup table. The
AP's MAC appears a number of times in different indexes within this lookup
table, so the MAC address extracted for the data-transmit RTS and the MAC
address extracted for the ADS/RTS packet are not the same STA table index.
Our analysis indicates the relevant firmware STA table index is
"bssSelfStaIdx".
There is an STA populate function responsible for formatting the MAC
address of the bssSelfStaIdx including byte-swapping the MAC address.
Its clear then that the required STA populate command did not run for
bssSelfStaIdx.
So taking a look at the sequence of SMD commands sent to the firmware we
see the following downstream when moving from an unencrypted to encrypted
BSS setup.
- WLAN_HAL_CONFIG_BSS_REQ
- WLAN_HAL_CONFIG_STA_REQ
- WLAN_HAL_SET_STAKEY_REQ
Upstream in wcn36xx we have
- WLAN_HAL_CONFIG_BSS_REQ
- WLAN_HAL_SET_STAKEY_REQ
The solution then is to add the missing WLAN_HAL_CONFIG_STA_REQ between
WLAN_HAL_CONFIG_BSS_REQ and WLAN_HAL_SET_STAKEY_REQ.
No surprise WLAN_HAL_CONFIG_STA_REQ is the routine responsible for
populating the STA lookup table in the firmware and once done the MAC sent
by the ADS routine is in the correct byte-order.
This bug is apparent with ADS but it is also the case that any other
firmware routine that depends on the "bssSelfStaIdx" would retrieve
malformed data on an encrypted link.
Fixes: 3e977c5c523d ("wcn36xx: Define wcn3680 specific firmware parameters")
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Tested-by: Benjamin Li <benl@squareup.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210909144428.2564650-2-bryan.odonoghue@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|