kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2025-05-29	usb: xhci: set page size to the xHCI-supported size	Niklas Neronin	2	-20/+22
	[ Upstream commit 68c1f1671650b49bbd26e6a65ddcf33f2565efa3 ] The current xHCI driver does not validate whether a page size of 4096 bytes is supported. Address the issue by setting the page size to the value supported by the xHCI controller, as read from the Page Size register. In the event of an unexpected value; default to a 4K page size. Additionally, this commit removes unnecessary debug messages and instead prints the supported and used page size once. The xHCI controller supports page sizes of (2^{(n+12)}) bytes, where 'n' is the Page Size Bit. Only one page size is supported, with a maximum page size of 128 KB. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-10-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29	usb: xhci: Don't change the status of stalled TDs on failed Stop EP	Michal Pecio	1	-1/+11
	[ Upstream commit dfc88357b6b6356dadea06b2c0bc8041f5e11720 ] When the device stalls an endpoint, current TD is assigned -EPIPE status and Reset Endpoint is queued. If a Stop Endpoint is pending at the time, it will run before Reset Endpoint and fail due to the stall. Its handler will change TD's status to -EPROTO before Reset Endpoint handler runs and initiates giveback. Check if the stall has already been handled and don't try to do it again. Since xhci_handle_halted_endpoint() performs this check too, not overwriting td->status is the only difference. I haven't seen this case yet, but I have seen a related one where the xHC has already executed Reset Endpoint, EP Context state is now Stopped and EP_HALTED is set. If the xHC took a bit longer to execute Reset Endpoint, said case would become this one. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250311154551.4035726-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-18	usb: host: tegra: Prevent host controller crash when OTG port is used	Jim Lin	1	-0/+3
	commit 732f35cf8bdfece582f6e4a9c659119036577308 upstream. When a USB device is connected to the OTG port, the tegra_xhci_id_work() routine transitions the PHY to host mode and calls xhci_hub_control() with the SetPortFeature command to enable port power. In certain cases, the XHCI controller may be in a low-power state when this operation occurs. If xhci_hub_control() is invoked while the controller is suspended, the PORTSC register may return 0xFFFFFFFF, indicating a read failure. This causes xhci_hc_died() to be triggered, leading to host controller shutdown. Example backtrace: [ 105.445736] Workqueue: events tegra_xhci_id_work [ 105.445747] dump_backtrace+0x0/0x1e8 [ 105.445759] xhci_hc_died.part.48+0x40/0x270 [ 105.445769] tegra_xhci_set_port_power+0xc0/0x240 [ 105.445774] tegra_xhci_id_work+0x130/0x240 To prevent this, ensure the controller is fully resumed before interacting with hardware registers by calling pm_runtime_get_sync() prior to the host mode transition and xhci_hub_control(). Fixes: f836e7843036 ("usb: xhci-tegra: Add OTG support") Cc: stable <stable@kernel.org> Signed-off-by: Jim Lin <jilin@nvidia.com> Signed-off-by: Wayne Chang <waynec@nvidia.com> Link: https://lore.kernel.org/r/20250422114001.126367-1-waynec@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18	usb: uhci-platform: Make the clock really optional	Alexey Charkov	1	-1/+1
	commit a5c7973539b010874a37a0e846e62ac6f00553ba upstream. Device tree bindings state that the clock is optional for UHCI platform controllers, and some existing device trees don't provide those - such as those for VIA/WonderMedia devices. The driver however fails to probe now if no clock is provided, because devm_clk_get returns an error pointer in such case. Switch to devm_clk_get_optional instead, so that it could probe again on those platforms where no clocks are given. Cc: stable <stable@kernel.org> Fixes: 26c502701c52 ("usb: uhci: Add clk support to uhci-platform") Signed-off-by: Alexey Charkov <alchark@gmail.com> Link: https://lore.kernel.org/r/20250425-uhci-clock-optional-v1-1-a1d462592f29@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18	xhci: dbc: Avoid event polling busyloop if pending rx transfers are inactive.	Mathias Nyman	2	-3/+19
	commit cab63934c33b12c0d1e9f4da7450928057f2c142 upstream. Event polling delay is set to 0 if there are any pending requests in either rx or tx requests lists. Checking for pending requests does not work well for "IN" transfers as the tty driver always queues requests to the list and TRBs to the ring, preparing to receive data from the host. This causes unnecessary busylooping and cpu hogging. Only set the event polling delay to 0 if there are pending tx "write" transfers, or if it was less than 10ms since last active data transfer in any direction. Cc: Łukasz Bartosik <ukaszb@chromium.org> Fixes: fb18e5bb9660 ("xhci: dbc: poll at different rate depending on data transfer activity") Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250505125630.561699-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02	usb: xhci: Fix Short Packet handling rework ignoring errors	Michal Pecio	1	-1/+1
	commit 9e3a28793d2fde7a709e814d2504652eaba6ae98 upstream. A Short Packet event before the last TRB of a TD is followed by another event on the final TRB on spec-compliant HCs, which is most of them. A 'last_td_was_short' flag was added to know if a TD has just completed as Short Packet and another event is to come. The flag was cleared after seeing the event (unless no TDs are pending, but that's a separate bug) or seeing a new TD complete as something other than Short Packet. A rework replaced the flag with an 'old_trb_comp_code' variable. When an event doesn't match the pending TD and the previous event was Short Packet, the new event is silently ignored. To preserve old behavior, 'old_trb_comp_code' should be cleared at this point, but instead it is being set to current comp code, which is often Short Packet again. This can cause more events to be silently ignored, even though they are no longer connected with the old TD that completed short and indicate a serious problem with the driver or the xHC. Common device classes like UAC in async mode, UVC, serial or the UAS status pipe complete as Short Packet routinely and could be affected. Clear 'old_trb_comp_code' to zero, which is an invalid completion code and the same value the variable starts with. This restores original behavior on Short Packet and also works for illegal Etron events, which the code has been extended to cover too. Fixes: b331a3d8097f ("xhci: Handle spurious events on Etron host isoc enpoints") Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250410151828.2868740-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02	usb: host: xhci-plat: mvebu: use ->quirks instead of ->init_quirk() func	Théo Lebrun	3	-17/+1
	[ Upstream commit 64eb182d5f7a5ec30227bce4f6922ff663432f44 ] Compatible "marvell,armada3700-xhci" match data uses the struct xhci_plat_priv::init_quirk() function pointer to add XHCI_RESET_ON_RESUME as quirk on XHCI. Instead, use the struct xhci_plat_priv::quirks field. Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com> Link: https://lore.kernel.org/r/20250205-s2r-cdns-v7-1-13658a271c3c@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02	usb: xhci: Avoid Stop Endpoint retry loop if the endpoint seems Running	Michal Pecio	1	-4/+7
	[ Upstream commit 28a76fcc4c85dd39633fb96edb643c91820133e3 ] Nothing prevents a broken HC from claiming that an endpoint is Running and repeatedly rejecting Stop Endpoint with Context State Error. Avoid infinite retries and give back cancelled TDs. No such cases known so far, but HCs have bugs. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250311154551.4035726-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02	xhci: Handle spurious events on Etron host isoc enpoints	Mathias Nyman	2	-13/+27
	[ Upstream commit b331a3d8097fad4e541d212684192f21fedbd6e5 ] Unplugging a USB3.0 webcam from Etron hosts while streaming results in errors like this: [ 2.646387] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13 [ 2.646446] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8630 trb-start 000000002fdf8640 trb-end 000000002fdf8650 [ 2.646560] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13 [ 2.646568] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8660 trb-start 000000002fdf8670 trb-end 000000002fdf8670 Etron xHC generates two transfer events for the TRB if an error is detected while processing the last TRB of an isoc TD. The first event can be any sort of error (like USB Transaction or Babble Detected, etc), and the final event is Success. The xHCI driver will handle the TD after the first event and remove it from its internal list, and then print an "Transfer event TRB DMA ptr not part of current TD" error message after the final event. Commit 5372c65e1311 ("xhci: process isoc TD properly when there was a transaction error mid TD.") is designed to address isoc transaction errors, but unfortunately it doesn't account for this scenario. This issue is similar to the XHCI_SPURIOUS_SUCCESS case where a success event follows a 'short transfer' event, but the TD the event points to is already given back. Expand the spurious success 'short transfer' event handling to cover the spurious success after error on Etron hosts. Kuangyi Chiang reported this issue and submitted a different solution based on using error_mid_td. This commit message is mostly taken from that patch. Reported-by: Kuangyi Chiang <ki.chiang65@gmail.com> Closes: https://lore.kernel.org/linux-usb/20241028025337.6372-6-ki.chiang65@gmail.com/ Tested-by: Kuangyi Chiang <ki.chiang65@gmail.com> Tested-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-16-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02	usb: xhci: Fix isochronous Ring Underrun/Overrun event handling	Michal Pecio	1	-6/+14
	[ Upstream commit 906dec15b9b321b546fd31a3c99ffc13724c7af4 ] The TRB pointer of these events points at enqueue at the time of error occurrence on xHCI 1.1+ HCs or it's NULL on older ones. By the time we are handling the event, a new TD may be queued at this ring position. I can trigger this race by rising interrupt moderation to increase IRQ handling delay. Similar delay may occur naturally due to system load. If this ever happens after a Missed Service Error, missed TDs will be skipped and the new TD processed as if it matched the event. It could be given back prematurely, risking data loss or buffer UAF by the xHC. Don't complete TDs on xrun events and don't warn if queued TDs don't match the event's TRB pointer, which can be NULL or a link/no-op TRB. Don't warn if there are no queued TDs at all. Now that it's safe, also handle xrun events if the skip flag is clear. This ensures completion of any TD stuck in 'error mid TD' state right before the xrun event, which could happen if a driver submits a finite number of URBs to a buggy HC and then an error occurs on the last TD. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-6-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02	usb: xhci: Complete 'error mid TD' transfers when handling Missed Service	Michal Pecio	1	-1/+5
	[ Upstream commit bfa8459942822bdcc86f0e87f237c0723ae64948 ] Missed Service Error after an error mid TD means that the failed TD has already been passed by the xHC without acknowledgment of the final TRB, a known hardware bug. So don't wait any more and give back the TD. Reproduced on NEC uPD720200 under conditions of ludicrously bad USB link quality, confirmed to behave as expected using dynamic debug. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02	usb: host: max3421-hcd: Add missing spi_device_id table	Alexander Stein	1	-0/+7
	[ Upstream commit 41d5e3806cf589f658f92c75195095df0b66f66a ] "maxim,max3421" DT compatible is missing its SPI device ID entry, not allowing module autoloading and leading to the following message: "SPI driver max3421-hcd has no spi_device_id for maxim,max3421" Fix this by adding the spi_device_id table. Signed-off-by: Alexander Stein <alexander.stein@mailbox.org> Link: https://lore.kernel.org/r/20250128195114.56321-1-alexander.stein@mailbox.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02	USB: OHCI: Add quirk for LS7A OHCI controller (rev 0x02)	Huacai Chen	1	-0/+23
	commit bcb60d438547355b8f9ad48645909139b64d3482 upstream. The OHCI controller (rev 0x02) under LS7A PCI host has a hardware flaw. MMIO register with offset 0x60/0x64 is treated as legacy PS2-compatible keyboard/mouse interface, which confuse the OHCI controller. Since OHCI only use a 4KB BAR resource indeed, the LS7A OHCI controller's 32KB BAR is wrapped around (the second 4KB BAR space is the same as the first 4KB internally). So we can add an 4KB offset (0x1000) to the OHCI registers (from the PCI BAR resource) as a quirk. Cc: stable <stable@kernel.org> Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Tested-by: Mingcong Bai <baimingcong@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://lore.kernel.org/r/20250328040059.3672979-1-chenhuacai@loongson.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02	usb: xhci: Fix invalid pointer dereference in Etron workaround	Michal Pecio	1	-1/+1
	commit 1ea050da5562af9b930d17cbbe9632d30f5df43a upstream. This check is performed before prepare_transfer() and prepare_ring(), so enqueue can already point at the final link TRB of a segment. And indeed it will, some 0.4% of times this code is called. Then enqueue + 1 is an invalid pointer. It will crash the kernel right away or load some junk which may look like a link TRB and cause the real link TRB to be replaced with a NOOP. This wouldn't end well. Use a functionally equivalent test which doesn't dereference the pointer and always gives correct result. Something has crashed my machine twice in recent days while playing with an Etron HC, and a control transfer stress test ran for confirmation has just crashed it again. The same test passes with this patch applied. Fixes: 5e1c67abc930 ("xhci: Fix control transfer error on Etron xHCI host") Cc: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Reviewed-by: Kuangyi Chiang <ki.chiang65@gmail.com> Link: https://lore.kernel.org/r/20250410151828.2868740-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02	xhci: Limit time spent with xHC interrupts disabled during bus resume	Mathias Nyman	3	-16/+20
	commit bea5892d0ed274e03655223d1977cf59f9aff2f2 upstream. Current xhci bus resume implementation prevents xHC host from generating interrupts during high-speed USB 2 and super-speed USB 3 bus resume. Only reason to disable interrupts during bus resume would be to prevent the interrupt handler from interfering with the resume process of USB 2 ports. Host initiated resume of USB 2 ports is done in two stages. The xhci driver first transitions the port from 'U3' to 'Resume' state, then wait in Resume for 20ms, and finally moves port to U0 state. xhci driver can't prevent interrupts by keeping the xhci spinlock due to this 20ms sleep. Limit interrupt disabling to the USB 2 port resume case only. resuming USB 2 ports in bus resume is only done in special cases where USB 2 ports had to be forced to suspend during bus suspend. The current way of preventing interrupts by clearing the 'Interrupt Enable' (INTE) bit in USBCMD register won't prevent the Interrupter registers 'Interrupt Pending' (IP), 'Event Handler Busy' (EHB) and USBSTS register Event Interrupt (EINT) bits from being set. New interrupts can't be issued before those bits are properly clered. Disable interrupts by clearing the interrupter register 'Interrupt Enable' (IE) bit instead. This way IP, EHB and INTE won't be set before IE is enabled again and a new interrupt is triggered. Reported-by: Devyn Liu <liudingyuan@huawei.com> Closes: https://lore.kernel.org/linux-usb/b1a9e2d51b4d4ff7a304f77c5be8164e@huawei.com/ Cc: stable@vger.kernel.org Tested-by: Devyn Liu <liudingyuan@huawei.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250410151828.2868740-6-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-10	usb: xhci: correct debug message page size calculation	Niklas Neronin	1	-3/+3
	[ Upstream commit 55741c723318905e6d5161bf1e12749020b161e3 ] The ffs() function returns the index of the first set bit, starting from 1. If no bits are set, it returns zero. This behavior causes an off-by-one page size in the debug message, as the page size calculation [1] is zero-based, while ffs() is one-based. Fix this by subtracting one from the result of ffs(). Note that since variable 'val' is unsigned, subtracting one from zero will result in the maximum unsigned integer value. Consequently, the condition 'if (val < 16)' will still function correctly. [1], Page size: (2^(n+12)), where 'n' is the set page size bit. Fixes: 81720ec5320c ("usb: host: xhci: use ffs() in xhci_mem_init()") Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-9-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-07	usb: xhci: Apply the link chain quirk on NEC isoc endpoints	Michal Pecio	1	-2/+11
	commit bb0ba4cb1065e87f9cc75db1fa454e56d0894d01 upstream. Two clearly different specimens of NEC uPD720200 (one with start/stop bug, one without) were seen to cause IOMMU faults after some Missed Service Errors. Faulting address is immediately after a transfer ring segment and patched dynamic debug messages revealed that the MSE was received when waiting for a TD near the end of that segment: [ 1.041954] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ffa08fe0 [ 1.042120] xhci_hcd: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffa09000 flags=0x0000] [ 1.042146] xhci_hcd: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffa09040 flags=0x0000] It gets even funnier if the next page is a ring segment accessible to the HC. Below, it reports MSE in segment at ff1e8000, plows through a zero-filled page at ff1e9000 and starts reporting events for TRBs in page at ff1ea000 every microframe, instead of jumping to seg ff1e6000. [ 7.041671] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ff1e8fe0 [ 7.041999] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ff1e8fe0 [ 7.042011] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint [ 7.042028] xhci_hcd: All TDs skipped for slot 1 ep 2. Clear skip flag. [ 7.042134] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint [ 7.042138] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 31 [ 7.042144] xhci_hcd: Looking for event-dma 00000000ff1ea040 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820 [ 7.042259] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint [ 7.042262] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 31 [ 7.042266] xhci_hcd: Looking for event-dma 00000000ff1ea050 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820 At some point completion events change from Isoch Buffer Overrun to Short Packet and the HC finally finds cycle bit mismatch in ff1ec000. [ 7.098130] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13 [ 7.098132] xhci_hcd: Looking for event-dma 00000000ff1ecc50 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820 [ 7.098254] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13 [ 7.098256] xhci_hcd: Looking for event-dma 00000000ff1ecc60 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820 [ 7.098379] xhci_hcd: Overrun event on slot 1 ep 2 It's possible that data from the isochronous device were written to random buffers of pending TDs on other endpoints (either IN or OUT), other devices or even other HCs in the same IOMMU domain. Lastly, an error from a different USB device on another HC. Was it caused by the above? I don't know, but it may have been. The disk was working without any other issues and generated PCIe traffic to starve the NEC of upstream BW and trigger those MSEs. The two HCs shared one x1 slot by means of a commercial "PCIe splitter" board. [ 7.162604] usb 10-2: reset SuperSpeed USB device number 3 using xhci_hcd [ 7.178990] sd 9:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=DRIVER_OK cmd_age=0s [ 7.179001] sd 9:0:0:0: [sdb] tag#0 CDB: opcode=0x28 28 00 04 02 ae 00 00 02 00 00 [ 7.179004] I/O error, dev sdb, sector 67284480 op 0x0:(READ) flags 0x80700 phys_seg 5 prio class 0 Fortunately, it appears that this ridiculous bug is avoided by setting the chain bit of Link TRBs on isochronous rings. Other ancient HCs are known which also expect the bit to be set and they ignore Link TRBs if it's not. Reportedly, 0.95 spec guaranteed that the bit is set. The bandwidth-starved NEC HC running a 32KB/uframe UVC endpoint reports tens of MSEs per second and runs into the bug within seconds. Chaining Link TRBs allows the same workload to run for many minutes, many times. No negative side effects seen in UVC recording and UAC playback with a few devices at full speed, high speed and SuperSpeed. The problem doesn't reproduce on the newer Renesas uPD720201/uPD720202 and on old Etron EJ168 and VIA VL805 (but the VL805 has other bug). [shorten line length of log snippets in commit messge -Mathias] Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-14-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-07	usb: xhci: Don't skip on Stopped - Length Invalid	Michal Pecio	1	-0/+4
	commit 58d0a3fab5f4fdc112c16a4c6d382f62097afd1c upstream. Up until commit d56b0b2ab142 ("usb: xhci: ensure skipped isoc TDs are returned when isoc ring is stopped") in v6.11, the driver didn't skip missed isochronous TDs when handling Stoppend and Stopped - Length Invalid events. Instead, it erroneously cleared the skip flag, which would cause the ring to get stuck, as future events won't match the missed TD which is never removed from the queue until it's cancelled. This buggy logic seems to have been in place substantially unchanged since the 3.x series over 10 years ago, which probably speaks first and foremost about relative rarity of this case in normal usage, but by the spec I see no reason why it shouldn't be possible. After d56b0b2ab142, TDs are immediately skipped when handling those Stopped events. This poses a potential problem in case of Stopped - Length Invalid, which occurs either on completed TDs (likely already given back) or Link and No-Op TRBs. Such event won't be recognized as matching any TD (unless it's the rare Link TRB inside a TD) and will result in skipping all pending TDs, giving them back possibly before they are done, risking isoc data loss and maybe UAF by HW. As a compromise, don't skip and don't clear the skip flag on this kind of event. Then the next event will skip missed TDs. A downside of not handling Stopped - Length Invalid on a Link inside a TD is that if the TD is cancelled, its actual length will not be updated to account for TRBs (silently) completed before the TD was stopped. I had no luck producing this sequence of completion events so there is no compelling demonstration of any resulting disaster. It may be a very rare, obscure condition. The sole motivation for this patch is that if such unlikely event does occur, I'd rather risk reporting a cancelled partially done isoc frame as empty than gamble with UAF. This will be fixed more properly by looking at Stopped event's TRB pointer when making skipping decisions, but such rework is unlikely to be backported to v6.12, which will stay around for a few years. Fixes: d56b0b2ab142 ("usb: xhci: ensure skipped isoc TDs are returned when isoc ring is stopped") Cc: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-04	usb: xhci: Fix host controllers "dying" after suspend and resume	Michal Pecio	1	-1/+5
	A recent cleanup went a bit too far and dropped clearing the cycle bit of link TRBs, so it stays different from the rest of the ring half of the time. Then a race occurs: if the xHC reaches such link TRB before more commands are queued, the link's cycle bit unintentionally matches the xHC's cycle so it follows the link and waits for further commands. If more commands are queued before the xHC gets there, inc_enq() flips the bit so the xHC later sees a mismatch and stops executing commands. This function is called before suspend and 50% of times after resuming the xHC is doomed to get stuck sooner or later. Then some Stop Endpoint command fails to complete in 5 seconds and this shows up xhci_hcd 0000:00:10.0: xHCI host not responding to stop endpoint command xhci_hcd 0000:00:10.0: xHCI host controller not responding, assume dead xhci_hcd 0000:00:10.0: HC died; cleaning up followed by loss of all USB decives on the affected bus. That's if you are lucky, because if Set Deq gets stuck instead, the failure is silent. Likely responsible for kernel bug 219824. I found this while searching for possible causes of that regression and reproduced it locally before hearing back from the reporter. To repro, simply wait for link cycle to become set (debugfs), then suspend, resume and wait. To accelerate the failure I used a script which repeatedly starts and stops a UVC camera. Some HCs get fully reinitialized on resume and they are not affected. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219824 Fixes: 36b972d4b7ce ("usb: xhci: improve xhci_clear_command_ring()") Cc: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250304113147.3322584-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-03	xhci: Restrict USB4 tunnel detection for USB3 devices to Intel hosts	Marc Zyngier	1	-0/+8
	When adding support for USB3-over-USB4 tunnelling detection, a check for an Intel-specific capability was added. This capability, which goes by ID 206, is used without any check that we are actually dealing with an Intel host. As it turns out, the Cadence XHCI controller also exposes an extended capability numbered 206 (for unknown purposes), but of course doesn't have the Intel-specific registers that the tunnelling code is trying to access. Fun follows. The core of the problems is that the tunnelling code blindly uses vendor-specific capabilities without any check (the Intel-provided documentation I have at hand indicates that 192-255 are indeed vendor-specific). Restrict the detection code to Intel HW for real, preventing any further explosion on my (non-Intel) HW. Cc: stable <stable@kernel.org> Fixes: 948ce83fbb7df ("xhci: Add USB4 tunnel detection for USB3 devices on Intel hosts") Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250227194529.2288718-1-maz@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-27	usb: xhci: Enable the TRB overfetch quirk on VIA VL805	Michal Pecio	3	-5/+10
	Raspberry Pi is a major user of those chips and they discovered a bug - when the end of a transfer ring segment is reached, up to four TRBs can be prefetched from the next page even if the segment ends with link TRB and on page boundary (the chip claims to support standard 4KB pages). It also appears that if the prefetched TRBs belong to a different ring whose doorbell is later rung, they may be used without refreshing from system RAM and the endpoint will stay idle if their cycle bit is stale. Other users complain about IOMMU faults on x86 systems, unsurprisingly. Deal with it by using existing quirk which allocates a dummy page after each transfer ring segment. This was seen to resolve both problems. RPi came up with a more efficient solution, shortening each segment by four TRBs, but it complicated the driver and they ditched it for this quirk. Also rename the quirk and add VL805 device ID macro. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Link: https://github.com/raspberrypi/linux/issues/4685 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215906 CC: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250225095927.2512358-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-14	usb: xhci: Restore xhci_pci support for Renesas HCs	Michal Pecio	1	-3/+4
	Some Renesas HCs require firmware upload to work, this is handled by the xhci_pci_renesas driver. Other variants of those chips load firmware from a SPI flash and are ready to work with xhci_pci alone. A refactor merged in v6.12 broke the latter configuration so that users are finding their hardware ignored by the normal driver and are forced to enable the firmware loader which isn't really necessary on their systems. Let xhci_pci work with those chips as before when the firmware loader is disabled by kernel configuration. Fixes: 25f51b76f90f ("xhci-pci: Make xhci-pci-renesas a proper modular driver") Cc: stable <stable@kernel.org> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219616 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219726 Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Tested-by: Nicolai Buchwitz <nb@tipi-net.de> Link: https://lore.kernel.org/r/20250128104529.58a79bfc@foxbook Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-14	USB: pci-quirks: Fix HCCPARAMS register error for LS7A EHCI	Huacai Chen	1	-0/+9
	LS7A EHCI controller doesn't have extended capabilities, so the EECP (EHCI Extended Capabilities Pointer) field of HCCPARAMS register should be 0x0, but it reads as 0xa0 now. This is a hardware flaw and will be fixed in future, now just clear the EECP field to avoid error messages on boot: ...... [ 0.581675] pci 0000:00:04.1: EHCI: unrecognized capability ff [ 0.581699] pci 0000:00:04.1: EHCI: unrecognized capability ff [ 0.581716] pci 0000:00:04.1: EHCI: unrecognized capability ff [ 0.581851] pci 0000:00:04.1: EHCI: unrecognized capability ff ...... [ 0.581916] pci 0000:00:05.1: EHCI: unrecognized capability ff [ 0.581951] pci 0000:00:05.1: EHCI: unrecognized capability ff [ 0.582704] pci 0000:00:05.1: EHCI: unrecognized capability ff [ 0.582799] pci 0000:00:05.1: EHCI: unrecognized capability ff ...... Cc: stable <stable@kernel.org> Signed-off-by: Baoqi Zhang <zhangbaoqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://lore.kernel.org/r/20250202124935.480500-1-chenhuacai@loongson.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-28	Merge tag 'driver-core-6.14-rc1' of ↵	Linus Torvalds	1	-21/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and debugfs updates from Greg KH: "Here is the big set of driver core and debugfs updates for 6.14-rc1. Included in here is a bunch of driver core, PCI, OF, and platform rust bindings (all acked by the different subsystem maintainers), hence the merge conflict with the rust tree, and some driver core api updates to mark things as const, which will also require some fixups due to new stuff coming in through other trees in this merge window. There are also a bunch of debugfs updates from Al, and there is at least one user that does have a regression with these, but Al is working on tracking down the fix for it. In my use (and everyone else's linux-next use), it does not seem like a big issue at the moment. Here's a short list of the things in here: - driver core rust bindings for PCI, platform, OF, and some i/o functions. We are almost at the "write a real driver in rust" stage now, depending on what you want to do. - misc device rust bindings and a sample driver to show how to use them - debugfs cleanups in the fs as well as the users of the fs api for places where drivers got it wrong or were unnecessarily doing things in complex ways. - driver core const work, making more of the api take const * for different parameters to make the rust bindings easier overall. - other small fixes and updates All of these have been in linux-next with all of the aforementioned merge conflicts, and the one debugfs issue, which looks to be resolved "soon"" * tag 'driver-core-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (95 commits) rust: device: Use as_char_ptr() to avoid explicit cast rust: device: Replace CString with CStr in property_present() devcoredump: Constify 'struct bin_attribute' devcoredump: Define 'struct bin_attribute' through macro rust: device: Add property_present() saner replacement for debugfs_rename() orangefs-debugfs: don't mess with ->d_name octeontx2: don't mess with ->d_parent or ->d_parent->d_name arm_scmi: don't mess with ->d_parent->d_name slub: don't mess with ->d_name sof-client-ipc-flood-test: don't mess with ->d_name qat: don't mess with ->d_name xhci: don't mess with ->d_iname mtu3: don't mess wiht ->d_iname greybus/camera - stop messing with ->d_iname mediatek: stop messing with ->d_iname netdevsim: don't embed file_operations into your structs b43legacy: make use of debugfs_get_aux() b43: stop embedding struct file_operations into their objects carl9170: stop embedding file_operations into their objects ...
2025-01-17	usb: xhci: tegra: Fix OF boolean read warning	Jon Hunter	1	-1/+1
	After commit c141ecc3cecd ("of: Warn when of_property_read_bool() is used on non-boolean properties") was added, the following warning is observed for the Tegra XHCI driver ... OF: /bus@0/usb@3610000: Read of boolean property 'power-domains' with a value. Previously, of_property_read_bool() was used to determine if a property was present but has now been replaced by of_property_present(). The warning is meant to prevent new users but this user existed before the change was made. Fix this by updating the Tegra XHCI driver to use of_property_present() function to determine if the 'power-domains' property is present. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Link: https://lore.kernel.org/r/20250116153829.477360-1-jonathanh@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-17	usb: host: xhci-plat: add support compatible ID PNP0D15	Chunfeng Yun	1	-0/+1
	Add support compatible ID PNP0D15 which declare that the xHCI controller doesn't support standard debug capability. Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Link: https://lore.kernel.org/r/20250116125141.25856-1-chunfeng.yun@mediatek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15	USB: host: Use str_enable_disable-like helpers	Krzysztof Kozlowski	5	-7/+12
	Replace ternary (condition ? "enable" : "disable") syntax with helpers from string_choices.h because: 1. Simple function call with one argument is easier to read. Ternary operator has three arguments and with wrapping might lead to quite long code. 2. Is slightly shorter thus also easier to read. 3. It brings uniformity in the text - same string. 4. Allows deduping by the linker, which results in a smaller binary file. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20250114-str-enable-disable-usb-v1-2-c8405df47c19@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15	xhci: don't mess with ->d_iname	Al Viro	1	-21/+4
	use debugs_{creat_file,get}_aux() instead Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20250112080705.141166-14-viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-13	Merge 6.13-rc7 into usb-next	Greg Kroah-Hartman	1	-1/+2
	We need the USB fixes in here as well for testing. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-11	usb: host: xhci-plat: Assign shared_hcd->rsrc_start	WangYuli	1	-0/+2
	When inserting a USB device, examining hcd->rsrc_start can be helpful in identifying which hcd is mounted, as the physical address represented here is typically unique. The following code snippet demonstrates this: struct usb_hcd *hcd = bus_to_hcd(udev->bus); unsigned long long usb_hcd_addr = (unsigned long long)hcd->rsrc_start; However, this approach has limitations now. For USB hosts with an MMIO interface, the effectiveness of this method is restricted to USB 2.0. Because commit 3429e91a661e ("usb: host: xhci: add platform driver support") assigned res->start to hcd->rsrc_start. But shared_hcd->rsrc_start remains unassigned, which is also necessary in certain scenarios. Fixes: 3429e91a661e ("usb: host: xhci: add platform driver support") Co-developed-by: Xu Rao <raoxu@uniontech.com> Signed-off-by: Xu Rao <raoxu@uniontech.com> Signed-off-by: WangYuli <wangyuli@uniontech.com> Link: https://lore.kernel.org/r/186B9F56972457B4+20250107133854.172309-1-wangyuli@uniontech.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-27	xhci: Add missing capability definition bits	Mathias Nyman	1	-0/+6
	Add capability bits for the xHC Capability Parameters 2 (HCCPARAMS2) register described in xHCI specification section 5.3.9 bit 7 Extended TBC TRB Status Capability (ETC_TSC) bit 8 Get/Set Extended Property Capability (GSC) bit 9 Virtualization Based Trusted I/O Capability (VTC) Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241227120142.1035206-6-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-27	xhci: Add command completion parameter support	Mathias Nyman	2	-4/+10
	xHC hosts can pass 24 bits of data with a command completion event TRB as the completion code only uses 8 bits of the 32 bit status field. Only Configure Endpoint, and Get Extended Property commands utilize this "command completion parameter" 24 bit field. For other command completion events the xHC should keep it RsvdZ. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241227120142.1035206-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-27	usb: xhci: Fix NULL pointer dereference on certain command aborts	Michal Pecio	1	-1/+2
	If a command is queued to the final usable TRB of a ring segment, the enqueue pointer is advanced to the subsequent link TRB and no further. If the command is later aborted, when the abort completion is handled the dequeue pointer is advanced to the first TRB of the next segment. If no further commands are queued, xhci_handle_stopped_cmd_ring() sees the ring pointers unequal and assumes that there is a pending command, so it calls xhci_mod_cmd_timer() which crashes if cur_cmd was NULL. Don't attempt timer setup if cur_cmd is NULL. The subsequent doorbell ring likely is unnecessary too, but it's harmless. Leave it alone. This is probably Bug 219532, but no confirmation has been received. The issue has been independently reproduced and confirmed fixed using a USB MCU programmed to NAK the Status stage of SET_ADDRESS forever. Everything continued working normally after several prevented crashes. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219532 Fixes: c311e391a7ef ("xhci: rework command timeout and cancellation,") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241227120142.1035206-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-27	xhci: dbgtty: Improve performance by handling received data immediately.	Mathias Nyman	1	-33/+65
	Improve dbc transfer rate performance by copying the received data to the tty buffer directly in the request complete callback function if possible. Only defer it in case there is already pending deferred work, tty is throttled, or we fail copy the data to the tty buffer The request complete callback is already called by a workqueue. This is part 3/3 of a dbc performance improvement series that roughly triples dbc performace when using adb push and pull over dbc. Max/min push rate after patches is 210/118 MB/s, pull rate 171/133 MB/s, tested with large files (300MB-9GB) by Łukasz Bartosik Cc: Łukasz Bartosik <ukaszb@chromium.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241227120142.1035206-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-27	xhci: dbc: Improve performance by removing delay in transfer event polling.	Mathias Nyman	1	-1/+1
	Queue event polling work with 0 delay in case there are pending transfers queued up. This is part 2 of a 3 part series that roughly triples dbc performace when using adb push and pull over dbc. Max/min push rate after patches is 210/118 MB/s, pull rate 171/133 MB/s, tested with large files (300MB-9GB) by Łukasz Bartosik First performance improvement patch was commit 31128e7492dc ("xhci: dbc: add dbgtty request to end of list once it completes") Cc: Łukasz Bartosik <ukaszb@chromium.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241227120142.1035206-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-23	usb: host: xhci-plat: set skip_phy_initialization if software node has ↵	Xu Yang	1	-1/+2
	XHCI_SKIP_PHY_INIT property The source of quirk XHCI_SKIP_PHY_INIT comes from xhci_plat_priv.quirks or software node property. This will set skip_phy_initialization if software node also has XHCI_SKIP_PHY_INIT property. Fixes: a6cd2b3fa894 ("usb: host: xhci-plat: Parse xhci-missing_cas_quirk and apply quirk") Cc: stable <stable@kernel.org> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20241209111423.4085548-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-17	usb: xhci: fix ring expansion regression in 6.13-rc1	Niklas Neronin	1	-1/+1
	The source and destination rings were incorrectly assigned during the ring linking process. The "source" ring, which contains the new segments, was not spliced into the "destination" ring, leading to incorrect ring expansion. Fixes: fe688e500613 ("usb: xhci: refactor xhci_link_rings() to use source and destination rings") Reported-by: Jeff Chua <jeff.chua.linux@gmail.com> Closes: https://lore.kernel.org/lkml/CAAJw_ZtppNqC9XA=-WVQDr+vaAS=di7jo15CzSqONeX48H75MA@mail.gmail.com/ Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241217102122.2316814-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-17	xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic	Mathias Nyman	1	-2/+0
	xHC hosts from several vendors have the same issue where endpoints start so slowly that a later queued 'Stop Endpoint' command may complete before endpoint is up and running. The 'Stop Endpoint' command fails with context state error as the endpoint still appears as stopped. See commit 42b758137601 ("usb: xhci: Limit Stop Endpoint retries") for details CC: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241217102122.2316814-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-04	usb: host: max3421-hcd: Correctly abort a USB request.	Mark Tomlinson	1	-5/+11
	If the current USB request was aborted, the spi thread would not respond to any further requests. This is because the "curr_urb" pointer would not become NULL, so no further requests would be taken off the queue. The solution here is to set the "urb_done" flag, as this will cause the correct handling of the URB. Also clear interrupts that should only be expected if an URB is in progress. Fixes: 2d53139f3162 ("Add support for using a MAX3421E chip as a host driver.") Cc: stable <stable@kernel.org> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Link: https://lore.kernel.org/r/20241124221430.1106080-1-mark.tomlinson@alliedtelesis.co.nz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-04	usb: ehci-hcd: fix call balance of clocks handling routines	Vitalii Mordan	1	-2/+7
	If the clocks priv->iclk and priv->fclk were not enabled in ehci_hcd_sh_probe, they should not be disabled in any path. Conversely, if they was enabled in ehci_hcd_sh_probe, they must be disabled in all error paths to ensure proper cleanup. Found by Linux Verification Center (linuxtesting.org) with Klever. Fixes: 63c845522263 ("usb: ehci-hcd: Add support for SuperH EHCI.") Cc: stable@vger.kernel.org # ff30bd6a6618: sh: clk: Fix clk_enable() to return 0 on NULL clk Signed-off-by: Vitalii Mordan <mordan@ispras.ru> Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20241121114700.2100520-1-mordan@ispras.ru Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-02	module: Convert symbol namespace to string literal	Peter Zijlstra	2	-3/+3
	Clean up the existing export namespace code along the same lines of commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo) to __section("foo")") and for the same reason, it is not desired for the namespace argument to be a macro expansion itself. Scripted using git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS \| while read file; do awk -i inplace ' /^#define EXPORT_SYMBOL_NS/ { gsub(/__stringify$ns$/, "ns"); print; next; } /^#define MODULE_IMPORT_NS/ { gsub(/__stringify$ns$/, "ns"); print; next; } /MODULE_IMPORT_NS/ { $0 = gensub(/MODULE_IMPORT_NS$([^)])$/, "MODULE_IMPORT_NS(\"\\1\")", "g"); } /EXPORT_SYMBOL_NS/ { if ($0 ~ /(EXPORT_SYMBOL_NS[^(])$([^,]+),/) { if ($0 !~ /(EXPORT_SYMBOL_NS[^(])\(([^,]+), ([^)]+)$/ && $0 !~ /(EXPORT_SYMBOL_NS[^(])/ && $0 !~ /^my/) { getline line; gsub(/[[:space:]]\\$/, ""); gsub(/[[:space:]]/, "", line); $0 = $0 " " line; } $0 = gensub(/(EXPORT_SYMBOL_NS[^(])$([^,]+), ([^)]+)$/, "\\1(\\2, \"\\3\")", "g"); } } { print }' $file; done Requested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc Acked-by: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-16	usb: ehci-spear: fix call balance of sehci clk handling routines	Vitalii Mordan	1	-3/+4
	If the clock sehci->clk was not enabled in spear_ehci_hcd_drv_probe, it should not be disabled in any path. Conversely, if it was enabled in spear_ehci_hcd_drv_probe, it must be disabled in all error paths to ensure proper cleanup. Found by Linux Verification Center (linuxtesting.org) with Klever. Fixes: 7675d6ba436f ("USB: EHCI: make ehci-spear a separate driver") Cc: stable@vger.kernel.org Signed-off-by: Vitalii Mordan <mordan@ispras.ru> Acked-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20241114230310.432213-1-mordan@ispras.ru Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-13	drivers/usb/host: refactor min/max with min_t/max_t	Sabyrzhan Tasbolatov	3	-4/+4
	Ensure type safety by using min_t/max_t instead of casted min/max. Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Link: https://lore.kernel.org/r/20241112155817.3512577-4-snovitoll@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-13	usb: cdns3: Synchronise PCI IDs via common data base	Andy Shevchenko	1	-3/+2
	There are a few places in the kernel where PCI IDs for different Cadence USB controllers are being used. Besides different naming, they duplicate each other. Make this all in order by providing common definitions via PCI IDs database and use in all users. While doing that, rename definitions as Roger suggested. Suggested-by: Roger Quadros <rogerq@kernel.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20241112160125.2340972-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: Avoid queuing redundant Stop Endpoint commands	Michal Pecio	3	-4/+29
	Stop Endpoint command on an already stopped endpoint fails and may be misinterpreted as a known hardware bug by the completion handler. This results in an unnecessary delay with repeated retries of the command. Avoid queuing this command when endpoint state flags indicate that it's stopped or halted and the command will fail. If commands are pending on the endpoint, their completion handlers will process cancelled TDs so it's done. In case of waiting for external operations like clearing TT buffer, the endpoint is stopped and cancelled TDs can be processed now. This eliminates practically all unnecessary retries because an endpoint with pending URBs is maintained in Running state by the driver, unless aforementioned commands or other operations are pending on it. This is guaranteed by xhci_ring_ep_doorbell() and by the fact that it is called every time any of those operations completes. The only known exceptions are hardware bugs (the endpoint never starts at all) and Stream Protocol errors not associated with any TRB, which cause an endpoint reset not followed by restart. Sounds like a bug. Generally, these retries are only expected to happen when the endpoint fails to start for unknown/no reason, which is a worse problem itself, and fixing the bug eliminates the retries too. All cases were tested and found to work as expected. SET_DEQ_PENDING was produced by patching uvcvideo to unlink URBs in 100us intervals, which then runs into this case very often. EP_HALTED was produced by restarting 'cat /dev/ttyUSB0' on a serial dongle with broken cable. EP_CLEARING_TT by the same, with the dongle on an external hub. Fixes: fd9d55d190c0 ("xhci: retry Stop Endpoint on buggy NEC controllers") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-34-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: Fix TD invalidation under pending Set TR Dequeue	Michal Pecio	1	-5/+13
	xhci_invalidate_cancelled_tds() may not work correctly if the hardware is modifying endpoint or stream contexts at the same time by executing a Set TR Dequeue command. And even if it worked, it would be unable to queue Set TR Dequeue for the next stream, failing to clear xHC cache. On stream endpoints, a chain of Set TR Dequeue commands may take some time to execute and we may want to cancel more TDs during this time. Currently this leads to Stop Endpoint completion handler calling this function without testing for SET_DEQ_PENDING, which will trigger the aforementioned problems when it happens. On all endpoints, a halt condition causes Reset Endpoint to be queued and an error status given to the class driver, which may unlink more URBs in response. Stop Endpoint is queued and its handler may execute concurrently with Set TR Dequeue queued by Reset Endpoint handler. (Reset Endpoint handler calls this function too, but there seems to be no possibility of it running concurrently with Set TR Dequeue). Fix xhci_invalidate_cancelled_tds() to work correctly under a pending Set TR Dequeue. Bail out of the function when SET_DEQ_PENDING is set, then make the completion handler call the function again and also call xhci_giveback_invalidated_tds(), which needs to be called next. This seems to fix another potential bug, where the handler would call xhci_invalidate_cancelled_tds(), which may clear some deferred TDs if a sanity check fails, and the TDs wouldn't be given back promptly. Said sanity check seems to be wrong and prone to false positives when the endpoint halts, but fixing it is beyond the scope of this change, besides ensuring that cleared TDs are given back properly. Fixes: 5ceac4402f5d ("xhci: Handle TD clearing for multiple streams case") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-33-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: Limit Stop Endpoint retries	Michal Pecio	3	-4/+27
	Some host controllers fail to atomically transition an endpoint to the Running state on a doorbell ring and enter a hidden "Restarting" state, which looks very much like Stopped, with the important difference that it will spontaneously transition to Running anytime soon. A Stop Endpoint command queued in the Restarting state typically fails with Context State Error and the completion handler sees the Endpoint Context State as either still Stopped or already Running. Even a case of Halted was observed, when an error occurred right after the restart. The Halted state is already recovered from by resetting the endpoint. The Running state is handled by retrying Stop Endpoint. The Stopped state was recognized as a problem on NEC controllers and worked around also by retrying, because the endpoint soon restarts and then stops for good. But there is a risk: the command may fail if the endpoint is "stopped for good" already, and retries will fail forever. The possibility of this was not realized at the time, but a number of cases were discovered later and reproduced. Some proved difficult to deal with, and it is outright impossible to predict if an endpoint may fail to ever start at all due to a hardware bug. One such bug (albeit on ASM3142, not on NEC) was found to be reliably triggered simply by toggling an AX88179 NIC up/down in a tight loop for a few seconds. An endless retries storm is quite nasty. Besides putting needless load on the xHC and CPU, it causes URBs never to be given back, paralyzing the device and connection/disconnection logic for the whole bus if the device is unplugged. User processes waiting for URBs become unkillable, drivers and kworker threads lock up and xhci_hcd cannot be reloaded. For peace of mind, impose a timeout on Stop Endpoint retries in this case. If they don't succeed in 100ms, consider the endpoint stopped permanently for some reason and just give back the unlinked URBs. This failure case is rare already and work is under way to make it rarer. Start this work today by also handling one simple case of race with Reset Endpoint, because it costs just two lines to implement. Fixes: fd9d55d190c0 ("xhci: retry Stop Endpoint on buggy NEC controllers") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-32-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: remove irrelevant comment	Niklas Neronin	1	-5/+0
	The code which it is referencing does not exist in the same function, or the file for that matter. Since it was added [1], the Interrupter Moderation Interval can be changed within xhci addon, e.g. PCI xhci_pci_setup(). [1], commit 0ebbab374223 ("USB: xhci: Ring allocation and initialization.") Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-31-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: add help function xhci_dequeue_td()	Niklas Neronin	1	-16/+13
	Add xhci_dequeue_td() helper function to reduce code duplication. Function xhci_dequeue_td() advances the dequeue pointer past the specified Transfer Descriptor (TD) and releases the TD. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-30-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: refactor xhci_td_cleanup() to return void	Niklas Neronin	1	-31/+28
	The function is modified to return 'void' instead of an integer since it invariably returns '0'. Additionally, multiple functions which only return xhci_td_cleanup() are also refactored to return void. This change eliminates the need for callers to handle a return value that does not convey meaningful information and improve code readability, as it becomes immediately clear that the function does not produce a significant output. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-29-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>