summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-01-18Linux 6.1.7v6.1.7Greg Kroah-Hartman1-1/+1
Link: https://lore.kernel.org/r/20230116154803.321528435@linuxfoundation.org Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Ronald Warsow <rwarsow@gmx.de> Tested-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Justin M. Forbes <jforbes@fedoraproject.org> Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com> Tested-by: Rudi Heitbaum <rudi@heitbaum.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by: Allen Pais <apais@linux.microsoft.com> Link: https://lore.kernel.org/r/20230117124546.116438951@linuxfoundation.org Tested-by: Allen Pais <apais@linux.microsoft.com> Tested-by: Ronald Warsow <rwarsow@gmx.de> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Tested-by: Fenil Jain <fkjainco@gmail.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18pinctrl: amd: Add dynamic debugging for active GPIOsMario Limonciello1-4/+6
commit 1d66e379731f79ae5039a869c0fde22a4f6a6a91 upstream. Some laptops have been reported to wake up from s2idle when plugging in the AC adapter or by closing the lid. This is a surprising behavior that is further clarified by commit cb3e7d624c3ff ("PM: wakeup: Add extra debugging statement for multiple active IRQs"). With that commit in place the following interaction can be seen when the lid is closed: [ 28.946038] PM: suspend-to-idle [ 28.946083] ACPI: EC: ACPI EC GPE status set [ 28.946101] ACPI: PM: Rearming ACPI SCI for wakeup [ 28.950152] Timekeeping suspended for 3.320 seconds [ 28.950152] PM: Triggering wakeup from IRQ 9 [ 28.950152] ACPI: EC: ACPI EC GPE status set [ 28.950152] ACPI: EC: ACPI EC GPE dispatched [ 28.995057] ACPI: EC: ACPI EC work flushed [ 28.995075] ACPI: PM: Rearming ACPI SCI for wakeup [ 28.995131] PM: Triggering wakeup from IRQ 9 [ 28.995271] ACPI: EC: ACPI EC GPE status set [ 28.995291] ACPI: EC: ACPI EC GPE dispatched [ 29.098556] ACPI: EC: ACPI EC work flushed [ 29.207020] ACPI: EC: ACPI EC work flushed [ 29.207037] ACPI: PM: Rearming ACPI SCI for wakeup [ 29.211095] Timekeeping suspended for 0.739 seconds [ 29.211095] PM: Triggering wakeup from IRQ 9 [ 29.211079] PM: Triggering wakeup from IRQ 7 [ 29.211095] ACPI: PM: ACPI non-EC GPE wakeup [ 29.211095] PM: resume from suspend-to-idle * IRQ9 on this laptop is used for the ACPI SCI. * IRQ7 on this laptop is used for the GPIO controller. What has occurred is when the lid was closed the EC woke up the SoC from it's deepest sleep state and the kernel's s2idle loop processed all EC events. When it was finished processing EC events, it checked for any other reasons to wake (break the s2idle loop). The IRQ for the GPIO controller was active so the loop broke, and then this IRQ was processed. This is not a kernel bug but it is certainly a surprising behavior, and to better debug it we should have a dynamic debugging message that we can enact to catch it. Acked-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com> Acked-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-by: Mark Pearson <markpearson@lenovo.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20221013134729.5592-2-mario.limonciello@amd.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18Revert "usb: ulpi: defer ulpi_register on ulpi_read_id timeout"Ferry Toth1-1/+1
commit b659b613cea2ae39746ca8bd2b69d1985dd9d770 upstream. This reverts commit 8a7b31d545d3a15f0e6f5984ae16f0ca4fd76aac. This patch results in some qemu test failures, specifically xilinx-zynq-a9 machine and zynq-zc702 as well as zynq-zed devicetree files, when trying to boot from USB drive. Link: https://lore.kernel.org/lkml/20221220194334.GA942039@roeck-us.net/ Fixes: 8a7b31d545d3 ("usb: ulpi: defer ulpi_register on ulpi_read_id timeout") Cc: stable@vger.kernel.org Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Ferry Toth <ftoth@exalondelft.nl> Link: https://lore.kernel.org/r/20221222205302.45761-1-ftoth@exalondelft.nl Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18block: handle bio_split_to_limits() NULL returnJens Axboe8-2/+19
commit 613b14884b8595e20b9fac4126bf627313827fbe upstream. This can't happen right now, but in preparation for allowing bio_split_to_limits() returning NULL if it ended the bio, check for it in all the callers. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18io_uring/io-wq: only free worker if it was allocated for creationJens Axboe1-1/+6
commit e6db6f9398dadcbc06318a133d4c44a2d3844e61 upstream. We have two types of task_work based creation, one is using an existing worker to setup a new one (eg when going to sleep and we have no free workers), and the other is allocating a new worker. Only the latter should be freed when we cancel task_work creation for a new worker. Fixes: af82425c6a2d ("io_uring/io-wq: free worker if task_work creation is canceled") Reported-by: syzbot+d56ec896af3637bdb7e4@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18io_uring/io-wq: free worker if task_work creation is canceledJens Axboe1-0/+1
commit af82425c6a2d2f347c79b63ce74fca6dc6be157f upstream. If we cancel the task_work, the worker will never come into existance. As this is the last reference to it, ensure that we get it freed appropriately. Cc: stable@vger.kernel.org Reported-by: 진호 <wnwlsgh98@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18drm/i915: Fix CFI violations in gt_sysfsNathan Chancellor3-258/+220
commit a8a4f0467d706fc22d286dfa973946e5944b793c upstream. When booting with CONFIG_CFI_CLANG, there are numerous violations when accessing the files under /sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/gt0: $ cd /sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/gt0 $ grep . * id:0 punit_req_freq_mhz:350 rc6_enable:1 rc6_residency_ms:214934 rps_act_freq_mhz:1300 rps_boost_freq_mhz:1300 rps_cur_freq_mhz:350 rps_max_freq_mhz:1300 rps_min_freq_mhz:350 rps_RP0_freq_mhz:1300 rps_RP1_freq_mhz:350 rps_RPn_freq_mhz:350 throttle_reason_pl1:0 throttle_reason_pl2:0 throttle_reason_pl4:0 throttle_reason_prochot:0 throttle_reason_ratl:0 throttle_reason_status:0 throttle_reason_thermal:0 throttle_reason_vr_tdc:0 throttle_reason_vr_thermalert:0 $ sudo dmesg &| grep "CFI failure at" [ 214.595903] CFI failure at kobj_attr_show+0x19/0x30 (target: id_show+0x0/0x70 [i915]; expected type: 0xc527b809) [ 214.596064] CFI failure at kobj_attr_show+0x19/0x30 (target: punit_req_freq_mhz_show+0x0/0x40 [i915]; expected type: 0xc527b809) [ 214.596407] CFI failure at kobj_attr_show+0x19/0x30 (target: rc6_enable_show+0x0/0x40 [i915]; expected type: 0xc527b809) [ 214.596528] CFI failure at kobj_attr_show+0x19/0x30 (target: rc6_residency_ms_show+0x0/0x270 [i915]; expected type: 0xc527b809) [ 214.596682] CFI failure at kobj_attr_show+0x19/0x30 (target: act_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.596792] CFI failure at kobj_attr_show+0x19/0x30 (target: boost_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.596893] CFI failure at kobj_attr_show+0x19/0x30 (target: cur_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.596996] CFI failure at kobj_attr_show+0x19/0x30 (target: max_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597099] CFI failure at kobj_attr_show+0x19/0x30 (target: min_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597198] CFI failure at kobj_attr_show+0x19/0x30 (target: RP0_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597301] CFI failure at kobj_attr_show+0x19/0x30 (target: RP1_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597405] CFI failure at kobj_attr_show+0x19/0x30 (target: RPn_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597538] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.597701] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.597836] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.597952] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598071] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598177] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598307] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598439] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598542] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) With kCFI, indirect calls are validated against their expected type versus actual type and failures occur when the two types do not match. The ultimate issue is that these sysfs functions are expecting to be called via dev_attr_show() but they may also be called via kobj_attr_show(), as certain files are created under two different kobjects that have two different sysfs_ops in intel_gt_sysfs_register(), hence the warnings above. When accessing the gt_ files under /sys/devices/pci0000:00/0000:00:02.0/drm/card0, which are using the same sysfs functions, there are no violations, meaning the functions are being called with the proper type. To make everything work properly, adjust certain functions to match the type of the ->show() and ->store() members in 'struct kobj_attribute'. Add a macro to generate functions for that can be called via both dev_attr_{show,store}() or kobj_attr_{show,store}() so that they can be called through both kobject locations without violating kCFI and adjust the attribute groups to account for this. Link: https://github.com/ClangBuiltLinux/linux/issues/1716 Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221013205909.1282545-1-nathan@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18io_uring/poll: attempt request issue after racy poll wakeupJens Axboe1-11/+22
commit 6e5aedb9324aab1c14a23fae3d8eeb64a679c20e upstream. If we have multiple requests waiting on the same target poll waitqueue, then it's quite possible to get a request triggered and get disappointed in not being able to make any progress with it. If we race in doing so, we'll potentially leave the poll request on the internal tables, but removed from the waitqueue. That means that any subsequent trigger of the poll waitqueue will not kick that request into action, causing an application to potentially wait for completion of a request that will never happen. Fix this by adding a new poll return state, IOU_POLL_REISSUE. Rather than have complicated logic for how to re-arm a given type of request, just punt it for a reissue. While in there, move the 'ret' variable to the only section where it gets used. This avoids confusion the scope of it. Cc: stable@vger.kernel.org Fixes: eb0089d629ba ("io_uring: single shot poll removal optimisation") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18io_uring: lock overflowing for IOPOLLPavel Begunkov1-1/+5
commit 544d163d659d45a206d8929370d5a2984e546cb7 upstream. syzbot reports an issue with overflow filling for IOPOLL: WARNING: CPU: 0 PID: 28 at io_uring/io_uring.c:734 io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734 CPU: 0 PID: 28 Comm: kworker/u4:1 Not tainted 6.2.0-rc3-syzkaller-16369-g358a161a6a9e #0 Workqueue: events_unbound io_ring_exit_work Call trace:  io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734  io_req_cqe_overflow+0x5c/0x70 io_uring/io_uring.c:773  io_fill_cqe_req io_uring/io_uring.h:168 [inline]  io_do_iopoll+0x474/0x62c io_uring/rw.c:1065  io_iopoll_try_reap_events+0x6c/0x108 io_uring/io_uring.c:1513  io_uring_try_cancel_requests+0x13c/0x258 io_uring/io_uring.c:3056  io_ring_exit_work+0xec/0x390 io_uring/io_uring.c:2869  process_one_work+0x2d8/0x504 kernel/workqueue.c:2289  worker_thread+0x340/0x610 kernel/workqueue.c:2436  kthread+0x12c/0x158 kernel/kthread.c:376  ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:863 There is no real problem for normal IOPOLL as flush is also called with uring_lock taken, but it's getting more complicated for IOPOLL|SQPOLL, for which __io_cqring_overflow_flush() happens from the CQ waiting path. Reported-and-tested-by: syzbot+6805087452d72929404e@syzkaller.appspotmail.com Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18efi: fix NULL-deref in init error pathJohan Hovold1-3/+6
[ Upstream commit 703c13fe3c9af557d312f5895ed6a5fda2711104 ] In cases where runtime services are not supported or have been disabled, the runtime services workqueue will never have been allocated. Do not try to destroy the workqueue unconditionally in the unlikely event that EFI initialisation fails to avoid dereferencing a NULL pointer. Fixes: 98086df8b70c ("efi: add missed destroy_workqueue when efisubsys_init fails") Cc: stable@vger.kernel.org Cc: Li Heng <liheng40@huawei.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18ALSA: usb-audio: Fix possible NULL pointer dereference in ↵Jaroslav Kysela1-1/+2
snd_usb_pcm_has_fixed_rate() [ Upstream commit 92a9c0ad86d47ff4cce899012e355c400f02cfb8 ] The subs function argument may be NULL, so do not use it before the NULL check. Fixes: 291e9da91403 ("ALSA: usb-audio: Always initialize fixed_rate in snd_usb_find_implicit_fb_sync_format()") Reported-by: coverity-bot <keescook@chromium.org> Link: https://lore.kernel.org/alsa-devel/202301121424.4A79A485@keescook/ Signed-off-by: Jaroslav Kysela <perex@perex.cz> Link: https://lore.kernel.org/r/20230113085311.623325-1-perex@perex.cz Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18platform/x86/amd: Fix refcount leak in amd_pmc_probeMiaoqian Lin1-1/+1
[ Upstream commit ccb32e2be14271a60e9ba89c6d5660cc9998773c ] pci_get_domain_bus_and_slot() takes reference, the caller should release the reference by calling pci_dev_put() after use. Call pci_dev_put() in the error path to fix this. Fixes: 3d7d407dfb05 ("platform/x86: amd-pmc: Add support for AMD Spill to DRAM STB feature") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20221229072534.1381432-1-linmq006@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18platform/surface: aggregator: Add missing call to ssam_request_sync_free()Maximilian Luz1-1/+3
[ Upstream commit c965daac370f08a9b71d573a71d13cda76f2a884 ] Although rare, ssam_request_sync_init() can fail. In that case, the request should be freed via ssam_request_sync_free(). Currently it is leaked instead. Fix this. Fixes: c167b9c7e3d6 ("platform/surface: Add Surface Aggregator subsystem") Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20221220175608.1436273-1-luzmaximilian@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18bnxt: make sure we return pages to the poolJakub Kicinski1-2/+2
[ Upstream commit 97f5e03a4a27d27ee4fed0cdb1658c81cf2784db ] Before the commit under Fixes the page would have been released from the pool before the napi_alloc_skb() call, so normal page freeing was fine (released page == no longer in the pool). After the change we just mark the page for recycling so it's still in the pool if the skb alloc fails, we need to recycle. Same commit added the same bug in the new bnxt_rx_multi_page_skb(). Fixes: 1dc4c557bfed ("bnxt: adding bnxt_xdp_build_skb to build skb from multibuffer xdp_buff") Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Link: https://lore.kernel.org/r/20230111042547.987749-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net: hns3: fix wrong use of rss size during VF rss configJie Wang1-1/+1
[ Upstream commit ae9f29fdfd827ad06c1ae8155c042245a9d00757 ] Currently, it used old rss size to get current tc mode. As a result, the rss size is updated, but the tc mode is still configured based on the old rss size. So this patch fixes it by using the new rss size in both process. Fixes: 93969dc14fcd ("net: hns3: refactor VF rss init APIs with new common rss init APIs") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Hao Lan <lanhao@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230110115359.10163-1-lanhao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net: lan966x: check for ptp to be enabled in lan966x_ptp_deinit()Clément Léger1-0/+3
[ Upstream commit b0e380b5d4275299adf43e249f18309331b6f54f ] If ptp was not enabled due to missing IRQ for instance, lan966x_ptp_deinit() will dereference NULL pointers. Fixes: d096459494a8 ("net: lan966x: Add support for ptp clocks") Signed-off-by: Clément Léger <clement.leger@bootlin.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18igc: Fix PPS delta between two synchronized end-pointsChristopher S Hall2-4/+8
[ Upstream commit 5e91c72e560cc85f7163bbe3d14197268de31383 ] This patch fix the pulse per second output delta between two synchronized end-points. Based on Intel Discrete I225 Software User Manual Section 4.2.15 TimeSync Auxiliary Control Register, ST0[Bit 4] and ST1[Bit 7] must be set to ensure that clock output will be toggles based on frequency value defined. This is to ensure that output of the PPS is aligned with the clock. How to test: 1) Running time synchronization on both end points. Ex: ptp4l --step_threshold=1 -m -f gPTP.cfg -i <interface name> 2) Configure PPS output using below command for both end-points Ex: SDP0 on I225 REV4 SKU variant ./testptp -d /dev/ptp0 -L 0,2 ./testptp -d /dev/ptp0 -p 1000000000 3) Measure the output using analyzer for both end-points Fixes: 87938851b6ef ("igc: enable auxiliary PHC functions for the i225") Signed-off-by: Christopher S Hall <christopher.s.hall@intel.com> Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18perf kmem: Support field "node" in evsel__process_alloc_event() coping with ↵Leo Yan1-12/+24
recent tracepoint restructuring [ Upstream commit dce088ab0d51ae3b14fb2bd608e9c649aadfe5dc ] Commit 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints") adds the field "node" into the tracepoints 'kmalloc' and 'kmem_cache_alloc', so this patch modifies the event process function to support the field "node". If field "node" is detected by checking function evsel__field(), it stats the cross allocation. When the "node" value is NUMA_NO_NODE (-1), it means the memory can be allocated from any memory node, in this case, we don't account it as a cross allocation. Fixes: 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints") Reported-by: Ravi Bangoria <ravi.bangoria@amd.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20230108062400.250690-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18perf kmem: Support legacy tracepointsLeo Yan1-3/+26
[ Upstream commit b3719108ae60169eda5c941ca5e1be1faa371c57 ] Commit 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints") removed tracepoints 'kmalloc_node' and 'kmem_cache_alloc_node', we need to consider the tool should be backward compatible. If it detect the tracepoint "kmem:kmalloc_node", this patch enables the legacy tracepoints, otherwise, it will ignore them. Fixes: 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints") Reported-by: Ravi Bangoria <ravi.bangoria@amd.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20230108062400.250690-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18perf build: Properly guard libbpf includesIan Rogers2-0/+8
[ Upstream commit d891f2b724b39a2a41e3ad7b57110193993242ff ] Including libbpf header files should be guarded by HAVE_LIBBPF_SUPPORT. In bpf_counter.h, move the skeleton utilities under HAVE_BPF_SKEL. Fixes: d6a735ef3277c45f ("perf bpf_counter: Move common functions to bpf_counter.h") Reported-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Mike Leach <mike.leach@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20230105172243.7238-1-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18octeontx2-pf: Fix resource leakage in VF driver unbindHariprasad Kelam1-0/+2
[ Upstream commit 53da7aec32982f5ee775b69dce06d63992ce4af3 ] resources allocated like mcam entries to support the Ntuple feature and hash tables for the tc feature are not getting freed in driver unbind. This patch fixes the issue. Fixes: 2da489432747 ("octeontx2-pf: devlink params support to set mcam entry count") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Link: https://lore.kernel.org/r/20230109061325.21395-1-hkelam@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18selftests/net: l2_tos_ttl_inherit.sh: Ensure environment cleanup on failure.Guillaume Nault1-4/+36
[ Upstream commit d68ff8ad3351b8fc8d6f14b9a4f5cc8ba3e8bd13 ] Use 'set -e' and an exit handler to stop the script if a command fails and ensure the test environment is cleaned up in any case. Also, handle the case where the script is interrupted by SIGINT. The only command that's expected to fail is 'wait $ping_pid', since it's killed by the script. Handle this case with '|| true' to make it play well with 'set -e'. Finally, return the Kselftest SKIP code (4) when the script breaks because of an environment problem or a command line failure. The 0 and 1 return codes should now reliably indicate that all tests have been run (0: all tests run and passed, 1: all tests run but at least one failed, 4: test script didn't run completely). Fixes: b690842d12fd ("selftests/net: test l2 tunnel TOS/TTL inheriting") Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18selftests/net: l2_tos_ttl_inherit.sh: Run tests in their own netns.Guillaume Nault1-69/+93
[ Upstream commit c53cb00f7983a5474f2d36967f84908b85af9159 ] This selftest currently runs half in the current namespace and half in a netns of its own. Therefore, the test can fail if the current namespace is already configured with incompatible parameters (for example if it already has a veth0 interface). Adapt the script to put both ends of the veth pair in their own netns. Now veth0 is created in NS0 instead of the current namespace, while veth1 is set up in NS1 (instead of the 'testing' netns). The user visible netns names are randomised to minimise the risk of conflicts with already existing namespaces. The cleanup() function doesn't need to remove the virtual interface anymore: deleting NS0 and NS1 automatically removes the virtual interfaces they contained. We can remove $ns, which was only used to run ip commands in the 'testing' netns (let's use the builtin "-netns" option instead). However, we still need a similar functionality as ping and tcpdump now need to run in NS0. So we now have $RUN_NS0 for that. Fixes: b690842d12fd ("selftests/net: test l2 tunnel TOS/TTL inheriting") Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18selftests/net: l2_tos_ttl_inherit.sh: Set IPv6 addresses with "nodad".Guillaume Nault1-4/+4
[ Upstream commit e59370b2e96eb8e7e057a2a16e999ff385a3f2fb ] The ping command can run before DAD completes. In that case, ping may fail and break the selftest. We don't need DAD here since we're working on isolated device pairs. Fixes: b690842d12fd ("selftests/net: test l2 tunnel TOS/TTL inheriting") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/mlx5e: Fix macsec possible null dereference when updating MAC security ↵Emeel Hakim1-7/+2
entity (SecY) [ Upstream commit 9828994ac492e8e7de47fe66097b7e665328f348 ] Upon updating MAC security entity (SecY) in hw offload path, the macsec security association (SA) initialization routine is called. In case of extended packet number (epn) is enabled the salt and ssci attributes are retrieved using the MACsec driver rx_sa context which is unavailable when updating a SecY property such as encoding-sa hence the null dereference. Fix by using the provided SA to set those attributes. Fixes: 4411a6c0abd3 ("net/mlx5e: Support MACsec offload extended packet number (EPN)") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/mlx5e: Fix macsec ssci attribute handling in offload pathEmeel Hakim1-3/+7
[ Upstream commit f5e1ed04aa2ea665a796f0109091ca3f2b01024a ] Currently when macsec offload is set with extended packet number (epn) enabled, the driver wrongly deduce the short secure channel identifier (ssci) from the salt instead of the stand alone ssci attribute as it should, consequently creating a mismatch between the kernel and driver's ssci values. Fix by using the ssci value from the relevant attribute. Fixes: 4411a6c0abd3 ("net/mlx5e: Support MACsec offload extended packet number (EPN)") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/mlx5e: Don't support encap rules with gbp optionGavin Li1-0/+2
[ Upstream commit d515d63cae2cd186acf40deaa8ef33067bb7f637 ] Previously, encap rules with gbp option would be offloaded by mistake but driver does not support gbp option offload. To fix this issue, check if the encap rule has gbp option and don't offload the rule Fixes: d8f9dfae49ce ("net: sched: allow flower to match vxlan options") Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/mlx5: Fix ptp max frequency adjustment rangeRahul Rameshbabu1-1/+1
[ Upstream commit fe91d57277eef8bb4aca05acfa337b4a51d0bba4 ] .max_adj of ptp_clock_info acts as an absolute value for the amount in ppb that can be set for a single call of .adjfine. This means that a single call to .getfine cannot be greater than .max_adj or less than -(.max_adj). Provides correct value for max frequency adjustment value supported by devices. Fixes: 3d8c38af1493 ("net/mlx5e: Add PTP Hardware Clock (PHC) support") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/mlx5e: IPoIB, Fix child PKEY interface stats on rx pathDragos Tatulea1-1/+1
[ Upstream commit b5e23931c45a2f99f60a2f2b98a9e4d5a62a5b13 ] The current code always does the accounting using the stats from the parent interface (linked in the rq). This doesn't work when there are child interfaces configured. Fix this behavior by always using the stats from the child interface priv. This will also work for parent only interfaces: the child (netdev) and parent netdev (rq->netdev) will point to the same thing. Fixes: be98737a4faa ("net/mlx5e: Use dynamic per-channel allocations in stats") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/mlx5e: IPoIB, Block PKEY interfaces with less rx queues than parentDragos Tatulea1-0/+9
[ Upstream commit 31c70bfe58ef09fe36327ddcced9143a16e9e83d ] A user is able to configure an arbitrary number of rx queues when creating an interface via netlink. This doesn't work for child PKEY interfaces because the child interface uses the parent receive channels. Although the child shares the parent's receive channels, the number of rx queues is important for the channel_stats array: the parent's rx channel index is used to access the child's channel_stats. So the array has to be at least as large as the parent's rx queue size for the counting to work correctly and to prevent out of bound accesses. This patch checks for the mentioned scenario and returns an error when trying to create the interface. The error is propagated to the user. Fixes: be98737a4faa ("net/mlx5e: Use dynamic per-channel allocations in stats") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/mlx5e: IPoIB, Block queue count configuration when sub interfaces are ↵Dragos Tatulea4-7/+62
present [ Upstream commit 806a8df7126a8c05d60411eeb81057c2a8bbe7a7 ] PKEY sub interfaces share the receive queues with the parent interface. While setting the sub interface queue count is not supported, it is currently possible to change the number of queues of the parent interface. Thus we can end up with inconsistent queue sizes between the parent and its sub interfaces. This change disallows setting the queue count on the parent interface when sub interfaces are present. This is achieved by introducing an explicit reference to the parent netdev in the mlx5i_priv of the child interface. An additional counter is also required on the parent side to detect when sub interfaces are attached and for proper cleanup. The rtnl lock is taken during the ethtool op and the sub interface ndo_init/uninit ops. There is no race here around counting the sub interfaces, reading the sub interfaces and setting the number of channels. The ASSERT_RTNL was added to document that. Fixes: be98737a4faa ("net/mlx5e: Use dynamic per-channel allocations in stats") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/mlx5e: Verify dev is present for fix features ndoRoy Novich1-0/+3
[ Upstream commit ab4b01bfdaa69492fb36484026b0a0f0af02d75a ] The native NIC port net device instance is being used as Uplink representor. While changing profiles private resources are not available, fix features ndo does not check if the netdev is present. Add driver protection to verify private resources are ready. Fixes: 7a9fb35e8c3a ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode") Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/mlx5: Fix command stats access after freeMoshe Shemesh2-12/+3
[ Upstream commit da2e552b469a0cd130ff70a88ccc4139da428a65 ] Command may fail while driver is reloading and can't accept FW commands till command interface is reinitialized. Such command failure is being logged to command stats. This results in NULL pointer access as command stats structure is being freed and reallocated during mlx5 devlink reload (see kernel log below). Fix it by making command stats statically allocated on driver probe. Kernel log: [ 2394.808802] BUG: unable to handle kernel paging request at 000000000002a9c0 [ 2394.810610] PGD 0 P4D 0 [ 2394.811811] Oops: 0002 [#1] SMP NOPTI ... [ 2394.815482] RIP: 0010:native_queued_spin_lock_slowpath+0x183/0x1d0 ... [ 2394.829505] Call Trace: [ 2394.830667] _raw_spin_lock_irq+0x23/0x26 [ 2394.831858] cmd_status_err+0x55/0x110 [mlx5_core] [ 2394.833020] mlx5_access_reg+0xe7/0x150 [mlx5_core] [ 2394.834175] mlx5_query_port_ptys+0x78/0xa0 [mlx5_core] [ 2394.835337] mlx5e_ethtool_get_link_ksettings+0x74/0x590 [mlx5_core] [ 2394.836454] ? kmem_cache_alloc_trace+0x140/0x1c0 [ 2394.837562] __rh_call_get_link_ksettings+0x33/0x100 [ 2394.838663] ? __rtnl_unlock+0x25/0x50 [ 2394.839755] __ethtool_get_link_ksettings+0x72/0x150 [ 2394.840862] duplex_show+0x6e/0xc0 [ 2394.841963] dev_attr_show+0x1c/0x40 [ 2394.843048] sysfs_kf_seq_show+0x9b/0x100 [ 2394.844123] seq_read+0x153/0x410 [ 2394.845187] vfs_read+0x91/0x140 [ 2394.846226] ksys_read+0x4f/0xb0 [ 2394.847234] do_syscall_64+0x5b/0x1a0 [ 2394.848228] entry_SYSCALL_64_after_hwframe+0x65/0xca Fixes: 34f46ae0d4b3 ("net/mlx5: Add command failures data to debugfs") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/mlx5e: TC, Keep mod hdr actions after mod hdr allocAriel Levkovich1-2/+3
[ Upstream commit 5e72f3f1c558019082cfeedeed73748f35d780c6 ] When offloading TC NIC rule which has mod_hdr action, the mod_hdr actions list is freed upon mod_hdr allocation. In the new format of handling multi table actions and CT in particular, the mod_hdr actions list is still relevant when setting the pre and post rules and therefore, freeing the list may cause adding rules which don't set the FTE_ID. Therefore, the mod_hdr actions list needs to be kept for the pre/post flows as well and should be left for these handler to be freed. Fixes: 8300f225268b ("net/mlx5e: Create new flow attr for multi table actions") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/mlx5: check attr pointer validity before dereferencing itAriel Levkovich1-1/+1
[ Upstream commit e0bf81bf0d3d4747c146e0bf44774d3d881d7137 ] Fix attr pointer validity checks after it was already dereferenced. Fixes: cb0d54cbf948 ("net/mlx5e: Fix wrong source vport matching on tunnel rule") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18Revert "r8169: disable detection of chip version 36"Heiner Kallweit1-4/+1
[ Upstream commit 2ea26b4de6f42b74a5f1701de41efa6bc9f12666 ] This reverts commit 42666b2c452ce87894786aae05e3fad3cfc6cb59. This chip version seems to be very rare, but it exits in consumer devices, see linked report. Link: https://stackoverflow.com/questions/75049473/cant-setup-a-wired-network-in-archlinux-fresh-install Fixes: 42666b2c452c ("r8169: disable detection of chip version 36") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/42e9674c-d5d0-a65a-f578-e5c74f244739@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18net/sched: act_mpls: Fix warning during failed attribute validationIdo Schimmel1-1/+7
[ Upstream commit 9e17f99220d111ea031b44153fdfe364b0024ff2 ] The 'TCA_MPLS_LABEL' attribute is of 'NLA_U32' type, but has a validation type of 'NLA_VALIDATE_FUNCTION'. This is an invalid combination according to the comment above 'struct nla_policy': " Meaning of `validate' field, use via NLA_POLICY_VALIDATE_FN: NLA_BINARY Validation function called for the attribute. All other Unused - but note that it's a union " This can trigger the warning [1] in nla_get_range_unsigned() when validation of the attribute fails. Despite being of 'NLA_U32' type, the associated 'min'/'max' fields in the policy are negative as they are aliased by the 'validate' field. Fix by changing the attribute type to 'NLA_BINARY' which is consistent with the above comment and all other users of NLA_POLICY_VALIDATE_FN(). As a result, move the length validation to the validation function. No regressions in MPLS tests: # ./tdc.py -f tc-tests/actions/mpls.json [...] # echo $? 0 [1] WARNING: CPU: 0 PID: 17743 at lib/nlattr.c:118 nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117 Modules linked in: CPU: 0 PID: 17743 Comm: syz-executor.0 Not tainted 6.1.0-rc8 #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014 RIP: 0010:nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117 [...] Call Trace: <TASK> __netlink_policy_dump_write_attr+0x23d/0x990 net/netlink/policy.c:310 netlink_policy_dump_write_attr+0x22/0x30 net/netlink/policy.c:411 netlink_ack_tlv_fill net/netlink/af_netlink.c:2454 [inline] netlink_ack+0x546/0x760 net/netlink/af_netlink.c:2506 netlink_rcv_skb+0x1b7/0x240 net/netlink/af_netlink.c:2546 rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:6109 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x5e9/0x6b0 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x739/0x860 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] ____sys_sendmsg+0x38f/0x500 net/socket.c:2482 ___sys_sendmsg net/socket.c:2536 [inline] __sys_sendmsg+0x197/0x230 net/socket.c:2565 __do_sys_sendmsg net/socket.c:2574 [inline] __se_sys_sendmsg net/socket.c:2572 [inline] __x64_sys_sendmsg+0x42/0x50 net/socket.c:2572 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Link: https://lore.kernel.org/netdev/CAO4mrfdmjvRUNbDyP0R03_DrD_eFCLCguz6OxZ2TYRSv0K9gxA@mail.gmail.com/ Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC") Reported-by: Wei Chen <harperchen1110@gmail.com> Tested-by: Wei Chen <harperchen1110@gmail.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230107171004.608436-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18drm/vmwgfx: Remove rcu locks from user resourcesZack Rusin6-233/+87
[ Upstream commit a309c7194e8a2f8bd4539b9449917913f6c2cd50 ] User resource lookups used rcu to avoid two extra atomics. Unfortunately the rcu paths were buggy and it was easy to make the driver crash by submitting command buffers from two different threads. Because the lookups never show up in performance profiles replace them with a regular spin lock which fixes the races in accesses to those shared resources. Fixes kernel oops'es in IGT's vmwgfx execution_buffer stress test and seen crashes with apps using shared resources. Fixes: e14c02e6b699 ("drm/vmwgfx: Look up objects without taking a reference") Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Martin Krastev <krastevm@vmware.com> Reviewed-by: Maaz Mombasawala <mombasawalam@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221207172907.959037-1-zack@kde.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18drm/vmwgfx: Remove vmwgfx_hashtabMaaz Mombasawala8-303/+12
[ Upstream commit 9da30cdd6a318595199319708c143ae318f804ef ] The vmwgfx driver has migrated from using the hashtable in vmwgfx_hashtab to the linux/hashtable implementation. Remove the vmwgfx_hashtab from the driver. Signed-off-by: Maaz Mombasawala <mombasawalam@vmware.com> Reviewed-by: Martin Krastev <krastevm@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com> Signed-off-by: Zack Rusin <zackr@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221022040236.616490-12-zack@kde.org Stable-dep-of: a309c7194e8a ("drm/vmwgfx: Remove rcu locks from user resources") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18drm/vmwgfx: Refactor ttm reference object hashtable to use linux/hashtable.Maaz Mombasawala3-49/+56
[ Upstream commit 76a9e07f270cf5fb556ac237dbf11f5dacd61fef ] This is part of an effort to move from the vmwgfx_open_hash hashtable to linux/hashtable implementation. Refactor the ref_hash hashtable, used for fast lookup of reference objects associated with a ttm file. This also exposed a problem related to inconsistently using 32-bit and 64-bit keys with this hashtable. The hash function used changes depending on the size of the type, and results are not consistent across numbers, for example, hash_32(329) = 329, but hash_long(329) = 328. This would cause the lookup to fail for objects already in the hashtable, since keys of different sizes were being passed during adding and lookup. This was not an issue before because vmwgfx_open_hash always used hash_long. Fix this by always using 64-bit keys for this hashtable, which means that hash_long is always used. Signed-off-by: Maaz Mombasawala <mombasawalam@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com> Signed-off-by: Zack Rusin <zackr@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221022040236.616490-11-zack@kde.org Stable-dep-of: a309c7194e8a ("drm/vmwgfx: Remove rcu locks from user resources") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18drm/vmwgfx: Refactor resource validation hashtable to use linux/hashtable ↵Maaz Mombasawala5-66/+58
implementation. [ Upstream commit 9e931f2e09701e25744f3d186a4ba13b5342b136 ] Vmwgfx's hashtab implementation needs to be replaced with linux/hashtable to reduce maintenence burden. As part of this effort, refactor the res_ht hashtable used for resource validation during execbuf execution to use linux/hashtable implementation. This also refactors vmw_validation_context to use vmw_sw_context as the container for the hashtable, whereas before it used a vmwgfx_open_hash directly. This makes vmw_validation_context less generic, but there is no functional change since res_ht is the only instance where validation context used a hashtable in vmwgfx driver. Signed-off-by: Maaz Mombasawala <mombasawalam@vmware.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Zack Rusin <zackr@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221022040236.616490-6-zack@kde.org Stable-dep-of: a309c7194e8a ("drm/vmwgfx: Remove rcu locks from user resources") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18drm/vmwgfx: Remove ttm object hashtableMaaz Mombasawala3-23/+9
[ Upstream commit 931e09d8d5b4aa19bdae0234f2727049f1cd13d9 ] The object_hash hashtable for ttm objects is not being used. Remove it and perform refactoring in ttm_object init function. Signed-off-by: Maaz Mombasawala <mombasawalam@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Martin Krastev <krastevm@vmware.com> Signed-off-by: Zack Rusin <zackr@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221022040236.616490-5-zack@kde.org Stable-dep-of: a309c7194e8a ("drm/vmwgfx: Remove rcu locks from user resources") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18drm/vmwgfx: Refactor resource manager's hashtable to use linux/hashtable ↵Maaz Mombasawala1-36/+26
implementation. [ Upstream commit 43531dc661b7fb6be249c023bf25847b38215545 ] Vmwgfx's hashtab implementation needs to be replaced with linux/hashtable to reduce maintenance burden. Refactor cmdbuf resource manager to use linux/hashtable.h implementation as part of this effort. Signed-off-by: Maaz Mombasawala <mombasawalam@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Martin Krastev <krastevm@vmware.com> Signed-off-by: Zack Rusin <zackr@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221022040236.616490-4-zack@kde.org Stable-dep-of: a309c7194e8a ("drm/vmwgfx: Remove rcu locks from user resources") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18drm/vmwgfx: Write the driver id registersZack Rusin1-9/+34
[ Upstream commit 7f4c33778686cc2d34cb4ef65b4265eea874c159 ] Driver id registers are a new mechanism in the svga device to hint to the device which driver is running. This should not change device behavior in any way, but might be convenient to work-around specific bugs in guest drivers. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Martin Krastev <krastevm@vmware.com> Reviewed-by: Maaz Mombasawala <mombasawalam@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221022040236.616490-2-zack@kde.org Stable-dep-of: a309c7194e8a ("drm/vmwgfx: Remove rcu locks from user resources") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18ice: Add check for kzallocJiasheng Jiang1-9/+14
[ Upstream commit 40543b3d9d2c13227ecd3aa90a713c201d1d7f09 ] Add the check for the return value of kzalloc in order to avoid NULL pointer dereference. Moreover, use the goto-label to share the clean code. Fixes: d6b98c8d242a ("ice: add write functionality for GNSS TTY") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18ice: Fix potential memory leak in ice_gnss_tty_write()Yuan Can1-0/+1
[ Upstream commit f58985620f55580a07d40062c4115d8c9cf6ae27 ] The ice_gnss_tty_write() return directly if the write_buf alloc failed, leaking the cmd_buf. Fix by free cmd_buf if write_buf alloc failed. Fixes: d6b98c8d242a ("ice: add write functionality for GNSS TTY") Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18drm/amdgpu: Fix potential NULL dereferenceLuben Tuikov1-2/+3
[ Upstream commit 0be7ed8e7eb15282b5d0f6fdfea884db594ea9bf ] Fix potential NULL dereference, in the case when "man", the resource manager might be NULL, when/if we print debug information. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: AMD Graphics <amd-gfx@lists.freedesktop.org> Cc: Dan Carpenter <error27@gmail.com> Cc: kernel test robot <lkp@intel.com> Fixes: 7554886daa31ea ("drm/amdgpu: Fix size validation for non-exclusive domains (v4)") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc: fix the O_* fcntl/open macro definitions for riscvWilly Tarreau1-7/+7
[ Upstream commit 00b18da4089330196906b9fe075c581c17eb726c ] When RISCV port was imported in 5.2, the O_* macros were taken with their octal value and written as-is in hex, resulting in the getdents64() to fail in nolibc-test. Fixes: 582e84f7b779 ("tool headers nolibc: add RISCV support") #5.2 Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc: restore mips branch ordering in the _start blockWilly Tarreau1-0/+2
[ Upstream commit 184177c3d6e023da934761e198c281344d7dd65b ] Depending on the compiler used and the optimization options, the sbrk() test was crashing, both on real hardware (mips-24kc) and in qemu. One such example is kernel.org toolchain in version 11.3 optimizing at -Os. Inspecting the sys_brk() call shows the following code: 0040047c <sys_brk>: 40047c: 24020fcd li v0,4045 400480: 27bdffe0 addiu sp,sp,-32 400484: 0000000c syscall 400488: 27bd0020 addiu sp,sp,32 40048c: 10e00001 beqz a3,400494 <sys_brk+0x18> 400490: 00021023 negu v0,v0 400494: 03e00008 jr ra It is obviously wrong, the "negu" instruction is placed in beqz's delayed slot, and worse, there's no nop nor instruction after the return, so the next function's first instruction (addiu sip,sip,-32) will also be executed as part of the delayed slot that follows the return. This is caused by the ".set noreorder" directive in the _start block, that applies to the whole program. The compiler emits code without the delayed slots and relies on the compiler to swap instructions when this option is not set. Removing the option would require to change the startup code in a way that wouldn't make it look like the resulting code, which would not be easy to debug. Instead let's just save the default ordering before changing it, and restore it at the end of the _start block. Now the code is correct: 0040047c <sys_brk>: 40047c: 24020fcd li v0,4045 400480: 27bdffe0 addiu sp,sp,-32 400484: 0000000c syscall 400488: 10e00002 beqz a3,400494 <sys_brk+0x18> 40048c: 27bd0020 addiu sp,sp,32 400490: 00021023 negu v0,v0 400494: 03e00008 jr ra 400498: 00000000 nop Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc") #5.0 Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18ASoC: qcom: Fix building APQ8016 machine driver without SOUNDWIREStephan Gerhold8-133/+157
[ Upstream commit 0cbf1ecd8c4801ec7566231491f7ad9cec31098b ] Older Qualcomm platforms like APQ8016 do not have hardware support for SoundWire, so kernel configurations made specifically for those platforms will usually not have CONFIG_SOUNDWIRE enabled. Unfortunately commit 8d89cf6ff229 ("ASoC: qcom: cleanup and fix dependency of QCOM_COMMON") breaks those kernel configurations, because SOUNDWIRE is now a required dependency for SND_SOC_QCOM_COMMON (and in turn also SND_SOC_APQ8016_SBC). Trying to migrate such a kernel config silently disables SND_SOC_APQ8016_SBC and breaks audio functionality. The soundwire helpers in common.c are only used by two of the Qualcomm audio machine drivers, so building and requiring CONFIG_SOUNDWIRE for all platforms is unnecessary. There is no need to stuff all common code into a single module. Fix the issue by moving the soundwire helpers to a separate SND_SOC_QCOM_SDW module/option that is selected only by the machine drivers that make use of them. This also allows reverting the imply/depends changes from the previous fix because both SM8250 and SC8280XP already depend on SOUNDWIRE, so the soundwire helpers will be only built if SOUNDWIRE is really enabled. Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Fixes: 8d89cf6ff229 ("ASoC: qcom: cleanup and fix dependency of QCOM_COMMON") Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20221231115506.82991-1-stephan@gerhold.net Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>