summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-11-18fs: Pass AT_GETATTR_NOSEC flag to getattr interface functionStefan Berger5-8/+31
When vfs_getattr_nosec() calls a filesystem's getattr interface function then the 'nosec' should propagate into this function so that vfs_getattr_nosec() can again be called from the filesystem's gettattr rather than vfs_getattr(). The latter would add unnecessary security checks that the initial vfs_getattr_nosec() call wanted to avoid. Therefore, introduce the getattr flag GETATTR_NOSEC and allow to pass with the new getattr_flags parameter to the getattr interface function. In overlayfs and ecryptfs use this flag to determine which one of the two functions to call. In a recent code change introduced to IMA vfs_getattr_nosec() ended up calling vfs_getattr() in overlayfs, which in turn called security_inode_getattr() on an exiting process that did not have current->fs set anymore, which then caused a kernel NULL pointer dereference. With this change the call to security_inode_getattr() can be avoided, thus avoiding the NULL pointer dereference. Reported-by: <syzbot+a67fc5321ffb4b311c98@syzkaller.appspotmail.com> Fixes: db1d1e8b9867 ("IMA: use vfs_getattr_nosec to get the i_version") Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: <linux-fsdevel@vger.kernel.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Tyler Hicks <code@tyhicks.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Suggested-by: Christian Brauner <brauner@kernel.org> Co-developed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Link: https://lore.kernel.org/r/20231002125733.1251467-1-stefanb@linux.vnet.ibm.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-18drm/msm: remove unnecessary NULL checkDan Carpenter1-2/+1
This NULL check was required when it was added, but we shuffled the code around and now it's not. The inconsistent NULL checking triggers a Smatch warning: drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c:847 mdp5_init() warn: variable dereferenced before check 'mdp5_kms' (see line 782) Fixes: 1f50db2f3e1e ("drm/msm/mdp5: move resource allocation to the _probe function") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/562559/ Link: https://lore.kernel.org/r/ZSj+6/J6YsoSpLak@kadam Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2023-11-18Merge tag 'bcachefs-2023-11-17' of https://evilpiepirate.org/git/bcachefsLinus Torvalds27-377/+248
Pull bcachefs fixes from Kent Overstreet: "Lots of small fixes for minor nits and compiler warnings. Bigger items: - The six locks lost wakeup is finally fixed: six_read_trylock() was checking for the waiting bit before decrementing the number of readers - validated the fix with a torture test. - Fix for a memory reclaim issue: when needing to reallocate a key cache key, we now do our usual GFP_NOWAIT; unlock(); GFP_KERNEL dance. - Multiple deleted inodes btree fixes - Fix an issue in fsck, where i_nlink would be recalculated incorrectly for hardlinked files if a snapshot had ever been taken. - Kill journal pre-reservations: This is a bigger patch than I would normally send at this point, but it deletes code and it fixes some of our tests that would sporadically die with the journal getting stuck, and it's a performance improvement, too" * tag 'bcachefs-2023-11-17' of https://evilpiepirate.org/git/bcachefs: (22 commits) bcachefs: Fix missing locking for dentry->d_parent access bcachefs: six locks: Fix lost wakeup bcachefs: Fix no_data_io mode checksum check bcachefs: Fix bch2_check_nlinks() for snapshots bcachefs: Don't decrease BTREE_ITER_MAX when LOCKDEP=y bcachefs: Disable debug log statements bcachefs: Fix missing transaction commit bcachefs: Fix error path in bch2_mount() bcachefs: Fix potential sleeping during mount bcachefs: Fix iterator leak in may_delete_deleted_inode() bcachefs: Kill journal pre-reservations bcachefs: Check for nonce offset inconsistency in data_update path bcachefs: Make sure to drop/retake btree locks before reclaim bcachefs: btree_trans->write_locked bcachefs: Run btree key cache shrinker less aggressively bcachefs: Split out btree_key_cache_types.h bcachefs: Guard against insufficient devices to create stripes bcachefs: Fix null ptr deref in bch2_backpointer_get_node() bcachefs: Fix multiple -Warray-bounds warnings bcachefs: Use DECLARE_FLEX_ARRAY() helper and fix multiple -Warray-bounds warnings ...
2023-11-18Merge tag 'mm-hotfixes-stable-2023-11-17-14-04' of ↵Linus Torvalds16-38/+57
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Thirteen hotfixes. Seven are cc:stable and the remainder pertain to post-6.6 issues or aren't considered suitable for backporting" * tag 'mm-hotfixes-stable-2023-11-17-14-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: more ptep_get() conversion parisc: fix mmap_base calculation when stack grows upwards mm/damon/core.c: avoid unintentional filtering out of schemes mm: kmem: drop __GFP_NOFAIL when allocating objcg vectors mm/damon/sysfs-schemes: handle tried region directory allocation failure mm/damon/sysfs-schemes: handle tried regions sysfs directory allocation failure mm/damon/sysfs: check error from damon_sysfs_update_target() mm: fix for negative counter: nr_file_hugepages selftests/mm: add hugetlb_fault_after_madv to .gitignore selftests/mm: restore number of hugepages selftests: mm: fix some build warnings selftests: mm: skip whole test instead of failure mm/damon/sysfs: eliminate potential uninitialized variable warning
2023-11-18Merge tag 'block-6.7-2023-11-17' of git://git.kernel.dk/linuxLinus Torvalds1-37/+38
Pull block fix from Jens Axboe: "Just a single fix from Christoph/Ming, fixing a case where integrity IO could be called without having an appropriate queue reference" * tag 'block-6.7-2023-11-17' of git://git.kernel.dk/linux: blk-mq: make sure active queue usage is held for bio_integrity_prep()
2023-11-18Merge tag 'io_uring-6.7-2023-11-17' of git://git.kernel.dk/linuxLinus Torvalds2-9/+12
Pull io_uring fix from Jens Axboe: "Just a single fixup for a change we made in this release, which caused a regression in sometimes missing fdinfo output if the SQPOLL thread had the lock held when fdinfo output was retrieved. This brings us back on par with what we had before, where just the main uring_lock will prevent that output. We'd love to get rid of that too, but that is beyond the scope of this release and will have to wait for 6.8" * tag 'io_uring-6.7-2023-11-17' of git://git.kernel.dk/linux: io_uring/fdinfo: remove need for sqpoll lock for thread/pid retrieval
2023-11-18Merge tag 'drm-fixes-2023-11-17' of git://anongit.freedesktop.org/drm/drmLinus Torvalds32-98/+182
Pull drm fixes from Daniel Vetter: "This is a 'blast from the bast' fixes pull, because it contains a bunch of AGP fixes for amdgpu. Otherwise nothing out of the ordinary. Next week is back to Dave unless he's knocked out by some conference bug. - amdgpu: fixes all over, including a set of AGP fixes - nouvea: GSP + other bugfixes - ivpu build fix - lenovo legion go panel orientation quirk" * tag 'drm-fixes-2023-11-17' of git://anongit.freedesktop.org/drm/drm: (30 commits) drm/amdgpu/gmc9: disable AGP aperture drm/amdgpu/gmc10: disable AGP aperture drm/amdgpu/gmc11: disable AGP aperture drm/amdgpu: add a module parameter to control the AGP aperture drm/amdgpu/gmc11: fix logic typo in AGP check drm/amd/display: Fix encoder disable logic drm/amd/display: Change the DMCUB mailbox memory location from FB to inbox drm/amdgpu: add and populate the port num into xgmi topology info drm/amd/display: Negate IPS allow and commit bits drm/amd/pm: Don't send unload message for reset drm/amdgpu: fix ras err_data null pointer issue in amdgpu_ras.c drm/amd/display: Clear dpcd_sink_ext_caps if not set drm/amd/display: Enable fast plane updates on DCN3.2 and above drm/amd/display: fix NULL dereference drm/amd/display: fix a NULL pointer dereference in amdgpu_dm_i2c_xfer() drm/amd/display: Add null checks for 8K60 lightup drm/amd/pm: Fill pcie error counters for gpu v1_4 drm/amd/pm: Update metric table for smu v13_0_6 drm/amdgpu: correct chunk_ptr to a pointer to chunk. drm/amd/display: Fix DSC not Enabled on Direct MST Sink ...
2023-11-18drm/panel: auo,b101uan08.3: Fine tune the panel power sequenceXuxin Xiong1-0/+1
For "auo,b101uan08.3" this panel, it is stipulated in the panel spec that MIPI needs to keep the LP11 state before the lcm_reset pin is pulled high. Fixes: 56ad624b4cb5 ("drm/panel: support for auo, b101uan08.3 wuxga dsi video mode panel") Signed-off-by: Xuxin Xiong <xuxinxiong@huaqin.corp-partner.google.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231114044205.613421-1-xuxinxiong@huaqin.corp-partner.google.com
2023-11-17NFSD: Fix checksum mismatches in the duplicate reply cacheChuck Lever3-24/+54
nfsd_cache_csum() currently assumes that the server's RPC layer has been advancing rq_arg.head[0].iov_base as it decodes an incoming request, because that's the way it used to work. On entry, it expects that buf->head[0].iov_base points to the start of the NFS header, and excludes the already-decoded RPC header. These days however, head[0].iov_base now points to the start of the RPC header during all processing. It no longer points at the NFS Call header when execution arrives at nfsd_cache_csum(). In a retransmitted RPC the XID and the NFS header are supposed to be the same as the original message, but the contents of the retransmitted RPC header can be different. For example, for krb5, the GSS sequence number will be different between the two. Thus if the RPC header is always included in the DRC checksum computation, the checksum of the retransmitted message might not match the checksum of the original message, even though the NFS part of these messages is identical. The result is that, even if a matching XID is found in the DRC, the checksum mismatch causes the server to execute the retransmitted RPC transaction again. Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-11-17NFSD: Fix "start of NFS reply" pointer passed to nfsd_cache_update()Chuck Lever1-1/+3
The "statp + 1" pointer that is passed to nfsd_cache_update() is supposed to point to the start of the egress NFS Reply header. In fact, it does point there for AUTH_SYS and RPCSEC_GSS_KRB5 requests. But both krb5i and krb5p add fields between the RPC header's accept_stat field and the start of the NFS Reply header. In those cases, "statp + 1" points at the extra fields instead of the Reply. The result is that nfsd_cache_update() caches what looks to the client like garbage. A connection break can occur for a number of reasons, but the most common reason when using krb5i/p is a GSS sequence number window underrun. When an underrun is detected, the server is obliged to drop the RPC and the connection to force a retransmit with a fresh GSS sequence number. The client presents the same XID, it hits in the server's DRC, and the server returns the garbage cache entry. The "statp + 1" argument has been used since the oldest changeset in the kernel history repo, so it has been in nfsd_dispatch() literally since before history began. The problem arose only when the server-side GSS implementation was added twenty years ago. Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-11-17NFSD: Update nfsd_cache_append() to use xdr_streamChuck Lever1-15/+8
When inserting a DRC-cached response into the reply buffer, ensure that the reply buffer's xdr_stream is updated properly. Otherwise the server will send a garbage response. Cc: stable@vger.kernel.org # v6.3+ Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-11-17nfsd: fix file memleak on client_opens_releaseMahmoud Adam1-1/+1
seq_release should be called to free the allocated seq_file Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Mahmoud Adam <mngyadam@amazon.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens") Reviewed-by: NeilBrown <neilb@suse.de> Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-11-17dm-crypt: start allocating with MAX_ORDERMikulas Patocka1-1/+1
Commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") changed the meaning of MAX_ORDER from exclusive to inclusive. So, we can allocate compound pages with up to 1 << MAX_ORDER pages. Reflect this change in dm-crypt and start trying to allocate compound pages with MAX_ORDER. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-17dm-verity: don't use blocking calls from taskletsMikulas Patocka3-14/+15
The commit 5721d4e5a9cd enhanced dm-verity, so that it can verify blocks from tasklets rather than from workqueues. This reportedly improves performance significantly. However, dm-verity was using the flag CRYPTO_TFM_REQ_MAY_SLEEP from tasklets which resulted in warnings about sleeping function being called from non-sleeping context. BUG: sleeping function called from invalid context at crypto/internal.h:206 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: ksoftirqd/0 preempt_count: 100, expected: 0 RCU nest depth: 0, expected: 0 CPU: 0 PID: 14 Comm: ksoftirqd/0 Tainted: G W 6.7.0-rc1 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x32/0x50 __might_resched+0x110/0x160 crypto_hash_walk_done+0x54/0xb0 shash_ahash_update+0x51/0x60 verity_hash_update.isra.0+0x4a/0x130 [dm_verity] verity_verify_io+0x165/0x550 [dm_verity] ? free_unref_page+0xdf/0x170 ? psi_group_change+0x113/0x390 verity_tasklet+0xd/0x70 [dm_verity] tasklet_action_common.isra.0+0xb3/0xc0 __do_softirq+0xaf/0x1ec ? smpboot_thread_fn+0x1d/0x200 ? sort_range+0x20/0x20 run_ksoftirqd+0x15/0x30 smpboot_thread_fn+0xed/0x200 kthread+0xdc/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x28/0x40 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork_asm+0x11/0x20 </TASK> This commit fixes dm-verity so that it doesn't use the flags CRYPTO_TFM_REQ_MAY_SLEEP and CRYPTO_TFM_REQ_MAY_BACKLOG from tasklets. The crypto API would do GFP_ATOMIC allocation instead, it could return -ENOMEM and we catch -ENOMEM in verity_tasklet and requeue the request to the workqueue. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # v6.0+ Fixes: 5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature") Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-17dm-bufio: fix no-sleep modeMikulas Patocka1-25/+62
dm-bufio has a no-sleep mode. When activated (with the DM_BUFIO_CLIENT_NO_SLEEP flag), the bufio client is read-only and we could call dm_bufio_get from tasklets. This is used by dm-verity. Unfortunately, commit 450e8dee51aa ("dm bufio: improve concurrent IO performance") broke this and the kernel would warn that cache_get() was calling down_read() from no-sleeping context. The bug can be reproduced by using "veritysetup open" with the "--use-tasklets" flag. This commit fixes dm-bufio, so that the tasklet mode works again, by expanding use of the 'no_sleep_enabled' static_key to conditionally use either a rw_semaphore or rwlock_t (which are colocated in the buffer_tree structure using a union). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # v6.4 Fixes: 450e8dee51aa ("dm bufio: improve concurrent IO performance") Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-17dm-delay: avoid duplicate logicMikulas Patocka1-44/+21
This is small refactoring of dm-delay - we avoid duplicate logic in flush_delayed_bios and flush_delayed_bios_fast and join these two functions into one. We also add cond_resched() to flush_delayed_bios because the list may have unbounded number of entries. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-17dm-delay: fix bugs introduced by kthread modeMikulas Patocka1-26/+35
This commit fixes the following bugs introduced by commit 70bbeb29fab0 ("dm delay: for short delays, use kthread instead of timers and wq"): * the function flush_worker_fn has no exit path - on unload, this function will just loop and consume 100% CPU without any progress * the wake-up mechanism in flush_worker_fn is racy - a wake up will be missed if the process adds entries to the delayed_bios list just before set_current_state(TASK_INTERRUPTIBLE) * flush_delayed_bios_fast submits a bio while holding a global mutex; this may deadlock if we have multiple stacked dm-delay devices and the underlying device attempts to acquire the mutex too * if the target constructor fails, it will call delay_dtr. delay_dtr would attempt to free dc->timer_lock without it being initialized by the constructor. * if the target constructor's kthread allocation fails, delay_dtr would crash trying to dereference dc->worker because it is non-NULL due to ERR_PTR. Fixes: 70bbeb29fab0 ("dm delay: for short delays, use kthread instead of timers and wq") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-17dm-delay: fix a race between delay_presuspend and delay_bioMikulas Patocka1-5/+11
In delay_presuspend, we set the atomic variable may_delay and then stop the timer and flush pending bios. The intention here is to prevent the delay target from re-arming the timer again. However, this test is racy. Suppose that one thread goes to delay_bio, sees that dc->may_delay is one and proceeds; now, another thread executes delay_presuspend, it sets dc->may_delay to zero, deletes the timer and flushes pending bios. Then, the first thread continues and adds the bio to delayed->list despite the fact that dc->may_delay is false. Fix this bug by changing may_delay's type from atomic_t to bool and only access it while holding the delayed_bios_lock mutex. Note that we don't have to grab the mutex in delay_resume because there are no bios in flight at this point. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-17blk-cgroup: bypass blkcg_deactivate_policy after destroyingMing Lei1-0/+13
blkcg_deactivate_policy() can be called after blkg_destroy_all() returns, and it isn't necessary since blkg_destroy_all has covered policy deactivation. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20231117023527.3188627-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-17blk-cgroup: avoid to warn !rcu_read_lock_held() in blkg_lookup()Ming Lei1-2/+0
So far, all callers either holds spin lock or rcu read explicitly, and most of the caller has added WARN_ON_ONCE(!rcu_read_lock_held()) or lockdep_assert_held(&disk->queue->queue_lock). Remove WARN_ON_ONCE(!rcu_read_lock_held()) from blkg_lookup() for killing the false positive warning from blkg_conf_prep(). Reported-by: Changhui Zhong <czhong@redhat.com> Fixes: 83462a6c971c ("blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish()") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20231117023527.3188627-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-17blk-throttle: fix lockdep warning of "cgroup_mutex or RCU read lock required!"Ming Lei1-0/+2
Inside blkg_for_each_descendant_pre(), both css_for_each_descendant_pre() and blkg_lookup() requires RCU read lock, and either cgroup_assert_mutex_or_rcu_locked() or rcu_read_lock_held() is called. Fix the warning by adding rcu read lock. Reported-by: Changhui Zhong <czhong@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20231117023527.3188627-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-17accel/qaic: Update MAX_ORDER use to be inclusiveJeffrey Hugo1-1/+1
MAX_ORDER was redefined so that valid allocations to the page allocator are in the range of 0..MAX_ORDER, inclusive in the commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely"). We are treating MAX_ORDER as an exclusive value, and thus could be requesting larger allocations. Update our use to match the redefinition of MAX_ORDER. Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231103153302.20642-1-quic_jhugo@quicinc.com
2023-11-17drm/amdgpu: Add function parameter 'xcc_mask' not described in ↵Srinivasan Shanmugam1-0/+1
'amdgpu_vm_flush_compute_tlb' Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1373: warning: Function parameter or member 'xcc_mask' not described in 'amdgpu_vm_flush_compute_tlb' Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amdgpu: add amdgpu runpm usage trace for separate funcsPrike Liang3-0/+21
Add trace for amdgpu runpm separate funcs usage and this will help debugging on the case of runpm usage missed to dereference. In the normal case the runpm usage count referred by one kind of functionality pairwise and usage should be changed from 1 to 0, otherwise there will be an issue in the amdgpu runpm usage dereference. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amdgpu: add pm metrics structure definitionAlex Deucher1-0/+15
Define the pm metrics structures to be exposed via sysfs. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
2023-11-17drm/amdgpu: expose the connected port num info through sysfsShiwu Zhang1-0/+44
By catting the xgmi_port_num sysfs node, it prints out the info in the format of <src node id>:<src port num> -> <dst node id>:<dst port num> for one xgmi link. For example, in case of 4 sockets fully and evenly connected setup, it would be like as below for the first node in the hive. 01:02 -> 02:03 01:03 -> 02:02 01:07 -> 03:04 01:04 -> 03:07 01:06 -> 04:05 01:05 -> 04:06 Based on the fact that there is two xgmi links between each socket pair, "01:02 -> 02:03" means that the current socket in question use the port 2 to connect with port 3 of the second node in the hive and so on. v2: print out the src/dst node id for each xgmi link (lijo) v3: replace the current_node++ with +1 to align with dst node (le) and use the dev_err instead of pr_err (lijo) v4: fix checkpatch warning (alex) Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/display: Promote DC to 3.2.260Aric Cyr1-1/+1
- Add missing chips for HDCP - Add new command to disable replay timing resync - Fix encoder disable logic - Enable DSC Flag in MST Mode Validation - Change the DMCUB mailbox memory location from FB to inbox - Add disable timeout option - Negate IPS allow and commit bits - Enable DCN clock gating for DCN35 - Prefer currently used OTG master when acquiring free pipe - Try to acquire a free OTG master not used in cur ctx first - Clear dpcd_sink_ext_caps if not set - Enable fast plane updates on DCN3.2 and above - Add null checks for 8K60 lightup - Refactor resource into component directory - Fix DSC not Enabled on Direct MST Sink - Guard against invalid RPTR/WPTR being set - Enable CM low mem power optimization - Fix a debugfs null pointer error Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd: Exclude dGPUs in eGPU enclosures from DPM quirksMario Limonciello1-2/+6
The PCIe speed capabilities advertised by a USB4 or TBT3 link are limited to PCIe gen 1 per the USB4 spec. In reality the speed will change dynamically based on fabric conditions and other traffic. DPM is disabled when dGPUs are connected directly to Intel hosts since the PCIe root port isn't able to handle dynamic speed switching. As this limitation is specifically for PCIe root ports in the SoC, don't apply it when connected to an eGPU enclosure connected to an Intel host. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2885 Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd: Use the first non-dGPU PCI device for BW limitsMario Limonciello1-2/+35
When bandwidth limits are looked up using pcie_bandwidth_available() virtual links such as USB4 are analyzed which might not represent the real speed. Furthermore devices may change speeds autonomously which may introduce conditional variation to the results reported in the status registers. Instead look at the capabilities of first PCI device outside of dGPU to decide upper limits that the dGPU will work at. For eGPU this effectively means that it will use the speed of the link partner. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925#note_2145860 Link: https://www.usb.org/document-library/usb4r-specification-v20 USB4 V2 with Errata and ECN through June 2023 Section 11.2.1 Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/display: Add missing chips for HDCPRodrigo Siqueira1-0/+5
[WHAT] Add missing HDCP ID in the message id enum. Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/display: Add new command to disable replay timing resyncAnthony Koo1-0/+41
[WHY & HOW] Add new command to disable replay timing resync Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Anthony Koo <anthony.koo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/display: Enable DSC Flag in MST Mode ValidationFangzhi Zuo1-18/+26
[WHY & HOW] When dsc is possible, MST mode validation includes: 1. if maximum dsc compression cannot fit into end to end bw, mode pruned 2. if native bw cannot fit into end to end bw, try to enabled dsc to see whether a feasible dsc config can be found 3. if native bw can fit into end to end bw, mode supported Reviewed-by: Wayne Lin <wayne.lin@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/display: Send PQ bit in AMD VSIFKrunoslav Kovac1-2/+4
[WHY & HOW] PB9 bit 5 was added to signal PQ EOTF in AMD vendor specific infoframe. This change sets it when appropriate. Reviewed-by: Aric Cyr <aric.cyr@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Krunoslav Kovac <krunoslav.kovac@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/display: Add disable timeout optionDuncan Ma3-6/+27
[WHY] Driver continues running whenever there is is timeout from smu or dmcub. It is difficult to track failure state when dcn, dc or dmcub changes on root failure. [HOW] Add disable_timeout option to halt driver whenever there is a failure in response. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Duncan Ma <duncan.ma@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/display: Enable DCN clock gating for DCN35Daniel Miess11-30/+108
[WHY & HOW] Enable DCN clock gating for DCN35. Disable DTBCLK gate before link training and re-enable afterwards Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Daniel Miess <daniel.miess@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/display: Prefer currently used OTG master when acquiring free pipeWenjing Liu2-1/+48
[WHY & HOW] When acquiring an OTG master pipe we should prefer currently enabled OTG master pipes first. If there are no free pipes used as current OTG master pipe then we will try to acquire a currently unused free pipe as new OTG master instead of tearing down current secondary pipes from ODM or MPC combine. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Reviewed-by: Dillon Varone <dillon.varone@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/display: Try to acquire a free OTG master not used in cur ctx firstWenjing Liu1-2/+6
[WHY & HOW] The current otg master pipe allocation logic is not optimized based current resource context. We should try to acquire a free OTG master not used in cur cts first to avoid unnecessary pipe switch from current state. Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/display: Refactor resource into component directoryMounika Adhuri68-181/+276
[WHY] Move all resource files to unique folder resource. [HOW] Created resource folder in dc, moved the dcnxx_resource.c and dcnxx_resource.h files into corresponding new folders inside the resource and made appropriate changes for compilation in Makefiles. Reviewed-by: Martin Leung <martin.leung@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mounika Adhuri <moadhuri@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/display: add a debugfs interface for the DMUB trace maskHamza Mahfooz4-2/+209
For features that are implemented primarily in DMUB (e.g. PSR), it is useful to be able to trace them at a DMUB level from the kernel, especially when debugging issues. So, introduce a debugfs interface that is able to read and set the DMUB trace mask dynamically at runtime and document how to use it. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/display: Enable CM low mem power optimizationYihan Zhu2-6/+9
[WHY & HOW] MPC MCM low mem power optimization still causes color distortion on first SCE enablement, only forces light sleep for it. DPP low memory power optimization still needs this bit to save power. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Yihan Zhu <yihan.zhu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amdgpu: correct mca ipid die/socket/addr decodeYang Wang1-5/+13
correct mca ipid die/socket/addr decode v2: squash in fix from Yang Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/display: Fix a debugfs null pointer errorAurabindo Pillai1-1/+5
[WHY & HOW] Check whether get_subvp_en() callback exists before calling it. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Alex Hung <alex.hung@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amdkfd: Clear the VALU exception state in the trap handlerLaurent Morichetti2-332/+338
The trap handler could be entered with pending VALU exceptions, so clear the exception state before issuing vector instructions. Reviewed-by: Jay Cornwall <jay.cornwall@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Laurent Morichetti <laurent.morichetti@amd.com> Tested-by: Lancelot Six <lancelot.six@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amdgpu: Refactor 'amdgpu_connector_dvi_detect' in amdgpu_connectors.cSrinivasan Shanmugam1-27/+42
Fixes the below: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Missing a blank line after declarations WARNING: Too many leading tabs - consider code refactoring + if (list_connector->connector_type != DRM_MODE_CONNECTOR_VGA) { WARNING: Too many leading tabs - consider code refactoring + if (!amdgpu_display_hpd_sense(adev, amdgpu_connector->hpd.hpd)) { Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/pm: Make smu_v13_0_baco_set_armd3_sequence() staticMa Jun2-4/+1
smu_v13_0_baco_set_armd3_sequence is not used by other files, so make it as static type. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/pm: Move some functions to smu_v13_0.c as generic codeMa Jun3-78/+32
Use generic functions and remove the duplicate code Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/radeon: Fix warning using plain integer as NULLAbhinav Singh1-4/+4
sparse static analysis tools generate a warning with this message "Using plain integer as NULL pointer". In this case this warning is being shown because we are trying to intialize a pointer to NULL using integer value 0. Signed-off-by: Abhinav Singh <singhabhinav9051571833@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17amdgpu: Adjust kmalloc_array calls for new -Walloc-sizeSam James5-8/+8
GCC 14 introduces a new -Walloc-size included in -Wextra which errors out on various files in drivers/gpu/drm/amd/amdgpu like: ``` amdgpu_amdkfd_gfx_v8.c:241:15: error: allocation of insufficient size ‘4’ for type ‘uint32_t[2]’ {aka ‘unsigned int[2]'} with size ‘8’ [-Werror=alloc-size] ``` This is because each HQD_N_REGS is actually a uint32_t[2]. Move the * 2 to the size argument so GCC sees we're allocating enough. Originally did 'sizeof(uint32_t) * 2' for the size but a friend suggested 'sizeof(**dump)' better communicates the intent. Link: https://lore.kernel.org/all/87wmuwo7i3.fsf@gentoo.org/ Signed-off-by: Sam James <sam@gentoo.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amd/pm: Fix return value and drop redundant paramMa Jun11-38/+28
Fix the return value and drop redundant parameter of get_asic_baco_capability function. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amdkfd: Move TLB flushing logic into amdgpuFelix Kuehling6-67/+57
This will make it possible for amdgpu GEM ioctls to flush TLBs on compute VMs. This removes VMID-based TLB flushing and always uses PASID-based flushing. This still works because it scans the VMID-PASID mapping registers to find the right VMID. It's only slightly less efficient. This is not a production use case. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>