summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-03-11smb: client: retry compound request without reusing leaseMeetakshi Setiya1-3/+38
There is a shortcoming in the current implementation of the file lease mechanism exposed when the lease keys were attempted to be reused for unlink, rename and set_path_size operations for a client. As per MS-SMB2, lease keys are associated with the file name. Linux smb client maintains lease keys with the inode. If the file has any hardlinks, it is possible that the lease for a file be wrongly reused for an operation on the hardlink or vice versa. In these cases, the mentioned compound operations fail with STATUS_INVALID_PARAMETER. This patch adds a fallback to the old mechanism of not sending any lease with these compound operations if the request with lease key fails with STATUS_INVALID_PARAMETER. Resending the same request without lease key should not hurt any functionality, but might impact performance especially in cases where the error is not because of the usage of wrong lease key and we might end up doing an extra roundtrip. Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-11smb: client: do not defer close open handles to deleted filesMeetakshi Setiya6-5/+74
When a file/dentry has been deleted before closing all its open handles, currently, closing them can add them to the deferred close list. This can lead to problems in creating file with the same name when the file is re-created before the deferred close completes. This issue was seen while reusing a client's already existing lease on a file for compound operations and xfstest 591 failed because of the deferred close handle that remained valid even after the file was deleted and was being reused to create a file with the same name. The server in this case returns an error on open with STATUS_DELETE_PENDING. Recreating the file would fail till the deferred handles are closed (duration specified in closetimeo). This patch fixes the issue by flagging all open handles for the deleted file (file path to be precise) by setting status_file_deleted to true in the cifsFileInfo structure. As per the information classes specified in MS-FSCC, SMB2 query info response from the server has a DeletePending field, set to true to indicate that deletion has been requested on that file. If this is the case, flag the open handles for this file too. When doing close in cifs_close for each of these handles, check the value of this boolean field and do not defer close these handles if the corresponding filepath has been deleted. Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-11smb: client: reuse file lease key in compound operationsMeetakshi Setiya6-31/+48
Currently, when a rename, unlink or set path size compound operation is requested on a file that has a lot of dirty pages to be written to the server, we do not send the lease key for these requests. As a result, the server can assume that this request is from a new client, and send a lease break notification to the same client, on the same connection. As a response to the lease break, the client can consume several credits to write the dirty pages to the server. Depending on the server's credit grant implementation, the server can stop granting more credits to this connection, and this can cause a deadlock (which can only be resolved when the lease timer on the server expires). One of the problems here is that the client is sending no lease key, even if it has a lease for the file. This patch fixes the problem by reusing the existing lease key on the file for rename, unlink and set path size compound operations so that the client does not break its own lease. A very trivial example could be a set of commands by a client that maintains open handle (for write) to a file and then tries to copy the contents of that file to another one, eg., tail -f /dev/null > myfile & mv myfile myfile2 Presently, the network capture on the client shows that the move (or rename) would trigger a lease break on the same client, for the same file. With the lease key reused, the lease break request-response overhead is eliminated, thereby reducing the roundtrips performed for this set of operations. The patch fixes the bug described above and also provides perf benefit. Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-11smb3: update allocation size more accurately on write completionSteve French1-1/+8
Changes to allocation size are approximated for extending writes of cached files until the server returns the actual value (on SMB3 close or query info for example), but it was setting the estimated value for number of blocks to larger than the file size even if the file is likely sparse which breaks various xfstests (e.g. generic/129, 130, 221, 228). When i_size and i_blocks are updated in write completion do not increase allocation size more than what was written (rounded up to 512 bytes). Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-11cifs: minor update to list of reviewersSteve French1-0/+1
Add Bharath for reviewing deferred close and leases Acked-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-11smb: remove SLAB_MEM_SPREAD flag usageChengming Zhou1-1/+1
The SLAB_MEM_SPREAD flag is already a no-op as of 6.8-rc1, remove its usage so we can delete it from slab. No functional change. Link: https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-0-02f1753e8303@suse.cz/ Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-11cifs: allow changing password during remountSteve French4-5/+30
There are cases where a session is disconnected and password has changed on the server (or expired) for this user and this currently can not be fixed without unmount and mounting again. This patch allows remount to change the password (for the non Kerberos case, Kerberos ticket refresh is handled differently) when the session is disconnected and the user can not reconnect due to still using old password. Future patches should also allow us to setup the keyring (cifscreds) to have an "alternate password" so we would be able to change the password before the session drops (without the risk of races between when the password changes and the disconnect occurs - ie cases where the old password is still needed because the new password has not fully rolled out to all servers yet). Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-11cifs: prevent updating file size from server if we have a read/write leaseBharath SM4-12/+17
In cases of large directories, the readdir operation may span multiple round trips to retrieve contents. This introduces a potential race condition in case of concurrent write and readdir operations. If the readdir operation initiates before a write has been processed by the server, it may update the file size attribute to an older value. Address this issue by avoiding file size updates from readdir when we have read/write lease. Scenario: 1) process1: open dir xyz 2) process1: readdir instance 1 on xyz 3) process2: create file.txt for write 4) process2: write x bytes to file.txt 5) process2: close file.txt 6) process2: open file.txt for read 7) process1: readdir 2 - overwrites file.txt inode size to 0 8) process2: read contents of file.txt - bug, short read with 0 bytes Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-11mailbox: imx: support i.MX95 Generic/ELE/V2X MUPeng Fan1-0/+3
Add i.MX95 Generic/ELE/V2X MU support, its register layout is same as i.MX8ULP, but the Parameter registers would show different TR/RR. Since the driver already supports get TR/RR from Parameter registers, not hardcoding the number, this patch just add the compatible entry to reuse i.MX8ULP S4 cfg data. Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-03-11mailbox: imx: populate sub-nodesPeng Fan1-0/+3
Some MUs such as i.MX95 MU, have internal SRAM which could be used for SCMI shared memory, so populate the sub-nodes to use the SRAM. Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-03-11mailbox: imx: get RR/TR registers num from Parameter registerPeng Fan1-11/+36
i.MX8ULP, i.MX93 MU has a Parameter register encoded as below: BIT: 15 --- 8 | 7 --- 0 RR_NUM TR_NUM So to make driver easy to support more variants, get the RR/TR registers number from Parameter register. The patch only adds support the specific MU, such as ELE MU. For generic MU, not add support for number larger than 4. Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-03-11mailbox: imx: support return value of initPeng Fan1-11/+24
There will be changes that init may fail, so adding return value for init function. Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-03-11dt-bindings: mailbox: fsl,mu: add i.MX95 Generic/ELE/V2X MU compatiblePeng Fan1-1/+57
Add i.MX95 Generic, Secure Enclave and V2X Message Unit compatible string. And the MUs in AONMIX has internal RAMs for SCMI shared buffer usage. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-03-10Linux 6.8v6.8Linus Torvalds1-1/+1
2024-03-10hwmon: (dell-smm) Add XPS 9315 to fan control whitelistArmin Wolf1-0/+13
A user reported that on this machine, disabling BIOS fan control is necessary in order to change the fan speed. Signed-off-by: Armin Wolf <W_Armin@gmx.de> Acked-by: Pali Rohár <pali@kernel.org> Link: https://lore.kernel.org/r/20240309212025.13758-1-W_Armin@gmx.de Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-03-10bcachefs: bch2_lookup() gives better error message on inode not foundKent Overstreet1-9/+64
When a dirent points to a missing inode, we really should print out the dirent. This requires quite a bit of refactoring, but there's some other benefits: we now do the entire looup (dirent and inode) in a single btree transaction, and copy to the VFS inode with btree locks still held, like the create path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: bch2_inode_insert()Kent Overstreet1-62/+76
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10mm: introduce PF_MEMALLOC_NORECLAIM, PF_MEMALLOC_NOWARNKent Overstreet2-6/+15
Introduce PF_MEMALLOC_* equivalents of some GFP_ flags: PF_MEMALLOC_NORECLAIM -> GFP_NOWAIT PF_MEMALLOC_NOWARN -> __GFP_NOWARN Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Darrick J. Wong <djwong@kernel.org> Cc: linux-mm@kvack.org Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10mm: introduce memalloc_flags_{save,restore}Kent Overstreet1-17/+26
Our proliferation of memalloc_*_{save,restore} APIs is getting a bit silly, this adds a generic version and converts the existing save/restore functions to wrappers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Darrick J. Wong <djwong@kernel.org> Cc: linux-mm@kvack.org Acked-by: Vlastimil Babka <vbabka@suse.cz>
2024-03-10bcachefs: factor out check_inode_backpointer()Kent Overstreet1-9/+29
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Factor out check_subvol_dirent()Kent Overstreet2-48/+58
Going to be adding more code here for checking subvol structure. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Kill some -EINVALsKent Overstreet2-5/+5
Repurposing standard error codes in bcachefs code is banned in new code, and we need to get rid of the remaining ones - private error codes give us much better error messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: bump max_active on btree_interior_update_workerKent Overstreet1-1/+1
WQ_UNBOUND with max_active 1 means ordered workqueue, but we don't actually need or want ordered semantics - and probably want a higher concurrency limit anyways. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: move fsck_write_inode() to inode.cKent Overstreet3-40/+44
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Initialize super_block->s_uuidKent Overstreet1-0/+1
Need to fix this oversight for the new FS_IOC_(GET|SET)UUID ioctls. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Switch to uuid_to_fsid()Kent Overstreet1-5/+1
switch the statfs code from something horrible and open coded to the more standard uuid_to_fsid() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Subvolumes may now be renamedKent Overstreet2-26/+55
Files within a subvolume cannot be renamed into another subvolume, but subvolumes themselves were intended to be. This implements subvolume renaming - we need to ensure that there's only a single dirent that points to a subvolume key (not multiple versions in different snapshots), and we need to ensure that dirent.d_parent_subol and inode.bi_parent_subvol are updated. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: btree node prefetching in check_topologyKent Overstreet4-3/+42
btree_and_journal_iter is old code that we want to get rid of, but we're not ready to yet. lack of btree node prefetching is, it turns out, a real performance issue for fsck on spinning rust, so - add it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: btree_and_journal_iter.transKent Overstreet4-17/+21
we now always have a btree_trans when using a btree_and_journal_iter; prep work for adding prefetching to btree_and_journal_iter Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: better journal pipeliningKent Overstreet4-59/+98
Recently a severe performance regression was discovered, which bisected to a6548c8b5eb5 bcachefs: Avoid flushing the journal in the discard path It turns out the old behaviour, which issued excessive journal flushes, worked around a performance issue where queueing delays would cause the journal to not be able to write quickly enough and stall. The journal flushes masked the issue because they periodically flushed the device write cache, reducing write latency for non flushes. This patch reworks the journalling code to allow more than one (non-flush) write to be in flight at a time. With this patch, doing 4k random writes and an iodepth of 128, we are now able to hit 560k iops to a Samsung 970 EVO Plus - previously, we were stuck in the ~200k range. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: closure per journal bufKent Overstreet3-23/+41
Prep work for having multiple journal writes in flight. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: bio per journal bufKent Overstreet3-29/+34
Prep work for having multiple journal writes in flight. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: jset_entry_datetimeKent Overstreet4-17/+67
This gives us a way to record the date and time every journal entry was written - useful for debugging. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: improve journal entry read fsck error messagesKent Overstreet1-41/+55
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: convert journal replay ptrs to darrayKent Overstreet3-58/+36
Eliminates some error paths - no longer have a hardcoded BCH_REPLICAS_MAX limit. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Cleanup bch2_dirent_lookup_trans()Kent Overstreet3-26/+14
Drop an unnecessary bch2_subvolume_get_snapshot() call, and drop the __ from the name - this is a normal interface. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: bch2_hash_set_snapshot() -> bch2_hash_set_in_snapshot()Kent Overstreet3-18/+12
Minor renaming for clarity, bit of refactoring. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Workqueues should be WQ_HIGHPRIKent Overstreet1-4/+4
Most bcachefs workqueues are used for completions, and should be WQ_HIGHPRI - this helps reduce queuing delays, we want to complete quickly once we can no longer signal backpressure by blocking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Improve bch2_dirent_to_text()Kent Overstreet1-9/+11
For DT_SUBVOL, we now print both parent and child subvol IDs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: fixup for building in userspaceKent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Avoid taking journal lock unnecessarilyKent Overstreet2-53/+55
Previously, any time we failed to get a journal reservation we'd retry, with the journal lock held; but this isn't necessary given wait_event()/wake_up() ordering. This avoids performance cliffs when the journal starts to get backed up and lock contention shoots up. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Journal writes should be REQ_SYNC|REQ_METAKent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Avoid setting j->write_work unnecessarilyKent Overstreet1-13/+11
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Split out journal workqueueKent Overstreet3-16/+19
We don't want journal write completions to be blocked behind btree transactions - io_complete_wq is used for btree updates after data and metadata writes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Kill unnecessary wakeups in journal reclaimKent Overstreet1-11/+9
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: skip invisible entries in empty subvolume checkingGuoyu Ou3-5/+9
When we are checking whether a subvolume is empty in the specified snapshot, entries that do not belong to this subvolume should be skipped. This fixes the following case: $ bcachefs subvolume create ./sub $ cd sub $ bcachefs subvolume create ./sub2 $ bcachefs subvolume snapshot . ./snap $ ls -a snap . .. $ rmdir snap rmdir: failed to remove 'snap': Directory not empty As Kent suggested, we pass 0 in may_delete_deleted_inode() to ignore subvols in the subvol we are checking, because inode.bi_subvol is only set on subvolume roots, and we can't go through every inode in the subvolume and change bi_subvol when taking a snapshot. It makes the check less strict, but that's ok, the rest of fsck will still catch it. Signed-off-by: Guoyu Ou <benogy@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: fix split brain messageKent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Set path->uptodate when no node at levelKent Overstreet1-2/+2
We were failing to set path->uptodate when reaching the end of a btree node iterator, causing the new prefetch code for backpointers gc to go into an infinite loop. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Correctly validate k->u64s in btree node read pathKent Overstreet3-6/+27
validate_bset_keys() never properly validated k->u64s; it checked if it was 0, but not if it was smaller than keys for the given packed format; this fixes that small oversight. This patch was backported, so it's adding quite a few error enums so that they don't get renumbered and we don't have confusing gaps. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10bcachefs: Fix degraded mode fsckKent Overstreet1-18/+18
We don't know where the superblock and journal lives on offline devices; that means if a device is offline fsck can't check those buckets. Previously, fsck would incorrectly clear bucket data types for those buckets on offline devices; now we just use the previous state. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>